I had a similar problem. It turned out that the image ids and the predictions were getting mixed up in my submission file. Changing shuffle from True to False in gen.flow_from_directory in Keras fixed it for me:
test = gen.flow_from_directory(directory,
target_size=(224,224),
class_mode=‘categorical’,
shuffle=False,
batch_size=batch_size)