[Bug fix] Using loaded checkpoint with --do_predict (instead of… (#3437)

* Using loaded checkpoint with --do_predict

Without this fix, I'm getting near-random validation performance for a trained model, and the validation performance differs per validation run. I think this happens since the `model` variable isn't set with the loaded checkpoint, so I'm using a randomly initialized model. Looking at the model activations, they differ each time I run evaluation (but they don't with this fix).

* Update checkpoint loading

* Fixing model loading
This commit is contained in:
Ethan Perez 2020-03-30 16:06:08 -05:00 committed by GitHub
parent 8deff3acf2
commit e5c393dceb
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 2 additions and 2 deletions

View File

@ -192,5 +192,5 @@ if __name__ == "__main__":
# Optionally, predict on dev set and write to output_dir
if args.do_predict:
checkpoints = list(sorted(glob.glob(os.path.join(args.output_dir, "checkpointepoch=*.ckpt"), recursive=True)))
GLUETransformer.load_from_checkpoint(checkpoints[-1])
model = model.load_from_checkpoint(checkpoints[-1])
trainer.test(model)

View File

@ -192,5 +192,5 @@ if __name__ == "__main__":
# https://github.com/PyTorchLightning/pytorch-lightning/blob/master\
# /pytorch_lightning/callbacks/model_checkpoint.py#L169
checkpoints = list(sorted(glob.glob(os.path.join(args.output_dir, "checkpointepoch=*.ckpt"), recursive=True)))
NERTransformer.load_from_checkpoint(checkpoints[-1])
model = model.load_from_checkpoint(checkpoints[-1])
trainer.test(model)