[Bug fix] Using loaded checkpoint with --do_predict (instead of… (#3437)

* Using loaded checkpoint with --do_predict Without this fix, I'm getting near-random validation performance for a trained model, and the validation performance differs per validation run. I think this happens since the `model` variable isn't set with the loaded checkpoint, so I'm using a randomly initialized model. Looking at the model activations, they differ each time I run evaluation (but they don't with this fix). * Update checkpoint loading * Fixing model loading
2025-07-31 02:02:21 +06:00 · 2020-03-30 16:06:08 -05:00 · 2020-03-30 16:06:08 -05:00 · e5c393dceb
commit e5c393dceb
parent 8deff3acf2
2 changed files with 2 additions and 2 deletions
--- a/examples/glue/run_pl_glue.py
+++ b/examples/glue/run_pl_glue.py
@ -192,5 +192,5 @@ if __name__ == "__main__":
    # Optionally, predict on dev set and write to output_dir
    if args.do_predict:
        checkpoints = list(sorted(glob.glob(os.path.join(args.output_dir, "checkpointepoch=*.ckpt"), recursive=True)))
-        GLUETransformer.load_from_checkpoint(checkpoints[-1])
+        model = model.load_from_checkpoint(checkpoints[-1])
        trainer.test(model)
--- a/examples/ner/run_pl_ner.py
+++ b/examples/ner/run_pl_ner.py
@ -192,5 +192,5 @@ if __name__ == "__main__":
        # https://github.com/PyTorchLightning/pytorch-lightning/blob/master\
        # /pytorch_lightning/callbacks/model_checkpoint.py#L169
        checkpoints = list(sorted(glob.glob(os.path.join(args.output_dir, "checkpointepoch=*.ckpt"), recursive=True)))
-        NERTransformer.load_from_checkpoint(checkpoints[-1])
+        model = model.load_from_checkpoint(checkpoints[-1])
        trainer.test(model)