transformers/pytorch_pretrained_bert
Abhi Sharma 9e666aaa29
Fix gradient overflow issue during attention mask
This fix is in reference to issue #382. GPT2 can now be trained in mixed precision, which I've confirmed with testing. I also tested unconditional generation on multiple seeds before and after changing 1e10 to 1e4 and there was no difference. Please let me know if there is anything else I can do to make this pull request better. Thanks for all your work!
2019-04-16 11:42:34 -07:00
..
__init__.py added best practices for serialization in README and examples 2019-04-15 15:00:33 +02:00
__main__.py adding gpt2 2019-02-17 23:38:51 +01:00
convert_gpt2_checkpoint_to_pytorch.py fix typo - logger info 2019-03-06 10:05:21 +01:00
convert_openai_checkpoint_to_pytorch.py fix typo - logger info 2019-03-06 10:05:21 +01:00
convert_tf_checkpoint_to_pytorch.py fix typo - logger info 2019-03-06 10:05:21 +01:00
convert_transfo_xl_checkpoint_to_pytorch.py fix typo - logger info 2019-03-06 10:05:21 +01:00
file_utils.py added best practices for serialization in README and examples 2019-04-15 15:00:33 +02:00
modeling_gpt2.py Fix gradient overflow issue during attention mask 2019-04-16 11:42:34 -07:00
modeling_openai.py Merge pull request #489 from huggingface/tokenization_serialization 2019-04-16 08:49:54 +02:00
modeling_transfo_xl_utilities.py fix typo - logger info 2019-03-06 10:05:21 +01:00
modeling_transfo_xl.py load all models on cpu 2019-04-15 15:43:01 +02:00
modeling.py Merge pull request #489 from huggingface/tokenization_serialization 2019-04-16 08:49:54 +02:00
optimization_openai.py same 2019-03-18 15:13:35 +01:00
optimization.py branches, optim cosine fix 2019-03-18 13:18:07 +01:00
tokenization_gpt2.py fixing tests 2019-04-15 12:55:38 +02:00
tokenization_openai.py fix openai special tokens loading 2019-04-15 16:07:45 +02:00
tokenization_transfo_xl.py tokenization updates 2019-04-15 14:24:52 +02:00
tokenization.py tokenization updates 2019-04-15 14:24:52 +02:00