transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

History

Abhi Sharma 9e666aaa29 Fix gradient overflow issue during attention mask This fix is in reference to issue #382. GPT2 can now be trained in mixed precision, which I've confirmed with testing. I also tested unconditional generation on multiple seeds before and after changing 1e10 to 1e4 and there was no difference. Please let me know if there is anything else I can do to make this pull request better. Thanks for all your work!		2019-04-16 11:42:34 -07:00
..
__init__.py	added best practices for serialization in README and examples	2019-04-15 15:00:33 +02:00
__main__.py	adding gpt2	2019-02-17 23:38:51 +01:00
convert_gpt2_checkpoint_to_pytorch.py	fix typo - logger info	2019-03-06 10:05:21 +01:00
convert_openai_checkpoint_to_pytorch.py	fix typo - logger info	2019-03-06 10:05:21 +01:00
convert_tf_checkpoint_to_pytorch.py	fix typo - logger info	2019-03-06 10:05:21 +01:00
convert_transfo_xl_checkpoint_to_pytorch.py	fix typo - logger info	2019-03-06 10:05:21 +01:00
file_utils.py	added best practices for serialization in README and examples	2019-04-15 15:00:33 +02:00
modeling_gpt2.py	Fix gradient overflow issue during attention mask	2019-04-16 11:42:34 -07:00
modeling_openai.py	Merge pull request #489 from huggingface/tokenization_serialization	2019-04-16 08:49:54 +02:00
modeling_transfo_xl_utilities.py	fix typo - logger info	2019-03-06 10:05:21 +01:00
modeling_transfo_xl.py	load all models on cpu	2019-04-15 15:43:01 +02:00
modeling.py	Merge pull request #489 from huggingface/tokenization_serialization	2019-04-16 08:49:54 +02:00
optimization_openai.py	same	2019-03-18 15:13:35 +01:00
optimization.py	branches, optim cosine fix	2019-03-18 13:18:07 +01:00
tokenization_gpt2.py	fixing tests	2019-04-15 12:55:38 +02:00
tokenization_openai.py	fix openai special tokens loading	2019-04-15 16:07:45 +02:00
tokenization_transfo_xl.py	tokenization updates	2019-04-15 14:24:52 +02:00
tokenization.py	tokenization updates	2019-04-15 14:24:52 +02:00