transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-04 21:30:07 +06:00

History

Patrick von Platen dca34695d0 Reformer (#3351 ) * first copy & past commit from Bert and morgans LSH code * add easy way to compare to trax original code * translate most of function * make trax lsh self attention deterministic with numpy seed + copy paste code * add same config * add same config * make layer init work * implemented hash_vectors function for lsh attention * continue reformer translation * hf LSHSelfAttentionLayer gives same output as trax layer * refactor code * refactor code * refactor code * refactor * refactor + add reformer config * delete bogus file * split reformer attention layer into two layers * save intermediate step * save intermediate step * make test work * add complete reformer block layer * finish reformer layer * implement causal and self mask * clean reformer test and refactor code * fix merge conflicts * fix merge conflicts * update init * fix device for GPU * fix chunk length init for tests * include morgans optimization * improve memory a bit * improve comment * factorize num_buckets * better testing parameters * make whole model work * make lm model work * add t5 copy paste tokenizer * add chunking feed forward * clean config * add improved assert statements * make tokenizer work * improve test * correct typo * extend config * add complexer test * add new axial position embeddings * add local block attention layer * clean tests * refactor * better testing * save intermediate progress * clean test file * make shorter input length work for model * allow variable input length * refactor * make forward pass for pretrained model work * add generation possibility * finish dropout and init * make style * refactor * add first version of RevNet Layers * make forward pass work and add convert file * make uploaded model forward pass work * make uploaded model forward pass work * refactor code * add namedtuples and cache buckets * correct head masks * refactor * made reformer more flexible * make style * remove set max length * add attention masks * fix up tests * fix lsh attention mask * make random seed optional for the moment * improve memory in reformer * add tests * make style * make sure masks work correctly * detach gradients * save intermediate * correct backprob through gather * make style * change back num hashes * rename to labels * fix rotation shape * fix detach * update * fix trainer * fix backward dropout * make reformer more flexible * fix conflict * fix * fix * add tests for fixed seed in reformer layer * fix trainer typo * fix typo in activations * add fp16 tests * add fp16 training * support fp16 * correct gradient bug in reformer * add fast gelu * re-add dropout for embedding dropout * better naming * better naming * renaming * finalize test branch * finalize tests * add more tests * finish tests * fix * fix type trainer * fix fp16 tests * fix tests * fix tests * fix tests * fix issue with dropout * fix dropout seeds * correct random seed on gpu * finalize random seed for dropout * finalize random seed for dropout * remove duplicate line * correct half precision bug * make style * refactor * refactor * docstring * remove sinusoidal position encodings for reformer * move chunking to modeling_utils * make style * clean config * make style * fix tests * fix auto tests * pretrained models * fix docstring * update conversion file * Update pretrained_models.rst * fix rst * fix rst * update copyright * fix test path * fix test path * fix small issue in test * include reformer in generation tests * add docs for axial position encoding * finish docs * Update convert_reformer_trax_checkpoint_to_pytorch.py * remove isort * include sams comments * remove wrong comment in utils * correct typos * fix typo * Update reformer.rst * applied morgans optimization * make style * make gpu compatible * remove bogus file * big test refactor * add example for chunking * fix typo * add to README		2020-05-07 10:17:01 +02:00
..
_static	Adding usage examples for common tasks (#2850 )	2020-02-25 13:48:24 -05:00
imgs	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
main_classes	Reformer (#3351 )	2020-05-07 10:17:01 +02:00
model_doc	Reformer (#3351 )	2020-05-07 10:17:01 +02:00
benchmarks.md	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
bertology.rst	Fixes #3877	2020-04-22 01:15:10 +00:00
conf.py	Release: v2.8.0	2020-04-06 10:03:53 -04:00
converting_tensorflow_models.rst	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
examples.md	[docs] Doc tweaks	2019-09-26 18:19:51 -04:00
favicon.ico	Adding usage examples for common tasks (#2850 )	2020-02-25 13:48:24 -05:00
glossary.rst	Reformer (#3351 )	2020-05-07 10:17:01 +02:00
index.rst	Reformer (#3351 )	2020-05-07 10:17:01 +02:00
installation.md	CPU/GPU memory benchmarking utilities - Remove support for python 3.5 (now only 3.6+) (#3186 )	2020-03-17 10:17:11 -04:00
migration.md	weigths*weights	2020-04-04 15:03:26 -04:00
model_sharing.md	[doc] --organization tweak	2020-03-10 16:52:44 -04:00
multilingual.rst	docs: add xlm-roberta section to multi-lingual section (#4101 )	2020-05-01 11:06:58 -04:00
notebooks.md	Update notebooks (#3620 )	2020-04-06 14:32:39 -04:00
pretrained_models.rst	Reformer (#3351 )	2020-05-07 10:17:01 +02:00
quickstart.md	Delete all mentions of Model2Model (#3019 )	2020-02-26 11:36:27 -05:00
serialization.rst	[docs] The use of `do_lower_case` in scripts is on its way to deprecation (#3738 )	2020-04-10 12:34:04 -04:00
torchscript.rst	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
usage.rst	[Docs] Add usage examples for translation and summarization (#3538 )	2020-03-31 09:36:03 -04:00