transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-20 13:08:21 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	011cc0be51	Fix all sphynx warnings (#5068 )	2020-06-16 16:50:02 -04:00
Sam Shleifer	a9f1fc6c94	Add bart-base (#5014 )	2020-06-15 13:29:26 -04:00
Julien Chaumond	d4c2cb402d	Kill model archive maps (#4636 ) * Kill model archive maps * Fixup * Also kill model_archive_map for MaskedBertPreTrainedModel * Unhook config_archive_map * Tokenizers: align with model id changes * make style && make quality * Fix CI	2020-06-02 09:39:33 -04:00
Patrick von Platen	48c3a70b4e	[Longformer] Docs and clean API (#4464 ) * add longformer docs * improve docs	2020-05-19 21:52:36 +02:00
Iz Beltagy	8f1d047148	Longformer (#4352 ) * first commit * bug fixes * better examples * undo padding * remove wrong VOCAB_FILES_NAMES * License * make style * make isort happy * unit tests * integration test * make `black` happy by undoing `isort` changes!! * lint * no need for the padding value * batch_size not bsz * remove unused type casting * seqlen not seq_len * staticmethod * `bert` selfattention instead of `n2` * uint8 instead of bool + lints * pad inputs_embeds using embeddings not a constant * black * unit test with padding * fix unit tests * remove redundant unit test * upload model weights * resolve todo * simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_ * increase unittest coverage	2020-05-19 16:04:43 +02:00
Patrick von Platen	ac7d5f67a2	[Reformer] Add Enwiki8 Reformer Model - Adapt convert script (#4282 ) * adapt convert script * update convert script * finish * fix marian pretrained docs	2020-05-11 16:38:07 +02:00
Sam Shleifer	3487be75ef	[Marian] documentation and AutoModel support (#4152 ) - MarianSentencepieceTokenizer - > MarianTokenizer - Start using unk token. - add docs page - add better generation params to MarianConfig - more conversion utilities	2020-05-10 13:54:57 -04:00
Patrick von Platen	dca34695d0	Reformer (#3351 ) * first copy & past commit from Bert and morgans LSH code * add easy way to compare to trax original code * translate most of function * make trax lsh self attention deterministic with numpy seed + copy paste code * add same config * add same config * make layer init work * implemented hash_vectors function for lsh attention * continue reformer translation * hf LSHSelfAttentionLayer gives same output as trax layer * refactor code * refactor code * refactor code * refactor * refactor + add reformer config * delete bogus file * split reformer attention layer into two layers * save intermediate step * save intermediate step * make test work * add complete reformer block layer * finish reformer layer * implement causal and self mask * clean reformer test and refactor code * fix merge conflicts * fix merge conflicts * update init * fix device for GPU * fix chunk length init for tests * include morgans optimization * improve memory a bit * improve comment * factorize num_buckets * better testing parameters * make whole model work * make lm model work * add t5 copy paste tokenizer * add chunking feed forward * clean config * add improved assert statements * make tokenizer work * improve test * correct typo * extend config * add complexer test * add new axial position embeddings * add local block attention layer * clean tests * refactor * better testing * save intermediate progress * clean test file * make shorter input length work for model * allow variable input length * refactor * make forward pass for pretrained model work * add generation possibility * finish dropout and init * make style * refactor * add first version of RevNet Layers * make forward pass work and add convert file * make uploaded model forward pass work * make uploaded model forward pass work * refactor code * add namedtuples and cache buckets * correct head masks * refactor * made reformer more flexible * make style * remove set max length * add attention masks * fix up tests * fix lsh attention mask * make random seed optional for the moment * improve memory in reformer * add tests * make style * make sure masks work correctly * detach gradients * save intermediate * correct backprob through gather * make style * change back num hashes * rename to labels * fix rotation shape * fix detach * update * fix trainer * fix backward dropout * make reformer more flexible * fix conflict * fix * fix * add tests for fixed seed in reformer layer * fix trainer typo * fix typo in activations * add fp16 tests * add fp16 training * support fp16 * correct gradient bug in reformer * add fast gelu * re-add dropout for embedding dropout * better naming * better naming * renaming * finalize test branch * finalize tests * add more tests * finish tests * fix * fix type trainer * fix fp16 tests * fix tests * fix tests * fix tests * fix issue with dropout * fix dropout seeds * correct random seed on gpu * finalize random seed for dropout * finalize random seed for dropout * remove duplicate line * correct half precision bug * make style * refactor * refactor * docstring * remove sinusoidal position encodings for reformer * move chunking to modeling_utils * make style * clean config * make style * fix tests * fix auto tests * pretrained models * fix docstring * update conversion file * Update pretrained_models.rst * fix rst * fix rst * update copyright * fix test path * fix test path * fix small issue in test * include reformer in generation tests * add docs for axial position encoding * finish docs * Update convert_reformer_trax_checkpoint_to_pytorch.py * remove isort * include sams comments * remove wrong comment in utils * correct typos * fix typo * Update reformer.rst * applied morgans optimization * make style * make gpu compatible * remove bogus file * big test refactor * add example for chunking * fix typo * add to README	2020-05-07 10:17:01 +02:00
Patrick von Platen	d22894dfd4	[Docs] Add DialoGPT (#3755 ) * add dialoGPT * update README.md * fix conflict * update readme * add code links to docs * Update README.md * Update dialo_gpt2.rst * Update pretrained_models.rst * Update docs/source/model_doc/dialo_gpt2.rst Co-Authored-By: Julien Chaumond <chaumond@gmail.com> * change filename of dialogpt Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-04-16 09:04:32 +02:00
Sam Shleifer	7a7fdf71f8	Multilingual BART - (#3602 ) - support mbart-en-ro weights - add MBartTokenizer	2020-04-10 11:25:39 -04:00
Patrick von Platen	fa9af2468a	Add T5 to docs (#3461 ) * add t5 docs basis * improve docs * add t5 docs * improve t5 docstring * add t5 tokenizer docstring * finish docstring * make style * add pretrained models * correct typo * make examples work * finalize docs	2020-03-27 10:57:16 -04:00
Sam Shleifer	b54ef78d0c	Bart-CNN (#3059 ) `generate` code that produces 99% identical summarizations to fairseq on CNN test data, with caching.	2020-03-02 10:35:53 -05:00
Sam Shleifer	53ce3854a1	New BartModel (#2745 ) * Results same as fairseq * Wrote a ton of tests * Struggled with api signatures * added some docs	2020-02-20 18:11:13 -05:00
VictorSanh	ee5a6856ca	distilbert-base-cased weights + Readmes + omissions	2020-02-07 15:28:13 -05:00
Lysandre	93dccf527b	Pretrained models	2020-01-30 10:04:18 -05:00
Wietse de Vries	f5a236c3ca	Add Dutch pre-trained BERT model	2020-01-27 21:00:34 -05:00
alberduris	81d6841b4b	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
alberduris	dd4df80f0b	Moved the encoded_prompts to correct device	2020-01-06 15:11:12 +01:00
Julien Chaumond	ac1b449cc9	[doc] move distilroberta to more appropriate place cc @lysandrejik	2019-12-21 00:09:01 -05:00
Stefan Schweter	dd7a958fd6	docs: add XLM-RoBERTa to pretrained model list (incl. all parameters)	2019-12-18 19:45:46 +01:00
Antti Virtanen	abc43ffbff	Add pretrained model documentation for FinBERT.	2019-12-17 20:35:25 -05:00
thomwolf	5c00e344c1	update model doc - swith 3B/11B to 3b/11b	2019-12-13 16:33:29 +01:00
Thomas Wolf	110394b2ba	Merge branch 'master' into t5	2019-12-13 16:03:32 +01:00
Julien Chaumond	1748fdf657	[doc] Fix rst table	2019-12-11 18:32:27 -05:00
Masatoshi Suzuki	c03c0dfd23	Add support for Japanese BERT models by cl-tohoku	2019-12-11 18:32:27 -05:00
Stefan Schweter	030faccb8d	doc: fix pretrained models table	2019-12-11 12:19:21 -05:00
thomwolf	0558c9cb9b	Merge branch 'master' into t5	2019-12-10 12:58:48 +01:00
Pierric Cistac	5c877fe94a	fix albert links	2019-12-09 18:53:00 -05:00
VictorSanh	552c44a9b1	release distilm-bert	2019-12-05 10:14:58 -05:00
Stefan Schweter	8c276b9c92	Merge branch 'master' into distilbert-german	2019-11-27 18:11:49 +01:00
Lysandre	ce02550d50	Fix pretrained models table	2019-11-26 15:47:02 -05:00
Lysandre	cf26a0c85e	Fix pretrained models table	2019-11-26 15:40:03 -05:00
Lysandre	668aac45d2	Pretrained models	2019-11-26 14:52:42 -05:00
Stefan Schweter	e631383d4f	docs: add new German distilbert model to pretrained models	2019-11-19 19:52:40 +01:00
Louis MARTIN	035fea5315	Add CamemBERT to auto files and docs	2019-11-16 00:11:07 -05:00
thomwolf	f03c0c1423	adding models in readme and auto classes	2019-11-08 11:49:46 +01:00
Julien Chaumond	1c542df7e5	Add RoBERTa-based GPT-2 Output Detector from OpenAI converted from https://github.com/openai/gpt-2-output-dataset/tree/master/detector Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-Authored-By: Jong Wook Kim <jongwook@nyu.edu> Co-Authored-By: Jeff Wu <wuthefwasthat@gmail.com>	2019-11-06 16:26:31 -05:00
Lysandre	d7d36181fd	GPT-2 XL	2019-11-05 13:31:58 -05:00
VictorSanh	8ad5c591cd	[RELEASE] DistilRoBERTa	2019-10-23 10:29:47 -04:00
Stefan Schweter	5f25a5f367	model: add support for new German BERT models (cased and uncased) from @dbmdz	2019-10-11 10:20:33 +02:00
thomwolf	d9e60f4f0d	Merge branch 'master' into pr/1383	2019-10-09 17:25:08 +02:00
thomwolf	48b438ff2a	doc and conversion	2019-10-09 17:06:30 +02:00
VictorSanh	c1689ac301	fix name	2019-10-03 10:56:39 -04:00
VictorSanh	4a790c40b1	update doc for distil*	2019-10-03 10:54:02 -04:00
LysandreJik	ebb32261b1	fix #1401	2019-10-02 17:52:56 -04:00
LysandreJik	cf5c5c9e1c	Documentation	2019-09-26 07:43:13 -04:00
thomwolf	31c23bd5ee	[BIG] pytorch-transformers => transformers	2019-09-26 10:15:53 +02:00
thomwolf	c88f05163d	fix typo in XLM models	2019-09-16 13:42:20 +02:00
LysandreJik	9ce42dc540	Pretrained models table fix	2019-08-28 13:56:28 -04:00
LysandreJik	75bc2a03cc	Updated article link	2019-08-28 10:05:15 -04:00

1 2

57 Commits