transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-07 23:00:08 +06:00

Author	SHA1	Message	Date
NielsRogge	29b0aef871	Improve detr (#12147 ) * Remove unused variables * Improve docs * Fix docs of segmentation masks Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-06-17 10:37:54 -04:00
Bhadresh Savani	700cee3446	[Docs] fixed broken link (#12205 ) * fixed broken link * Update docs/source/benchmarks.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/benchmarks.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-16 15:14:53 -04:00
Philipp Schmid	afa414d060	updated DLC images and sample notebooks (#12191 )	2021-06-16 07:24:00 -04:00
Patrick von Platen	ccca510276	Hubert (#11889 ) * fix_torch_device_generate_test * remove @ * add hubert * add first test file * more docs * fix bugs * fix bug * finish * finish * finish docstring * fix * fix * finalize * add to ignored * finish * Apply suggestions from code review * correct naming * finish * fix auto config * finish * correct convert script * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com> * apply suggestions lysandre & suraj Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-06-16 12:14:12 +01:00
Patrick von Platen	c3c39f7e84	[Flax] Add Beam Search (#12131 ) * fix_torch_device_generate_test * remove @ * push new logit processors * add processors * save first working version * save intermediate * finish * make style * make fix-copies * finish * Update tests/test_modeling_flax_bart.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-06-16 09:43:54 +01:00
Kilian Kluge	a79585bbf9	Update AutoModel classes in summarization example (#12178 ) - Convert use of deprecated AutoModelWithLMHead to AutoModelForSeq2SeqLM - Add newly required `truncation=True` to `tokenizer.encode` with `max_length` This silences all warnings.	2021-06-15 10:36:10 -04:00
Sylvain Gugger	a8694b8850	Adjust banner width	2021-06-15 09:37:15 -04:00
Sylvain Gugger	60b1d6b45b	Add course banner (#12157 ) * Add course banner * Update course banner	2021-06-15 09:25:49 -04:00
Sylvain Gugger	a55dc157e3	Add video links to the documentation (#12162 )	2021-06-15 06:37:37 -04:00
Stas Bekman	040283170c	consistent nn. and nn.functional: part 5 docs (#12161 )	2021-06-14 13:34:32 -07:00
Vasudev Gupta	d9c0d08f9a	Flax Big Bird (#11967 ) * add flax bert * bert -> bigbird * original_full ported * add debugger * init block sparse * fix copies ; gelu_fast -> gelu_new * block sparse port * fix block sparse * block sparse working * all ckpts working * fix-copies * make quality * init tests * temporary fix for FlaxBigBirdForMultipleChoice * skip test_attention_outputs * fix * gelu_fast -> gelu_new ; fix multiple choice model * remove nsp * fix sequence classifier * fix * make quality * make fix-copies * finish * Delete debugger.ipynb * Update src/transformers/models/big_bird/modeling_flax_big_bird.py * make style * finish * bye bye jit flax tests Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-06-14 20:01:03 +01:00
Will Rice	d438eee030	Adding TFWav2Vec2Model (#11617 ) * [WIP] Add TFWav2Vec2Model Work in progress for adding a tensorflow version of Wav2Vec2 * feedback changes * small fix * Test Feedback Round 1 * Add SpecAugment and CTC Loss * correct spec augment mask creation * docstring and correct copyright * correct bugs * remove bogus file * finish tests correction * del unnecessary layers * Update src/transformers/models/wav2vec2/modeling_tf_wav2vec2.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * correct final bug * Feedback Changes Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-06-14 18:58:54 +01:00
Daniel Stancl	4a51b1dd9b	FlaxBart (#11537 ) * Start working on FlaxBart * Create modeling_flax_bart.py * Write FlaxBartAttention * Add FlaxBartEncoderLayer * Add FlaxBartDecoderLayer and some typing * Add helepr function for FlaxBart * shift_tokens_right * _make_causal_mask * _expand_mask * Add PositionalEmbedding and fix init_std naming * Add FlaxBartPretrainedModel * Add FlaxBartEncoder * Add FlaxBartEncoder * Add FlaxBartEncoder among modules to be imported * YET WE CANNOT INITIALIZE THAT!! :( * Make BartEncoder working Change BartEncoder to instance of nn.Module so far * Add FlaxBartDecoder * Add FlaxBartModel * TODO to make model run -> Prepapre model inputs * Resolve padding * Add FlaxBartModel * Add FlaxBartModel into importable modules * Remove FlaxBartEncoder and FlaxBartDecoder from importable modules * make style; not properly working * make style; make quality not pass due to some import I left * Remove TODO for padding_idx in nn.Embed so far * Add FlaxBartForConditionalGeneration * Incorporate Flax model output classes, i.e. return_dict * Add another models and incorporate use_cache arg * Add FlaxBartForSequenceClassification and FlaxBartForQuestionAnswering * Incorporate use_cache arg from PyTorch implementation * Add all necessary Flax output utils * Add FlaxBartForCausalLM; not working yet' * Add minor improvements; still lacks some functionality * Update docs, src and tests * Add support of FlaxBart to docs/source * Fix some bugs in FlaxBart souce code * Add some neccessary tests for FlaxBart models - jit_compilation not passing * Fix tests and add test_head_masking * Fix tests for @jax.jit computation * Add test_head_masking * Migrate FlaxBart tests from jax.numpy to numpy * Remove FlaxBartForCausalLM * Clean repo * fix bart model weight structure * Fix FlaxBartForSequenceClassification Slicing is not possible to use below jit, therefore, selecting sentence representation from hidden_states must be changed. * Allow FlaxBartForSequenceClassification for testing pt_flax equivalence * Allow testing for FlaxBartForQA for pt_flax equivalence * Add a comment to FlaxBartForSequenceClassification + change noise from 1e-3 to 1e-6 * remove past_key_values * remove inputs_mebeds and make input_ids required * add position ids * re-write attention layer * fix dataclass * fix pos embeds and attention output * fix pos embeds * expose encode method * expose decode method * move docstring to top * add cache for causal attn layer * remove head masking for now * s2s greedy search first pass * boom boom * fix typos * fix greedy generate for bart * use encoder, decoder layers instead of num_hidden_layers * handle encoder_outputs * cleanup * simplify decoding * more clean-up * typos * Change header + add {decoder_,}position_ids into 2 models * add BartConfig * fix existing tests * add encode, decode methods * Fix shift_tokens_right for JIT compilation + clarify one condition * fix decode * encoder => encode * simplify generate * add tests for encode and decode * style * add tests for cache * fix equivalence tests * sample generate now works with seq2seq * generation tests * initialize dense layers * docstring and cleanup * quality * remove get/set input_embeddings * address Patricks suggestions * decode for every model, remove encoder_outputs from call * update tests accordingly * decode returns only decoder outputs and logits * fix arguments * doc encode, decode methods * correct base_model_prefix * fix test for seq classif model * fix docs Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-06-14 15:16:08 +05:30
Jayendra	9a9314f6d9	Flax VisionTransformer (#11951 ) * adding vit for flax * added test for Flax-vit and some bug-fixes * overrided methods where variable changes were necessary for flax_vit test * added FlaxViTForImageClassification for test * Update src/transformers/models/vit/modeling_flax_vit.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * made changes suggested in PR * Adding jax-vit models for autoimport * swapping num_channels and height,width dimension * fixing the docstring for torch-like inputs for VIT * add model to main init * add docs * doc, fix-copies * docstrings * small test fixes * fix docs * fix docstr * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * style Co-authored-by: jayendra <jayendra@infocusp.in> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-06-10 21:17:13 +05:30
Anton Lozhkov	d472bd7b18	Wav2Vec2 Pretraining (#11306 ) * Working quantizer forward * Working quantizer forward * Clean up unused model parts, test reproducibility * Working quantizer forward * Clean up unused model parts, test reproducibility * Remove custom outputs from the shared ones * correct conversion * correct bug * add first pretrain script * save intermediate * static shapes * save intermediate * finish first pretrain script version * more refactor * remove wanddb * refactor more * improve test * correct perplexity compute bug * finish model implementation * add to docs * finish docs * finish pretraining script * finish pretraining script * remove wandb * finish PR for merge * finish config * finish * make deepspeed work * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions * fix flaky test Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-09 18:40:56 +01:00
NielsRogge	d3eacbb829	Add DETR (#11653 ) * Squash all commits of modeling_detr_v7 branch into one * Improve docs * Fix tests * Style * Improve docs some more and fix most tests * Fix slow tests of ViT, DeiT and DETR * Improve replacement of batch norm * Restructure timm backbone forward * Make DetrForSegmentation support any timm backbone * Fix name of output * Address most comments by @LysandreJik * Give better names for variables * Conditional imports + timm in setup.py * Address additional comments by @sgugger * Make style, add require_timm and require_vision to testsé * Remove train_backbone attribute of DetrConfig, add methods to freeze/unfreeze backbone * Add png files to fixtures * Fix type hint * Add timm to workflows * Add `BatchNorm2d` to the weight initialization * Fix retain_grad test * Replace model checkpoints by Facebook namespace * Fix name of checkpoint in test * Add user-friendly message when scipy is not available * Address most comments by @patrickvonplaten * Remove return_intermediate_layers attribute of DetrConfig and simplify Joiner * Better initialization * Scipy is necessary to get sklearn metrics * Rename TimmBackbone to DetrTimmConvEncoder and rename DetrJoiner to DetrConvModel * Make style * Improve docs and add 2 community notebooks Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-06-09 11:51:13 -04:00
Stas Bekman	0e82f0cbc2	typo	2021-06-08 12:55:17 -07:00
Stas Bekman	32290d87f6	[Deepspeed] various fixes (#12058 ) * replace deprecated config * sub_group_size was too big * complete deprecation removal	2021-06-08 08:36:15 -07:00
Stas Bekman	2c73b93099	[Deepspeed] Assert on mismatches between ds and hf args (#12021 ) * wip * add mismatch validation + test * renames * Update docs/source/main_classes/deepspeed.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * renames Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-04 08:58:23 -07:00
Stas Bekman	640318befa	[deepspeed] Move code and doc into standalone files (#11984 ) * move code and docs * style * moved * restore	2021-06-02 09:56:00 -07:00
Stas Bekman	d406a2729a	[docs] fix xref to `PreTrainedModel.generate` (#11049 ) * fix xref to generate * do the same for search methods * style * style	2021-06-02 09:21:05 -07:00
Gunjan Chhablani	88ca6a231d	VisualBERT (#10534 ) * Init VisualBERT * Add cookie-cutter, Config, and Embeddings * Add preliminary Model * Add Bert analogous classes * Add basic code for NLVR, VQA, Flickr * Update Init * Fix VisualBert Downstream Models * Rename classifier to cls * Comment position_ids buffer * Remove sentence image predictor output * Update output dicts * Remove unnecessary files * Fix Auto Modeling * Fix transformers init * Add conversion script * Add conversion script * Fix docs * Update visualbert modelling * Update configuration * Style fixes * Add model and integration tests * Add all tests * Update model mapping * Add simple detector from original repository * Update docs and configs * Fix style * Fix style * Update docs * Fix style * Fix import issues in style * Fix style * Add changes from review * Fix style * Fix style * Update docs * Fix style * Fix style * Update docs/source/model_doc/visual_bert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/visual_bert/modeling_visual_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_modeling_visual_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/visual_bert/modeling_visual_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/visual_bert/modeling_visual_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/visual_bert/modeling_visual_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add changes from review * Remove convert run script * Add changes from review * Update src/transformers/models/visual_bert/modeling_visual_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/visual_bert/modeling_visual_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/visual_bert/modeling_visual_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/visual_bert/modeling_visual_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/visual_bert/modeling_visual_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add changes from review * Add changes from review * Add visual embedding example in docs * Fix "copied from" comments * Add changes from review * Fix error, style, checkpoints * Update docs * Fix integration tests * Fix style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-02 18:13:08 +05:30
Stas Bekman	7ec596ecda	[DeepSpeed] decouple `DeepSpeedConfigHF` from `Trainer` (#11966 ) * decouple DeepSpeedConfigHF from Trainer * add LoggingLevel ctx manager; add new test * cleanup * add docs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * implemented suggested renames * formatter workaround Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-01 13:24:52 -07:00
Alberto Villa	1c3ab3e5d6	Typo in usage example, changed to device instead of torch_device (#11979 )	2021-06-01 14:58:49 -04:00
Patrick von Platen	47a98fc4cb	ByT5 model (#11971 ) * allow tf to use uneven num of layers * add tokenizer * finish docs * finish docs * Apply suggestions from code review * include in index * finish * Update docs/source/model_doc/byt5.rst Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * apply sylvais suggestions * make style Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2021-06-01 19:07:37 +01:00
Stas Bekman	79712e7e7a	[deepspeed] docs (#11940 ) * deepspeed docs * cleanup * cleanup	2021-06-01 09:21:21 -07:00
Suraj Patil	ad25fd62bd	Add FlaxCLIP (#11883 ) * add flax CLIP * default input_shape * add tests * fix test * fix name * fix docs * fix shapes * attend at least 1 token * flax conv to torch conv * return floats * fix equivalence tests * fix import * return attention_weights and update tests * fix dosctrings * address patricks comments * input_shape arg * add tests for get_image_features and get_text_features methods * fix tests	2021-06-01 09:44:31 +05:30
Bhadresh Savani	e1205e478a	Added Sequence Classification class in GPTNeo (#11906 ) * seq classification changes * fix tests	2021-05-28 06:27:02 -04:00
Patrick von Platen	42fe0dc23e	Add Emotion Speech Noteboook (#11900 )	2021-05-27 10:46:10 +01:00
Patrick von Platen	996a315e76	Flax Generate (#11777 ) * fix_torch_device_generate_test * remove @ * add * indexing * correct a couple of tests * fix tests * add logits processor * finish top_k, top_p, temp * add docs * correct flax prng key default * improve generate * add generation docs * add docs * make style * revert model outputs change * make style * correct typo * fix tests * fix slow test * add raise * finish generation Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-27 00:18:17 +01:00
Nick Lane-Smith	eaab9397cd	Fix two typos in docs (#11852 ) * typo2 * fix typo	2021-05-24 14:26:02 -04:00
yujun	206f06f2dd	Add new model RoFormer (use rotary position embedding ) (#11684 ) * add roformer * Update docs/source/model_doc/roformer.rst Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update docs/source/model_doc/roformer.rst Co-authored-by: Suraj Patil <surajp815@gmail.com> * update * add TFRoFormerSinusoidalPositionalEmbedding and fix TFMarianSinusoidalPositionalEmbedding * update docs * make style and make quality * roback * unchanged * rm copies from , this is a error in TFMarianSinusoidalPositionalEmbedding * update Copyright year * move # Add modeling imports here to the correct position * max_position_embeddings can be set to 1536 * # Copied from transformers.models.bert.modeling_bert.BertOutput with Bert->RoFormer * # Copied from transformers.models.bert.modeling_bert.BertLayer.__init__ with Bert->RoFormer * update tokenization_roformer * make style * add staticmethod apply_rotary_position_embeddings * add TF staticmethod apply_rotary_position_embeddings * update torch apply_rotary_position_embeddings * fix tf apply_rotary_position_embeddings error * make style * add pytorch RoFormerSelfAttentionRotaryPositionEmbeddingTest * add TF rotary_position_embeddings test * update test_modeling_rofomer * Update docs/source/model_doc/roformer.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/roformer/convert_roformer_original_tf_checkpoint_to_pytorch.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/roformer/modeling_roformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/roformer/modeling_roformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/roformer/modeling_tf_roformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refact roformer tokenizer * add RoFormerTokenizerFast * add RoFormerTokenizationTest * add require_jieba * update Copyright * update tokenizer & add copy from * add option rotary_value * use rust jieba * use rjieba * use rust jieba * fix test_alignement_methods * slice normalized_string is too slow * add config.embedding_size when embedding_size!=hidden_size * fix pickle tokenizer * Update docs/source/model_doc/roformer.rst Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style and make quality Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-05-20 08:00:34 -04:00
Suraj Patil	ca33278fdb	FlaxGPT2 (#11556 ) * flax gpt2 * combine masks * handle shared embeds * add causal LM sample * style * add tests * style * fix imports, docs, quality * don't use cache * add cache * add cache 1st version * make use cache work * start adding test for generation * finish generation loop compilation * rewrite test * finish * update * update * apply sylvains suggestions * update * refactor * fix typo Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-05-18 22:50:51 +01:00
Vyom Pathak	fd3b12e8c3	Fixed: Better names for nlp variables in pipelines' tests and docs. (#11752 ) * Fixed: Better names for nlp variables in pipelines' tests and docs. * Fixed: Better variable names	2021-05-18 09:47:28 -04:00
Patrick von Platen	cebb96f53a	Add more subsections to main doc (#11758 ) * add headers to main doc * Apply suggestions from code review * update * upload	2021-05-18 14:38:56 +01:00
Julien Chaumond	0fc56df5fb	Add visual + link to Premium Support webpage (#11740 ) * Update README.md * Update index.rst	2021-05-17 05:28:56 -04:00
Sylvain Gugger	cbbf49f644	Fix doc deployment	2021-05-13 10:34:14 -04:00
NielsRogge	fa84540e98	Vit deit fixes (#11309 ) * Improve docs of DeiT and ViT, add community notebook * Add gitignore for test_samples * Add notebook with Trainer Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-05-12 11:46:02 -04:00
Lysandre	d77eb0cf92	Docs for v4.7.0.dev0	2021-05-12 17:08:35 +02:00
Suraj Patil	f063c56d94	Fix clip docs (#11694 ) * fix doc url * fix example	2021-05-12 15:28:30 +05:30
Suraj Patil	8719afa1ad	CLIP (#11445 ) * begin second draft * fix import, style * add loss * fix embeds, logits_scale, and projection * fix imports * add conversion script * add feature_extractor and processor * style * add tests for tokenizer, extractor and processor * add vision model tests * add weight init * add more tests * fix save_load test * model output, dosstrings, causal mask * config doc * add clip model tests * return dict * bigin integration test * add integration tests * fix-copies * fix init * Clip => CLIP * fix module name * docs * fix doc * output_dim => projection_dim * fix checkpoint names * remoe fast tokenizer file * fix conversion script * fix tests, quality * put causal mask on device * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix attribute test * style * address sylvains comments * style * fix docstrings * add qucik_gelu in activations, docstrings * clean-up attention test * fix act fun * fix config * fix torchscript tests * even batch_size * remove comment * fix ouput tu_tuple * fix save load tests * fix add tokens test * add fast tokenizer * update copyright * new processor API * fix docs * docstrings * docs * fix doc * fix doc * fix tokenizer * fix import in doc example * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * check types of config * valhalla => openai * load image using url * fix test * typo Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-05-12 13:48:15 +05:30
Vasudev Gupta	575c979144	Update community.md (#11654 )	2021-05-10 09:48:21 +01:00
Tanmay Laud	f7f872955d	Big Bird Fast Tokenizer implementation (#11075 ) * Added Big Bird Fast Tokenizer initial file * style fixes * flake fixes * Added big bird fast tokenizer to init files * Added big bird fast to Auto tokenization * fix styles * minor quality fixes * Added initial test code * Fix SpmConverter when precompiled_charsmap doesn't exist * fixed post processor * minor style fix * minor fix input names * Actually fix identity normalization * style * Added token type ids to fast tokenizer * style * flake fix * fix copies Co-authored-by: Anthony MOI <m.anthony.moi@gmail.com>	2021-05-10 03:01:23 -04:00
Lysandre Debut	39084ca663	Add the ImageClassificationPipeline (#11598 ) * Add the ImageClassificationPipeline * Code review Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com> * Have `load_image` at the module level Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2021-05-07 08:08:40 -04:00
Vasudev Gupta	dc3f6758cf	Add BigBirdPegasus (#10991 ) * init bigbird pegasus * add debugging nb ; update config * init conversion * update conversion script * complete conversion script * init forward() * complete forward() * add tokenizer * add some slow tests * commit current * fix copies * add docs * add conversion script for bigbird-roberta-summarization * remove TODO * small fixups * correct tokenizer * add bigbird core for now * fix config * fix more * revert pegasus-tokenizer back * make style * everything working for pubmed; yayygit status * complete tests finally * remove bigbird pegasus tok * correct tokenizer * correct tests * add tokenizer files * finish make style * fix test * update * make style * fix tok utils base file * make fix-copies * clean a bit * small update * fix some suggestions * add to readme * fix a bit, clean tests * fix more tests * Update src/transformers/__init__.py * Update src/transformers/__init__.py * make fix-copies * complete attn switching, auto-padding left * make style * fix auto-padding test * make style * fix batched attention tests * put tolerance at 1e-1 for stand-alone decoder test * fix docs * fix tests * correct slow tokenizer conversion * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * complete remaining suggestions * fix test Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-05-07 09:27:43 +02:00
Stas Bekman	c065025c47	[trainer] document resume randomness (#11588 ) * document resume randomness * fix link * reword * fix * reword * style	2021-05-04 14:17:11 -07:00
Patrick Fernandes	0afe4a90f9	[Flax] Add Electra models (#11426 ) * add electra model to flax * Remove Electra Next Sentence Prediction model added by mistake * fix parameter sharing and loosen equality threshold * fix styling issues * add mistaken removen imports * fix electra table * Add FlaxElectra to automodels and fixe docs * fix issues pointed out the PR * fix flax electra to comply with latest changes * remove stale class * add copied from Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-05-04 20:56:09 +02:00
Patrick von Platen	084a187da3	[FlaxRoberta] Add FlaxRobertaModels & adapt run_mlm_flax.py (#11470 ) * add flax roberta * make style * correct initialiazation * modify model to save weights * fix copied from * fix copied from * correct some more code * add more roberta models * Apply suggestions from code review * merge from master * finish * finish docs Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-04 19:57:59 +02:00
Sylvain Gugger	fe82b1bfa0	Update training tutorial (#11533 ) * Update training tutorial * Apply suggestions from code review Co-authored-by: Hamel Husain <hamelsmu@github.com> * Address review comments * Update docs/source/training.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * More review comments * Last review comments Co-authored-by: Hamel Husain <hamelsmu@github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-05-03 13:18:46 -04:00
NielsRogge	f3cf8ae7b3	Add LUKE (#11223 ) * Rebase with master * Minor bug fix in docs * Copy files from adding_luke_v2 and improve docs * change the default value of use_entity_aware_attention to True * remove word_hidden_states * fix head models * fix tests * fix the conversion script * add integration tests for the pretrained large model * improve docstring * Improve docs, make style * fix _init_weights for pytorch 1.8 * improve docs * fix tokenizer to construct entity sequence with [MASK] entity when entities=None * Make fix-copies * Make style & quality * Bug fixes * Add LukeTokenizer to init * Address most comments by @patil-suraj and @LysandreJik * rename _compute_extended_attention_mask to get_extended_attention_mask * add comments to LukeSelfAttention * fix the documentation of the tokenizer * address comments by @patil-suraj, @LysandreJik, and @sgugger * improve docs * Make style, quality and fix-copies * Improve docs * fix docs * add "entity_span_classification" task * update example code for LukeForEntitySpanClassification * improve docs * improve docs * improve the code example in luke.rst * rename the classification layer in LukeForEntityClassification from typing to classifier * add bias to the classifier in LukeForEntitySpanClassification * update docs to use fine-tuned hub models in code examples of the head models * update the example sentences * Make style & quality * Add require_torch to tokenizer tests * Add require_torch to tokenizer tests * Address comments by @sgugger and add community notebooks * Make fix-copies Co-authored-by: Ikuya Yamada <ikuya@ikuya.net>	2021-05-03 09:07:29 -04:00
Stas Bekman	4e7bf94e72	[DeepSpeed] fp32 support (#11499 ) * prep for deepspeed==0.3.16 * new version * too soon * support and test fp32 mode * troubleshooting doc start * workaround no longer needed * add fp32 doc * style * cleanup, add tf32 note * clarify * release was made	2021-04-30 12:51:48 -07:00
Stas Bekman	282f3ac3ef	[debug utils] activation/weights underflow/overflow detector (#11274 ) * sync * add activation overflow debug utility * cleanup * document detect_overflow * import torch * add deprecation warning * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * convert to rst, add note * add class * fix docs * improve the doc * rework to dump a lot more info about each frame * complete expansion * cleanup * format * cleanup * doesn't have to be transformers * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * wrap long line * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-30 11:15:46 -07:00
Hamel Husain	804c2974d5	Improve task summary docs (#11513 ) * fix task summary docs * refactor to use model.config.id2label instead of list * fix nit * Update docs/source/task_summary.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-30 09:06:47 -04:00
Shubham Sanghavi	30ede8994e	Implement Fast Tokenization for Deberta (#11387 )	2021-04-30 08:08:15 -04:00
Nicolas Patry	db9dd09cf9	Adding `AutomaticSpeechRecognitionPipeline`. (#11337 ) * Adding `AutomaticSpeechRecognitionPipeline`. - Because we added everything to enable this pipeline, we probably should add it to `transformers`. - This PR tries to limit the scope and focuses only on the pipeline part (what should go in, and out). - The tests are very specific for S2T and Wav2vec2 to make sure both architectures are supported by the pipeline. We don't use the mixin for tests right now, because that requires more work in the `pipeline` function (will be done in a follow up PR). - Unsure about the "helper" function `ffmpeg_read`. It makes a lot of sense from a user perspective, it does not add any additional dependencies (as in hard dependency, because users can always use their own load mechanism). Meanwhile, it feels slightly clunky to have so much optional preprocessing. - The pipeline is not done to support streaming audio right now. Future work: - Add `automatic-speech-recognition` as a `task`. And add the FeatureExtractor.from_pretrained within `pipeline` function. - Add small models within tests - Add the Mixin to tests. - Make the logic between ForCTC vs ForConditionalGeneration better. * Update tests/test_pipelines_automatic_speech_recognition.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Adding docs + main import + type checking + LICENSE. * Doc style !. * Fixing TYPE_HINT. * Specifying waveform shape in the docs. * Adding asserts + specify in the documentation the shape of the input np.ndarray. * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Adding require to tests + move the `feature_extractor` doc. Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-04-30 11:54:08 +02:00
Hamel Husain	3f6add8bab	fix #1149 (#11493 )	2021-04-28 11:16:41 -04:00
Sylvain Gugger	2d27900b5d	Update min versions in README and add Flax (#11472 ) * Update min versions in README and add Flax * Adapt index	2021-04-28 09:10:06 -04:00
Hamel Husain	7ceff67e1a	Finish Making Quick Tour respect the model object (#11467 ) * finish quicktour * fix import * fix print * explain config default better * Update docs/source/quicktour.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-27 10:04:12 -04:00
Hamel Husain	88ac60f7b5	update QuickTour docs to reflect model output object (#11462 ) * update docs to reflect model output object * run make style`	2021-04-26 22:18:37 -04:00
Stas Bekman	bc2571e61c	[Deepspeed] ZeRO-Infinity integration plus config revamp (#11418 ) * adding Z-inf * revamp config process * up version requirement * wip * massive rewrite * cleanup * cleanup * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * consistent json commas * act on suggestions * leave this feature for 0.3.16 * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-26 10:40:32 -07:00
Stas Bekman	a753cafdc0	[docs] fix invalid class name (#11438 ) * fix invalid class name * proper ref * proper ref	2021-04-26 08:37:32 -07:00
Sylvain Gugger	bf2e0cf70b	Trainer push to hub (#11328 ) * Initial support for upload to hub * push -> upload * Fixes + examples * Fix torchhub test * Torchhub test I hate you * push_model_to_hub -> push_to_hub * Apply mixin to other pretrained models * Remove ABC inheritance * Add tests * Typo * Run tests * Install git-lfs * Change approach * Add push_to_hub to all * Staging test suite * Typo * Maybe like this? * More deps * Cache * Adapt name * Quality * MOAR tests * Put it in testing_utils * Docs + torchhub last hope * Styling * Wrong method * Typos * Update src/transformers/file_utils.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Address review comments * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-04-23 09:17:37 -04:00
Stas Bekman	9f72e8f4e1	[testing doc] bring doc up to date (#11359 ) * bring doc up to date * fix	2021-04-21 08:51:00 -07:00
Sylvain Gugger	dabeb15292	Examples reorg (#11350 ) * Base move * Examples reorganization * Update references * Put back test data * Move conftest * More fixes * Move test data to test fixtures * Update path * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments and clean Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-04-21 11:11:20 -04:00
Sylvain Gugger	74712e22f3	Honor contributors to models (#11329 ) * Honor contributors to models * Fix typo * Address review comments * Add more authors	2021-04-21 09:47:27 -04:00
Stas Bekman	63ca402380	[troubleshooting] add 2 points of reference to the offline mode (#11236 ) * add 2 points of reference to the offline mode * link the new doc * add error message * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * rename * Trigger CI Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-14 08:39:23 -07:00
Yusuke Mori	075e821d1d	Add prefix to examples in model_doc rst (#11226 ) * Add prefix to examples in model_doc rst * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-14 10:58:55 -04:00
Sylvain Gugger	f38cd4373f	Indent code block in the documentation (#11233 ) * Indent code block * Indent code blocks version 2 * Quality	2021-04-13 15:36:36 -04:00
Sylvain Gugger	3312e96bfb	Doc check: a bit of clean up (#11224 )	2021-04-13 12:14:25 -04:00
Sylvain Gugger	893e51a53f	Document v4.5.1	2021-04-13 11:28:17 -04:00
Yusuke Mori	22fa0a6004	Add documentation for BertJapanese (#11219 ) * Start writing BERT-Japanese doc * Fix typo, Update toctree * Modify model file to use comment for document, Add examples * Clean bert_japanese by make style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Split a big code block into two * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add prefix >>> to all lines in code blocks * Clean bert_japanese by make fixup Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-13 09:49:15 -04:00
NielsRogge	9f1260971f	Add DeiT (PyTorch) (#11056 ) * First draft of deit * More improvements * Remove DeiTTokenizerFast from init * Conversion script works * Add DeiT to ViT conversion script * Add tests, add head model, add support for deit in vit conversion script * Update model checkpoint names * Update image_mean and image_std, set resample to bicubic * Improve docs * Docs improvements * Add DeiTForImageClassificationWithTeacher to init * Address comments by @sgugger * Improve feature extractors * Make fix-copies * Minor fixes * Address comments by @patil-suraj * All models uploaded * Fix tests * Remove labels argument from DeiTForImageClassificationWithTeacher * Fix-copies, style and quality * Fix tests * Fix typo * Multiple docs improvements * More docs fixes	2021-04-12 18:07:10 -04:00
fghuman	0c6fcd3034	Added documentation for data collator. (#10941 ) * Added documentation for data collator. * Update docs/source/data_collator.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Added documentation for data collator. * Added documentation for the data collator. * Merge branch 'doc_DataCollator' of C:\Users\mahii\PycharmProjects\transformers with conflicts. * Update documentation for the data collator. * Update documentation for the data collator. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Amna <A.A.Ahmad@student.tudelft.nl>	2021-04-12 11:59:46 -04:00
Kevin Canwen Xu	fb41f9f50c	Add a special tokenizer for CPM model (#11068 ) * Add a special tokenizer for CPM model * make style * fix * Add docs * styles * cpm doc * fix ci * fix the overview * add test * make style * typo * Custom tokenizer flag * Add REAMDE.md Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-04-10 02:07:47 +08:00
Sylvain Gugger	45fc8c7951	Make `get_special_tokens_mask` consider all tokens (#11163 )	2021-04-09 11:57:44 -04:00
Niklas Muennighoff	8b78a32be1	[Community notebooks] Add Wav2Vec notebook for creating captions for YT Clips (#11142 ) * Add Wav2Vec Inference notebook * Update docs/source/community.md Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-04-09 12:10:37 +05:30
Stas Bekman	0311ba2153	typo (#11152 ) * typo * style	2021-04-08 19:47:31 -07:00
Stas Bekman	c2e0fd5283	[setup] make fairscale and deepspeed setup extras (#11151 ) * make fairscale and deepspeed setup extras * fix default * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * no reason not to ask for the good version * update the CIs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-08 15:46:54 -07:00
Stas Bekman	66446909b2	[tests] relocate core integration tests (#11146 ) * relocate core integration tests * add sys.path context manager * cleanup * try * try2 * fix path * doc * style * add dep * add 2 more deps	2021-04-08 13:13:17 -07:00
Julien Demouth	02ec02d6d3	Add nvidia megatron models (#10911 ) * Add support for NVIDIA Megatron models * Add support for NVIDIA Megatron GPT2 and BERT Add the megatron_gpt2 model. That model reuses the existing GPT2 model. This commit includes a script to convert a Megatron-GPT2 checkpoint downloaded from NVIDIA GPU Cloud. See examples/megatron-models/README.md for details. Add the megatron_bert model. That model is implemented as a modification of the existing BERT model in Transformers. This commit includes a script to convert a Megatron-BERT checkpoint downloaded from NVIDIA GPU Cloud. See examples/megatron-models/README.md for details. * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Remove model.half in tests + add "# Copied ..." Remove the model.half() instruction which makes tests fail on the CPU. Add a comment "# Copied ..." before many classes in the model to enable automatic tracking in CI between the new Megatron classes and the original Bert ones. * Fix issues * Fix Flax/TF tests * Fix copyright * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update docs/source/model_doc/megatron_bert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/megatron_gpt2.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Resolve most of 'sgugger' comments * Fix conversion issue + Run make fix-copies/quality/docs * Apply suggestions from code review * Causal LM & merge * Fix init * Add CausalLM to last auto class Co-authored-by: Julien Demouth <jdemouth@nvidia.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-04-08 14:09:11 -04:00
Stas Bekman	c6d664849b	[DeepSpeed] ZeRO Stage 3 (#10753 ) * synced gpus * fix * fix * need to use t5-small for quality tests * notes * complete merge * fix a disappearing std stream problem * start zero3 tests * wip * tune params * sorting out the pre-trained model loading * reworking generate loop wip * wip * style * fix tests * split the tests * refactor tests * wip * parameterized * fix * workout the resume from non-ds checkpoint pass + test * cleanup * remove no longer needed code * split getter/setter functions * complete the docs * suggestions * gpus and their compute capabilities link * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * style * remove invalid paramgd * automatically configure zero3 params that rely on hidden size * make _get_resized_embeddings zero3-aware * add test exercising resize_token_embeddings() * add docstring Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-04-08 09:53:01 -07:00
Yusuke Mori	5bf5d50c8d	Typo fix of the name of BertLMHeadModel in BERT doc (#11133 )	2021-04-08 08:22:58 -04:00
Sylvain Gugger	403d530eec	Auto feature extractor (#11097 ) * AutoFeatureExtractor * Init and first tests * Tests * Damn you gitignore * Quality * Defensive test for when not all backends are here * Use pattern for Speech2Text models	2021-04-06 19:20:08 -04:00
Stas Bekman	520198f56f	[doc] gpt-neo (#11098 ) make the example work	2021-04-06 16:42:06 -04:00
Lysandre	9853c5dd58	Development on v4.6.0dev0	2021-04-06 12:53:25 -04:00
Philipp Schmid	b219d6b5a5	added social thumbnail for docs (#11083 )	2021-04-06 14:56:18 +02:00
Sylvain Gugger	6c1bee7d89	Link to new blog	2021-04-06 08:55:40 -04:00
Amala Deshmukh	e1c02e018c	Add example for registering callbacks with trainers (#10928 ) * Add example for callback registry Resolves: #9036 * Update callback registry documentation * Added comments for other ways to register callback	2021-04-05 12:27:23 -04:00
Lysandre Debut	9f4e0c23d6	Documentation about loading a fast tokenizer within Transformers (#11029 ) * Documentation about loading a fast tokenizer within Transformers * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-05 10:51:16 -04:00
Sylvain Gugger	6c25f5228e	Refactor AutoModel classes and add Flax Auto classes (#11027 ) * Refactor AutoModel classes and add Flax Auto classes * Add new objects to the init * Fix hubconf and sort models * Fix TF tests * Missing coma * Update src/transformers/models/auto/auto_factory.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Fix init * Fix dummies * Other init to fix Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-04-05 10:11:28 -04:00
Lysandre Debut	773e4c7263	Remove unnecessary space (#11060 )	2021-04-05 09:36:20 -04:00
Eren Şahin	6e31014110	[doc] update code-block rendering (#11053 ) double : prevents code-block section to be rendered, so made it single :	2021-04-05 09:06:07 -04:00
Philipp Schmid	34e1bec649	added new notebook and merge of trainer (#11015 ) * added new notebook and merge of trainer * Update docs/source/sagemaker.md Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-04-01 23:13:47 +02:00
Julien Chaumond	e8da77d181	[doc] no more bucket	2021-04-01 14:25:47 -04:00
Joe Davison	f4ad3d8cea	minor typo fix negative log-likelihood	2021-04-01 11:58:37 -06:00
NielsRogge	30677dc743	Add Vision Transformer and ViTFeatureExtractor (#10950 ) * Squash all commits into one * Update ViTFeatureExtractor to use image_utils instead of torchvision * Remove torchvision and add Pillow * Small docs improvement * Address most comments by @sgugger * Fix tests * Clean up conversion script * Pooler first draft * Fix quality * Improve conversion script * Make style and quality * Make fix-copies * Minor docs improvements * Should use fix-copies instead of manual handling * Revert "Should use fix-copies instead of manual handling" This reverts commit `fd4e591bce`. * Place ViT in alphabetical order Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-01 11:16:05 -04:00
Patrick von Platen	01068abdb9	add blog to docs (#10997 )	2021-03-31 18:36:00 +03:00
Patrick von Platen	b6dddda4d2	add notebook (#10995 )	2021-03-31 17:00:56 +03:00
Patrick von Platen	e87505f3a1	[Flax] Add other BERT classes (#10977 ) * add first code structures * add all bert models * add to init and docs * correct docs * make style	2021-03-31 09:45:58 +03:00
Philipp Schmid	e3c8443f08	improved sagemaker documentation for git_config and examples (#10966 ) * improved branch usage * fixed grammar and comma	2021-03-30 18:00:52 +02:00
Suraj Patil	83d38c9ff3	GPT Neo few fixes (#10968 ) * fix checkpoint names * auto model * fix doc	2021-03-30 11:15:55 -04:00
Suraj Patil	860264379f	GPT Neo (#10848 ) * lets begin * boom boom * fix out proj in attn * fix attention * fix local attention * add tokenizer * fix imports * autotokenizer * fix checkpoint name * cleanup * more clean-up * more cleanup * output attentions * fix attn mask creation * fix imports * config doc * add tests * add slow tests * quality * add conversion script * copyright * typo * another bites the dust * fix attention tests * doc * add embed init in convert function * fix copies * remove tokenizer * enable caching * address review comments * improve config and create attn layer list internally * more consistent naming * init hf config from mesh-tf config json file * remove neo tokenizer from doc * handle attention_mask in local attn layer * attn_layers => attention_layers * add tokenizer_class in config * fix docstring * raise if len of attention_layers is not same as num_layers * remove tokenizer_class from config * more consistent naming * fix doc * fix checkpoint names * fp16 compat * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-03-30 09:42:30 -04:00
Vasudev Gupta	6dfd027279	BigBird (#10183 ) * init bigbird * model.__init__ working, conversion script ready, config updated * add conversion script * BigBirdEmbeddings working :) * slightly update conversion script * BigBirdAttention working :) ; some bug in layer.output.dense * add debugger-notebook * forward() working for BigBirdModel :) ; replaced gelu with gelu_fast * tf code adapted to torch till rand_attn in bigbird_block_sparse_attention ; till now everything working :) * BigBirdModel working in block-sparse attention mode :) * add BigBirdForPreTraining * small fix * add tokenizer for BigBirdModel * fix config & hence modeling * fix base prefix * init testing * init tokenizer test * pos_embed must be absolute, attn_type=original_full when add_cross_attn=True , nsp loss is optional in BigBirdForPreTraining, add assert statements * remove position_embedding_type arg * complete normal tests * add comments to block sparse attention * add attn_probs for sliding & global tokens * create fn for block sparse attn mask creation * add special tests * restore pos embed arg * minor fix * attn probs update * make big bird fully gpu friendly * fix tests * remove pruning * correct tokenzier & minor fixes * update conversion script , remove norm_type * tokenizer-inference test add * remove extra comments * add docs * save intermediate * finish trivia_qa conversion * small update to forward * correct qa and layer * better error message * BigBird QA ready * fix rebased * add triva-qa debugger notebook * qa setup * fixed till embeddings * some issue in q/k/v_layer * fix bug in conversion-script * fixed till self-attn * qa fixed except layer norm * add qa end2end test * fix gradient ckpting ; other qa test * speed-up big bird a bit * hub_id=google * clean up * make quality * speed up einsum with bmm * finish perf improvements for big bird * remove wav2vec2 tok * fix tokenizer * include docs * correct docs * add helper to auto pad block size * make style * remove fast tokenizer for now * fix some * add pad test * finish * fix some bugs * fix another bug * fix buffer tokens * fix comment and merge from master * add comments * make style * commit some suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix typos * fix some more suggestions * add another patch Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * another path Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * update * update nit suggestions * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-03-30 08:51:34 +03:00
Sylvain Gugger	06a6fea782	Instantiate model only once in pipeline (#10888 ) * Instantiate model only once in pipeline * Remove documentation of deprecated method * Add FutureWarning * Update src/transformers/pipelines/base.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-03-29 10:39:14 -04:00
Sylvain Gugger	b0595d33c1	Add ImageFeatureExtractionMixin (#10905 ) * Add ImageFeatureExtractionMixin * Add dummy vision objects * Add require_vision * Add tests * Fix test	2021-03-26 11:23:56 -04:00
Tomy Hsieh	4b2b50aa7b	Rename NLP library to Datasets library (#10920 ) * Rename NLP library to Datasets library * Update github template * Fix styling	2021-03-26 08:07:59 -04:00
Amir Tahmasbi	4684bfc757	Layout lm tf 2 (#10636 ) * Added embeddings layer * Added layoutlm layers, main model, maskedlm and token classification classes * Added model classes to tf auto models * Added model to PT to TF conversion script * Added model to doc README * Added tests * Removed unused imports * Added layoutlm model, test, and doc for sequence classification, and fix imports in __init__.py * Made tests pass! * Fixed typos in imports and docs * Fixed a typo in embeddings layer * Removed imports * Fixed formatting issues, imports, tests * Added layoutlm layers, main model, maskedlm and token classification classes * Added model classes to tf auto models * Added model to PT to TF conversion script * Removed unused imports * Added layoutlm model, test, and doc for sequence classification, and fix imports in __init__.py * Made tests pass! * Fixed typos in imports and docs * Removed imports * Fixed small formatting issues * Removed duplicates import from main __init__.py * Chnaged deafult arg to true for adding pooling layer to tf layoutlm * Fixed formatting issues * Style * Added copied from to classes copied from bert * Fixed doc strings examples to work with layoutlm inputs * Removed PyTorch reference in doc strings example * Added integration tests * Cleaned up initialization file * Updated model checkpoint identifiers * Fixed imports Co-authored-by: Amir Tahmasbi <amir@ehsai.ca> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-03-25 12:32:38 -04:00
Philipp Schmid	1a3e0c4fe6	make local setup more clearer and added missing links (#10899 )	2021-03-25 09:01:31 -04:00
Eliza Szczechla	1f5ea9e04a	Add notebook on fine-tuning Bart (#10883 ) Co-authored-by: Eliza <eliza@habanero.tiger.com.pl>	2021-03-24 11:03:37 -04:00
Philipp Schmid	77ffd5edd5	Amazon SageMaker Documentation (#10867 ) * added finished documentation * changed version from 1.6 to 1.6.0 for distributed * updated versions * updated urls	2021-03-23 10:56:44 -04:00
Patrick von Platen	77bf3fe787	[Generate] Add save mode logits processor to remove nans and infs if necessary (#10769 ) * push * finish * finish * make fix copies * change name	2021-03-23 01:00:05 +03:00
Eric Lam	be87b84276	Add new community notebook - wav2vec2 with GPT (#10794 ) * Add new community notebook - wav2vec2 with GPT * Update:community.md, new nb add * feat: notebook of wav2vec xlsr ctc decoding with gpt logit adjustment * Update: Wav2vec2 CTC decoding with gpt2 adjustment * Update docs/source/community.md Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-03-21 13:29:53 +05:30
Sylvain Gugger	dcebe254fa	Document v4.4.2	2021-03-18 15:19:25 -04:00
Stas Bekman	8715d20c97	[doc] [testing] extend the pytest -k section with more examples (#10761 ) * [doc] [testing] extend -k section This PR adds more examples on using `pytest -k` - I always forget that I want to use `-k A OR B` when I want several tests - I keep trying AND and it doesn't match any. * style	2021-03-17 09:23:38 -04:00
Cheng Li	c83fbc5f2d	[Deepspeed] Allow HF optimizer and scheduler to be passed to deepspeed (#10464 ) * pass hf optimizer and scheduler to deepspeed if not specified in ds config * pass hf optimizer and scheduler to deepspeed if not specified in ds config * update * make init_deepspeed support config dict * fix docstring formatting * clean up trainer's comments * add new tests * fix type * composit argparse doesn't work * style * add a new test, rename others * document new functionality * complete tests, add docs * style * correct level * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add new methods to the doc * must tell DS we are using a non-native optimizer * add protection against cpu_offload + HF optimizer combo * fix the cli overrides * sync docs + tests * restore AdamW * better docs * need new version * no longer needed * remove outdate information * refactor duplicated code Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-03-16 15:51:09 -07:00
Lysandre	73fe40898d	Docs for v4.4.1	2021-03-16 15:41:49 -04:00
Lysandre	1b5ce1e63b	Development on v4.5.0dev0	2021-03-16 11:41:15 -04:00
Lysandre	c988db5af2	Release v4.4.0	2021-03-16 11:33:35 -04:00
Suraj Patil	d3d388b934	fix M2M100 example (#10745 )	2021-03-16 20:20:00 +05:30
Lysandre Debut	5dcc08f1df	Fix S2T example (#10741 )	2021-03-16 08:55:07 -04:00
Théo Matussière	6f840990a7	split seq2seq script into summarization & translation (#10611 ) * split seq2seq script, update docs * needless diff * fix readme * remove test diff * s/summarization/translation Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * cr * fix arguments & better mbart/t5 refs * copyright Co-authored-by: Suraj Patil <surajp815@gmail.com> * reword readme Co-authored-by: Suraj Patil <surajp815@gmail.com> * s/summarization/translation * short script names * fix tests * fix isort, include mbart doc * delete old script, update tests * automate source prefix * automate source prefix for translation * s/translation/trans Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * fix script name (short version) * typos Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * exact parameter Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * remove superfluous source_prefix calls in docs * rename scripts & warn for source prefix * black * flake8 Co-authored-by: theo <theo@matussie.re> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-03-15 09:11:42 -04:00
Stas Bekman	4c32f9f26e	AdamW is now supported by default (#9624 )	2021-03-12 13:40:07 -08:00
Sylvain Gugger	e8246f78f9	Add auto_wrap option in fairscale integration (#10673 ) * Add auto_wrap option in fairscale integration * Style	2021-03-12 07:50:20 -05:00
Nicolas Patry	543d0549f8	Adding new parameter to `generate`: `max_time`. (#9846 ) * [WIP] Adding new parameter to `generate`: `max_time`. Generation by tokens number is sometimes a bit clunky because we don't know how many tokens are good enough or even how many tokens are in the payload (for pipelines users for instance). This leads to hard to understand behavior. This PR proposes a new argument `max_time` which is a float of seconds for the allowed time for `generate` to run on. Ideally combinations of `max_tokens=None`, `max_time=2` could be used to generate as many tokens as possible within time budget. NB: Another possible approach consists of passing a callback to `generate` putting the caller in charge of the actual decision of when to stop generating tokens. It opens the door to 'which args should we pass' to this callback. It's hard to imagine other use-cases for this early stopping behavior than time (that are not already covered by parameters of generate) * Revamp with StoppingCriteria * Removing deprecated mentions. * Forgot arguments to stopping criteria. * Readding max_length it's not just used as a stopping criteria. * Default value for `stopping_criteria`. * Address @patrickvonplaten comments. - More docstrings - Actual doc - Include in global namespace - Remove TF work. * Put back `max_length` (deprecation different PR). * Doc quality. * Fixing old behavior without `stopping_criteria` but with `max_length`. Making sure we don't break that in the future. * Adding more tests for possible inconsistencies between `max_length` and `stopping_criteria`. * Fixing the torch imports.	2021-03-12 10:11:50 +01:00
WybeKoper	2f8485199c	Fix broken link (#10656 ) * Fixed broken link * fixed max length violation Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>	2021-03-11 14:29:02 -05:00
Suraj Patil	055ed78f52	[S2T] fix example in docs (#10667 )	2021-03-11 22:43:37 +05:30
Patrick von Platen	602d63f05c	[XLSR-Wav2Vec2] Add multi-lingual Wav2Vec2 models (#10648 ) * add conversion script * add wav2vec2 xslr models * finish * Update docs/source/model_doc/xlsr_wav2vec2.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-03-11 17:44:18 +03:00
Sylvain Gugger	26a33cfd8c	Document Trainer limitation on custom models (#10635 )	2021-03-10 14:58:22 -05:00
Suraj Patil	d26b37e744	Speech2TextTransformer (#10175 ) * s2t * fix config * conversion script * fix import * add tokenizer * fix tok init * fix tokenizer * first version working * fix embeds * fix lm head * remove extra heads * fix convert script * handle encoder attn mask * style * better enc attn mask * override _prepare_attention_mask_for_generation * handle attn_maks in encoder and decoder * input_ids => input_features * enable use_cache * remove old code * expand embeddings if needed * remove logits bias * masked_lm_loss => loss * hack tokenizer to support feature processing * fix model_input_names * style * fix error message * doc * remove inputs_embeds * remove input_embeds * remove unnecessary docstring * quality * SpeechToText => Speech2Text * style * remove shared_embeds * subsample => conv * remove Speech2TextTransformerDecoderWrapper * update output_lengths formula * fix table * remove max_position_embeddings * update conversion scripts * add possibility to do upper case for now * add FeatureExtractor and Processor * add tests for extractor * require_torch_audio => require_torchaudio * add processor test * update import * remove classification head * attention mask is now 1D * update docstrings * attention mask should be of type long * handle attention mask from generate * alwyas return attention_mask * fix test * style * doc * Speech2TextTransformer => Speech2Text * Speech2TextTransformerConfig => Speech2TextConfig * remove dummy_inputs * nit * style * multilinguial tok * fix tokenizer * add tgt_lang setter * save lang_codes * fix tokenizer * add forced_bos_token_id to tokenizer * apply review suggestions * add torchaudio to extra deps * add speech deps to CI * fix dep * add libsndfile to ci * libsndfile1 * add speech to extras all * libsndfile1 -> libsndfile1 * libsndfile * libsndfile1-dev * apt update * add sudo to install * update deps table * install libsndfile1-dev on CI * tuple to list * init conv layer * add model tests * quality * add integration tests * skip_special_tokens * add speech_to_text_transformer in toctree * fix tokenizer * fix fp16 tests * add tokenizer tests * fix copyright * input_values => input_features * doc * add model in readme * doc * change checkpoint names * fix copyright * fix code example * add max_model_input_sizes in tokenizer * fix integration tests * add do_lower_case to tokenizer * remove clamp trick * fix "Add modeling imports here" * fix copyrights * fix tests * SpeechToTextTransformer => SpeechToText * fix naming * fix table formatting * fix typo * style * fix typos * remove speech dep from extras[testing] * fix copies * rename doc file, * put imports under is_torch_available * run feat extract tests when torch is available * dummy objects for processor and extractor * fix imports in tests * fix import in modeling test * fxi imports * fix torch import * fix imports again * fix positional embeddings * fix typo in import * adapt new extractor refactor * style * fix torchscript test * doc * doc * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix docs, copied from, style * fix docstring * handle imports * remove speech from all extra deps * remove s2t from seq2seq lm mapping * better names * skip training tests * add install instructions * List => Tuple * doc * fix conversion script * fix urls * add instruction for libsndfile * fix fp16 test Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-03-10 21:42:04 +05:30
Patrick von Platen	9a06b6b11b	[FeatureExtractorSavingUtils] Refactor PretrainedFeatureExtractor (#10594 ) * save first version * finish refactor * finish refactor * correct naming * correct naming * shorter names * Update src/transformers/feature_extraction_common_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * change name * finish Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-03-09 12:16:59 +03:00
Ratthachat (Jung)	696e8a4365	Add TFRag (#9002 ) * Create modeling_tf_dpr.py * Add TFDPR * Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot last commit accidentally deleted these 4 lines, so I recover them back * Add TFDPR * Add TFDPR * clean up some comments, add TF input-style doc string * Add TFDPR * Make return_dict=False as default * Fix return_dict bug (in .from_pretrained) * Add get_input_embeddings() * Create test_modeling_tf_dpr.py The current version is already passed all 27 tests! Please see the test run at : https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing * fix quality * delete init weights * run fix copies * fix repo consis * del config_class, load_tf_weights They shoud be 'pytorch only' * add config_class back after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion * newline after .. note:: * import tf, np (Necessary for ModelIntegrationTest) * slow_test from_pretrained with from_pt=True At the moment we don't have TF weights (since we don't have official official TF model) Previously, I did not run slow test, so I missed this bug * Add simple TFDPRModelIntegrationTest Note that this is just a test that TF and Pytorch gives approx. the same output. However, I could not test with the official DPR repo's output yet * upload correct tf model * remove position_ids as missing keys * create modeling_tf_rag * add tests for tf * add tf tests * revert wrong pt commit * further refactor * further refactor * refactor * Update modeling_tf_rag.py - input_processing - fix prepare_input_for_generation (mostly fix generate bug) - bring back from_pretrained hack in order to test generate * delete colab pieces of code * Show case of greedy "generate" Temporarily change from beam_search test to greedy_search test to show case that TF and PT do get equivalent output. * cosmetic update * correct typos * update * push some progress * make easy check * fix rag save from pretrained * Update src/transformers/modeling_tf_utils.py * remove commented out lines * delete unnecessary lines * add simple test case for nq_checkpoint Add nq_checkpoint test to show that current version without hack still fails * temporarily put ugly hack back again * Add TFRagSequenceForGeneration!! * __init__.py , import TFRagSequenceForGeneration * Add TFRagSequence tests! * rag init.py - add TFRagSequenceForGeneration * fix from_pretrained * fix prepare_inputs_for_generation * Beam search for RagToken! * minor clean up * add tf.cast in TFRagModel * More tf.cast * Add all remaining tests (still have issues) * delete all T5 related * make style * fix load weight prefix * fix bart * fix return_dict for tf_rag make all tests pass .. Hooray * fix some tests * fix code quality * fix qualtiy check * finish tests tf rag * add tf rag to docs * remove TFT5 from docstring Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * remove TFT5 from docstring Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Delete outdated comments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * improve doc strings * add generative model classes * fix adjust token logic * refactor generate for TFRag * using shape_list, not _get_shape Co-authored-by: Julien Plu <plu.julien@gmail.com> * axis=[1]->axis=1 * delete NEED_HELP comment * improve readability Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improve readability Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improve readability Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Indicating model is in a developing state in docstrings As suggested by Julien * small last changes * apply sylvains suggestions * finish tf rag Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: patrickvonplaten <patrick@huggingface.co> Co-authored-by: Julien Plu <plu.julien@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-03-09 00:49:51 +03:00
Sylvain Gugger	5469369480	Fix version control with anchors (#10595 ) * Fix version control with anchors * Simplify	2021-03-08 10:19:22 -05:00
Suraj Patil	f6e74a63ca	Add m2m100 (#10236 ) * m2m_100 * no layernorm_embedding * sinusoidal positional embeddings * update pos embeddings * add default config values * tokenizer * add conversion script * fix config * fix pos embed * remove _float_tensor * update tokenizer * update lang codes * handle lang codes * fix pos embeds * fix spm key * put embedding weights on device * remove qa and seq classification heads * fix convert script * lang codes pn one line * fix embeds * fix tokenizer * fix tokenizer * add fast tokenizer * style * M2M100MT => M2M100 * fix copyright, style * tokenizer converter * vocab file * remove fast tokenizer * fix embeds * fix tokenizer * fix tests * add tokenizer tests * add integration test * quality * fix model name * fix test * doc * doc * fix doc * add copied from statements * fix tokenizer tests * apply review suggestions * fix urls * fix shift_tokens_right * apply review suggestions * fix * fix doc * add lang code to id * remove unused function * update checkpoint names * fix copy * fix tokenizer * fix checkpoint names * fix merge issue * style	2021-03-06 22:14:16 +05:30
Stas Bekman	88a951e3cc	offline mode for firewalled envs (#10407 ) * offline mode start * add specific values * fix fallback * add test * better values check and range * test that actually works * document the offline mode * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * more strict check * cleaner test * pt-only test * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-03-05 17:27:48 -08:00
lewtun	12b66215cf	Fix example of custom Trainer to reflect signature of compute_loss (#10537 )	2021-03-05 07:44:53 -05:00
Sylvain Gugger	948b730f97	Remove unsupported methods from ModelOutput doc (#10505 )	2021-03-03 14:55:18 -05:00
Jeff Yang	39f70a4058	feat(docs): navigate with left/right arrow keys (#10481 ) * feat(docs): navigate with left/right arrow keys * fix: add missing comma	2021-03-03 11:17:12 -05:00
Lysandre Debut	0c2325198f	Add I-BERT to README (#10462 )	2021-03-01 12:12:31 -05:00
Patrick von Platen	0234de8418	Add Fine-Tuning for Wav2Vec2 (#10145 ) * add encode labels function to tokenizer * start adding finetuning * init dropout * upload * correct convert script * apply changes * fix second typo * make first dummy training run * adapt convert script * push confg for comparison * remove conf * finish training * adapt data collator * add research folder * update according to fairseq feedback * some minor corrections * refactor masking indices a bit * some minor changes * clean tokenizer * finish clean-up * remove previous logic * update run script * correct training * finish changes * finish model * correct bug * fix training a bit more * add some tests * finish gradient checkpointing * finish example * correct gradient checkpointing * improve tokenization method * revert changes in tokenizer * revert general change * adapt fine-tuning * update * save intermediate test * Update README.md * finish finetuning * delete conversion script * Update src/transformers/models/wav2vec2/configuration_wav2vec2.py * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * finish wav2vec2 script * finish wav2vec2 fine-tuning * finalize test * correct test * adapt tests * finish * remove test file Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-03-01 12:13:17 +03:00
Patrick von Platen	3c733f3208	Update ibert.rst (#10445 )	2021-02-28 19:03:49 +03:00
Darigov Research	aeba4f95bb	Adds terms to Glossary (#10443 ) * feat: Adds three definitions to glossary from @cronoik Needed a definition for transformer which in turn needed 2 more definitions To do with issue https://github.com/huggingface/transformers/issues/9078 * fix: Adjusts definition of neural network to make it easier to read	2021-02-28 08:27:54 -05:00
Tanmay Garg	256482ac92	Introduce save_strategy training argument (#10286 ) * Introduce save_strategy training argument * deprecate EvaluationStrategy * collapse EvaluationStrategy and LoggingStrategy into a single IntervalStrategy enum * modify tests to use modified enum	2021-02-27 19:34:22 -05:00
Andrea Bacciu	b040e6efc1	Fix None in add_token_positions - issue #10210 (#10374 ) * Fix None in add_token_positions - issue #10210 Fix None in add_token_positions related to the issue #10210 * add_token_positions fix None values in end_positions vector add_token_positions fix None in end_positions vector as proposed by @joeddav	2021-02-25 09:18:33 -07:00
Sylvain Gugger	9d14be5c20	Add support for ZeRO-2/3 and ZeRO-offload in fairscale (#10354 ) * Ass support for ZeRO-2/3 and ZeRO-offload in fairscale * Quality * Rework from review comments * Add doc * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Address review comments Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-02-25 11:07:53 -05:00
Sehoon Kim	63645b3b11	I-BERT model support (#10153 ) * IBertConfig, IBertTokentizer added * IBert Model names moified * tokenizer bugfix * embedding -> QuantEmbedding * quant utils added * quant_mode added to configuration * QuantAct added, Embedding layer + QuantAct addition * QuantAct added * unused path removed, QKV quantized * self attention layer all quantized, except softmax * temporarl commit * all liner layers quantized * quant_utils bugfix * bugfix: requantization missing * IntGELU added * IntSoftmax added * LayerNorm implemented * LayerNorm implemented all * names changed: roberta->ibert * config not inherit from ROberta * No support for CausalLM * static quantization added, quantize_model.py removed * import modules uncommented * copyrights fixed * minor bugfix * quant_modules, quant_utils merged as one file * import * fixed * unused runfile removed * make style run * configutration.py docstring fixed * refactoring: comments removed, function name fixed * unused dependency removed * typo fixed * comments(Copied from), assertion string added * refactoring: super(..) -> super(), etc. * refactoring * refarctoring * make style * refactoring * cuda -> to(x.device) * weight initialization removed * QuantLinear set_param removed * QuantEmbedding set_param removed * IntLayerNorm set_param removed * assert string added * assertion error message fixed * is_decoder removed * enc-dec arguments/functions removed * Converter removed * quant_modules docstring fixed * conver_slow_tokenizer rolled back * quant_utils docstring fixed * unused aruments e.g. use_cache removed from config * weight initialization condition fixed * x_min, x_max initialized with small values to avoid div-zero exceptions * testing code for ibert * test emb, linear, gelu, softmax added * test ln and act added * style reformatted * force_dequant added * error tests overrided * make style * Style + Docs * force dequant tests added * Fix fast tokenizer in init * Fix doc * Remove space * docstring, IBertConfig, chunk_size * test_modeling_ibert refactoring * quant_modules.py refactoring * e2e integration test added * tokenizers removed * IBertConfig added to tokenizer_auto.py * bugfix * fix docs & test * fix style num 2 * final fixes Co-authored-by: Sehoon Kim <sehoonkim@berkeley.edu> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-02-25 10:06:42 -05:00
Patrick von Platen	cb38ffcc5e	[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324 ) * push to show * small improvement * small improvement * Update src/transformers/feature_extraction_utils.py * Update src/transformers/feature_extraction_utils.py * implement base * add common tests * make all tests pass for wav2vec2 * make padding work & add more tests * finalize feature extractor utils * add call method to feature extraction * finalize feature processor * finish tokenizer * finish general processor design * finish tests * typo * remove bogus file * finish docstring * add docs * finish docs * small fix * correct docs * save intermediate * load changes * apply changes * apply changes to doc * change tests * apply surajs recommend * final changes * Apply suggestions from code review * fix typo * fix import * correct docstring	2021-02-25 17:42:46 +03:00
abhishek thakur	9dc7825744	Remove unused variable in example for Q&A (#10392 )	2021-02-25 09:18:47 -05:00
Lysandre	3591844306	v4.3.3 docs	2021-02-24 15:19:01 -05:00
Stas Bekman	eab0afc19c	[Trainer] implement gradient_accumulation_steps support in DeepSpeed integration (#10310 ) * implement gradient_accumulation_steps support in DeepSpeed integration * typo * cleanup * cleanup	2021-02-22 11:15:59 -08:00
Sylvain Gugger	9e147d31f6	Deprecate prepare_seq2seq_batch (#10287 ) * Deprecate prepare_seq2seq_batch * Fix last tests * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com> * More review comments Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-02-22 12:36:16 -05:00
Lysandre Debut	cd8c4c3fc2	DeBERTa-v2 fixes (#10328 ) Co-authored-by: Pengcheng He <penhe@microsoft.com> Co-authored-by: Pengcheng He <penhe@microsoft.com>	2021-02-22 07:45:18 -05:00
Pengcheng He	9a7e63729f	Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… (#10018 ) * Integrate DeBERTa v2(the 1.5B model surpassed human performance on SuperGLUE); Add DeBERTa v2 900M,1.5B models; * DeBERTa-v2 * Fix v2 model loading issue (#10129) * Doc members * Update src/transformers/models/deberta/modeling_deberta.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Address Sylvain's comments * Address Patrick's comments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Style Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-02-19 18:34:44 -05:00
Sylvain Gugger	f6e53e3c2b	Fix example links in the task summary (#10291 )	2021-02-19 18:04:15 -05:00
Stas Bekman	5da7c78ed8	update to new script; notebook notes (#10241 )	2021-02-17 15:58:08 -08:00
Joe Davison	4210cd96fc	fix add_token_positions fn (#10217 )	2021-02-16 14:00:05 -05:00
Suraj Patil	6fc940ed09	Add mBART-50 (#10154 ) * add tokenizer for mBART-50 * update tokenizers * make src_lang and tgt_lang optional * update tokenizer test * add setter * update docs * update conversion script * update docs * update conversion script * update tokenizer * update test * update docs * doc * address Sylvain's suggestions * fix test * fix formatting * nits	2021-02-15 20:58:54 +05:30
Sylvain Gugger	803498318c	[Doc] Fix version control in internal pages (#10124 )	2021-02-13 08:52:30 -05:00
Stas Bekman	b54cb0bd82	[DeepSpeed in notebooks] Jupyter + Colab (#10130 ) * init devices/setup explicitly * docs + test * simplify * cleanup * cleanup * cleanup * correct the required dist setup * derive local_rank from env LOCAL_RANK	2021-02-11 14:02:05 -08:00
Tanmay Thakur	2f3b5f4dcc	Add new community notebook - Blenderbot (#10126 ) * Update:community.md, new nb add * feat: updated grammar on nb description * Update: Train summarizer for BlenderBotSmall	2021-02-11 12:53:40 +03:00
Stas Bekman	7c07a47dfb	[DeepSpeed docs] new information (#9610 ) * how to specify a specific gpu * new paper * expand on buffer sizes * style * where to find config examples * specific example * small updates	2021-02-09 22:16:20 -08:00
Boris Dayma	7c7962ba89	doc: update W&B related doc (#10086 ) * doc: update W&B related doc * doc(wandb): mention report_to * doc(wandb): commit suggestion Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * doc(wandb): fix typo * doc(wandb): remove WANDB_DISABLED Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-02-09 14:47:52 -05:00
Sylvain Gugger	0c3d23dff7	Add patch releases to the doc	2021-02-09 14:17:09 -05:00
Lysandre Debut	78f4a0e7e5	Logging propagation (#10092 ) * Enable propagation by default * Document enable/disable default handler	2021-02-09 10:27:49 -05:00
Patrick von Platen	b972125ced	Deprecate Wav2Vec2ForMaskedLM and add Wav2Vec2ForCTC (#10089 ) * add wav2vec2CTC and deprecate for maskedlm * remove from docs	2021-02-09 03:49:02 -05:00
Juan Cruz-Benito	e4bf9910dc	Removing run_pl_glue.py from text classification docs, include run_xnli.py & run_tf_text_classification.py (#10066 ) * Removing run_pl_glue.py from seq classification docs * Adding run_tf_text_classification.py * Using :prefix_link: to refer local files * Applying "make style" to the branch * Update docs/source/task_summary.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removing last underscores Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-02-08 13:04:21 -05:00
Lysandre	0dd579c9cf	Docs for v4.3.0	2021-02-08 18:53:24 +01:00
Sylvain Gugger	45aaf5f7ab	A few fixes in the documentation (#10033 )	2021-02-08 05:02:01 -05:00
Patrick von Platen	89be094e29	[Templates] Add template "call-for-model" markdown and "call-for-big-bird" markdown (#9921 ) * add big bird * change teacher to mentor * add proposal template * adapt template * delete old template * correct some links * finish template * create big bird from template * add big bird * improve boxes * finish boxes * add pointers for BigBird * finish big bird * up * up * up * up * apply lysandres and sylvains suggestions * delete bogus file * correct markdown * try different style * try different style * finalize	2021-02-05 15:47:54 +03:00
Sylvain Gugger	3be965c5db	Update doc for pre-release (#10014 ) * Update doc for pre-release * Use stable as default * Use the right commit :facepalms:	2021-02-04 16:52:27 -05:00
Sylvain Gugger	b72f16b3ec	Fix doc for TFConverBertModel	2021-02-04 10:14:46 -05:00
demSd	00031785a8	BartForCausalLM analogs to `ProphetNetForCausalLM` (#9128 ) * initiliaze bart4causalLM * create BartDecoderWrapper, setters/getters * delete spaces * forward and additional methods * update cache function, loss function, remove ngram* params in data class. * add bartcausallm, bartdecoder testing * correct bart for causal lm * remove at * add mbart as well * up * fix typo * up * correct * add pegasusforcausallm * add blenderbotforcausallm * add blenderbotsmallforcausallm * add marianforcausallm * add test for MarianForCausalLM * add Pegasus test * add BlenderbotSmall test * add blenderbot test * fix a fail * fix an import fail * a fix * fix * Update modeling_pegasus.py * fix models * fix inputs_embeds setting getter * adapt tests * correct repo utils check * finish test improvement * fix tf models as well * make style * make fix-copies * fix copies * run all tests * last changes * fix all tests Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-02-04 11:56:12 +03:00
yylun	5442a11f5f	fix steps_in_epoch variable in trainer when using max_steps (#9969 ) * fix steps_in_epoch variable when using max_steps * redundant sentence * Revert "redundant sentence" This reverts commit `ad5c0e9b6e`. * remove redundant sentence Co-authored-by: wujindou <wujindou@sogou-inc.com>	2021-02-03 09:30:37 -05:00
Patrick von Platen	d6217fb30c	Wav2Vec2 (#9659 ) * add raw scaffold * implement feat extract layers * make style * remove + * correctly convert weights * make feat extractor work * make feature extraction proj work * run forward pass * finish forward pass * Succesful decoding example * remove unused files * more changes * add wav2vec tokenizer * add new structure * fix run forward * add other layer norm architecture * finish 2nd structure * add model tests * finish tests for tok and model * clean-up * make style * finish docstring for model and config * make style * correct docstring * correct tests * change checkpoints to fairseq * fix examples * finish wav2vec2 * make style * apply sylvains suggestions * apply lysandres suggestions * change print to log.info * re-add assert statement * add input_values as required input name * finish wav2vec2 tokenizer * Update tests/test_tokenization_wav2vec2.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * apply sylvains suggestions Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-02-02 15:52:10 +03:00
Sylvain Gugger	de38a6e4d2	Fix 9918 (#9932 ) * Initial work * Fix doc styler and other models	2021-02-02 05:22:20 -05:00
Patrick von Platen	0e3be1ac8f	Add new model docs (#9667 ) * add new model logic * fix docs * change structure * improve add_new_model * push new changes * up * up * correct spelling * improve docstring * correct line length * update readme * correct links * correct typos * only add rst file for now * Apply suggestions from code review 1 Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be> * Apply suggestions from code review Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com> * finish adding all suggestions * make style * apply Niels feedback * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply sylvains suggestions Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-02-01 17:55:10 +03:00
Stas Bekman	40cfc355f1	[doc] nested markup is invalid in rst (#9898 ) Apparently nested markup in RST is invalid: https://docutils.sourceforge.io/FAQ.html#is-nested-inline-markup-possible So currently this line doesn't get rendered properly, leaving inner markdown unrendered, resulting in: ``` https://docutils.sourceforge.io/FAQ.html#is-nested-inline-markup-possible ``` This PR removes the bold which fixes the link.	2021-01-30 09:59:19 -05:00
Stas Bekman	15e4ce353a	[docs] expand install instructions (#9817 ) * expand install instructions * fix * white space * rewrite as discussed in the PR * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * change the wording to encourage issue report Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-28 09:36:46 -08:00
Joe Davison	caddf9126b	tutorial typo	2021-01-28 09:21:58 -05:00
Stefan Schweter	5ed5a54684	ADD BORT (#9813 ) * tests: add integration tests for new Bort model * bort: add conversion script from Gluonnlp to Transformers 🚀 * bort: minor cleanup (BORT -> Bort) * add docs * make fix-copies * clean doc a bit * correct docs * Update docs/source/model_doc/bort.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/bort.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct dialogpt doc * correct link * Update docs/source/model_doc/bort.rst * Update docs/source/model_doc/dialogpt.rst Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-27 21:25:11 +03:00
abhishek thakur	f617490e71	ConvBERT Model (#9717 ) * finalize convbert * finalize convbert * fix * fix * fix * push * fix * tf image patches * fix torch model * tf tests * conversion * everything aligned * remove print * tf tests * fix tf * make tf tests pass * everything works * fix init * fix * special treatment for sepconv1d * style * 🙏🏽 * add doc and cleanup * add electra test again * fix doc * fix doc again * fix doc again * Update src/transformers/modeling_tf_pytorch_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/conv_bert/configuration_conv_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update docs/source/model_doc/conv_bert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/conv_bert/configuration_conv_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * conv_bert -> convbert * more fixes from review * add conversion script * dont use pretrained embed * unused config * suggestions from julien * some more fixes * p -> param * fix copyright * fix doc * Update src/transformers/models/convbert/configuration_convbert.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * comments from reviews * fix-copies * fix style * revert shape_list Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-01-27 03:20:09 -05:00
Yusuke Mori	cb73ab5a38	Fix broken links in the converting tf ckpt document (#9791 ) * Fix broken links in the converting tf ckpt document * Update docs/source/converting_tensorflow_models.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Reflect the review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-26 03:37:57 -05:00
Sylvain Gugger	7acfa95afb	Add missing new line	2021-01-20 14:13:16 -05:00
Darigov Research	5a307ece82	Adds flashcards to Glossary & makes small corrections (#8949 ) * fix: Makes small typo corrections & standardises glossary * feat: Adds introduction & links to transformer flashcards * feat: Adds attribution & adjustments requested in #8949 * feat: Adds flashcards to community.md * refactor: Removes flashcards from glossary	2021-01-20 13:28:40 -05:00
NielsRogge	88583d4958	Add notebook (#9696 )	2021-01-20 10:19:26 -05:00
NielsRogge	d1370d29b1	Add DeBERTa head models (#9691 ) * Add DebertaForMaskedLM, DebertaForTokenClassification, DebertaForQuestionAnswering * Add docs and fix quality * Fix Deberta not having pooler	2021-01-20 10:18:50 -05:00
acul3	8940c7662d	Add t5 convert to transformers-cli (#9654 ) * Update run_mlm.py * add t5 model to transformers-cli convert * update rum_mlm.py same as master * update converting model docs * update converting model docs * Update convert.py * Trigger notification * update import sorted * fix typo t5	2021-01-20 09:34:27 -05:00
Sylvain Gugger	76f36e183a	Add a community page to the docs (#9682 )	2021-01-20 04:54:36 -05:00
Stas Bekman	82498cbc37	[deepspeed doc] install issues + 1-gpu deployment (#9582 ) * [doc] install + 1-gpu deployment * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improvements Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-14 11:05:04 -08:00
Lysandre	e43f3b6190	v4.2.1 in docs	2021-01-14 14:25:30 +01:00
Lysandre	33a8497db8	v4.2.0 documentation	2021-01-13 16:15:40 +01:00
Lysandre	7d9a9d0c72	Release: v4.2.0	2021-01-13 16:01:51 +01:00
Julien Chaumond	247a7b2029	Doc: Update pretrained_models wording (#9545 ) * Update pretrained_models.rst To clarify things cf. this tweet for instance https://twitter.com/RTomMcCoy/status/1349094111505211395 * format	2021-01-13 05:58:05 -05:00
Stas Bekman	2df34f4aba	[trainer] deepspeed integration (#9211 ) * deepspeed integration * style * add test * ds wants to do its own backward * fp16 assert * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * for clarity extract what args are being passed to deepspeed * introduce the concept of self.wrapped_model * s/self.wrapped_model/self.model_wrapped/ * complete transition to self.wrapped_model / self.model * fix * doc * give ds its own init * add custom overrides, handle bs correctly * fix test * clean up model_init logic, fix small bug * complete fix * collapse --deepspeed_config into --deepspeed * style * start adding doc notes * style * implement hf2ds optimizer and scheduler configuration remapping * oops * call get_num_training_steps absolutely when needed * workaround broken auto-formatter * deepspeed_config arg is no longer needed - fixed in deepspeed master * use hf's fp16 args in config * clean * start on the docs * rebase cleanup * finish up --fp16 * clarify the supported stages * big refactor thanks to discovering deepspeed.init_distributed * cleanup * revert fp16 part * add checkpoint-support * more init ds into integrations * extend docs * cleanup * unfix docs * clean up old code * imports * move docs * fix logic * make it clear which file it's referring to * document nodes/gpus * style * wrong format * style * deepspeed handles gradient clipping * easier to read * major doc rewrite * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * docs * switch to AdamW optimizer * style * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * clarify doc Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-12 19:05:18 -08:00
NielsRogge	e45eba3b1c	Improve LayoutLM (#9476 ) * Add LayoutLMForSequenceClassification and integration tests Improve docs Add LayoutLM notebook to list of community notebooks * Make style & quality * Address comments by @sgugger, @patrickvonplaten and @LysandreJik * Fix rebase with master * Reformat in one line * Improve code examples as requested by @patrickvonplaten Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-12 09:26:32 -05:00
Patrick von Platen	7f28613213	[TFBart] Split TF-Bart (#9497 ) * make templates ready * make add_new_model_command_ready * finish tf bart * prepare tf mbart * finish tf bart * add tf mbart * add marian * prep pegasus * add tf pegasus * push blenderbot tf * add blenderbot * add blenderbot small * clean-up * make fix copy * define blend bot tok * fix * up * make style * add to docs * add copy statements * overwrite changes * improve * fix docs * finish * fix last slow test * fix missing git conflict line * fix blenderbot * up * fix blenderbot small * load changes * finish copied from * upload fix	2021-01-12 02:06:32 +01:00
Sylvain Gugger	8d25df2c7a	Make doc styler detect lists on rst (#9488 )	2021-01-11 08:53:41 -05:00
Patrick von Platen	9e1ea846bc	[README] Add new models (#9465 ) * add new models * make fix-copies	2021-01-08 05:49:43 -05:00
Patrick von Platen	ae5a32bb0d	up (#9454 )	2021-01-07 11:51:02 +01:00
Simon Brandeis	c89f1bc92e	Add flags to return scores, hidden states and / or attention weights in GenerationMixin (#9150 ) * Define new output dataclasses for greedy generation * Add output_[...] flags in greedy generation methods Added output_attentions, output_hidden_states, output_scores flags in generate and greedy_search methods in GenerationMixin. * [WIP] Implement logic and tests for output flags in generation * Update GreedySearchOutput classes & docstring * Implement greedy search output accumulation logic Update greedy_search unittests Fix generate method return value docstring Properly init flags with the default config * Update configuration to add output_scores flag * Fix test_generation_utils Sort imports and fix isinstance tests for GreedySearchOutputs * Fix typo in generation_utils * Add return_dict_in_generate for backwards compatibility * Add return_dict_in_generate flag in config * Fix tyPo in configuration * Fix handling of attentions and hidden_states flags * Make style & quality * first attempt attentions * some corrections * improve tests * special models requires special test * disable xlm test for now * clean tests * fix for tf * isort * Add output dataclasses for other generation methods * Add logic to return dict in sample generation * Complete test for sample generation - Pass output_attentions and output_hidden_states flags to encoder in encoder-decoder models - Fix import satements order in test_generation_utils file * Add logic to return dict in sample generation - Refactor tests to avoid using self.assertTrue, which provides scarce information when the test fails - Add tests for the three beam_search methods: vanilla, sample and grouped * Style doc * Fix copy-paste error in generation tests * Rename logits to scores and refactor * Refactor group_beam_search for consistency * make style * add sequences_scores * fix all tests * add docs * fix beam search finalize test * correct docstring * clean some files * Made suggested changes to the documentation * Style doc ? * Style doc using the Python util * Update src/transformers/generation_utils.py * fix empty lines * fix all test Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-01-06 17:11:42 +01:00
Qbiwan	ecfcac223c	Improve documentation coverage for Phobert (#9427 ) * first commit * change phobert to phoBERT as per author in overview * v3 and v4 both runs on same code hence there is no need to differentiate them Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-06 10:04:32 -05:00
Qbiwan	be898998bb	Improve documentation coverage for Herbert (#9428 ) * first commit * changed XLMTokenizer to HerbertTokenizer in code example	2021-01-06 09:13:43 -05:00
Patrick von Platen	b972c1bfb0	finalize (#9431 )	2021-01-06 14:36:55 +01:00
Sylvain Gugger	bcb55d33ce	Upgrade styler to better handle lists (#9423 ) * Add missing lines before a new list. * Update doc styler and restyle some files. * Fix docstrings of LED and Longformer	2021-01-06 07:46:17 -05:00
NielsRogge	b7e548976f	Fix URLs to TAPAS notebooks (#9435 )	2021-01-06 07:20:41 -05:00
Stas Bekman	d64372fdfc	[docs] outline sharded ddp doc (#9208 ) * outline sharded dpp doc * fix link * add example * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * narrow the command and remove non-essentials Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-05 17:34:15 -08:00
Patrick von Platen	eef66035a2	[PyTorch Bart] Split Bart into different models (#9343 ) * first try * remove old template * finish bart * finish mbart * delete unnecessary line * init pegasus * save intermediate * correct pegasus * finish pegasus * remove cookie cutter leftover * add marian * finish blenderbot * replace in file * correctly split blenderbot * delete "old" folder * correct "add statement" * adapt config for tf comp * correct configs for tf * remove ipdb * fix more stuff * fix mbart * push pegasus fix * fix mbart * more fixes * fix research projects code * finish docs for bart, mbart, and marian * delete unnecessary file * correct attn typo * correct configs * remove pegasus for seq class * correct peg docs * correct peg docs * finish configs * further improve docs * add copied from statements to mbart * fix copied from in mbart * add copy statements to marian * add copied from to marian * add pegasus copied from * finish pegasus * finish copied from * Apply suggestions from code review * make style * backward comp blenderbot * apply lysandres and sylvains suggestions * apply suggestions * push last fixes * fix docs * fix tok tests * fix imports code style * fix doc	2021-01-05 22:00:05 +01:00
Patrick von Platen	189387e9b2	LED (#9278 ) * create model * add integration * save current state * make integration tests pass * add one more test * add explanation to tests * remove from bart * add padding * remove unnecessary test * make all tests pass * re-add cookie cutter tests * finish PyTorch * fix attention test * Update tests/test_modeling_common.py * revert change * remove unused file * add string to doc * save intermediate * make tf integration tests pass * finish tf * fix doc * fix docs again * add led to doctree * add to auto tokenizer * added tips for led * make style * apply jplus statements * correct tf longformer * apply lysandres suggestions * apply sylvains suggestions * Apply suggestions from code review	2021-01-05 13:14:30 +01:00
Sugeeth	314cca2842	Fix documentation links always pointing to master. (#9217 ) * Use extlinks to point hyperlink with the version of code * Point to version on release and master until then * Apply style * Correct links * Add missing backtick * Simple missing backtick after all. Co-authored-by: Raghavendra Sugeeth P S <raghav-5305@raghav-5305.csez.zohocorpin.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-01-05 06:18:48 -05:00
Qbiwan	086718ac6e	Improve documentation coverage for Bertweet (#9379 ) * bertweet docs coverage * style doc max len 119 * maxlen style rst * run main() from style_doc * changed according to comments	2021-01-04 13:12:59 -05:00
Patrick von Platen	75ff530551	correct docs (#9378 )	2021-01-04 17:27:29 +01:00
Patrick von Platen	52b3a05e83	[Bart doc] Fix outdated statement (#9299 ) * fix bart doc * fix docs	2020-12-24 14:47:53 +01:00
Suraj Patil	88ef8893cd	Add caching mechanism to BERT, RoBERTa (#9183 ) * add past_key_values * add use_cache option * make mask before cutting ids * adjust position_ids according to past_key_values * flatten past_key_values * fix positional embeds * fix _reorder_cache * set use_cache to false when not decoder, fix attention mask init * add test for caching * add past_key_values for Roberta * fix position embeds * add caching test for roberta * add doc * make style * doc, fix attention mask, test * small fixes * adress patrick's comments * input_ids shouldn't start with pad token * use_cache only when decoder * make consistent with bert * make copies consistent * add use_cache to encoder * add past_key_values to tapas attention * apply suggestions from code review * make coppies consistent * add attn mask in tests * remove copied from longformer * apply suggestions from code review * fix bart test * nit * simplify model outputs * fix doc * fix output ordering	2020-12-23 23:01:32 +05:30
Connor Brinton	bcc87c639f	Minor documentation revisions from copyediting (#9266 ) * typo: Revise "checkout" to "check out" * typo: Change "seemlessly" to "seamlessly" * typo: Close parentheses in "Using the tokenizer" * typo: Add closing parenthesis to supported models aside * docs: Treat ``position_ids`` as plural Alternatively, the word "argument" could be added to make the subject singular. * docs: Remove comma, making subordinate clause * docs: Remove comma separating verb and direct object * docs: Fix typo ("next" -> "text") * docs: Reverse phrase order to simplify sentence * docs: "quicktour" -> "quick tour" * docs: "to throw" -> "from throwing" * docs: Remove disruptive newline in padding/truncation section * docs: "show exemplary" -> "show examples of" * docs: "much harder as" -> "much harder than" * docs: Fix typo "seach" -> "search" * docs: Fix subject-verb disagreement in WordPiece description * docs: Fix style in preprocessing.rst	2020-12-23 10:15:49 -05:00
Sylvain Gugger	490b39e614	Seq2seq trainer (#9241 ) * Add label smoothing in Trainer * Add options for scheduler and Adafactor in Trainer * Put Seq2SeqTrainer in the main lib * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments and adapt scripts * Documentation * Move test not using script to tests folder Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-22 11:33:44 -05:00
Sylvain Gugger	1fc7119181	Fix script that check objects are documented (#9259 )	2020-12-22 11:12:58 -05:00
Suraj Patil	f4432b7e01	add base model classes to bart subclassed models (#9230 ) * add base model classes to bart subclassed models * add doc	2020-12-21 19:56:46 +05:30
Stas Bekman	3ff5e8955a	[t5 doc] typos (#9199 ) * [t5 doc] typos a few run away backticks @sgugger * style	2020-12-18 16:03:26 -08:00
Sylvain Gugger	3e56e2ce04	Fix typo	2020-12-18 10:11:07 -05:00
sandip	467e9158b4	Added TF CTRL Sequence Classification (#9151 ) * Added TF CTRL Sequence Classification * code refactor	2020-12-17 18:10:57 -05:00
Lysandre	bd40345d3e	v4.1.1 docs	2020-12-17 11:28:38 -05:00
Lysandre	bfa4ccf77d	Release: v4.1.1	2020-12-17 11:25:49 -05:00
Lysandre	e0790cca78	Fix TAPAS doc	2020-12-17 11:25:05 -05:00
Sylvain Gugger	6d2e864db7	Put all models in the constants (#9170 ) * Put all models in the constants * Add Google AI mention in the main README	2020-12-17 11:23:21 -05:00
Lysandre	f83d9c8da7	v4.1.0 docs	2020-12-17 10:16:07 -05:00
Lysandre	f5438ab8a2	Release: v4.1.0	2020-12-17 10:04:55 -05:00
Lysandre	ac2c7e398f	Remove erroneous character	2020-12-17 09:47:19 -05:00
Lysandre Debut	1aca3d6afa	Add disclaimer to TAPAS rst file (#9167 ) Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: sgugger <sylvain.gugger@gmail.com>	2020-12-17 09:34:06 -05:00
Lysandre Debut	1c1a2ffbff	TableQuestionAnsweringPipeline (#9145 ) * AutoModelForTableQuestionAnswering * TableQuestionAnsweringPipeline * Apply suggestions from Patrick's code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Sylvain and Patrick comments * Better PyTorch/TF error message * Add integration tests * Argument Handler naming Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com> * Fix docs to appease the documentation gods Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-16 12:31:50 -05:00
Lysandre Debut	07384baf7a	AutoModelForTableQuestionAnswering (#9154 ) * AutoModelForTableQuestionAnswering * Update src/transformers/models/auto/modeling_auto.py * Style	2020-12-16 12:14:33 -05:00
Hayden Housen	34334662df	Add message to documentation that longformer doesn't support token_type_ids (#9152 ) * Add message to documentation that longformer doesn't support token_type_ids * Format changes	2020-12-16 11:06:14 -05:00
Patrick von Platen	640e6fe190	[Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054 ) * save intermediate * save intermediate * save intermediate * correct flax bert model file * new module / model naming * make style * almost finish BERT * finish roberta * make fix-copies * delete keys file * last refactor * fixes in run_mlm_flax.py * remove pooled from run_mlm_flax.py` * fix gelu \| gelu_new * remove Module from inits * splits * dirty print * preventing warmup_steps == 0 * smaller splits * make fix-copies * dirty print * dirty print * initial_evaluation argument * declaration order fix * proper model initialization/loading * proper initialization * run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug * removed tokenizers warning hack, fixed model re-initialization * reverted training_args.py changes * fix flax from pretrained * improve test in flax * apply sylvains tips * update init * make 0.3.0 compatible * revert tevens changes * revert tevens changes 2 * finalize revert * fix bug * add docs * add pretrained to init * Update src/transformers/modeling_flax_utils.py * fix copies * final improvements Co-authored-by: TevenLeScao <teven.lescao@gmail.com>	2020-12-16 13:03:32 +01:00
NielsRogge	1551e2dc6d	[WIP] Tapas v4 (tres) (#9117 ) * First commit: adding all files from tapas_v3 * Fix multiple bugs including soft dependency and new structure of the library * Improve testing by adding torch_device to inputs and adding dependency on scatter * Use Python 3 inheritance rather than Python 2 * First draft model cards of base sized models * Remove model cards as they are already on the hub * Fix multiple bugs with integration tests * All model integration tests pass * Remove print statement * Add test for convert_logits_to_predictions method of TapasTokenizer * Incorporate suggestions by Google authors * Fix remaining tests * Change position embeddings sizes to 512 instead of 1024 * Comment out positional embedding sizes * Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES * Added more model names * Fix truncation when no max length is specified * Disable torchscript test * Make style & make quality * Quality * Address CI needs * Test the Masked LM model * Fix the masked LM model * Truncate when overflowing * More much needed docs improvements * Fix some URLs * Some more docs improvements * Test PyTorch scatter * Set to slow + minify * Calm flake8 down * First commit: adding all files from tapas_v3 * Fix multiple bugs including soft dependency and new structure of the library * Improve testing by adding torch_device to inputs and adding dependency on scatter * Use Python 3 inheritance rather than Python 2 * First draft model cards of base sized models * Remove model cards as they are already on the hub * Fix multiple bugs with integration tests * All model integration tests pass * Remove print statement * Add test for convert_logits_to_predictions method of TapasTokenizer * Incorporate suggestions by Google authors * Fix remaining tests * Change position embeddings sizes to 512 instead of 1024 * Comment out positional embedding sizes * Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES * Added more model names * Fix truncation when no max length is specified * Disable torchscript test * Make style & make quality * Quality * Address CI needs * Test the Masked LM model * Fix the masked LM model * Truncate when overflowing * More much needed docs improvements * Fix some URLs * Some more docs improvements * Add add_pooling_layer argument to TapasModel Fix comments by @sgugger and @patrickvonplaten * Fix issue in docs + fix style and quality * Clean up conversion script and add task parameter to TapasConfig * Revert the task parameter of TapasConfig Some minor fixes * Improve conversion script and add test for absolute position embeddings * Improve conversion script and add test for absolute position embeddings * Fix bug with reset_position_index_per_cell arg of the conversion cli * Add notebooks to the examples directory and fix style and quality * Apply suggestions from code review * Move from `nielsr/` to `google/` namespace * Apply Sylvain's comments Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Rogge Niels <niels.rogge@howest.be> Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: sgugger <sylvain.gugger@gmail.com>	2020-12-15 17:08:49 -05:00
sandip	389aba34bf	Added TF OpenAi GPT1 Sequence Classification (#9105 ) * TF OpenAI GPT Sequence Classification * Update src/transformers/models/openai/modeling_tf_openai.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-12-15 11:27:08 -05:00
Julien Plu	df3f4d2aef	Fix T5 and BART for TF (#9063 ) * Fix T5 for graphe compilation+execution * Fix BART * Fix import * Fix naming * fix attribute name * Oops * fix import * fix tests * fix tests * Update test * Add mising import * Address Patrick's comments * Style * Address Patrick's comment	2020-12-14 18:47:00 +01:00
Ahmed Elnaggar	a9c8bff724	Add parallelization support for T5EncoderModel (#9082 ) * add model parallelism to T5EncoderModel add model parallelism to T5EncoderModel * remove decoder from T5EncoderModel parallelize * uodate T5EncoderModel docs * Extend T5ModelTest for T5EncoderModel * fix T5Stask using range for get_device_map * fix style Co-authored-by: Ahmed Elnaggar <elnaggar@rostlab.informatik.tu-muenchen.de>	2020-12-14 12:00:45 -05:00
Stas Bekman	b00eb4fb02	Testing Experimental CI Features (#9070 )	2020-12-14 10:34:59 -05:00
Simon Brandeis	74daf1f954	Fixed a broken link in documentation (#9101 )	2020-12-14 09:12:27 -05:00
Julien Chaumond	3552d0e0d8	[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013 ) * rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-11 18:24:42 -05:00
Sylvain Gugger	1310e1a758	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
Sylvain Gugger	51e81e5895	MPNet copyright files (#9015 )	2020-12-10 09:29:38 -05:00
Patrick von Platen	06971ac4f9	[Bart] Refactor - fix issues, consistency with the library, naming (#8900 ) * remove make on the fly linear embedding * start refactor * big first refactor * save intermediate * save intermediat * correct mask issue * save tests * refactor padding masks * make all tests pass * further refactor * make pegasus test pass * fix bool if * fix leftover tests * continue * bart renaming * delete torchscript test hack * fix imports in tests * correct shift * fix docs and repo cons * re-add fix for FSTM * typo in test * fix typo * fix another typo * continue * hot fix 2 for tf * small fixes * refactor types linting * continue * finish refactor * fix import in tests * better bart names * further refactor and add test * delete hack * apply sylvains and lysandres commens * small perf improv * further perf improv * improv perf * fix typo * make style * small perf improv	2020-12-09 20:55:24 +01:00
StillKeepTry	df2af6d8b8	Add MP Net 2 (#9004 )	2020-12-09 10:32:43 -05:00
Patrick von Platen	da37a21c89	push (#9008 )	2020-12-09 15:14:33 +01:00
Sylvain Gugger	7e1d709e2a	Fix link to stable version in the doc navbar (#9007 )	2020-12-09 09:11:39 -05:00
Patrick von Platen	02d0e0355c	Diverse beam search 2 (#9006 ) * diverse beam search * bug fixes * bug fixes * bug fix * separate out diverse_beam_search function * separate out diverse_beam_search function * bug fix * improve code quality * bug fix * bug fix * separate out diverse beam search scorer * code format * code format * code format * code format * add test * code format * documentation changes * code quality * add slow integration tests * more general name * refactor into logits processor * add test * avoid too much copy paste * refactor * add to docs * fix-copies * bug fix * Revert "bug fix" This reverts commit `c99eb5a8dc`. * improve comment * implement sylvains feedback Co-authored-by: Ayush Jain <a.jain@sprinklr.com> Co-authored-by: ayushtiku5 <40797286+ayushtiku5@users.noreply.github.com>	2020-12-09 15:00:37 +01:00
Sylvain Gugger	00aa9dbca2	Copyright (#8970 ) * Add copyright everywhere missing * Style	2020-12-07 18:36:34 -05:00
Navjot	c108d0b5a4	add max_length to showcase the use of truncation (#8975 )	2020-12-07 18:35:39 -05:00
sandip	483e13273f	Add TFGPT2ForSequenceClassification based on DialogRPT (#8714 ) * Add TFGPT2ForSequenceClassification based on DialogRPT * Add TFGPT2ForSequenceClassification based on DialogRPT * TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing * Add TFGPT2ForSequenceClassification based on DialogRPT * TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing * code refactor for latest other TF PR * code refactor * code refactor * Update modeling_tf_gpt2.py	2020-12-07 16:58:37 +01:00
Lysandre Debut	0c5615af66	Put Transformers on Conda (#8918 ) * conda * Guide * correct tag * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/installation.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Sylvain's comments Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-12-03 14:28:49 -05:00
Julien Chaumond	9ad6194318	Tweak wording + Add badge w/ number of models on the hub (#8914 ) * Add badge w/ number of models on the hub * try to apease @sgugger 😇 * not sure what this `c` was about [ci skip] * Fix script and move stuff around * Fix doc styling error Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>	2020-12-03 10:56:55 -05:00
sandip	f6b44e6190	Transfoxl seq classification (#8868 ) * Transfoxl sequence classification * Transfoxl sequence classification	2020-12-02 10:08:32 -05:00
elk-cloner	4a9e502a36	Ctrl for sequence classification (#8812 ) * add CTRLForSequenceClassification * pass local test * merge with master * fix modeling test for sequence classification * fix deco * fix assert	2020-12-01 09:49:27 +01:00
LysandreJik	9995a341c9	Update docs	2020-11-30 12:07:52 -05:00
LysandreJik	22b0ff757a	Release: v4.0.0	2020-11-30 12:07:43 -05:00
Sylvain Gugger	75f8100fc7	Add a direct link to the big table (#8850 )	2020-11-30 10:29:23 -05:00
Ahmed Elnaggar	40ecaf0c2b	Add T5 Encoder for Feature Extraction (#8717 ) * Add T5 Encoder class for feature extraction * fix T5 encoder add_start_docstrings indent * update init with T5 encoder * update init with TFT5ModelEncoder * remove TFT5ModelEncoder * change T5ModelEncoder order in init * add T5ModelEncoder to transformers init * clean T5ModelEncoder * update init with TFT5ModelEncoder * add TFModelEncoder for Tensorflow * update init with TFT5ModelEncoder * Update src/transformers/models/t5/modeling_t5.py change output from Seq2SeqModelOutput to BaseModelOutput Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * remove encoder_outputs 1. remove encoder_outputs from the function call. 2. remove the encoder_outputs If statement. 3. remove isinstance from return_dict. * Authorize missing decoder keys * remove unnecessary input parameters remove pask_key_values and use_cache * remove use_cache remove use_cache from the forward method * add doctoring for T5 encoder add doctoring for T5 encoder with T5_ENCODER_INPUTS_DOCSTRING * change return_dict to dot access * add T5_ENCODER_INPUTS_DOCSTRING for TF T5 * change TFT5Encoder output type to BaseModelOutput * remove unnecessary parameters for TFT5Encoder * remove unnecessary if statement * add import BaseModelOutput * fix BaseModelOutput typo to TFBaseModelOutput * update T5 doc with T5ModelEncoder * add T5ModelEncoder to tests * finish pytorch * finish docs and mt5 * add mtf to init * fix init * remove n_positions * finish PR * Update src/transformers/models/mt5/modeling_mt5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/t5/modeling_t5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/t5/modeling_tf_t5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/mt5/modeling_tf_mt5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-30 08:34:40 +01:00
Lysandre Debut	610cb106a2	Migration guide from v3.x to v4.x (#8763 ) * Migration guide from v3.x to v4.x * Better wording * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Sylvain's comments * Better wording. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-29 20:13:07 -05:00
Guy Rosin	3a08cc1ce7	Minor docs typo fixes (#8797 ) * Fix minor typos * Additional typos * Style fix Co-authored-by: guyrosin <guyrosin@assist-561.cs.technion.ac.il>	2020-11-29 11:27:00 -05:00
Stas Bekman	00ea45659f	suggest a numerical limit of 50MB for determining @slow (#8824 )	2020-11-27 16:04:54 -05:00
Moussa Kamal Eddine	81fe0bf085	Add barthez model (#8393 ) * Add init barthez * Add barthez model, tokenizer and docs BARThez is a pre-trained french seq2seq model that uses BART objective. * Apply suggestions from code review docs typos Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add license * Change URLs scheme * Remove barthez model keep tokenizer * Fix style * Fix quality * Update tokenizer * Add fast tokenizer * Add fast tokenizer test Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-27 12:31:42 -05:00
Patrick von Platen	2a6fbe6a40	[XLNet] Fix mems behavior (#8567 ) * fix mems in xlnet * fix use_mems * fix use_mem_len * fix use mems * clean docs * fix tf typo * make xlnet tf for generation work * fix tf test * refactor use cache * add use cache for missing models * correct use_cache in generate * correct use cache in tf generate * fix tf * correct getattr typo * make sylvain happy * change in docs as well * do not apply to cookie cutter statements * fix tf test * make pytorch model fully backward compatible	2020-11-25 16:54:59 -05:00
Sylvain Gugger	4821ea5aeb	Big model table (#8774 ) * First draft * Styling * With all changes staged * Update docs/source/index.rst Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Styling Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-11-25 12:02:15 -05:00
Lysandre Debut	02f48b9bfc	Model parallel documentation (#8741 ) * Add parallelize methods to the .rst files * Correct format	2020-11-23 20:14:48 -05:00
Colin Brochtrup	8ffc01a76a	Add early stopping callback to pytorch trainer (#8581 ) * Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer * Add early stopping test * Set patience counter to 0 if best metric not defined yet * Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on. * Run make style * make funciton name sensible * Improve new argument docstring wording and hope that flakey CI test passes. * Use on_evaluation callback instead of custom. Remove some debug printing * Move early stopping arguments and state into early stopping callback * Run make style * Remove old code * Fix docs formatting. make style went rogue on me. * Remove copied attributes and fix variable * Add assertions on training arguments instead of mutating them. Move comment out of public docs. * Make separate test for early stopping callback. Add test of invalid arguments. * Run make style... I remembered before CI this time! * appease flake8 * Add EarlyStoppingCallback to callback docs * Make docstring EarlyStoppingCallabck match other callbacks. * Fix typo in docs	2020-11-23 17:25:35 -05:00
Sylvain Gugger	900024273b	Change default cache path (#8734 ) * Change default cache path * Document changes * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-23 13:56:45 -05:00
Santiago Castro	e1f3156b21	Fix many typos (#8708 )	2020-11-21 22:58:10 -05:00
Sylvain Gugger	cb3e5c33f7	Fix a few last paths for the new repo org (#8666 )	2020-11-19 11:56:42 -05:00
elk-cloner	5362bb8a6b	Tf longformer for sequence classification (#8231 ) * working on LongformerForSequenceClassification * add TFLongformerForMultipleChoice * add TFLongformerForTokenClassification * use add_start_docstrings_to_model_forward * test TFLongformerForSequenceClassification * test TFLongformerForMultipleChoice * test TFLongformerForTokenClassification * remove test from repo * add test and doc for TFLongformerForSequenceClassification, TFLongformerForTokenClassification, TFLongformerForMultipleChoice * add requested classes to modeling_tf_auto.py update dummy_tf_objects fix tests fix bugs in requested classes * pass all tests except test_inputs_embeds * sync with master * pass all tests except test_inputs_embeds * pass all tests * pass all tests * work on test_inputs_embeds * fix style and quality * make multi choice work * fix TFLongformerForTokenClassification signature * fix TFLongformerForMultipleChoice, TFLongformerForSequenceClassification signature * fix mult choice * fix mc hint * fix input embeds * fix input embeds * refactor input embeds * fix copy issue * apply sylvains changes and clean more Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-19 10:37:27 -05:00
cronoik	dcc9c64299	Updated the Extractive Question Answering code snippets (#8636 ) * Updated the Extractive Question Answering code snippets The Extractive Question Answering code snippets do not work anymore since the models return task-specific output objects. This commit fixes the pytorch and tensorflow examples but adding `.values()` to the model call. * Update task_summary.rst	2020-11-18 18:56:47 -05:00
Patrick von Platen	cdfa56afe0	[Tokenizer Doc] Improve tokenizer summary (#8622 ) * improve summary * small fixes * cleaned line length * correct "" formatting * apply sylvains suggestions	2020-11-18 17:14:15 +01:00
Lysandre Debut	3095ee9dab	Tokenizers should be framework agnostic (#8599 ) * Tokenizers should be framework agnostic * Run the slow tests * Not testing * Fix documentation * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-17 14:03:03 -05:00
Julien Chaumond	042a6aa777	Tokenizers: ability to load from model subfolder (#8586 ) * <small>tiny typo</small> * Tokenizers: ability to load from model subfolder * use subfolder for local files as well * Uniformize model shortcut name => model id * from s3 => from huggingface.co Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>	2020-11-17 08:58:45 -05:00
Patrick von Platen	5104223552	[MT5] More docs (#8589 ) * add docs * make style	2020-11-17 12:47:57 +01:00
Patrick von Platen	86822a358b	T5 & mT5 (#8552 ) * add mt5 and t5v1_1 model * fix tests * correct some imports * add tf model * finish tf t5 * improve examples * fix copies * clean doc	2020-11-17 12:23:09 +01:00
Sylvain Gugger	c89bdfbe72	Reorganize repo (#8580 ) * Put models in subfolders * Styling * Fix imports in tests * More fixes in test imports * Sneaky hidden imports * Fix imports in doc files * More sneaky imports * Finish fixing tests * Fix examples * Fix path for copies * More fixes for examples * Fix dummy files * More fixes for example * More model import fixes * Is this why you're unhappy GitHub? * Fix imports in conver command	2020-11-16 21:43:42 -05:00
Sylvain Gugger	1073a2bde5	Switch `return_dict` to `True` by default. (#8530 ) * Use the CI to identify failing tests * Remove from all examples and tests * More default switch * Fixes * More test fixes * More fixes * Last fixes hopefully * Use the CI to identify failing tests * Remove from all examples and tests * More default switch * Fixes * More test fixes * More fixes * Last fixes hopefully * Run on the real suite * Fix slow tests	2020-11-16 11:43:00 -05:00
Julien Chaumond	725269746b	Model sharing doc: more tweaks (#8520 ) * More doc tweaks * Update model_sharing.rst * make style * missing newline * Add email tip Co-authored-by: Pierric Cistac <pierric@huggingface.co>	2020-11-13 12:10:26 -05:00
Sylvain Gugger	bb03a14edd	Update doc for v3.5.1	2020-11-13 10:29:58 -05:00
Sylvain Gugger	7933054638	Model sharing doc (#8498 ) * Model sharing doc * Style	2020-11-12 11:53:23 -05:00
Chengxi Guo	d65e0bfea3	Fix doc bug (#8500 ) * fix doc bug Signed-off-by: mymusise <mymusise1@gmail.com> * fix example bug Signed-off-by: mymusise <mymusise1@gmail.com>	2020-11-12 11:47:23 -05:00
Funtowicz Morgan	a5b682329c	Flax/Jax documentation (#8331 ) * First addition of Flax/Jax documentation Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * make style * Ensure input order match between Bert & Roberta Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Install dependencies "all" when building doc Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * wraps build_doc deps with "" Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Addressing @sgugger comments. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use list to highlight JAX features. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Make style. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Let's not look to much into the future for now. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Style Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-11-11 14:53:36 -05:00
Ratthachat (Jung)	026a2ff225	Add TFDPR (#8203 ) * Create modeling_tf_dpr.py * Add TFDPR * Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot last commit accidentally deleted these 4 lines, so I recover them back * Add TFDPR * Add TFDPR * clean up some comments, add TF input-style doc string * Add TFDPR * Make return_dict=False as default * Fix return_dict bug (in .from_pretrained) * Add get_input_embeddings() * Create test_modeling_tf_dpr.py The current version is already passed all 27 tests! Please see the test run at : https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing * fix quality * delete init weights * run fix copies * fix repo consis * del config_class, load_tf_weights They shoud be 'pytorch only' * add config_class back after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion * newline after .. note:: * import tf, np (Necessary for ModelIntegrationTest) * slow_test from_pretrained with from_pt=True At the moment we don't have TF weights (since we don't have official official TF model) Previously, I did not run slow test, so I missed this bug * Add simple TFDPRModelIntegrationTest Note that this is just a test that TF and Pytorch gives approx. the same output. However, I could not test with the official DPR repo's output yet * upload correct tf model * remove position_ids as missing keys Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: patrickvonplaten <patrick@huggingface.co>	2020-11-11 12:28:09 -05:00
Santiago Castro	8fe6629bb4	Add missing tasks to `pipeline` docstring (#8428 )	2020-11-10 13:44:25 -05:00
Stas Bekman	02bdfc0251	using multi_gpu consistently (#8446 ) * s\|multiple_gpu\|multi_gpu\|g; s\|multigpu\|multi_gpu\|g' * doc	2020-11-10 13:23:58 -05:00
Stas Bekman	e21340da7a	[testing utils] get_auto_remove_tmp_dir more intuitive behavior (#8401 ) * [testing utils] get_auto_remove_tmp_dir default change Now that I have been using `get_auto_remove_tmp_dir default change` for a while, I realized that the defaults aren't most optimal. 99% of the time we want the tmp dir to be empty at the beginning of the test - so changing the default to `before=True` - this shouldn't impact any tests since this feature is used only during debug. * simplify things * update docs * fix doc layout * style * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * better 3-state doc * style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * s/tmp/temporary/ + style * correct the statement Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-10 11:57:21 -05:00
Sam Shleifer	c314b1fd3b	[docs] improve bart/marian/mBART/pegasus docs (#8421 )	2020-11-10 10:18:34 -05:00
Lysandre	aec51e5696	v3.5.0 documentation	2020-11-10 08:58:47 -05:00
Lysandre	818878dc88	Release: v3.5.0	2020-11-10 08:50:43 -05:00
Lysandre Debut	9cebee38ad	Model sharing rst (#8439 ) * Update RST * Finer details * Re-organize * Style	2020-11-10 08:35:11 -05:00
Stas Bekman	ef032ddd1e	[docs] [testing] gpu decorators table (#8422 ) * gpu decorators table * whitespace * Update docs/source/testing.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * whitespace Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-09 14:27:42 -05:00
Sam Shleifer	46509d1c19	[docs] remove sshleifer from issue-template :( (#8418 )	2020-11-09 12:51:38 -05:00
Patrick von Platen	9c83b96e62	[Tests] Add Common Test for Training + Fix a couple of bugs (#8415 ) * add training tests * correct longformer * fix docs * fix some tests * fix some more train tests * remove ipdb * fix multiple edge case model training * fix funnel and prophetnet * clean gpt models * undo renaming of albert	2020-11-09 18:24:41 +01:00
Yossi Synett	bc0d26d1de	[All Seq2Seq model + CLM models that can be used with EncoderDecoder] Add cross-attention weights to outputs (#8071 ) * Output cross-attention with decoder attention output * Update src/transformers/modeling_bert.py * add cross-attention for t5 and bart as well * fix tests * correct typo in docs * add sylvains and sams comments * correct typo Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-06 19:34:48 +01:00
Leandro von Werra	17450397a7	Docs bart training ref (#8330 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-11-05 17:20:57 -05:00
Stas Bekman	d787935a14	[s2s] test_distributed_eval (#8315 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-11-05 16:01:15 -05:00
Guillaume Filion	27b402cab0	Output global_attentions in Longformer models (#7562 ) * Output global_attentions in Longformer models * make style * small refactoring * fix tests * make fix-copies * add for tf as well * remove comments in test * make fix-copies * make style * add docs * make docstring pretty Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2020-11-05 21:10:43 +01:00
Patrick von Platen	a1bbcf3f6c	Refactoring the generate() function (#6949 ) * first draft * show design proposition for new generate method * up * make better readable * make first version * gpt2 tests pass * make beam search for gpt2 work * add first encoder-decoder code * delete typo * make t5 work * save indermediate * make bart work with beam search * finish beam search bart / t5 * add default kwargs * make more tests pass * fix no bad words sampler * some fixes and tests for all distribution processors * fix test * fix rag slow tests * merge to master * add nograd to generate * make all slow tests pass * speed up generate * fix edge case bug * small fix * correct typo * add type hints and docstrings * fix typos in tests * add beam search tests * add tests for beam scorer * fix test rag * finish beam search tests * move generation tests in seperate file * fix generation tests * more tests * add aggressive generation tests * fix tests * add gpt2 sample test * add more docstring * add more docs * finish doc strings * apply some more of sylvains and sams comments * fix some typos * make fix copies * apply lysandres and sylvains comments * final corrections on examples * small fix for reformer	2020-11-03 16:04:22 +01:00
Sam Shleifer	566b083eb1	TFMarian, TFMbart, TFPegasus, TFBlenderbot (#7987 ) * Start plumbing * Marian close * Small stubs for all children * Fixed bart * marian working * pegasus test is good, but failing * Checkin tests * More model files * Subtle marian, pegasus integration test failures * Works well * rm print * boom boom * Still failing model2doc * merge master * Equivalence test failing, all others fixed * cleanup * Fix embed_scale * Cleanup marian pipeline test * Undo extra changes * Smaller delta * Cleanup model testers * undo delta * fix tests import structure * cross test decorator * Cleaner set_weights * Respect authorized_unexpected_keys * No warnings * No warnings * style * Nest tf import * black * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * functional dropout * fixup * Fixup * style_doc * embs * shape list * delete slow force_token_id_to_be_generated func * fixup Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-30 11:23:16 -04:00
Santiago Castro	969859d5f6	Fix doc errors and typos across the board (#8139 ) * Fix doc errors and typos across the board * Fix a typo * Fix the CI * Fix more typos * Fix CI * More fixes * Fix CI * More fixes * More fixes	2020-10-29 10:33:33 -04:00
Sylvain Gugger	6241c873cd	Document the various LM Auto models (#8118 )	2020-10-28 13:41:56 -04:00
Stas Bekman	5423f2a9d4	[testing] port test_trainer_distributed to distributed pytest + TestCasePlus enhancements (#8107 ) * move the helper code into testing_utils * port test_trainer_distributed to work with pytest * improve docs * simplify notes * doc * doc * style * doc * further improvements * torch might not be available * real fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-28 11:51:32 -04:00
Davide Fiocco	995006eabb	Add AzureML in integrations via dedicated callback (#8062 ) * first attempt to add AzureML callbacks * func arg fix * var name fix, but still won't fix error... * fixing as in https://discuss.huggingface.co/t/how-to-integrate-an-azuremlcallback-for-logging-in-azure/1713/2 * Avoid lint check of azureml import * black compliance * Make isort happy * Fix point typo in docs * Add AzureML to Callbacks docs * Attempt to make sphinx happy * Format callback docs * Make documentation style happy * Make docs compliant to style Co-authored-by: Davide Fiocco <davide.fiocco@frontiersin.net>	2020-10-27 14:21:54 -04:00
Sylvain Gugger	08f534d2da	Doc styling (#8067 ) * Important files * Styling them all * Revert "Styling them all" This reverts commit `7d029395fd`. * Syling them for realsies * Fix syntax error * Fix benchmark_utils * More fixes * Fix modeling auto and script * Remove new line * Fixes * More fixes * Fix more files * Style * Add FSMT * More fixes * More fixes * More fixes * More fixes * Fixes * More fixes * More fixes * Last fixes * Make sphinx happy	2020-10-26 18:26:02 -04:00
Sylvain Gugger	04a17f8550	Doc fixes in preparation for the docstyle PR (#8061 ) * Fixes in preparation for doc styling * More fixes * Better syntax * Fixes * Style * More fixes * More fixes	2020-10-26 15:01:09 -04:00
Yusuke Mori	a9ac1db276	Minor error fix of 'bart-large-cnn' details in the pretrained_models doc (#8053 )	2020-10-26 11:05:16 -04:00
Samuel	fc2d6eac3c	Minor typo fixes to the preprocessing tutorial in the docs (#8046 ) * Fix minor typos Fix minor typos in the docs. * Update docs/source/preprocessing.rst Clearer data structure description. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-26 10:22:29 -04:00
noise-field	c48b16b8da	Mlflow integration callback (#8016 ) * Add MLflow integration class Add integration code for MLflow in integrations.py along with the code that checks that MLflow is installed. * Add MLflowCallback import Add import of MLflowCallback in trainer.py * Handle model argument Allow the callback to handle model argument and store model config items as hyperparameters. * Log parameters to MLflow in batches MLflow cannot log more than a hundred parameters at once. Code added to split the parameters into batches of 100 items and log the batches one by one. * Fix style * Add docs on MLflow callback * Fix issue with unfinished runs The "fluent" api used in MLflow integration allows only one run to be active at any given moment. If the Trainer is disposed off and a new one is created, but the training is not finished, it will refuse to log the results when the next trainer is created. * Add MLflow integration class Add integration code for MLflow in integrations.py along with the code that checks that MLflow is installed. * Add MLflowCallback import Add import of MLflowCallback in trainer.py * Handle model argument Allow the callback to handle model argument and store model config items as hyperparameters. * Log parameters to MLflow in batches MLflow cannot log more than a hundred parameters at once. Code added to split the parameters into batches of 100 items and log the batches one by one. * Fix style * Add docs on MLflow callback * Fix issue with unfinished runs The "fluent" api used in MLflow integration allows only one run to be active at any given moment. If the Trainer is disposed off and a new one is created, but the training is not finished, it will refuse to log the results when the next trainer is created.	2020-10-26 09:41:58 -04:00
Stas Bekman	101186bc1f	[docs] [testing] distributed training (#7993 ) * distributed training * fix * fix formatting * wording	2020-10-26 08:15:05 -04:00
Samuel	9aa2826687	Minor typo fixes to the tokenizer summary (#8045 ) Minor typo fixes to the tokenizer summary	2020-10-26 08:08:33 -04:00
Stas Bekman	8348105692	[testing] slow tests should be marked as slow (#7895 ) * slow tests should be slow * exception note * style * integrate LysandreJik's notes with some expansions * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * another slow test * fix link, and prose * clarify. * note from Sam * typo Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-22 06:34:05 -04:00
Sam Shleifer	829842159e	Add TFBartForConditionalGeneration (#5411 ) * half done * doc improvement * Cp test file * brokedn * broken test * undo some mess * ckpt * borked * Halfway * 6 passing * boom boom * Much progress but still 6 * boom boom * merged master * 10 passing * boom boom * Style * no t5 changes * 13 passing * Integration test failing, but not gibberish * Frustrated * Merged master * 4 fail * 4 fail * fix return_dict * boom boom * Still only 4 * prepare method * prepare method * before delete classif * Skip tests to avoid adding boilerplate * boom boom * fast tests passing * style * boom boom * Switch to supporting many input types * remove FIXMENORM * working * Fixed past_key_values/decoder_cached_states confusion * new broken test * Fix attention mask kwarg name * undo accidental * Style and reviewers * style * Docs and common tests * Cleaner assert messages * copy docs * style issues * Sphinx fix * Simplify caching logic * test does not require torch * copy _NoLayerEmbedTokens * Update src/transformers/modeling_tf_bart.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update tests/test_modeling_tf_bart.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_bart.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_bart.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_bart.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Line length and dont document None * Add pipeline test coverage * assert msg * At parity * Assert messages * mark slow * Update compile test * back in init * Merge master * Fix tests Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-21 13:10:16 +02:00
Joe Davison	13842e413c	PPL guide minor code snippet fix (#7938 )	2020-10-20 16:17:39 -06:00
Lysandre Debut	96f4828ace	Respect the 119 line chars (#7928 )	2020-10-20 11:02:47 -04:00
Lysandre	ef0ac063c9	Docs for v3.4.0	2020-10-20 16:29:00 +02:00
Lysandre	eb0e0ce2ad	Release: v3.4.0	2020-10-20 16:22:26 +02:00
Patrick von Platen	ffd675b42c	add summary (#7927 )	2020-10-20 10:11:02 -04:00
Lysandre Debut	5547b40b13	labels and decoder_input_ids to Glossary (#7906 ) * labels and decoder_input_ids to Glossary * Formatting fixes * Update docs/source/glossary.rst Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * sam's comments Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-10-20 09:50:47 -04:00
Stas Bekman	3e31e7f956	[testing] rename skip targets + docs (#7863 ) * rename skip targets + docs * fix quotes * style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * small improvements * fix Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-20 04:39:13 -04:00
Patrick von Platen	e3d2bee8d0	fix t5 training docstring (#7911 )	2020-10-19 21:49:47 +02:00
Quentin Lhoest	033f29c625	Allow Custom Dataset in RAG Retriever (#7763 ) * add CustomHFIndex * typo in config * update tests * add custom dataset example * clean script * update test data * minor in test * docs * docs * style * fix imports * allow to pass the indexed dataset directly * update tests * use multiset DPR * address thom and patrick's comments * style * update dpr tokenizer * add output_dir flag in use_own_knowledge_dataset.py * allow custom datasets in examples/rag/finetune.py * add test for custom dataset in distributed rag retriever	2020-10-19 19:42:45 +02:00
Weizhen	2422cda01b	ProphetNet (#7157 ) * add new model prophetnet prophetnet modified modify codes as suggested v1 add prophetnet test files * still bugs, because of changed output formats of encoder and decoder * move prophetnet into the latest version * clean integration tests * clean tokenizers * add xlm config to init * correct typo in init * further refactoring * continue refactor * save parallel * add decoder_attention_mask * fix use_cache vs. past_key_values * fix common tests * change decoder output logits * fix xlm tests * make common tests pass * change model architecture * add tokenizer tests * finalize model structure * no weight mapping * correct n-gram stream attention mask as discussed with qweizhen * remove unused import * fix index.rst * fix tests * delete unnecessary code * add fast integration test * rename weights * final weight remapping * save intermediate * Descriptions for Prophetnet Config File * finish all models * finish new model outputs * delete unnecessary files * refactor encoder layer * add dummy docs * code quality * fix tests * add model pages to doctree * further refactor * more refactor, more tests * finish code refactor and tests * remove unnecessary files * further clean up * add docstring template * finish tokenizer doc * finish prophetnet * fix copies * fix typos * fix tf tests * fix fp16 * fix tf test 2nd try * fix code quality * add test for each model * merge new tests to branch * Update model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Update model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Update src/transformers/modeling_prophetnet.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Update utils/check_repo.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * apply sams and sylvains comments * make style * remove unnecessary code * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/configuration_prophetnet.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * implement lysandres comments * correct docs * fix isort * fix tokenizers * fix copies Co-authored-by: weizhen <weizhen@mail.ustc.edu.cn> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-19 17:36:09 +02:00
Stas Bekman	4eb61f8e88	remove USE_CUDA (#7861 )	2020-10-19 07:08:34 -04:00
Thomas Wolf	ba8c4d0ac0	[Dependencies\|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659 ) * splitting fast and slow tokenizers [WIP] * [WIP] splitting sentencepiece and tokenizers dependencies * update dummy objects * add name_or_path to models and tokenizers * prefix added to file names * prefix * styling + quality * spliting all the tokenizer files - sorting sentencepiece based ones * update tokenizer version up to 0.9.0 * remove hard dependency on sentencepiece 🎉 * and removed hard dependency on tokenizers 🎉 * update conversion script * update missing models * fixing tests * move test_tokenization_fast to main tokenization tests - fix bugs * bump up tokenizers * fix bert_generation * update ad fix several tokenizers * keep sentencepiece in deps for now * fix funnel and deberta tests * fix fsmt * fix marian tests * fix layoutlm * fix squeezebert and gpt2 * fix T5 tokenization * fix xlnet tests * style * fix mbart * bump up tokenizers to 0.9.2 * fix model tests * fix tf models * fix seq2seq examples * fix tests without sentencepiece * fix slow => fast conversion without sentencepiece * update auto and bert generation tests * fix mbart tests * fix auto and common test without tokenizers * fix tests without tokenizers * clean up tests lighten up when tokenizers + sentencepiece are both off * style quality and tests fixing * add sentencepiece to doc/examples reqs * leave sentencepiece on for now * style quality split hebert and fix pegasus * WIP Herbert fast * add sample_text_no_unicode and fix hebert tokenization * skip FSMT example test for now * fix style * fix fsmt in example tests * update following Lysandre and Sylvain's comments * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-18 20:51:24 +02:00
Katarina Slama	dfa4c26bc0	Typo and fix the input of labels to `cross_entropy` (#7841 ) The current version caused some errors. The changes fixed it for me. Hope this is helpful!	2020-10-15 19:36:31 -04:00
Sylvain Gugger	a1d1b332d0	Add predict step accumulation (#7767 ) * Add eval_accumulation_step and clean distributed eval * Add TPU test * Add TPU stuff * Fix arg name * Fix Seq2SeqTrainer * Fix total_size * Update src/transformers/trainer_pt_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Doc and add test to TPU * Add unit test * Adapt name Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-14 11:41:45 -04:00
Tiger	7e73c12805	fixed lots of typos. (#7758 )	2020-10-13 10:00:20 -04:00
Felipe Curti	dcba9ee03b	Gpt1 for sequence classification (#7683 ) * Add Documentation for GPT-1 Classification * Add GPT-1 with Classification head * Add tests for GPT-1 Classification * Add GPT-1 For Classification to auto models * Remove authorized missing keys, change checkpoint to openai-gpt	2020-10-13 05:06:15 -04:00
Sylvain Gugger	2c9e83f7b8	Fix title level in Blenderbot doc (#7687 )	2020-10-09 19:24:10 -04:00
Sylvain Gugger	a3cea6a8cc	Better links for models in READMED and doc index (#7680 )	2020-10-09 11:17:16 -04:00
sgugger	bc00b37a0d	Revert "Better model links in the README and index" This reverts commit `76e05518bb`.	2020-10-09 10:56:13 -04:00
sgugger	76e05518bb	Better model links in the README and index	2020-10-09 10:54:40 -04:00
Noah Trenaman	5668fdb09e	Update XLM-RoBERTa details (#7669 )	2020-10-09 05:16:58 -04:00
Thomas Wolf	9aeacb58ba	Adding Fast tokenizers for SentencePiece based tokenizers - Breaking: remove Transfo-XL fast tokenizer (#7141 ) * [WIP] SP tokenizers * fixing tests for T5 * WIP tokenizers * serialization * update T5 * WIP T5 tokenization * slow to fast conversion script * Refactoring to move tokenzier implementations inside transformers * Adding gpt - refactoring - quality * WIP adding several tokenizers to the fast world * WIP Roberta - moving implementations * update to dev4 switch file loading to in-memory loading * Updating and fixing * advancing on the tokenizers - updating do_lower_case * style and quality * moving forward with tokenizers conversion and tests * MBart, T5 * dumping the fast version of transformer XL * Adding to autotokenizers + style/quality * update init and space_between_special_tokens * style and quality * bump up tokenizers version * add protobuf * fix pickle Bert JP with Mecab * fix newly added tokenizers * style and quality * fix bert japanese * fix funnel * limite tokenizer warning to one occurence * clean up file * fix new tokenizers * fast tokenizers deep tests * WIP adding all the special fast tests on the new fast tokenizers * quick fix * adding more fast tokenizers in the fast tests * all tokenizers in fast version tested * Adding BertGenerationFast * bump up setup.py for CI * remove BertGenerationFast (too early) * bump up tokenizers version * Clean old docstrings * Typo * Update following Lysandre comments Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>	2020-10-08 11:32:16 +02:00
Sam Shleifer	960faaaf28	Blenderbot (#7418 ) Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-07 19:09:23 -04:00
Sylvain Gugger	08ba4b4902	Trainer callbacks (#7596 ) * Initial callback proposal * Finish various callbacks * Post-rebase conflicts * Fix tests * Don't use something that's not set * Documentation * Remove unwanted print. * Document all models can work * Add tests + small fixes * Update docs/source/internal/trainer_utils.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * Fix TF tests * Real fix this time * This one should work * Fix typo * Really fix typo Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-07 10:50:21 -04:00
Lysandre Debut	5982431814	Add GPT2ForSequenceClassification based on DialogRPT (#7501 ) * Add GPT2ForSequenceClassification based on DialogRPT * Better documentation * Code quality	2020-10-06 17:31:21 -04:00
Lysandre Debut	0257992e4a	Fix squeezebert docs (#7587 ) * Configuration * Modeling * Tokenization * Obliterate the trailing spaces * From underlines to long underlines	2020-10-06 06:22:04 -04:00
Lysandre Debut	818c294fdd	The toggle actually sticks (#7586 )	2020-10-05 11:23:57 -04:00
Sylvain Gugger	b2b7fc7814	Check and update model list in index.rst automatically (#7527 ) * Check and update model list in index.rst automatically * Check and update model list in index.rst automatically * Adapt template	2020-10-05 09:40:45 -04:00
Amine Abdaoui	0d79de7322	docs(pretrained_models): fix num parameters (#7575 ) * docs(pretrained_models): fix num parameters * fix(pretrained_models): correct typo Co-authored-by: Amin <amin.geotrend@gmail.com>	2020-10-05 07:50:56 -04:00
Forrest Iandola	02ef825be2	SqueezeBERT architecture (#7083 ) * configuration_squeezebert.py thin wrapper around bert tokenizer fix typos wip sb model code wip modeling_squeezebert.py. Next step is to get the multi-layer-output interface working set up squeezebert to use BertModelOutput when returning results. squeezebert documentation formatting allow head mask that is an array of [None, ..., None] docs docs cont'd path to vocab docs and pointers to cloud files (WIP) line length and indentation squeezebert model cards formatting of model cards untrack modeling_squeezebert_scratchpad.py update aws paths to vocab and config files get rid of stub of NSP code, and advise users to pretrain with mlm only fix rebase issues redo rebase of modeling_auto.py fix issues with code formatting more code format auto-fixes move squeezebert before bert in tokenization_auto.py and modeling_auto.py because squeezebert inherits from bert tests for squeezebert modeling and tokenization fix typo move squeezebert before bert in modeling_auto.py to fix inheritance problem disable test_head_masking, since squeezebert doesn't yet implement head masking fix issues exposed by the test_modeling_squeezebert.py fix an issue exposed by test_tokenization_squeezebert.py fix issue exposed by test_modeling_squeezebert.py auto generated code style improvement issue that we inherited from modeling_xxx.py: SqueezeBertForMaskedLM.forward() calls self.cls(), but there is no self.cls, and I think the goal was actually to call self.lm_head() update copyright resolve failing 'test_hidden_states_output' and remove unused encoder_hidden_states and encoder_attention_mask docs add integration test. rename squeezebert-mnli --> squeezebert/squeezebert-mnli autogenerated formatting tweaks integrate feedback from patrickvonplaten and sgugger to programming style and documentation strings * tiny change to order of imports	2020-10-05 04:25:43 -04:00
Sylvain Gugger	e2c935f561	Cleanup documentation for BART, Marian, MBART and Pegasus (#7523 ) * Cleanup documentation for BART, Marian, MBART and Pegasus * Cleanup documentation for BART, Marian, MBART and Pegasus	2020-10-05 04:22:12 -04:00
Alexandr	9a92afb6d0	Update LayoutLM doc (#7388 ) Co-authored-by: Alexandr Maslov <avmaslov3@gmail.com>	2020-10-01 09:11:42 -04:00
Sylvain Gugger	be51c1039d	Add forgotten return_dict argument in the docs (#7483 )	2020-10-01 04:41:29 -04:00
Sylvain Gugger	dc7d2daa4c	Alphabetize model lists (#7478 )	2020-09-30 10:43:58 -04:00
François REMY	cc4eff8087	Make transformers install check positive (#7473 ) When transformers is correctly installed, you should get a positive message ^_^	2020-09-30 07:44:40 -04:00
Pengcheng He	7a0cf0ec93	Add DeBERTa model (#5929 ) * Add DeBERTa model * Remove dependency of deberta * Address comments * Patch DeBERTa Documentation Style * Add final tests * Style * Enable tests + nitpicks * position IDs * BERT -> DeBERTa * Quality * Style * Tokenization * Last updates. * @patrickvonplaten's comments * Not everything can be a copy * Apply most of @sgugger's review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Last reviews * DeBERTa -> Deberta Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-09-30 07:07:30 -04:00
Sylvain Gugger	a1c2ef7bd0	Add documentation for v3.3.1	2020-09-29 14:31:43 -04:00
Sylvain Gugger	1ba08dc221	Release: v3.3.1	2020-09-29 14:17:34 -04:00
Lysandre	16c213820e	Update docs to version v3.3.0	2020-09-28 16:32:00 +02:00
Lysandre	0613f05226	Release: v3.3.0	2020-09-28 16:24:43 +02:00
Sylvain Gugger	ca3fc36de3	Reorganize documentation navbar (#7423 ) * Reorganize documentation navbar * Update css to have clear sections	2020-09-28 16:22:58 +02:00
Sylvain Gugger	0611eab5e3	Document RAG again (#7377 ) Do not merge before Monday	2020-09-28 08:31:46 -04:00
Boris Dayma	1749ca317e	docs: fix model sharing file names (#5855 ) * docs: fix model sharing file names * Update docs/source/model_sharing.rst Co-authored-by: Julien Chaumond <chaumond@gmail.com> * docs(model_sharing.rst): fix new line Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-09-28 08:17:30 -04:00
Sylvain Gugger	a8e7982f84	Remove mentions of RAG from the docs (#7376 ) * Remove mentions of RAG from the docs * Deactivate check	2020-09-24 17:07:14 -04:00
Lysandre Debut	8d3bb781ee	Formatter (#7368 ) * Formatter * Docs	2020-09-24 10:59:21 -04:00
Sylvain Gugger	0ccb6f5c6d	Clean RAG docs and template docs (#7348 ) * Clean RAG docs and template docs * Fix typo * Better doc	2020-09-24 09:24:41 -04:00
Sylvain Gugger	3323146e90	Models doc (#7345 ) * Clean up model documentation * Formatting * Preparation work * Long lines * Main work on rst files * Cleanup all config files * Syntax fix * Clean all tokenizers * Work on first models * Models beginning * FaluBERT * All PyTorch models * All models * Long lines again * Fixes * More fixes * Update docs/source/model_doc/bert.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update docs/source/model_doc/electra.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Last fixes Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-23 13:20:45 -04:00
Stas Bekman	28cf873036	[testing] skip decorators: docs, tests, bugs (#7334 ) * skip decorators: docs, tests, bugs * another important note * style * bloody style * add @pytest.mark.parametrize * add note * no idea what it wants :(	2020-09-23 05:16:19 -04:00
Ola Piktus	c754c41c61	RAG (#6813 ) * added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * Formatting / renaming prior to actual work * First commit * improve comments * Retrieval evaluation scripts * refactor to include modeling outputs + MPI retriever * Fix rag-token model + refactor * Various fixes + finetuning logic * use_bos fix * Retrieval refactor * Finetuning refactoring and cleanup * Add documentation and cleanup * Remove set_up_rag_env.sh file * Fix retrieval wit HF index * Fix import errors * Fix quality errors * Refactor as per suggestions in https://github.com/huggingface/transformers/pull/6813#issuecomment-687208867 * fix quality * Fix RAG Sequence generation * minor cleanup plus initial tests * fix test * fix tests 2 * Comments fix * post-merge fixes * Improve readme + post-rebase refactor * Extra dependencied for tests * Fix tests * Fix tests 2 * Refactor test requirements * Fix tests 3 * Post-rebase refactor * rename nlp->datasets * RAG integration tests * add tokenizer to slow integration test and allow retriever to run on cpu * add tests; fix position ids warning * change structure * change structure * add from encoder generator * save working solution * make all integration tests pass * add RagTokenizer.save/from_pretrained and RagRetriever.save/from_pretrained * don't save paths * delete unnecessary imports * pass config to AutoTokenizer.from_pretrained for Rag tokenizers * init wiki_dpr only once * hardcode legacy index and passages paths (todo: add the right urls) * finalize config * finalize retriver api and config api * LegacyIndex index download refactor * add dpr to autotokenizer * make from pretrained more flexible * fix ragfortokengeneration * small name changes in tokenizer * add labels to models * change default index name * add retrieval tests * finish token generate * align test with previous version and make all tests pass * add tests * finalize tests * implement thoms suggestions * add first version of test * make first tests work * make retriever platform agnostic * naming * style * add legacy index URL * docstrings + simple retrieval test for distributed * clean model api * add doc_ids to retriever's outputs * fix retrieval tests * finish model outputs * finalize model api * fix generate problem for rag * fix generate for other modles * fix some tests * save intermediate * set generate to default * big refactor generate * delete rag_api * correct pip faiss install * fix auto tokenization test * fix faiss install * fix test * move the distributed logic to examples * model page * docs * finish tests * fix dependencies * fix import in __init__ * Refactor eval_rag and finetune scripts * start docstring * add psutil to test * fix tf test * move require torch to top * fix retrieval test * align naming * finish automodel * fix repo consistency * test ragtokenizer save/load * add rag model output docs * fix ragtokenizer save/load from pretrained * fix tokenizer dir * remove torch in retrieval * fix docs * fixe finetune scripts * finish model docs * finish docs * remove auto model for now * add require torch * remove solved todos * integrate sylvains suggestions * sams comments * correct mistake on purpose * improve README * Add generation test cases * fix rag token * clean token generate * fix test * add note to test * fix attention mask * add t5 test for rag * Fix handling prefix in finetune.py * don't overwrite index_name Co-authored-by: Patrick Lewis <plewis@fb.com> Co-authored-by: Aleksandra Piktus <piktus@devfair0141.h2.fair> Co-authored-by: Aleksandra Piktus <piktus@learnfair5102.h2.fair> Co-authored-by: Aleksandra Piktus <piktus@learnfair5067.h2.fair> Co-authored-by: Your Name <you@example.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>	2020-09-22 18:29:58 +02:00
Lysandre	6e21f24220	Documentation version	2020-09-22 18:04:39 +02:00
Lysandre	3ebb1b3a2b	Release: v3.2.0	2020-09-22 17:36:51 +02:00
Sylvain Gugger	21ca148090	is_pretokenized -> is_split_into_words (#7236 ) * is_pretokenized -> is_split_into_words * Fix tests	2020-09-22 09:34:35 -04:00
Minghao Li	cd9a0585ea	Add LayoutLM Model (#7064 ) * first version * finish test docs readme model/config/tokenization class * apply make style and make quality * fix layoutlm GitHub link * fix conflict in index.rst and add layoutlm to pretrained_models.rst * fix bug in test_parents_and_children_in_mappings * reformat modeling_auto.py and tokenization_auto.py * fix bug in test_modeling_layoutlm.py * Update docs/source/model_doc/layoutlm.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/layoutlm.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove inh, add tokenizer fast, and update some doc * copy and rename necessary class from modeling_bert to modeling_layoutlm * Update src/transformers/configuration_layoutlm.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/configuration_layoutlm.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/configuration_layoutlm.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/configuration_layoutlm.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_layoutlm.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_layoutlm.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_layoutlm.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * add mish to activations.py, import ACT2FN and import logging from utils Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-22 09:28:02 -04:00
Stas Bekman	47ab3e8262	@slow has to be last (#7251 ) Found an issue when `@slow` isn't the last decorator (gets ignored!), so documenting this significance.	2020-09-20 09:17:29 -04:00
Stas Bekman	1eeb206bef	[ported model] FSMT (FairSeq MachineTranslation) (#6940 ) * ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in https://github.com/huggingface/transformers/pull/6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit `226dad15ca`. * decouple from bart * remove unused code #1 * remove unused code #2 * remove unused code #3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (#7034) * [s2s] --eval_max_generate_length (#7018) * Fix CI with change of name of nlp (#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-17 11:31:29 -04:00
Stas Bekman	f8590c56e6	[doc] improve/expand the Parametrization section (#7156 )	2020-09-16 08:45:50 -04:00
Stas Bekman	b00cafbde5	[docs] add testing documentation (#7101 ) * [docs] add testing documentation * Update docs/source/testing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * tweaks as suggested * Update docs/source/testing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * tweaks * Update docs/source/testing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/testing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * more tweaks * suggestions from @LysandreJik Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-09-15 19:25:25 -04:00
sgugger	5636cbb25d	Extra )	2020-09-14 09:37:55 -04:00
Sylvain Gugger	ccc8e30c8a	Clean up autoclass doc (#7081 )	2020-09-14 09:26:41 -04:00
Bartosz Telenczuk	15d18e0307	fix link to paper (#7116 )	2020-09-14 07:43:40 -04:00
Sylvain Gugger	4cbd50e611	Compute loss method (#7074 )	2020-09-11 12:06:31 -04:00
Sylvain Gugger	e841b75dec	Automate the lists in auto-xxx docs (#7061 ) * More readable dict * More nlp -> datasets * Revert "More nlp -> datasets" This reverts commit `3cd1883d22`. * Automate the lists in auto-xxx docs * More readable dict * Revert "More nlp -> datasets" This reverts commit `3cd1883d22`. * Automate the lists in auto-xxx docs * nlp -> datasets * Fix new key	2020-09-11 10:42:09 -04:00
Patrick von Platen	db38f7ce29	[BertGeneration, Docs] Fix another old name in docs (#7050 ) * correct docs for bert generation * upload	2020-09-10 17:12:33 +02:00
Patrick von Platen	3bd95b0faf	correct docs for bert generation (#7048 )	2020-09-10 17:08:40 +02:00
Sylvain Gugger	15a189049e	Add TF Funnel Transformer (#7029 ) * Add TF Funnel Transformer * Proper dummy input * Formatting * Update src/transformers/modeling_tf_funnel.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * One review comment forgotten Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-10 10:41:56 -04:00
Patrick von Platen	7fd1febf38	Add "Leveraging Pretrained Checkpoints for Generation" Seq2Seq models. (#6594 ) * add conversion script * improve conversion script * make style * add tryout files * fix * update * add causal bert * better names * add tokenizer file as well * finish causal_bert * fix small bugs * improve generate * change naming * renaming * renaming * renaming * remove leftover files * clean files * add fix tokenizer * finalize * correct slow test * update docs * small fixes * fix link * adapt check repo * apply sams and sylvains recommendations * fix import * implement Lysandres recommendations * fix logger warn	2020-09-10 16:40:51 +02:00
Stas Bekman	4ee1053dcf	add -y to bypass prompt for transformers-cli upload (#7035 )	2020-09-10 04:58:29 -04:00
Stas Bekman	d0963486c1	adding TRANSFORMERS_VERBOSITY env var (#6961 ) * introduce TRANSFORMERS_VERBOSITY env var + test + test helpers * cleanup * remove helper function	2020-09-09 04:08:01 -04:00
Sam Shleifer	f0fc0aea6b	pegasus.rst: fix expected output (#7017 )	2020-09-08 13:29:16 -04:00
Sylvain Gugger	d155b38d6e	Funnel transformer (#6908 ) * Initial model * Fix upsampling * Add special cls token id and test * Formatting * Test and fist FunnelTokenizerFast * Common tests * Fix the check_repo script and document Funnel * Doc fixes * Add all models * Write doc * Fix test * Initial model * Fix upsampling * Add special cls token id and test * Formatting * Test and fist FunnelTokenizerFast * Common tests * Fix the check_repo script and document Funnel * Doc fixes * Add all models * Write doc * Fix test * Fix copyright * Forgot some layers can be repeated * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/modeling_funnel.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * Update src/transformers/modeling_funnel.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments * Update src/transformers/modeling_funnel.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Slow integration test * Make small integration test * Formatting * Add checkpoint and separate classification head * Formatting * Expand list, fix link and add in pretrained models * Styling * Add the model in all summaries * Typo fixes Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-09-08 08:08:08 -04:00
Antonio V Mendoza	ea2c6f1afc	Adding the LXMERT pretraining model (MultiModal languageXvision) to HuggingFace's suite of models (#5793 ) * added template files for LXMERT and competed the configuration_lxmert.py * added modeling, tokization, testing, and finishing touched for lxmert [yet to be tested] * added model card for lxmert * cleaning up lxmert code * Update src/transformers/modeling_lxmert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_lxmert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_lxmert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_lxmert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * tested torch lxmert, changed documtention, updated outputs, and other small fixes * Update src/transformers/convert_pytorch_checkpoint_to_tf2.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/convert_pytorch_checkpoint_to_tf2.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/convert_pytorch_checkpoint_to_tf2.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * renaming, other small issues, did not change TF code in this commit * added lxmert question answering model in pytorch * added capability to edit number of qa labels for lxmert * made answer optional for lxmert question answering * add option to return hidden_states for lxmert * changed default qa labels for lxmert * changed config archive path * squshing 3 commits: merged UI + testing improvments + more UI and testing * changed some variable names for lxmert * TF LXMERT * Various fixes to LXMERT * Final touches to LXMERT * AutoTokenizer order * Add LXMERT to index.rst and README.md * Merge commit test fixes + Style update * TensorFlow 2.3.0 sequential model changes variable names Remove inherited test * Update src/transformers/modeling_tf_pytorch_utils.py * Update docs/source/model_doc/lxmert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/lxmert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_lxmert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * added suggestions * Fixes * Final fixes for TF model * Fix docs Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-09-03 04:02:25 -04:00
Suraj Patil	4230d30f77	[pipelines] Text2TextGenerationPipeline (#6744 ) * add Text2TextGenerationPipeline * remove max length warning * remove comments * remove input_length * fix typo * add tests * use TFAutoModelForSeq2SeqLM * doc * typo * add the doc below TextGenerationPipeline * doc nit * style * delete comment	2020-09-02 07:34:35 -04:00
Harry Wang	ee1bff06f8	minor docs grammar fixes (#6889 )	2020-09-02 06:45:19 -04:00
Patrick von Platen	4d1a3ffde8	[EncoderDecoder] Add xlm-roberta to encoder decoder (#6878 ) * finish xlm-roberta * finish docs * expose XLMRobertaForCausalLM	2020-09-01 21:56:39 +02:00
Lysandre Debut	1461aac8d7	Update docs stable version	2020-09-01 11:02:24 -04:00
Lysandre	3726754a6c	v3.1.0 documentation	2020-09-01 14:39:07 +02:00
Lysandre	4b3ee9cbc5	Release: v3.1.0	2020-09-01 14:27:52 +02:00
Patrick von Platen	afc4ece462	[Generate] Facilitate PyTorch generate using `ModelOutputs` (#6735 ) * fix generate for GPT2 Double Head * fix gpt2 double head model * fix bart / t5 * also add for no beam search * fix no beam search * fix encoder decoder * simplify t5 * simplify t5 * fix t5 tests * fix BART * fix transfo-xl * fix conflict * integrating sylvains and sams comments * fix tf past_decoder_key_values * fix enc dec test	2020-09-01 12:38:25 +02:00
Sylvain Gugger	d5f1ffa0d8	Logging doc (#6852 ) * Add logging doc * Foamtting * Update docs/source/main_classes/logging.rst * Update src/transformers/utils/logging.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-01 03:16:34 -04:00
Lysandre Debut	41aa2b4ef1	Adafactor docs (#6765 )	2020-08-27 05:16:50 -04:00
Patrick von Platen	fa8ee8e855	fix torchscript docs (#6740 )	2020-08-26 04:51:56 -04:00
Quentin Lhoest	0f16dd0ac2	Add DPR to models summary (#6690 ) * add dpr to models summary * minor * minor * Update docs/source/model_summary.rst qa -> question answering Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_summary.rst qa -> question ansering (cont'd) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-25 09:57:28 +02:00
Sam Shleifer	0ebc9699fa	[fixdoc] Add import to pegasus usage doc (#6698 )	2020-08-24 15:54:57 -04:00
Stas Bekman	912a21ec78	remove BartForConditionalGeneration.generate (#6659 ) As suggested here: https://github.com/huggingface/transformers/issues/6651#issuecomment-678594233 this removes generic `generate` doc with examples not-relevant to bart.	2020-08-25 00:42:34 +08:00
Suraj Patil	cbda72932c	[Doc model summary] add MBart model summary (#6649 )	2020-08-21 13:42:59 -04:00
Patrick von Platen	a4db4e3032	[Docs model summaries] Add pegasus to docs (#6640 ) * add pegasus to docs * Update docs/source/model_summary.rst	2020-08-21 16:22:10 +02:00
Suraj Patil	d0e42a7bed	CamembertForCausalLM (#6577 ) * added CamembertForCausalLM * add in __init__ and auto model * style * doc	2020-08-21 13:52:54 +02:00
Morgan Funtowicz	b105f2c6b3	Update ONNX doc to match the removal of --optimize argument. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>	2020-08-21 10:37:09 +02:00
Joe Davison	039d8d65fc	add intro to nlp lib & dataset links to custom datasets tutorial (#6583 ) * add intro to nlp lib + links * unique links...	2020-08-20 10:32:51 -04:00
Romain Rigaux	cabfdfafc0	Docs copy button misses ... prefixed code (#6518 ) Tested in a local build of the docs. e.g. Just above https://huggingface.co/transformers/task_summary.html#causal-language-modeling Copy will copy the full code, e.g. for token in top_5_tokens: print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token]))) Instead of currently only: for token in top_5_tokens: >>> for token in top_5_tokens: ... print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token]))) Distilled models are smaller than the models they mimic. Using them instead of the large versions would help reduce our carbon footprint. Distilled models are smaller than the models they mimic. Using them instead of the large versions would help increase our carbon footprint. Distilled models are smaller than the models they mimic. Using them instead of the large versions would help decrease our carbon footprint. Distilled models are smaller than the models they mimic. Using them instead of the large versions would help offset our carbon footprint. Distilled models are smaller than the models they mimic. Using them instead of the large versions would help improve our carbon footprint. Docs for the option fix: https://sphinx-copybutton.readthedocs.io/en/latest/	2020-08-20 17:35:06 +08:00
Sylvain Gugger	18ca0e9140	Fix #6575 (#6596 )	2020-08-19 13:04:33 -04:00
Suraj Patil	fb6844aff5	[Pegasus Doc] minor typo (#6579 ) Minor typo correction @sshleifer	2020-08-18 12:47:47 -04:00
Romain Rigaux	7516bcf273	[docs] Fix number of 'ug' occurrences in tokenizer_summary (#6574 )	2020-08-18 10:23:25 -04:00
Romain Rigaux	5a5af22ed5	[docs] Fix wrong newline in the middle of a paragraph (#6573 )	2020-08-18 10:22:43 -04:00
Sam Shleifer	12d7624199	[marian] converter supports models from new Tatoeba project (#6342 )	2020-08-17 23:55:42 -04:00
Suraj Patil	c9564f5343	[Doc] add more MBart and other doc (#6490 ) * add mbart example * add Pegasus and MBart in readme * typo * add MBart in Pretrained models * add pre-proc doc * add DPR in readme * fix indent * doc fix	2020-08-17 12:30:26 -04:00
Stas Bekman	f68c873100	replace _ with __ rst links (#6541 )	2020-08-17 12:27:02 -04:00
Stas Bekman	b732e7e111	[doc] multiple corrections to "Summary of the tasks" (#6509 ) * [doc] multiple corrections to "Summary of the tasks" * fix indentation * correction * fix links, add links to examples/seq2seq/README.md instead of non-existing script	2020-08-17 11:49:16 -04:00
Stas Bekman	84d33317ae	[doc] make the text more readable, fix some typos, add some disambiguation (#6508 ) * [doc] make the text more readable, fix some typos, add some disambiguation * Update docs/source/glossary.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-17 11:07:58 -04:00
Joe Davison	d0c2389f48	add custom datasets tutorial (#6466 ) * add custom datasets tutorial * python -> bash code blocks * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * minor review feedback changes * add working native QA snippet Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-17 09:15:34 -04:00
Patrick von Platen	36010cb1e2	fix pegasus doc (#6533 )	2020-08-17 12:24:43 +02:00
Stas Bekman	49d8076fa2	[doc] Summary of the models fixes (#6511 ) * [doc] Summary of the models fixes * correction	2020-08-17 16:04:53 +08:00
Stas Bekman	423eb5b1d7	[doc] fix invalid env vars (#6504 ) - remove invalid `ENV_` prefix. - add a few ':' while at it	2020-08-17 11:11:40 +08:00
Stas Bekman	df15c7c226	typos (#6505 )	2020-08-17 10:57:36 +08:00
Sylvain Gugger	895ed8f451	Generation doc (#6470 ) * Generation doc * MBartForConditionalGeneration (#6441) * add MBartForConditionalGeneration * style * rebase and fixes * add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS * fix docs * don't ignore mbart * doc * fix mbart fairseq link * put mbart before bart * apply doc suggestions * Use hash to clean the test dirs (#6475) * Use hash to clean the test dirs * Use hash to clean the test dirs * Use hash to clean the test dirs * fix * [EncoderDecoder] Add Cross Attention for GPT2 (#6415) * add cross attention layers for gpt2 * make gpt2 cross attention work * finish bert2gpt2 * add explicit comments * remove attention mask since not yet supported * revert attn mask in pipeline * Update src/transformers/modeling_gpt2.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_encoder_decoder.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Sort unique_no_split_tokens to make it deterministic (#6461) * change unique_no_split_tokens's type to set * use sorted list instead of set * style * Import accuracy_score (#6480) * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address comments * Styling * Generation doc * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address comments * Styling Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> Co-authored-by: gijswijnholds <gijswijnholds@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-14 09:46:39 -04:00
gijswijnholds	b5ba758ba9	Import accuracy_score (#6480 )	2020-08-14 08:16:16 -04:00
Suraj Patil	680f1337c3	MBartForConditionalGeneration (#6441 ) * add MBartForConditionalGeneration * style * rebase and fixes * add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS * fix docs * don't ignore mbart * doc * fix mbart fairseq link * put mbart before bart * apply doc suggestions	2020-08-14 03:21:16 -04:00
Patrick von Platen	0735def8e1	[EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer (#6411 ) * add encoder-decoder for roberta * fix headmask * apply Sylvains suggestions * fix typo * Apply suggestions from code review	2020-08-12 18:23:30 +02:00
Sylvain Gugger	a8db954cda	Activate check on the CI (#6427 ) * Activate check on the CI * Fix repo inconsistencies * Don't document too much	2020-08-12 08:42:14 -04:00
Sylvain Gugger	34fabe1697	Move prediction_loss_only to TrainingArguments (#6426 )	2020-08-12 08:03:45 -04:00
Sam Shleifer	be1520d3a3	rename prepare_translation_batch -> prepare_seq2seq_batch (#6103 )	2020-08-11 15:57:07 -04:00
Sam Shleifer	66fa8ceaea	PegasusForConditionalGeneration (torch version) (#6340 ) Co-authored-by: Jingqing Zhang <jingqing.zhang15@imperial.ac.uk>	2020-08-11 14:31:23 -04:00
Patrick von Platen	00bb0b25ed	TF Longformer (#5764 ) * improve names and tests longformer * more and better tests for longformer * add first tf test * finalize tf basic op functions * fix merge * tf shape test passes * narrow down discrepancies * make longformer local attn tf work * correct tf longformer * add first global attn function * add more global longformer func * advance tf longformer * finish global attn * upload big model * finish all tests * correct false any statement * fix common tests * make all tests pass except keras save load * fix some tests * fix torch test import * finish tests * fix test * fix torch tf tests * add docs * finish docs * Update src/transformers/modeling_longformer.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_longformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply Lysandres suggestions * reverse to assert statement because function will fail otherwise * applying sylvains recommendations * Update src/transformers/modeling_longformer.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * Update src/transformers/modeling_tf_longformer.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-10 23:25:06 +02:00
Sylvain Gugger	06bc347c97	Fix links for open in colab (#6391 )	2020-08-10 11:16:17 -04:00
Sylvain Gugger	3e0fe3cf5c	Colab button (#6389 ) * Add colab button * Add colab link for tutorials	2020-08-10 11:12:29 -04:00
Sylvain Gugger	6028ed92bd	Small docfile fixes (#6328 )	2020-08-10 05:37:12 -04:00
Sylvain Gugger	6ba540b747	Add a script to check all models are tested and documented (#6298 ) * Add a script to check all models are tested and documented * Apply suggestions from code review Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> * Address comments Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>	2020-08-07 09:18:37 -04:00
Sylvain Gugger	c67d1a0259	Tf model outputs (#6247 ) * TF outputs and test on BERT * Albert to DistilBert * All remaining TF models except T5 * Documentation * One file forgotten * TF outputs and test on BERT * Albert to DistilBert * All remaining TF models except T5 * Documentation * One file forgotten * Add new models and fix issues * Quality improvements * Add T5 * A bit of cleanup * Fix for slow tests * Style	2020-08-05 11:34:39 -04:00
Joe Davison	972535ea74	fix zero shot pipeline docs (#6245 )	2020-08-04 16:37:49 -04:00
Kevin Canwen Xu	3c289fb38c	Remove outdated BERT tips (#6217 ) * Remove out-dated BERT tips * Update modeling_outputs.py * Update bert.rst * Update bert.rst	2020-08-04 01:17:56 +08:00
Sylvain Gugger	e4920c92d6	Doc pipelines (#6175 ) * Init work on pipelines doc * Work in progress * Work in progress * Doc pipelines * Rm unwanted default * Apply suggestions from code review Lysandre comments Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-03 11:44:46 -04:00
Faiaz Rahman	a39dfe4fb1	Fixed typo in Longformer (#6180 )	2020-08-01 18:20:48 +08:00
Sylvain Gugger	86caab1e0b	Harmonize both Trainers API (#6157 ) * Harmonize both Trainers API * Fix test * main_prcess -> process_zero	2020-07-31 09:43:23 -04:00
Paul O'Leary McCann	cf3cf304ca	Replace mecab-python3 with fugashi for Japanese tokenization (#6086 ) * Replace mecab-python3 with fugashi This replaces mecab-python3 with fugashi for Japanese tokenization. I am the maintainer of both projects. Both projects are MeCab wrappers, so the underlying C++ code is the same. fugashi is the newer wrapper and doesn't use SWIG, so for basic use of the MeCab API it's easier to use. This code insures the use of a version of ipadic installed via pip, which should make versioning and tracking down issues easier. fugashi has wheels for Windows, OSX, and Linux, which will help with issues with installing old versions of mecab-python3 on Windows. Compared to mecab-python3, because fugashi doesn't use SWIG, it doesn't require a C++ runtime to be installed on Windows. In adding this change I removed some code dealing with `cursor`, `token_start`, and `token_end` variables. These variables didn't seem to be used for anything, it is unclear to me why they were there. I ran the tests and they passed, though I couldn't figure out how to run the slow tests (`--runslow` gave an error) and didn't try testing with Tensorflow. * Style fix * Remove unused variable Forgot to delete this... * Adapt doc with install instructions * Fix typo Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-07-31 04:41:14 -04:00
Funtowicz Morgan	7231f7b503	Enable ONNX/ONNXRuntime optimizations through converter script (#6131 ) * Add onnxruntime transformers optimization support Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added Optimization section in ONNX/ONNXRuntime documentation. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Improve note reference Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fixing imports order. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Add warning about different level of optimization between torch and tf export. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Address @LysandreJik wording suggestion Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address @LysandreJik wording suggestion Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Always optimize model before quantization for maximum performances. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Address comments on the documentation. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Improve TensorFlow optimization message as suggested by @yufenglee Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Removed --optimize parameter Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Warn the user about current quantization limitation when model is larger than 2GB. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Trigger CI for last check * Small change in print for the optimization section. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-07-31 09:45:13 +02:00
Sylvain Gugger	f3065abdb8	Doc tokenizer (#6110 ) * Start doc tokenizers * Tokenizer documentation * Start doc tokenizers * Tokenizer documentation * Formatting after rebase * Formatting after merge * Update docs/source/main_classes/tokenizer.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address comment * Update src/transformers/tokenization_utils_base.py Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Address Thom's comments Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-07-30 14:51:19 -04:00
guillaume-be	e642c78908	Addition of a DialoguePipeline (#5516 ) * initial commit for pipeline implementation Addition of input processing and history concatenation * Conversation pipeline tested and working for single & multiple conversation inputs * Added docstrings for dialogue pipeline * Addition of dialogue pipeline integration tests * Delete test_t5.py * Fixed max code length * Updated styling * Fixed test broken by formatting tools * Removed unused import * Added unit test for DialoguePipeline * Fixed Tensorflow compatibility * Fixed multi-framework support using framework flag * - Fixed docstring - Added `min_length_for_response` as an initialization parameter - Renamed `args` to `conversations`, `conversations` being a `Conversation` or a `List[Conversation]` - Updated truncation to truncate entire segments of conversations, instead of cutting in the middle of a user/bot input - renamed pipeline name from dialogue to conversational - removed hardcoded default value of 1000 and use config.max_length instead - added `append_response` and `set_history` method to the Conversation class to avoid direct fields mutation - fixed bug in history truncation method * - Updated ConversationalPipeline to accept only active conversations (otherwise a ValueError is raised) * - Simplified input tensor conversion * - Updated attention_mask value for Tensorflow compatibility * - Updated last dialogue reference to conversational & fixed integration tests * Fixed conflict with master * Updates following review comments * Updated formatting * Added Conversation and ConversationalPipeline to the library __init__, addition of docstrings for Conversation, added both to the docs * Update src/transformers/pipelines.py Updated docsting following review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-07-30 14:11:39 -04:00
Sylvain Gugger	91cb95461e	Switch from return_tuple to return_dict (#6138 ) * Switch from return_tuple to return_dict * Fix test * [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614) * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests * AutoModels Tiny tweaks * Style * Final changes before merge * Re-order for simpler review * Final fixes * Addressing @sgugger's comments * Test MultipleChoice * Rework TF trainer (#6038) * Fully rework training/prediction loops * fix method name * Fix variable name * Fix property name * Fix scope * Fix method name * Fix tuple index * Fix tuple index * Fix indentation * Fix variable name * fix eval before log * Add drop remainder for test dataset * Fix step number + fix logging datetime * fix eval loss value * use global step instead of step + fix logging at step 0 * Fix logging datetime * Fix global_step usage * Fix breaking loop + logging datetime * Fix step in prediction loop * Fix step breaking * Fix train/test loops * Force TF at least 2.2 for the trainer * Use assert_cardinality to facilitate the dataset size computation * Log steps per epoch * Make tfds compliant with TPU * Make tfds compliant with TPU * Use TF dataset enumerate instead of the Python one * revert previous commit * Fix data_dir * Apply style * rebase on master * Address Sylvain's comments * Address Sylvain's and Lysandre comments * Trigger CI * Remove unused import * Switch from return_tuple to return_dict * Fix test * Add recent model Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Plu <plu.julien@gmail.com>	2020-07-30 09:17:00 -04:00
Oren Amsalem	d24ea708d7	Actually the extra_id are from 0-99 and not from 1-100 (#5967 ) a = tokenizer.encode("we got a <extra_id_99>", return_tensors='pt',add_special_tokens=True) print(a) >tensor([[ 62, 530, 3, 9, 32000]]) a = tokenizer.encode("we got a <extra_id_100>", return_tensors='pt',add_special_tokens=True) print(a) >tensor([[ 62, 530, 3, 9, 3, 2, 25666, 834, 23, 26, 834, 2915, 3155]])	2020-07-30 06:13:29 -04:00
Funtowicz Morgan	6c002853a6	Added capability to quantize a model while exporting through ONNX. (#6089 ) * Added capability to quantize a model while exporting through ONNX. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> We do not support multiple extensions Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Reformat files Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * More quality Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Ensure test_generate_identified_name compares the same object types Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added documentation everywhere on ONNX exporter Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use pathlib.Path instead of plain-old string Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use f-string everywhere Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use the correct parameters for black formatting Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use Python 3 super() style. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use packaging.version to ensure installed onnxruntime version match requirements Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fixing imports sorting order. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Missing raise(s) Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added quantization documentation Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fix some spelling. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fix bad list header format Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-07-29 13:21:29 +02:00
Funtowicz Morgan	640550fc7a	ONNX documentation (#5992 ) * Move torchscript and add ONNX documentation under modle_export Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Let's follow guidelines by the gurus: Renamed torchscript.rst to serialization.rst Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Remove previously introduced tree element Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * WIP doc Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * ONNX documentation Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fix invalid link Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Improve spelling Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Final wording pass Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-07-29 11:02:35 +02:00
Xin Wen	b9b11795cf	Update model_summary.rst (#5737 ) Add '-' to make the reference of Transformer-XL more accurate and formal.	2020-07-27 05:34:02 -04:00
Sylvain Gugger	3b44aa935a	Model utils doc (#6005 ) * Document TF modeling utils * Document all model utils	2020-07-24 09:16:28 -04:00
Sylvain Gugger	33d7506ea1	Update doc of the model page (#5985 )	2020-07-22 18:14:57 -04:00
Sylvain Gugger	e714412fe6	Update doc to new model outputs (#5946 ) * Update doc to new model outputs * Fix outputs in quicktour	2020-07-21 18:13:55 -04:00
Sylvain Gugger	a20969170b	Add AlbertForPretraining to doc (#5914 )	2020-07-20 17:53:21 -04:00
Joe Davison	5d178954c9	tiny ppl doc typo fix (#5751 )	2020-07-14 10:39:44 -06:00
Stas Bekman	45addfe96d	FlaubertForTokenClassification (#5644 ) * implement FlaubertForTokenClassification as a subclass of XLMForTokenClassification * fix mapping order * add the doc * add common tests	2020-07-13 14:59:53 -04:00
Stas Bekman	0a19a49dfe	doc improvements (#5688 )	2020-07-13 18:10:17 +08:00
Sylvain Gugger	7fad617dc1	Document model outputs (#5673 ) * Document model outputs * Update docs/source/main_classes/output.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-07-10 17:31:02 -04:00
Sylvain Gugger	b2747af543	Improvements to PretrainedConfig documentation (#5642 ) * Update PretrainedConfig doc * Formatting * Small fixes * Forgotten args and more cleanup	2020-07-10 10:31:47 -04:00
Sylvain Gugger	760f726e51	Add forum link in the docs (#5637 )	2020-07-09 15:13:22 -04:00
Lysandre Debut	1158e56551	Correct extension (#5631 )	2020-07-09 11:03:07 -04:00
Stas Bekman	fa5423b169	doc fixes (#5613 )	2020-07-08 19:52:44 -04:00
Joe Davison	b4b33fdf25	Guide to fixed-length model perplexity evaluation (#5449 ) * add first draft ppl guide * upload imgs * expand on strides * ref typo * rm superfluous past var * add tokenization disclaimer	2020-07-07 16:04:15 -06:00
Sam Shleifer	353b8f1e7a	Add mbart-large-cc25, support translation finetuning (#5129 ) improve unittests for finetuning, especially w.r.t testing frozen parameters fix freeze_embeds for T5 add streamlit setup.cfg	2020-07-07 13:23:01 -04:00
Suraj Patil	33e43edddc	[docs] fix model_doc links in model summary (#5566 ) * fix model_doc links * update model links	2020-07-07 11:06:12 -04:00
Quentin Lhoest	fbd8792195	Add DPR model (#5279 ) * beginning of dpr modeling * wip * implement forward * remove biencoder + better init weights * export dpr model to embed model for nlp lib * add new api * remove old code * make style * fix dumb typo * don't load bert weights * docs * docs * style * move the `k` parameter * fix init_weights * add pretrained configs * minor * update config names * style * better config * style * clean code based on PR comments * change Dpr to DPR * fix config * switch encoder config to a dict * style * inheritance -> composition * add messages in assert startements * add dpr reader tokenizer * one tokenizer per model * fix base_model_prefix * fix imports * typo * add convert script * docs * change tokenizers conf names * style * change tokenizers conf names * minor * minor * fix wrong names * minor * remove unused convert functions * rename convert script * use return_tensors in tokenizers * remove n_questions dim * move generate logic to tokenizer * style * add docs * docs * quality * docs * add tests * style * add tokenization tests * DPR full tests * Stay true to the attention mask building * update docs * missing param in bert input docs * docs * style Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-07-07 08:56:12 -04:00
Lysandre	1d2332861f	Post v3.0.2 release commit	2020-07-06 18:56:47 -04:00
Lysandre	b0892fa0e8	Release: v3.0.2	2020-07-06 18:49:44 -04:00
Arnav Sharma	b2309cc6bf	Typo fix in `training` doc (#5495 )	2020-07-06 09:15:22 -04:00
ELanning	7ecff0ccbb	Fix typo in training (#5510 )	2020-07-06 09:14:57 -04:00
Sylvain Gugger	6b735a7253	Tokenizer summary (#5467 ) * Work on tokenizer summary * Finish tutorial * Link to it * Apply suggestions from code review Co-authored-by: Anthony MOI <xn1t0x@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Add vocab definition Co-authored-by: Anthony MOI <xn1t0x@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-07-02 17:07:42 -04:00
George Ho	84e56669af	Fix typo in glossary (#5466 )	2020-07-02 09:19:33 -04:00
Patrick von Platen	d16e36c7e5	[Reformer] Add Masked LM Reformer (#5426 ) * fix conflicts * fix * happy rebasing	2020-07-01 22:43:18 +02:00
Patrick von Platen	fe81f7d12c	finish reformer qa head (#5433 )	2020-07-01 12:27:14 -04:00
Sylvain Gugger	6c55e9fc32	Fix dropdown bug in searches (#5440 ) * Trigger CI * Fix dropdown bug in searches	2020-07-01 11:02:59 -04:00
Sylvain Gugger	4ade7491f4	Fix examples titles and optimization doc page (#5408 )	2020-07-01 08:11:25 -04:00
Sylvain Gugger	87716a6d07	Documentation for the Trainer API (#5383 ) * Documentation for the Trainer API * Address review comments * Address comments	2020-06-30 11:43:43 -04:00
Sylvain Gugger	0607b88945	How to share model cards with the CLI (#5374 ) * How to share model cards * Switch the two options * Fix bad copy/cut * Julien's suggestion	2020-06-30 08:59:32 -04:00
Lysandre Debut	b9ee87f5c7	Doc for v3.0.0 (#5366 ) * Doc for v3.0.0 * Update docs/source/_static/js/custom.js Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/_static/js/custom.js Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-06-29 11:08:54 -04:00
Lysandre	b62ca59527	Release: v3.0.0	2020-06-29 10:40:13 -04:00
Patrick von Platen	4bcc35cd69	[Docs] Benchmark docs (#5360 ) * first doc version * add benchmark docs * fix typos * improve README * Update docs/source/benchmarks.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix naming and docs Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-06-29 16:08:57 +02:00
Julien Chaumond	c950fef545	[docs] Small tweaks to #5323	2020-06-29 14:24:33 +02:00
Sylvain Gugger	1af58c0706	New model sharing tutorial (#5323 )	2020-06-27 11:10:02 -04:00
Thomas Wolf	601d4d699c	[tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308 ) * remove references to old API in docstring - update data processors * style * fix tests - better type checking error messages * better type checking * include awesome fix by @LysandreJik for #5310 * updated doc and examples	2020-06-26 19:48:14 +02:00
Joe Davison	2ffef0d0c7	Training & fine-tuning quickstart (#5034 ) * add initial fine-tuning guide * split code blocks to smaller segments * fix up trianer section of fine-tune doc * a few last typos * Update usage -> task summary link Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-06-25 15:11:11 -06:00
Lysandre Debut	364a5ae1f0	Refactor Code samples; Test code samples (#5036 ) * Refactor code samples * Test docstrings * Style * Tokenization examples * Run rust of tests * First step to testing source docs * Style and BART comment * Test the remainder of the code samples * Style * let to const * Formatting fixes * Ready for merge * Fix fixture + Style * Fix last tests * Update docs/source/quicktour.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Addressing @sgugger's comments + Fix MobileBERT in TF Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-06-25 16:46:00 -04:00
Sylvain Gugger	d12ceb48ba	Tokenization tutorial (#5257 ) * All done * Link to the tutorial * Typo fixes Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Add metnion of the return_xxx args Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-06-24 18:43:20 -04:00
Sylvain Gugger	6894b486d0	Fix version controller links (for realsies) (#5251 )	2020-06-24 12:13:43 -04:00
Sylvain Gugger	609e0c583f	Fix links (#5248 )	2020-06-24 11:35:55 -04:00
Sylvain Gugger	7c41057d50	Add hugs (#5225 )	2020-06-24 07:56:14 -04:00
Sylvain Gugger	173528e368	Add version control menu (#5222 ) * Add version control menu * Constify things Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-23 17:05:12 -04:00
Sylvain Gugger	417e492f1e	Quick tour (#5145 ) * Quicktour part 1 * Update * All done * Typos Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Address comments in quick tour * Update docs/source/quicktour.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update from feedback Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-06-22 16:08:09 -04:00
Sylvain Gugger	1262495a91	Add TF auto model to the docs + fix sphinx warnings (#5187 )	2020-06-22 14:43:52 -04:00
Sylvain Gugger	eb0ca71ef6	Update glossary (#5148 ) * Update glossary * Update docs/source/glossary.rst Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-06-22 08:30:49 -04:00
Vasily Shamporov	9a3f91088c	Add MobileBert (#4901 ) * Add MobileBert * Quality + Conversion script * style * Update src/transformers/modeling_mobilebert.py * Links to S3 * Style * TFMobileBert Slight fixes to the pytorch MobileBert Style * MobileBertForMaskedLM (PT + TF) * MobileBertForNextSentencePrediction (PT + TF) * MobileFor{MultipleChoice, TokenClassification} (PT + TF) ss * Tests + Auto * Doc * Tests * Addressing @sgugger's comments * Adressing @patrickvonplaten's comments * Style * Style * Integration test * style * Model card Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-06-19 16:38:36 -04:00
Suraj Patil	18177a1a60	lm_labels => labels (#5080 )	2020-06-18 09:16:29 +02:00
Sylvain Gugger	204ebc25e6	Update installation page and add contributing to the doc (#5084 ) * Update installation page and add contributing to the doc * Remove mention of symlinks	2020-06-17 14:01:10 -04:00
Sylvain Gugger	7291ea0bff	Reorganize documentation (#5064 ) * Reorganize topics and add all models	2020-06-17 07:55:20 -04:00
Sylvain Gugger	011cc0be51	Fix all sphynx warnings (#5068 )	2020-06-16 16:50:02 -04:00
Yacine Jernite	49c5202522	Eli5 examples (#4968 ) * add eli5 examples * add dense query script * query_di * merging * merging * add_utils * adds nearest neighbor wikipedia * batch queries * training_retriever * new notebooks * moved retriever traiing script * finished wiki40b * max_len_fix * train_s2s * retriever_batch_checkpointing * cleanup * merge * dim_fix * fix_indexer * fix_wiki40b_snippets * fix_embed_for_r * fp32 index * fix_sparse_q * joint_training * remove obsolete datasets * add_passage_nn_results * add_passage_nn_results * add_batch_nn * add_batch_nn * add_data_scripts * notebook * notebook * notebook * fix_multi_gpu * add_app * full_caching * full_caching * notebook * sparse_done * images * notebook * add_image_gif * with_Gif * add_contr_image * notebook * notebook * notebook * train_functions * notebook * min_retrieval_length * pandas_option * notebook * min_retrieval_length * notebook * notebook * eval_Retriever * notebook * images * notebook * add_example * add_example * notebook * fireworks * notebook * notebook * joe's notebook comments * app_update * notebook * notebook_link * captions * notebook * assing RetriBert model * add RetriBert to Auto * change AutoLMHead to AutoSeq2Seq * notebook downloads from hf models * style_black * style_black * app_update * app_update * fix_app_update * style * style * isort * Delete WikiELI5training.ipynb * Delete evaluate_eli5.py * Delete WikiELI5explore.ipynb * Delete ExploreWikiELI5Support.html * Delete explainlikeimfive.py * Delete wiki_snippets.py * children before parent * children before parent * style_black * style_black_only * isort * isort_new * Update src/transformers/modeling_retribert.py Co-authored-by: Julien Chaumond <chaumond@gmail.com> * typo fixes * app_without_asset * cleanup * Delete ELI5animation.gif * Delete ELI5contrastive.svg * Delete ELI5wiki_index.svg * Delete choco_bis.svg * Delete fireworks.gif * Delete huggingface_logo.jpg * Delete huggingface_logo.svg * Delete Long_Form_Question_Answering_with_ELI5_and_Wikipedia.ipynb * Delete eli5_app.py * Delete eli5_utils.py * readme * Update README.md * unused imports * moved_info * default_beam * ftuned model * disclaimer * Update src/transformers/modeling_retribert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * black * add_doc * names * isort_Examples * isort_Examples * Add doc to index Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-06-16 16:36:58 -04:00
Sylvain Gugger	439aa1d6e9	Remove old section + caching in install (#5027 )	2020-06-16 13:03:41 -04:00
Sylvain Gugger	f9f8a5312e	Add DistilBertForMultipleChoice (#5032 ) * Add `DistilBertForMultipleChoice`	2020-06-15 18:31:41 -04:00
Anthony MOI	36434220fc	[HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized pipeline - fast tokenizers - tests (#4510 ) * Use tokenizers pre-tokenized pipeline * failing pretrokenized test * Fix is_pretokenized in python * add pretokenized tests * style and quality * better tests for batched pretokenized inputs * tokenizers clean up - new padding_strategy - split the files * [HUGE] refactoring tokenizers - padding - truncation - tests * style and quality * bump up requied tokenizers version to 0.8.0-rc1 * switched padding/truncation API - simpler better backward compat * updating tests for custom tokenizers * style and quality - tests on pad * fix QA pipeline * fix backward compatibility for max_length only * style and quality * Various cleans up - add verbose * fix tests * update docstrings * Fix tests * Docs reformatted * __call__ method documented Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-06-15 17:12:51 -04:00
Sam Shleifer	a9f1fc6c94	Add bart-base (#5014 )	2020-06-15 13:29:26 -04:00
Suraj Patil	e93ccb3290	BartForQuestionAnswering (#4908 )	2020-06-12 15:47:57 -04:00
Sylvain Gugger	538531cde5	Add AlbertForMultipleChoice (#4959 ) * Add AlbertForMultipleChoice * Make up to date and add all models to common tests	2020-06-12 14:20:19 -04:00
Suraj Patil	ef2dcdccaa	ElectraForQuestionAnswering (#4913 ) * ElectraForQuestionAnswering * udate __init__ * add test for electra qa model * add ElectraForQuestionAnswering in auto models * add ElectraForQuestionAnswering in all_model_classes * fix outputs, input_ids defaults to None * add ElectraForQuestionAnswering in docs * remove commented line	2020-06-10 15:17:52 -04:00
Sylvain Gugger	41a1d27cde	Add XLMRobertaForQuestionAnswering (#4855 ) * Add XLMRobertaForQuestionAnswering * Formatting * Make test happy	2020-06-08 21:22:37 -04:00
Sylvain Gugger	37be3786cf	Clean documentation (#4849 ) * Clean documentation	2020-06-08 11:28:19 -04:00
Sylvain Gugger	56d5d160cd	Add model and doc badges (#4811 ) * Add badges for models and docs	2020-06-05 18:45:42 -04:00
Sylvain Gugger	5c0cfc2cf0	Add link to community models (#4804 )	2020-06-05 15:29:20 -04:00
Sylvain Gugger	fa661ce749	Add model summary (#4789 ) * Add model summary * Add link to pretrained models	2020-06-05 12:22:50 -04:00
Julien Chaumond	99207bd112	Pipelines: miscellanea of QoL improvements and small features... (#4632 ) * [hf_api] Attach all unknown attributes for future-proof compatibility * [Pipeline] NerPipeline is really a TokenClassificationPipeline * modelcard.py: I don't think we need to force the download * Remove config, tokenizer from SUPPORTED_TASKS as we're moving to one model = one weight + one tokenizer * FillMaskPipeline: also output token in string form * TextClassificationPipeline: option to return all scores, not just the argmax * Update docs/source/main_classes/pipelines.rst	2020-06-03 03:51:31 -04:00
Julien Chaumond	b42586ea56	Fix CI after killing archive maps (#4724 ) * 🐛 Fix model ids for BART and Flaubert	2020-06-02 10:21:09 -04:00
Lysandre	b43c78e5d3	Release: v2.11.0	2020-06-02 09:49:09 -04:00
Julien Chaumond	d4c2cb402d	Kill model archive maps (#4636 ) * Kill model archive maps * Fixup * Also kill model_archive_map for MaskedBertPreTrainedModel * Unhook config_archive_map * Tokenizers: align with model id changes * make style && make quality * Fix CI	2020-06-02 09:39:33 -04:00
Patrick von Platen	56ee2560be	[Longformer] Better handling of global attention mask vs local attention mask (#4672 ) * better api * improve automatic setting of global attention mask * fix longformer bug * fix global attention mask in test * fix global attn mask flatten * fix slow tests * update docstring * update docs and make more robust * improve attention mask	2020-05-29 17:58:42 +02:00
Patrick von Platen	9c17256447	[Longformer] Multiple choice for longformer (#4645 ) * add multiple choice for longformer * add models to docs * adapt docstring * add test to longformer * add longformer for mc in init and modeling auto * fix tests	2020-05-29 13:46:08 +02:00
Lysandre Debut	6a17688021	per_device instead of per_gpu/error thrown when argument unknown (#4618 ) * per_device instead of per_gpu/error thrown when argument unknown * [docs] Restore examples.md symlink * Correct absolute links so that symlink to the doc works correctly * Update src/transformers/hf_argparser.py Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Warning + reorder * Docs * Style * not for squad Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-27 11:36:55 -04:00
Patrick von Platen	c589eae2b8	[Longformer For Question Answering] Conversion script, doc, small fixes (#4593 ) * add new longformer for question answering model * add new config as well * fix links * fix links part 2	2020-05-26 14:58:47 +02:00
Patrick von Platen	3e3e552125	[Reformer] fix reformer num buckets (#4564 ) * fix reformer num buckets * fix * adapt docs * set num buckets in config	2020-05-25 16:04:45 -04:00
Alexander Measure	95a26fcf2d	link to paper was broken (#4526 ) changed from https://https://arxiv.org/abs/2001.04451.pdf to https://arxiv.org/abs/2001.04451.pdf	2020-05-22 15:17:09 -04:00
Lysandre	e0db6bbd65	Release: v2.10.0	2020-05-22 10:37:44 -04:00
Patrick von Platen	48c3a70b4e	[Longformer] Docs and clean API (#4464 ) * add longformer docs * improve docs	2020-05-19 21:52:36 +02:00
Iz Beltagy	8f1d047148	Longformer (#4352 ) * first commit * bug fixes * better examples * undo padding * remove wrong VOCAB_FILES_NAMES * License * make style * make isort happy * unit tests * integration test * make `black` happy by undoing `isort` changes!! * lint * no need for the padding value * batch_size not bsz * remove unused type casting * seqlen not seq_len * staticmethod * `bert` selfattention instead of `n2` * uint8 instead of bool + lints * pad inputs_embeds using embeddings not a constant * black * unit test with padding * fix unit tests * remove redundant unit test * upload model weights * resolve todo * simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_ * increase unittest coverage	2020-05-19 16:04:43 +02:00
Soham Chatterjee	fa6113f9a0	Fixed spelling of training (#4416 )	2020-05-18 11:23:29 -04:00
Lysandre	7cb203fae4	Release: v2.9.1	2020-05-13 17:38:50 -04:00
Sam Shleifer	9a687ebb77	[Marian Fixes] prevent predicting pad_token_id before softmax, support language codes, name multilingual models (#4290 )	2020-05-13 17:29:41 -04:00
Patrick von Platen	839bfaedb2	[Docs, Notebook] Include generation pipeline (#4295 ) * add first text for generation * add generation pipeline to usage * Created using Colaboratory * correct docstring * finish	2020-05-13 14:24:08 -04:00
Guo, Quan	39994051e4	Add migrating from `pytorch-transformers` (#4273 ) "Migrating from pytorch-transformers to transformers" is missing in the main document. It is available in the main `readme` thought. Just move it to the document.	2020-05-11 13:35:13 -04:00
fgaim	41e8291217	Add ALBERT to the Tensorflow to Pytorch model conversion cli (#3933 ) * Add ALBERT to convert command of transformers-cli * Document ALBERT tf to pytorch model conversion	2020-05-11 13:10:00 -04:00
Stefan Schweter	3f42eb979f	Documentation: fix links to NER examples (#4279 ) * docs: fix link to token classification (NER) example * examples: fix links to NER scripts	2020-05-11 12:48:21 -04:00
Patrick von Platen	ac7d5f67a2	[Reformer] Add Enwiki8 Reformer Model - Adapt convert script (#4282 ) * adapt convert script * update convert script * finish * fix marian pretrained docs	2020-05-11 16:38:07 +02:00
Sam Shleifer	3487be75ef	[Marian] documentation and AutoModel support (#4152 ) - MarianSentencepieceTokenizer - > MarianTokenizer - Start using unk token. - add docs page - add better generation params to MarianConfig - more conversion utilities	2020-05-10 13:54:57 -04:00
Girishkumar	9d2f467bfb	[README] Corrected some grammatical mistakes (#4199 )	2020-05-10 09:02:36 -04:00
Julien Chaumond	c99fe0386b	[doc] Fix broken links + remove crazy big notebook	2020-05-07 18:44:18 -04:00
Julien Chaumond	612fa1b10b	Examples readme.md (#4215 ) * README * Update README.md	2020-05-07 15:00:06 -04:00
Lysandre	e7cfc1a313	Release: v2.9.0	2020-05-07 14:15:20 -04:00
Julien Chaumond	0ae96ff8a7	BIG Reorganize examples (#4213 ) * Created using Colaboratory * [examples] reorganize files * remove run_tpu_glue.py as superseded by TPU support in Trainer * Bugfix: int, not tuple * move files around	2020-05-07 13:48:44 -04:00
Patrick von Platen	dca34695d0	Reformer (#3351 ) * first copy & past commit from Bert and morgans LSH code * add easy way to compare to trax original code * translate most of function * make trax lsh self attention deterministic with numpy seed + copy paste code * add same config * add same config * make layer init work * implemented hash_vectors function for lsh attention * continue reformer translation * hf LSHSelfAttentionLayer gives same output as trax layer * refactor code * refactor code * refactor code * refactor * refactor + add reformer config * delete bogus file * split reformer attention layer into two layers * save intermediate step * save intermediate step * make test work * add complete reformer block layer * finish reformer layer * implement causal and self mask * clean reformer test and refactor code * fix merge conflicts * fix merge conflicts * update init * fix device for GPU * fix chunk length init for tests * include morgans optimization * improve memory a bit * improve comment * factorize num_buckets * better testing parameters * make whole model work * make lm model work * add t5 copy paste tokenizer * add chunking feed forward * clean config * add improved assert statements * make tokenizer work * improve test * correct typo * extend config * add complexer test * add new axial position embeddings * add local block attention layer * clean tests * refactor * better testing * save intermediate progress * clean test file * make shorter input length work for model * allow variable input length * refactor * make forward pass for pretrained model work * add generation possibility * finish dropout and init * make style * refactor * add first version of RevNet Layers * make forward pass work and add convert file * make uploaded model forward pass work * make uploaded model forward pass work * refactor code * add namedtuples and cache buckets * correct head masks * refactor * made reformer more flexible * make style * remove set max length * add attention masks * fix up tests * fix lsh attention mask * make random seed optional for the moment * improve memory in reformer * add tests * make style * make sure masks work correctly * detach gradients * save intermediate * correct backprob through gather * make style * change back num hashes * rename to labels * fix rotation shape * fix detach * update * fix trainer * fix backward dropout * make reformer more flexible * fix conflict * fix * fix * add tests for fixed seed in reformer layer * fix trainer typo * fix typo in activations * add fp16 tests * add fp16 training * support fp16 * correct gradient bug in reformer * add fast gelu * re-add dropout for embedding dropout * better naming * better naming * renaming * finalize test branch * finalize tests * add more tests * finish tests * fix * fix type trainer * fix fp16 tests * fix tests * fix tests * fix tests * fix issue with dropout * fix dropout seeds * correct random seed on gpu * finalize random seed for dropout * finalize random seed for dropout * remove duplicate line * correct half precision bug * make style * refactor * refactor * docstring * remove sinusoidal position encodings for reformer * move chunking to modeling_utils * make style * clean config * make style * fix tests * fix auto tests * pretrained models * fix docstring * update conversion file * Update pretrained_models.rst * fix rst * fix rst * update copyright * fix test path * fix test path * fix small issue in test * include reformer in generation tests * add docs for axial position encoding * finish docs * Update convert_reformer_trax_checkpoint_to_pytorch.py * remove isort * include sams comments * remove wrong comment in utils * correct typos * fix typo * Update reformer.rst * applied morgans optimization * make style * make gpu compatible * remove bogus file * big test refactor * add example for chunking * fix typo * add to README	2020-05-07 10:17:01 +02:00
Stefan Schweter	e80be7f1d0	docs: add xlm-roberta section to multi-lingual section (#4101 )	2020-05-01 11:06:58 -04:00
Patrick von Platen	fa49b9afea	Clean Encoder-Decoder models with Bart/T5-like API and add generate possibility (#3383 ) * change encoder decoder style to bart & t5 style * make encoder decoder generation dummy work for bert * make style * clean init config in encoder decoder * add tests for encoder decoder models * refactor and add last tests * refactor and add last tests * fix attn masks for bert encoder decoder * make style * refactor prepare inputs for Bert * refactor * finish encoder decoder * correct typo * add docstring to config * finish * add tests * better naming * make style * fix flake8 * clean docstring * make style * rename	2020-04-28 15:11:09 +02:00
Patrick von Platen	52679fbc2e	add dialogpt training tips (#3996 )	2020-04-28 14:32:31 +02:00
Lorenzo Ampil	12bb7fe770	Fix t5 doc typos (#3978 ) * Fix tpo in into and add line under * Add missing blank line under * Correct types under	2020-04-27 18:27:15 +02:00
Lorenzo Ampil	f16540fcba	Pipeline for Text Generation: GenerationPipeline (#3758 ) * Add GenerationPipeline * Fix parameter names * Correct parameter __call__ parameters * Add model type attribute and correct function calls for prepare_input * Take out trailing commas from init attributes * Remove unnecessary tokenization line * Implement support for multiple text inputs * Apply generation support for multiple input text prompts * Take out tensor coersion * Take out batch index * Add text prompt to return sequence * Squeeze token tensore before decoding * Return only a single list of sequences if only one prompt was used * Correct results variable name * Add GenerationPipeline to SUPPORTED_TASKS with the alias , initalized w GPT2 * Registedred AutoModelWithLMHead for both pt and t * Update docstring for GenerationPipeline * Add kwargs parameter to mode.generate * Take out kwargs parameter after all * Add generation pipeline example in pipeline docstring * Fix max length by squeezing tokens tensor * Apply ensure_tensor_on_device to pytorch tensor * Include generation step in torch.no_grad * Take out input from prepare_xlm_input and set 'en' as default xlm_language * Apply framework specific encoding during prepare_input * Format w make style * Move GenerationPipeline import to follow proper import sorting * Take out training comma from generation dict * Apply requested changes * Change name to TextGenerationPipeline * Apply TextGenerationPipeline rename to __init___ * Changing alias to * Set input mapping as input to ensure_tensor_on_device * Fix assertion placement * Add test_text_generation * Add TextGenerationPipeline to PipelineCommonTests * Take out whitespace * Format __init__ w black * Fix __init__ style * Forman __init___ * Add line to end of __init__ * Correct model tokenizer set for test_text_generation * Ensure to return list of list, not list of string (to pass test) * Limit test models to only 3 to limit runtime to address circleCI timeout error * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Remove argument docstring, __init__, add additional __call__ arguments, and reformat results to list of dict * Fix blank result list * Add TextGenerationPipeline to pipelines.rst * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Fix typos from adding PADDING_TEXT_TOKEN_LENGTH * Fix incorrectly moved result list * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py * Update src/transformers/pipelines.py Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com> * Add back generation line and make style * Take out blank whitespace * Apply new alis, text-generation, to test_pipelines * Fix text generation alias in test * Update src/transformers/pipelines.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-04-22 09:37:03 -04:00
Julien Chaumond	1dc9b3c784	Fixes #3877	2020-04-22 01:15:10 +00:00
Thomas Wolf	827d6d6ef0	Cleanup fast tokenizers integration (#3706 ) * First pass on utility classes and python tokenizers * finishing cleanup pass * style and quality * Fix tests * Updating following @mfuntowicz comment * style and quality * Fix Roberta * fix batch_size/seq_length inBatchEncoding * add alignement methods + tests * Fix OpenAI and Transfo-XL tokenizers * adding trim_offsets=True default for GPT2 et RoBERTa * style and quality * fix tests * add_prefix_space in roberta * bump up tokenizers to rc7 * style * unfortunately tensorfow does like these - removing shape/seq_len for now * Update src/transformers/tokenization_utils.py Co-Authored-By: Stefan Schweter <stefan@schweter.it> * Adding doc and docstrings * making flake8 happy Co-authored-by: Stefan Schweter <stefan@schweter.it>	2020-04-18 13:43:57 +02:00
Patrick von Platen	d22894dfd4	[Docs] Add DialoGPT (#3755 ) * add dialoGPT * update README.md * fix conflict * update readme * add code links to docs * Update README.md * Update dialo_gpt2.rst * Update pretrained_models.rst * Update docs/source/model_doc/dialo_gpt2.rst Co-Authored-By: Julien Chaumond <chaumond@gmail.com> * change filename of dialogpt Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-04-16 09:04:32 +02:00
Julien Chaumond	cbad305ce6	[docs] The use of `do_lower_case` in scripts is on its way to deprecation (#3738 )	2020-04-10 12:34:04 -04:00
Sam Shleifer	7a7fdf71f8	Multilingual BART - (#3602 ) - support mbart-en-ro weights - add MBartTokenizer	2020-04-10 11:25:39 -04:00
Lysandre Debut	261c4ff4e2	Update notebooks (#3620 ) * Update notebooks * From local to global link * from local links to actual global links	2020-04-06 14:32:39 -04:00
LysandreJik	36bffc81b3	Release: v2.8.0	2020-04-06 10:03:53 -04:00
Julien Chaumond	94eb68d742	weigths*weights	2020-04-04 15:03:26 -04:00
Lysandre Debut	d5d7d88612	ELECTRA (#3257 ) * Electra wip * helpers * Electra wip * Electra v1 * ELECTRA may be saved/loaded * Generator & Discriminator * Embedding size instead of halving the hidden size * ELECTRA Tokenizer * Revert BERT helpers * ELECTRA Conversion script * Archive maps * PyTorch tests * Start fixing tests * Tests pass * Same configuration for both models * Compatible with base + large * Simplification + weight tying * Archives * Auto + Renaming to standard names * ELECTRA is uncased * Tests * Slight API changes * Update tests * wip * ElectraForTokenClassification * temp * Simpler arch + tests Removed ElectraForPreTraining which will be in a script * Conversion script * Auto model * Update links to S3 * Split ElectraForPreTraining and ElectraForTokenClassification * Actually test PreTraining model * Remove num_labels from configuration * wip * wip * From discriminator and generator to electra * Slight API changes * Better naming * TensorFlow ELECTRA tests * Accurate conversion script * Added to conversion script * Fast ELECTRA tokenizer * Style * Add ELECTRA to README * Modeling Pytorch Doc + Real style * TF Docs * Docs * Correct links * Correct model intialized * random fixes * style * Addressing Patrick's and Sam's comments * Correct links in docs	2020-04-03 14:10:54 -04:00
Patrick von Platen	83d1fbcff6	[Docs] Add usage examples for translation and summarization (#3538 )	2020-03-31 09:36:03 -04:00
Patrick von Platen	42e1e3c67f	Update usage doc regarding generate fn (#3504 )	2020-03-31 09:31:46 -04:00
LysandreJik	6f5a12a583	Release: v2.7.0	2020-03-30 08:49:24 -04:00
Patrick von Platen	5b44e0a31b	[T5] Add training documenation (#3507 ) * Add clear description of how to train T5 * correct docstring in T5 * correct typo * correct docstring format * update t5 model docs * implement collins feedback * fix typo and add more explanation for sentinal tokens * delete unnecessary todos	2020-03-30 13:35:53 +02:00
Patrick von Platen	fa9af2468a	Add T5 to docs (#3461 ) * add t5 docs basis * improve docs * add t5 docs * improve t5 docstring * add t5 tokenizer docstring * finish docstring * make style * add pretrained models * correct typo * make examples work * finalize docs	2020-03-27 10:57:16 -04:00
LysandreJik	471cce24b3	Release: v2.6.0	2020-03-24 10:37:32 -04:00
Sam Shleifer	38a555a83c	Add Summarization to Pipelines (#3128 ) * passing * Undo stupid chg * docs * undo rename * delete-cruft * only import if you have torch * Dont rely on dict ordering * Fix dict ordering upstream * docstring link * docstring link * remove trailing comma for 3.5 compat * new name * delegate kwarging * Update kwargs	2020-03-17 18:04:21 -04:00
Thomas Wolf	2187c49f5c	CPU/GPU memory benchmarking utilities - Remove support for python 3.5 (now only 3.6+) (#3186 ) * memory benchmark rss * have both forward pass and line-by-line mem tracing * cleaned up tracing * refactored and cleaning up API * no f-strings yet... * add GPU mem logging * fix GPU memory monitoring * style and quality * clean up and doc * update with comments * Switching to python 3.6+ * fix quality	2020-03-17 10:17:11 -04:00
Julien Chaumond	d6de6423ba	[doc] --organization tweak Co-Authored-By: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-03-10 16:52:44 -04:00
Julien Chaumond	0e56dc3078	[doc] Document the new --organization flag of CLI	2020-03-10 16:42:01 -04:00
Sam Shleifer	857e0a0d3b	Rename BartForMaskedLM -> BartForConditionalGeneration (#3114 ) * improved documentation	2020-03-05 17:41:18 -05:00
Lysandre	07a79db505	Fix failing doc samples	2020-03-04 19:11:31 -05:00
Lysandre Debut	d3eb7d23a4	Pipeline doc (#3055 ) * Pipeline doc initial commit * pipeline abstraction * Remove modelcard argument from pipeline * Task-specific pipelines can be instantiated with no model or tokenizer * All pipelines doc	2020-03-02 14:07:10 -05:00
Sam Shleifer	b54ef78d0c	Bart-CNN (#3059 ) `generate` code that produces 99% identical summarizations to fairseq on CNN test data, with caching.	2020-03-02 10:35:53 -05:00
Sam Shleifer	9df74b8bc4	Delete all mentions of Model2Model (#3019 )	2020-02-26 11:36:27 -05:00
Lysandre Debut	bb7c468520	Documentation (#2989 ) * All Tokenizers BertTokenizer + few fixes RobertaTokenizer OpenAIGPTTokenizer + Fixes GPT2Tokenizer + fixes TransfoXLTokenizer Correct rst for TransformerXL XLMTokenizer + fixes XLNet Tokenizer + Style DistilBERT + Fix XLNet RST CTRLTokenizer CamemBERT Tokenizer FlaubertTokenizer XLMRobertaTokenizer cleanup * cleanup	2020-02-25 18:43:36 -05:00
Lysandre Debut	65e7c90a77	Adding usage examples for common tasks (#2850 ) * Usage: Sequence Classification & Question Answering * Pipeline example * Language modeling * TensorFlow code for Sequence classification * Custom TF/PT toggler in docs * QA + LM for TensorFlow * Finish Usage for both PyTorch and TensorFlow * Addressing Julien's comments * More assertive * cleanup * Favicon - added favicon option in conf.py along with the favicon image - udpated 🤗 logo. slightly smaller and should appear more consistent across editing programs (no more tongue on the outside of the mouth) Co-authored-by: joshchagani <joshua@joshuachagani.com>	2020-02-25 13:48:24 -05:00
Lysandre	f9ec5ca90b	Release: v2.5.1	2020-02-24 18:22:54 -05:00
Sam Shleifer	53ce3854a1	New BartModel (#2745 ) * Results same as fairseq * Wrote a ton of tests * Struggled with api signatures * added some docs	2020-02-20 18:11:13 -05:00
Lysandre	fb560dcb07	Release: v2.5.0 Welcome Rust Tokenizers	2020-02-19 11:46:19 -05:00
Lysandre	fd639e5be3	Correct quickstart example when using the past	2020-02-10 11:25:56 -05:00
Lysandre	dd28830327	Update RoBERTa tips	2020-02-07 16:42:35 -05:00
Lysandre	db97930122	Update XLM-R tips	2020-02-07 16:42:35 -05:00
VictorSanh	ee5a6856ca	distilbert-base-cased weights + Readmes + omissions	2020-02-07 15:28:13 -05:00
Julien Chaumond	42f08e596f	[examples] rename run_lm_finetuning to run_language_modeling	2020-02-07 09:15:28 -05:00
Julien Chaumond	7748cbbe7d	Oopsie	2020-02-06 15:30:02 -05:00
Julien Chaumond	432c12521e	[docs] Add menu w/ links to other pages on hf.co	2020-02-06 15:30:02 -05:00
Julien Chaumond	eae8ee0389	[doc] model sharing: mention README.md + tweaks cc @lysandrejik @thomwolf	2020-02-05 14:20:03 -05:00
Lysandre	9c67196b83	Update quickstart	2020-02-04 11:11:37 -05:00
Lysandre	d426b58b9e	Patch: v2.4.1	2020-01-31 14:55:33 -05:00
Lysandre	6664ea943d	Release: v2.4.0	2020-01-31 09:40:32 -05:00
Hang Le	b43cb09aaa	Add layerdrop	2020-01-30 12:05:01 -05:00
Lysandre	93dccf527b	Pretrained models	2020-01-30 10:04:18 -05:00
Lysandre	73306d028b	FlauBERT documentation	2020-01-30 10:04:18 -05:00
Lysandre	c69b082601	Update documentation	2020-01-29 12:06:13 -05:00
Lysandre	44a5b4bbe7	Update documentation	2020-01-29 11:47:49 -05:00
Wietse de Vries	f5a236c3ca	Add Dutch pre-trained BERT model	2020-01-27 21:00:34 -05:00
thomwolf	e0849a66ac	adding in the doc	2020-01-27 14:27:07 -05:00
Lysandre	983fef469c	AutoModels doc	2020-01-24 16:37:30 -05:00
Lysandre	24d5ad1dcc	Run the examples in slow	2020-01-23 09:38:45 -05:00
Lysandre	9ddf60b694	Tips + whitespaces	2020-01-23 09:38:45 -05:00
Lysandre	0e9899f451	Fixes	2020-01-23 09:38:45 -05:00
Lysandre	7511f3dd89	PyTorch CTRL + Style	2020-01-23 09:38:45 -05:00
Lysandre	980211a63a	XLM-RoBERTa	2020-01-23 09:38:45 -05:00
Lysandre	db1a7f27a1	PyTorch DistilBERT	2020-01-23 09:38:45 -05:00
Lysandre	b28020f590	TF RoBERTa	2020-01-23 09:38:45 -05:00
Lysandre	3e1bc27e1b	Pytorch RoBERTa	2020-01-23 09:38:45 -05:00
Lysandre	f44ff574d3	Camembert	2020-01-23 09:38:45 -05:00
Lysandre	ccebcae75f	PyTorch XLM	2020-01-23 09:38:45 -05:00
Lysandre	cd656fb21a	PyTorch XLNet	2020-01-23 09:38:45 -05:00
Lysandre	98edad418e	PyTorch Transformer-XL	2020-01-23 09:38:45 -05:00
Lysandre	850795c487	Pytorch GPT	2020-01-23 09:38:45 -05:00
Lysandre	1487b840d3	TF GPT2	2020-01-23 09:38:45 -05:00
Lysandre	bd0d3fd76e	GPT-2 PyTorch models + better tips for BERT	2020-01-23 09:38:45 -05:00
Lysandre	cd77c750c5	BERT PyTorch models	2020-01-23 09:38:45 -05:00
Lysandre	3922a2497e	TF ALBERT + TF Utilities + Fix warnings	2020-01-23 09:38:45 -05:00
Lysandre	00df3d4de0	ALBERT Modeling + required changes to utilities	2020-01-23 09:38:45 -05:00
Lysandre	632675ea88	Can test examples spread over multiple blocks	2020-01-23 09:38:45 -05:00
Lysandre	9bab9b83d2	Glossary	2020-01-23 09:38:45 -05:00
Julien Chaumond	119dc50e2a	Doc tweak on model sharing	2020-01-22 22:40:38 -05:00
Lysandre	387217bd3e	Added example usage	2020-01-14 14:09:09 +01:00
Lysandre	7d1bb7f256	Add missing XLNet and XLM models	2020-01-14 14:09:09 +01:00
Lysandre Debut	632682726f	Updated Configurations	2020-01-14 14:09:09 +01:00
alberduris	81d6841b4b	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
alberduris	dd4df80f0b	Moved the encoded_prompts to correct device	2020-01-06 15:11:12 +01:00
Morgan Funtowicz	80faf22b4a	Updating documentation for converting tensorflow model to reflect the new cli convert format. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-01-04 13:41:18 +01:00
Julien Chaumond	9b2badf3c9	[cli] Update doc	2019-12-27 22:54:29 -05:00
Aymeric Augustin	a8d34e534e	Remove [--editable] in install instructions. Use -e only in docs targeted at contributors. If a user copy-pastes command line with [--editable], they will hit an error. If they don't know the --editable option, we're giving them a choice to make before they can move forwards, but this isn't a choice they need to make right now.	2019-12-24 08:46:08 +01:00
Aymeric Augustin	70373a5f7c	Update contribution instructions. Also provide shortcuts in a Makefile.	2019-12-23 21:05:30 +01:00
Aymeric Augustin	d8e33dbd67	Fix path to source code in docs config. This should fix API docs, which went AWOL with yesterday's changes.	2019-12-23 16:49:35 +01:00
Aymeric Augustin	45841eaf7b	Remove references to Python 2 in documentation.	2019-12-22 18:38:56 +01:00
Aymeric Augustin	ced0a94204	Switch test files to the standard test_*.py scheme.	2019-12-22 14:15:13 +01:00
Aymeric Augustin	067395d5c5	Move tests outside of library.	2019-12-22 13:47:17 +01:00
Julien Chaumond	ac1b449cc9	[doc] move distilroberta to more appropriate place cc @lysandrejik	2019-12-21 00:09:01 -05:00
Lysandre	a436574bfd	Release: v2.3.0	2019-12-20 16:22:20 -05:00
Rémi Louf	4e3f745ba4	add example for Model2Model in quickstart	2019-12-20 09:12:31 -05:00
Stefan Schweter	f09d999641	docs: fix numbering 😅	2019-12-18 19:49:33 +01:00
Stefan Schweter	dd7a958fd6	docs: add XLM-RoBERTa to pretrained model list (incl. all parameters)	2019-12-18 19:45:46 +01:00
Stefan Schweter	d35405b7a3	docs: add XLM-RoBERTa to index page	2019-12-18 19:45:10 +01:00
Antti Virtanen	abc43ffbff	Add pretrained model documentation for FinBERT.	2019-12-17 20:35:25 -05:00
Julien Chaumond	3f5ccb183e	[doc] Clarify uploads cf `855ff0e91d (commitcomment-36452545)`	2019-12-16 18:20:29 -05:00
Julien Chaumond	855ff0e91d	[doc] Model upload and sharing ping @lysandrejik @thomwolf Is this clear enough? Anything we should add?	2019-12-16 12:42:22 -05:00
Thomas Wolf	e92bcb7eb6	Merge pull request #1739 from huggingface/t5 [WIP] Adding Google T5 model	2019-12-14 09:40:43 +01:00
Lysandre	7bd11dda6f	Release: v2.2.2	2019-12-13 16:45:30 -05:00
thomwolf	5c00e344c1	update model doc - swith 3B/11B to 3b/11b	2019-12-13 16:33:29 +01:00
Thomas Wolf	110394b2ba	Merge branch 'master' into t5	2019-12-13 16:03:32 +01:00
Julien Chaumond	1748fdf657	[doc] Fix rst table	2019-12-11 18:32:27 -05:00
Masatoshi Suzuki	c03c0dfd23	Add support for Japanese BERT models by cl-tohoku	2019-12-11 18:32:27 -05:00
Stefan Schweter	030faccb8d	doc: fix pretrained models table	2019-12-11 12:19:21 -05:00
thomwolf	0558c9cb9b	Merge branch 'master' into t5	2019-12-10 12:58:48 +01:00
Thomas Wolf	e57d00ee10	Merge pull request #1984 from huggingface/squad-refactor [WIP] Squad refactor	2019-12-10 11:07:26 +01:00
Pierric Cistac	5c877fe94a	fix albert links	2019-12-09 18:53:00 -05:00
Lysandre Debut	00c4e39581	Merge branch 'master' into squad-refactor	2019-12-09 10:41:15 -05:00
Aymeric Augustin	35401fe50f	Remove dependency on pytest for running tests (#2055 ) * Switch to plain unittest for skipping slow tests. Add a RUN_SLOW environment variable for running them. * Switch to plain unittest for PyTorch dependency. * Switch to plain unittest for TensorFlow dependency. * Avoid leaking open files in the test suite. This prevents spurious warnings when running tests. * Fix unicode warning on Python 2 when running tests. The warning was: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal * Support running PyTorch tests on a GPU. Reverts `27e015bd`. * Tests no longer require pytest. * Make tests pass on cuda	2019-12-06 13:57:38 -05:00
Thomas Wolf	5482822a2b	Merge pull request #2046 from jplu/tf2-ner-example Add NER TF2 example.	2019-12-06 12:12:22 +01:00
LysandreJik	9ecd83dace	Patch evaluation for impossible values + cleanup	2019-12-05 14:44:57 -05:00
VictorSanh	552c44a9b1	release distilm-bert	2019-12-05 10:14:58 -05:00
Julien Plu	9200a759d7	Add few tests on the TF optimization file with some info in the documentation. Complete the README.	2019-12-05 12:56:43 +01:00
Thomas Wolf	1f179f095f	Merge pull request #2011 from AdityaSoni19031997/patch-1 typo fix on the docs as per Pytorch v1.1+	2019-12-05 12:39:04 +01:00
LysandreJik	7a03519975	Documentation	2019-12-04 17:24:35 -05:00
LysandreJik	8101924a68	Patch: v2.2.1	2019-12-03 11:20:26 -05:00
Aditya Soni	c356290c8d	typo fix as per Pytorch v1.1+	2019-12-01 14:08:14 +05:30
Stefan Schweter	8c276b9c92	Merge branch 'master' into distilbert-german	2019-11-27 18:11:49 +01:00
VictorSanh	d75d49a51d	add XnliProcessor to doc	2019-11-27 11:07:22 -05:00
Lysandre	361620954a	Remove TFBertForPreTraining from ALBERT doc	2019-11-27 10:11:37 -05:00
Lysandre	ce02550d50	Fix pretrained models table	2019-11-26 15:47:02 -05:00
Lysandre	cf26a0c85e	Fix pretrained models table	2019-11-26 15:40:03 -05:00
Lysandre	ee4647bd5c	CamemBERT & ALBERT doc	2019-11-26 15:10:51 -05:00
Lysandre	668aac45d2	Pretrained models	2019-11-26 14:52:42 -05:00
Lysandre	ae98d45991	Release: v2.2.0	2019-11-26 14:12:44 -05:00
Julien Chaumond	afaa335851	[doc] Fix assets urls	2019-11-23 11:34:45 -05:00
Stefan Schweter	e631383d4f	docs: add new German distilbert model to pretrained models	2019-11-19 19:52:40 +01:00
Louis MARTIN	035fea5315	Add CamemBERT to auto files and docs	2019-11-16 00:11:07 -05:00
Thomas Wolf	df99f8c5a1	Merge pull request #1832 from huggingface/memory-leak-schedulers replace LambdaLR scheduler wrappers by function	2019-11-14 22:10:31 +01:00
Rémi Louf	2276bf69b7	update the examples, docs and template	2019-11-14 20:38:02 +01:00
Lysandre	e18f786cd5	Quickstart example showcasing past	2019-11-14 10:06:00 -05:00
thomwolf	f03c0c1423	adding models in readme and auto classes	2019-11-08 11:49:46 +01:00
Julien Chaumond	1c542df7e5	Add RoBERTa-based GPT-2 Output Detector from OpenAI converted from https://github.com/openai/gpt-2-output-dataset/tree/master/detector Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-Authored-By: Jong Wook Kim <jongwook@nyu.edu> Co-Authored-By: Jeff Wu <wuthefwasthat@gmail.com>	2019-11-06 16:26:31 -05:00
Julien Chaumond	30968d70af	misc doc	2019-11-05 19:06:12 -05:00
Lysandre	d7d36181fd	GPT-2 XL	2019-11-05 13:31:58 -05:00
Julien Chaumond	93d2fff071	Close #1654	2019-11-01 09:47:38 -04:00
VictorSanh	8ad5c591cd	[RELEASE] DistilRoBERTa	2019-10-23 10:29:47 -04:00
LysandreJik	82f6abd98a	Benchmark section added to the documentation	2019-10-18 17:27:10 -04:00
Lysandre	3ddce1d74c	Release: 2.1.1	2019-10-11 06:37:49 -04:00
Stefan Schweter	5f25a5f367	model: add support for new German BERT models (cased and uncased) from @dbmdz	2019-10-11 10:20:33 +02:00
LysandreJik	9c2e0a4acf	Release: 2.1.0	2019-10-09 12:14:03 -04:00
LysandreJik	7fe98d8c18	Update CTRL documentation	2019-10-09 12:12:36 -04:00
LysandreJik	89f86f9661	CTRL added to the documentation	2019-10-09 12:04:06 -04:00
thomwolf	d9e60f4f0d	Merge branch 'master' into pr/1383	2019-10-09 17:25:08 +02:00
thomwolf	48b438ff2a	doc and conversion	2019-10-09 17:06:30 +02:00
Julien Chaumond	d688af19e5	Update link to swift-coreml-transformers cc @lysandrejik	2019-10-08 16:37:52 -04:00
LysandreJik	8fcc6507ce	Multilingual	2019-10-07 15:02:42 -04:00
Thomas Wolf	b3cfd97946	Merge pull request #1373 from TimYagan/fix-css Fixed critical css font-family issues	2019-10-03 19:04:02 -04:00
VictorSanh	e2ae9c0b73	fix links in doc index	2019-10-03 11:42:21 -04:00
VictorSanh	c1689ac301	fix name	2019-10-03 10:56:39 -04:00
VictorSanh	4a790c40b1	update doc for distil*	2019-10-03 10:54:02 -04:00
LysandreJik	ebb32261b1	fix #1401	2019-10-02 17:52:56 -04:00
Tim Yagan	0a4ed7192e	Fixed critical css font-family issues Fixed critical css font-family issues to ensure compatibility with multiple webbrowsers	2019-09-29 13:51:01 +02:00
Julien Chaumond	d8b641c839	6 -> 8 models	2019-09-27 17:22:01 -04:00
pj	4f2b6579bf	Fix some typos	2019-09-27 22:55:43 +08:00
Gabriel Luiz Freitas Almeida	d2de5b9d8c	Just some typos	2019-09-27 07:08:36 -03:00
Julien Chaumond	fc9faa8a47	[docs] Doc tweaks Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2019-09-26 18:19:51 -04:00
LysandreJik	93f0c5fc72	Repository link in the documentation	2019-09-26 11:45:00 -04:00
thomwolf	6c3b131516	typo in readme/doc	2019-09-26 16:23:28 +02:00
LysandreJik	7e957237e4	[Doc] XLM + Torch in documentation	2019-09-26 10:08:56 -04:00
LysandreJik	927904bc91	[doc] pytorch_transformers -> transformers	2019-09-26 08:47:15 -04:00
LysandreJik	294edfd83d	Release version in documentation	2019-09-26 08:16:12 -04:00
LysandreJik	de5e4864cb	Documentation	2019-09-26 08:04:54 -04:00
LysandreJik	8349d75773	Various small doc fixes	2019-09-26 07:45:40 -04:00
LysandreJik	fb056494e5	Example usage	2019-09-26 07:45:40 -04:00
LysandreJik	36f592cc82	Updated doc for `InputExample` and `InputFeatures`	2019-09-26 07:45:40 -04:00
LysandreJik	ad4a393e2e	Changed processor documentation architecture. Added documentation for GLUE	2019-09-26 07:45:40 -04:00
LysandreJik	c4ac7a76db	GLUE processors	2019-09-26 07:45:40 -04:00
LysandreJik	4acd87ff4e	TF models added to documentation	2019-09-26 07:45:40 -04:00
LysandreJik	cf5c5c9e1c	Documentation	2019-09-26 07:43:13 -04:00
thomwolf	f47f7f4611	add logo	2019-09-26 11:28:44 +02:00
thomwolf	31c23bd5ee	[BIG] pytorch-transformers => transformers	2019-09-26 10:15:53 +02:00
thomwolf	c88f05163d	fix typo in XLM models	2019-09-16 13:42:20 +02:00
LysandreJik	593c070435	Better examples	2019-09-06 12:00:12 -04:00
LysandreJik	0b52642d37	1.2.0 in docs	2019-09-04 11:03:32 -04:00
LysandreJik	7f522437bc	Updated documentation for LM finetuning script	2019-09-02 13:40:25 -04:00
Julien Chaumond	2dcc5a1629	[doc] Add blurb about large-scale model downloads cc @n1t0 @lysandrejik @thomwolf	2019-09-02 12:27:11 -04:00
LysandreJik	09363f2a8b	Fix documentation index	2019-08-30 19:48:32 -04:00
LysandreJik	e0caab0cf0	fix link	2019-08-30 10:09:17 -04:00
LysandreJik	a600b30cc3	Fix index number in documentation	2019-08-30 10:08:14 -04:00
LysandreJik	20c06fa37d	Added DistilBERT to documentation index	2019-08-30 10:06:51 -04:00
LysandreJik	9ce42dc540	Pretrained models table fix	2019-08-28 13:56:28 -04:00
Thomas Wolf	0ecfd17f49	Merge pull request #987 from huggingface/generative-finetuning Generative finetuning	2019-08-28 16:51:50 +02:00
Thomas Wolf	50792dbdcc	Merge pull request #1127 from huggingface/dilbert DilBERT	2019-08-28 16:43:09 +02:00
LysandreJik	75bc2a03cc	Updated article link	2019-08-28 10:05:15 -04:00
LysandreJik	1dc43e56c9	Documentation additions	2019-08-28 09:37:27 -04:00
thomwolf	0d288727b8	fix #1106	2019-08-27 14:50:22 +02:00
Thomas Wolf	90dcd8c05d	Merge branch 'master' into generative-finetuning	2019-08-22 10:43:30 +02:00
Lysandre	2f9397139d	Added GPT-2 LARGE to Pre-trained Models documentation	2019-08-21 11:29:37 -04:00
Lysandre	d6bbcbc4cf	Added finetuning example to documentation	2019-08-21 11:22:05 -04:00
VictorSanh	6f877d9daf	Update dev results on GLUE (bert-base-uncased) w/ median on 5 runs	2019-08-21 03:43:29 +00:00
thomwolf	e239a4a20f	close #984	2019-08-20 11:02:00 +02:00
LysandreJik	572dcfd1db	Doc	2019-08-14 14:56:14 -04:00
Thomas Wolf	d43dc48b34	Merge branch 'master' into auto_models	2019-08-05 19:17:35 +02:00
thomwolf	0b524b0848	remove derived classes for now	2019-08-05 19:08:19 +02:00
thomwolf	13936a9621	update doc and tests	2019-08-05 18:48:16 +02:00
Lysandre Debut	6f05ad72b4	Merge pull request #791 from huggingface/doc RestructuredText table for pretrained models.	2019-08-05 10:18:00 -04:00
thomwolf	b90e29d52c	working on automodels	2019-08-05 16:06:34 +02:00
thomwolf	328afb7097	cleaning up tokenizer tests structure (at last) - last remaining ppb refs	2019-08-05 14:08:56 +02:00
thomwolf	00132b7a7a	updating docs - adding few tests to tokenizers	2019-08-04 22:42:55 +02:00
thomwolf	009273dbdd	big doc update [WIP]	2019-08-04 12:14:57 +02:00
Julien Chaumond	44dd941efb	link to `swift-coreml-transformers`	2019-08-01 09:50:30 -04:00
thomwolf	c717d38573	dictionnary => dictionary	2019-07-26 23:30:48 +02:00
Sukuya	35c52f2f3c	Update torchscript.rst Import fixed to pytorch_transformers else torchscript flag can't be used.	2019-07-25 16:51:11 +08:00
LysandreJik	9d381e7be9	Fixed incorrect links in the PretrainedModel	2019-07-17 09:25:38 -04:00
Stefan Schweter	e6cc6d237f	docs: fix link to various notebooks	2019-07-16 23:42:28 +02:00
Stefan Schweter	5b78400e21	docs: fix link to modeling example source (bert)	2019-07-16 23:41:57 +02:00
Stefan Schweter	61cc3ee350	docs: fix link to tf checkpoint to pytorch script	2019-07-16 23:41:04 +02:00
Stefan Schweter	dbbd94cb7a	docs: fix link to bertology example and update dataset description	2019-07-16 23:40:04 +02:00
LysandreJik	117ed92992	RestructuredText table for pretrained models.	2019-07-16 11:58:47 -04:00
thomwolf	5c82d3488f	indicate default evaluation in breaking changes	2019-07-16 15:45:58 +02:00
thomwolf	43e0e8fa04	updates to readme and doc	2019-07-16 13:56:47 +02:00
thomwolf	3b8b0e01bb	update readme	2019-07-16 00:12:55 +02:00
thomwolf	2397f958f9	updating examples and doc	2019-07-14 23:20:10 +02:00
LysandreJik	6491575fd5	Added TorchScript disclaimer. CSS modifications.	2019-07-11 12:38:21 -04:00
LysandreJik	c82b74b996	Fixed Sphinx errors and warnings	2019-07-10 15:30:19 -04:00
LysandreJik	f773faa258	Fixed all links. Removed TPU. Changed CLI to Converting TF models. Many minor formatting adjustments. Added "TODO Lysandre filled" where necessary.	2019-07-10 14:45:56 -04:00
LysandreJik	c4bab2dc85	Added footer with social links.	2019-07-09 18:03:01 -04:00
LysandreJik	331db8cc02	Added viewcode plugin for source code visualization within the static website.	2019-07-09 17:01:56 -04:00
LysandreJik	83fb311ef7	Patched warnings + Refactored XLNet's Docstrings	2019-07-09 16:38:30 -04:00
LysandreJik	8fe2c9d98e	Refactored Docstrings of BERT, GPT2, GPT, TransfoXL, XLM and XLNet.	2019-07-09 15:55:31 -04:00
LysandreJik	269e73b601	Adding example detailing how to add a new file to the documentation + adding fonts.	2019-07-09 10:11:29 -04:00
LysandreJik	6847e30e1c	New page detailing the use of TorchScript.	2019-07-08 17:34:24 -04:00
LysandreJik	ab30651802	Hugging Face theme.	2019-07-08 16:05:26 -04:00
LysandreJik	64fd986376	Tokenizers and Config classes are referenced.	2019-07-05 17:44:59 -04:00
LysandreJik	df759114c9	Single file documentation for each model, accompanied by the Documentation overview.	2019-07-05 17:35:26 -04:00
LysandreJik	03de9686a7	Initial folder structure for the documentation. A draft of documentation change has been made in the BertModel class.	2019-07-05 17:11:13 -04:00

... 51 52 53 54 55 ...

3359 Commits