transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-05 22:00:09 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	5f3c57fc84	Check the repo consistency in model templates test (#15141 ) * Check the repo consistency in model templates test * Fix doc template * Fix docstrings * Fix last docstring	2022-01-14 04:52:38 -05:00
Sylvain Gugger	1a00863e95	Fix typo in doc template	2022-01-11 15:22:15 -05:00
NielsRogge	6ea6266625	Fix cookiecutter (#15100 )	2022-01-11 05:57:26 -05:00
Suraj Patil	3e9fdcf019	[DOC] fix doc examples for bart-like models (#15093 ) * fix doc examples * remove double colons	2022-01-10 18:13:28 +01:00
Sylvain Gugger	61d18ae035	Happy New Year! (#15094 )	2022-01-10 12:05:57 -05:00
Sylvain Gugger	207594be81	Convert rst files (#14888 ) * Convert all tutorials and guides * Convert all remaining rst to mdx * Track and fix bad links	2021-12-22 16:14:35 -05:00
Sylvain Gugger	27b3031de2	Mass conversion of documentation from rst to Markdown (#14866 ) * Convert docstrings of all configurations and tokenizers * Processors and fixes * Last modeling files and fixes to models * Pipeline modules * Utils files * Data submodule * All the other files * Style * Missing examples * Style again * Fix copies * Say bye bye to rst docstrings forever	2021-12-21 15:06:33 -05:00
Sylvain Gugger	7af80f6618	Convert docstrings of modeling files (#14850 ) * Convert file_utils docstrings to Markdown * Test on BERT * Return block indent * Temporarily disable doc styler * Remove from quality checks as well * Remove doc styler mess * Remove check from circleCI * Fix typo * Convert file_utils docstrings to Markdown * Test on BERT * Return block indent * Temporarily disable doc styler * Remove from quality checks as well * Remove doc styler mess * Remove check from circleCI * Fix typo * Let's go on all other model files * Add templates too * Styling and quality	2021-12-21 05:37:32 -05:00
Daniel Stancl	ff066119ca	Implement head_mask for Flax BERT and other models copied from BERT (#14620 ) * Implement head_mask for Flax BERT and other models copied from BERT * Remove `from jax._src.nn.functions import sigmoid` Remove `from jax._src.nn.functions import sigmoid` unintentionally added by IDE * Remove no more valid copy statement * Apply patil-suraj's suggestions from code review * Apply suggestions from the code review * Update Flax template * Fix a typo * Also update template for CausalLM modules	2021-12-17 17:06:59 +01:00
Lysandre Debut	8010fda9bf	Removes images to put them in a dataset (#14781 ) * First try * Update instructions	2021-12-16 04:42:02 -05:00
Yih-Dar	15a9d01519	Avoid using tf.tile in embeddings for TF models (#14735 ) * avoid tf.tile in embeddings * remove more tf.tile in embeddings * clean Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2021-12-13 17:30:46 +00:00
Yih-Dar	32eb29fef9	Fix doc examples: modify config before super().__init__ (#14697 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2021-12-13 12:50:02 +01:00
Yih-Dar	59d684fa92	Fix examples: 'CausalLMOutputWithCrossAttentions' object has no attribute 'last_hidden_state' (#14678 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2021-12-10 14:55:54 +01:00
Thomas Viehmann	6ed9882ddb	use functional interface for softmax in attention (#14198 ) * use functional interface instead of instantiating module and immediately calling it * fix torch.nn.functional to nn.functional. Thank you Stas!	2021-11-30 11:47:33 -05:00
Sylvain Gugger	d83b0e0c07	Add a post init method to all models (#14431 ) * Add a post init method to all models * Fix tests * Fix last tests * Fix templates * Add comment * Forgot to save	2021-11-18 08:38:09 -05:00
Suraj Patil	e92190c0f8	Fix Flax params dtype (#13098 ) * fix inits * fix embed dtype * fix embed dtype * add test to check default dtype * quality * add type conversion methods for flax models * more robust casting * cast sinusoidal positions * update pegasus * update albert * update test * make sure dtype is passed to every module * style * fix electra dense * fix t5 * quality * add more tests * better name * use the dtype for lm head computation * fix albert * style * fix albert embed dtype * more tests * fix vision enc-dec * cleanup * fix embed dtype pegasus * fix default param test * doc * update template * fix final_logits_bias dtype * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix doc * fix doc * add detailed docstring for dtype parameter * remove un-necessary import Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-11-11 14:45:20 +05:30
Patrick von Platen	e81d8d7fa9	[Bert2Bert] allow bert2bert + relative embeddings (#14324 ) * [Bert2Bert] allow bert2bert + relative embeddings * up * Update README_ko.md * up * up	2021-11-09 14:26:58 -05:00
Sylvain Gugger	c28bc80bbb	Generalize problem_type to all sequence classification models (#14180 ) * Generalize problem_type to all classification models * Missing import * Deberta BC and fix tests * Fix template * Missing imports * Revert change to reformer test * Fix style	2021-10-29 10:32:56 -04:00
Patrick von Platen	f5af873617	[Docs] More general docstrings (#14028 ) * up * finish * up * up * finish	2021-10-16 00:48:37 +02:00
Yih-Dar	8b240a0661	Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222 ) * Add cross attentions to TFGPT2Model * Add TFEncoderDecoderModel * Add TFBaseModelOutputWithPoolingAndCrossAttentions * Add cross attentions to TFBertModel * Fix past or past_key_values argument issue * Fix generation * Fix save and load * Add some checks and comments * Clean the code that deals with past keys/values * Add kwargs to processing_inputs * Add serving_output to TFEncoderDecoderModel * Some cleaning + fix use_cache value issue * Fix tests + add bert2bert/bert2gpt2 tests * Fix more tests * Ignore crossattention.bias when loading GPT2 weights into TFGPT2 * Fix return_dict_in_generate in tf generation * Fix is_token_logit_eos_token bug in tf generation * Finalize the tests after fixing some bugs * Fix another is_token_logit_eos_token bug in tf generation * Add/Update docs * Add TFBertEncoderDecoderModelTest * Clean test script * Add TFEncoderDecoderModel to the library * Add cross attentions to TFRobertaModel * Add TFRobertaEncoderDecoderModelTest * make style * Change the way of position_ids computation * bug fix * Fix copies in tf_albert * Remove some copied from and apply some fix-copies * Remove some copied * Add cross attentions to some other TF models * Remove encoder_hidden_states from TFLayoutLMModel.call for now * Make style * Fix TFRemBertForCausalLM * Revert the change to longformer + Remove copies * Revert the change to albert and convbert + Remove copies * make quality * make style * Add TFRembertEncoderDecoderModelTest * make quality and fix-copies * test TFRobertaForCausalLM * Fixes for failed tests * Fixes for failed tests * fix more tests * Fixes for failed tests * Fix Auto mapping order * Fix TFRemBertEncoder return value * fix tf_rembert * Check copies are OK * Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined * Add TFEncoderDecoderModelSaveLoadTests * fix tf weight loading * check the change of use_cache * Revert the change * Add missing test_for_causal_lm for TFRobertaModelTest * Try cleaning past * fix _reorder_cache * Revert some files to original versions * Keep as many copies as possible * Apply suggested changes - Use raise ValueError instead of assert * Move import to top * Fix wrong require_torch * Replace more assert by raise ValueError * Add test_pt_tf_model_equivalence (the test won't pass for now) * add test for loading/saving * finish * finish * Remove test_pt_tf_model_equivalence * Update tf modeling template * Remove pooling, added in the prev. commit, from MainLayer * Update tf modeling test template * Move inputs["use_cache"] = False to modeling_tf_utils.py * Fix torch.Tensor in the comment * fix use_cache * Fix missing use_cache in ElectraConfig * Add a note to from_pretrained * Fix style * Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt * Fix TFMLP (in TFGPT2) activation issue * Fix None past_key_values value in serving_output * Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub * Apply review suggestions - style for cross_attns in serving_output * Apply review suggestions - change assert + docstrings * break the error message to respect the char limit * deprecate the argument past * fix docstring style * Update the encoder-decoder rst file * fix Unknown interpreted text role "method" * fix typo Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-10-13 00:10:34 +02:00
Tommy Chiang	a2ef9c5446	Use torch.unique_consecutive to check same element (#13637 ) We use `torch.unique` here only to check whether every elements have the same value. Therefore, we can use `torch.unique_consecutive` here. This function eliminates all but the first element from every consecutive group of equivalent elements. Like, if we apply this function to `[1, 2, 2, 1]`, it will result in `[1, 2, 1]`. As you could see, this is enough for checking whether every elements have the same value. Since `torch.unique_consecutive` do less thing, it is much more faster. On my computer, it is 25x faster on GPU and 15x faster on CPU.	2021-09-24 10:31:23 +02:00
Sylvain Gugger	27d4639779	Make gradient_checkpointing a training argument (#13657 ) * Make gradient_checkpointing a training argument * Update src/transformers/modeling_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/configuration_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Fix tests * Style * document Gradient Checkpointing as a performance feature * Small rename * PoC for not using the config * Adapt BC to new PoC * Forgot to save * Rollout changes to all other models * Fix typo Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org>	2021-09-22 07:51:38 -04:00
Bhadresh Savani	3fbb55c757	[Flax] Fixes typo in Bart based Flax Models (#13565 )	2021-09-15 11:03:52 +05:30
Nils Reimers	c8be8a9adb	Update model configs - Allow setters for common properties (#13026 ) * refactor GPT Config to allow dyn. properties * make attribute_map a class attribute * remove old code * update unit test to test config: Add test for common properties setter * update unit test to test config: Add test for common properties passed as parameters to __init__ * update to black code format * Allow that setters are not defined for certain config classes * update config classes to implement attribute_map * bugfix lxmert config - id2labels was not defined when num_labels was set * update broken configs - add attribute_maps * update bart config * update black codestyle * update documentation on common config attributes * update GPTJ config to new attribute map * update docs on common attributes * gptj config: add max_position_embeddings * gptj config: format with black * update speech to text 2 config * format doc file to max_len 119 * update config template	2021-09-06 16:30:13 +02:00
Patrick von Platen	02039352b2	Update README.md	2021-09-01 09:50:21 +02:00
Jonathan Chang	d160782a53	Add template for adding flax models (#12441 ) * Add option to add flax * Add flax template for __init__.py * Add flax template for .rst * Copy TF modeling template * Add a missing line in modeling_tf_... template * Update first half of modeling_flax_.. * Update encoder flax template * Copy test_modeling_tf... as test_modeling_flax... * Replace some TF to Flax in test_modeling_flax_... * Replace tf to np some function might not work, like _assert_tensors_equal * Replace remaining tf to np (might not work) * Fix cookiecutter * Add Flax in to_replace_... template * Update transformers-cli add-new-model * Save generate_flax in configuration.json This will be read by transformers-cli * Fix to_replace_... and cli * Fix replace cli * Fix cookiecutter name * Move docstring earlier to avoid not defined error * Fix a missing Module * Add encoder-decoder flax template from bart * Fix flax test * Make style * Fix endif * Fix replace all "utf-8 -> unp-8" * Update comment * Fix flax template (add missing ..._DOCSTRING) * Use flax_bart imports in template (was t5) * Fix unp * Update templates/adding_a_new_model/tests * Revert "Fix unp" This reverts commit `dc9002a41d`. * Remove one line of copied from to suppress CI error * Use generate_tensorflow_pytorch_and_flax * Add a missing part * fix typo * fix flax config * add examples for flax * small rename * correct modeling imports * correct auto loading * corrects some flax tests * correct small typo * correct as type * finish modif * correct more templates * final fixes * add file testers * up * make sure tests match template regex * correct pytorch * correct tf * correct more tf * correct imports * minor error * minor error * correct init * more fixes * correct more flax tests * correct flax test * more fixes * correct docs * update * fix Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-09-01 09:49:03 +02:00
Jongheon Kim	ef8d6f2b4a	Set missing seq_length variable when using inputs_embeds with ALBERT & Remove code duplication (#13152 ) * Set seq_length variable when using inputs_embeds * remove code duplication	2021-08-31 06:51:25 -04:00
Sylvain Gugger	9870093f7b	[WIP] Disentangle auto modules from other modeling files (#13023 ) * Initial work * All auto models * All tf auto models * All flax auto models * Tokenizers * Add feature extractors * Fix typos * Fix other typo * Use the right config * Remove old mapping names and update logic in AutoTokenizer * Update check_table * Fix copies and check_repo script * Fix last test * Add back name * clean up * Update template * Update template * Forgot a ) * Use alternative to fixup * Fix TF model template * Address review comments * Address review comments * Style	2021-08-06 13:12:30 +02:00
Sylvain Gugger	790f1c9545	Fix template for inputs docstrings (#12976 )	2021-08-03 08:28:25 +02:00
Lysandre Debut	c3d9ac7607	Expose get_config() on ModelTesters (#12812 ) * Expose get_config() on ModelTesters * Typo	2021-07-21 04:13:11 -04:00
Sylvain Gugger	0a6b9048d1	Init pickle (#12567 ) * Try to pickle transformers * Deal with special objs better * Make picklable	2021-07-08 07:20:46 -04:00
Michal Szutenberg	0d2bffad31	Remove tf.roll wherever not needed (#12512 ) It was used in shift_right. After this change TF code is more similar to Pytorch implementations Also, TF graphs are optimized (one node less)	2021-07-07 16:17:30 +01:00
Sylvain Gugger	9eda6b52e2	Add all XxxPreTrainedModel to the main init (#12314 ) * Add all XxxPreTrainedModel to the main init * Add to template * Add to template bis * Add FlaxT5	2021-06-23 10:40:54 -04:00
Hamid Shojanazeri	af6e01c5bc	Fix for the issue of device-id getting hardcoded for token_type_ids during Tracing [WIP] (#11252 ) * registering a buffer for token_type_ids, to pass the error of device-id getting hardcoded when tracing * sytle format * adding persistent flag to the resgitered buffers that prevent from adding them to the state_dict and addresses the Backward compatibility issue * adding the try catch to the fix as persistent flag is only available from PT >1.6 * adding version check * added the condition to only use the token_type_ids buffer when its autogenerated not passed by user * adding comments and making the conidtion where token_type_ids are None to use the registered buffer * taking out position-embeddding from the if block * adding comments * handling the case if buffer for position_ids was not registered * reverted the changes on position_ids, fix the issue with size of token_type_ids buffer, moved the modification for generated token_type_ids to Bertmodel, instead of Embeddings * reverting the token_type_ids in case of None to the previous version * reverting changes on position_ids adding back the if block * changes added by running make fix-copies * changes added by running make fix-copies and added the import version as it was getting used * changes added by running make fix-copies * changes added by running make fix-copies * fixing the import format * fixing the import format * modified to use temp tensor for trimed and expanded token_type_ids buffer * changes made by fix-copies after temp tensor modifications * changes made by fix-copies after temp tensor modifications * changes made by fix-copies after temp tensor modifications * clean up * clean up * clean up * clean up * Nit * Nit * Nit * modified according to support device conversion on traced models * modified according to support device conversion on traced models * modified according to support device conversion on traced models * modified according to support device conversion on traced models * changes based on latest in master * Adapt templates * Add version import Co-authored-by: Ubuntu <ubuntu@ip-172-31-32-81.us-west-2.compute.internal> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-06-22 05:21:30 -04:00
Xa9aX ツ	f3558bbcfd	Depreciate pythonic Mish and support PyTorch 1.9 version of Mish (#12240 ) * Moved Mish to Torch 1.9 version * Run black formatting	2021-06-18 09:13:45 -04:00
Stas Bekman	a156da9a23	consistent nn. and nn.functional: p2 templates (#12153 )	2021-06-14 11:41:24 -07:00
François Lagunas	f8bd8c6c7e	Fixes bug that appears when using QA bert and distilation. (#12026 ) * Fixing bug that appears when using distilation (and potentially other uses). During backward pass Pytorch complains with: RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation This happens because the QA model code modifies the start_positions and end_positions input tensors, using clamp_ function: as a consequence the teacher and the student both modifies the inputs, and backward pass fails. * Fixing all models QA clamp_ bug.	2021-06-07 11:21:59 -04:00
Suraj Patil	185122ef22	fix docs of past_key_values (#12049 )	2021-06-07 15:24:03 +05:30
Sylvain Gugger	7eee950ac3	Re-styling in seq2seq attention (#11613 )	2021-05-06 14:24:19 -04:00
Daniel Stancl	38a716cd41	TF BART models - Add `cross_attentions` to model output and fix cross-attention head masking (#10699 ) * Add cross_attn_head_mask to BART * Fix cross_attentions in TFBart-like models * This commit enables returning of `cross_attentions` for TFBart-like models * It also fixes attention head masking in cross-attenion module * Update TF model templates * Fix missing , in TF model templates * Fix typo: congig -> config	2021-04-26 14:16:21 +02:00
Daniel Stancl	e3ff165aa5	Fix cross-attention head mask for Torch encoder-decoder models (#10605 ) * Fix cross-attention head mask for Torch BART models * Fix head masking for cross-attention module for the following models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart, Pegasus * Enable test_headmasking for M2M_100 model * Fix cross_head_mask for FSMT, LED and T5 * This commit fixes `head_mask` for cross-attention modules in the following models: FSMT, LED, T5 * It also contains some smaller changes in doc so that it is be perfectly clear the shape of `cross_head_mask` is the same as of `decoder_head_mask` * Update template * Fix template for BartForCausalLM * Fix cross_head_mask for Speech2Text models * Fix cross_head_mask in templates * Fix args order in BartForCausalLM template * Fix doc in BART templates * Make more explicit naming * `cross_head_mask` -> `cross_attn_head_mask` * `cross_layer_head_mask` -> `cross_attn_layer_head_mask` * Fix doc * make style quality * Fix speech2text docstring	2021-04-23 18:58:06 +02:00
Sylvain Gugger	74712e22f3	Honor contributors to models (#11329 ) * Honor contributors to models * Fix typo * Address review comments * Add more authors	2021-04-21 09:47:27 -04:00
Sylvain Gugger	45fc8c7951	Make `get_special_tokens_mask` consider all tokens (#11163 )	2021-04-09 11:57:44 -04:00
Stas Bekman	c9035e4537	fix: The 'warn' method is deprecated (#11105 ) * The 'warn' method is deprecated * fix test	2021-04-07 09:20:06 -04:00
Sylvain Gugger	acc3bd9d2a	Enforce string-formatting with f-strings (#10980 ) * First third * Styling and fix mistake * Quality * All the rest * Treat %s and %d * typo * Missing ) * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-03-31 10:00:27 -04:00
Sylvain Gugger	700229f8a4	Fixes in the templates (#10951 ) * Fixes in the templates * Define in all cases * Dimensionality -> Dimension Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-03-29 17:36:13 -04:00
Sylvain Gugger	2295d783d5	Copy tokenizer files in each of their repo (#10624 ) * Move tokenizer files in each repo * Fix mBART50 tests * Fix mBART tests * Fix Marian tests * Update templates	2021-03-10 11:26:23 -05:00
Sylvain Gugger	7da995c00c	Fix embeddings for PyTorch 1.8 (#10549 ) * Fix embeddings for PyTorch 1.8 * Try with PyTorch 1.8.0 * Fix embeddings init * Fix copies * Typo * More typos	2021-03-05 16:18:48 -05:00
Patrick von Platen	2d2ed2cc18	[T5] Fix speed degradation bug t5 (#10496 ) * fix speed degradation bug t5 * fix for all models * fix code quality	2021-03-03 12:42:41 +03:00
mingruimingrui	894db6701e	Bugfix: Removal of padding_idx in BartLearnedPositionalEmbedding (#10200 ) * Assumption of padding_idx <2 might not stand * Use offset instead of 2 * Fix with black * Change behavior to warning instead for backward compatibility. * Fix with black * Remove warning * Make padding_idx non-required * padding_idx fix for blenderbot * padding_idx fix for blenderbot_small * padding_idx fix for led * padding_idx fix for mbart * Remove extra whitespaces * padding_idx fix for template * Fix padding_idx passed to nn.Embedding mistake * Fixed padding_idx passed to positional embedding in template * Remove padding_idx from pytorch learned positional embeddings * Remove accidentally added quotes * Remove padding_idx from tf learned positional embeddings * Remove zeroing of weights in __init__ Co-authored-by: Wang Ming Rui <mingrui.wang@C02CJTUYMD6M.local>	2021-02-25 14:33:13 +03:00
Julien Plu	83d803ba02	Making TF BART-like models XLA and AMP compliant (#10191 ) * Update BART * Update Blenderbot * Update BlenderbotSmall * Update Marian * Update MBart * Update MBart * Update Pegasus * Update template * Fix Marian and Pegasus * Apply style * Default initializer * Default initializer * Default initializer * Remove int32 casts * Fix template * Remove more cast	2021-02-17 17:48:56 +01:00
Julien Plu	31b0560ab4	Add AMP for Albert (#10141 )	2021-02-15 17:18:33 +01:00
Julien Plu	570218878a	Fix TF template (#10189 ) * Fix template * Update Seq2Seq tests	2021-02-15 09:21:57 -05:00
Patrick von Platen	8e13b73593	Update README.md	2021-02-11 18:35:27 +03:00
Patrick von Platen	d6b4f48ecb	Update ADD_BIG_BIRD.md	2021-02-11 18:34:17 +03:00
Patrick von Platen	4cda2d73ef	Update ADD_BIG_BIRD.md	2021-02-09 19:58:35 +03:00
Julien Plu	b82fe7d258	Replace strided slice with tf.expand_dims (#10078 ) * Replace tf.newaxis -> tf.expand_dims * Fix tests * Fix tests * Use reshape when a tensors needs a double expand * Fix GPT2 * Fix GPT2	2021-02-09 11:48:28 -05:00
Lysandre Debut	c9df1b1d53	Model templates (#10072 )	2021-02-08 09:07:02 -05:00
Julien Plu	cdd8659231	Fix TF template (#10069 ) * Fix template * Fix template	2021-02-08 08:10:50 -05:00
Julien Plu	31563e056d	Restore TF embeddings and attention layers to their previous version (#9890 ) * Refacto BERT * Restore all the concerned models * Remove print * Update template * Apply Sylvain's and Morgan's comments * Fix cast * Put the cast inside call * Remove cond in ebds * Fix funnel * Restore previous dot product (attention_scores) computation * Add ConvBERT and BART * Make all the S2S models ONNX compliant * Fix test * Fix check copies	2021-02-08 14:36:30 +03:00
Lysandre Debut	ae37ceacbd	Fix typo (#10064 )	2021-02-08 06:02:05 -05:00
Patrick von Platen	89be094e29	[Templates] Add template "call-for-model" markdown and "call-for-big-bird" markdown (#9921 ) * add big bird * change teacher to mentor * add proposal template * adapt template * delete old template * correct some links * finish template * create big bird from template * add big bird * improve boxes * finish boxes * add pointers for BigBird * finish big bird * up * up * up * up * apply lysandres and sylvains suggestions * delete bogus file * correct markdown * try different style * try different style * finalize	2021-02-05 15:47:54 +03:00
Lysandre Debut	e89c959af9	Fix model templates (#9999 )	2021-02-04 07:47:26 -05:00
demSd	00031785a8	BartForCausalLM analogs to `ProphetNetForCausalLM` (#9128 ) * initiliaze bart4causalLM * create BartDecoderWrapper, setters/getters * delete spaces * forward and additional methods * update cache function, loss function, remove ngram* params in data class. * add bartcausallm, bartdecoder testing * correct bart for causal lm * remove at * add mbart as well * up * fix typo * up * correct * add pegasusforcausallm * add blenderbotforcausallm * add blenderbotsmallforcausallm * add marianforcausallm * add test for MarianForCausalLM * add Pegasus test * add BlenderbotSmall test * add blenderbot test * fix a fail * fix an import fail * a fix * fix * Update modeling_pegasus.py * fix models * fix inputs_embeds setting getter * adapt tests * correct repo utils check * finish test improvement * fix tf models as well * make style * make fix-copies * fix copies * run all tests * last changes * fix all tests Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-02-04 11:56:12 +03:00
Patrick von Platen	0f4dc5d864	fix typo in naming (#9944 )	2021-02-02 12:22:42 +03:00
Patrick von Platen	538b3b4607	[Tokenizer Utils Base] Make pad function more flexible (#9928 ) * change tokenizer requirement * split line * Correct typo from list to str * improve style * make other function pretty as well * add comment * correct typo * add new test * pass tests for tok without padding token * Apply suggestions from code review	2021-02-02 10:35:27 +03:00
Patrick von Platen	0e3be1ac8f	Add new model docs (#9667 ) * add new model logic * fix docs * change structure * improve add_new_model * push new changes * up * up * correct spelling * improve docstring * correct line length * update readme * correct links * correct typos * only add rst file for now * Apply suggestions from code review 1 Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be> * Apply suggestions from code review Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com> * finish adding all suggestions * make style * apply Niels feedback * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply sylvains suggestions Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-02-01 17:55:10 +03:00
Sylvain Gugger	d85691ac75	Doc title in the template (#9910 )	2021-02-01 03:05:31 -05:00
Funtowicz Morgan	2ee9f9b69e	Fix computation of attention_probs when head_mask is provided. (#9853 ) * Fix computation of attention_probs when head_mask is provided. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Apply changes to the template Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-01-28 06:11:52 -05:00
Lysandre Debut	763ece2fea	Fix model templates (#9842 )	2021-01-27 08:20:58 -05:00
Julien Plu	bd701ab1a0	Fix template (#9840 )	2021-01-27 07:40:30 -05:00
Julien Plu	4adbdce5ee	Clean TF Bert (#9788 ) * Start cleaning BERT * Clean BERT and all those depends of it * Fix attribute name * Apply style * Apply Sylvain's comments * Apply Lysandre's comments * remove unused import	2021-01-27 11:28:11 +01:00
Julien Plu	a1720694a5	Remove a TF usage warning and rework the documentation (#9756 ) * Rework documentation * Update the template * Trigger CI * Restore the warning but with the TF logger * Update convbert doc	2021-01-27 10:45:42 +01:00
Lysandre	897a24c869	Fix head_mask for model templates	2021-01-26 11:02:48 +01:00
Julien Plu	a7dabfb3d1	Fix TF s2s models (#9478 ) * Fix Seq2Seq models for serving * Apply style * Fix lonfgormer * Fix mBart/Pegasus/Blenderbot * Apply style * Add a main intermediate layer * Apply style * Remove import * Apply tf.function to Longformer * Fix utils check_copy * Update S2S template * Fix BART + Blenderbot * Fix BlenderbotSmall * Fix BlenderbotSmall * Fix BlenderbotSmall * Fix MBart * Fix Marian * Fix Pegasus + template * Apply style * Fix common attributes test * Forgot to fix the LED test * Apply Patrick's comment on LED Decoder	2021-01-21 17:03:29 +01:00
Julien Plu	3f290e6c84	Fix mixed precision in TF models (#9163 ) * Fix Gelu precision * Fix gelu_fast * Naming * Fix usage and apply style * add TF gelu approximate version * add TF gelu approximate version * add TF gelu approximate version * Apply style * Fix albert * Remove the usage of the Activation layer	2021-01-21 07:00:11 -05:00
Julien Plu	7251a4736d	Fix template (#9697 )	2021-01-20 09:04:53 -05:00
Julien Plu	14042d560f	New TF embeddings (cleaner and faster) (#9418 ) * Create new embeddings + add to BERT * Add Albert * Add DistilBert * Add Albert + Electra + Funnel * Add Longformer + Lxmert * Add last models * Apply style * Update the template * Remove unused imports * Rename attribute * Import embeddings in their own model file * Replace word_embeddings per weight * fix naming * Fix Albert * Fix Albert * Fix Longformer * Fix Lxmert Mobilebert and MPNet * Fix copy * Fix template * Update the get weights function * Update src/transformers/modeling_tf_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/electra/modeling_tf_electra.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address Sylvain's comments Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-20 12:08:12 +01:00
Sylvain Gugger	7e662e6a3b	Fix model templates and use less than 119 chars (#9684 ) * Fix model templates and use less than 119 chars * Missing new line	2021-01-19 17:11:22 -05:00
Yusuke Mori	b020a736c3	Update `past_key_values` in GPT-2 (#9596 ) * Update past_key_values in gpt2 (#9391) * Update generation_utils, and rename some items * Update modeling_gpt2 to avoid an error in gradient_checkpointing * Remove 'reorder_cache' from util and add variations to XLNet, TransfoXL, GPT-2 * Change the location of '_reorder_cache' in modeling files * Add '_reorder_cache' in modeling_ctrl * Fix a bug of my last commit in CTRL * Add '_reorder_cache' to GPT2DoubleHeadsModel * Manage 'use_cache' in config of test_modeling_gpt2 * Clean up the doc string * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix the doc string (GPT-2, CTRL) * improve gradient_checkpointing_behavior Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-01-19 16:00:15 +01:00
Patrick von Platen	7f28613213	[TFBart] Split TF-Bart (#9497 ) * make templates ready * make add_new_model_command_ready * finish tf bart * prepare tf mbart * finish tf bart * add tf mbart * add marian * prep pegasus * add tf pegasus * push blenderbot tf * add blenderbot * add blenderbot small * clean-up * make fix copy * define blend bot tok * fix * up * make style * add to docs * add copy statements * overwrite changes * improve * fix docs * finish * fix last slow test * fix missing git conflict line * fix blenderbot * up * fix blenderbot small * load changes * finish copied from * upload fix	2021-01-12 02:06:32 +01:00
Julien Plu	1e3c362235	Fix template (#9512 )	2021-01-11 08:03:28 -05:00
Julien Plu	1243ee7d0c	Full rework of the TF input/output embeddings and bias resizing (#9193 ) * Start rework resizing * Rework bias/decoder resizing * Full resizing rework * Full resizing rework * Start to update the models with the new approach * Finish to update the models * Update all the tests * Update the template * Fix tests * Fix tests * Test a new approach * Refactoring * Refactoring * Refactoring * New rework * Rework BART * Rework bert+blenderbot * Rework CTRL * Rework Distilbert * Rework DPR * Rework Electra * Rework Flaubert * Rework Funnel * Rework GPT2 * Rework Longformer * Rework Lxmert * Rework marian+mbart * Rework mobilebert * Rework mpnet * Rework openai * Rework pegasus * Rework Roberta * Rework T5 * Rework xlm+xlnet * Rework template * Fix TFT5EncoderOnly + DPRs * Restore previous methods * Fix Funnel * Fix CTRL and TransforXL * Apply style * Apply Sylvain's comments * Restore a test in DPR * Address the comments * Fix bug * Apply style * remove unused import * Fix test * Forgot a method * missing test * Trigger CI * naming update * Rebase * Trigger CI	2021-01-11 06:27:28 -05:00
Julien Plu	cf416764f4	Fix template (#9504 )	2021-01-11 05:21:25 -05:00
Julien Plu	4f7022d68d	Reformat (#9482 )	2021-01-10 15:10:15 +01:00
Sylvain Gugger	1bdf42409c	Fast imports part 3 (#9474 ) * New intermediate inits * Update template * Avoid importing torch/tf/flax in tokenization unless necessary * Styling * Shutup flake8 * Better python version check	2021-01-08 07:40:59 -05:00
Sylvain Gugger	758ed3332b	Transformers fast import part 2 (#9446 ) * Main init work * Add version * Change from absolute to relative imports * Fix imports * One more typo * More typos * Styling * Make quality script pass * Add necessary replace in template * Fix typos * Spaces are ignored in replace for some reason * Forgot one models. * Fixes for import Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr> * Add documentation * Styling Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>	2021-01-07 09:36:14 -05:00
Julien Plu	812045adcc	New serving (#9419 ) * Add a serving method * Add albert * Add serving for BERT and BART * Add more models * Finish the serving addition * Temp fix * Restore DPR * Fix funnel attribute * Fix attributes GPT2 * Fix OpenAIGPT attribute * Fix T5 attributes * Fix Bart attributes * Fix TransfoXL attributes * Add versioning * better test * Update template * Fix Flaubert * Fix T5 * Apply style * Remove unused imports * Deactivate extra parameters * Remove too long test + saved_model default to False * Ignore the saved model test for some models * Fix some inputs * Fix mpnet serving * Trigger CI * Address all comments	2021-01-07 11:48:49 +01:00
Patrick von Platen	b8462b5b2a	[GenerationOutputs] Fix GenerationOutputs Tests (#9443 ) * fix generation models * fix led * fix docs * add is_decoder * fix last docstrings * make style * fix t5 cross attentions * correct t5	2021-01-06 19:37:02 +01:00
Patrick von Platen	eef66035a2	[PyTorch Bart] Split Bart into different models (#9343 ) * first try * remove old template * finish bart * finish mbart * delete unnecessary line * init pegasus * save intermediate * correct pegasus * finish pegasus * remove cookie cutter leftover * add marian * finish blenderbot * replace in file * correctly split blenderbot * delete "old" folder * correct "add statement" * adapt config for tf comp * correct configs for tf * remove ipdb * fix more stuff * fix mbart * push pegasus fix * fix mbart * more fixes * fix research projects code * finish docs for bart, mbart, and marian * delete unnecessary file * correct attn typo * correct configs * remove pegasus for seq class * correct peg docs * correct peg docs * finish configs * further improve docs * add copied from statements to mbart * fix copied from in mbart * add copy statements to marian * add copied from to marian * add pegasus copied from * finish pegasus * finish copied from * Apply suggestions from code review * make style * backward comp blenderbot * apply lysandres and sylvains suggestions * apply suggestions * push last fixes * fix docs * fix tok tests * fix imports code style * fix doc	2021-01-05 22:00:05 +01:00
Patrick von Platen	912f6881d2	add import math (#9346 )	2020-12-29 19:35:06 +01:00
Patrick von Platen	785e52cd30	improve templates (#9342 )	2020-12-29 16:48:44 +01:00
Patrick von Platen	83fdd252f6	[Seq2Seq Templates] Correct some TF-serving errors and add gradient checkpointing to PT by default. (#9334 ) * correct tests * correct shape and get_tf_activation * more correction tf * add gradient checkpointing to templates * correct typo	2020-12-28 17:51:04 +01:00
Patrick von Platen	6c091abef2	[Templates] Adapt Bert (#9284 ) * adapt templates * adapt config * add test as well * fix output type * fix cache false naming * finish tests * last fix	2020-12-24 01:44:33 +01:00
Patrick von Platen	d5db6c37d4	[Seq2Seq Templates] Fix check_repo.py templates file (#9277 ) * add enc dec pt model to check repo * fix indent	2020-12-23 11:40:20 +01:00
Patrick von Platen	cbe63949d7	Model Templates for Seq2Seq (#9251 ) * adapt cookie cutter * fix copy past statement * delete copy statements for now * remove unused import from template * make doc rst * correct config docstring * correct training * correct inputs processing tf enc dec * make style * adapt templates * clean tabs * correct tensor -> Tensor naming * correct indent * correct templates * fix the test * break lines to avoid > 119 * Apply suggestions from code review	2020-12-22 23:41:20 +01:00
Julien Plu	161a6461db	Fix TF template (#9234 )	2020-12-21 13:52:16 +01:00
Julien Plu	5a8a4eb187	Improve BERT-like models performance with better self attention (#9124 ) * Improve BERT-like models attention layers * Apply style * Put back error raising instead of assert * Update template * Fix copies * Apply raising valueerror in MPNet * Restore the copy check for the Intermediate layer in Longformer * Update longformer	2020-12-21 13:10:15 +01:00
Lysandre Debut	6587cf9f84	Patch *ForCausalLM model (#9092 )	2020-12-14 00:39:55 -05:00
Julien Plu	51d9c569fa	Fix embeddings resizing in TF models (#8657 ) * Resize the biases in same time than the embeddings * Trigger CI * Biases are not reset anymore * Remove get_output_embeddings + better LM model detection in generation utils * Apply style * First test on BERT * Update docstring + new name * Apply the new resizing logic to all the models * fix tests * Apply style * Update the template * Fix naming * Fix naming * Apply style * Apply style * Remove unused import * Revert get_output_embeddings * Trigger CI * Update num parameters * Restore get_output_embeddings in TFPretrainedModel and add comments * Style * Add decoder resizing * Style * Fix tests * Separate bias and decoder resize * Fix tests * Fix tests * Apply style * Add bias resizing in MPNet * Trigger CI * Apply style	2020-12-13 23:05:24 -05:00
Lysandre Debut	67ff1c314a	Templates overhaul 1 (#8993 )	2020-12-08 18:00:07 -05:00
Sylvain Gugger	00aa9dbca2	Copyright (#8970 ) * Add copyright everywhere missing * Style	2020-12-07 18:36:34 -05:00
Julien Plu	dcd3046f98	Better booleans handling in the TF models (#8777 ) * Apply on BERT and ALBERT * Update TF Bart * Add input processing to TF BART * Add input processing for TF CTRL * Add input processing to TF Distilbert * Add input processing to TF DPR * Add input processing to TF Electra * Add deprecated arguments * Add input processing to TF XLM * Add input processing to TF Funnel * Add input processing to TF GPT2 * Add input processing to TF Longformer * Add input processing to TF Lxmert * Apply style * Add input processing to TF Mobilebert * Add input processing to TF GPT * Add input processing to TF Roberta * Add input processing to TF T5 * Add input processing to TF TransfoXL * Apply style * Rebase on master * Bug fix * Retry to bugfix * Retry bug fix * Fix wrong model name * Try another fix * Fix BART * Fix input precessing * Apply style * Put the deprecated warnings in the input processing function * Remove the unused imports * Raise an error when len(kwargs)>0 * test ModelOutput instead of TFBaseModelOutput * Bug fix * Address Patrick's comments * Address Patrick's comments * Address Sylvain's comments * Add boolean processing for the inputs * Apply style * Missing optional * Fix missing some input proc * Update the template * Fix missing inputs * Missing input * Fix args parameter * Trigger CI * Trigger CI * Trigger CI * Address Patrick's and Sylvain's comments * Replace warn by warning * Trigger CI * Fix XLNET * Fix detection	2020-12-04 09:08:29 -05:00
Patrick von Platen	443f67e887	[PyTorch] Refactor Resize Token Embeddings (#8880 ) * fix resize tokens * correct mobile_bert * move embedding fix into modeling_utils.py * refactor * fix lm head resize * refactor * break lines to make sylvain happy * add news tests * fix typo * improve test * skip bart-like for now * check if base_model = get(...) is necessary * clean files * improve test * fix tests * revert style templates * Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py	2020-12-02 19:19:50 +01:00
Julien Plu	29d4992453	New TF model inputs (#8602 ) * Apply on BERT and ALBERT * Update TF Bart * Add input processing to TF BART * Add input processing for TF CTRL * Add input processing to TF Distilbert * Add input processing to TF DPR * Add input processing to TF Electra * Add input processing for TF Flaubert * Add deprecated arguments * Add input processing to TF XLM * remove unused imports * Add input processing to TF Funnel * Add input processing to TF GPT2 * Add input processing to TF Longformer * Add input processing to TF Lxmert * Apply style * Add input processing to TF Mobilebert * Add input processing to TF GPT * Add input processing to TF Roberta * Add input processing to TF T5 * Add input processing to TF TransfoXL * Apply style * Rebase on master * Bug fix * Retry to bugfix * Retry bug fix * Fix wrong model name * Try another fix * Fix BART * Fix input precessing * Apply style * Put the deprecated warnings in the input processing function * Remove the unused imports * Raise an error when len(kwargs)>0 * test ModelOutput instead of TFBaseModelOutput * Bug fix * Address Patrick's comments * Address Patrick's comments * Address Sylvain's comments * Add the new inputs in new Longformer models * Update the template with the new input processing * Remove useless assert * Apply style * Trigger CI	2020-11-24 13:55:00 -05:00
Stas Bekman	e84786aaa6	consistent ignore keys + make private (#8737 ) * consistent ignore keys + make private * style * - authorized_missing_keys => _keys_to_ignore_on_load_missing - authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected * move public doc of private attributes to private comment	2020-11-23 12:33:13 -08:00
Sylvain Gugger	dd52804f5f	Remove deprecated (#8604 ) * Remove old deprecated arguments Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr> * Remove needless imports * Fix tests Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>	2020-11-17 15:11:29 -05:00
Sylvain Gugger	36a19915ea	Fix model templates (#8595 ) * First fixes * Fix imports and add init * Fix typo * Move init to final dest * Fix tokenization import * More fixes * Styling	2020-11-17 10:35:38 -05:00
Sylvain Gugger	c89bdfbe72	Reorganize repo (#8580 ) * Put models in subfolders * Styling * Fix imports in tests * More fixes in test imports * Sneaky hidden imports * Fix imports in doc files * More sneaky imports * Finish fixing tests * Fix examples * Fix path for copies * More fixes for examples * Fix dummy files * More fixes for example * More model import fixes * Is this why you're unhappy GitHub? * Fix imports in conver command	2020-11-16 21:43:42 -05:00
Sylvain Gugger	1073a2bde5	Switch `return_dict` to `True` by default. (#8530 ) * Use the CI to identify failing tests * Remove from all examples and tests * More default switch * Fixes * More test fixes * More fixes * Last fixes hopefully * Use the CI to identify failing tests * Remove from all examples and tests * More default switch * Fixes * More test fixes * More fixes * Last fixes hopefully * Run on the real suite * Fix slow tests	2020-11-16 11:43:00 -05:00
Lysandre Debut	826f04576f	Model templates encoder only (#8509 ) * Model templates * TensorFlow * Remove pooler * CI * Tokenizer + Refactoring * Encoder-Decoder * Let's go testing * Encoder-Decoder in TF * Let's go testing in TF * Documentation * README * Fixes * Better names * Style * Update docs * Choose to skip either TF or PT * Code quality fixes * Add to testing suite * Update file path * Cookiecutter path * Update `transformers` path * Handle rebasing * Remove seq2seq from model templates * Remove s2s config * Apply Sylvain and Patrick comments * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Last fixes from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-13 11:59:30 -05:00
Santiago Castro	969859d5f6	Fix doc errors and typos across the board (#8139 ) * Fix doc errors and typos across the board * Fix a typo * Fix the CI * Fix more typos * Fix CI * More fixes * Fix CI * More fixes * More fixes	2020-10-29 10:33:33 -04:00
Sylvain Gugger	378142afdf	Rename add_start_docstrings_to_callable (#8120 )	2020-10-28 13:42:31 -04:00
Stas Bekman	3e31e7f956	[testing] rename skip targets + docs (#7863 ) * rename skip targets + docs * fix quotes * style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * small improvements * fix Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-20 04:39:13 -04:00
Thomas Wolf	ba8c4d0ac0	[Dependencies\|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659 ) * splitting fast and slow tokenizers [WIP] * [WIP] splitting sentencepiece and tokenizers dependencies * update dummy objects * add name_or_path to models and tokenizers * prefix added to file names * prefix * styling + quality * spliting all the tokenizer files - sorting sentencepiece based ones * update tokenizer version up to 0.9.0 * remove hard dependency on sentencepiece 🎉 * and removed hard dependency on tokenizers 🎉 * update conversion script * update missing models * fixing tests * move test_tokenization_fast to main tokenization tests - fix bugs * bump up tokenizers * fix bert_generation * update ad fix several tokenizers * keep sentencepiece in deps for now * fix funnel and deberta tests * fix fsmt * fix marian tests * fix layoutlm * fix squeezebert and gpt2 * fix T5 tokenization * fix xlnet tests * style * fix mbart * bump up tokenizers to 0.9.2 * fix model tests * fix tf models * fix seq2seq examples * fix tests without sentencepiece * fix slow => fast conversion without sentencepiece * update auto and bert generation tests * fix mbart tests * fix auto and common test without tokenizers * fix tests without tokenizers * clean up tests lighten up when tokenizers + sentencepiece are both off * style quality and tests fixing * add sentencepiece to doc/examples reqs * leave sentencepiece on for now * style quality split hebert and fix pegasus * WIP Herbert fast * add sample_text_no_unicode and fix hebert tokenization * skip FSMT example test for now * fix style * fix fsmt in example tests * update following Lysandre and Sylvain's comments * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-18 20:51:24 +02:00
Sylvain Gugger	13c1857718	Fix typo in all model docs (#7714 )	2020-10-12 04:06:59 -04:00
Sylvain Gugger	b2b7fc7814	Check and update model list in index.rst automatically (#7527 ) * Check and update model list in index.rst automatically * Check and update model list in index.rst automatically * Adapt template	2020-10-05 09:40:45 -04:00
Forrest Iandola	02ef825be2	SqueezeBERT architecture (#7083 ) * configuration_squeezebert.py thin wrapper around bert tokenizer fix typos wip sb model code wip modeling_squeezebert.py. Next step is to get the multi-layer-output interface working set up squeezebert to use BertModelOutput when returning results. squeezebert documentation formatting allow head mask that is an array of [None, ..., None] docs docs cont'd path to vocab docs and pointers to cloud files (WIP) line length and indentation squeezebert model cards formatting of model cards untrack modeling_squeezebert_scratchpad.py update aws paths to vocab and config files get rid of stub of NSP code, and advise users to pretrain with mlm only fix rebase issues redo rebase of modeling_auto.py fix issues with code formatting more code format auto-fixes move squeezebert before bert in tokenization_auto.py and modeling_auto.py because squeezebert inherits from bert tests for squeezebert modeling and tokenization fix typo move squeezebert before bert in modeling_auto.py to fix inheritance problem disable test_head_masking, since squeezebert doesn't yet implement head masking fix issues exposed by the test_modeling_squeezebert.py fix an issue exposed by test_tokenization_squeezebert.py fix issue exposed by test_modeling_squeezebert.py auto generated code style improvement issue that we inherited from modeling_xxx.py: SqueezeBertForMaskedLM.forward() calls self.cls(), but there is no self.cls, and I think the goal was actually to call self.lm_head() update copyright resolve failing 'test_hidden_states_output' and remove unused encoder_hidden_states and encoder_attention_mask docs add integration test. rename squeezebert-mnli --> squeezebert/squeezebert-mnli autogenerated formatting tweaks integrate feedback from patrickvonplaten and sgugger to programming style and documentation strings * tiny change to order of imports	2020-10-05 04:25:43 -04:00
Sylvain Gugger	0ccb6f5c6d	Clean RAG docs and template docs (#7348 ) * Clean RAG docs and template docs * Fix typo * Better doc	2020-09-24 09:24:41 -04:00
Sylvain Gugger	d1691d90e5	Samell fixed in tf template (#7044 )	2020-09-10 10:36:02 -04:00
Lysandre Debut	b482ad474a	Fix template (#7040 )	2020-09-10 08:45:52 -04:00
Stas Bekman	48ff6d5109	[doc] remove the implied defaults to :obj:`None`, s/True/ :obj:`True/, etc. (#6956 ) * remove the implied defaults to :obj:`None` * fix bug in the original * replace to :obj:`True`, :obj:`False`	2020-09-04 18:22:25 -04:00
Sylvain Gugger	722b5807d8	Template updates (#6914 )	2020-09-03 04:14:58 -04:00
Stas Bekman	7351ef83c1	[doc] typos (#6867 ) * [doc] typos fixed typos * Update README.md	2020-09-02 06:51:51 -04:00
Lysandre	a75c64d80c	Black 20 release	2020-08-26 17:20:22 +02:00
Sylvain Gugger	a573777901	Update repo to isort v5 (#6686 ) * Run new isort * More changes * Update CI, CONTRIBUTING and benchmarks	2020-08-24 11:03:01 -04:00
Stas Bekman	e983da0e7d	cleanup tf unittests: part 2 (#6260 ) * cleanup torch unittests: part 2 * remove trailing comma added by isort, and which breaks flake * one more comma * revert odd balls * part 3: odd cases * more ["key"] -> .key refactoring * .numpy() is not needed * more unncessary .numpy() removed * more simplification	2020-08-13 04:29:06 -04:00
Sylvain Gugger	c67d1a0259	Tf model outputs (#6247 ) * TF outputs and test on BERT * Albert to DistilBert * All remaining TF models except T5 * Documentation * One file forgotten * TF outputs and test on BERT * Albert to DistilBert * All remaining TF models except T5 * Documentation * One file forgotten * Add new models and fix issues * Quality improvements * Add T5 * A bit of cleanup * Fix for slow tests * Style	2020-08-05 11:34:39 -04:00
Stas Bekman	5deed37f9f	cleanup torch unittests (#6196 ) * improve unit tests this is a sample of one test according to the request in https://github.com/huggingface/transformers/issues/5973 before I apply it to the rest * batch 1 * batch 2 * batch 3 * batch 4 * batch 5 * style * non-tf template * last deletion of check_loss_output	2020-08-04 02:42:56 -04:00
Teven	5a0dac53bf	Empty assert hunt (#6056 ) * Fixed empty asserts * black-reformatted stragglers in templates * More code quality checks * Update src/transformers/convert_marian_to_pytorch.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/convert_marian_to_pytorch.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * removed unused line as per @sshleifer Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-03 10:19:03 +02:00
Sylvain Gugger	d951c14ae4	Model output test (#6155 ) * Use return_dict=True in all tests * Formatting	2020-07-31 09:44:37 -04:00
Sylvain Gugger	91cb95461e	Switch from return_tuple to return_dict (#6138 ) * Switch from return_tuple to return_dict * Fix test * [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614) * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests * AutoModels Tiny tweaks * Style * Final changes before merge * Re-order for simpler review * Final fixes * Addressing @sgugger's comments * Test MultipleChoice * Rework TF trainer (#6038) * Fully rework training/prediction loops * fix method name * Fix variable name * Fix property name * Fix scope * Fix method name * Fix tuple index * Fix tuple index * Fix indentation * Fix variable name * fix eval before log * Add drop remainder for test dataset * Fix step number + fix logging datetime * fix eval loss value * use global step instead of step + fix logging at step 0 * Fix logging datetime * Fix global_step usage * Fix breaking loop + logging datetime * Fix step in prediction loop * Fix step breaking * Fix train/test loops * Force TF at least 2.2 for the trainer * Use assert_cardinality to facilitate the dataset size computation * Log steps per epoch * Make tfds compliant with TPU * Make tfds compliant with TPU * Use TF dataset enumerate instead of the Python one * revert previous commit * Fix data_dir * Apply style * rebase on master * Address Sylvain's comments * Address Sylvain's and Lysandre comments * Trigger CI * Remove unused import * Switch from return_tuple to return_dict * Fix test * Add recent model Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Plu <plu.julien@gmail.com>	2020-07-30 09:17:00 -04:00
Sylvain Gugger	a884b7fa38	Update the new model template (#6019 )	2020-07-24 14:15:37 -04:00
Sam Shleifer	feeb956a19	[docs] Add integration test example to copy pasta template (#5961 ) Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-07-22 12:48:38 -04:00
Thomas Wolf	601d4d699c	[tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308 ) * remove references to old API in docstring - update data processors * style * fix tests - better type checking error messages * better type checking * include awesome fix by @LysandreJik for #5310 * updated doc and examples	2020-06-26 19:48:14 +02:00
Anthony MOI	36434220fc	[HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized pipeline - fast tokenizers - tests (#4510 ) * Use tokenizers pre-tokenized pipeline * failing pretrokenized test * Fix is_pretokenized in python * add pretokenized tests * style and quality * better tests for batched pretokenized inputs * tokenizers clean up - new padding_strategy - split the files * [HUGE] refactoring tokenizers - padding - truncation - tests * style and quality * bump up requied tokenizers version to 0.8.0-rc1 * switched padding/truncation API - simpler better backward compat * updating tests for custom tokenizers * style and quality - tests on pad * fix QA pipeline * fix backward compatibility for max_length only * style and quality * Various cleans up - add verbose * fix tests * update docstrings * Fix tests * Docs reformatted * __call__ method documented Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-06-15 17:12:51 -04:00
Bharat Raghunathan	6e603cb789	[All models] Extend config.output_attentions with output_attentions function arguments (#4538 ) * DOC: Replace instances of ``config.output_attentions`` with function argument ``output_attentions`` * DOC: Apply Black Formatting * Fix errors where output_attentions was undefined * Remove output_attentions in classes per review * Fix regressions on tests having `output_attention` * Fix further regressions in tests relating to `output_attentions` Ensure proper propagation of `output_attentions` as a function parameter to all model subclasses * Fix more regressions in `test_output_attentions` * Fix issues with BertEncoder * Rename related variables to `output_attentions` * fix pytorch tests * fix bert and gpt2 tf * Fix most TF tests for `test_output_attentions` * Fix linter errors and more TF tests * fix conflicts * DOC: Apply Black Formatting * Fix errors where output_attentions was undefined * Remove output_attentions in classes per review * Fix regressions on tests having `output_attention` * fix conflicts * fix conflicts * fix conflicts * fix conflicts * fix pytorch tests * fix conflicts * fix conflicts * Fix linter errors and more TF tests * fix tf tests * make style * fix isort * improve output_attentions * improve tensorflow Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-06-09 23:39:06 +02:00
Julien Chaumond	d4c2cb402d	Kill model archive maps (#4636 ) * Kill model archive maps * Fixup * Also kill model_archive_map for MaskedBertPreTrainedModel * Unhook config_archive_map * Tokenizers: align with model id changes * make style && make quality * Fix CI	2020-06-02 09:39:33 -04:00
Julien Chaumond	455c639093	CDN urls (#4030 ) * [file_utils] use_cdn + documentation * Move to cdn. urls for weights * [urls] Hotfix for bert-base-japanese	2020-04-28 20:27:14 -04:00
Thomas Wolf	827d6d6ef0	Cleanup fast tokenizers integration (#3706 ) * First pass on utility classes and python tokenizers * finishing cleanup pass * style and quality * Fix tests * Updating following @mfuntowicz comment * style and quality * Fix Roberta * fix batch_size/seq_length inBatchEncoding * add alignement methods + tests * Fix OpenAI and Transfo-XL tokenizers * adding trim_offsets=True default for GPT2 et RoBERTa * style and quality * fix tests * add_prefix_space in roberta * bump up tokenizers to rc7 * style * unfortunately tensorfow does like these - removing shape/seq_len for now * Update src/transformers/tokenization_utils.py Co-Authored-By: Stefan Schweter <stefan@schweter.it> * Adding doc and docstrings * making flake8 happy Co-authored-by: Stefan Schweter <stefan@schweter.it>	2020-04-18 13:43:57 +02:00
Sam Shleifer	dbd041243d	[cleanup] factor out get_head_mask, invert_attn_mask, get_exten… (#3806 ) * Delete some copy pasted code	2020-04-16 09:55:25 -04:00
Julien Chaumond	a594ee9c84	More doc for model cards (#3698 ) see https://github.com/huggingface/transformers/pull/3679#pullrequestreview-389368270	2020-04-08 12:12:52 -04:00
Julien Chaumond	94eb68d742	weigths*weights	2020-04-04 15:03:26 -04:00
monologg	73368963b2	Fix importing unofficial TF models with extra optimizer weights	2020-02-07 10:25:31 -05:00
Julien Chaumond	83a41d39b3	💄 super	2020-01-15 18:33:50 -05:00
Julien Chaumond	b803b067bf	Config to Model mapping	2020-01-13 20:05:20 +00:00
Genta Indra Winata	d6a677b14b	Fix typograpical errors (#2438 )	2020-01-07 17:21:23 +01:00
alberduris	81d6841b4b	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
alberduris	dd4df80f0b	Moved the encoded_prompts to correct device	2020-01-06 15:11:12 +01:00
Aymeric Augustin	0ffc8eaf53	Enforce target version for black. This should stabilize formatting.	2020-01-05 12:52:14 -05:00

1 2 3 4 5 ...

290 Commits