transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-29 09:12:21 +06:00

Author	SHA1	Message	Date
Joao Gante	dad5ca83b2	TF: Finalize `unpack_inputs`-related changes (#16499 ) * Add unpack_inputs to remaining models * removed kwargs to `call()` in TF models * fix TF T5 tests	2022-04-04 16:37:33 +01:00
Yih-Dar	2199382dfd	Use random_attention_mask for TF tests (#16517 ) * use random_attention_mask for TF tests * Fix for TFCLIP test (for now). Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-01 16:53:07 +02:00
Joao Gante	c2f8eaf6bc	TF: unpack inputs on Convbert, GPTJ, LED, and templates (#16491 ) * Add unpack_inputs to remaining models * remove stray use of inputs in the templates; fix tf.debugging of attn masks	2022-03-30 17:12:27 +01:00
Sylvain Gugger	088c1880b7	Big file_utils cleanup (#16396 ) * Big file_utils cleanup * This one still needs to be treated separately	2022-03-25 07:25:20 -04:00
Sylvain Gugger	4975002df5	Reorganize file utils (#16264 ) * Split file_utils in several submodules * Fixes * Add back more objects * More fixes * Who exactly decided to import that from there? * Second suggestion to code with code review * Revert wront move * Fix imports * Adapt all imports * Adapt all imports everywhere * Revert this import, will fix in a separate commit	2022-03-23 10:26:33 -04:00
Lysandre Debut	eca77f4719	Updates the default branch from master to main (#16326 ) * Updates the default branch from master to main * Links from `master` to `main` * Typo * Update examples/flax/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-23 03:46:59 -04:00
Jacob Dineen	ec3aace0ae	Add type annotations for Rembert/Splinter and copies (#16338 ) * undo black autoformat * minor fix to rembert forward with default * make fix-copies, make quality * Adding types to template model * Removing List from the template types * Remove `Optional` from a couple of types that don't accept `None` Co-authored-by: matt <rocketknight1@gmail.com>	2022-03-22 20:07:48 +00:00
Robot Jelly	d50f62f2de	added type hints for BART model (#16270 ) * added type hints for BART model * make fixup, adding imports to copied files * Adding some missing types to cookiecutter * Adding some missing types to cookiecutter * Adding some missing types to cookiecutter Co-authored-by: matt <rocketknight1@gmail.com>	2022-03-21 15:18:01 +00:00
Sanchit Gandhi	ee27b3d7df	Replace all deprecated `jax.ops` operations with jnp's `at` (#16078 ) * Replace all deprecated `jax.ops` operations with jnp's `at` * np to jnp scores * suggested changes	2022-03-16 09:08:55 +00:00
Joao Gante	70203b5937	TF generate refactor - past without encoder outputs (#15944 ) * Remove packed past from generation_tf_utils * update models with the new past format * update template accordingly	2022-03-08 14:46:44 +00:00
Yih-Dar	f0aacc140b	Do not change the output from tuple to list - to match PT's version (#15918 ) * Do not change the output from tuple to list - to match PT's version * Fix the same issues for 5 other models and the template Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-03-04 17:50:24 +01:00
Yih-Dar	8635407bc7	Fix tf.concatenate + test past_key_values for TF models (#15774 ) * fix wrong method name tf.concatenate * add tests related to causal LM / decoder * make style and quality * clean-up * Fix TFBertModel's extended_attention_mask when past_key_values is provided * Fix tests * fix copies * More tf.int8 -> tf.int32 in TF test template * clean-up * Update TF test template * revert the previous commit + update the TF test template * Fix TF template extended_attention_mask when past_key_values is provided * Fix some styles manually * clean-up * Fix ValueError: too many values to unpack in the test * Fix more: too many values to unpack in the test * Add a comment for extended_attention_mask when there is past_key_values * Fix TFElectra extended_attention_mask when past_key_values is provided * Add tests to other TF models * Fix for TF Electra test: add prepare_config_and_inputs_for_decoder * Fix not passing training arg to lm_head in TFRobertaForCausalLM * Fix tests (with past) for TF Roberta * add testing for pask_key_values for TFElectra model Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-25 17:11:46 +01:00
Lysandre Debut	bb7949b35a	Fix model templates (#15806 ) * Fix model templates * Update paths	2022-02-23 18:27:29 -05:00
Lysandre Debut	29c10a41d0	[Test refactor 1/5] Per-folder tests reorganization (#15725 ) * Per-folder tests reorganization Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Stas Bekman <stas@stason.org>	2022-02-23 15:46:28 -05:00
Patrick von Platen	2e12b907ae	TF generate refactor - Greedy Search (#15562 ) * TF generate start refactor * Add tf tests for sample generate * re-organize * boom boom * Apply suggestions from code review * re-add * add all code * make random greedy pass * make encoder-decoder random work * further improvements * delete bogus file * make gpt2 and t5 tests work * finish logits tests * correct logits processors * correct past / encoder_outputs drama * refactor some methods * another fix * refactor shape_list * fix more shape list * import shape _list * finish docs * fix imports * make style * correct tf utils * Fix TFRag as well * Apply Lysandre's and Sylvais suggestions * Update tests/test_generation_tf_logits_process.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Update src/transformers/tf_utils.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * remove cpu according to gante * correct logit processor Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-02-15 17:54:43 +01:00
Yih-Dar	6a5472a8e1	Force use_cache to be False in PyTorch (#15385 ) * use_cache = False for PT models if labels is passed * Fix for BigBirdPegasusForConditionalGeneration * add warning if users specify use_cache=True * Use logger.warning instead of warnings.warn Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-08 16:20:53 +01:00
SaulLu	7b8bdd8601	fix the `tokenizer_config.json` file for the slow tokenizer when a fast version is available (#15319 ) * add new test * update test * remove `tokenizer_file` from `additional_files_names` in `tokenization_utils_base.py` * add `tokenizer_file` for the fast only tokenizer * change global variables layoutxml * remove `"tokenizer_file"` from DPR tokenizer's Global variables * remove `tokenizer_file` from herbert slow tokenizer init * `"tokenizer_file"` from LED tokenizer's Global variables * remove `tokenizer_file` from mbart slow tokenizer init * remove `tokenizer_file` from slow tokenizer template * adapt to versioning * adapt the `test_tokenizer_mismatch_warning` test * clean test * clarify `VOCAB_FILES_NAMES` in tokenization_utils_fast.py * Revert "remove `tokenizer_file` from mbart slow tokenizer init" This reverts commit `0dbb723fa9`. * Revert "`"tokenizer_file"` from LED tokenizer's Global variables" This reverts commit `5a3f879bdd`. * Revert "remove `tokenizer_file` from herbert slow tokenizer init" This reverts commit `f5e10007b7`. * Revert "remove `"tokenizer_file"` from DPR tokenizer's Global variables" This reverts commit `da0895330b`. * set `tokenizer_file` in super `__init__` of mbart	2022-02-01 16:48:25 +01:00
Yih-Dar	dc05dd539f	Fix TF Causal LM models' returned logits (#15256 ) * Fix TF Causal LM models' returned logits * Fix expected shape in the tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-01 11:04:07 +00:00
Yih-Dar	554d333ece	Fix loss calculation in TFXXXForTokenClassification models (#15294 ) * Fix loss calculation in TFFunnelForTokenClassification * revert the change in TFFunnelForTokenClassification * fix FunnelForTokenClassification loss * fix other TokenClassification loss * fix more * fix more * add num_labels to ElectraForTokenClassification * revert the change to research projects Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-01-31 11:43:08 -05:00
Sylvain Gugger	7fc6f41d91	Add doc for add-new-model-like command (#15433 )	2022-01-31 11:10:45 -05:00
Yih-Dar	c15bb3fe19	[Fix doc example] fix missing import jnp (#15291 ) * fix missing import jnp * Fix missing jax and k=1 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-01-24 14:54:23 +01:00
Jonas Kuball	c962c2adbf	Adds missing module_specs for usages of _LazyModule (#15230 ) * Add missing __spec__ for transformers.models.auto * Moves the __spec__-test to the UnitTest class * Adds module_spec to all instances of _LazyModule * Refactors an old test from pytest to unittest	2022-01-21 07:30:12 -05:00
Matt	2708bfa127	Rename compute_loss in TF models (#15207 ) * Rename compute_loss to hf_compute_loss to avoid conflicts with the new Keras method * make style * Adding deprecation warning to `compute_loss` * Fix sneaky reference to compute_loss * Replace logger.warning with warnings.warn * Clarifying warning and deprecation timeline	2022-01-19 13:29:07 +00:00
Sylvain Gugger	5f3c57fc84	Check the repo consistency in model templates test (#15141 ) * Check the repo consistency in model templates test * Fix doc template * Fix docstrings * Fix last docstring	2022-01-14 04:52:38 -05:00
Sylvain Gugger	1a00863e95	Fix typo in doc template	2022-01-11 15:22:15 -05:00
NielsRogge	6ea6266625	Fix cookiecutter (#15100 )	2022-01-11 05:57:26 -05:00
Suraj Patil	3e9fdcf019	[DOC] fix doc examples for bart-like models (#15093 ) * fix doc examples * remove double colons	2022-01-10 18:13:28 +01:00
Sylvain Gugger	61d18ae035	Happy New Year! (#15094 )	2022-01-10 12:05:57 -05:00
Sylvain Gugger	207594be81	Convert rst files (#14888 ) * Convert all tutorials and guides * Convert all remaining rst to mdx * Track and fix bad links	2021-12-22 16:14:35 -05:00
Sylvain Gugger	27b3031de2	Mass conversion of documentation from rst to Markdown (#14866 ) * Convert docstrings of all configurations and tokenizers * Processors and fixes * Last modeling files and fixes to models * Pipeline modules * Utils files * Data submodule * All the other files * Style * Missing examples * Style again * Fix copies * Say bye bye to rst docstrings forever	2021-12-21 15:06:33 -05:00
Sylvain Gugger	7af80f6618	Convert docstrings of modeling files (#14850 ) * Convert file_utils docstrings to Markdown * Test on BERT * Return block indent * Temporarily disable doc styler * Remove from quality checks as well * Remove doc styler mess * Remove check from circleCI * Fix typo * Convert file_utils docstrings to Markdown * Test on BERT * Return block indent * Temporarily disable doc styler * Remove from quality checks as well * Remove doc styler mess * Remove check from circleCI * Fix typo * Let's go on all other model files * Add templates too * Styling and quality	2021-12-21 05:37:32 -05:00
Daniel Stancl	ff066119ca	Implement head_mask for Flax BERT and other models copied from BERT (#14620 ) * Implement head_mask for Flax BERT and other models copied from BERT * Remove `from jax._src.nn.functions import sigmoid` Remove `from jax._src.nn.functions import sigmoid` unintentionally added by IDE * Remove no more valid copy statement * Apply patil-suraj's suggestions from code review * Apply suggestions from the code review * Update Flax template * Fix a typo * Also update template for CausalLM modules	2021-12-17 17:06:59 +01:00
Lysandre Debut	8010fda9bf	Removes images to put them in a dataset (#14781 ) * First try * Update instructions	2021-12-16 04:42:02 -05:00
Yih-Dar	15a9d01519	Avoid using tf.tile in embeddings for TF models (#14735 ) * avoid tf.tile in embeddings * remove more tf.tile in embeddings * clean Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2021-12-13 17:30:46 +00:00
Yih-Dar	32eb29fef9	Fix doc examples: modify config before super().__init__ (#14697 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2021-12-13 12:50:02 +01:00
Yih-Dar	59d684fa92	Fix examples: 'CausalLMOutputWithCrossAttentions' object has no attribute 'last_hidden_state' (#14678 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2021-12-10 14:55:54 +01:00
Thomas Viehmann	6ed9882ddb	use functional interface for softmax in attention (#14198 ) * use functional interface instead of instantiating module and immediately calling it * fix torch.nn.functional to nn.functional. Thank you Stas!	2021-11-30 11:47:33 -05:00
Sylvain Gugger	d83b0e0c07	Add a post init method to all models (#14431 ) * Add a post init method to all models * Fix tests * Fix last tests * Fix templates * Add comment * Forgot to save	2021-11-18 08:38:09 -05:00
Suraj Patil	e92190c0f8	Fix Flax params dtype (#13098 ) * fix inits * fix embed dtype * fix embed dtype * add test to check default dtype * quality * add type conversion methods for flax models * more robust casting * cast sinusoidal positions * update pegasus * update albert * update test * make sure dtype is passed to every module * style * fix electra dense * fix t5 * quality * add more tests * better name * use the dtype for lm head computation * fix albert * style * fix albert embed dtype * more tests * fix vision enc-dec * cleanup * fix embed dtype pegasus * fix default param test * doc * update template * fix final_logits_bias dtype * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix doc * fix doc * add detailed docstring for dtype parameter * remove un-necessary import Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-11-11 14:45:20 +05:30
Patrick von Platen	e81d8d7fa9	[Bert2Bert] allow bert2bert + relative embeddings (#14324 ) * [Bert2Bert] allow bert2bert + relative embeddings * up * Update README_ko.md * up * up	2021-11-09 14:26:58 -05:00
Sylvain Gugger	c28bc80bbb	Generalize problem_type to all sequence classification models (#14180 ) * Generalize problem_type to all classification models * Missing import * Deberta BC and fix tests * Fix template * Missing imports * Revert change to reformer test * Fix style	2021-10-29 10:32:56 -04:00
Patrick von Platen	f5af873617	[Docs] More general docstrings (#14028 ) * up * finish * up * up * finish	2021-10-16 00:48:37 +02:00
Yih-Dar	8b240a0661	Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222 ) * Add cross attentions to TFGPT2Model * Add TFEncoderDecoderModel * Add TFBaseModelOutputWithPoolingAndCrossAttentions * Add cross attentions to TFBertModel * Fix past or past_key_values argument issue * Fix generation * Fix save and load * Add some checks and comments * Clean the code that deals with past keys/values * Add kwargs to processing_inputs * Add serving_output to TFEncoderDecoderModel * Some cleaning + fix use_cache value issue * Fix tests + add bert2bert/bert2gpt2 tests * Fix more tests * Ignore crossattention.bias when loading GPT2 weights into TFGPT2 * Fix return_dict_in_generate in tf generation * Fix is_token_logit_eos_token bug in tf generation * Finalize the tests after fixing some bugs * Fix another is_token_logit_eos_token bug in tf generation * Add/Update docs * Add TFBertEncoderDecoderModelTest * Clean test script * Add TFEncoderDecoderModel to the library * Add cross attentions to TFRobertaModel * Add TFRobertaEncoderDecoderModelTest * make style * Change the way of position_ids computation * bug fix * Fix copies in tf_albert * Remove some copied from and apply some fix-copies * Remove some copied * Add cross attentions to some other TF models * Remove encoder_hidden_states from TFLayoutLMModel.call for now * Make style * Fix TFRemBertForCausalLM * Revert the change to longformer + Remove copies * Revert the change to albert and convbert + Remove copies * make quality * make style * Add TFRembertEncoderDecoderModelTest * make quality and fix-copies * test TFRobertaForCausalLM * Fixes for failed tests * Fixes for failed tests * fix more tests * Fixes for failed tests * Fix Auto mapping order * Fix TFRemBertEncoder return value * fix tf_rembert * Check copies are OK * Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined * Add TFEncoderDecoderModelSaveLoadTests * fix tf weight loading * check the change of use_cache * Revert the change * Add missing test_for_causal_lm for TFRobertaModelTest * Try cleaning past * fix _reorder_cache * Revert some files to original versions * Keep as many copies as possible * Apply suggested changes - Use raise ValueError instead of assert * Move import to top * Fix wrong require_torch * Replace more assert by raise ValueError * Add test_pt_tf_model_equivalence (the test won't pass for now) * add test for loading/saving * finish * finish * Remove test_pt_tf_model_equivalence * Update tf modeling template * Remove pooling, added in the prev. commit, from MainLayer * Update tf modeling test template * Move inputs["use_cache"] = False to modeling_tf_utils.py * Fix torch.Tensor in the comment * fix use_cache * Fix missing use_cache in ElectraConfig * Add a note to from_pretrained * Fix style * Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt * Fix TFMLP (in TFGPT2) activation issue * Fix None past_key_values value in serving_output * Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub * Apply review suggestions - style for cross_attns in serving_output * Apply review suggestions - change assert + docstrings * break the error message to respect the char limit * deprecate the argument past * fix docstring style * Update the encoder-decoder rst file * fix Unknown interpreted text role "method" * fix typo Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-10-13 00:10:34 +02:00
Tommy Chiang	a2ef9c5446	Use torch.unique_consecutive to check same element (#13637 ) We use `torch.unique` here only to check whether every elements have the same value. Therefore, we can use `torch.unique_consecutive` here. This function eliminates all but the first element from every consecutive group of equivalent elements. Like, if we apply this function to `[1, 2, 2, 1]`, it will result in `[1, 2, 1]`. As you could see, this is enough for checking whether every elements have the same value. Since `torch.unique_consecutive` do less thing, it is much more faster. On my computer, it is 25x faster on GPU and 15x faster on CPU.	2021-09-24 10:31:23 +02:00
Sylvain Gugger	27d4639779	Make gradient_checkpointing a training argument (#13657 ) * Make gradient_checkpointing a training argument * Update src/transformers/modeling_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/configuration_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Fix tests * Style * document Gradient Checkpointing as a performance feature * Small rename * PoC for not using the config * Adapt BC to new PoC * Forgot to save * Rollout changes to all other models * Fix typo Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org>	2021-09-22 07:51:38 -04:00
Bhadresh Savani	3fbb55c757	[Flax] Fixes typo in Bart based Flax Models (#13565 )	2021-09-15 11:03:52 +05:30
Nils Reimers	c8be8a9adb	Update model configs - Allow setters for common properties (#13026 ) * refactor GPT Config to allow dyn. properties * make attribute_map a class attribute * remove old code * update unit test to test config: Add test for common properties setter * update unit test to test config: Add test for common properties passed as parameters to __init__ * update to black code format * Allow that setters are not defined for certain config classes * update config classes to implement attribute_map * bugfix lxmert config - id2labels was not defined when num_labels was set * update broken configs - add attribute_maps * update bart config * update black codestyle * update documentation on common config attributes * update GPTJ config to new attribute map * update docs on common attributes * gptj config: add max_position_embeddings * gptj config: format with black * update speech to text 2 config * format doc file to max_len 119 * update config template	2021-09-06 16:30:13 +02:00
Patrick von Platen	02039352b2	Update README.md	2021-09-01 09:50:21 +02:00
Jonathan Chang	d160782a53	Add template for adding flax models (#12441 ) * Add option to add flax * Add flax template for __init__.py * Add flax template for .rst * Copy TF modeling template * Add a missing line in modeling_tf_... template * Update first half of modeling_flax_.. * Update encoder flax template * Copy test_modeling_tf... as test_modeling_flax... * Replace some TF to Flax in test_modeling_flax_... * Replace tf to np some function might not work, like _assert_tensors_equal * Replace remaining tf to np (might not work) * Fix cookiecutter * Add Flax in to_replace_... template * Update transformers-cli add-new-model * Save generate_flax in configuration.json This will be read by transformers-cli * Fix to_replace_... and cli * Fix replace cli * Fix cookiecutter name * Move docstring earlier to avoid not defined error * Fix a missing Module * Add encoder-decoder flax template from bart * Fix flax test * Make style * Fix endif * Fix replace all "utf-8 -> unp-8" * Update comment * Fix flax template (add missing ..._DOCSTRING) * Use flax_bart imports in template (was t5) * Fix unp * Update templates/adding_a_new_model/tests * Revert "Fix unp" This reverts commit `dc9002a41d`. * Remove one line of copied from to suppress CI error * Use generate_tensorflow_pytorch_and_flax * Add a missing part * fix typo * fix flax config * add examples for flax * small rename * correct modeling imports * correct auto loading * corrects some flax tests * correct small typo * correct as type * finish modif * correct more templates * final fixes * add file testers * up * make sure tests match template regex * correct pytorch * correct tf * correct more tf * correct imports * minor error * minor error * correct init * more fixes * correct more flax tests * correct flax test * more fixes * correct docs * update * fix Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-09-01 09:49:03 +02:00
Jongheon Kim	ef8d6f2b4a	Set missing seq_length variable when using inputs_embeds with ALBERT & Remove code duplication (#13152 ) * Set seq_length variable when using inputs_embeds * remove code duplication	2021-08-31 06:51:25 -04:00

1 2 3 4 5

213 Commits