transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-23 22:38:58 +06:00

Author	SHA1	Message	Date
cloudhan	e86faecfd4	Fix obvious typos in flax decoder impl (#17279 ) Change config.encoder_ffn_dim -> config.decoder_ffn_dim for decoder.	2022-05-16 13:08:04 +02:00
Suraj Patil	9bd67ac7bb	update BART docs (#17212 )	2022-05-12 19:25:16 +01:00
Sylvain Gugger	4ad2f68e34	Fix template init (#17163 )	2022-05-10 15:24:23 -04:00
Dom Miketa	df735d1317	[WIP] Fix Pyright static type checking by replacing if-else imports with try-except (#16578 ) * rebase and isort * modify cookiecutter init * fix cookiecutter auto imports * fix clean_frameworks_in_init * fix add_model_to_main_init * blackify * replace unnecessary f-strings * update yolos imports * fix roberta import bug * fix yolos missing dependency * fix add_model_like and cookiecutter bug * fix repository consistency error * modify cookiecutter, fix add_new_model_like * remove stale line Co-authored-by: Dom Miketa <dmiketa@exscientia.co.uk>	2022-05-09 11:28:53 -04:00
Pavel Belevich	39f8eafc1b	Remove device parameter from create_extended_attention_mask_for_decoder (#16894 )	2022-05-03 11:06:11 -04:00
Yih-Dar	19420fd99e	Move test model folders (#17034 ) * move test model folders (TODO: fix imports and others) * fix (potentially partially) imports (in model test modules) * fix (potentially partially) imports (in tokenization test modules) * fix (potentially partially) imports (in feature extraction test modules) * fix import utils.test_modeling_tf_core * fix path ../fixtures/ * fix imports about generation.test_generation_flax_utils * fix more imports * fix fixture path * fix get_test_dir * update module_to_test_file * fix get_tests_dir from wrong transformers.utils * update config.yml (CircleCI) * fix style * remove missing imports * update new model script * update check_repo * update SPECIAL_MODULE_TO_TEST_MAP * fix style * add __init__ * update self-scheduled * fix add_new_model scripts * check one way to get location back * python setup.py build install * fix import in test auto * update self-scheduled.yml * update slack notification script * Add comments about artifact names * fix for yolos Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-05-03 14:42:02 +02:00
Sanchit Gandhi	cd9274d010	[FlaxBert] Add ForCausalLM (#16995 ) * [FlaxBert] Add ForCausalLM * make style * fix output attentions * Add RobertaForCausalLM * remove comment * fix fx-to-pt model loading * remove comment * add modeling tests * add enc-dec model tests * add big_bird * add electra * make style * make repo-consitency * add to docs * remove roberta test * quality * amend cookiecutter * fix attention_mask bug in flax bert model tester * tighten pt-fx thresholds to 1e-5 * add 'copied from' statements * amend 'copied from' statements * amend 'copied from' statements * quality	2022-05-03 11:26:19 +02:00
Joao Gante	e03966e404	TF: XLA stable softmax (#16892 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-25 20:10:51 +01:00
Suraj Patil	d3bd9ac728	[Flax] improve large model init and loading (#16148 ) * begin do_init * add params_shape_tree * raise error if params are accessed when do_init is False * don't allow do_init=False when keys are missing * make shape tree a property * assign self._params at the end * add test for do_init * add do_init arg to all flax models * fix param setting * disbale do_init for composite models * update test * add do_init in FlaxBigBirdForMultipleChoice * better names and errors * improve test * style * add a warning when do_init=False * remove extra if * set params after _required_params * add test for from_pretrained * do_init => _do_init * chage warning to info * fix typo * add params in init_weights * add params to gpt neo init * add params to init_weights * update do_init test * Trigger CI * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update template * trigger CI * style * style * fix template Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-04-19 14:19:55 +02:00
Anmol Joshi	a315988bae	Moved functions to pytorch_utils.py (#16625 ) * Moved functions to pytorch_utils.py * isort formatting * Reverted tf changes * isort, make fix-copies * documentation fix * Fixed Conv1D import * Reverted research examples file * backward compatibility for pytorch_utils * missing import * isort fix	2022-04-12 12:38:50 -04:00
Matt	4354005291	Adding new train_step logic to make things less confusing for users (#15994 ) * Adding new train_step logic to make things less confusing for users * DO NOT ASK WHY WE NEED THAT SUBCLASS * Metrics now working, at least for single-output models with type annotations! * Updates and TODOs for the new train_step * Make fixup * Temporary test workaround until T5 has types * Temporary test workaround until T5 has types * I think this actually works! Needs a lot of tests though * MAke style/quality * Revert changes to T5 tests * Deleting the aforementioned unmentionable subclass * Deleting the aforementioned unmentionable subclass * Adding a Keras API test * Style fixes * Removing unneeded TODO and comments * Update test_step too * Stop trying to compute metrics with the dummy_loss, patch up test * Make style * make fixup * Docstring cleanup * make fixup * make fixup * Stop expanding 1D input tensors when using dummy loss * Adjust T5 test given the new compile() * make fixup * Skipping test for convnext * Removing old T5-specific Keras test now that we have a common one * make fixup * make fixup * Only skip convnext test on CPU * Update src/transformers/modeling_tf_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Avoiding TF import issues * make fixup * Update compile() to support TF 2.3 * Skipping model.fit() on template classes for now * Skipping model.fit() on template class tests for now * Replace ad-hoc solution with find_labels * make fixup Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-05 14:23:27 +01:00
Joao Gante	dad5ca83b2	TF: Finalize `unpack_inputs`-related changes (#16499 ) * Add unpack_inputs to remaining models * removed kwargs to `call()` in TF models * fix TF T5 tests	2022-04-04 16:37:33 +01:00
Yih-Dar	2199382dfd	Use random_attention_mask for TF tests (#16517 ) * use random_attention_mask for TF tests * Fix for TFCLIP test (for now). Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-01 16:53:07 +02:00
Joao Gante	c2f8eaf6bc	TF: unpack inputs on Convbert, GPTJ, LED, and templates (#16491 ) * Add unpack_inputs to remaining models * remove stray use of inputs in the templates; fix tf.debugging of attn masks	2022-03-30 17:12:27 +01:00
Sylvain Gugger	088c1880b7	Big file_utils cleanup (#16396 ) * Big file_utils cleanup * This one still needs to be treated separately	2022-03-25 07:25:20 -04:00
Sylvain Gugger	4975002df5	Reorganize file utils (#16264 ) * Split file_utils in several submodules * Fixes * Add back more objects * More fixes * Who exactly decided to import that from there? * Second suggestion to code with code review * Revert wront move * Fix imports * Adapt all imports * Adapt all imports everywhere * Revert this import, will fix in a separate commit	2022-03-23 10:26:33 -04:00
Lysandre Debut	eca77f4719	Updates the default branch from master to main (#16326 ) * Updates the default branch from master to main * Links from `master` to `main` * Typo * Update examples/flax/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-23 03:46:59 -04:00
Jacob Dineen	ec3aace0ae	Add type annotations for Rembert/Splinter and copies (#16338 ) * undo black autoformat * minor fix to rembert forward with default * make fix-copies, make quality * Adding types to template model * Removing List from the template types * Remove `Optional` from a couple of types that don't accept `None` Co-authored-by: matt <rocketknight1@gmail.com>	2022-03-22 20:07:48 +00:00
Robot Jelly	d50f62f2de	added type hints for BART model (#16270 ) * added type hints for BART model * make fixup, adding imports to copied files * Adding some missing types to cookiecutter * Adding some missing types to cookiecutter * Adding some missing types to cookiecutter Co-authored-by: matt <rocketknight1@gmail.com>	2022-03-21 15:18:01 +00:00
Sanchit Gandhi	ee27b3d7df	Replace all deprecated `jax.ops` operations with jnp's `at` (#16078 ) * Replace all deprecated `jax.ops` operations with jnp's `at` * np to jnp scores * suggested changes	2022-03-16 09:08:55 +00:00
Joao Gante	70203b5937	TF generate refactor - past without encoder outputs (#15944 ) * Remove packed past from generation_tf_utils * update models with the new past format * update template accordingly	2022-03-08 14:46:44 +00:00
Yih-Dar	f0aacc140b	Do not change the output from tuple to list - to match PT's version (#15918 ) * Do not change the output from tuple to list - to match PT's version * Fix the same issues for 5 other models and the template Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-03-04 17:50:24 +01:00
Yih-Dar	8635407bc7	Fix tf.concatenate + test past_key_values for TF models (#15774 ) * fix wrong method name tf.concatenate * add tests related to causal LM / decoder * make style and quality * clean-up * Fix TFBertModel's extended_attention_mask when past_key_values is provided * Fix tests * fix copies * More tf.int8 -> tf.int32 in TF test template * clean-up * Update TF test template * revert the previous commit + update the TF test template * Fix TF template extended_attention_mask when past_key_values is provided * Fix some styles manually * clean-up * Fix ValueError: too many values to unpack in the test * Fix more: too many values to unpack in the test * Add a comment for extended_attention_mask when there is past_key_values * Fix TFElectra extended_attention_mask when past_key_values is provided * Add tests to other TF models * Fix for TF Electra test: add prepare_config_and_inputs_for_decoder * Fix not passing training arg to lm_head in TFRobertaForCausalLM * Fix tests (with past) for TF Roberta * add testing for pask_key_values for TFElectra model Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-25 17:11:46 +01:00
Lysandre Debut	bb7949b35a	Fix model templates (#15806 ) * Fix model templates * Update paths	2022-02-23 18:27:29 -05:00
Lysandre Debut	29c10a41d0	[Test refactor 1/5] Per-folder tests reorganization (#15725 ) * Per-folder tests reorganization Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Stas Bekman <stas@stason.org>	2022-02-23 15:46:28 -05:00
Patrick von Platen	2e12b907ae	TF generate refactor - Greedy Search (#15562 ) * TF generate start refactor * Add tf tests for sample generate * re-organize * boom boom * Apply suggestions from code review * re-add * add all code * make random greedy pass * make encoder-decoder random work * further improvements * delete bogus file * make gpt2 and t5 tests work * finish logits tests * correct logits processors * correct past / encoder_outputs drama * refactor some methods * another fix * refactor shape_list * fix more shape list * import shape _list * finish docs * fix imports * make style * correct tf utils * Fix TFRag as well * Apply Lysandre's and Sylvais suggestions * Update tests/test_generation_tf_logits_process.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Update src/transformers/tf_utils.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * remove cpu according to gante * correct logit processor Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-02-15 17:54:43 +01:00
Yih-Dar	6a5472a8e1	Force use_cache to be False in PyTorch (#15385 ) * use_cache = False for PT models if labels is passed * Fix for BigBirdPegasusForConditionalGeneration * add warning if users specify use_cache=True * Use logger.warning instead of warnings.warn Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-08 16:20:53 +01:00
SaulLu	7b8bdd8601	fix the `tokenizer_config.json` file for the slow tokenizer when a fast version is available (#15319 ) * add new test * update test * remove `tokenizer_file` from `additional_files_names` in `tokenization_utils_base.py` * add `tokenizer_file` for the fast only tokenizer * change global variables layoutxml * remove `"tokenizer_file"` from DPR tokenizer's Global variables * remove `tokenizer_file` from herbert slow tokenizer init * `"tokenizer_file"` from LED tokenizer's Global variables * remove `tokenizer_file` from mbart slow tokenizer init * remove `tokenizer_file` from slow tokenizer template * adapt to versioning * adapt the `test_tokenizer_mismatch_warning` test * clean test * clarify `VOCAB_FILES_NAMES` in tokenization_utils_fast.py * Revert "remove `tokenizer_file` from mbart slow tokenizer init" This reverts commit `0dbb723fa9`. * Revert "`"tokenizer_file"` from LED tokenizer's Global variables" This reverts commit `5a3f879bdd`. * Revert "remove `tokenizer_file` from herbert slow tokenizer init" This reverts commit `f5e10007b7`. * Revert "remove `"tokenizer_file"` from DPR tokenizer's Global variables" This reverts commit `da0895330b`. * set `tokenizer_file` in super `__init__` of mbart	2022-02-01 16:48:25 +01:00
Yih-Dar	dc05dd539f	Fix TF Causal LM models' returned logits (#15256 ) * Fix TF Causal LM models' returned logits * Fix expected shape in the tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-01 11:04:07 +00:00
Yih-Dar	554d333ece	Fix loss calculation in TFXXXForTokenClassification models (#15294 ) * Fix loss calculation in TFFunnelForTokenClassification * revert the change in TFFunnelForTokenClassification * fix FunnelForTokenClassification loss * fix other TokenClassification loss * fix more * fix more * add num_labels to ElectraForTokenClassification * revert the change to research projects Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-01-31 11:43:08 -05:00
Sylvain Gugger	7fc6f41d91	Add doc for add-new-model-like command (#15433 )	2022-01-31 11:10:45 -05:00
Yih-Dar	c15bb3fe19	[Fix doc example] fix missing import jnp (#15291 ) * fix missing import jnp * Fix missing jax and k=1 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-01-24 14:54:23 +01:00
Jonas Kuball	c962c2adbf	Adds missing module_specs for usages of _LazyModule (#15230 ) * Add missing __spec__ for transformers.models.auto * Moves the __spec__-test to the UnitTest class * Adds module_spec to all instances of _LazyModule * Refactors an old test from pytest to unittest	2022-01-21 07:30:12 -05:00
Matt	2708bfa127	Rename compute_loss in TF models (#15207 ) * Rename compute_loss to hf_compute_loss to avoid conflicts with the new Keras method * make style * Adding deprecation warning to `compute_loss` * Fix sneaky reference to compute_loss * Replace logger.warning with warnings.warn * Clarifying warning and deprecation timeline	2022-01-19 13:29:07 +00:00
Sylvain Gugger	5f3c57fc84	Check the repo consistency in model templates test (#15141 ) * Check the repo consistency in model templates test * Fix doc template * Fix docstrings * Fix last docstring	2022-01-14 04:52:38 -05:00
Sylvain Gugger	1a00863e95	Fix typo in doc template	2022-01-11 15:22:15 -05:00
NielsRogge	6ea6266625	Fix cookiecutter (#15100 )	2022-01-11 05:57:26 -05:00
Suraj Patil	3e9fdcf019	[DOC] fix doc examples for bart-like models (#15093 ) * fix doc examples * remove double colons	2022-01-10 18:13:28 +01:00
Sylvain Gugger	61d18ae035	Happy New Year! (#15094 )	2022-01-10 12:05:57 -05:00
Sylvain Gugger	207594be81	Convert rst files (#14888 ) * Convert all tutorials and guides * Convert all remaining rst to mdx * Track and fix bad links	2021-12-22 16:14:35 -05:00
Sylvain Gugger	27b3031de2	Mass conversion of documentation from rst to Markdown (#14866 ) * Convert docstrings of all configurations and tokenizers * Processors and fixes * Last modeling files and fixes to models * Pipeline modules * Utils files * Data submodule * All the other files * Style * Missing examples * Style again * Fix copies * Say bye bye to rst docstrings forever	2021-12-21 15:06:33 -05:00
Sylvain Gugger	7af80f6618	Convert docstrings of modeling files (#14850 ) * Convert file_utils docstrings to Markdown * Test on BERT * Return block indent * Temporarily disable doc styler * Remove from quality checks as well * Remove doc styler mess * Remove check from circleCI * Fix typo * Convert file_utils docstrings to Markdown * Test on BERT * Return block indent * Temporarily disable doc styler * Remove from quality checks as well * Remove doc styler mess * Remove check from circleCI * Fix typo * Let's go on all other model files * Add templates too * Styling and quality	2021-12-21 05:37:32 -05:00
Daniel Stancl	ff066119ca	Implement head_mask for Flax BERT and other models copied from BERT (#14620 ) * Implement head_mask for Flax BERT and other models copied from BERT * Remove `from jax._src.nn.functions import sigmoid` Remove `from jax._src.nn.functions import sigmoid` unintentionally added by IDE * Remove no more valid copy statement * Apply patil-suraj's suggestions from code review * Apply suggestions from the code review * Update Flax template * Fix a typo * Also update template for CausalLM modules	2021-12-17 17:06:59 +01:00
Lysandre Debut	8010fda9bf	Removes images to put them in a dataset (#14781 ) * First try * Update instructions	2021-12-16 04:42:02 -05:00
Yih-Dar	15a9d01519	Avoid using tf.tile in embeddings for TF models (#14735 ) * avoid tf.tile in embeddings * remove more tf.tile in embeddings * clean Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2021-12-13 17:30:46 +00:00
Yih-Dar	32eb29fef9	Fix doc examples: modify config before super().__init__ (#14697 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2021-12-13 12:50:02 +01:00
Yih-Dar	59d684fa92	Fix examples: 'CausalLMOutputWithCrossAttentions' object has no attribute 'last_hidden_state' (#14678 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2021-12-10 14:55:54 +01:00
Thomas Viehmann	6ed9882ddb	use functional interface for softmax in attention (#14198 ) * use functional interface instead of instantiating module and immediately calling it * fix torch.nn.functional to nn.functional. Thank you Stas!	2021-11-30 11:47:33 -05:00
Sylvain Gugger	d83b0e0c07	Add a post init method to all models (#14431 ) * Add a post init method to all models * Fix tests * Fix last tests * Fix templates * Add comment * Forgot to save	2021-11-18 08:38:09 -05:00
Suraj Patil	e92190c0f8	Fix Flax params dtype (#13098 ) * fix inits * fix embed dtype * fix embed dtype * add test to check default dtype * quality * add type conversion methods for flax models * more robust casting * cast sinusoidal positions * update pegasus * update albert * update test * make sure dtype is passed to every module * style * fix electra dense * fix t5 * quality * add more tests * better name * use the dtype for lm head computation * fix albert * style * fix albert embed dtype * more tests * fix vision enc-dec * cleanup * fix embed dtype pegasus * fix default param test * doc * update template * fix final_logits_bias dtype * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix doc * fix doc * add detailed docstring for dtype parameter * remove un-necessary import Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-11-11 14:45:20 +05:30

1 2 3 4 5

224 Commits