transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Jegor Kitškerkin	8a5e8a9c2a	Add ViViT (#22518 ) * Add model * Add ability to get classification head weights * Add docs * Add imports to __init__.py * Run style * Fix imports and add mdx doc * Run style * Fix copyright * Fix config docstring * Remove imports of ViViTLayer and load_tf_weights_in_vivit * Remove FeatureExtractor and replace with ImageProcessor everywhere * Remove ViViTForPreTraining from vivit.mdx * Change ViViT -> Vivit everywhere * Add model_doc to _toctree.yml * Replace tuples with lists in arguments of VivitConfig * Rename patch_size to tubelet_size in TubeletEmbeddings * Fix checkpoint names * Add tests * Remove unused num_frames * Fix imports for VivitImageProcessor * Minor fixes * Decrease number of frames in VivitModelTester from 32 to 16 * Decrease number of frames in VivitModelTester from 16 to 8 * Add initialization for pos embeddings * Rename Vivit -> ViViT in some places * Fix docstring and formatting * Rename TubeletEmbeddings -> VivitTubeletEmbeddings * Remove load_tf_weights_in_vivit * Change checkpoint name * Remove Vivit _TOKENIZER_FOR_DOC * Fix * Fix VivitTubeletEmbeddings and pass config object as parameter * Use image_size and num_frames instead of video_size * Change conversion script and fix differences with the orig implementation * Fix docstrings * Add attention head pruning * Run style and fixup * Fix tests * Add ViViT to video_classification.mdx * Save processor in conversion script * Fix * Add image processor test * Run fixup and style * Run fix-copies * Update tests/models/vivit/test_modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/vivit/test_modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use PyAV instead of decord * Add unittest.skip * Run style * Remove unneeded test * Update docs/source/en/model_doc/vivit.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/configuration_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add model * Add docs * Run style * Fix imports and add mdx doc * Remove FeatureExtractor and replace with ImageProcessor everywhere * Change ViViT -> Vivit everywhere * Rename Vivit -> ViViT in some places * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Run make style * Remove inputs save * Fix image processor * Fix * Run `make style` * Decrease parameters of VivitModelTester * Decrease tubelet size * Rename vivit.mdx * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix default values in image_processing_vivit.py --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-07-11 14:04:04 +01:00
Arthur	b15343de6f	[Patch-t5-tokenizer] Patches the changes on T5 to make sure previous behaviour is still valide for beginning of words (#24622 ) * patch `_tokenize` function * more tests * properly fix * fixup * Update src/transformers/models/t5/tokenization_t5.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix without ifs * update * protect import * add python processing * is first needed * add doc and update with lefacy * updaate * fix T5 SPM converter * styling * fix T5 warning * add is_seqio_available * remove is_first * revert some changes * more tests and update * update llama test batterie * fixup * refactor T5 spm common tests * draft the llama tests * update * uopdate test * nits * refine * name nit * fix t5 tests * fix T5 * update * revert convert slow to fast changes that fail lots of tests * legacy support * fixup * nits is first not defined * don't use legacy behaviour for switch transformers * style * My attempt to check. * nits * fixes * update * fixup * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * updates * fixup * add legacy warning * fixup * warning_once nit * update t5 documentation test * update llama tok documentation * add space to warning * nits * nit * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * last nits --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-07-11 15:02:18 +02:00
Matt	b3ab3fac1d	Falcon port (#24523 ) * Initial commit * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Cleanup config docstring * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Convert to relative imports * Remove torch < 1.8 warning * Restructure cos_sin header * qkv -> query, key, value * Refactor attention calculation * Add a couple of config variables to account for the different checkpoints * Successful merging of the code paths! * Fix misplaced line in the non-parallel attention path * Update config and tests * Add a pad_token_id when testing * Support output_attentions when alibi is None * make fixup * Skip KV cache shape test * No more _keys_to_ignore_on_load_missing * Simplify self attention a bit * Simplify self attention a bit * make fixup * stash commit * Some more attention mask updates * Should pass all tests except assisted generation! * Add big model generation test * make fixup * Add temporary workaround for test * Test overrides for assisted generation * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/falcon/test_modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Test overrides for assisted generation * Add generation demo * Update copyright * Make the docstring model actually small * Add module-level docstring * Remove all assertions * Add copied from bloom * Reformat the QKV layer * Add copied from bloom * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove unused line and reformat * No single letter variables * Cleanup return names * Add copied from line * Remove the deprecated arguments blocks * Change the embeddings test to an alibi on/off test * Remove position_ids from FalconForQA * Remove old check for token type IDs * Fix the alibi path when multi_query is False * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/falcon/test_modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update config naming * Fix typo for new_decoder_architecture * Add some comments * Fix docstring * Fix docstring * Create range in the right dtype from the start * Review comment cleanup * n_head_kv -> num_kv_heads * self.alibi -> self.use_alibi * self.num_kv -> self.num_kv_heads * Reorder config args * Made alibi arguments Optional * Add all model docstrings * Add extra checkpoints * Add author info for Falcon * Stop removing token_type_ids because our checkpoints shouldn't return it anymore * Add one hopeful comment for the future * Fix typo * Update tests, fix cache issue for generation * Use -1e9 instead of -inf to avoid float overflow * Recompute the rotary embeddings much less often * Re-enable disabled tests * One final fix to attention mask calculation, and update tests * Cleanup targeting falcon-40b equivalency * Post-rebase docs update * Update docstrings, especially in the config * More descriptive variable names, and comments where we can't rename them --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-07-11 13:36:31 +01:00
Marc Sun	35eac0df75	add link to accelerate doc (#24601 )	2023-07-10 17:49:30 -04:00
Joao Gante	a074a5d34d	Docs: change some `input_ids` doc reference from `BertTokenizer` to `AutoTokenizer` (#24730 )	2023-07-10 17:57:26 +01:00
Sebastian Husch Lee	2541108564	[`T5`] Adding model_parallel = False to `T5ForQuestionAnswering` and `MT5ForQuestionAnswering` (#24684 ) Adding model_parallel = False	2023-07-10 13:50:07 +01:00
novice	30ed3adf47	Add Multi Resolution Analysis (MRA) (New PR) (#24513 ) * Add all files * Update masked_language_modeling.md * fix mlm models * fix conflicts * fix conflicts * fix copies * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Reduce seq_len and hidden_size in ModelTester * remove output_attentions * fix conflicts * remove copied from statements * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-07-10 10:50:43 +01:00
Dan Saattrup Nielsen	abaca9f943	Enable `conversational` pipeline for `GPTSw3Tokenizer` (#24648 ) * feat: Add `_build_conversation_input_ids` to GPT-SW3 tokenizer, adjust line length * feat: Merge in PR https://github.com/huggingface/transformers/pull/24504. This allows the GPT-SW3 models (and other GPT-2 based models) to be 4-bit quantised using `load_in_4bit` with `bitsandbytes`. * fix: F-string * fix: F-string * fix: Remove EOS token from all responses * fix: Remove redundant newlines * feat: Add `load_in_4bit` to `Pipeline` * fix: Separate turns with `\n<s>\n` rather than `<s>` * fix: Add missing newline in prompt * tests: Add unit tests for the new `_build_conversation_input_ids` method * style: Automatic style correction * tests: Compare encodings rather than decodings * fix: Remove `load_in_4bit` from pipeline arguments * docs: Add description and references of the GPT-SW3 chat format * style: Line breaks * Apply suggestions from code review Fix Conversation type hints Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix: Import TYPE_CHECKING * style: Run automatic fixes * tests: Remove `_build_conversation_input_ids` unit tests * tests: Remove import of `Conversation` in GPT-SW3 unit test * style: Revert formatting * style: Move TYPE_CHECKING line after all imports * style: Imports order * fix: Change prompt to ensure that `sp_model.encode` and `encode` yields same result * docs: Add TODO comment related to the addition of whitespace during decoding * style: Automatic style checks * fix: Remove final whitespace in prompt, as prefix whitespace is used by sentencepiece --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-07-07 19:52:21 +01:00
Joao Gante	f614b6e393	Whisper: fix prompted max length (#24666 )	2023-07-07 18:11:38 +01:00
Yih-Dar	4957294270	Fix flaky `test_for_warning_if_padding_and_no_attention_mask` (#24706 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-07 11:55:21 +02:00
Arthur	fb78769b9c	[`MT5`] Fix CONFIG_MAPPING issue leading it to load umt5 class (#24678 ) * update * add umt5 to auto tokenizer mapping * nits * fixup * fix failing torch test	2023-07-07 11:33:54 +09:00
Zach Mueller	fded6f4186	Fix integration with Accelerate and failing test (#24691 ) Fix integration	2023-07-06 14:12:16 -04:00
Yih-Dar	bbf3090848	Avoid import `sentencepiece_model_pb2` in `utils.__init__.py` (#24689 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-06 16:30:23 +02:00
Sourab Mangrulkar	66a378429d	DeepSpeed/FSDP ckpt saving utils fixes and FSDP training args fixes (#24591 ) * update ds and fsdp ckpt logic * refactoring * fix 🐛 * resolve comment * fix issue with overriding of the fsdp config set by accelerate	2023-07-06 15:03:25 +05:30
Zhao Tianyu	392740452e	Add dropouts to GPT-NeoX (#24680 ) * add attention dropout, post attention dropout, post mlp dropout to gpt-neox * fix typo * add documentation * fix too long line * ran Checking/fixing src/transformers/models/gpt_neox/configuration_gpt_neox.py src/transformers/models/gpt_neox/modeling_gpt_neox.py python utils/custom_init_isort.py python utils/sort_auto_mappings.py doc-builder style src/transformers docs/source --max_len 119 --path_to_docs docs/source python utils/check_doc_toc.py --fix_and_overwrite running deps_table_update updating src/transformers/dependency_versions_table.py python utils/check_copies.py python utils/check_table.py python utils/check_dummies.py python utils/check_repo.py Checking all models are included. Checking all models are public. Checking all models are properly tested. Checking all objects are properly documented. Checking all models are in at least one auto class. Checking all names in auto name mappings are defined. Checking all keys in auto name mappings are defined in `CONFIG_MAPPING_NAMES`. Checking all auto mappings could be imported. Checking all objects are equally (across frameworks) in the main __init__. python utils/check_inits.py python utils/check_config_docstrings.py python utils/check_config_attributes.py python utils/check_doctest_list.py python utils/update_metadata.py --check-only python utils/check_task_guides.py	2023-07-06 10:26:36 +01:00
Yuchao Dai	fb3b22c3b9	LlamaTokenizer should be picklable (#24681 ) * LlamaTokenizer should be picklable * make fixup	2023-07-06 10:21:27 +01:00
Matt	9a5d468ba0	Add Nucleotide Transformer notebooks and restructure notebook list (#24669 ) * Add Nucleotide Transformer notebooks and restructure lists * Add missing linebreak!	2023-07-05 18:28:47 +01:00
Rafael Padilla	3df3b9d4bf	Fix model referenced and results in documentation. Model mentioned was inaccessible (#24609 )	2023-07-05 13:25:36 -03:00
Yih-Dar	050ef14516	Unpin `huggingface_hub` (#24667 ) * fix * fix * fix * [test all] commit * [test all] commit * [test all] commit --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-05 16:49:10 +02:00
Nripesh Niketan	bd9dfc23b9	Add `is_torch_mps_available` function to utils (#24660 ) * Add mps function utils * black formating * format fix * Added MPS functionality to transformers * format fix	2023-07-05 16:02:20 +02:00
Yih-Dar	ee339bad01	Fix `VisionTextDualEncoderIntegrationTest` (#24661 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-05 13:44:30 +02:00
Yih-Dar	d211a84aca	Fix `EncodecModelTest::test_multi_gpu_data_parallel_forward` (#24663 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-05 11:37:46 +02:00
Sylvain Gugger	469f4d0c29	Make warning disappear for remote code in pipelines (#24603 ) * Make warning disappear for remote code in pipelines * Make sure it works twice in a row * No need for that	2023-07-04 19:03:14 -04:00
Sylvain Gugger	b19c7b5ccf	Add `finetuned_from` property in the autogenerated model card (#24528 ) * Add finetuned_from tag in the autogenerated model card * Update name	2023-07-04 17:58:31 -04:00
Rafael Padilla	ea9caf7aba	Update warning messages reffering to post_process_object_detection (#24649 ) * including the threshold alert in warning messages. * Updating doc owlvit.md including post_process_object_detection function with threshold. * fix	2023-07-04 16:47:57 -03:00
amyeroberts	f3e96235a3	documentation_tests.txt - sort filenames alphabetically (#24647 ) * Sort filenames alphabetically * Add check for order	2023-07-04 17:06:05 +01:00
Prathik Rao	a3b402ff9a	llama fp16 torch.max bug fix (#24561 ) * open llama fp16 bug fix * bug fix * bug fixed * make style * Update modeling_llama.py * apply formatting * Address amy's comment --------- Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: root <root@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2023-07-04 16:05:12 +01:00
Sanchit Gandhi	4e94566018	Fix audio feature extractor deps (#24636 ) * Fix audio feature extractor deps * use audio utils window over torch window	2023-07-04 16:03:27 +01:00
Shahad Mahmud	cd4584e3c8	precompiled_charsmap checking before adding to the normalizers' list for XLNetTokenizerFast conversion. (#24618 ) * precompiled_charsmap checking before adding to the normalizers' list. * precompiled_charsmap checking for all Sentencepiece tokenizer models * precompiled_charsmap checking for SPM tokenizer models - correct formatting	2023-07-04 02:51:42 +02:00
Joao Gante	f4e4b4d0e2	Generate: force cache with `inputs_embeds` forwarding (#24639 )	2023-07-03 18:18:49 +01:00
Joao Gante	9934bb1f42	Generate: multi-device support for contrastive search (#24635 )	2023-07-03 16:08:20 +01:00
Gema Parreño	4b26a61631	Fix loading dataset docs link in run_translation.py example (#24594 ) * fix loading dataset link * Update examples/tensorflow/translation/run_translation.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Update examples/tensorflow/translation/run_translation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-07-03 15:21:21 +01:00
Yih-Dar	6eedfa6dd1	Pin `Pillow` for now (#24633 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-03 12:24:46 +02:00
Eli Simhayev	fc7ce2ebc5	[Time-Series] Added blog-post to tips (#24482 ) * [Time-Series] Added blog-post to tips * added Resources to time series models docs * removed "with Bert"	2023-07-03 10:07:25 +02:00
Nayeon Han	e16191a8ac	🌐 [i18n-KO] Translated `perplexity.mdx` to Korean (#23850 ) * docs: ko: `perplexity.mdx` * translate comment * reference english file * change extension * update toctree	2023-07-03 08:50:27 +02:00
Arthur	799df10aef	[`Umt5`] Add google's umt5 to `transformers` (#24477 ) * add tokenization template * update conversion script * update modeling code * update * update convert checkpoint * update modeling * revert changes on convert script * new conversion script for new format * correct position bias * cleaning a bit * Credit co authors Co-authored-by: agemagician <ahmed.elnaggar@tum.de> Co-authored-by: stefan-it <> * styling * Add docq * fix copies * add co author * Other Author * Merge branch 'main' of https://github.com/huggingface/transformers into add-umt5 * add testing * nit * Update docs/source/en/model_doc/umt5.mdx Co-authored-by: Stefan Schweter <stefan@schweter.it> * fix t5 * actual fix? * revert wrong changes * remove * update test * more fixes * revert some changes * add SPIECE_UNDERLINE * add a commone xample * upfate * fix copies * revert changes on t5 conversion script * revert bytefallback changes since there was no addition yet * fixup * fixup * ingore umt5 cutom testing folder * fix readmes * revertT5 changes * same outputs * fixup * update example * Apply suggestions from code review * style * draft addition of all new files * current update * fix attention and stuff * finish refactoring * auto config * fixup * more nits * add umt5 to init * use md format * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * revert changes on mt5 * revert mt4 changes * update test * more fixes * add to mapping * fix-copies * fix copies * foix retain grad * fix some tests * nits * done * Update src/transformers/models/umt5/modeling_umt5.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/umt5.md * Update src/transformers/models/umt5/__init__.py * Update docs/source/en/model_doc/umt5.md Co-authored-by: Stefan Schweter <stefan@schweter.it> * Update src/transformers/models/umt5/modeling_umt5.py * update conversion script + use google checkpoints * nits * update test and modelling * stash slow convert * update fixupd * don't change slow --------- Co-authored-by: stefan-it <> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-03 07:38:21 +02:00
ydshieh	66ded238cd	fix pydantic install command	2023-07-01 09:29:21 +02:00
Serge Matveenko	d51aa48a76	Limit Pydantic to V1 in dependencies (#24596 ) * Limit Pydantic to V1 in dependencies Pydantic is about to release V2 release which will break a lot of things. This change prevents `transformers` to be used with Pydantic V2 to avoid breaking things. * more --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-01 00:04:03 +02:00
Yih-Dar	299aafe55f	Use protobuf 4 (#24599 ) * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-30 20:56:55 +02:00
Stas Bekman	49e812d12b	[several models] improve readability (#24585 ) * [modeling_clip.py] improve readability * apply to other models * fix	2023-06-30 11:27:27 -07:00
Matt	134caef31a	Speed up TF tests by reducing hidden layer counts (#24595 ) * hidden layers, huh, what are they good for (absolutely nothing) * Some tests break with 1 hidden layer, use 2 * Use 1 hidden layer in a few slow models * Use num_hidden_layers=2 everywhere * Slightly higher tol for groupvit * Slightly higher tol for groupvit	2023-06-30 16:30:33 +01:00
Yih-Dar	3441ad7d43	Make (TF) CI faster (test only a subset of model classes) (#24592 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-30 16:54:54 +02:00
JB (Don)	78a2b19fc8	Show a warning for missing attention masks when pad_token_id is not None (#24510 ) * Adding warning messages to BERT for missing attention masks These warning messages when there are pad tokens within the input ids and no attention masks are given. The warning message should only show up once. * Adding warning messages to BERT for missing attention masks These warning messages are shown when the pad_token_id is not None and no attention masks are given. The warning message should only show up once. * Ran fix copies to copy over the changes to some of the other models * Add logger.warning_once.cache_clear() to the test * Shows warning when there are no attention masks and input_ids start/end with pad tokens * Using warning_once() instead and fix indexing in input_ids check --------- Co-authored-by: JB Lau <hckyn@voyager2.local>	2023-06-30 08:19:39 -04:00
Jeroen Van Goey	fd8dcd0953	Udate link to RunHouse hardware setup documentation. (#24590 ) * Udate link to RunHouse hardware setup documentation. * Fix link to hardware setup in other location as well	2023-06-30 12:11:58 +01:00
Arthur	b52a03cd3b	⚠️⚠️[`T5Tokenize`] Fix T5 family tokenizers⚠️⚠️ (#24565 ) * don't add space before single letter chars that don't have a merge * fix the fix * fixup * add a test * more testing * fixup * hack to make sure fast is also fixed * update switch transformers test * revert convert slow * Update src/transformers/models/t5/tokenization_t5.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add typechecking * quality --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-06-30 07:00:43 +02:00
Sourab Mangrulkar	9e28750287	fix peft ckpts not being pushed to hub (#24578 ) * fix push to hub for peft ckpts * oops	2023-06-30 00:07:44 +05:30
MS Kim(tony9402)	232c898f9f	Fix annotations (#24582 ) * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations	2023-06-29 14:17:35 -04:00
Yih-Dar	c817bc44e2	Check all objects are equally in the main `__init__` file (#24573 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-29 17:49:59 +02:00
Sylvain Gugger	8c4471d1fc	Fix ESM models buffers (#24576 ) * Fix ESM models buffers * Remove modifs * Tied weights keys are needed silly * quality	2023-06-29 10:55:21 -04:00
amyeroberts	b324557aac	Removal of deprecated vision methods and specify deprecation versions (#24570 ) * Removal of deprecated methods and specify versions * Fix tests	2023-06-29 15:09:51 +01:00

1 2 3 4 5 ...

13372 Commits