transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 10:12:23 +06:00

Author	SHA1	Message	Date
Samin Yasar	4f08887053	Add Multimodal heading and Document question answering in task_summary.mdx (#23318 ) * add multimodal heading and docqa * fix sentence * task_summary data type = modality clarification * change the multimodal example to a smaller model	2023-07-17 13:51:19 +01:00
dependabot[bot]	38dfb86958	Bump cryptography from 41.0.0 to 41.0.2 in /examples/research_projects/decision_transformer (#24833 ) Bump cryptography in /examples/research_projects/decision_transformer Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.0 to 41.0.2. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](https://github.com/pyca/cryptography/compare/41.0.0...41.0.2) --- updated-dependencies: - dependency-name: cryptography dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-07-17 07:17:17 -04:00
namespace-Pt	18d42bfd23	Remove unused code in GPT-Neo (#24826 ) 1	2023-07-17 07:07:47 -04:00
Sohyun Sim	9771ad33be	🌐 [i18n-KO] Translated `custom_tools.mdx` to Korean (#24580 ) * docs: ko: custom_tools.mdx * feat: deepl draft * fix: change .mdx to .md * fix: resolve suggestions * fix: resolve suggestions	2023-07-17 07:04:10 -04:00
statelesshz	8ba26c18cf	deprecate `sharded_ddp` training argument (#24825 ) * deprecate fairscale's ShardedDDP * fix code style * roll back * deprecate the `sharded_ddp` training argument --------- Co-authored-by: jihuazhong <jihuazhong1@huawei.com>	2023-07-17 06:57:42 -04:00
Kadir Nar	5bb4430edc	[🔗 Docs] Fixed Incorrect Migration Link (#24793 ) * [🔗 Docs] Fixed Incorrect Migration Link * Update README.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-07-14 17:47:50 -04:00
Sylvain Gugger	1023705440	Check models used for common tests are small (#24824 ) * First models * Conditional DETR * Treat DETR models, skip others * Skip LayoutLMv2 as well * Fix last tests	2023-07-14 14:43:19 -04:00
Dario Sučić	a865b62e07	set correct model input names for gptsw3tokenizer (#24788 )	2023-07-14 18:13:45 +01:00
Nicolas Patry	50726f9ea7	Fixing double `use_auth_token.pop` (preventing private models from being visible). (#24812 ) Fixing double `use_auth_token.pop` (preventing private models from being visible). Should fix: https://github.com/huggingface/transformers/issues/14334#issuecomment-1634527833 Repro: Have a private repo, with `vocab.json` (spread out files for the tokenizer) and use `AutoTokenizer.from_pretrained(..., use_auth_token="token")`.	2023-07-14 15:20:02 +02:00
Sylvain Gugger	91d7df58b6	Copy code when using local trust remote code (#24785 ) * Copy code when using local trust remote code * Remote upgrade strategy * Revert "Remote upgrade strategy" This reverts commit `4f0392f5d7`.	2023-07-13 16:57:20 -04:00
Sylvain Gugger	f32303d519	Run hub tests (#24807 ) * Run hub tests * [all-test] Run tests please! * [all-test] Add vision dep for hub tests * Fix tests	2023-07-13 15:25:45 -04:00
Fady Nakhla	9d7a0871e2	Use _BaseAutoModelClass's register method (#24810 ) Switching _BaseAutoModelClass from_pretrained and from_config to use the register classmethod that it defines rather than using the _LazyAutoMapping register method directly. This makes use of the additional consistency check within the base model's register.	2023-07-13 15:24:51 -04:00
Georgie Mathews	0866705022	Update setup.py to be compatible with pipenv (#24789 )	2023-07-13 12:56:43 -04:00
Matt	c0ca73dc98	Remove Falcon docs for the release until TGI is ready (#24808 ) * Remove Falcon docs for the release until TGI is ready * Update toctree	2023-07-13 17:27:58 +01:00
dymil	f9a711df4a	Fix typo 'submosules' (#24809 )	2023-07-13 16:56:53 +01:00
amyeroberts	eebce4470c	Add accelerate version in transformers-cli env (#24806 ) * Add accelerate version in transformers-cli env * Add accelerate config	2023-07-13 16:50:19 +01:00
Joao Gante	34d9409427	Llama/GPTNeoX: add RoPE scaling (#24653 ) * add rope_scaling * tmp commit * add gptneox * add tests * GPTNeoX can now handle long inputs, so the pipeline test was wrong * Update src/transformers/models/open_llama/configuration_open_llama.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove ntk * remove redundant validation --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-07-13 16:47:30 +01:00
Sylvain Gugger	9342c8fb82	Deprecate models (#24787 ) * Deprecate some models * Fix imports * Fix inits too * Remove tests * Add deprecated banner to documentation * Remove from init * Fix auto classes * Style * Remote upgrade strategy 1 * Remove site package cache * Revert this part * Fix typo... * Update utils * Update docs/source/en/model_doc/bort.md Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Address review comments * With all files saved --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2023-07-13 11:46:54 -04:00
Yih-Dar	717dadc6f3	Skip torchscript tests for `MusicgenForConditionalGeneration` (#24782 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-13 15:54:18 +02:00
amyeroberts	e367a9770f	Fix MobileVitV2 doctest checkpoint (#24805 ) * Fix doctest checkpoint * Add import torch for mobilevit	2023-07-13 14:47:59 +01:00
Yih-Dar	e538189931	Upgrade jax/jaxlib/flax pin versions (#24791 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-13 13:57:30 +02:00
Bram Vanroy	6ba4d5de3a	[DOC] Clarify relationshi load_best_model_at_end and save_total_limit (#24614 ) * Update training_args.py Clarify the relationship between `load_best_model_at_end` and `save_total_limit`. * fix: faulty quotes * make quality * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * DOCS: add explicit `True` * DOCS: make style/quality --------- Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-13 07:36:16 -04:00
SeongBeomLEE	21946a8cf4	[fix] Change the condition of ValueError in "convert_checkpoint_from_transformers_to_megatron" (#24769 ) * fix: half inference error norm_factor is still torch.float32 after using model.half So I changed it to register_buffer so I can change it to torch.float16 after using model.half * fix: Added a variable "persistent=False" * run make style * [fix] Change the condition of ValueError convert_checkpoint_from_transformers_to_megatron * [fix] error wording layers -> attention heads	2023-07-13 11:57:56 +01:00
Liyang90	1f6f32c243	Removing unnecessary `device=device` in modeling_llama.py (#24696 ) * Update modeling_llama.py Removing unnecessary `device=device` * fix in all occurrences of _make_causal_mask	2023-07-13 10:30:22 +01:00
Yih-Dar	906afa1d5c	Revert "Unpin protobuf in docker file (for daily CI)" (#24800 ) Revert "Unpin protobuf in docker file (for daily CI) (#24761)" This reverts commit `45025d92f8`.	2023-07-13 04:19:45 +02:00
Zach Mueller	f1732e1374	Rm duplicate pad_across_processes (#24780 ) Rm duplicate	2023-07-12 11:47:21 -04:00
Lysandre Debut	cfc8a05305	Remove WWT from README (#24672 )	2023-07-12 10:58:08 -04:00
Pedro Cuenca	395e566a42	gpt-bigcode: avoid `zero_` to support Core ML (#24755 ) gpt-bigcode: avoid `zeros_` to support Core ML. In-place `zeros_` is not supported by the Core ML conversion process. This PR replaces it with `zeros_like` so conversion can proceed. The change only affects a workaround for a PyTorch bug on the `cpu` device.	2023-07-12 16:38:25 +02:00
Zach Mueller	0284285501	Fix pad across processes dim in trainer and not being able to set the timeout (#24775 ) * dim, and rm copy * Don't rm copy for now * Oops * pad index * Should be a working test * Tickle down ddp timeout * Put fix back in now that testing locally is done * Better comment specifying timeout Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-12 10:01:51 -04:00
Yih-Dar	4f85aaa6c9	Update default values of bos/eos token ids in `CLIPTextConfig` (#24773 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-12 13:50:26 +02:00
Bauke Brenninkmeijer	fc9e387dc0	Replacement of 20 asserts with exceptions (#24757 ) * initial replacements of asserts with errors/exceptions * replace assert with exception in generation, align and bart * reset formatting change * reset another formatting issue * Apply suggestion Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * don't touch this file * change to 'is not False' * fix type --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-12 07:45:09 -04:00
Joao Gante	430a04a75a	Docs: Update logit processors __call__ docs (#24729 ) * tmp commit * __call__ docs * kwargs documented; shorter input_ids doc * nit * Update src/transformers/generation/logits_process.py	2023-07-12 12:21:30 +01:00
amyeroberts	6e2f069650	Add MobileVitV2 to doctests (#24771 ) * Add to doctests * Alphabetical order	2023-07-12 12:06:17 +01:00
Zach Mueller	7edc33ac7a	Fix eval_accumulation_steps leading to incorrect metrics (#24756 ) Fix eval steps	2023-07-12 05:49:12 -04:00
Yih-Dar	45025d92f8	Unpin protobuf in docker file (for daily CI) (#24761 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-11 23:55:55 +02:00
Sylvain Gugger	6aadb8d016	Allow existing configs to be registered (#24760 )	2023-07-11 16:52:34 -04:00
Gaurav Kumbhat	4c0e251dc7	🐛 Handle empty gen_kwargs for seq2seq trainer prediction_step function (#24759 ) * 🐛 Handle empty gen_kwargs for seq2seq trainer prediction_step fn Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com> * Update src/transformers/trainer_seq2seq.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-11 16:48:06 -04:00
Zach Mueller	253d43d46d	Fix lr scheduler not being reset on reruns (#24758 ) * Try this * Solved! * Rm extranious * Rm extranious * self * Args' * Check for if we created the lr scheduler * Move comment * Clean	2023-07-11 16:37:04 -04:00
Yih-Dar	1be0145d6a	Skip some slow tests for doctesting in PRs (Circle)CI (#24753 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-11 22:08:14 +02:00
NielsRogge	bb13a92859	[InstructBLIP] Fix bos token of LLaMa checkpoints (#24492 ) * Add fix * Fix doctest	2023-07-11 20:43:01 +01:00
janEbert	aac4c79968	Fix non-deterministic Megatron-LM checkpoint name (#24674 ) Fix non-deterministic checkpoint name `os.listdir`'s order is not deterministic, which is a problem when querying the first listed file as in the code (`os.listdir(...)[0]`). This can return a checkpoint name such as `distrib_optim.pt`, which does not include desired information such as the saved arguments originally given to Megatron-LM.	2023-07-11 19:55:04 +01:00
Sylvain Gugger	33aafc26ee	Skip keys not in the state dict when finding mismatched weights (#24749 )	2023-07-11 12:40:21 -04:00
Zehan Li	3d8697261e	add gradient checkpointing for distilbert (#24719 ) * add gradient checkpointing for distilbert * reformatted	2023-07-11 11:29:47 -04:00
Joao Gante	2642d8d04b	Docs: add `kwargs` type to fix formatting (#24733 )	2023-07-11 16:21:29 +01:00
Connor Henderson	5739726fcc	fix: Text splitting in the BasicTokenizer (#22280 ) * fix: Apostraphe splitting in the BasicTokenizer for CLIPTokenizer * account for apostrophe at start of new word * remove _run_split_on_punc, use re.findall instead * remove debugging, make style and quality * use pattern and punc splitting, repo-consistency will fail * remove commented out debugging * adds bool args to BasicTokenizer, remove pattern * do_split_on_punc default True * clean stray comments and line breaks * rebase, repo-consistency * update to just do punctuation split * add unicode normalizing back * remove redundant line	2023-07-11 11:07:58 -04:00
Justin Martin	2489e380e4	Fix typo in LocalAgent (#24736 )	2023-07-11 09:04:50 -04:00
Jegor Kitškerkin	8a5e8a9c2a	Add ViViT (#22518 ) * Add model * Add ability to get classification head weights * Add docs * Add imports to __init__.py * Run style * Fix imports and add mdx doc * Run style * Fix copyright * Fix config docstring * Remove imports of ViViTLayer and load_tf_weights_in_vivit * Remove FeatureExtractor and replace with ImageProcessor everywhere * Remove ViViTForPreTraining from vivit.mdx * Change ViViT -> Vivit everywhere * Add model_doc to _toctree.yml * Replace tuples with lists in arguments of VivitConfig * Rename patch_size to tubelet_size in TubeletEmbeddings * Fix checkpoint names * Add tests * Remove unused num_frames * Fix imports for VivitImageProcessor * Minor fixes * Decrease number of frames in VivitModelTester from 32 to 16 * Decrease number of frames in VivitModelTester from 16 to 8 * Add initialization for pos embeddings * Rename Vivit -> ViViT in some places * Fix docstring and formatting * Rename TubeletEmbeddings -> VivitTubeletEmbeddings * Remove load_tf_weights_in_vivit * Change checkpoint name * Remove Vivit _TOKENIZER_FOR_DOC * Fix * Fix VivitTubeletEmbeddings and pass config object as parameter * Use image_size and num_frames instead of video_size * Change conversion script and fix differences with the orig implementation * Fix docstrings * Add attention head pruning * Run style and fixup * Fix tests * Add ViViT to video_classification.mdx * Save processor in conversion script * Fix * Add image processor test * Run fixup and style * Run fix-copies * Update tests/models/vivit/test_modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/vivit/test_modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use PyAV instead of decord * Add unittest.skip * Run style * Remove unneeded test * Update docs/source/en/model_doc/vivit.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/configuration_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add model * Add docs * Run style * Fix imports and add mdx doc * Remove FeatureExtractor and replace with ImageProcessor everywhere * Change ViViT -> Vivit everywhere * Rename Vivit -> ViViT in some places * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Run make style * Remove inputs save * Fix image processor * Fix * Run `make style` * Decrease parameters of VivitModelTester * Decrease tubelet size * Rename vivit.mdx * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix default values in image_processing_vivit.py --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-07-11 14:04:04 +01:00
Arthur	b15343de6f	[Patch-t5-tokenizer] Patches the changes on T5 to make sure previous behaviour is still valide for beginning of words (#24622 ) * patch `_tokenize` function * more tests * properly fix * fixup * Update src/transformers/models/t5/tokenization_t5.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix without ifs * update * protect import * add python processing * is first needed * add doc and update with lefacy * updaate * fix T5 SPM converter * styling * fix T5 warning * add is_seqio_available * remove is_first * revert some changes * more tests and update * update llama test batterie * fixup * refactor T5 spm common tests * draft the llama tests * update * uopdate test * nits * refine * name nit * fix t5 tests * fix T5 * update * revert convert slow to fast changes that fail lots of tests * legacy support * fixup * nits is first not defined * don't use legacy behaviour for switch transformers * style * My attempt to check. * nits * fixes * update * fixup * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * updates * fixup * add legacy warning * fixup * warning_once nit * update t5 documentation test * update llama tok documentation * add space to warning * nits * nit * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * last nits --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-07-11 15:02:18 +02:00
Matt	b3ab3fac1d	Falcon port (#24523 ) * Initial commit * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Cleanup config docstring * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Convert to relative imports * Remove torch < 1.8 warning * Restructure cos_sin header * qkv -> query, key, value * Refactor attention calculation * Add a couple of config variables to account for the different checkpoints * Successful merging of the code paths! * Fix misplaced line in the non-parallel attention path * Update config and tests * Add a pad_token_id when testing * Support output_attentions when alibi is None * make fixup * Skip KV cache shape test * No more _keys_to_ignore_on_load_missing * Simplify self attention a bit * Simplify self attention a bit * make fixup * stash commit * Some more attention mask updates * Should pass all tests except assisted generation! * Add big model generation test * make fixup * Add temporary workaround for test * Test overrides for assisted generation * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/falcon/test_modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Test overrides for assisted generation * Add generation demo * Update copyright * Make the docstring model actually small * Add module-level docstring * Remove all assertions * Add copied from bloom * Reformat the QKV layer * Add copied from bloom * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove unused line and reformat * No single letter variables * Cleanup return names * Add copied from line * Remove the deprecated arguments blocks * Change the embeddings test to an alibi on/off test * Remove position_ids from FalconForQA * Remove old check for token type IDs * Fix the alibi path when multi_query is False * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/falcon/test_modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update config naming * Fix typo for new_decoder_architecture * Add some comments * Fix docstring * Fix docstring * Create range in the right dtype from the start * Review comment cleanup * n_head_kv -> num_kv_heads * self.alibi -> self.use_alibi * self.num_kv -> self.num_kv_heads * Reorder config args * Made alibi arguments Optional * Add all model docstrings * Add extra checkpoints * Add author info for Falcon * Stop removing token_type_ids because our checkpoints shouldn't return it anymore * Add one hopeful comment for the future * Fix typo * Update tests, fix cache issue for generation * Use -1e9 instead of -inf to avoid float overflow * Recompute the rotary embeddings much less often * Re-enable disabled tests * One final fix to attention mask calculation, and update tests * Cleanup targeting falcon-40b equivalency * Post-rebase docs update * Update docstrings, especially in the config * More descriptive variable names, and comments where we can't rename them --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-07-11 13:36:31 +01:00
Marc Sun	35eac0df75	add link to accelerate doc (#24601 )	2023-07-10 17:49:30 -04:00

1 2 3 4 5 ...

13418 Commits