transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-15 10:38:23 +06:00

Author	SHA1	Message	Date
Karim Foda	d9e28d91a8	Fix gradient checkpointing bug marian (#21842 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-02 15:41:15 +00:00
Karim Foda	b405b62f4a	Fix gradient checkpointing bug M2M 100 (#21841 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-02 15:40:56 +00:00
Karim Foda	7e6dd664e8	Fix gradient checkpointing bug LED (#21840 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-02 15:40:35 +00:00
Sourab Mangrulkar	b6f47b5393	fsdp bf16 enable autocast (#21847 )	2023-03-02 20:18:07 +05:30
Arthur	fb76994c41	[GPT-J] add deprecation warning (#21869 ) * add deprecation warning * remove pos ids from args docstirng * fix failing test	2023-03-02 14:51:59 +01:00
Kashif Rasul	648d0deb1d	fix typo in Bart's attention (#21898 )	2023-03-02 08:49:26 -05:00
Arthur	c87654dca1	[Whisper] Add rescaling function with `do_normalize` (#21263 ) * add `zero_mean_unit_var_norm` function * normalize before MEL computation * fixup * add simple test * quality * Update tests/models/whisper/test_feature_extraction_whisper.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fixup * use attention masks if padding was applied * Update based on review Co-authored-by: bofeng huang <bofenghuang7@gmail.com> --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: bofeng huang <bofenghuang7@gmail.com>	2023-03-02 14:17:21 +01:00
Arthur	b48c7f7b3f	[T5 doc] Fix confusing documentation about `d_kv` (#21896 ) * Confusing documentation in T5 * Fix onfusing documentation in T5 configuration file	2023-03-02 14:07:25 +01:00
Sid Kiblawi	edbb37f736	Add `inputs_embeds` functionality when generating with BioGPT (#21889 ) * initial commit to add inputs_embeds to generation * formatting	2023-03-02 07:43:19 -05:00
amyeroberts	3412f5979d	Use PyAV instead of Decord in examples (#21572 ) * Use PyAV instead of Decord * Get frame indices * Fix number of frames * Update src/transformers/models/videomae/image_processing_videomae.py * Fix up * Fix copies * Update timesformer doctests * Update docstrings	2023-03-02 12:30:38 +00:00
Arthur	c256bc6d10	[ZAC] fix ci daily (#21893 ) add correct revision after model was overwritten	2023-03-02 10:46:03 +01:00
Arthur	633e5e89f7	[Refactor] Relative imports wherever we can (#21880 ) * initial commit * update * second batch * style * fix imports * fix relative import on pipeline	2023-03-02 09:45:42 +01:00
Arthur	43299c63ca	fix checkpoint (#21874 )	2023-03-02 08:47:20 +01:00
Yih-Dar	89359e4c63	Fix `test_load_default_pipelines_pt` for `ClapModel` (#21886 ) * fix tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-01 21:52:26 +01:00
Yih-Dar	36ee128375	Fix `WhisperModelTest` (#21883 ) * force on the same device * fix tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-01 20:41:27 +01:00
saswatmeher	4edfd2d4d2	Fix Gradient checkpointing bug BigBird (#21882 ) Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>	2023-03-01 19:10:03 +00:00
Alara Dirik	269b054939	Add ALIGN to transformers (#21741 ) Adds the ALIGN model to transformers. ALIGN is introduced in "Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision" by Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig.	2023-03-01 21:23:31 +03:00
Matt	f7c618e3b0	Add TFVisionTextDualEncoder (#21873 ) * Temporary commit to stash everything so far * Temporary commit to stash everything so far * stash commit * Refactor from_pretrained * Fix final test, make fixup * Update dummies * Add model to TEST_FILES_WITH_NO_COMMON_TESTS * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Add TFVisionTextDualEncoder to utils/documentation_tests.txt * make fixup --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2023-03-01 18:00:48 +00:00
twaka	45e11091e5	Make loading of pretrained gpt2 faster by avoiding initialization of Conv1D's weights (#21879 ) apply normal_ after assigning weight as nn.Parameter to avoid unnecessary initialization computation	2023-03-01 11:59:21 -05:00
Matt	1d3a1cc44b	Add check for different embedding types in examples (#21881 ) * Add check for different embedding types in examples * Correctly update summarization example	2023-03-01 16:57:06 +00:00
Yih-Dar	53735d7c3b	Add an utility file to get information from test files (#21856 ) * Add an utility file to get information from test files --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-01 17:53:29 +01:00
Stas Bekman	3eba1dd27e	[doc] deepspeed tests (#21859 )	2023-03-01 08:52:49 -08:00
Sourab Mangrulkar	571dd693b5	update FSDP and add XLA-FSDP documentation (#21812 ) * update FSDP and add XLA-FSDP documentation * resolving comments * minor update * fix xla-fsdp docs	2023-03-01 19:51:07 +05:30
Maria Khalusova	9c1d59882b	Removed BLIP mention from the troubleshooting guide (#21872 ) removed BLIP mention from the troubleshooting guide	2023-03-01 08:26:25 -05:00
Younes Belkada	72787c5b68	[`Blip`] Fix blip doctest (#21868 ) fix blip doctest	2023-03-01 14:05:53 +01:00
Lorenzo Balzani	619d831848	Italian translation of community.mdx (#21871 ) Italian translation of community.mdx gh-17459	2023-03-01 07:49:56 -05:00
raghavanone	ebd5258975	Change the way tensor is reshaped in BartAttention (from .view to .reshape) (#21860 ) * Change the .view call to .reshape * Change the .view call to .reshape to all the copies from bart attention * Fix copies and style * Fix copies and style * Fix copies and style * Fix copies and style * Fix copies and style * Revert unneccessary changes * Revert unneccessary changes * Revert unneccessary changes * Revert unneccessary changes	2023-03-01 07:47:17 -05:00
Eugene Zapolsky	f71873c5fc	[deepspeed] check whether model is NLP one instead of counting on input type (#21800 ) * trying to figure out whether model is NLP * drop my changes and apply easier fix * trying to handle all int input types * fix logic --------- Co-authored-by: Stas Bekman <stas@stason.org>	2023-03-01 07:41:35 -05:00
saswatmeher	72e9ca7519	Fix gradient checkpointing bug Bart (#21866 ) Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>	2023-03-01 11:41:58 +00:00
Andy Ehrenberg	5e6cd51bec	Flax beam search fix (#21857 )	2023-03-01 10:25:33 +00:00
Arthur	b599b19289	[ConvBert] Fix #21523 (#21849 ) * fix reshaping Fixes #21523 * add test * styling * last fixes * Update src/transformers/models/convbert/modeling_convbert.py * code quallity	2023-03-01 11:11:04 +01:00
Arthur	44e3e3fb49	prepare for "__floordiv__ is deprecated and its behavior will change in a future version of pytorch" (#20211 ) * rounding_mode = "floor" instead of // to prevent behavioral change * add other TODO * use `torch_int_div` from pytrch_utils * same for tests * fix copies * style * use relative imports when needed * Co-authored-by: sgugger <sylvain.gugger@gmail.com>	2023-03-01 10:49:21 +01:00
Sylvain Gugger	b29e2dcaff	Fix flaky test for log level (#21776 ) * Fix flaky test for log level * Fix other flaky test	2023-02-28 16:24:14 -05:00
Matt	acfb714bdf	Improve TF weight loading, especially PT crossloading (#21792 ) * First commit for the improved PT-TF weight loading * Remove workarounds from TFEncoderDecoder tests * Allow a custom weight renaming function in from_pretrained and use that to clean up EncoderDecoder * make fixup * First attempt at visionencoderdecoder * Disable tensorfloat32 in tests to get consistent outputs * Quick fix to tf_vision_encoder_decoder tests * make fixup * Update Blenderbot tests * Remove unused arg in modeling_tf_opt * load_tf_sharded_weights had strict=True! This meant transfer learning was impossible, so I'm setting it to False. * Support prefixes when loading sharded TF checkpoints * make fixup * Add test to load sharded models with a weight prefix * Fix sharded weight loading test * Add a test for transfer from a sharded checkpoint * make fixup * Add test to check that crossloading from PT with a prefix works * Refactor from_pretrained in the encoderdecoder classes * Refactor from_pretrained in the encoderdecoder classes * missmatched -> mismatched * Explicitly check for None * No comments showing my very impressive and attractive knowledge of Py3.9+ * Disable TF32 across all TF tests	2023-02-28 18:41:34 +00:00
Yih-Dar	871c31a6f1	🔥Rework pipeline testing by removing `PipelineTestCaseMeta` 🚀 (#21516 ) * Add PipelineTesterMixin * remove class PipelineTestCaseMeta * move validate_test_components * Add for ViT * Add to SPECIAL_MODULE_TO_TEST_MAP * style and quality * Add feature-extraction * update * raise instead of skip * add tiny_model_summary.json * more explicit * skip tasks not in mapping * add availability check * Add Copyright * A way to diable irrelevant tests * update with main * remove disable_irrelevant_tests * skip tests * better skip message * better skip message * Add all pipeline task tests * revert * Import PipelineTesterMixin * subclass test classes with PipelineTesterMixin * Add pipieline_model_mapping * Fix import after adding pipieline_model_mapping * Fix style and quality after adding pipieline_model_mapping * Fix one more import after adding pipieline_model_mapping * Fix style and quality after adding pipieline_model_mapping * Fix test issues * Fix import requirements * Fix mapping for MobileViTModelTest * Update * Better skip message * pipieline_model_mapping could not be None * Remove some PipelineTesterMixin * Fix typo * revert tests_fetcher.py * update * rename * revert * Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests * style and quality * test fetcher for all pipeline/model tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-28 19:40:57 +01:00
Anahita Bhiwandiwalla	4cb5ffa93d	Add loss for BridgeTowerForMaskedLM and BridgeTowerForImageAndTextRetrieval (#21684 ) * Add loss for BridgeTowerForMaskedLM and BridgeTowerForImageAndTextRetrieval * minor fix return_dict * implement test for loss computation --------- Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com> Co-authored-by: Tiep Le <tiep.le@intel.com>	2023-02-28 12:21:48 -05:00
Younes Belkada	7f4f8b97d0	[`Blip2`] Fix Blip-2 multi gpu (#21707 ) * fix blip multi gpu * fix * final changes * adapt suggestions * fix failing slow test * forward contrib credits from testing and suggestions * reformat --------- Co-authored-by: akkikiki <akkikiki@users.noreply.github.com>	2023-02-28 17:28:58 +01:00
Yih-Dar	aab895c396	Make Slack CI reporting stronger (#21823 ) * Use token * Avoid failure * better error * Fix * fix style --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-28 17:12:44 +01:00
Maria Khalusova	6ca844582c	Add: task guide for zero shot object detection (#21829 ) * zero shot object detection part 1 * added batch prediction section * added image guided object detection section * make style * added the task guide to the TOC * minor polishing * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> * added embedded owlvit demo * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * minor fix * make style --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-28 10:23:08 -05:00
Herumb Shandilya	31fa2b6c68	[GPTJ] Fix gradient checkpointing bug (#21794 ) * If applied, this commit fixes generate bug in gptj * Remove extra same code block * formatting and test fix * Conflict fix and declaration error fix --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-28 10:12:42 -05:00
raghavanone	eec76042f4	Fix the issue of blip model returning loss even when the label is not provided. (#21811 ) * Fix the issue of blip model returning loss even when the label is not provoided * Fix ruff failure * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks	2023-02-28 09:54:08 -05:00
Younes Belkada	b8de7e448e	[`Blip2`] Add `Blip2Model` (#21817 ) * add v1 * add `Blip2Model` - add relevant functions - add tests - add on automapping * fix docs * fix doctest	2023-02-28 15:42:55 +01:00
Younes Belkada	ae9230af40	[`T5`] Fix torchquant issue (#21843 ) * fix torchquant issue * add tests	2023-02-28 15:09:44 +01:00
anruijian	2d506ea4c4	Fix tf random token masking probability in data collator (#21834 ) * fix tf random mask tokens probability * fix tf random mask tokens probability in collator for langauge modelling	2023-02-28 07:55:47 -05:00
Karim Foda	4fe744f528	Fix gradient checkpointing imagegpt (#21816 ) * Fix gradient checkpointing bug in gptneox * Fix gradient checkpointing bug in modeling_imagegpt.py * Revert gpt neox changes --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-28 07:47:04 -05:00
Karim Foda	e07a3d95f8	Fix gradient checkpointing bug in git (#21818 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-28 07:46:33 -05:00
Andy Ehrenberg	50db741417	check for None forced tokens (#21793 )	2023-02-28 13:24:43 +01:00
saswatmeher	50644cf624	Fix gradient checkpointing bug BioGpt (#21844 ) Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>	2023-02-28 11:56:25 +00:00
Yih-Dar	a9dd124346	Rename `MobileViTModelTest` to `TFMobileViTModelTest` (#21825 ) Let's give TF a bit more love ❤️ 🙏 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-28 08:10:29 +01:00
Stas Bekman	c7f3abc257	introduce `logger.warning_once` and use it for grad checkpointing code (#21804 ) * logger.warning_once * style	2023-02-27 13:25:06 -08:00

1 2 3 4 5 ...

12196 Commits