transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-10 08:10:05 +06:00

Author	SHA1	Message	Date
Yih-Dar	d788d37d24	Fix daily CI image build (#27307 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-11-06 11:27:22 +01:00
Mayank Mishra	b026b5ca6d	Fix tokenizer export for LLamaTokenizerFast (#27222 ) * fix tokenizer * fix tokenizer	2023-11-06 10:26:18 +01:00
jiaqiw09	cc3e478185	translate run_scripts.md to chinese (#27246 ) * translate run_scripts.md to chinese * translate run_scripts.md to chinese * translate run_scripts.md to chinese	2023-11-03 10:19:41 -07:00
jiaqiw09	bf7cfac20a	translate autoclass_tutorial to chinese (#27269 ) * translate autoclass_tutorial.md to chinese * translate update	2023-11-03 09:16:55 -07:00
Susnato Dhar	1ac2463dfe	[`FA2`] Add flash attention for for `DistilBert` (#26489 ) * flash attention added for DistilBert * fixes * removed padding_masks * Update modeling_distilbert.py * Update test_modeling_distilbert.py * style fix	2023-11-03 16:07:54 +00:00
Maria Khalusova	5964f820db	[Docs] Model_doc structure/clarity improvements (#26876 ) * first batch of structure improvements for model_docs * second batch of structure improvements for model_docs * more structure improvements for model_docs * more structure improvements for model_docs * structure improvements for cv model_docs * more structural refactoring * addressed feedback about image processors	2023-11-03 10:57:03 -04:00
Younes Belkada	ad8ff96224	[`Docs` / `SAM` ] Reflect correct changes to run inference without OOM (#27268 ) Update sam.md	2023-11-03 15:23:13 +01:00
Shiyu Li	f13f544ad9	Fix switch transformer mixed precision issue (#27220 ) * Fix mixed precision error for switch transformer * Fixup	2023-11-03 14:00:33 +00:00
Matt	db69bd88fb	Update the ConversationalPipeline docstring for chat templates (#27250 ) * Update the ConversationalPipeline docstring now that we're using chat templates * Direct access to conversation.messages * Explain the string init	2023-11-03 13:17:46 +00:00
Maria Khalusova	011b15c1c7	[docs] Custom model doc update (#27213 ) doc update	2023-11-03 08:03:13 -04:00
Yih-Dar	af8d1dc309	Avoid many failing tests in doctesting (#27262 ) * fix * update * update * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-11-03 12:47:07 +01:00
Younes Belkada	8f1a43cd91	[`PEFT` / `Tests` ] Fix peft integration failing tests (#27258 ) fix peft integration issues	2023-11-03 12:23:02 +01:00
Tom Aarsen	05ea7b79e6	Refactor: Use Llama RoPE implementation for Falcon (#26933 ) * Use Llama RoPE implementation for Falcon + Add copy functionalities * Use standard cache format for Falcon * Simplify apply_rotary_pos_emb, copy from Llama * Remove unnecessary cache conversion test We don't need to convert any caches anymore! * Resolve copy complaint	2023-11-03 11:05:55 +00:00
Lysandre Debut	e9a6c72b5e	Fuyu protection (#27248 )	2023-11-03 08:45:05 +01:00
Komal Kumar	552ff24488	Fixed base model class name extraction from PeftModels (#27162 ) * Fixed base model class name extraction from PeftModels * Changes to first unwrap the model then extract the base model name * Changed base_model to base_model.model to stay consistent with peft model abstractions	2023-11-02 20:08:03 +00:00
Chi	4991216841	Removed the redundant SiLUActivation class. (#27136 ) * Removed the redundant SiLUActivation class and now use nn.functional.silu directly. * I apologize for adding torch.functional.silu. I have replaced it with nn.SiLU.	2023-11-02 18:13:57 +00:00
jiaqiw09	00d8502b7a	translate peft.md to chinese (#27215 ) * tranlsate peft.md to chinese * translate peft.md to chinese * fix missing link	2023-11-02 10:42:29 -07:00
Lysandre	bc78fd1274	Dev version	2023-11-02 18:15:36 +01:00
Yoach Lacombe	0ed6729bb1	Enrich TTS pipeline parameters naming (#26473 ) * enrich TTS pipeline docstring for clearer forward_params use * change token leghts * update Pipeline parameters * correct docstring and make style * fix tests * make style * change music prompt Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * raise errors if generate_kwargs with forward-only models * make style --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-11-02 17:06:56 +00:00
Pietro Lesci	147e8ce4ae	Remove redundant code from T5 encoder mask creation (#27216 ) * remove redundant code * update * add typecasting * make `attention_mask` float again	2023-11-02 16:01:41 +00:00
Joao Gante	a6c82d4567	Generate: return `past_key_values` (#25086 )	2023-11-02 15:39:21 +00:00
Marc Sun	441c3e0dd2	fix-deprecated-exllama-arg (#27243 ) fix-exllama	2023-11-02 11:23:31 -04:00
Nicolas Patry	8801861d2d	Fixing m4t. (#27240 ) * Fixing m4t. * Trying to remove comparison ? Odd test failure. * Adding shared. But why on earth does it hang ???? * Putting back the model weights checks the test is silently failing on cuda. * Fix style + unremoved comment.	2023-11-02 15:32:17 +01:00
Lysandre Debut	443bf5e9e2	Fix safetensors failing tests (#27231 ) * Fix Kosmos2 * Fix ProphetNet * Fix MarianMT * Fix M4T * XLM ProphetNet * ProphetNet fix * XLM ProphetNet * Final M4T fixes * Tied weights keys * Revert M4T changes * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-11-02 15:03:09 +01:00
Michael Benayoun	4557a0dede	Wrap `_prepare_4d_causal_attention_mask` as a leaf function (#27236 ) Wrap _prepare_4d_causal_attention_mask as a leaf function	2023-11-02 12:03:30 +00:00
Pablo Montalvo	8a312956fd	Fuyu: improve image processing (#27007 ) * Fix Fuyu image scaling bug It could produce negative padding and hence inference errors for certain image sizes. * initial rework commit * add batching capabilities, refactor image processing * add functional batching for a list of images and texts * make args explicit * Fuyu processing update (#27133) * Add file headers * Add file headers * First pass - preprocess method with standard args * First pass image processor rework * Small tweaks * More args and docstrings * Tidying iterating over batch * Tidying up * Modify to have quick tests (for now) * Fix up * BatchFeature * Passing tests * Add tests for processor * Sense check when patchifying * Add some tests * FuyuBatchFeature * Post-process box coordinates * Update to `size` in processor * Remove unused and duplicate constants * Store unpadded dims after resize * Fix up * Return FuyuBatchFeature * Get unpadded sizes after resize * Update exception * Fix return * Convert input `<box>` coordinates to model format. * Post-process point coords, support multiple boxes/points in a single sequence * Replace constants * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Preprocess List[List[image]] * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update to Amy's latest state. * post-processing returns a list of tensors * Fix error when target_sizes is None Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com> * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Review comments * Update src/transformers/models/fuyu/image_processing_fuyu.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Fix up * Fix up --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-72-126.ec2.internal> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com> * Fix conflicts in fuyu_follow_up_image_processing (#27228) fixing conflicts and updating on main * Revert "Fix conflicts in fuyu_follow_up_image_processing" (#27232) Revert "Fix conflicts in fuyu_follow_up_image_processing (#27228)" This reverts commit `acce10b6c6`. --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-72-126.ec2.internal>	2023-11-02 12:25:41 +01:00
Younes Belkada	9b25c164bd	[`core` / `Quantization`] Fix for 8bit serialization tests (#27234 ) * fix for 8bit serialization * added regression tests. * fixup	2023-11-02 12:03:51 +01:00
Hz, Ji	c52e429b1c	Reproducible checkpoint for npu (#27208 ) * save NPU's RNG states when saving a checkpoint and set after all the data skip phase when resuming training. * re-trigger ci * re-trigger ci	2023-11-02 10:27:13 +00:00
Roohollah Etemadi	7adaefe2bc	support bf16 (#25879 ) * added bf16 support * added cuda availability check * applied make style, quality	2023-11-02 11:05:20 +01:00
Patrick von Platen	af3de8d87c	[Whisper, Bart, MBart] Add Flash Attention 2 (#27203 ) * add whisper fa2 * correct * change all * correct * correct * fix more * fix more * fix more * fix more * fix more * fix more * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix more * fix more * fix more * fix more * fix more --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-11-01 21:03:01 +01:00
Zach Mueller	3520e37e86	Enable split_batches through TrainingArguments (#26798 ) * Enable split_batches through TrainingArguments * Extra dispatch_batches * Keep as default false * Add to docstring * Add to docstring * Remove the capturewarnings change * Comma	2023-11-01 14:42:38 -04:00
Lysandre Debut	95020f208e	Fix CPU offload + disk offload tests (#27204 ) Fix disk offload tests + weight sharing issues	2023-11-01 19:25:23 +01:00
Marc Sun	c9e72f55b2	Add exllamav2 better (#27111 ) * add_ xllamav2 arg * add test * style * add check * add doc * replace by use_exllama_v2 * fix tests * fix doc * style * better condition * fix logic * add deprecate msg * deprecate exllama * remove disable_exllama from the linter * remove * fix warning * Revert the commits deprecating exllama * deprecate disable_exllama for use_exllama * fix * fix loading attribute * better handling of args * remove disable_exllama from init and linter * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * better arg * fix warning * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * switch to dict * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * style * nits * style * better tests * style --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-11-01 13:09:21 -04:00
jiaqiw09	239cd0eaa2	Translate task summary to chinese (#27180 ) * translate task_summary.md to chinese * update translation * update translation * fix _toctree.yml	2023-11-01 09:28:34 -07:00
Rafael Padilla	1e32b05e06	improving TimmBackbone to support FrozenBatchNorm2d (#27160 ) * supporting freeze_batch_norm_2d * supporting freeze_batch_norm_2d * including unfreeze + separate into methods * fix typo * calling unfreeze * lint * Update src/transformers/models/timm_backbone/modeling_timm_backbone.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Rafael Padilla <rafael.padilla@huggingface.co> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-11-01 12:58:35 -03:00
Wesley L Passos	21a2fbaf48	Fix docstring in get_oneformer_resize_output_image_size func (#27207 )	2023-11-01 15:31:13 +00:00
Andi Powers Holmes	f8afb2b2ec	Add TensorFlow implementation of ConvNeXTv2 (#25558 ) * Add type annotations to TFConvNextDropPath * Use tf.debugging.assert_equal for TFConvNextEmbeddings shape check * Add TensorFlow implementation of ConvNeXTV2 * check_docstrings: add TFConvNextV2Model to exclusions TFConvNextV2Model and TFConvNextV2ForImageClassification have docstrings which are equivalent to their PyTorch cousins, but a parsing issue prevents them from passing the test. Adding exclusions for these two classes as discussed in #25558.	2023-11-01 15:09:55 +00:00
Patrick von Platen	391d14e810	[WhisperForCausalLM] Add WhisperForCausalLM for speculative decoding (#27195 ) * finish * add tests * fix all tests * [Assistant Decoding] Add test * fix more * better * finish * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * finish --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-11-01 16:01:53 +01:00
Alexander Kozlov	f9b4bea0a6	Added cache_block_outputs option to enable GPTQ for non-regular models (#27032 ) * Added cache_block_outputs option to enable GPTQ for non-regular models * Update src/transformers/utils/quantization_config.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Fixed style * Update src/transformers/utils/quantization_config.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-11-01 14:37:19 +00:00
Shashank Rajput	037fb7d0e1	added unsqueeze_dim to apply_rotary_pos_emb (#27117 ) * added unsqueeze_dim to apply_rotary_pos_emb * Added docstring * Modified docstring * Modified docstring * Modified docstring * Modified docstring * Modified docstring * ran make fix-copies and make fixup * Update src/transformers/models/llama/modeling_llama.py Accepting the proposed changes in formatting. Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * incorporating PR suggestions * incorporating PR suggestions * incorporating PR suggestions * incorporating PR suggestions * .. --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-11-01 14:16:57 +00:00
Wesley L Passos	f3c1a172bb	Fixing docstring in get_resize_output_image_size function (#27191 )	2023-11-01 12:42:41 +00:00
MD FAIZAN KHAN	636f704d0b	Fix the typos and grammar mistakes in CONTRIBUTING.md. (#27193 ) Fix the typos and grammar mistakes in CONTRIBUTING.md	2023-11-01 12:42:22 +00:00
Wesley L Passos	71025520bc	Fix docstring get maskformer resize output image size (#27196 ) * fix docstring in get_maskformer_resize_output_image_size * fix functions docstring * fix 'copied from' functions docstring * fix docstring * fix return type * fix docstring resize	2023-11-01 12:26:14 +00:00
Younes Belkada	ae093eef01	[`core` / `Quantization` ] AWQ integration (#27045 ) * working v1 * oops * Update src/transformers/modeling_utils.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fixup * oops * push * more changes * add docs * some fixes * fix copies * add v1 doc * added installation guide * relax constraints * revert * attempt llm-awq * oops * oops * fixup * raise error when incorrect cuda compute capability * nit * add instructions for llm-awq * fixup * fix copies * fixup and docs * change * few changes + add demo * add v1 tests * add autoawq in dockerfile * finalize * Update tests/quantization/autoawq/test_awq.py * fix test * fix * fix issue * Update src/transformers/integrations/awq.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/main_classes/quantization.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/main_classes/quantization.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/integrations/awq.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/integrations/awq.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add link to example script * Update docs/source/en/main_classes/quantization.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add more content * add more details * add link to quantization docs * camel case + change backend class name * change to string * fixup * raise errors if libs not installed * change to `bits` and `group_size` * nit * nit * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * disable training * address some comments and fix nits * fix * final nits and fix tests * adapt to our new runners * make fix-copies * Update src/transformers/utils/quantization_config.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/integrations/awq.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/integrations/awq.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * move to top * add conversion test * final nit * add more elaborated test --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-11-01 09:06:31 +01:00
Hz, Ji	82c7e87987	device agnostic fsdp testing (#27120 ) * make fsdp test cases device agnostic * make style	2023-11-01 07:17:06 +01:00
Yeyang	7d8ff3629b	🌐 [i18n-ZH] Translate tflite.md into Chinese (#27134 ) * docs(zh): translate tflite.md * docs(zh): add space around links * Update docs/source/zh/tflite.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-10-31 12:50:48 -07:00
Lysandre Debut	113ebf80ac	Safetensors serialization by default (#27064 ) * Safetensors serialization by default * First pass on the tests * Second pass on the tests * Third pass on the tests * Fix TF weight loading from TF-format safetensors * Specific encoder-decoder fixes for weight crossloading * Add VisionEncoderDecoder fixes for TF too * Change filename test for pt-to-tf * One missing fix for TFVisionEncoderDecoder * Fix the other crossload test * Support for flax + updated tests * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Sanchit's comments * Sanchit's comments 2 * Nico's comments * Fix tests * cleanup * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-10-31 19:16:49 +01:00
Dong-geon Lee	25e6e9418c	Unify warning styles for better readability (#27184 )	2023-10-31 18:12:14 +00:00
Hz, Ji	50378cbf6c	device agnostic models testing (#27146 ) * device agnostic models testing * add decorator `require_torch_fp16` * make style * apply review suggestion * Oops, the fp16 decorator was misused	2023-10-31 18:12:14 +01:00
Steven Liu	77930f8a01	[docs] Update CPU/GPU inference docs (#26881 ) * first draft * remove non-existent paths * edits * feedback * feedback and optimum * Apply suggestions from code review Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com> * redirect to correct doc * _redirects.yml --------- Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>	2023-10-31 09:44:51 -07:00

... 12 13 14 15 16 ...

15053 Commits