transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-19 12:38:23 +06:00

Author	SHA1	Message	Date
Arthur Zucker	9643069465	v4.47.0.dev0	2024-10-24 11:23:29 +02:00
Yih-Dar	f0e640adfa	Drop support for Python 3.8 (#34314 ) * drop python 3.8 * update docker files --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-24 11:16:55 +02:00
Arthur	05863817d6	Better defaults (#34026 ) * be nice to our usres * nit * fixup * default to -1 * oups * turbo nit * auto infer framework	2024-10-24 11:11:55 +02:00
Abhishek Maurya	65753d6065	Remove graph breaks for torch.compile() in flash_attention_forward when Lllama Model is padding free tuned (#33932 ) * fix: fixes for graph breaks Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: formatting Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: import error Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: Add Fa2Kwargs Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * Revert "PR changes" This reverts commit `39d2868e5c`. * PR changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: FlashAttentionKwarg Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: FlashAttentionKwarg Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * PR Changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * addition of documentation Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * change in _flash_attention_forward Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * make fix-copies Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * revert make fix-copies Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix copies * style * loss kwargs typing * style and pull latest changes --------- Signed-off-by: Abhishek <maurya.abhishek@ibm.com> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2024-10-24 11:02:54 +02:00
Joao Gante	b0f0c61899	Add SynthID (watermerking by Google DeepMind) (#34350 ) * Add SynthIDTextWatermarkLogitsProcessor * esolving comments. * Resolving comments. * esolving commits, * Improving SynthIDWatermark tests. * switch to PT version * detector as pretrained model + style * update training + style * rebase * Update logits_process.py * Improving SynthIDWatermark tests. * Shift detector training to wikitext negatives and stabilize with lower learning rate. * Clean up. * in for 7B * cleanup * upport python 3.8. * README and final cleanup. * HF Hub upload and initiaze. * Update requirements for synthid_text. * Adding SynthIDTextWatermarkDetector. * Detector testing. * Documentation changes. * Copyrights fix. * Fix detector api. * ironing out errors * ironing out errors * training checks * make fixup and make fix-copies * docstrings and add to docs * copyright * BC * test docstrings * move import * protect type hints * top level imports * watermarking example * direct imports * tpr fpr meaning * process_kwargs * SynthIDTextWatermarkingConfig docstring * assert -> exception * example updates * no immutable dict (cant be serialized) * pack fn * einsum equivalent * import order * fix test on gpu * add detector example --------- Co-authored-by: Sumedh Ghaisas <sumedhg@google.com> Co-authored-by: Marc Sun <marc@huggingface.co> Co-authored-by: sumedhghaisas2 <138781311+sumedhghaisas2@users.noreply.github.com> Co-authored-by: raushan <raushan@huggingface.co>	2024-10-23 21:18:52 +01:00
Arthur	e50bf61dec	Fix red CI: benchmark script (#34351 ) * dont'trigger always * fux * oups * update * ?? * ? * aie	2024-10-23 18:33:52 +02:00
Yih-Dar	c42b3223db	skip `test_pipeline_depth_estimation` temporarily (#34316 ) skip Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-23 17:27:51 +02:00
Zach Mueller	d9f733625c	Enable Gradient Accumulation fix across all models + trainer fully in forward() (#34283 ) * Enable grad accum fix across all models + trainer fully in forward() * handle peft case * Account for DDP: need to run scale tests * Use accelerator state * Quality * Guard * Experiment w/ only fairseq fix * Fairseq only * Revert multiply_grads fix * Mult by grad accum to fully bring back solution * Style * Good to go now * Skip fx tests for now * Bookmark * Working now	2024-10-23 11:24:57 -04:00
Aymeric Roucher	1fb575fcf0	Support boolean tool args (#34208 ) Support boolean tool arguments	2024-10-23 16:48:21 +02:00
Filippos Ventirozos	343c8cb86f	Added Deberta model type support (#34308 ) * Added Deberta model type for 'add_prefix_space' functionality * housekeeping --------- Co-authored-by: Filippos Ventirozos <filippos.ventirozos@autotrader.co.uk>	2024-10-23 11:15:36 +02:00
Steven Liu	5ba85de7a4	[docs] Fix Korean toctree (#34324 ) fix	2024-10-23 10:52:51 +02:00
Vijay	049682a5a6	Example doc for token classification of Llama and Dependent/Copied Models (#34139 ) * Added Example Doc for token classification on all tokenClassificationModels copied from llama * Refactor code to add code sample docstrings for Gemma and Gemma2 models (including modular Gemma) * Refactor code to update model checkpoint names for Qwen2 models	2024-10-22 10:26:16 -07:00
wony617	644d5287b2	🌐 [i18n-KO] Translated `model_doc/bartpho.md` to Korean (#33981 ) * docs: ko: model_doc/bartpho.md * feat: nmt draft * Update docs/source/ko/model_doc/bartpho.md * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-22 09:46:52 -07:00
Ahnjj_DEV	b03dc0a87e	🌐 [i18n-KO] Translated `bert japanese.md` to Korean (#33890 ) * docs: ko: bert-japanese.md * Update _toctree.yml * fix: manual edits * Update docs/source/ko/_toctree.yml Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/_toctree.yml Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> --------- Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-22 09:46:31 -07:00
Ahnjj_DEV	4b14aa1bcd	🌐 [i18n-KO] Translated `executorch.md` to Korean (#33888 ) * docs: ko: executorch.md * Update _toctree.yml * fix: manual edits * Update docs/source/ko/main_classes/executorch.md Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> * Update docs/source/ko/_toctree.yml Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> * Update docs/source/ko/_toctree.yml * Update docs/source/ko/_toctree.yml * Update docs/source/ko/_toctree.yml --------- Co-authored-by: HyeokJun SHIN <96534680+jun048098@users.noreply.github.com> Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-22 09:46:20 -07:00
Fanli Lin	688eeac81e	[docs] fix typo (#34235 ) fix typo	2024-10-22 09:46:07 -07:00
Mansu Kim	a65a6ce7fe	fix error in _get_eval_sampler when group_by_length enabled (#34237 ) * remove self in _get_eval_sampler * remove self in front of _get_eval_sampler	2024-10-22 18:02:42 +02:00
Yoni Gozlan	e7c3fa7f57	Fix continue_final_message for image-text-to-text chat templates (#34236 ) * fix continue_final_message for vlms * Add one test for vlms continue_final_message chat template	2024-10-22 11:57:44 -04:00
Chinedum Echeta	96f67c068b	Feature: Add `MLFLOW_MAX_LOG_PARAMS` to `MLflowCallback` (#34279 )	2024-10-22 16:34:17 +02:00
Michael Kamerath	eef6b0ba42	Add option for running ffmpeg_microphone_live as a background process (#32838 ) * Add option for running ffmpeg_microphone_live as a background process * Code quality checks for audio_utils * Code clean up for audio_utils * Fixing logic in ffmpeg_microphone calls in audio_utils * Allowing any arbitrary arguments to be passed to ffmpeg_microphone_live * Formatting * Fixing last problems with adding ffmpeg_additional_args * Fixing default arguments and formatting issues * Fixing comments for ffmpeg_additional_args * Adding two shorts tests for ffmpeg_microphone_live * Fixing test bug	2024-10-22 15:56:41 +02:00
Guang Yang	c14ccbcd64	Olmo is ExecuTorch Compatible (#34181 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-22 15:53:01 +02:00
Guang Yang	7a08a772cc	Qwen2.5 is ExecuTorch Compatible (#34102 ) Qwen2 is ExecuTorch Compatible Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-22 15:52:23 +02:00
Alexandros Benetatos	c31a6ff474	Add post_process_depth_estimation to image processors and support ZoeDepth's inference intricacies (#32550 ) * add colorize_depth and matplotlib availability check * add post_process_depth_estimation for zoedepth + tests * add post_process_depth_estimation for DPT + tests * add post_process_depth_estimation in DepthEstimationPipeline & special case for zoedepth * run `make fixup` * fix import related error on tests * fix more import related errors on test * forgot some `torch` calls in declerations * remove `torch` call in zoedepth tests that caused error * updated docs for depth estimation * small fix for `colorize` input/output types * remove `colorize_depth`, fix various names, remove matplotlib dependency * fix formatting * run fixup * different images for test * update examples in `forward` functions * fixed broken links * fix output types for docs * possible format fix inside `<Tip>` * Readability related updates Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Readability related update * cleanup after merge * refactor `post_process_depth_estimation` to return dict; simplify ZoeDepth's `post_process_depth_estimation` * rewrite dict merging to support python 3.8 --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-10-22 15:50:54 +02:00
pbelcak	104599d7a8	Fix: tensor of examples of the same length triggers invalid stacking (#34166 ) * Fix issue where tensor of examples of the same length triggers invalid stacking * Update data_collator.py	2024-10-22 15:49:21 +02:00
Cyril Vallez	51e395d13e	Fix FA2 attention for models supporting sliding window (#34093 ) Fix FA2	2024-10-22 15:37:21 +02:00
HALLOUARD	eb6a734995	[RT-DETR] Fix onnx inference bug for Optype (Where) (#33877 ) * feat: [RT-DETR] Add onnx runtime config and fix onnx inference bug Optype (Where) * fix lint * use dtype istead of torch.float32 * add doc * remove onnx config * use dtype info * use tensor to fix lint	2024-10-22 15:14:07 +02:00
Marc Sun	84b17e03f1	Update PR templates (#34065 ) update PR template	2024-10-22 15:11:54 +02:00
Matt	681fc43713	Sync video classification pipeline with huggingface_hub spec (#34288 ) * Sync video classification pipeline * Add disclaimer	2024-10-22 13:33:49 +01:00
regisss	93352e81f5	Fix Korean doc _toctree.yml (#34293 ) Fix korean doc _toctree.yml	2024-10-22 11:05:56 +02:00
Steven Liu	b644178ed4	[docs] Fix GenerationConfig params (#34299 ) fix generationconfigs	2024-10-22 11:03:25 +02:00
Raushan Turganbay	73d65e637b	T5 compile compatibilty (#34089 ) * this worked in normal generation, needs more tests * fix almost all tests in t5 * nit * longt5, umt5, mt5 * style * udop, pix2struct * more models * fix some tests * fix onnx tests * tracing tests fixed * compile enabled and tested for t5 models * fix small bug in slow tests * [run-slow] t5 * uncomment * style * update with new generation refactoring * nit * fix copies * this is the fix, had to change t5 to fix copies * update * [run-slow] t5 * [run-slow] t5 * update * add test for encoder only T5 * clean up after rebase * fix pop2piano * add comment * style * fix copies after rebase * fix copies missed this one	2024-10-22 08:23:53 +02:00
Raushan Turganbay	5077bc034f	VLM: add more modularity (#34175 ) * update * fix tests + fix copies * fix tests once more	2024-10-22 07:56:35 +02:00
Raushan Turganbay	21d5025826	Attn implementation for composite models (#32238 ) * first try * codestyle * idefics2 is happy * [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo, paligemma * fix-copies * [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo * blip-2 needs to init vision from config * when was this removed O_o * minor fix * tests * this way? * tests * model-agnostic code * codestyle * add tests for idefics * modify general test for VLMs * no generation test for vlm yet! * no generation test here also * wanr in VIT-SDPA if output attn * add more tests * user can pass dict as attn impl * repo consistency * update * muicgen * no prints * forgot speech enc-dec and clip * how many composite models we have? * musicgen meelody is same as mudicgen * +siglip * fix tests + add some more * remove idefics custom overriden code * make idefics2 automappable * nits * skip tests * doctests * Update src/transformers/models/idefics2/configuration_idefics2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/clip/test_modeling_clip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/idefics2/test_modeling_idefics2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/idefics2/test_modeling_idefics2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/configuration_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * major update, no need for automap * clean up * add FA2 test * more tests * style * skip tests * why did these started failing now? * no attributes for FA2 needed * one tiny test * address comment about FA2 false warning * style * add new models and resolve conflicts * fix copies * let it be this way for now, come back tomorrow to review * some more fixes * update * more updates * update * fix copies * style and tests * another big update * fix tests * fix tests * update * another update * fix tests * fix copies * fix tests --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-10-22 06:54:44 +02:00
Andrés Marafioti	32590b5ecb	Fix method name which changes in tutorial (#34252 ) The method `model_download_tool` was called `model_download_counter` earlier in the tutorial, this raises an error when following the code.	2024-10-21 14:21:52 -03:00
Matt	f701b98e4a	Add a doc section on writing generation prompts (#34248 ) Add a section on writing generation prompts	2024-10-21 14:35:57 +01:00
Yoni Gozlan	a4122813d1	Add DetrImageProcessorFast (#34063 ) * add fully functionning image_processing_detr_fast * Create tensors on the correct device * fix copies * fix doc * add tests equivalence cpu gpu * fix doc en * add relative imports and copied from * Fix copies and nit	2024-10-21 09:05:05 -04:00
Yoni Gozlan	24bdc94da5	Change Paligemma import logging to work with modular (#34211 ) * change import logging * fix CI	2024-10-21 08:55:27 -04:00
Raushan Turganbay	ca541bd4f4	Generation tests: don't rely on main input name (#34228 ) * don't rely on main input name * update	2024-10-21 10:00:14 +02:00
Matthew Hoffman	816f442496	Only cast logits to float when computing loss (#34147 ) * Only cast logits to float when computing loss Some misses from #31292 and #33902 * Move logits.float() into existing if labels is not None branch	2024-10-18 18:15:26 +02:00
Matt	e46e3bc173	Fix UDOP dtype issue (#34180 ) * Trigger UDOP tests * Try forcing dtype in LayoutLMV3 * Do checks to see where uint8 is getting in * Do checks to see where uint8 is getting in * Found it! * Add .astype(np.float32) * Remove forced check, make fixup * Checking where exactly the uint8 creeps in * More checking on the uint8 issues * Manually upcast in rescale() * Remove UDOP trigger	2024-10-18 16:54:58 +01:00
Cyril Vallez	6604764007	add Glm (#33823 ) * Create modular_glm.py * Update modular_glm.py * Finalize architecture without all attentions * Add all attentions modules * Finalize modular * Update given last version * Last update * Finalize model * Finalize converter * Update convert_glm_weights_to_hf.py * style * style * Create __init__.py * Aff all inits * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Correct the rotary embeddings * Remove apply_residual_connection_post_layernorm (always false) * remove use_rms_norm (always true) * remove past_layer_norm (always true) * Update __init__.py * Update config and license * start adding tests and doc * Add doc + style * Update test_modeling_glm.py * Add dummies * Apply correct modeling * Refactor attention to follow llama * Update __init__.py * Update convert_glm_weights_to_hf.py * Correct bias * remove linear_bias and pdrop (never used) * apply modular * Simplify converter * remove dummies + style * add model_input_names * Add pretraining_tp to config for when eager attention is used * Update modular to remove all pretraining_tp * Update test_modeling_glm.py * Update the __all__ * Update __all__ * Update __init__.py * Update test_modeling_glm.py * add revisions * Add the correct repos and revisions * style * Update __init__.py * update exports * remove import of modular files * style * Apply Llama changes + refine converter * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * style * Use new modular converter * add pretrainedmodel to init * style * Update test_modeling_glm.py * Move config outside modular to please CI about docstrings * Add dummies to please CI * Update glm.md * Update glm.md	2024-10-18 17:41:12 +02:00
Lysandre Debut	e95ea479ee	Informative 2 (#34154 ) * Informative * style * Informative 2 * Apply suggestions from code review Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-10-18 14:12:15 +02:00
byi8220	0437d6cd03	Fix broken test decorator `require_torch_up_to_2_accelerators` (#34201 ) * fix broken require_torch_up_to_2_accelerators * make style	2024-10-18 13:54:55 +02:00
Raushan Turganbay	5a5b590d06	BLIP: fix input expansion logic (#34225 ) fix	2024-10-18 12:17:30 +02:00
Arthur	b54109c746	Fix-red-ci (#34230 ) * fix copies, skip fx for llama * styke * re-fix copies * last? * style	2024-10-17 23:38:35 +02:00
Zach Mueller	6ba31a8a94	Enable users to use their own loss functions + deal with prefetching for grad accum (#34198 ) * bookmark * Bookmark * Bookmark * Actually implement * Pass in kwarg explicitly * Adjust for if we do or don't have labels * Bookmark fix for od * bookmark * Fin * closer * Negate accelerate grad accum div * Fixup not training long enough * Add in compute_loss to take full model output * Document * compute_loss -> compute_loss_fn * Add a test * Refactor * Refactor * Uncomment tests * Update tests/trainer/test_trainer.py Co-authored-by: Daniel Han <danielhanchen@gmail.com> --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>	2024-10-17 17:01:56 -04:00
Pedro Cuenca	7a06d07e14	Support Llama 3.2 conversion (text models) (#33778 ) * Support Llama 3.2 conversion (text models) Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Fix rope factor * Update chat template Initialize from a well-known template. The guidance is that the changes should be applied to 3.1 models as well. * Remove import * Support Llama Guard 3 conversion * Tokenizer details * Fix eos added token in base models * Fix generation config for base models * Specify revision for known tokenizers * Style * Reuse chat templates for older models * Improve error when converting tokenizer < Llama 3 --------- Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>	2024-10-17 22:37:37 +02:00
Arthur	c1c7e89620	Fix Gradient Accumulation issue (#34191 ) * quick fix * 3 losses * oups * fix * nits * check how it scales for special models * propagate for conditiona detr * propagate * propagate * propagate * fixes * propagate changes * update * fixup * nits * f string * fixes * more fixes * ? * nit * arg annoying f string * nits * grumble * update * nit * refactor * fix fetch tests * nit * nit * Update src/transformers/loss/loss_utils.py Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> * update * nit * fixup * make pass * nits * port code to more models * fixup * ntis * arf * update * update * nits * update * fix * update * nits * fine * agjkfslga.jsdlkgjklas * nits * fix fx? * update * update * styel * fix imports * update * update * fixup to fix the torch fx? --------- Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>	2024-10-17 22:34:40 +02:00
Joao Gante	f51ac9e059	Generate: visit non-llm `prepare_inputs_for_generation` (#34199 ) * tmp * all visited * test all * Update src/transformers/models/moshi/modeling_moshi.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * delete another one :D --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-17 16:53:48 +01:00
David Chanin	1d2c29f0b3	Fix bus error when using GPT2 on M1 macs (#34031 ) There's a bug on M1 macs with transformer >= 4.43.0 and torch >= 2.1.0, where if a model has tied embeddings, then the fast loading from #31771 causes a bus error when the model is actually run. This can be solved by disabling `_supports_param_buffer_assignment` for these models. More info in comments in #33357	2024-10-17 17:39:04 +02:00

... 42 43 44 45 46 ...

19383 Commits