transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 10:12:23 +06:00

Author	SHA1	Message	Date
dependabot[bot]	aefbdfe8cf	Bump pyarrow from 7.0.0 to 15.0.0 in /examples/research_projects/decision_transformer (#30582 ) Bump pyarrow in /examples/research_projects/decision_transformer Bumps [pyarrow](https://github.com/apache/arrow) from 7.0.0 to 15.0.0. - [Commits](https://github.com/apache/arrow/compare/go/v7.0.0...go/v15.0.0) --- updated-dependencies: - dependency-name: pyarrow dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:37:40 +01:00
dependabot[bot]	7164171212	Bump gitpython from 3.1.32 to 3.1.41 in /examples/research_projects/distillation (#30586 ) Bump gitpython in /examples/research_projects/distillation Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.32 to 3.1.41. - [Release notes](https://github.com/gitpython-developers/GitPython/releases) - [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES) - [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.32...3.1.41) --- updated-dependencies: - dependency-name: gitpython dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:36:57 +01:00
dependabot[bot]	ff8f624542	Bump grpcio from 1.44.0 to 1.53.2 in /examples/research_projects/decision_transformer (#30585 ) Bump grpcio in /examples/research_projects/decision_transformer Bumps [grpcio](https://github.com/grpc/grpc) from 1.44.0 to 1.53.2. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.44.0...v1.53.2) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:35:52 +01:00
dependabot[bot]	b71f512823	Bump gitpython from 3.1.32 to 3.1.41 in /examples/research_projects/decision_transformer (#30587 ) Bump gitpython in /examples/research_projects/decision_transformer Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.32 to 3.1.41. - [Release notes](https://github.com/gitpython-developers/GitPython/releases) - [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES) - [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.32...3.1.41) --- updated-dependencies: - dependency-name: gitpython dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:30:24 +01:00
Pedro Cuenca	f4f18afde8	Gemma: update activation warning (#29995 ) * Gemma: only display act. warning when necessary This is a nit PR, but I was confused. I got the warning even after I had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I was using the "legacy" `gelu_pytorch_tanh`. Another option is to keep the warning but change the message to say something like "`hidden_act` is ignored, please use `hidden_activation` instead. Setting Gemma's activation function to `gelu_pytorch_tanh`". * Change message, and set `config.hidden_activation`	2024-05-01 17:23:38 +02:00
amyeroberts	bbaa8ceff6	Fix canonical model --model_type in examples (#30480 ) Fix --model_type in examples	2024-05-01 15:47:05 +01:00
Arthur	3c69d81eeb	remove jax example (#30498 ) remove example	2024-05-01 16:34:57 +02:00
Matt	1e05671d21	Fix QA example (#30580 ) * Handle cases when CLS token is absent * Use BOS token as a fallback	2024-05-01 08:43:02 +01:00
Matt	4b4da18f53	Refactor default chat template warnings (#30551 ) * Temporarily silence warnings in apply_chat_template until we can properly deprecate default chat templates * make fixup * Move the default chat template warning into apply_chat_template itself * make fixup	2024-05-01 08:42:11 +01:00
Raushan Turganbay	4bc9cb36b7	Fix Marian model conversion (#30173 ) * fix marian model coversion * uncomment that line * remove unnecessary code * revert tie_weights, doesn't hurt	2024-05-01 12:33:12 +05:00
Raushan Turganbay	38a4bf79ad	Encoder-decoder models: move embedding scale to nn.Module (#30410 ) * move scaling to nn.Module * let the test be here for now (need to fix) * failing tests * last failing models * Revert commit `4c14817f38` * clean-up * oops forgot * codestyle * raise NotImplemented when possible * Update tests/test_modeling_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * skip tests in respective modeling files --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-01 12:33:00 +05:00
Raushan Turganbay	9d31b32e9d	Use text config's vocab size in testing models (#30568 ) use text config's vocab size	2024-05-01 12:32:45 +05:00
Yih-Dar	78fdd64dcf	Remove `use_square_size` after loading (#30567 ) * fix * add test --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-30 21:11:37 +02:00
Yih-Dar	87927b248e	General PR slow CI (#30540 ) * More general PR slow CI * Update utils/pr_slow_ci_models.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-30 21:05:09 +02:00
Raushan Turganbay	b8ac4d035c	Fix generation doctests (#30263 ) * fix doctest * fix torch doctest * make CI happy * raise error * make fixup	2024-04-30 21:02:26 +02:00
DarshanDeshpande	2ecefc3959	Add chat templating support for KeyDataset in text-generation pipeline (#30558 ) * added chat templating support for keydataset in generation pipeline * fixed and improved test * fix formatting test failures * Fix tests * Fix tests	2024-04-30 19:51:41 +01:00
Jiarui Xu	0cdb6b3f92	BlipModel: get_multimodal_features method (#30438 ) * add_blip_get_multimodal_feautres * Fix docstring error * reimplement get_multimodal_features * fix error * recheck code quality * add new necessary tests	2024-04-30 19:01:01 +01:00
Anton Vlasjuk	9112520b15	Fix seq2seq collator padding (#30556 ) * fix seq2seq data collator to respect the given padding strategy further added tests for the seq2seq data collator in the style of the `data_collator_for_token_classification` (pt, tf, np) * formatting and change bool equals "==" to "is" * add missed return types in tests * update numpy test as it can handle unequal shapes, not like pt or tf	2024-04-30 18:32:30 +01:00
Joao Gante	78a57c5e1a	DBRX: make fixup (#30578 )	2024-04-30 18:30:23 +01:00
Joao Gante	1bff6a0b58	Generate: update links on LLM tutorial doc (#30550 )	2024-04-30 18:14:12 +01:00
Joao Gante	75bbfd5b22	Cache: Static cache as a standalone object (#30476 )	2024-04-30 16:37:19 +01:00
Jacky Lee	0ae789e043	Enable multi-device for more models (#30409 ) * feat: support for dinov2 * feat: support for depth_anything * feat: support for efficientformer * feat: support for bert (is this right?) * update: embedding split * remove: empty string * feat: support for align * fix: copies * fix: QAQBertEmbeddings * fix: more consistency issues * revert: support for effientformer * feat: support for altclip * feat: support for blip_text * support for ChineseCLIP * feat: support for depth anything * feat: support for dpt * feat: support for dpt * feat: support for git * feat: support for groupvit * update: format * fix: support for clip * fix: consistency * feat: support for pvt * feat: support for vit_msn * fix: consistency * fix: other copies * remove: device transfer * revert: in-place add * update: support for align * update: support for bert * update: support for Chinese CLIP * revert: changes to efficientformer * update: support for dpt * update: support for efficientformer * revert: changes to git * revert: changes to groupvit * revert: changes to roc_bert * update: support for vit_msn * revert: changes to dpt * remove: extra space * style: extra space	2024-04-30 12:09:08 +01:00
Raushan Turganbay	c712d05aa8	Pass `use_cache` in kwargs for GPTNeoX (#30538 ) pass use_cache in kwargs	2024-04-30 12:16:18 +05:00
Zach Mueller	a3aabc702e	Include safetensors as part of `_load_best_model` (#30553 ) * Include safetensors * Cleanup	2024-04-29 14:47:26 -04:00
Benjamin Warner	9df8b301ce	Reenable SDPA's FA2 During Training with torch.compile (#30442 ) * Reenable SDPA's FA2 during training with torch.compile * fix Olmo's SDPA FA2 dispatching too * update formatting * improved SDPA comment * formatting and explanatory comment * is_causal if statement to one-liner	2024-04-30 00:45:43 +08:00
Yih-Dar	87be06ca77	Fix repo. fetch/checkout in PR slow CI job (#30537 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-29 14:32:43 +02:00
Yih-Dar	c02421883b	Update runner tag for PR slow CI (#30535 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-29 14:07:41 +02:00
clinty	bdbe166211	Fix broken link to Transformers notebooks (#30512 ) Co-authored-by: Clint Adams <clint@debian.org>	2024-04-29 10:57:51 +01:00
amyeroberts	e8acb70015	Pass attn_implementation when using AutoXXX.from_config (#30507 ) * Pass attn_implementation when using AutoXXX.from_config * Fix	2024-04-29 10:22:33 +01:00
Howard Liberty	80126f98d8	Allow boolean FSDP options in fsdp_config (#30439 ) * Allow boolean FSDP options in fsdp_config * Use lower() to be safe	2024-04-29 10:03:26 +01:00
Eitan Turok	73014b561d	Fix link in dbrx.md (#30509 )	2024-04-26 20:52:24 +01:00
Eduardo Pacheco	6d4cabda26	[SegGPT] Fix seggpt image processor (#29550 ) * Fixed SegGptImageProcessor to handle 2D and 3D prompt mask inputs * Added new test to check prompt mask equivalence * New proposal * Better proposal * Removed unnecessary method * Updated seggpt docs * Introduced do_convert_rgb * nits	2024-04-26 19:40:12 +01:00
amyeroberts	c793b26f2e	load_image - decode b64encode and encodebytes strings (#30192 ) * Decode b64encode and encodebytes strings * Remove conditional encode -- image is always a string	2024-04-26 18:21:47 +01:00
amyeroberts	e7d52a10d7	Fix GroundingDINO, DPR after BERT SDPA update (#30506 ) Fix GroundingDINO, DPR after BET SDPA update	2024-04-26 18:04:41 +01:00
Sanchit Gandhi	38b53da38a	[examples] update whisper fine-tuning (#29938 ) * [examples] update whisper fine-tuning * deprecate forced/suppress tokens * item assignment * update readme * final fix	2024-04-26 17:06:03 +01:00
amyeroberts	aafa7ce72b	[`DETR`] Remove timm hardcoded logic in modeling files (#29038 ) * Enable instantiating model with pretrained backbone weights * Clarify pretrained import * Use load_backbone instead * Add backbone_kwargs to config * Fix up * Add tests * Tidy up * Enable instantiating model with pretrained backbone weights * Update tests so backbone checkpoint isn't passed in * Clarify pretrained import * Update configs - docs and validation check * Update src/transformers/utils/backbone_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Clarify exception message * Update config init in tests * Add test for when use_timm_backbone=True * Use load_backbone instead * Add use_timm_backbone to the model configs * Add backbone_kwargs to config * Pass kwargs to constructors * Draft * Fix tests * Add back timm - weight naming * More tidying up * Whoops * Tidy up * Handle when kwargs are none * Update tests * Revert test changes * Deformable detr test - don't use default * Don't mutate; correct model attributes * Add some clarifying comments * nit - grammar is hard --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-04-26 16:55:24 +01:00
Zach Mueller	77ff304d29	Remove skipping logic now that set_epoch exists (#30501 ) * Remove skipping logic now that set_epoch exists * Working version, clean	2024-04-26 11:52:09 -04:00
JB (Don)	dfa7b580e9	[`BERT`] Add support for sdpa (#28802 ) * Adding SDPA support for BERT * Using the proper input name for testing model input in inference() * Adding documentation for SDPA in BERT model page * Use the stable link for the documentation * Adding a gate to only call .contiguous() for torch < 2.2.0 * Additions and fixes to the documentation * Minor updates to documentation * Adding extra requirements needed for the contiguous() bug * Adding "Adapted from" in plcae of the "Copied from" * Add benchmark speedup tables to the documentation * Minor fixes to the documentation * Use ClapText as a replacemenet for Bert in the Copied-From * Some more fixes for the fix-copies references * Overriding the test_eager_matches_sdpa_generate in bert tests to not load with low_cpu_mem_usage [test all] * Undo changes to separate test * Refactored SDPA self attention code for KV projections * Change use_sdpa to attn_implementation * Fix test_sdpa_can_dispatch_on_flash by preparing input (required for MultipleChoice models)	2024-04-26 16:23:44 +01:00
Matt	2de5cb12be	Use the Keras set_random_seed in tests (#30504 ) Use the Keras set_random_seed to ensure reproducible weight initialization	2024-04-26 16:14:53 +01:00
Michael Goin	20081c743e	Update `dtype_byte_size` to handle torch.float8_e4m3fn/float8_e5m2 types (#30488 ) * Update modeling_utils/dtype_byte_size to handle float8 types * Add a test for dtype_byte_size * Format * Fix bool	2024-04-26 11:26:43 +01:00
kyo	59e715f71c	Fix the `bitsandbytes` error formatting ("Some modules are dispatched on ...") (#30494 ) Fix the `bitsandbytes` error when some modules are not properly offloaded.	2024-04-26 10:13:52 +01:00
Younes Belkada	19cfdf0fac	FEAT: PEFT support for EETQ (#30449 ) Update quantizer_eetq.py	2024-04-26 10:20:35 +02:00
Aaron Jimenez	a98c41798c	[docs] Spanish translation of pipeline_tutorial.md (#30252 ) * add pipeline_webserver to es/ * add pipeline_webserver to es/, translate first section * add comment for checking link * translate pipeline_webserver * edit pipeline_webserver * fix typo	2024-04-25 12:18:06 -07:00
Younes Belkada	26ddc58047	Quantization: `HfQuantizer` quant method update (#30484 ) ensure popular quant methods are supported	2024-04-25 21:09:28 +02:00
Matt	f39627125b	Add sidebar tutorial for chat models (#30401 ) * Draft tutorial for talking to chat models * Reformat lists and text snippets * Cleanups and clarifications * Finish up remaining TODOs * Correct section link * Small fix * Add proper quantization examples * Add proper quantization examples * Add proper quantization examples * Update docs/source/en/conversations.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/conversations.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix Text Generation Pipeline link and add a ref to the LLM inference guide * intelligent -> capable * Small intro cleanup * Small text cleanup * Small text cleanup * Clarification about system message * Clarification about system message --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-04-25 19:38:48 +01:00
Xuehai Pan	bc274a28a9	Do not use deprecated `SourceFileLoader.load_module()` in dynamic module loading (#30370 )	2024-04-25 18:23:39 +02:00
Raushan Turganbay	e60491adc9	Fix Llava for 0-embeddings (#30473 )	2024-04-25 20:28:51 +05:00
Zach Mueller	ad697f1801	Introduce Stateful Callbacks (#29666 ) * Introduce saveable callbacks * Add note * Test for non-present and flag * Support early stopping and refusing to train further * Update docstring * More saving * Import oopsie * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Make it go through TrainerArguments * Document * Fix test * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Rework to allow for duplicates * CLean * Fix failing tests --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-25 11:00:09 -04:00
Zach Mueller	86f2569738	Make accelerate install non-torch dependent (#30463 ) * Pin accelerate w/o eager * Eager * Update .circleci/create_circleci_config.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Expound * Expound squared * PyTorch -> dependency --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-25 09:37:55 -04:00
manju rangam	928331381e	Fix Issue #29817 Video Classification Task Guide Using Undeclared Variables (#30457 ) * Fix issue #29817 Video Classification Task Guide Using Undeclared Variables * Update docs/source/en/tasks/video_classification.md updated with review comments Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix issue #29817 Add line space following PR comments --------- Co-authored-by: manju-rangam <Manju1@Git> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-25 13:49:30 +01:00

1 2 3 4 5 ...

15768 Commits