transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-27 00:09:00 +06:00

Author	SHA1	Message	Date
Aviv Shamsian	7f79a97399	fix prompt strip to support tensors and np arrays (#27818 ) * fix prompt strip to support tensors and np arrays * framework agnostic * change logic check before converting prompt into list Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * adding _convert_to_list to tokenization_whisper_fast * adding tests for prompt decoding * adding comment Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * adding comment Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * revert minor * make style formatting * style formatting after update * Update src/transformers/models/whisper/tokenization_whisper_fast.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fixing _strip_prompt to handle _decode_with_timestamps * fix copies --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2024-07-12 20:07:10 +01:00
Joao Gante	d1a1bcf56a	Docker: TF pin on the consistency job (#31928 ) * pin * dev-ci * dev-ci * dev-ci * test pushed image	2024-07-12 14:28:46 +02:00
jiqing-feng	aec1ca3a58	[Bug Fix] fix qa pipeline tensor to numpy (#31585 ) * fix qa pipeline * fix tensor to numpy	2024-07-11 22:22:26 +01:00
Naman Garg	c1e139c2b0	Adding hiera (#30356 ) * initialized Structure * Updated variable names * Added Config class, basic HF setup, convert_to_hf * Fixed Convert function, added hiera to HF files, Initilized test files * better naming for x in forward pass * Moved utils to hiera * Change hiera -> hiera_model * Fixed integration into tranformers * Fix: Convert Checkpoint * added documentation for hiera * added documentation for hiera * added Docstings to models, Transformers based changes * make style and quality * make style and quality * Integration & Block tests running * Fixed bugs * initialized Structure * Updated variable names * Added Config class, basic HF setup, convert_to_hf * Fixed Convert function, added hiera to HF files, Initilized test files * better naming for x in forward pass * Moved utils to hiera * Change hiera -> hiera_model * Fixed integration into tranformers * Fix: Convert Checkpoint * added documentation for hiera * added documentation for hiera * added Docstings to models, Transformers based changes * make style and quality * make style and quality * Integration & Block tests running * Fixed bugs * Removed tim dependency * added HieraBlock * fixed: Model name * added tests for HieraModel, HieraBlock * fixed imports * fixed quality & copies * Fixes * Update docs/source/en/model_doc/hiera.md Fix name Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/hiera.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/hiera.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/hiera/configuration_hiera.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/hiera/configuration_hiera.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fixed formatting * Code quality & Import differences * quality and repo-consistency fix * fixed no torch error * Docstring fix * Docstring fix * doc string fix * fixed example usage * Resolved issues in modeling_hiera * Removed Hiera MAE * Added test and resolved bug * fixed doc string * First commit * Finished conversion script and model forward working * Resolved all issues * nits * Improving tests * Nits * More nits * Improving HieraForMaskedImageModeling * More improvements and nits * Fixed docstrings of outputs * More fixes * More imrpovments * Updated conversion script * Fixed docstrings * Improved tests * Fixed attentou outputs test * All tests green * Removed unnecessary file * contribution attribution * Resolved a few issues * Resolved Comments * Updated model repo id and fixed bugs * Removed loss print * Make tests green * Updated docstrings * Fix style * Fixed num_heads in config * Removed unnecessary video checkpoint related code in the conversion script * Fix style * Changed atol in conversion script * HieraConfig * Fix copies * Fixed typo * Resolved few issues * make * converted conv_nd -> nn.Module * Removed video complexities * Removed video complexities * fix style * Addressing comments * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix style * Fixed tests * Fixed typo * Fixed interpolate test * Made torch fx compatible * Made sure imageprocesor is correct * Addressed comments * Noise directly as torch * Remove unnecesary attr * Added return_dit * Update src/transformers/models/hiera/__init__.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Updated checkpoints * [run_slow] hiera * Fixed device mismatch * [run_slow] hiera * Fixed GPU tests * [run_slow] hiera --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-29-50.us-east-2.compute.internal> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Eduardo Pacheco <eduardo.pach@hotmail.com> Co-authored-by: Eduardo Pacheco <69953243+EduardoPach@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-11 22:13:56 +01:00
Apoorv Khandelwal	574e68d554	Allow `Trainer.get_optimizer_cls_and_kwargs` to be overridden (#31875 ) * Change `Trainer.get_optimizer_cls_and_kwargs` to `self.` * Make `get_optimizer_cls_and_kwargs` an instance method * Fixing typo * Revert `get_optimizer_cls_and_kwargs` to staticmethod * restore newline to trainer.py eof	2024-07-11 22:13:06 +01:00
t11s	52585019a1	🚨 fix(SigLip): remove spurious exclusion of first vision output token (#30952 ) fix(SigLip): remove spurious exclusion of first vision output token in classifier	2024-07-11 19:40:57 +01:00
Joao Gante	6a05f68f51	Generate: fix `SlidingWindowCache.reset()` (#31917 ) fix sliding cache	2024-07-11 19:35:46 +01:00
Arthur	e314395277	Refactor flash attention implementation in transformers (#31446 ) * dumb commit * nit * update * something like this * unpack in modeling utils * safe import * oups * update * nits * diff convert gemma * update * start propagating * udpate other modeling code as well * update for sliding window models * nits * more init cleanups * styling * fixup * noice * pass fixup * typo typing_extension -> typing_extensions * torch.nn.functionnal -> torch.nn.functional * add to import structure * unpack * simplify a bit more for this first version * nut * update * update * nit * ease the import of `Unpack` * remove useless `use_sliding_window` * no qua please * protect import? * style * [run-slow] * [run slow] llama,gemma,mistral,mixtral * remove extra kwargs * fix llama * address review comments * apply diff_model_converter to modeling_gemma.py * remove cache_position 1 * remove cache_position 2 * some cleaning * refactor gemma2 as well * apply review comments * rename file to modeling_flash_attention_utils.py * siglip refactor * remove dead code * is the hub down? * still down? * fix siglip * fix gemma2 * fatal: Could not read from remote repository. * fix typo in softcap implem * flacky * Failed: Timeout >120.0s --------- Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>	2024-07-11 20:37:31 +08:00
fxmarty	ad4ef3a290	Fix fx tests with inputs_embeds (#31862 ) * fix tests * [test_all] check * address review comments	2024-07-11 20:14:03 +08:00
Omar Salman	1499a55008	Add warning message for beta and gamma parameters (#31654 ) * Add warning message for and parameters * Fix when the warning is raised * Formatting changes * Improve testing and remove duplicated warning from _fix_key	2024-07-11 13:01:47 +01:00
Sangbum Daniel Choi	23d6d0cc06	add gather_use_object arguments II (#31799 ) * add gather_use_object arguments * fix name and pass the CI test for Seq2SeqTrainer * make style * make it to functools * fix typo * add accelerate version: * adding warning * Update src/transformers/trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * make style * Update src/transformers/training_args.py * check function move to initial part * add test for eval_use_gather_object * fix minor --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-07-11 12:23:02 +01:00
Sai-Suraj-27	2e48b3e872	fix: Fixed the `1st argument` name in classmethods (#31907 ) Fixed the first argument name in few classmethods.	2024-07-11 12:11:50 +01:00
Isotr0py	48c20700e1	Fix missing methods for Fuyu (#31880 ) * add missing methods for FuyuForCausalLM * fix a typo * format code * add missing tie_weights * format code	2024-07-11 11:01:46 +01:00
Arthur	f4ec7a286a	[`Gemma2`] Support FA2 softcapping (#31887 ) * Support softcapping * strictly greater than * update	2024-07-11 11:57:35 +02:00
Arthur	f67e0f7fb7	[`ConvertSlow`] make sure the order is preserved for addedtokens (#31902 ) * preserve the order * oups * oups * nit * trick * fix issues	2024-07-11 11:56:41 +02:00
Raushan Turganbay	14d3b3f0f0	Processor accepts any kwargs (#31889 ) * accept kwargs in processors * return unused kwargs * fix tests * typo * update the other way	2024-07-11 13:20:30 +05:00
turboderp	a695c18649	Fixes to alternating SWA layers in Gemma2 (#31775 ) * HybridCache: Flip order of alternating global-attn/sliding-attn layers * HybridCache: Read sliding_window argument from cache_kwargs * Gemma2Model: Flip order of alternating global-attn/sliding-attn layers * Code formatting	2024-07-11 10:03:46 +02:00
Raushan Turganbay	d625294d79	InstructBlipVideo: Update docstring (#31886 ) * update docs * one more change	2024-07-11 10:13:29 +05:00
haikuoxin	c54af4c77e	Add a condition for nested_detach (#31855 ) fix bug: https://github.com/huggingface/transformers/issues/31852	2024-07-10 21:37:22 +01:00
Yih-Dar	080e14b24c	Modify `warnings` in a `with` block to avoid flaky tests (#31893 ) * fix * [test_all] check before merge --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-07-10 17:56:12 +02:00
NielsRogge	ec03d97b27	[RT-DETR] Add resources (#31815 ) * Add resources * Address comments	2024-07-10 16:34:53 +01:00
Marc Sun	8df28bb308	Push sharded checkpoint to hub when `push_to_hub=True` in `TrainingArguments` (#31808 ) Save sharded checkpoint in Trainer	2024-07-10 15:14:20 +02:00
Sai-Suraj-27	da79b18087	fix: Removed `duplicate` field definitions in some classes (#31888 ) Removed duplicate field definitions in classes.	2024-07-10 13:46:31 +01:00
Yih-Dar	9d98706b3f	Fix failed tests in #31851 (#31879 ) * Revert "Revert "Fix `_init_weights` for `ResNetPreTrainedModel`" (#31868)" This reverts commit `b45dd5de9c`. * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-07-10 14:25:24 +02:00
Noah Young	a0a3e2f469	Fix file type checks in data splits for contrastive training example script (#31720 ) fix data split file type checks	2024-07-10 10:17:03 +01:00
yukionfire	e9eeedaf3b	remove duplicate words in msg (#31876 )	2024-07-10 09:54:45 +01:00
Raushan Turganbay	97aa3e2905	Add conversion for interleave llava (#31858 ) * add conversion for interleave llava * remove debug lines * remove unused imports * Update src/transformers/models/llava/convert_llava_weights_to_hf.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * small changes + docs --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-10 12:12:21 +05:00
Yun Dai	ad35309a62	add warning when using gradient_checkpointing with FSDP full shard (#31578 ) * add warning when using with FSDP full shard * fix style * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add hybrid shard warn * fix style --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-09 23:55:57 +01:00
dependabot[bot]	6176d8f5ee	Bump certifi from 2023.7.22 to 2024.7.4 in /examples/research_projects/visual_bert (#31872 ) Bump certifi in /examples/research_projects/visual_bert Bumps [certifi](https://github.com/certifi/python-certifi) from 2023.7.22 to 2024.7.4. - [Commits](https://github.com/certifi/python-certifi/compare/2023.07.22...2024.07.04) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-07-09 22:20:39 +01:00
Yih-Dar	b45dd5de9c	Revert "Fix `_init_weights` for `ResNetPreTrainedModel`" (#31868 ) Revert "Fix `_init_weights` for `ResNetPreTrainedModel` (#31851)" This reverts commit `4c8149d643`.	2024-07-09 23:00:56 +02:00
Mauricio Villegas	c5bc2d5fd5	Add return type annotation to PreTrainedModel.from_pretrained (#31869 ) Update modeling_utils.py Add return type annotation to PreTrainedModel.from_pretrained	2024-07-09 21:49:29 +01:00
dependabot[bot]	6e59b30841	Bump zipp from 3.7.0 to 3.19.1 in /examples/research_projects/decision_transformer (#31871 ) Bump zipp in /examples/research_projects/decision_transformer Bumps [zipp](https://github.com/jaraco/zipp) from 3.7.0 to 3.19.1. - [Release notes](https://github.com/jaraco/zipp/releases) - [Changelog](https://github.com/jaraco/zipp/blob/main/NEWS.rst) - [Commits](https://github.com/jaraco/zipp/compare/v3.7.0...v3.19.1) --- updated-dependencies: - dependency-name: zipp dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-07-09 21:44:48 +01:00
Merve Noyan	e3a7d9bd47	Update depth estimation task guide (#31860 ) --------- Co-authored-by: Merve Noyan <mervenoyan@Merve-MacBook-Pro.local> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-07-09 22:13:30 +03:00
Yih-Dar	4c8149d643	Fix `_init_weights` for `ResNetPreTrainedModel` (#31851 ) * init * test --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-07-09 20:09:08 +02:00
Yung-Sung Chuang	d094d8d9ec	Generate: Add new decoding strategy "DoLa" in `.generate()` (#29619 ) Co-authored-by: Joao Gante <joao@huggingface.co>	2024-07-09 17:37:38 +01:00
chenk	99c0e55335	docs: typo in tf qa example (#31864 ) Signed-off-by: chenk <hen.keinan@gmail.com>	2024-07-09 16:30:06 +01:00
Joao Gante	4c2538b863	Test loading generation config with safetensor weights (#31550 ) fix test	2024-07-09 16:22:43 +02:00
kallewoof	cffa2b9c1d	save_pretrained: use tqdm when saving checkpoint shards from offloaded params (#31856 )	2024-07-09 12:55:57 +01:00
hatti	350aed7076	chore: remove duplicate words (#31853 ) remove duplicate words	2024-07-09 10:38:29 +01:00
NielsRogge	bd760cd13d	[Grounding DINO] Add processor to auto mapping (#31845 ) Add model	2024-07-09 11:28:53 +02:00
fxmarty	0abf5e8eae	FX symbolic_trace: do not test decoder_inputs_embeds (#31840 ) only test input_embeds, not decoder_input_embeds	2024-07-09 08:07:46 +02:00
Raushan Turganbay	952dfd4867	Deprecate `vocab_size` in other two VLMs (#31681 ) * deprrecate `vocab_size` in other two VLMs * Update src/transformers/models/fuyu/configuration_fuyu.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * depracate until 4.44 --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-09 10:40:06 +05:00
Joao Gante	594c1610fa	Mamba & RecurrentGemma: enable strict signature (#31549 ) * enable strict signature * this should not have been deleted * recurrent_gemma too	2024-07-08 15:48:32 +01:00
André Storhaug	ae9dd02ee1	Fix incorrect accelerator device handling for MPS in `TrainingArguments` (#31812 ) * Fix wrong acclerator device setup when using MPS * More robust TrainingArguments MPS handling * Update training_args.py * Cleanup	2024-07-08 12:49:30 +01:00
Yih-Dar	4879ac2b33	Avoid failure `TFBlipModelTest::test_pipeline_image_to_text` (#31827 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-07-08 13:49:21 +02:00
fxmarty	ba743700f4	transformers.fx.symbolic_trace supports inputs_embeds (#31574 ) * symbolic trace supports inputs_embeds * fix test? * Update tests/test_modeling_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-07-08 19:17:28 +08:00
omahs	e5ca9b057c	Fix typos (#31819 ) * fix typo * fix typo * fix typos * fix typo * fix typos	2024-07-08 11:52:47 +01:00
dependabot[bot]	f4711844a3	Bump certifi from 2023.7.22 to 2024.7.4 in /examples/research_projects/lxmert (#31838 ) Bump certifi in /examples/research_projects/lxmert Bumps [certifi](https://github.com/certifi/python-certifi) from 2023.7.22 to 2024.7.4. - [Commits](https://github.com/certifi/python-certifi/compare/2023.07.22...2024.07.04) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-07-08 11:17:49 +01:00
dependabot[bot]	9f3f58c905	Bump transformers from 4.26.1 to 4.38.0 in /examples/tensorflow/language-modeling-tpu (#31837 ) Bump transformers in /examples/tensorflow/language-modeling-tpu Bumps [transformers](https://github.com/huggingface/transformers) from 4.26.1 to 4.38.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.26.1...v4.38.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-07-08 11:12:33 +01:00
Pavel Iakubovskii	a177821b24	Add FA2 and `sdpa` support for SigLIP (#31499 ) * Rebase to main * Fix attention implementation autoset for tex and vision configs * Fixup * Minor fixes * Fix copies * Fix attention_mask for FA2 * Add eqvivalence tests for siglip * Remove right padding test * Uncomment flaky * Fix import * Add to docs * Fix test message * Add sdpa * Add sdpa equivalence test * Add siglip sdpa to docs * Fix typing for attention output * Add sdpa tests * Fix signature of FA2 * Autoset attn_implementation in config * Rename bsz -> batch_size * Move back autoset attn method * Mark as flaky * Correct attention mask padding * [run-slow] siglip * Add FA2 and sdpa docs * Style fix * Remove flaky for FA2 test * Change attention implementation set * Change attn_implementaiton propogation * Fix typos * Add modality to assert message * Add more sdpa backends in test * [run slow] siglip * Add math sdpa backend for all options * [run slow] siglip	2024-07-08 11:10:02 +01:00

... 60 61 62 63 64 ...

19383 Commits