Yih-Dar
dde6c427a1
Fix AMD push CI not triggered ( #28029 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-12-14 12:44:00 +01:00
Younes Belkada
73de5108e1
[core
/ modeling
] Fix training bug with PEFT + GC ( #28031 )
...
fix trainign bug
2023-12-14 12:19:45 +01:00
Arthur
2788f8d8d5
[SeamlessM4TTokenizer
] Safe import ( #28026 )
...
safe import
2023-12-14 08:46:10 +01:00
Arthur
131a528be0
well well well ( #28011 )
2023-12-14 06:51:04 +01:00
Marc Sun
17506d1256
add modules_in_block_to_quantize
arg in GPTQconfig ( #27956 )
...
* add inside_layer_modules arg
* fix
* change to modules_to_quantize_inside_block
* fix
* remane again
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* better docsting
* fix again with less explanation
* Update src/transformers/utils/quantization_config.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* style
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-12-13 14:13:44 -05:00
Rockerz
fe44b1f1a9
Add model_docs from cpmant.md to derformable_detr.md ( #27884 )
...
* upfaste
* Update
* Update docs/source/ja/model_doc/deformable_detr.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/model_doc/data2vec.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/model_doc/cvt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add suggestions
* Toctree update
* remove git references
* Update docs/source/ja/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/model_doc/decision_transformer.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-12-13 10:02:29 -08:00
Lysandre
3ed3e3190c
Dev version
2023-12-13 18:29:31 +01:00
Aaron Jimenez
815ea8e8a2
[Doc] Spanish translation of glossary.md ( #27958 )
...
* Add glossary to es/_toctree.yml
* Add glossary.md to es/
* A section translated
* B and C section translated
* Fix typo in en/glossary.md C section
* D section translated | Add a extra line in en/glossary.md
* E and F section translated | Fix typo in en/glossary.md
* Fix words preentrenado
* H and I section translated | Fix typo in en/glossary.md
* L section translated
* M and N section translated
* P section translated
* R section translated
* S section translated
* T section translated
* U and Z section translated | Fix TensorParallel link in both files
* Fix word
2023-12-13 09:21:59 -08:00
Zach Mueller
93766251cb
Fix bug with rotating checkpoints ( #28009 )
...
* Fix bug
* Write test
* Keep back old modification for grad accum steps
* Whitespace...
* Whitespace again
* Race condition
* Wait for everyone
2023-12-13 12:17:30 -05:00
Arthur
ec43d6870a
[CI slow
] Fix expected values ( #27999 )
...
* fix expected values
* style
* test is slow
2023-12-13 13:37:10 +01:00
Arindam Jati
749f94e460
Fix PatchTSMixer slow tests ( #27997 )
...
* fix slow tests
* revert formatting
---------
Co-authored-by: Arindam Jati <arindam.jati@ibm.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
2023-12-13 13:34:25 +01:00
Younes Belkada
c7f076a00e
Adds VIP-llava to transformers ( #27932 )
...
* v1
* add-new-model-like
* revert
* fix forward and conversion script
* revert
* fix copies
* fixup
* fix
* Update docs/source/en/index.md
* Apply suggestions from code review
* push
* fix
* fixes here and there
* up
* fixup and fix tests
* Apply suggestions from code review
* add docs
* fixup
* fixes
* docstring
* add docstring
* fixup
* docstring
* fixup
* nit
* docs
* more copies
* fix copies
* nit
* update test
2023-12-13 10:42:24 +01:00
Arthur
371fb0b7dc
[Whisper
] raise better errors ( #27971 )
...
* [`Whisper`] raise better erros
fixes #27893
* update torch as well
2023-12-13 09:13:01 +01:00
Arthur
230ac352d8
[Tokenizer Serialization
] Fix the broken serialisation ( #27099 )
...
* nits
* nits
* actual fix
* style
* ze fix
* fix fix fix style
2023-12-13 09:11:34 +01:00
Dave Berenbaum
f4db565b69
fix typo in dvclive callback ( #27983 )
2023-12-12 16:29:58 -05:00
Stas Bekman
9936143014
[doc] fix typo ( #27981 )
2023-12-12 20:32:42 +00:00
fxmarty
78172dcdb7
Fix SDPA correctness following torch==2.1.2 regression ( #27973 )
...
* fix sdpa with non-contiguous inputs for gpt_bigcode
* fix other archs
* add currently comment
* format
2023-12-13 00:33:46 +09:00
Matt
5e4ef0a0f6
Better key error for AutoConfig ( #27976 )
...
* Improve the error printed when loading an unrecognized architecture
* Improve the error printed when loading an unrecognized architecture
* Raise a ValueError instead because KeyError prints weirdly
* make fixup
2023-12-12 14:41:55 +00:00
saswatmeher
a49f4acab3
Fix link in README.md of Image Captioning ( #27969 )
...
Update the link for vision encoder decoder doc used by
FlaxVisionEncoderDecoderModel link.
2023-12-12 08:07:15 -05:00
Arthur
680c610f97
Hot-fix-mixstral-loss ( #27948 )
...
* fix loss computation
* compute on GPU if possible
2023-12-12 12:20:28 +01:00
Joao Gante
4b759da8be
Generate: assisted_decoding
now accepts arbitrary candidate generators ( #27750 )
...
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-12-12 09:25:57 +00:00
Anthony Susevski
e660424717
fixed typos (issue 27919) ( #27920 )
...
* fixed typos (issue 27919)
* Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-12-11 18:44:23 -05:00
dancingpipi
e5079b0b2a
Support PeftModel signature inspect ( #27865 )
...
* Support PeftModel signature inspect
* Use get_base_model() to get the base model
---------
Co-authored-by: shujunhua1 <shujunhua1@jd.com>
2023-12-11 19:30:11 +00:00
Steven Liu
35478182ce
[docs] Fused AWQ modules ( #27896 )
...
streamline
2023-12-11 10:41:33 -08:00
NielsRogge
67b1335cb9
Update bounding box format everywhere ( #27944 )
...
Update formats
2023-12-11 18:03:42 +00:00
Younes Belkada
54d0b1c278
[Mixtral
] Change mistral op order ( #27955 )
...
up
2023-12-11 19:03:18 +01:00
Adam Louly
4850aaba6f
fix no sequence length models error ( #27522 )
...
* fix no sequence length models error
* block size check
---------
Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2023-12-11 18:01:26 +00:00
Ashish Tawari
4b4b864224
Fix for stochastic depth decay rule in the TimeSformer implementation ( #27875 )
...
Update modeling_timesformer.py
Fixing typo to correct the stochastic depth decay rule
2023-12-11 16:20:31 +00:00
Chenhao Xu
c0a354d8d7
fix bug in mask2former: cost matrix is infeasible ( #27897 )
...
fix bug: cost matrix is infeasible
2023-12-11 16:19:16 +00:00
rjenc29
7e35f37071
Fix a couple of typos and add an illustrative test ( #26941 )
...
* fix a typo and add an illustrative test
* appease black
* reduce code duplication and add Annotion type back with a pending deprecation warning
* remove unused code
* change warning type
* black formatting fix
* change enum deprecation approach to support 3.8 and earlier
* add stacklevel
* fix black issue
* fix ruff issues
* fix ruff issues
* move tests to own mixin
* include yolos
* fix black formatting issue
* fix black formatting issue
* use logger instead of warnings and include target version for deprecation
2023-12-11 15:51:51 +00:00
Ella Charlaix
39acfe84ba
Add deepspeed test to amd scheduled CI ( #27633 )
...
* add deepspeed scheduled test for amd
* fix image
* add dockerfile
* add comment
* enable tests
* trigger
* remove trigger for this branch
* trigger
* change runner env to trigger the docker build image test
* use new docker image
* remove test suffix from docker image tag
* replace test docker image with original image
* push new image
* Trigger
* add back amd tests
* fix typo
* add amd tests back
* fix
* comment until docker image build scheduled test fix
* remove deprecated deepspeed build option
* upgrade torch
* update docker & make tests pass
* Update docker/transformers-pytorch-deepspeed-amd-gpu/Dockerfile
* fix
* tmp disable test
* precompile deepspeed to avoid timeout during tests
* fix comment
* trigger deepspeed tests with new image
* comment tests
* trigger
* add sklearn dependency to fix slow tests
* enable back other tests
* final update
---------
Co-authored-by: Felix Marty <felix@hf.co>
Co-authored-by: Félix Marty <9808326+fxmarty@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-12-11 16:33:36 +01:00
Yih-Dar
0f59d2f173
Fix AMD scheduled CI not triggered ( #27951 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-12-11 16:22:10 +01:00
Peter Götz
417bb91484
In PreTrainedTokenizerBase add missing word in error message ( #27949 )
...
"text input must of type" -> "text input must be of type"
2023-12-11 15:12:40 +00:00
Timon Käch
5cec306cdc
Fix parameter count in readme for mixtral 45b ( #27945 )
...
fix parameter count in readme
2023-12-11 14:58:48 +00:00
NielsRogge
921a6bf26e
Update import message ( #27946 )
...
* Update import message
* Update message
2023-12-11 14:58:06 +00:00
Zach Mueller
44127ec667
Fix test for auto_find_batch_size on multi-GPU ( #27947 )
...
* Fix test for multi-GPU
* WIth CPU handle
2023-12-11 09:57:41 -05:00
Merve Noyan
b911c1f10f
Docs for AutoBackbone & Backbone ( #27456 )
...
* Initial commit for AutoBackbone & Backbone
* Added timm and clarified out_indices
* Swapped the example to out_indices
* fix toctree
* Update autoclass_tutorial.md
* Update backbones.md
* Update autoclass_tutorial.md
* Add dummy torch input instead
* Add dummy torch input
* Update autoclass_tutorial.md
* Update backbones.md
* minor fix
* Update docs/source/en/main_classes/backbones.md
Co-authored-by: Maria Khalusova <kafooster@gmail.com>
* Update docs/source/en/autoclass_tutorial.md
Co-authored-by: Maria Khalusova <kafooster@gmail.com>
* Added illustrations and explained backbone & neck
* Update docs/source/en/main_classes/backbones.md
Co-authored-by: Maria Khalusova <kafooster@gmail.com>
* Update backbones.md
---------
Co-authored-by: Maria Khalusova <kafooster@gmail.com>
2023-12-11 08:22:17 -05:00
YQ
e49c385266
use logger.warning_once to avoid massive outputs ( #27428 )
...
* use logger.warning_once to avoid massive outputs when training/finetuning longformer
* update more
2023-12-11 11:59:29 +00:00
vijaye12
6ff109227b
Fix PatchTSMixer Docstrings ( #27943 )
...
* docstring corrections
* style make
---------
Co-authored-by: vijaye12 <vijaye12@in.ibm.com>
2023-12-11 11:56:57 +00:00
Arthur
accccdd008
[Add Mixtral
] Adds support for the Mixtral MoE ( #27942 )
...
* up
* up
* test
* logits ok
* up
* up
* few fixes
* conversion script
* up
* nits
* nits
* update
* nuke
* more updates
* nites
* fix many issues
* nit
* scatter
* nit
* nuke megablocks
* nits
* fix conversion script
* nit
* remove
* nits
* nit
* update
* oupsssss
* change
* nits device
* nits
* fixup
* update
* merge
* add copied from
* fix the copy mentions
* update tests
* more fixes
* nits
* conversion script
* add parts of the readme
* Update tests/models/mixtral/test_modeling_mixtral.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* new test + conversion script
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Apply suggestions from code review
* fix
* fix copies
* fix copies
* ooops
* fix config
* Apply suggestions from code review
* fix nits
* nit
* add copies
* add batched tests
* docs
* fix flash attention
* let's add more verbose
* add correct outputs
* support router ouptus
* ignore copies where needed
* fix
* cat list if list is given for now
* nits
* Update docs/source/en/model_doc/mixtral.md
* finish router refactoring
* fix forward
* fix expected values
* nits
* fixup
* fix
* fix bug
* fix
* fix dtype mismatch
* fix
* grrr grrr I support item assignment
* fix CI
* docs
* fixup
* remove some copied form
* fix weird diff
* skip doctest fast on the config and modeling
* mark that is supports flash attention in the doc
* update
* Update src/transformers/models/mixtral/modeling_mixtral.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* Update docs/source/en/model_doc/mixtral.md
Co-authored-by: Lysandre Debut <hi@lysand.re>
* revert router logits config issue
* update doc accordingly
* Update src/transformers/models/mixtral/convert_mixtral_weights_to_hf.py
* nits
* use torch testing asssert close
* fixup
* doc nits
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-12-11 12:50:27 +01:00
Arthur
0676d992a5
[from_pretrained
] Make from_pretrained fast again ( #27709 )
...
* Skip nn.Module.reset_parameters
* Actually skip
* Check quality
* Maybe change all inits
* Fix init issues: only modify public functions
* Add a small test for now
* Style
* test updates
* style
* nice tes
* style
* make it even faster
* one more second
* remove fx icompatible
* Update tests/test_modeling_common.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* Update tests/test_modeling_common.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* skip
* fix quality
* protect the import
---------
Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-12-11 12:38:17 +01:00
fxmarty
9f18cc6df0
Fix SDPA dispatch & make SDPA CI compatible with torch<2.1.1 ( #27940 )
...
fix sdpa dispatch
2023-12-11 18:56:38 +09:00
NielsRogge
7ea21f1f03
[LLaVa] Some improvements ( #27895 )
...
* More improvements
* Improve variable names
* Update READMEs, improve docs
2023-12-11 10:22:26 +01:00
Yoach Lacombe
5e620a92cf
Fix SeamlessM4Tv2ModelIntegrationTest
( #27911 )
...
change dtype of some integration tests
2023-12-11 09:18:41 +01:00
Yih-Dar
e96c1de191
Skip UnivNetModelTest::test_multi_gpu_data_parallel_forward
( #27912 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-12-11 09:17:37 +01:00
NielsRogge
8d8970efdd
[BEiT] Fix test ( #27934 )
...
Fix test
2023-12-11 09:17:02 +01:00
Sangbum Daniel Choi
235be08569
[DETA] fix backbone freeze/unfreeze function ( #27843 )
...
* [DETA] fix freeze/unfreeze function
* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add freeze/unfreeze test case in DETA
* fix type
* fix typo 2
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-12-11 07:57:30 +01:00
Brendan Fahy
df5c5c62ae
Fix typo ( #27918 )
2023-12-09 11:59:24 +01:00
Justin Yu
5fa66df3f3
[integration] Update Ray Tune integration for Ray 2.7 ( #26499 )
...
* fix tune integration for ray 2.7+
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
* add version check for ray tune backend availability
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
* missing import
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
* pin min version instead
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
* address comments
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
* some fixes
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
* fix unnecessary final checkpoint
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
* fix lint
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
* dep table fix
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
* fix lint
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
---------
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
2023-12-09 11:04:13 +01:00
Joshua Lochner
ffd426eef8
[CLAP] Replace hard-coded batch size to enable dynamic ONNX export ( #27790 )
...
* [CLAP] Replace hard-coded batch size to enable dynamic ONNX export
* Add back docstring
2023-12-09 10:39:39 +01:00