Yasmin Moslem
6d3d5b1039
Remove deprecated properties in tokenization_nllb.py and tokenization_nllb_fast.py ( #29834 )
...
* Fix typo in tokenization_nllb.py
Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability.
* Fix typo in tokenization_nllb_fast.py
Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability.
* Remove deprecated attributes in tokenization_nllb.py
Remove deprecated attributes: `lang_code_to_id`, `fairseq_tokens_to_ids`, `id_to_lang_code`, and `fairseq_ids_to_tokens`
* Remove deprecated attribute in tokenization_nllb_fast.py
Remove deprecated attribute `lang_code_to_id`
* Remove deprecated properties in tokenization_nllb.py
Remove deprecated properties - fix format
* Remove deprecated properties in tokenization_nllb_fast.py
Remove deprecated properties - fix format
* Update test_tokenization_nllb.py
* update test_tokenization_nllb.py
* Update tokenization_nllb.py
* Update test_tokenization_seamless_m4t.py
* Update test_tokenization_seamless_m4t.py
2024-05-23 18:53:26 +02:00
Aritra Roy Gosthipaty
965e98dc54
[Port] TensorFlow implementation of Mistral ( #29708 )
...
* chore: initial commit
* chore: adding imports and inits
* chore: adding the causal and classification code
* chore: adding names to the layers
* chore: using single self attn layer
* chore: built the model and layers
* chore: start with testing
* chore: docstring change, transpose fix
* fix: rotary embedding
* chore: adding cache implementation
* remove unused torch
* chore: fixing the indexing issue
* make fix-copies
* Use modeling_tf_utils.keras
* make fixup
* chore: fixing tests
* chore: adding past key value logic
* chore: adding multi label classfication test
* fix: switching on the built parameters in the layers
* fixing repo consistency
* ruff formats
* style changes
* fix: tf and pt equivalence
* removing returns from docstrings
* fix docstrings
* fix docstrings
* removing todos
* fix copies
* fix docstring
* fix docstring
* chore: using easier rotate_half
* adding integration tests
* chore: addressing review related to rotary embedding layer
* review changes
* [run-slow] mistral
* skip: test save load after resize token embedding
* style
---------
Co-authored-by: Matt <rocketknight1@gmail.com>
2024-05-23 17:48:49 +01:00
Yih-Dar
2a89673fe5
Update 4 MptIntegrationTests
expected outputs ( #30989 )
...
* fix
* fix
* fix
* fix
* fix
* [run-slow] mpt
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-23 18:27:54 +02:00
Yasmin Moslem
892b13d3cf
Add a check that warmup_setps is either 0 or >= 1 ( #30764 )
...
* Add a check that warmup_setps is either 0 or >= 1
Update training_args.py to add a check that warmup_setps is either 0 or >= 1. Otherwise, raise an error.
* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-23 17:23:59 +01:00
Fanli Lin
21339a5213
[tests] add torch.use_deterministic_algorithms
for XPU ( #30774 )
...
* add xpu check
* add marker
* add documentation
* update doc
* fix ci
* remove from global init
* fix
2024-05-23 16:53:07 +01:00
Marc Sun
8366b57241
Fix accelerate failing tests ( #30836 )
...
* Fix accelerate tests
* fix clip
* skip dbrx tests
* fix GPTSan
* fix M2M100Model
* same fix as jamba
* fix mt5
* Fix T5Model
* Fix umt5 model
* fix switch_transformers
* fix whisper
* fix gptsan again
* fix siglip recent test
* skip siglip tests
* wrong place fixed
2024-05-23 17:18:58 +02:00
Younes Belkada
5a74ae6dbe
FIX / Docs: Minor changes in quantization docs ( #30985 )
...
* Change in quantization docs
* Update overview.md
* Update docs/source/en/quantization/overview.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-05-23 16:36:49 +02:00
Benjamin Warner
046c2ad792
Finish adding support for torch.compile dynamic shapes ( #30919 )
...
add torch.compile dynamic support
2024-05-23 16:01:29 +02:00
Poedator
6739e1d261
test_custom_4d_attention_mask skip with sliding window attn ( #30833 )
2024-05-23 15:22:10 +02:00
Younes Belkada
87a351818e
Docs / Quantization: refactor quantization documentation ( #30942 )
...
* refactor quant docs
* delete file
* rename to overview
* fix
* fix table
* fix
* add content
* fix library versions
* fix table
* fix table
* fix table
* fix table
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* replace to quantization_config
* fix aqlm snippet
* add DLAI courses
* fix
* fix table
* fix bulet points
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-05-23 14:31:52 +02:00
Raushan Turganbay
d583f1317b
Quantized KV Cache ( #30483 )
...
* clean-up
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fixup
* Update tests/quantization/quanto_integration/test_quanto.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/generation/configuration_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* more suggestions
* mapping if torch available
* run tests & add 'support_quantized' flag
* fix jamba test
* revert, will be fixed by another PR
* codestyle
* HQQ and versatile cache classes
* final update
* typo
* make tests happy
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-05-23 17:25:20 +05:00
dependabot[bot]
e05baad861
Bump requests from 2.31.0 to 2.32.2 in /examples/research_projects/visual_bert ( #30983 )
...
Bump requests in /examples/research_projects/visual_bert
Bumps [requests](https://github.com/psf/requests ) from 2.31.0 to 2.32.2.
- [Release notes](https://github.com/psf/requests/releases )
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md )
- [Commits](https://github.com/psf/requests/compare/v2.31.0...v2.32.2 )
---
updated-dependencies:
- dependency-name: requests
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 12:38:00 +01:00
Arthur
4ef85fee71
Push ci image ( #30982 )
...
* [build-ci-image]
* correct branch
* push ci image
* [build-ci-image]
* update scheduled as well
* [push-ci-image]
* [build-ci-image]
* [push-ci-image]
* update deps
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* oups [build-ci-image]
* [push-ci-image]
* fix
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* updated
* [build-ci-image] update tag
* [build-ci-image]
* [build-ci-image]
* fix tag
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* github name
* commit_title?
* fetch
* update
* it not found
* dev
* dev
* [push-ci-image]
* dev
* dev
* update
* dev
* dev print dev commit message dev
* dev ? dev
* dev
* dev
* dev
* dev
* [build-ci-image]
* [build-ci-image]
* [push-ci-image]
* revert unwanted
* revert convert as well
* no you are not important
* [build-ci-image]
* Update .circleci/config.yml
* pin tf probability dev
* [push-ci-image] skip
* [push-ci-image] test
* [push-ci-image]
* fix
* device
2024-05-23 11:45:31 +02:00
Kamil Akesbi
eb1a77bbb0
Using assistant in AutomaticSpeechRecognitionPipeline with different encoder size ( #30637 )
...
* fiw input to generate in pipeline
* fixup
* pass input_features to generate with assistant
* error if model and assistant with different enc size
* fix
* apply review suggestions
* use self.config.is_encoder_decoder
* pass inputs to generate directly
* add slow tests
* Update src/transformers/generation/utils.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* apply review
* Update src/transformers/generation/utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* apply code review
* update attributes encoder_xyz to check
* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* add slow test
* solve conflicts
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-05-23 09:59:38 +01:00
Pavel Iakubovskii
15585b81a5
Update object detection with latest resize and pad strategies ( #30955 )
...
* Update with new resizing and pad strategy
* Return pixel mask param
* Update inference in guide
* Fix empty compose
* Update guide
2024-05-23 00:13:56 +01:00
Pablo Montalvo
a25f7d3c12
Paligemma causal attention mask ( #30967 )
...
* PaliGemma working causal attention
* Formatting
* Style
* Docstrings + remove commented code
* Update docstring for PaliGemma Config
* PaliGemma - add separator ind to model/labels
* Refactor + docstring paligemma processor method
* Style
* return token type ids when tokenizing labels
* use token type ids when building causal mask
* add token type ids to tester
* remove separator from config
* fix style
* don't ignore separator
* add processor documentation
* simplify tokenization
* fix causal mask
* style
* fix label propagation, revert suffix naming
* fix style
* fix labels tokenization
* [run-slow]paligemma
* add eos if suffixes are present
* [run-slow]paligemma
* [run-slow]paligemma
* add misssing tokens to fast version
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix style
* [run-slow]paligemma
---------
Co-authored-by: Peter Robicheaux <peter@roboflow.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-05-22 19:37:15 +02:00
Jun
d44e1ae036
Fix link in Pipeline documentation ( #30948 )
...
fix documentation as suggested by stevhliu
Co-authored-by: Jun <jun@reliant.ai>
2024-05-22 09:39:46 -07:00
Sanchit Gandhi
0948c827de
[Whisper] Strip prompt before finding common subsequence ( #27836 )
2024-05-22 17:25:47 +01:00
Raushan Turganbay
b1065aa08a
Generation: get special tokens from model config ( #30899 )
...
* fix
* let's do this way?
* codestyle
* update
* add tests
2024-05-22 18:15:41 +02:00
Arthur
1d568dfab2
legacy to init the slow tokenizer when converting from slow was wrong ( #30972 )
2024-05-22 18:06:50 +02:00
Yih-Dar
1432f641b8
Finally fix the missing new model failure CI report ( #30968 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-22 17:48:26 +02:00
amyeroberts
dff54ad2d9
🚨 out_indices always a list ( #30941 )
...
* out_indices always a list
* Update src/transformers/utils/backbone_utils.py
* Update src/transformers/utils/backbone_utils.py
* Move type casting
* nit
2024-05-22 15:23:04 +01:00
Pablo Montalvo
250ae9f746
Paligemma - fix slow tests, add bf16 and f16 slow tests ( #30851 )
...
* fix slow tests, add bf16 and f16 slow tests
* few fixes
* [run-slow]paligemma
* add gate decorator
* [run-slow]paligemma
* add missing gating
* [run-slow]paligemma
* [run-slow]paligemma
2024-05-22 16:20:07 +02:00
Sanchit Gandhi
ada86f973c
[whisper] only trigger forced ids warning once ( #30966 )
2024-05-22 15:06:51 +01:00
Jonatan Kłosko
1518508467
Avoid extra chunk in speech recognition ( #29539 )
2024-05-22 14:07:51 +01:00
Vaibhav Srivastav
24d2a5e1a3
[doc] Add references to the fine-tuning blog and distil-whisper to Whisper. ( #30938 )
...
[doc] Add references to the fine-tuning blog and distil-whisper to Whisper doc.
2024-05-22 14:06:09 +01:00
Marc Sun
5c186003b8
Fix low cpu mem usage tests ( #30808 )
...
* Fix tests
* fix udop failing test
* remove skip
* style
2024-05-22 14:09:01 +02:00
Raushan Turganbay
934e1b84e9
Update video-llava docs ( #30935 )
...
* update video-llava
* Update docs/source/en/model_doc/video_llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-22 16:56:41 +05:00
dependabot[bot]
edb14eba64
Bump requests from 2.31.0 to 2.32.2 in /examples/research_projects/lxmert ( #30956 )
...
---
updated-dependencies:
- dependency-name: requests
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-22 11:27:41 +01:00
Arthur
8e8786e5f0
Update build ci image [push-ci-image] ( #30933 )
...
* [build-ci-image]
* correct branch
* push ci image
* [build-ci-image]
* update scheduled as well
* [push-ci-image]
* [build-ci-image]
* [push-ci-image]
* update deps
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* oups [build-ci-image]
* [push-ci-image]
* fix
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* updated
* [build-ci-image] update tag
* [build-ci-image]
* [build-ci-image]
* fix tag
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* github name
* commit_title?
* fetch
* update
* it not found
* dev
* dev
* [push-ci-image]
* dev
* dev
* update
* dev
* dev print dev commit message dev
* dev ? dev
* dev
* dev
* dev
* dev
* [build-ci-image]
* [build-ci-image]
* [push-ci-image]
* revert unwanted
* revert convert as well
* no you are not important
* [build-ci-image]
* Update .circleci/config.yml
* pin tf probability dev
2024-05-22 10:52:59 +02:00
Arthur
673440d073
update ruff version ( #30932 )
...
* update ruff version
* fix research projects
* Empty
* Fix errors
---------
Co-authored-by: Lysandre <lysandre@huggingface.co>
2024-05-22 06:40:15 +02:00
NielsRogge
60bb571e99
🚨 [Idefics2] Update ignore index ( #30898 )
...
* Update ignore index
* Update docs
* Update docs
2024-05-21 19:38:02 +02:00
Lu Teng
5bf9caa06d
Fix inhomogeneous shape error in example ( #30434 )
...
Fix inhomogeneous shape error in example.
2024-05-21 18:14:11 +01:00
amyeroberts
d24097e022
Fix swin embeddings interpolation ( #30936 )
2024-05-21 15:40:19 +01:00
Younes Belkada
eae2b6b89e
TST / Workflows: Get slack notifications for docker image build ( #30891 )
...
* Get slack notifications for docker image build
* Apply suggestions from code review
* Apply suggestions from code review
2024-05-21 15:54:41 +02:00
Yih-Dar
64e0573a81
[Benchmark] Reuse optimum-benchmark
( #30615 )
...
* benchmark
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-21 15:15:19 +02:00
Matthew Beckers
3b09d3f05f
fix: center_crop occasionally outputs off-by-one dimension matrix ( #30934 )
...
If required padding for a crop larger than input image is odd-numbered,
the padding would be rounded down instead of rounded up, causing the
output dimension to be one smaller than it should be.
2024-05-21 13:56:52 +01:00
Zach Mueller
daf281f44f
Enforce saving at end of training if saving option chosen ( #30160 )
...
* Enforce saving at end of training
* Fix test
* Rework test
* Fixup tests'
* Update comment based on sourab feedback
* Clean
2024-05-21 07:50:11 -04:00
Mohit Sharma
7a4792e6b3
CI: AMD MI300 tests fix ( #30797 )
...
* add fix
* update import
* updated dicts and comments
* remove prints
* Update testing_utils.py
2024-05-21 12:46:07 +01:00
hoshi-hiyouga
a755745546
PaliGemma - fix processor with no input text ( #30916 )
...
Update processing_paligemma.py
2024-05-21 10:43:22 +01:00
dependabot[bot]
d502bd6475
Bump requests from 2.31.0 to 2.32.0 in /examples/research_projects/decision_transformer ( #30925 )
...
---
updated-dependencies:
- dependency-name: requests
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-21 09:41:29 +01:00
Younes Belkada
8871b26150
FEAT / Trainer: LOMO optimizer support ( #30178 )
...
* add V1 - adalomo not working yet
* add todo docs + refactor from comments
* adjust LR
* add docs
* add more elaborated test
* Apply suggestions from code review
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
* fix
* push
* add accelerate check
* fix DDP case
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix
* init kwargs
* safely add attribute
* revert to enum logic
* Update src/transformers/trainer.py
---------
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-21 10:16:37 +02:00
Younes Belkada
c876d12127
FIX / TST: Fix expected results on Mistral slow test (A10) ( #30909 )
...
Update test_modeling_mistral.py
2024-05-21 09:14:14 +02:00
Aaron Jimenez
0df888ffb7
[docs] Spanish translation of model_memory_anatomy.md ( #30885 )
...
* add model_memory_anatomy to es/_toctree.yml
* copy model_memory_anatomy.md to es/
* translate first section
* translate doc
* chage forward activations
* fix sentence and and link to Trainer
* fix Trainer link
2024-05-20 16:48:52 -07:00
Longjie Zheng
616bb11d48
Add torch.compile for Mistral ( #30642 )
...
* first version
* fix sliding window
* fix style
* add sliding window cache
* fix style
* address comments
* fix test
* fix style
* move sliding window check inside cache init
* revert changes on irrelevant files & add comment on SlidingWindowCache
* address comments & fix style
fix style
* update causal mask
* [run-slow] mistral
* [run-slow] mistral
* [run-slow] mistral
* [run-slow] mistral
* [run-slow] mistral
* [run-slow] llama
* [run-slow] mistral
* [run-slow] mistral
* [run-slow] mistral
* revert CI from a10 to t4
* wrap up
2024-05-20 16:27:24 +02:00
Zach Mueller
92d1d97c05
Introduce configured_state arg for accelerator_config ( #29781 )
...
* Introduce configured_state
* Include note on tuning
* Allow for users to have defined a state already
* Include tests
* Add note on hpam tune
* Guard a bit better
* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Finish rebase
* Finish rebase
* Guard carefully
* Fixup test
* Refactor
* Fin refactor
* Comment
* Update wrt feedback
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-20 09:21:40 -04:00
Arthur
bb48e92186
tokenizer_class = "AutoTokenizer"
Llava Family (#30912 )
...
propagate changes to more models
2024-05-20 13:56:11 +02:00
Anton Vlasjuk
76e05301c3
Fix a shape annotation and typos in mamba
slow forward ( #30691 )
...
* fix typos and one shape comment
* fix `intermediade` typo in jamba
2024-05-20 13:55:57 +02:00
Yoach Lacombe
e6708709cb
Add AutoFeatureExtractor support to Wav2Vec2ProcessorWithLM ( #28706 )
...
* Add AutoFeatureExtractor support to Wav2Vec2ProcessorWithLM
* update with a type filter
* add raises error test
* fix added test
2024-05-20 13:40:42 +02:00
Hafedh
c11ac7857b
fix for custom pipeline configuration ( #29004 )
...
* fix for custom pipeline configuration
* fix for custom pipelines
* remove extra exception
* added test for custom pipelines extra tag
* format with ruff
* limit extra tag for first time only
* format with ruff
* improve tests for custom pipelines
2024-05-20 11:38:32 +02:00