dependabot[bot]
076e66e479
Bump certifi from 2023.7.22 to 2024.7.4 in /examples/research_projects/decision_transformer ( #31813 )
...
Bump certifi in /examples/research_projects/decision_transformer
Bumps [certifi](https://github.com/certifi/python-certifi ) from 2023.7.22 to 2024.7.4.
- [Commits](https://github.com/certifi/python-certifi/compare/2023.07.22...2024.07.04 )
---
updated-dependencies:
- dependency-name: certifi
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-08 10:52:10 +01:00
Dingli Yang
c1cda0ee2c
Fix Seq2SeqTrainer crash when BatchEncoding data is None ( #31418 )
...
avoiding crash when BatchEncoding data is None
2024-07-08 10:51:23 +01:00
NielsRogge
06fd7972ac
Add ZoeDepth ( #30136 )
...
* First draft
* Add docs
* Clean up code
* Convert model
* Add image processor
* Convert Zoe_K
* More improvements
* Improve variable names and docstrings
* Improve variable names
* Improve variable names
* Replace nn.sequential
* More improvements
* Convert ZoeD_NK
* Fix most tests
* Verify pixel values
* Verify pixel values
* Add squeeze
* Update beit to support arbitrary window sizes
* Improve image processor
* Improve docstring
* Improve beit
* Improve model outputs
* Add figure
* Fix beit
* Update checkpoint
* Fix repo id
* Add _keys_to_ignore_on_load_unexpected
* More improvements
* Address comments
* Address comments
* Address comments
* Address comments
* Rename variable name
* Add backbone_hidden_size
* Vectorize
* Vectorize more
* Address comments
* Clarify docstring
* Remove backbone_hidden_size
* Fix image processor
* Remove print statements
* Remove print statement
* Add integration test
* Address comments
* Address comments
* Address comments
* Address comments
* Add requires_backends
* Clean up
* Simplify conversion script
* Simplify more
* Simplify more
* Simplify more
* Clean up
* Make sure beit is loaded correctly
* Address comment
* Address bin_configurations
* Use bin_configurations
* Convert models, add integration tests
* Fix doc test
* Address comments
* Unify regressor classes
* Clarify arguments
* Improve resize_image
* Add num_relative_features
* Address comment
* [run-slow]beit,data2vec,zoedepth
* [run-slow]beit,data2vec,zoedepth
* Address comments
* Address comment
* Address comment
* Replace nn.TransformerEncoderLayer and nn.TransformerEncoder
* Replace nn.MultiheadAttention
* Add attributes for patch transformer to config
* Add tests for ensure_multiple_of
* Update organization
* Add tests
* [run-slow] beit data2vec
* Update ruff
* [run-slow] beit data2vec
* Add comment
* Improve docstrings, add test
* Fix interpolate_pos_encoding
* Fix slow tests
* Add docstring
* Update src/transformers/models/zoedepth/image_processing_zoedepth.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/zoedepth/image_processing_zoedepth.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Improve tests and docstrings
* Use run_common_tests
* Improve docstrings
* Improve docstrings
* Improve tests
* Improve tests
* Remove print statements
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-08 11:43:33 +02:00
Pedro Cuenca
1082361a19
Depth Anything: update conversion script for V2 ( #31522 )
...
* Depth Anything: update conversion script for V2
* Update docs
* Style
* Revert "Update docs"
This reverts commit be0ca47ea1
.
* Add docs for depth anything v2
* Add depth_anything_v2 to MODEL_NAMES_MAPPING
Done similarly to Flan-T5: https://github.com/huggingface/transformers/pull/19892/files
* Add tip in original docs
2024-07-05 19:28:41 +01:00
Thien Tran
a8fa6fbbec
Fix Wav2Vec2 Fairseq conversion (weight norm state dict keys) ( #31714 )
...
* handle new weight norm
* fix
* fix trailing space
2024-07-05 19:26:21 +01:00
Anton Vlasjuk
a01b033cb4
Fix galore lr display with schedulers ( #31710 )
...
* fix galore lr display with lr schedulers
* style
* add some tests to check for displayed lrs
* copy-paste err for warmup steps
* standardize the default lr to be only in the optimizer
* trying out my luck with the reads
2024-07-05 18:59:09 +01:00
Billy Cao
ac26260436
Allow FP16 or other precision inference for Pipelines ( #31342 )
...
* cast image features to model.dtype where needed to support FP16 or other precision in pipelines
* Update src/transformers/pipelines/image_feature_extraction.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Use .to instead
* Add FP16 pipeline support for zeroshot audio classification
* Remove unused torch imports
* Add docs on FP16 pipeline
* Remove unused import
* Add FP16 tests to pipeline mixin
* Add fp16 placeholder for mask_generation pipeline test
* Add FP16 tests for all pipelines
* Fix formatting
* Remove torch_dtype arg from is_pipeline_test_to_skip*
* Fix format
* trigger ci
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-05 17:21:50 +01:00
Matt
e786844425
Repeating an important warning in the chat template docs ( #31796 )
...
* Repeating an important warning in the chat template docs
* Update docs/source/en/chat_templating.md
Co-authored-by: Lysandre Debut <hi@lysand.re>
* Reword for clarity
* Reword for clarity
---------
Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-07-05 15:30:24 +01:00
Billy Cao
1d3eaa6f7e
Add training support for SigLIP ( #31495 )
...
* Add siglip loss function
* Update docs
* Enable training tests
[experimental] enable GC training tests as it has worked for my own data
* Remove test_training* overrides to enable training tests
[run_slow] siglip
* Skip training tests for Siglip text model and ImageClassificationModel
[run_slow] siglip
* Skip GC training tests for SiglipForImageClassification
* Explicitly skip training tests for SiglipVisionModel
Add skip reason for training tests for SiglipTextModel
* Remove copied from to fix CI
2024-07-05 14:50:39 +01:00
Aymeric Roucher
1556025271
Code agent: allow function persistence between steps ( #31769 )
...
* Code agent: allow function persistence between steps
2024-07-05 11:09:11 +02:00
Yih-Dar
eef0507f3d
Fix gemma tests ( #31794 )
...
* skip 3 7b tests
* fix
* fix
* fix
* [run-slow] gemma
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-05 10:17:59 +02:00
Boris Feld
9e599d1d94
Update CometCallback to allow reusing of the running experiment ( #31366 )
...
* Update CometCallback to allow reusing of the running experiment
* Fixups
* Remove useless TODO
* Add checks for minimum version of the Comet SDK
* Fix documentation and links.
Also simplify how the Comet Experiment name is passed
2024-07-05 08:13:46 +02:00
xiangdong
d19b5a90c2
Exclude torch.compile time from metrics computation ( #31443 )
...
* exclude compile time from metrics computation
* fix the quality issue
2024-07-05 08:11:55 +02:00
Kazuaki Ishizaki
2aa2a14481
Make tensor device correct when ACCELERATE_TORCH_DEVICE is defined ( #31751 )
...
return correct device when ACCELERATE_TORCH_DEVICE is defined
2024-07-05 08:09:04 +02:00
Marc Sun
8c5c180de0
Fix serialization for offloaded model ( #31727 )
...
* Fix serialization
* style
* add test
2024-07-05 08:07:07 +02:00
mxkopy
eaa5f41439
Fix ClapProcessor to merge feature_extractor output into the returned BatchEncoding ( #31767 )
...
* fixed ClapProcessor to merge all values output from the feature extractor into the returned BatchEncoding.
* fixed trailing whitespace
2024-07-05 07:55:47 +02:00
Billy Cao
43ffb785c0
Add torch_empty_cache_steps to TrainingArguments ( #31546 )
...
* Add torch_empty_cache_steps to TrainingArguments
* Fix formatting
* Add torch_empty_cache_steps to docs on single gpu training
* Remove check for torch_empty_cache_steps <= max_steps
* Captalize Tip
* Be device agnostic
* Fix linting
2024-07-04 13:20:49 -04:00
hoshi-hiyouga
cee768d97e
Fix Gemma2 types ( #31779 )
...
Update __init__.py
2024-07-04 15:37:32 +02:00
Yih-Dar
87726a08ed
pytest_num_workers=4
for some CircleCI jobs (#31764 )
...
pytest_num_workers=4
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-04 14:44:58 +02:00
Pavel Iakubovskii
048f599f35
Fix RT-DETR weights initialization ( #31724 )
...
* Fix init for rt-detr heads
* Fixup
* Add separate prior_prob value to config for initialization
* Add bbox init
* Change to 1 / num_labels init
* Adjust weights init test
* Fix style for test
2024-07-03 14:29:02 +01:00
Pavel Iakubovskii
b97521614a
Fix RT-DETR cache for generate_anchors ( #31671 )
...
* Fix cache and type conversion
* Add test
* Fixup
* nit
* [run slow] rt_detr
* Fix test
* Fixup
* [run slow] rt_detr
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
2024-07-03 14:19:57 +01:00
Willard Sheen
534cbf8a5d
[fix bug] logits's shape different from label's shape in preprocess_logits_for_metrics ( #31447 )
...
* [fix BUG] pad labels before use it in preprocess_logits_for_metrics
* a more readable fix
labels can't use `gather` before pass to `preprocess_logits_for_metrics`, so must split into 2 if-block
* add a comment
* oh code quality check
2024-07-03 06:58:27 -04:00
Nate Brake
65a02cd27d
Add ignore_errors=True to trainer.py rmtree in _inner_training_loop ( #31668 )
...
Update trainer.py
2024-07-03 06:54:49 -04:00
Joao Gante
ddfaf11926
Gemma 2: Update slow tests ( #31759 )
...
gemma 2 slow tests
2024-07-03 11:43:44 +02:00
Pablo Montalvo
c1fe12595e
handle (processor_class, None) returned by ModelPatterns ( #31753 )
2024-07-03 11:42:30 +02:00
Aymeric Roucher
0fd885b91c
Adds final answer tool for all agents ( #31703 )
...
* Adds final answer tool for all agents
* Typo
* Add clarification in doc
* Put final_answer tool adition in agent for clarity
2024-07-03 11:36:09 +02:00
Ella Charlaix
dc72fd7edd
Requires for torch.tensor before casting ( #31755 )
2024-07-03 11:12:51 +02:00
jiqing-feng
7f91f168a1
fix assisted decoding ( #31401 )
...
* fix assisted decoding
* check None
* fix typo
* fix _prepare_special_tokens
* fix style
* fix lint
* add tests for assisted decoding
* fix style
* fix tests check
2024-07-03 09:22:56 +01:00
Jörg Bornschein
f91c16d270
Fix documentation for Gemma2. ( #31682 )
...
* Fix documentation for Gemma2.
Model sizes and Blog post URL are wrong in the documentation.
* Update docs/source/en/model_doc/gemma2.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-02 23:04:53 +01:00
Matt
cd0935dd55
Make tool JSON schemas consistent ( #31756 )
...
Make the order of array items consistent using sorted()
2024-07-02 20:00:42 +01:00
Joao Gante
82486e5995
🚨 🚨 TextGenerationPipeline: rely on the tokenizer default kwargs ( #31747 )
...
* rely on the tokenizer default kwargs
* fix a few tests
2024-07-02 16:17:42 +02:00
Sanchit Gandhi
a9701953ff
[whisper] static kv cache ( #31166 )
...
* make work with cache abstraction
* correct for static cache
* hacks for compile
* make fast
* fix
* fix pos ids
* generate
* fix sdpa
* fix sdpa cache pos
* fix fa2
* clean fa2
* integrate cache into generate
* make style
* copies
* more copies
* update eager
* update sdpa
* update fa2
* simplify
* use cache pos
* always compute cross-cache for debug
* avoid recompiles
Co-authored-by: Arthur Zucker <arthur@huggingface.co>
* fix fix
* fix fix fix
* more fix
* try encoder-decoder cache (too messy)
* revert encoder-decoder cache
* check cross-attn cache
* use enc-dec dataclass
* use richer enc-dec dataclass
* clean-up
* revert static cache changes
* small fixes
* revert to cpu flag
* fix copies
* add static slow test
* past k/v docstring
* more docstrings
* cache_position docstrings
* add to docs
* add enc-dec cache to docs
* make style
* fix after rebase
* fix beam
* style
* fix generation strategies
* fix most decoder-only tests
* style
* skip test
* more clean up
* small docstrings
* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* add todo
* only crop self-attn
* check cache in mixin
* style
* fix re-compile after rebase
* move `is_updated` logic to enc-dec wrapper
* revert back
* revert cache back
* finalise design
* fix
* fix fix
* style
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* deprecate
* updates
* final updates
* style
* style
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-07-02 13:24:15 +01:00
fxmarty
57d7594a79
Fix mistral ONNX export ( #31696 )
...
* use bitwise or
* why is the CI not triggered?
2024-07-02 19:54:10 +08:00
Yih-Dar
93cd94b79d
Move some test files (tets/test_xxx_utils.py
) to tests/utils
( #31730 )
...
* move
* move
* move
* move
* Update tests/utils/test_image_processing_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-02 13:46:03 +02:00
Krisztián Boros
cf85e86e9a
remove incorrect urls pointing to the llava repository ( #31107 )
...
* remove incorrect urls pointing to the llava repository
* remove incorrect urls pointing to the llava repository; removing entire comments
* remove incorrect urls pointing to the llava repository; removing entire comments; ran fix-copies
* ran fixup
2024-07-02 12:24:55 +01:00
Joao Gante
3345ae733b
dependencies: keras-nlp<0.14
pin ( #31684 )
...
* keras nlp pin
* this should use the new docker images:dev
* dev-ci
2024-07-01 17:39:33 +01:00
Jade Choghari
e655029515
Add French version of run scripts tutorial ( #31483 )
...
* Add French translation of run scripts tutorial
* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Jade Choghari <chogharijade@icloud.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-06-28 18:02:30 +02:00
Arthur
bbf1e61864
Gemma capping is a must for big models ( #31698 )
...
* softcapping
* soft cap before the mask
* style
* ...
* super nit
2024-06-28 17:16:17 +02:00
Sangbum Daniel Choi
cb298978ad
add gather_use_object arguments ( #31514 )
...
* add gather_use_object arguments
* fix name and pass the CI test for Seq2SeqTrainer
* make style
* make it to functools
* fix typo
* add accelerate version:
* adding warning
* Update src/transformers/trainer.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* make style
* Update src/transformers/training_args.py
* check function move to initial part
* add test for eval_use_gather_object
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-06-28 13:50:27 +01:00
Jacky Lee
82a1fc7256
Fix return_dict in encodec ( #31646 )
...
* fix: use return_dict parameter
* fix: type checks
* fix: unused imports
* update: one-line if else
* remove: recursive check
2024-06-28 12:18:01 +01:00
hoshi-hiyouga
5e89b335ab
Fix Gemma2 4d attention mask ( #31674 )
...
Update modeling_gemma2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-06-28 08:20:30 +02:00
Wing Lian
0142aab7f8
don't zero out the attention_mask when using sliding window with flash attention ( #31670 )
...
* don't zero out the attention_mask when using sliding window with flash attention
* chore: lint
2024-06-28 07:59:54 +02:00
Sanchit Gandhi
1c68f2cafb
[HybridCache] Fix get_seq_length
method ( #31661 )
...
* fix gemma2
* handle in generate
2024-06-27 19:40:40 +02:00
Steven Liu
464aa74659
[docs] Llama3 ( #31662 )
...
quick usage to top
2024-06-27 10:32:51 -07:00
Billy Cao
e44b878c02
Fix float out of range in owlvit and owlv2 when using FP16 or lower precision ( #31657 )
2024-06-27 18:07:33 +01:00
Arthur
75a6319864
Fix post gemma merge ( #31660 )
...
* nit
* toctree issue
* protect gemma2 tests as well
* sdpa supported
2024-06-27 17:51:42 +02:00
Lysandre
727eea4ab0
v4.43.0.dev0
2024-06-27 17:40:07 +02:00
Arthur
0cf60f13ab
Add gemma 2 ( #31659 )
...
* inital commit
* Add doc
* protect?
* fixup stuffs
* update tests
* fix build documentation
* mmmmmmm config attributes
* style
* nit
* uodate
* nit
* Fix docs
* protect some stuff
---------
Co-authored-by: Lysandre <lysandre@huggingface.co>
2024-06-27 17:36:19 +02:00
Raushan Turganbay
4aa17d0069
Remove deprecated config attribute in VLMs ( #31655 )
...
remove
2024-06-27 16:54:41 +05:00
Sangbum Daniel Choi
be50a0338b
change anchor_image_size None for compatibility ( #31640 )
...
* change anchor_image_size None for compatibility
* make fix-copies
2024-06-27 12:36:55 +01:00