Arthur
115ac94d06
[Core generation
] Adds support for static KV cache ( #27931 )
...
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-02-08 11:50:34 +01:00
Javier
4b236aed76
Fix utf-8 yaml load for marian conversion to pytorch in Windows ( #28618 )
...
Fix utf-8 yaml in marian conversion
2024-02-08 08:23:15 +01:00
Klaus Hipp
33df036917
[Docs] Revert translation of '@slow' decorator ( #28912 )
2024-02-08 03:31:47 +01:00
Klaus Hipp
328ade855b
[Docs] Fix placement of tilde character ( #28913 )
...
Fix placement of tilde character
2024-02-07 17:19:39 -08:00
Huazhong Ji
5f96855761
Add npu device for pipeline ( #28885 )
...
add npu device for pipeline
Co-authored-by: unit_test <test@unit.com>
2024-02-07 17:27:01 +00:00
Yih-Dar
308d2b9004
Update the cache number ( #28905 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-07 16:37:09 +01:00
Daniel Korat
abf8f54a01
⚠️ Raise Exception
when trying to generate 0 tokens ⚠️ ( #28621 )
...
* change warning to exception
* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* validate `max_new_tokens` > 0 in `GenerationConfig`
* fix truncation test parameterization in `TextGenerationPipelineTests`
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-02-07 13:42:01 +01:00
Matt
349a6e8542
Fix Keras scheduler import so it works for older versions of Keras ( #28895 )
...
Fix our schedule import so it works for older versions of Keras
2024-02-07 12:28:24 +00:00
Sourab Mangrulkar
d9deddb4c1
fix Starcoder FA2 implementation ( #28891 )
2024-02-07 14:10:10 +05:30
Sai-Suraj-27
64d1518cbf
fix: Fixed the documentation for logging_first_step
by removing "evaluate" ( #28884 )
...
Fixed the documentation for logging_first_step by removing evaluate.
2024-02-07 08:46:36 +01:00
Klaus Hipp
1c31b7aa3b
[Docs] Add missing language options and fix broken links ( #28852 )
...
* Add missing entries to the language selector
* Add links to the Colab and AWS Studio notebooks for ONNX
* Use anchor links in CONTRIBUTING.md
* Fix broken hyperlinks due to spaces
* Fix links to OpenAI research articles
* Remove confusing footnote symbols from author names, as they are also considered invalid markup
2024-02-06 12:01:01 -08:00
Yih-Dar
40658be461
Hotfix - make torchaudio
get the correct version in torch_and_flax_job
( #28899 )
...
* check
* check
* check
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-06 21:00:42 +01:00
Klaus Hipp
4830f26965
[Docs] Fix backticks in inline code and documentation links ( #28875 )
...
Fix backticks in code blocks and documentation links
2024-02-06 11:15:44 -08:00
Lucain
a1afec9e17
Explicit server error on gated model ( #28894 )
2024-02-06 17:45:20 +00:00
Yih-Dar
89439fea64
unpin torch ( #28892 )
...
* unpin torch
* check
* check
* check
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-06 17:21:05 +01:00
Yih-Dar
76b4f666f5
Revert "[WIP] Hard error when ignoring tensors." ( #28898 )
...
Revert "[WIP] Hard error when ignoring tensors. (#27484 )"
This reverts commit 2da28c4b41
.
2024-02-06 17:18:30 +01:00
Yih-Dar
6529a5b5c1
Fix FastSpeech2ConformerModelTest
and skip it on CPU ( #28888 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-06 11:05:23 +01:00
Sourab Mangrulkar
5346db1684
Raise error when using save_only_model
with load_best_model_at_end
for DeepSpeed/FSDP ( #28866 )
...
* Raise error when using `save_only_model` with `load_best_model_at_end` for DeepSpeed/FSDP
* Update trainer.py
2024-02-06 11:25:44 +05:30
Eran Hirsch
ee2a3400f2
Fix LongT5ForConditionalGeneration initialization of lm_head ( #28873 )
2024-02-06 04:24:20 +01:00
Klaus Hipp
1ea0bbd73c
[Docs] Update project names and links in awesome-transformers ( #28878 )
...
Update project names and repository links in awesome-transformers
2024-02-06 04:06:29 +01:00
dependabot[bot]
e83227d76e
Bump cryptography from 41.0.2 to 42.0.0 in /examples/research_projects/decision_transformer ( #28879 )
...
Bump cryptography in /examples/research_projects/decision_transformer
Bumps [cryptography](https://github.com/pyca/cryptography ) from 41.0.2 to 42.0.0.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst )
- [Commits](https://github.com/pyca/cryptography/compare/41.0.2...42.0.0 )
---
updated-dependencies:
- dependency-name: cryptography
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-06 03:53:08 +01:00
nakranivaibhav
2e7c942c81
Adds LlamaForQuestionAnswering class in modeling_llama.py along with AutoModel Support ( #28777 )
...
* This is a test commit
* testing commit
* final commit with some changes
* Removed copy statement
* Fixed formatting issues
* Fixed error added past_key_values in the forward method
* Fixed a trailing whitespace. Damn the formatting rules are strict
* Added the copy statement
2024-02-06 03:41:42 +01:00
xkszltl
ac51e59e47
Do not use mtime for checkpoint rotation. ( #28862 )
...
Resolve https://github.com/huggingface/transformers/issues/26961
2024-02-06 03:21:50 +01:00
eajechiloae
06901162b5
ClearMLCallback enhancements: support multiple runs and handle logging better ( #28559 )
...
* add clearml tracker
* support multiple train runs
* remove bad code
* add UI entries for config/hparams overrides
* handle models in different tasks
* run ruff format
* tidy code based on code review
---------
Co-authored-by: Eugen Ajechiloae <eugenajechiloae@gmail.com>
2024-02-05 20:04:17 +00:00
amyeroberts
ba3264b4e8
Image Feature Extraction pipeline ( #28216 )
...
* Draft pipeline
* Fixup
* Fix docstrings
* Update doctest
* Update pipeline_model_mapping
* Update docstring
* Update tests
* Update src/transformers/pipelines/image_feature_extraction.py
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
* Fix docstrings - review comments
* Remove pipeline mapping for composite vision models
* Add to pipeline tests
* Remove for flava (multimodal)
* safe pil import
* Add requirements for pipeline run
* Account for super slow efficientnet
* Review comments
* Fix tests
* Swap order of kwargs
* Use build_pipeline_init_args
* Add back FE pipeline for Vilt
* Include image_processor_kwargs in docstring
* Mark test as flaky
* Update TODO
* Update tests/pipelines/test_pipelines_image_feature_extraction.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Add license header
---------
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-02-05 14:50:07 +00:00
Yoach Lacombe
7addc9346c
Correct wav2vec2-bert inputs_to_logits_ratio ( #28821 )
...
* Correct wav2vec2-bert inputs_to_logits_ratio
* correct ratio
* correct ratio, clean asr pipeline
* refactor on one line
2024-02-05 13:14:47 +00:00
Arthur
3f9f749325
[Doc
] update contribution guidelines ( #28858 )
...
update guidelines
2024-02-05 21:19:21 +09:00
Nicolas Patry
2da28c4b41
[WIP] Hard error when ignoring tensors. ( #27484 )
...
* [WIP] Hard error when ignoring tensors.
* Better selection/error when saving a checkpoint.
- Find all names we should normally drop (those are in the transformers
config)
- Find all disjoint tensors (for those we can safely trigger a copy to
get rid of the sharing before saving)
- Clone those disjoint tensors getting rid of the issue
- Find all identical names (those should be declared in the config
but we try to find them all anyway.)
- For all identical names:
- If they are in the config, just ignore them everything is fine
- If they are not, warn about them.
- For all remainder tensors which are shared yet neither identical NOR
disjoint. raise a hard error.
* Adding a failing test on `main` that passes here.
* We don't need to keep the subfolder logic in this test.
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-02-05 09:17:24 +01:00
w4ffl35
0466fd5ca2
Ability to override clean_code_for_run ( #28783 )
...
* Add clean_code_for_run function
* Call clean_code_for_run from agent method
2024-02-05 03:48:41 +01:00
Zizhao Chen
c430d6eaee
[Docs] Fix bad doc: replace save with logging ( #28855 )
...
Fix bad doc: replace save with logging
2024-02-05 03:38:08 +01:00
Ziyang
7b702836af
Support custom scheduler in deepspeed training ( #26831 )
...
Reuse trainer.create_scheduler to create scheduler for deepspeed
2024-02-05 03:33:55 +01:00
dependabot[bot]
ca8944c4e3
Bump dash from 2.3.0 to 2.15.0 in /examples/research_projects/decision_transformer ( #28845 )
...
Bump dash in /examples/research_projects/decision_transformer
Bumps [dash](https://github.com/plotly/dash ) from 2.3.0 to 2.15.0.
- [Release notes](https://github.com/plotly/dash/releases )
- [Changelog](https://github.com/plotly/dash/blob/dev/CHANGELOG.md )
- [Commits](https://github.com/plotly/dash/compare/v2.3.0...v2.15.0 )
---
updated-dependencies:
- dependency-name: dash
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-05 03:12:30 +01:00
amyeroberts
3d2900e829
Mark test_encoder_decoder_model_generate
for vision_encoder_deocder
as flaky ( #28842 )
...
Mark test as flaky
2024-02-02 16:57:08 +00:00
Sourab Mangrulkar
80d50076c8
Reduce GPU memory usage when using FSDP+PEFT ( #28830 )
...
support FSDP+PEFT
2024-02-02 21:18:01 +05:30
Yih-Dar
f497795948
Use -v
for pytest
on CircleCI ( #28840 )
...
use -v in pytest
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-02 16:44:13 +01:00
Yih-Dar
a7cb92aa03
fix / skip (for now) some tests before switch to torch 2.2 ( #28838 )
...
* fix / skip some tests before we can switch to torch 2.2
* style
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-02 14:11:50 +01:00
Yih-Dar
0e75aeefaf
Fix issues caused by natten ( #28834 )
...
try
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-02 21:11:48 +09:00
Juri Ganitkevitch
ec29d25d9f
Add missing None check for hf_quantizer ( #28804 )
...
* Add missing None check for hf_quantizer
* Add test, fix logic.
* make style
* Switch test model to Mistral
* Comment
* Update tests/test_modeling_utils.py
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-02-02 09:34:12 +01:00
skumar951
1efb21c764
Explicitly check if token ID's are None in TFBertTokenizer constructor ( #28824 )
...
Add an explicit none-check, since token ids can be 0
2024-02-02 09:13:36 +01:00
Klaus Hipp
721ee783ca
[Docs] Fix spelling and grammar mistakes ( #28825 )
...
* Fix typos and grammar mistakes in docs and examples
* Fix typos in docstrings and comments
* Fix spelling of `tokenizer` in model tests
* Remove erroneous spaces in decorators
* Remove extra spaces in Markdown link texts
2024-02-02 08:45:00 +01:00
Steven Liu
2418c64a1c
[docs] HfQuantizer ( #28820 )
...
* tidy
* fix path
2024-02-02 08:22:18 +01:00
Steven Liu
abbffc4525
[docs] Backbone ( #28739 )
...
* backbones
* fix path
* fix paths
* fix code snippet
* fix links
2024-02-01 09:16:16 -08:00
Rockerz
23ea6743f2
Add models from deit ( #28302 )
...
* Add modelss
* Add 2 more models
* add models to tocrree
* Add modles
* Update docs/source/ja/model_doc/detr.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/model_doc/deit.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/model_doc/deplot.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix bugs
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-02-01 09:15:55 -08:00
zspo
d98591a12b
[docs] fix some bugs about parameter description ( #28806 )
...
Co-authored-by: p_spozzhang <p_spozzhang@tencent.com>
2024-02-01 16:59:29 +00:00
Sangbum Daniel Choi
e19c12e094
enable graident checkpointing in DetaObjectDetection and add tests in Swin/Donut_Swin ( #28615 )
...
* enable graident checkpointing in DetaObjectDetection
* fix missing part in original DETA
* make style
* make fix-copies
* Revert "make fix-copies"
This reverts commit 4041c86c29248f1673e8173b677c20b5a4511358.
* remove fix-copies of DetaDecoder
* enable swin gradient checkpointing
* fix gradient checkpointing in donut_swin
* add tests for deta/swin/donut
* Revert "fix gradient checkpointing in donut_swin"
This reverts commit 1cf345e34d3cc0e09eb800d9895805b1dd9b474d.
* change supports_gradient_checkpointing pipeline to PreTrainedModel
* Revert "add tests for deta/swin/donut"
This reverts commit 6056ffbb1eddc3cb3a99e4ebb231ae3edf295f5b.
* Revert "Revert "fix gradient checkpointing in donut_swin""
This reverts commit 24e25d0a14891241de58a0d86f817d0b5d2a341f.
* Simple revert
* enable deformable detr gradient checkpointing
* add gradient in encoder
2024-02-01 15:07:44 +00:00
Matt
7bc6d76396
Add tip on setting tokenizer attributes ( #28764 )
...
* Add tip on setting tokenizer attributes
* Grammar
* Remove the bit that was causing doc builds to fail
2024-02-01 14:44:58 +00:00
fxmarty
709dc43239
Fix symbolic_trace with kv cache ( #28724 )
...
* fix symbolic_trace with kv cache
* comment & better test
2024-02-01 09:45:02 +01:00
Yih-Dar
eb8e7a005f
Make is_torch_bf16_available_on_device
more strict ( #28796 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-01 09:03:53 +01:00
JB (Don)
0d26abdd3a
Adding [T5/MT5/UMT5]ForTokenClassification ( #28443 )
...
* Adding [T5/MT5/UMT5]ForTokenClassification
* Add auto mappings for T5ForTokenClassification and variants
* Adding ForTokenClassification to the list of models
* Adding attention_mask param to the T5ForTokenClassification test
* Remove outdated comment in test
* Adding EncoderOnly and Token Classification tests for MT5 and UMT5
* Fix typo in umt5 string
* Add tests for all the existing MT5 models
* Fix wrong comment in dependency_versions_table
* Reverting change to common test for _keys_to_ignore_on_load_missing
The test is correctly picking up redundant keys in _keys_to_ignore_on_load_missing.
* Removing _keys_to_ignore_on_missing from MT5 since the key is not used in the model
* Add fix-copies to MT5ModelTest
2024-02-01 03:53:49 +01:00
Shichao Song
7b2bd1fbbd
[docs] Correct the statement in the docstirng of compute_transition_scores in generation/utils.py ( #28786 )
2024-01-31 17:07:30 +00:00