Commit Graph

14415 Commits

Author SHA1 Message Date
Xabier de Zuazo
606d90845f
Fix Whisper Conversion Script: Correct decoder_attention_heads and _download function (#26834)
* Fix error in convert_openai_to_hf.py: "_download() missing 1 required positional argument: root"

* Fix error in convert_openai_to_hf.py: "TypeError: byte indices must be integers or slices, not str"

* Fix decoder_attention_heads value in convert_openai_to_hf.py.

Correct the assignment for `decoder_attention_heads` in the conversion script for the Whisper model.

* Black reformat convert_openai_to_hf.py file.

* Fix Whisper model configuration defaults (for Tiny).

- Correct encoder/decoder layers and attention heads count.
- Update model width (`d_model`) to 384.

* Add docstring to the convert_openai_to_hf.py script with a doctest

* Add shebang and +x permission to the convert_openai_to_hf.py

* convert_openai_to_hf.py: reuse the read model_bytes in the _download() function

* Move convert_openai_to_hf.py doctest example to whisper.md

* whisper.md: Add an inference example to the Conversion section.

* whisper.md: remove `model.config.forced_decoder_ids` from examples (deprecated)

* whisper.md: Remove "## Format Conversion" section; not used by users

* whisper.md: Use librispeech_asr_dummy dataset and load_dataset()
2023-11-07 13:39:42 +01:00
Joao Gante
90b4adc1f1
Generate: skip tests on unsupported models instead of passing (#27265) 2023-11-07 12:08:28 +00:00
Younes Belkada
26d8d5f211
Fix autoawq docker image (#27339)
* Update Dockerfile

* Update docker/transformers-all-latest-gpu/Dockerfile
2023-11-07 11:21:04 +01:00
Sanchit Gandhi
da7ea9a4e3
[Whisper] Block language/task args for English-only (#27322)
* [Whisper] Block language/task args for English-only

* Update src/transformers/models/whisper/modeling_whisper.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-11-07 10:04:23 +00:00
Maria Khalusova
9beb2737d7
[docs] fixed links with 404 (#27327)
* fixed links with 404

* make style
2023-11-06 19:45:03 +00:00
Yih-Dar
1b20e2bb42
Fix Kosmos2Processor batch mode (#27323)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-11-06 19:05:50 +01:00
Iker García-Ferrero
a6e0d5a219
Fix VideoMAEforPretrained dtype error (#27296)
* Fix dtype error

* Fix mean and std dtype

* make style
2023-11-06 17:20:06 +00:00
Akshay Chintalapati
e9dbd39263
Update sequence_classification.md (#27281)
I'm adding accelerate as one of the libraries to install because otherwise when running the Trainer, the model errorr out with the error. 

ImportError: Using the `Trainer` with `PyTorch` requires `accelerate>=0.20.1`: Please run `pip install transformers[torch]` or `pip install accelerate -U`

Further context: 
1. I've tried this across different environments so I believe that the environment is not the issue. 
2. I had the latest transformers library version running. 
3. Typically even after install accelerate and import it, it wouldn't resolve the issue until I restart the notebook and try again.
2023-11-06 14:21:48 +00:00
Arthur
147f774671
[PretrainedTokenizer] add some of the most important functions to the doc (#27313) 2023-11-06 15:11:00 +01:00
Hz, Ji
1ffc4dee5b
enable memory tracker metrics for npu (#27280) 2023-11-06 13:44:21 +00:00
Pingzhi Li
d7dcfa8917
Remove an unexpected argument for FlaxResNetBasicLayerCollection (#27272)
Remove unexpected argument for FlaxResNetBasicLayerCollection
2023-11-06 12:16:03 +00:00
Yih-Dar
eef7ea98c3
Update doctest workflow file (#27306)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-11-06 11:27:48 +01:00
Yih-Dar
d788d37d24
Fix daily CI image build (#27307)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-11-06 11:27:22 +01:00
Mayank Mishra
b026b5ca6d
Fix tokenizer export for LLamaTokenizerFast (#27222)
* fix tokenizer

* fix tokenizer
2023-11-06 10:26:18 +01:00
jiaqiw09
cc3e478185
translate run_scripts.md to chinese (#27246)
* translate run_scripts.md to chinese

* translate run_scripts.md to chinese

* translate run_scripts.md to chinese
2023-11-03 10:19:41 -07:00
jiaqiw09
bf7cfac20a
translate autoclass_tutorial to chinese (#27269)
* translate autoclass_tutorial.md  to chinese

* translate update
2023-11-03 09:16:55 -07:00
Susnato Dhar
1ac2463dfe
[FA2] Add flash attention for for DistilBert (#26489)
* flash attention added for DistilBert

* fixes

* removed padding_masks

* Update modeling_distilbert.py

* Update test_modeling_distilbert.py

* style fix
2023-11-03 16:07:54 +00:00
Maria Khalusova
5964f820db
[Docs] Model_doc structure/clarity improvements (#26876)
* first batch of structure improvements for model_docs

* second batch of structure improvements for model_docs

* more structure improvements for model_docs

* more structure improvements for model_docs

* structure improvements for cv model_docs

* more structural refactoring

* addressed feedback about image processors
2023-11-03 10:57:03 -04:00
Younes Belkada
ad8ff96224
[Docs / SAM ] Reflect correct changes to run inference without OOM (#27268)
Update sam.md
2023-11-03 15:23:13 +01:00
Shiyu Li
f13f544ad9
Fix switch transformer mixed precision issue (#27220)
* Fix mixed precision error for switch transformer

* Fixup
2023-11-03 14:00:33 +00:00
Matt
db69bd88fb
Update the ConversationalPipeline docstring for chat templates (#27250)
* Update the ConversationalPipeline docstring now that we're using chat templates

* Direct access to conversation.messages

* Explain the string init
2023-11-03 13:17:46 +00:00
Maria Khalusova
011b15c1c7
[docs] Custom model doc update (#27213)
doc update
2023-11-03 08:03:13 -04:00
Yih-Dar
af8d1dc309
Avoid many failing tests in doctesting (#27262)
* fix

* update

* update

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-11-03 12:47:07 +01:00
Younes Belkada
8f1a43cd91
[PEFT / Tests ] Fix peft integration failing tests (#27258)
fix peft integration issues
2023-11-03 12:23:02 +01:00
Tom Aarsen
05ea7b79e6
Refactor: Use Llama RoPE implementation for Falcon (#26933)
* Use Llama RoPE implementation for Falcon

+ Add copy functionalities

* Use standard cache format for Falcon

* Simplify apply_rotary_pos_emb, copy from Llama

* Remove unnecessary cache conversion test

We don't need to convert any caches anymore!

* Resolve copy complaint
2023-11-03 11:05:55 +00:00
Lysandre Debut
e9a6c72b5e
Fuyu protection (#27248) 2023-11-03 08:45:05 +01:00
Komal Kumar
552ff24488
Fixed base model class name extraction from PeftModels (#27162)
* Fixed base model class name extraction from PeftModels

* Changes to first unwrap the model then extract the base model name

* Changed base_model to base_model.model to stay consistent with peft model abstractions
2023-11-02 20:08:03 +00:00
Chi
4991216841
Removed the redundant SiLUActivation class. (#27136)
* Removed the redundant SiLUActivation class and now use nn.functional.silu directly.

* I apologize for adding torch.functional.silu. I have replaced it with nn.SiLU.
2023-11-02 18:13:57 +00:00
jiaqiw09
00d8502b7a
translate peft.md to chinese (#27215)
* tranlsate peft.md to chinese

* translate peft.md to chinese

* fix missing link
2023-11-02 10:42:29 -07:00
Lysandre
bc78fd1274 Dev version 2023-11-02 18:15:36 +01:00
Yoach Lacombe
0ed6729bb1
Enrich TTS pipeline parameters naming (#26473)
* enrich TTS pipeline docstring for clearer forward_params use

* change token leghts

* update Pipeline parameters

* correct docstring and make style

* fix tests

* make style

* change music prompt

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* raise errors if generate_kwargs with forward-only models

* make style

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-11-02 17:06:56 +00:00
Pietro Lesci
147e8ce4ae
Remove redundant code from T5 encoder mask creation (#27216)
* remove redundant code

* update

* add typecasting

* make `attention_mask` float again
2023-11-02 16:01:41 +00:00
Joao Gante
a6c82d4567
Generate: return past_key_values (#25086) 2023-11-02 15:39:21 +00:00
Marc Sun
441c3e0dd2
fix-deprecated-exllama-arg (#27243)
fix-exllama
2023-11-02 11:23:31 -04:00
Nicolas Patry
8801861d2d
Fixing m4t. (#27240)
* Fixing m4t.

* Trying to remove comparison ? Odd test failure.

* Adding shared. But why on earth does it hang ????

* Putting back the model weights checks the test is silently failing on
cuda.

* Fix style + unremoved comment.
2023-11-02 15:32:17 +01:00
Lysandre Debut
443bf5e9e2
Fix safetensors failing tests (#27231)
* Fix Kosmos2

* Fix ProphetNet

* Fix MarianMT

* Fix M4T

* XLM ProphetNet

* ProphetNet fix

* XLM ProphetNet

* Final M4T fixes

* Tied weights keys

* Revert M4T changes

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-11-02 15:03:09 +01:00
Michael Benayoun
4557a0dede
Wrap _prepare_4d_causal_attention_mask as a leaf function (#27236)
Wrap _prepare_4d_causal_attention_mask as a leaf function
2023-11-02 12:03:30 +00:00
Pablo Montalvo
8a312956fd
Fuyu: improve image processing (#27007)
* Fix Fuyu image scaling bug

It could produce negative padding and hence inference errors for certain
image sizes.

* initial rework commit

* add batching capabilities, refactor image processing

* add functional batching for a list of images and texts

* make args explicit

* Fuyu processing update (#27133)

* Add file headers

* Add file headers

* First pass - preprocess method with standard args

* First pass image processor rework

* Small tweaks

* More args and docstrings

* Tidying iterating over batch

* Tidying up

* Modify to have quick tests (for now)

* Fix up

* BatchFeature

* Passing tests

* Add tests for processor

* Sense check when patchifying

* Add some tests

* FuyuBatchFeature

* Post-process box coordinates

* Update to `size` in processor

* Remove unused and duplicate constants

* Store unpadded dims after resize

* Fix up

* Return FuyuBatchFeature

* Get unpadded sizes after resize

* Update exception

* Fix return

* Convert input `<box>` coordinates to model format.

* Post-process point coords, support multiple boxes/points in a single
sequence

* Replace constants

* Update src/transformers/models/fuyu/image_processing_fuyu.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Preprocess List[List[image]]

* Update src/transformers/models/fuyu/image_processing_fuyu.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update to Amy's latest state.

* post-processing returns a list of tensors

* Fix error when target_sizes is None

Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com>

* Update src/transformers/models/fuyu/image_processing_fuyu.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/transformers/models/fuyu/image_processing_fuyu.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/transformers/models/fuyu/image_processing_fuyu.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/transformers/models/fuyu/image_processing_fuyu.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Review comments

* Update src/transformers/models/fuyu/image_processing_fuyu.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Fix up

* Fix up

---------

Co-authored-by: Ubuntu <ubuntu@ip-172-31-72-126.ec2.internal>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com>

* Fix conflicts in fuyu_follow_up_image_processing (#27228)

fixing conflicts and updating on main

* Revert "Fix conflicts in fuyu_follow_up_image_processing" (#27232)

Revert "Fix conflicts in fuyu_follow_up_image_processing (#27228)"

This reverts commit acce10b6c6.

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-72-126.ec2.internal>
2023-11-02 12:25:41 +01:00
Younes Belkada
9b25c164bd
[core / Quantization] Fix for 8bit serialization tests (#27234)
* fix for 8bit serialization

* added regression tests.

* fixup
2023-11-02 12:03:51 +01:00
Hz, Ji
c52e429b1c
Reproducible checkpoint for npu (#27208)
* save NPU's RNG states when saving a checkpoint and set after all the
data skip phase when resuming training.

* re-trigger ci

* re-trigger ci
2023-11-02 10:27:13 +00:00
Roohollah Etemadi
7adaefe2bc
support bf16 (#25879)
* added bf16 support

* added cuda availability check

* applied make style, quality
2023-11-02 11:05:20 +01:00
Patrick von Platen
af3de8d87c
[Whisper, Bart, MBart] Add Flash Attention 2 (#27203)
* add whisper fa2

* correct

* change all

* correct

* correct

* fix more

* fix more

* fix more

* fix more

* fix more

* fix more

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix more

* fix more

* fix more

* fix more

* fix more

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-11-01 21:03:01 +01:00
Zach Mueller
3520e37e86
Enable split_batches through TrainingArguments (#26798)
* Enable split_batches through TrainingArguments

* Extra dispatch_batches

* Keep as default false

* Add to docstring

* Add to docstring

* Remove the capturewarnings change

* Comma
2023-11-01 14:42:38 -04:00
Lysandre Debut
95020f208e
Fix CPU offload + disk offload tests (#27204)
Fix disk offload tests + weight sharing issues
2023-11-01 19:25:23 +01:00
Marc Sun
c9e72f55b2
Add exllamav2 better (#27111)
* add_ xllamav2 arg

* add test

* style

* add check

* add doc

* replace by use_exllama_v2

* fix tests

* fix doc

* style

* better condition

* fix logic

* add deprecate msg

* deprecate exllama

* remove disable_exllama from the linter

* remove

* fix warning

* Revert the commits deprecating exllama

* deprecate disable_exllama for use_exllama

* fix

* fix loading attribute

* better handling of args

* remove disable_exllama from init and linter

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* better arg

* fix warning

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* switch to dict

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* style

* nits

* style

* better tests

* style

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-11-01 13:09:21 -04:00
jiaqiw09
239cd0eaa2
Translate task summary to chinese (#27180)
* translate task_summary.md to chinese

* update translation

* update translation

* fix _toctree.yml
2023-11-01 09:28:34 -07:00
Rafael Padilla
1e32b05e06
improving TimmBackbone to support FrozenBatchNorm2d (#27160)
* supporting freeze_batch_norm_2d

* supporting freeze_batch_norm_2d

* including unfreeze + separate into methods

* fix typo

* calling unfreeze

* lint

* Update src/transformers/models/timm_backbone/modeling_timm_backbone.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: Rafael Padilla <rafael.padilla@huggingface.co>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-11-01 12:58:35 -03:00
Wesley L Passos
21a2fbaf48
Fix docstring in get_oneformer_resize_output_image_size func (#27207) 2023-11-01 15:31:13 +00:00
Andi Powers Holmes
f8afb2b2ec
Add TensorFlow implementation of ConvNeXTv2 (#25558)
* Add type annotations to TFConvNextDropPath

* Use tf.debugging.assert_equal for TFConvNextEmbeddings shape check

* Add TensorFlow implementation of ConvNeXTV2

* check_docstrings: add TFConvNextV2Model to exclusions

TFConvNextV2Model and TFConvNextV2ForImageClassification have docstrings
which are equivalent to their PyTorch cousins, but a parsing issue prevents them
from passing the test.

Adding exclusions for these two classes as discussed in #25558.
2023-11-01 15:09:55 +00:00
Patrick von Platen
391d14e810
[WhisperForCausalLM] Add WhisperForCausalLM for speculative decoding (#27195)
* finish

* add tests

* fix all tests

* [Assistant Decoding] Add test

* fix more

* better

* finish

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* finish

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-11-01 16:01:53 +01:00