Commit Graph

12731 Commits

Author SHA1 Message Date
Younes Belkada
09a9888fe9
[bnb] 8bit models should not be converted to DDP (#22628)
add safety checker
2023-04-06 18:09:24 +02:00
Yih-Dar
d0b83fe2e1
A script to add/update pipeline_model_mapping systematically (#22180)
* Auto. add and update pipeline_model_mapping

* Fix style and quality

* Finalize (comments)

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-06 18:08:14 +02:00
Yih-Dar
fa01127a67
update_pip_test_mapping (#22606)
* Add TFBlipForConditionalGeneration

* update pipeline_model_mapping

* Add import

* Revert changes in GPTSanJapaneseTest

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-06 17:56:06 +02:00
Connor Henderson
321b0908dd
docs: Fix broken link to generation strategies (#22623)
fix broken link
2023-04-06 11:48:50 -04:00
Yih-Dar
2c22bc79c2
Make tiny model creation + pipeline testing more robust (#22500)
* Final Tiny things

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-06 17:45:55 +02:00
amyeroberts
12d51db243
Backbone add mixin tests (#22542)
* Add out_indices to backbones, deprecate out_features

* Update - can specify both out_features and out_indices but not both

* Add backbone mixin tests

* Test tidy up

* Add test_backbone for convnext

* Remove redefinition of method

* Update for Dinat and Nat backbones

* Update tests

* Smarter indexing

* Add checks on config creation for backbone

* PR comments
2023-04-06 13:50:15 +01:00
Joao Gante
48706c7178
Seq2SeqTrainer: use unwrapped model to retrieve the generation config (#22584) 2023-04-06 13:29:58 +01:00
Nicolas Patry
0aa1153ffb
Revert error back into warning for byte fallback conversion. (#22607) 2023-04-06 14:00:29 +02:00
Nicolas Patry
1670be4bde
Adding Llama FastTokenizer support. (#22264)
* Adding Llama FastTokenizer support.

- Requires https://github.com/huggingface/tokenizers/pull/1183 version
- Only support byte_fallback for llama, raise otherwise (safety net).
- Lots of questions are special tokens

How to test:

```python

from transformers.convert_slow_tokenizer import convert_slow_tokenizer
from transformers import AutoTokenizer
from tokenizers import Tokenizer

tokenizer = AutoTokenizer.from_pretrained("huggingface/llama-7b")

if False:
    new_tokenizer = Tokenizer.from_file("tok.json")
else:
    new_tokenizer = convert_slow_tokenizer(tokenizer)
    new_tokenizer.save("tok.json")

strings = [
    "This is a test",
    "生活的真谛是",
    "生活的真谛是[MASK]。",
    # XXX: This one is problematic because of special tokens
    # "<s> Something something",
]

for string in strings:
    encoded = tokenizer(string)["input_ids"]
    encoded2 = new_tokenizer.encode(string).ids

    assert encoded == encoded2, f"{encoded} != {encoded2}"

    decoded = tokenizer.decode(encoded)
    decoded2 = new_tokenizer.decode(encoded2)

    assert decoded.strip() == decoded2, f"{repr(decoded)} != {repr(decoded2)}"
```

The converter + some test script.

The test script.

Tmp save.

Adding Fast tokenizer + tests.

Adding the tokenization tests.

Correct combination.

Small fix.

Fixing tests.

Fixing with latest update.

Rebased.

fix copies + normalized added tokens  + copies.

Adding doc.

TMP.

Doc + split files.

Doc.

Versions + try import.

Fix Camembert + warnings -> Error.

Fix by ArthurZucker.

Not a decorator.

* Fixing comments.

* Adding more to docstring.

* Doc rewriting.
2023-04-06 09:53:03 +02:00
Kaustubh
1564189298
feat(model parallelism): moving the labels to the same device as the logits for gpt2 and bart (#22591) 2023-04-05 14:37:17 -04:00
Matt
e577bd0f13
Use native TF checkpoints for the BLIP TF tests (#22593)
* Use native TF checkpoints for the TF tests

* Remove unneeded exceptions
2023-04-05 18:43:14 +01:00
Younes Belkada
176ceff91f
Add DePlot + MatCha on transformers (#22528)
* add deplot + matcha on `transformers`

* more docs

* correct path

* Update docs/source/en/model_doc/deplot.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix

* use auto processor

* Update docs/source/en/model_doc/matcha.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make fixup

* Update docs/source/en/model_doc/deplot.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* add correct names

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2023-04-05 17:43:48 +02:00
Nicolas Patry
126eafe396
Adding support for BPE merge creation from scores instead of ids. (#22582)
* Adding support for BPE merge creation from scores instead of ids.

* Revert warn -> raise.

* Update src/transformers/convert_slow_tokenizer.py

* Quality.
2023-04-05 16:03:06 +02:00
Matt
12f1a3bb3c
Fix a typo in one of the BLIP pretrained checkpoint names (#22588)
Fixes a typo in one of the BLIP pretrained checkpoint names
2023-04-05 14:56:20 +01:00
Mikel Penagarikano
d5239bab5b
Sync preprocesses before loading the processor at run_speech_recognition_ctc.py (#21926)
* Update run_speech_recognition_ctc.py

Make sure all processes wait until data is saved before loading the processor from the output_dit

* Make sure all processes wait until data is saved before loading the processor from the output_dit

* Update run_speech_recognition_ctc.py

* Update run_speech_recognition_seq2seq.py
2023-04-05 09:36:04 -04:00
Wonhyeong Seo
f49b0762a1
docs: ko: complete _toctree.yml (#22581)
Co-authored-by: gabrielwithappy <102908949+gabrielwithappy@users.noreply.github.com>
2023-04-05 09:32:17 -04:00
Quentin Meeus
4861c25817
Add thousands separator in training summary (#22583)
The logger prints a summary at the beginning of training that displays some info such as number of examples, number of parameters, total number of steps, etc. Those numbers can be quite large and difficult to read. I added a thousand separator to improve readability for the following:
- num_examples
- num_train_epochs
- per_device_train_batch_size
- total_train_batch_size
- max_steps
- num_trainable_params
2023-04-05 09:28:38 -04:00
Matt
2a91a9ef66
Fix PT-TF equivalence test for GPT1 (#22586)
* Re-enable skipped test and fix the hidden state shape issue

* Actually fix the bug instead of just doing something wrong
2023-04-05 13:16:00 +01:00
Joao Gante
0684284911
Tests: disable accelerate_tests mark warnings (#22585) 2023-04-05 13:13:26 +01:00
Sylvain Gugger
6c640f098a
Move back doctest instructions to setup.cfg (#22587) 2023-04-05 07:53:19 -04:00
Joao Gante
861ff890d6
Generate: TextIteratorStreamer timeout (#22576) 2023-04-05 09:57:46 +01:00
Sylvain Gugger
11fd2c773b
Skip failing test 2023-04-04 21:26:17 -04:00
Matt
edb704b26e
Fix inverted conditional in TF common test! (#22540)
* Fix inverted conditional in TF common test!

* Make the same change in the PT tests file

* Make sure hidden states for GPT2 have the same output shape in PT/TF

* Minor fix to PT implementation of token classification loss

* Skip loss equivalence test for TFHubert because it keeps overflowing to inf

* Compute LM loss for TF the (weird) way it's computed in PT

* Skip loss equivalence test for Wav2Vec2 for the same reason as Hubert

* Fix - don't try to access the hidden states property when output is a tuple
2023-04-04 21:59:54 +01:00
Sourab Mangrulkar
48fbd8fa2e
fix _no_split_modules for Whisper model (#22486) 2023-04-04 13:01:32 -04:00
Shubhamai
900677487d
Flax Regnet (#21867)
* initial commit

* review changes

* post model PR merge

* updating doc
2023-04-04 12:41:12 -04:00
Sun Haozhe
fc5b7419d4
corrected the code comment for the output of find_pruneable_heads_and_indices (#22557)
* corrected/clarified the code comment of find_pruneable_heads_and_indices

* have run make style
2023-04-04 11:29:42 -04:00
Matt
5f3ea66bc0
Add TF port of BLIP (#22090)
* Initial commit

* more stash commit

* Yet another stash commit

* yet more stash commit

* Mostly working except for docs / repo consistency

* Stop importing model list from torch file

* Add TF BLIP models to docs

* Add auto classes

* Move get_text_features and get_image_features

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/blip/test_modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/blip/test_modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/models/blip/test_modeling_tf_blip_text.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Use channels_last convolutions in TF (better performance + compatibility)

* Remove _shape function

* Move multi-line statement to one line in PT + TF

* Specify tf.keras.layers instead of importing from it

* Remove test_gradient_checkpointing and empty test_training methods

* move some multi-line statements to one line

* Update docstring for generate

* Remove pruned heads set

* Remove self.seq_len_dim

* Fixed issues with loss computation, should resolve some tests. Also ensured that the PT version follows the config for output_attentions and output_hidden_states

* ensure original model follows config in more cases

* Skip the same cross-attention tests in the PT tests - didn't realize we did it twice!

* Add training args throughout the models and layers

* make fixup

* Fix docstring for inputs_embeds

* Add docstring for is_decoder

* Add docstrings to text models

* Remove redundant computation

* Add unpack_inputs / keras_serializable

* Add modeling_tf_blip to doctests

* Add config classes for keras serialization

* Changes to allow model porting with pt-to-tf

* Quick fix to decoder head and test tweaks

* Revert an issue with masking the embeddings outputs

* Allow missing keys in some equivalence tests (for unused layers)

* Add tf-pt equivalence tests back in

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make fixup

* Refactor invert_attention_mask out into tf_utils

* Re-enable cross-tests on the PT side too

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-04-04 16:05:22 +01:00
Nicolas Patry
a515d0a77c
Soft error whisper. (#22475)
* Soft error whisper.

* Fix format.

---------

Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-94.taildb5d.ts.net>
2023-04-04 16:21:57 +02:00
Maziyar Panahi
98268b2e76
Add id2label and label2id to model's config in run_xnil (#22558)
Add id2label and label2id to config in run_xnil
2023-04-04 09:28:57 -04:00
Younes Belkada
fa2bdffc5d
[bnb] Fix typo (#22556)
Update modeling_utils.py
2023-04-04 15:26:45 +02:00
Sylvain Gugger
28fcf00607
Remove hack for dynamic modules and use Python functions instead (#22537) 2023-04-04 09:20:13 -04:00
Viktor Scherbakov
871598be55
Implemented safetensors checkpoints save/load for Trainer (#22498)
* implemented safetensors save/load

* remove duplicated file

* added tests

* more tests

* style fix

* fix tf tests

* change to list comprehension

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* review fixes + safe load for sharded checkpoint

* style fix

* remove rogue import

* remove partial to avoid undefined exception

* use naming alias instead of safetensors.torch

* fix safe sharding in tests

* grammar

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update docs

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update docs

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* minor corrections

* style

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-04-04 09:05:04 -04:00
Arthur
00b5887b94
🚨🚨🚨 [NLLB Tokenizer] Fix the prefix tokens 🚨🚨🚨 (#22313)
* fix the prefix tokens

* update fast and test values

* add legacy behaviour

Co-authored-by: sgugger <sylvain.gugger@gmail.com>

* update disclaimer, linkissue PR and behaviral changes

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <hi@lysand.re>

* styling

* make a quote

* quote this time

---------

Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-04-04 14:53:06 +02:00
TheWall9
ad5e9b6c6a
[Roformer] Fixing a bug in RoFormerEncoder where it was ignoring the length of past_key_values when generating as a decoder (#22416)
* fix RoFormerEncoder postion embedding when generate as decoder

* make fixup

* add test case for check generate with past key values

* remove duplicating code
2023-04-04 12:50:33 +02:00
Joao Gante
1905384fd5
Generate: Add text streamer decoding options (#22544) 2023-04-04 09:03:13 +01:00
Pavel T
41a2f3529c
Fix OPTForQuestionAnswering doc string (#22481)
* Fix OPTForQuestionAnswering doc string

for more adequate model answer decoding

* black style fix

* doc-builder style
2023-04-03 21:05:31 -04:00
Younes Belkada
159ff3342c
Update test_image_processing_pix2struct.py (#22543) 2023-04-03 15:26:35 -04:00
Sylvain Gugger
c14d31294e
Skip failing test 2023-04-03 14:07:40 -04:00
Xuehai Pan
4169dc84bf
[setup] migrate setup script to pyproject.toml (#22539)
* [setup] migrate setup script to `pyproject.toml`

* [setup] cleanup configurations

* remove unused imports
2023-04-03 14:03:41 -04:00
Vladimir Blagojevic
a17841ac49
Generate: Enable easier TextStreamer customization (#22516) 2023-04-03 18:49:38 +01:00
Xuehai Pan
80d1319e1b
[setup] drop deprecated distutils usage (#22531)
* [setup] drop deprecated `distutils` usage

* drop deprecated `distutils.util.strtobool` usage

* fix import order

* reformat docstring by `doc-builder`
2023-04-03 12:04:24 -04:00
Ilya
4c33a0c4fc
Fix missing metrics with multiple eval datasets (#22536) 2023-04-03 12:03:57 -04:00
Younes Belkada
d7a4f5becc
[T5] Enable naive Pipeline Parallelism training for T5 (#22535)
* enable PP for T5

* make fixup

* fix failing tests
2023-04-03 17:55:37 +02:00
Younes Belkada
cab048fb35
[Trainer] Force is_model_parallel when model is loaded in multiple GPUs using accelerate (#22532)
* add `is_model_parallel` arg on Trainer

* add warning

* adapt from suggestions

* revert t5 changes

* remove commas

* adapt from suggestions
2023-04-03 17:10:50 +02:00
zhbh01
aecbcb3680
[BLIP] fix cross attentions for BlipTextEncoder (#22515) 2023-04-03 11:00:26 -04:00
Thibault Douzon
4e441e529c
fix LayoutLMv3TokenizerFast subword label after 'Ġ' token (#21695)
LayoutLMv3TokenizerFast produces empty 'Ġ' token with `offset_mapping = (0, 0)`.
Next token is wrongly assumed to also be beginning of word and isn't
correctly assigned `pad_token_label`.
Modify test with text that produce 'Ġ' token.
Remove copy check from LayoutLMv2TokenizerFast for `_batch_encode_plus`.

solves issue: #19978
2023-04-03 10:32:36 -04:00
Kirill
a60010566a
llama docs: fix conversion script url (#22514) 2023-04-03 10:28:40 -04:00
larekrow
9419f144ad
Fix convert_opt_original_pytorch_checkpoint_to_pytorch.py typo (#22526)
`load_checkpoint()` silently fails because `".qkj_proj." in key` is always `False`, but will eventually cause an error at `model.load_state_dict(state_dict)`.
2023-04-03 10:06:52 -04:00
Joao Gante
a55a822adf
Generate: TextIteratorStreamer (streamer for gradio) (#22501)
* haha text go brrr (but in gradio)
2023-04-03 15:04:37 +01:00
Mohammed Jabir
7d25c9c81e
added biogpt token classifier (#22447)
* added biogpt token classifier

* fix reviews

* Updated modeling_biogpt.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-04-03 09:20:02 -04:00