Commit Graph

255 Commits

Author SHA1 Message Date
Lysandre Debut
95020f208e
Fix CPU offload + disk offload tests (#27204)
Fix disk offload tests + weight sharing issues
2023-11-01 19:25:23 +01:00
Lysandre Debut
113ebf80ac
Safetensors serialization by default (#27064)
* Safetensors serialization by default

* First pass on the tests

* Second pass on the tests

* Third pass on the tests

* Fix TF weight loading from TF-format safetensors

* Specific encoder-decoder fixes for weight crossloading

* Add VisionEncoderDecoder fixes for TF too

* Change filename test for pt-to-tf

* One missing fix for TFVisionEncoderDecoder

* Fix the other crossload test

* Support for flax + updated tests

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Sanchit's comments

* Sanchit's comments 2

* Nico's comments

* Fix tests

* cleanup

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-10-31 19:16:49 +01:00
Younes Belkada
f7ea959b96
[core/ GC / tests] Stronger GC tests (#27124)
* stronger GC tests

* better tests and skip failing tests

* break down into 3 sub-tests

* break down into 3 sub-tests

* refactor a bit

* more refactor

* fix

* last nit

* credits contrib and suggestions

* credits contrib and suggestions

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-10-30 19:53:46 +01:00
Younes Belkada
06e782da4e
[core] Refactor of gradient_checkpointing (#27020)
* v1

* fix

* remove `create_custom_forward`

* fixup

* fixup

* add test and fix all failing GC tests

* remove all remaining `create_custom_forward` methods

* fix idefics bug

* fixup

* replace with `__call__`

* add comment

* quality
2023-10-25 12:16:15 +02:00
Younes Belkada
5a73316bed
[FA-2] Final fix for FA2 dtype (#26846)
* final fix for FA2 dtype

* try

* oops

* Update src/transformers/models/falcon/modeling_falcon.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* apply fix everywhere

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-18 19:48:55 +02:00
Younes Belkada
41c42f85f6
[FA2] Fix flash attention 2 fine-tuning with Falcon (#26852)
fix fa2 + dropout issue
2023-10-17 15:38:03 +02:00
Younes Belkada
e6d250e4cd
[core] fix silent bug keep_in_fp32 modules (#26589)
* fix silent bug `keep_in_fp32` modules

* final fix

* added a common test.

* Trigger CI

* revert
2023-10-05 14:44:31 +02:00
Younes Belkada
ae9a344cce
[Mistral] Add Flash Attention-2 support for mistral (#26464)
* add FA-2 support for mistral

* fixup

* add sliding windows

* fixing few nits

* v1 slicing cache - logits do not match

* add comment

* fix bugs

* more mem efficient

* add warning once

* add warning once

* oops

* fixup

* more comments

* copy

* add safety checker

* fixup

* Update src/transformers/models/mistral/modeling_mistral.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* copied from

* up

* raise when padding side is right

* fixup

* add doc + few minor changes

* fixup

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-10-03 13:44:46 +02:00
Yih-Dar
391177441b
Avoid all-zeor attnetion mask used in testing (#26469)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-09-29 11:06:06 +02:00
Younes Belkada
153755ee38
[FA / tests] Add use_cache tests for FA models (#26415)
* add use_cache tests for FA

* fixup
2023-09-27 12:21:54 +02:00
Younes Belkada
368a58e61c
[core ] Integrate Flash attention 2 in most used models (#25598)
* v1

* oops

* working v1

* fixup

* add some TODOs

* fixup

* padding support + try with module replacement

* nit

* alternative design

* oops

* add `use_cache` support for llama

* v1 falcon

* nit

* a bit of refactor

* nit

* nits nits

* add v1 padding support falcon (even though it seemed to work before)

* nit

* falcon works

* fixup

* v1 tests

* nit

* fix generation llama flash

* update tests

* fix tests + nits

* fix copies

* fix nit

* test- padding mask

* stype

* add more mem efficient support

* Update src/transformers/modeling_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fixup

* nit

* fixup

* remove it from config when saving

* fixup

* revert docstring

* add more checks

* use values

* oops

* new version

* fixup

* add same trick for falcon

* nit

* add another test

* change tests

* fix issues with GC and also falcon

* fixup

* oops

* Update src/transformers/models/falcon/modeling_falcon.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add init_rope

* updates

* fix copies

* fixup

* fixup

* more clarification

* fixup

* right padding tests

* add docs

* add FA in docker image

* more clarifications

* add some figures

* add todo

* rectify comment

* Change to FA2

* Update docs/source/en/perf_infer_gpu_one.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* split in two lines

* change test name

* add more tests

* some clean up

* remove `rearrange` deps

* add more docs

* revert changes on dockerfile

* Revert "revert changes on dockerfile"

This reverts commit 8d72a66b4b.

* revert changes on dockerfile

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <hi@lysand.re>

* address some comments

* docs

* use inheritance

* Update src/transformers/testing_utils.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* fixup

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_utils.py

* final comments

* clean up

* style

* add cast + warning for PEFT models

* fixup

---------

Co-authored-by: Felix Marty <9808326+fxmarty@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-09-22 17:42:10 +02:00
Sam Passaglia
8e3980a290
[FIX] resize_token_embeddings (#26102)
* fix roundup command

* add test for resize_token_embeddings

* Update tests/test_modeling_common.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* style

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-09-19 21:44:41 +02:00
Arthur
eb644980eb
Fix pad to multiple of (#25732)
* nits

* update the test

* nits

* update

* fix bark

* fix bark tests and allow padding to multiple of without new tokens
2023-09-15 11:53:39 -04:00
Jinho Park
17fdd35481
Add BROS (#23190)
* add Bros boilerplate

* copy and pasted modeling_bros.py from official Bros repo

* update copyright of bros files

* copy tokenization_bros.py from official repo and update import path

* copy tokenization_bros_fast.py from official repo and update import path

* copy configuration_bros.py from official repo and update import path

* remove trailing period in copyright line

* copy and paste bros/__init__.py from official repo

* save formatting

* remove unused unnecessary pe_type argument - using only crel type

* resolve import issue

* remove unused model classes

* remove unnecessary tests

* remove unused classes

* fix original code's bug - layer_module's argument order

* clean up modeling auto

* add bbox to prepare_config_and_inputs

* set temporary value to hidden_size (32 is too low because of the of the
Bros' positional embedding)

* remove decoder test, update create_and_check* input arguemnts

* add missing variable to model tests

* do make fixup

* update bros.mdx

* add boilerate plate for no_head inference test

* update BROS_PRETRAINED_MODEL_ARCHIVE_LIST (add naver-clova-ocr prefix)

* add prepare_bros_batch_inputs function

* update modeling_common to add bbox inputs in Bros Model Test

* remove unnecessary model inference

* add test case

* add model_doc

* add test case for token_classification

* apply fixup

* update modeling code

* update BrosForTokenClassification loss calculation logic

* revert logits preprocessing logic to make sure logits have original shape

* - update class name

* - add BrosSpadeOutput
- update BrosConfig arguments

* add boilerate plate for no_head inference test

* add prepare_bros_batch_inputs function

* add test case

* add test case for token_classification

* update modeling code

* update BrosForTokenClassification loss calculation logic

* revert logits preprocessing logic to make sure logits have original shape

* apply masking on the fly

* add BrosSpadeForTokenLinking

* update class name
put docstring to the beginning of the file

* separate the logits calculation logic and loss calculation logic

* update logic for loss calculation so that logits shape doesn't change
when return

* update typo

* update prepare_config_and_inputs

* update dummy node initialization

* update last_hidden_states getting logic to consider when return_dict is False

* update box first token mask param

* bugfix: remove random attention mask generation

* update keys to ignore on load missing

* run make style and quality

* apply make style and quality of other codes

* update box_first_token_mask to bool type

* update index.md

* apply make style and quality

* apply make fix-copies

* pass check_repo

* update bros model doc

* docstring bugfix fix

* add checkpoint for doc, tokenizer for doc

* Update README.md

* Update docs/source/en/model_doc/bros.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update bros.md

* Update src/transformers/__init__.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/bros.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* apply suggestions from code review

* apply suggestions from code review

* revert test_processor_markuplm.py

* Update test_processor_markuplm.py

* apply suggestions from code review

* apply suggestions from code review

* apply suggestions from code review

* update BrosSpadeELForTokenClassification head name to entity linker

* add doc string for config params

* update class, var names to more explicit and apply suggestions from code review

* remove unnecessary keys to ignore

* update relation extractor to be initialized with config

* add bros processor

* apply make style and quality

* update bros.md

* remove bros tokenizer, add bros processor that wraps bert tokenizer

* revert change

* apply make fix-copies

* update processor code, update itc -> initial token, stc -> subsequent token

* add type hint

* remove unnecessary condition branches in embedding forward

* fix auto tokenizer fail

* update docstring for each classes

* update bbox input dimension as standard 2 points and convert them to 4
points in forward pass

* update bros docs

* apply suggestions from code review : update Bros -> BROS in bros.md

* 1. box prefix var -> bbox
2. update variable names to be more explicit

* replace einsum with torch matmul

* apply style and quality

* remove unused argument

* remove unused arguments

* update docstrings

* apply suggestions from code review: add BrosBboxEmbeddings, replace
einsum with classical matrix operations

* revert einsum update

* update bros processor

* apply suggestions from code review

* add conversion script for bros

* Apply suggestions from code review

* fix readme

* apply fix-copies

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-09-14 18:02:37 +01:00
Dong-Yong Lee
8881f38a4f
Fix beam search when using model parallel (#24969)
* Fix GPTNeoX beam search when using parallelize

* Fix beam search idx device when using model parallel

* remove onnx related stuff

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix: move test_beam_search_on_multi_gpu to GenerationTesterMixin

* fix: add right item to _no_split_modules of MegaPreTrainedModel

* fix: add num_beams within parallelized beam_search test

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-09-14 11:00:52 -04:00
Arthur
d6bf08f7f6
[resize_embedding] Introduce pad_to_multiple_of and guidance (#25088)
* fix

* revert cahnges and update resizing of embedding layer

* use wraning

* fixup

* more styling nits

* fix all tests that overload the embedding tests

* 👀👀 remove breakpoint

* remove useless overload + overload correctly where needed

* resize lm head with new vocab size

* reverse not necessary changes

* style

* fix CIs!

* fix last CI tests, adapt bark and Marian

* fixup
2023-08-17 17:00:32 +02:00
Yih-Dar
6ea3ee3cd2
Fix test_model_parallelism (#25359)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-08-08 10:48:45 +02:00
Yih-Dar
c177606fb4
Fix more offload edge cases (#25342)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-08-07 17:45:41 +02:00
Sylvain Gugger
fab1a0aa82
Give more memory in test_disk_offload (#25315) 2023-08-04 14:10:31 +02:00
Yih-Dar
bd90cda9a6
CI with num_hidden_layers=2 🚀🚀🚀 (#25266)
* CI with layers=2

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-08-02 20:22:36 +02:00
Connor Henderson
0779fc8eb8
Edit err message and comment in test_model_is_small (#25087)
* Edit err message and comment in

* put back 80M comment
2023-07-25 12:24:36 -04:00
Sylvain Gugger
f295fc8a16
Fix last models for common tests that are too big. (#25058)
* Fix last models for common tests that are too big.

* Remove print statement
2023-07-25 07:56:04 -04:00
Sylvain Gugger
afe8bfc075
Comment again print statement 2023-07-24 10:12:20 -04:00
Sylvain Gugger
42571f6eb8
Make more test models smaller (#25005)
* Make more test models tiny

* Make more test models tiny

* More models

* More models
2023-07-24 10:08:47 -04:00
Sylvain Gugger
1023705440
Check models used for common tests are small (#24824)
* First models

* Conditional DETR

* Treat DETR models, skip others

* Skip LayoutLMv2 as well

* Fix last tests
2023-07-14 14:43:19 -04:00
Yih-Dar
fd6735102a
Make PT/Flax tests could be run on GPU (#24557)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-28 20:11:01 +02:00
Sylvain Gugger
8e5d1619b3
Clean load keys (#24505)
* Preliminary work on some models

* Fix test load missing and make sure nonpersistent buffers are tested

* Always ignore nonpersistent buffers if in state_dict

* Treat models

* More models

* Treat remaining models

* Fix quality

* Fix tests

* Remove draft

* This test is not needed anymore

* Fix copies

* Fix last test

* Newly added models

* Fix last tests

* Address review comments
2023-06-27 14:45:40 -04:00
Younes Belkada
3ce3385c47
Revert "Fix gradient checkpointing + fp16 autocast for most models" (#24420)
Revert "Fix gradient checkpointing + fp16 autocast for most models (#24247)"

This reverts commit 285a48011d.
2023-06-22 16:11:27 +02:00
Younes Belkada
285a48011d
Fix gradient checkpointing + fp16 autocast for most models (#24247)
* fix gc bug

* continue PoC on OPT

* fixes

* 🤯

* fix tests

* remove pytest.mark

* fixup

* forward contrib credits from discussions

* forward contrib credits from discussions

* reverting changes on untouched files.

---------

Co-authored-by: zhaoqf123 <zhaoqf123@users.noreply.github.com>
Co-authored-by: 7eu7d7 <7eu7d7@users.noreply.github.com>
2023-06-21 17:04:59 +02:00
Sylvain Gugger
372f50030b
Split common test from core tests (#24284) 2023-06-15 07:30:24 -04:00
Sylvain Gugger
695928e1e5
Tied params cleanup (#24211)
* First test

* Add info for all models

* style

* Repo consistency

* Fix last model and cleanup prints

* Repo consistency

* Use consistent function for detecting tied weights
2023-06-13 11:38:39 -04:00
Stas Bekman
bbbc5c15d4
[AutoModel] fix torch_dtype=auto in from_pretrained (#23379)
* [automodel] fix torch_dtype=auto in from_pretrained

* add test

* fix logic

* Update src/transformers/models/auto/auto_factory.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-05-16 10:21:42 -07:00
Lucain
74c55ab9e5
Prepare tests for hfh 0.14 (#22958)
* Test hf_hub 0.14.0rc1

* fix mocked tests

* package version

---------

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
Co-authored-by: testbot <lucainp@hf.co>
2023-04-24 09:31:50 -04:00
Matt
edb704b26e
Fix inverted conditional in TF common test! (#22540)
* Fix inverted conditional in TF common test!

* Make the same change in the PT tests file

* Make sure hidden states for GPT2 have the same output shape in PT/TF

* Minor fix to PT implementation of token classification loss

* Skip loss equivalence test for TFHubert because it keeps overflowing to inf

* Compute LM loss for TF the (weird) way it's computed in PT

* Skip loss equivalence test for Wav2Vec2 for the same reason as Hubert

* Fix - don't try to access the hidden states property when output is a tuple
2023-04-04 21:59:54 +01:00
Matt
5f3ea66bc0
Add TF port of BLIP (#22090)
* Initial commit

* more stash commit

* Yet another stash commit

* yet more stash commit

* Mostly working except for docs / repo consistency

* Stop importing model list from torch file

* Add TF BLIP models to docs

* Add auto classes

* Move get_text_features and get_image_features

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/blip/test_modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/blip/test_modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/models/blip/test_modeling_tf_blip_text.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Use channels_last convolutions in TF (better performance + compatibility)

* Remove _shape function

* Move multi-line statement to one line in PT + TF

* Specify tf.keras.layers instead of importing from it

* Remove test_gradient_checkpointing and empty test_training methods

* move some multi-line statements to one line

* Update docstring for generate

* Remove pruned heads set

* Remove self.seq_len_dim

* Fixed issues with loss computation, should resolve some tests. Also ensured that the PT version follows the config for output_attentions and output_hidden_states

* ensure original model follows config in more cases

* Skip the same cross-attention tests in the PT tests - didn't realize we did it twice!

* Add training args throughout the models and layers

* make fixup

* Fix docstring for inputs_embeds

* Add docstring for is_decoder

* Add docstrings to text models

* Remove redundant computation

* Add unpack_inputs / keras_serializable

* Add modeling_tf_blip to doctests

* Add config classes for keras serialization

* Changes to allow model porting with pt-to-tf

* Quick fix to decoder head and test tweaks

* Revert an issue with masking the embeddings outputs

* Allow missing keys in some equivalence tests (for unused layers)

* Add tf-pt equivalence tests back in

* Update src/transformers/models/blip/modeling_tf_blip.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make fixup

* Refactor invert_attention_mask out into tf_utils

* Re-enable cross-tests on the PT side too

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-04-04 16:05:22 +01:00
Nicolas Patry
d143087d18
Making sure we can use safetensors to serialize all the time. (#22437)
* Making sure we can use safetensors to serialize all the time.

* Expanding the tests for increased coverage.

* Update the test.

* Getting current state of affairs.

* Tentative fix.

* Fixing black version.

* Fixing the worst offenders.

* Try to modify less files.

* Fixing blip_2 (Weird solution right now).

* Fixing deta.

* Fix blip ?

* Missing extra newline.

* No deta modification.

* Adding some comments.

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Addressing comments.

* Addressing comments.

* creating warn_once.

* Warning_once !

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-03-31 16:07:35 +02:00
Patrick von Platen
f780557a34
[Safetensors] Add explicit flag to from pretrained (#22083)
* [Safetensors] Add explicit  flag to from pretrained

* add test

* remove @

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-03-13 21:39:06 +01:00
Lucain
923110b74f
Remove set_access_token usage + fail tests if FutureWarning (#22051)
* Remove set_access_token usage + fail tests if FutureWarning

* do not fail on FutureWarning in CI

---------

Co-authored-by: testbot <lucainp@hf.co>
2023-03-09 09:23:48 -05:00
Yih-Dar
9474abdf47
Use larger atol in torch.allclose for some tests (#21966)
Use larger atol

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-06 17:41:00 +01:00
Younes Belkada
831f3144a6
[tests] add accelerate marker (#21743)
* add `accelerate` marker

* add to docs

* Update docs/source/en/testing.mdx
2023-02-27 12:33:34 +01:00
Arthur
c51dc4f927
[torch] remove deprecated uint8 in favor of bool (#21384)
* uint8 -> bool

* fix copies

* style

* update test modeling commen when checking attention buffers

* style

* use logical not on random mask instead of subtraction with 1

* remove torch uint8

* quality

* remove modified modeling utils

* Update based on review

Co-authored-by: sgugger <sylvain.gugger@gmail.com>

---------

Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2023-02-27 11:46:02 +01:00
Aaron Gokaslan
5e8c8eb5ba
Apply ruff flake8-comprehensions (#21694) 2023-02-22 09:14:54 +01:00
Sylvain Gugger
d4ba6e1a0e
Fix generation config for empty state dict (#21630) 2023-02-14 10:57:28 -05:00
Stas Bekman
2f5507580b
[from_pretrained] extend torch_dtype="auto" to look up config.torch_dtype first, expand docs (#21524)
* [from_pretrained] expand on torch_dtype entry

* fold 4 into 1

* style

* support torch_dtype='config' plus tests

* style

* oops

* fold config into auto, fix bug

* fix check

* better log

* better log

* clean up
2023-02-10 09:09:21 -08:00
Patrick von Platen
b20147a3c8
[Variant] Make sure variant files are not incorrectly deleted (#21562)
* [Variant] Make sure variant files are not incorrectly deleted

* Apply suggestions from code review

* fix
2023-02-10 15:44:51 +01:00
Sylvain Gugger
04b2f13c37
🚨🚨🚨 Enforce single model initialization (#21431)
* Enforce single model initialization

* Add OneFormer example for problem 3

* Do it the Stas way

* Actually rename the uses...

* Rewrite test

* Try to change the test this way

* Fix all init slow/fast tests

* Break connection

* Fix more tests

* Fix test for initialization

* Remove custom test

* Quality

* Fix last failing tests

* The end?
2023-02-09 15:46:26 -05:00
Sylvain Gugger
2020ac4bd6
Fix from_pretrained API with config and state_dict (#21542) 2023-02-09 15:44:02 -05:00
Joao Gante
0d33381fad
Tag tests as slow (#21537)
begone slow tests
2023-02-09 14:46:15 +00:00
Sylvain Gugger
6f79d26442
Update quality tooling for formatting (#21480)
* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies
2023-02-06 18:10:56 -05:00
Patrick von Platen
90cddfa824
Add variant to transformers (#21332)
* Bump onnx in /examples/research_projects/decision_transformer

Bumps [onnx](https://github.com/onnx/onnx) from 1.11.0 to 1.13.0.
- [Release notes](https://github.com/onnx/onnx/releases)
- [Changelog](https://github.com/onnx/onnx/blob/main/docs/Changelog.md)
- [Commits](https://github.com/onnx/onnx/compare/v1.11.0...v1.13.0)

---
updated-dependencies:
- dependency-name: onnx
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* adapt

* finish

* Update examples/research_projects/decision_transformer/requirements.txt

* up

* add tests

* Apply suggestions from code review

Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* fix test

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-02-01 09:21:52 +01:00