Commit Graph

10093 Commits

Author SHA1 Message Date
Yih-Dar
9a3453846b
fix (#17890)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-27 14:36:11 +02:00
Younes Belkada
3ec7d4cfe4
fix mask (#17837) 2022-06-27 14:08:18 +02:00
Matt
ee0d001de7
Add a TF in-graph tokenizer for BERT (#17701)
* Add a TF in-graph tokenizer for BERT

* Add from_pretrained

* Add proper truncation, option handling to match other tokenizers

* Add proper imports and guards

* Add test, fix all the bugs exposed by said test

* Fix truncation of paired texts in graph mode, more test updates

* Small fixes, add a (very careful) test for savedmodel

* Add tensorflow-text dependency, make fixup

* Update documentation

* Update documentation

* make fixup

* Slight changes to tests

* Add some docstring examples

* Update tests

* Update tests and add proper lowercasing/normalization

* make fixup

* Add docstring for padding!

* Mark slow tests

* make fixup

* Fall back to BertTokenizerFast if BertTokenizer is unavailable

* Fall back to BertTokenizerFast if BertTokenizer is unavailable

* make fixup

* Properly handle tensorflow-text dummies
2022-06-27 12:06:21 +01:00
Yih-Dar
401fcca6c5
Fix TF GPT2 test_onnx_runtime_optimize (#17874)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-27 09:27:30 +02:00
Joao Gante
cc5c061e34
CLI: handle multimodal inputs (#17839) 2022-06-25 16:17:11 +01:00
Sylvain Gugger
e8eb699ee8
Properly get tests deps in test_fetcher (#17870)
* Properly get tests deps in test_fetcher

* Remove print
2022-06-24 16:56:46 -04:00
Yih-Dar
b03be78a4b
Fix test_inference_instance_segmentation_head (#17872)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-24 19:36:45 +02:00
Yih-Dar
494aac65a7
Skip test_multi_gpu_data_parallel_forward for MaskFormer (#17864)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-24 19:35:00 +02:00
Yih-Dar
0e0f1f4692
Use higher value for hidden_size in Flax BigBird test (#17822)
* Use higher value for hidden_size in Flax BigBird test

* remove 5e-5

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-24 19:31:30 +02:00
kumapo
2ef94ee039
Fix: torch.utils.checkpoint import error. (#17849) 2022-06-24 13:23:29 -04:00
willtai
ef28a402a9
Add type hints for gptneox models (#17858)
* feat: Add type hints for GPTNeoxForCausalLM and GPTNeoXModel

* fix: removed imported Dict type

* fix: Removed unused List import
2022-06-24 17:12:36 +01:00
Suraj Patil
061a73d16f
[CodeGen] support device_map="auto" for sharded checkpoints (#17871) 2022-06-24 18:06:30 +02:00
rooa
d6b6fb9963
Add CodeGen model (#17443)
* Add CodeGen model

* Add missing key and switch order of super()

* Fix torch.ones init with uint8 instead of bool

* Address comments: copy statements and doc

* update tests

* remove old model parallel

* fix batch gen tests

* fix batch gen test

* update test_gpt2_sample_max_time

* fix codgen test and revert gpt2 test change

* Fix incorrect tie_word_embedding value, typo, URL

* Fix model order in README and styling

* Reorder model list alphabetically

* Set tie_word_embedding to False by default

* Apply suggestions from code review

* Better attn mask name & remove attn masked_bias

* add tokenizer for codegen

* quality

* doc tokenizer

* fix-copies

* add CodeGenTokenizer in converter

* make truncation optional

* add test for truncation

* add copyright

* fix-copies

* fix fast tokenizer decode

* Update src/transformers/models/codegen/tokenization_codegen.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* increase vocab_size in tests

Co-authored-by: patil-suraj <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-06-24 17:10:38 +02:00
Yih-Dar
447490015a
Fix Splinter test (#17854)
* fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-24 16:26:14 +02:00
Suraj Patil
73a0496c2f
[tests/VisionEncoderDecoder] import to_2tuple from test utils (#17865) 2022-06-24 15:23:30 +02:00
NaN
bc7a6fdc02
Fix Constrained beam search duplication and weird output issue (#17814)
* fix(ConstrainedBeamSearchScorer.step_sentence_constraint): avoid hypothesis duplication between topk and advance

* fix(GenerationMixin.constrained_beam_search): appropriately assign beam scores instead of token scores
2022-06-24 14:56:08 +02:00
Vishwas
c2c0d9db5f
Improve encoder decoder model docs (#17815)
* Copied all the changes from the last PR

* added in documentation_tests.txt

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Co-authored-by: vishwaspai <vishwas.pai@emplay.net>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2022-06-24 14:48:19 +02:00
NielsRogge
0917870510
Improve vision models (#17731)
* Improve vision models

* Add a lot of improvements

* Remove to_2tuple from swin tests

* Fix TF Swin

* Fix more tests

* Fix copies

* Improve more models

* Fix ViTMAE test

* Add channel check for TF models

* Add proper channel check for TF models

* Apply suggestion from code review

* Apply suggestions from code review

* Add channel check for Flax models, apply suggestion

* Fix bug

* Add tests for greyscale images

* Add test for interpolation of pos encodigns

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-06-24 11:34:51 +02:00
Zachary Mueller
893ab12452
Auto-build Docker images before on-merge if setup.py was changed (#17573)
* Auto-build on setup modification

* Modify push-caller

* Make adjustments based on code review
2022-06-23 16:51:33 -04:00
Zachary Mueller
75259b44bf
Properly calculate the total train iterations and recalculate num epochs in no_trainer scripts (#17856) 2022-06-23 15:46:01 -04:00
Sylvain Gugger
7c1b91281f
Index RNG states by global rank in saves (#17852) 2022-06-23 12:53:50 -04:00
Sijun He
7cf52a49de
Nezha Pytorch implementation (#17776)
* wip

* rebase

* all tests pass

* rebase

* ready for PR

* address comments

* fix styles

* add require_torch to pipeline test

* remove remote image to improve CI consistency

* address comments; fix tf/flax tests

* address comments; fix tf/flax tests

* fix tests; add alias

* repo consistency tests

* Update src/transformers/pipelines/visual_question_answering.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* address comments

* Update src/transformers/pipelines/visual_question_answering.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* merge

* wip

* wip

* wip

* most basic tests passes

* all tests pass now

* relative embedding

* wip

* running make fixup

* remove bert changes

* fix doc

* fix doc

* fix issues

* fix doc

* address comments

* fix CI

* remove redundant copied from

* address comments

* fix broken test

Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-06-23 12:36:22 -04:00
Zachary Mueller
acb709d551
Change no trainer image_classification test (#17635)
* Adjust test arguments and use a new example test
2022-06-23 11:11:16 -04:00
Fx039482
e70abdad1b
Update modeling_cvt.py (#17846)
As shown in the colab notebook I added the missing type hints for " CvtForImageClassification
CvtModel
"
2022-06-23 16:08:36 +01:00
Matt
1a7ef3349f
Fix broken test for models with batchnorm (#17841)
* Fix tests that broke when models used batchnorm

* Initializing the model twice does not actually...
...give you the same weights each time.
I am good at machine learning.

* Fix speed regression
2022-06-23 15:59:53 +01:00
Younes Belkada
18c263c4b6
BLOOM minor changes on tokenizer (#17823)
* few fixes:

- hardcode tokenizer padding side
- remove unused args

* few fixes:

- added new attribute on TokenizerTesterMixin
- added new slow test
- remove unused arg on tokenizer class

* make style

* Update src/transformers/models/bloom/tokenization_bloom_fast.py

Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

* make quality

* apply changes

- remove new attribute
- redefine test on the class

* add comments

Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
2022-06-23 15:57:12 +02:00
Leandro von Werra
6f29029b05
Improve performance docs (#17750)
* add skeleton files

* fix cpu inference link

* add hint to make clear that single gpu section contains general info

* add new files to ToC

* update toctree to have subsection for performance

* add "coming soon" to the still empty sections

* fix missing title

* fix typo

* add reference to empty documents

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2022-06-23 14:51:54 +02:00
Yih-Dar
5bc779ae28
Fix an error message in BigBird (#17840)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-23 14:43:53 +02:00
Guillaume Klein
3eed5530ec
Fix properties of unset special tokens in non verbose mode (#17797)
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
2022-06-23 14:40:13 +02:00
SaulLu
b2fdbaccdd
change message (#17836) 2022-06-23 14:39:48 +02:00
willtai
d37a68e685
Add missing type hints for QDQBertModel (#17783)
* Feat: add missing type hints for QDQBertModel

* fix: ran black and isort

* feat: Add missing output type for QDQBertModel

* feat: Add type hints for QDQBertLMHeadModel and models starting with QDQBertFor

* fix: add missing return type for QDQBertModel

* fix: remove wrong return type for QDQBertEmbeddings

* fix: readded config argument to load_tf_weights_in_qdqbert

* fix: add BertConfig type to BertEmbeddings config due t checko error in ci

* fix: removed config type hints to avoid copy checks
2022-06-23 12:58:43 +01:00
Fx039482
4297f44b63
Update type hints modeling_yoso.py (#17827)
* Update modeling_yoso.py

* make fixup

* Update modeling_yoso.py

That should be it copied from previous PR
2022-06-23 12:37:29 +01:00
Joao Gante
5cce3076c4
TF: generate without tf.TensorArray (#17801) 2022-06-23 12:28:08 +01:00
Quentin
ab223fc148
add doctests for DETR (#17786)
* add: check labels for detr object detection doctests

* add: check shapes

* add: add detr to documentation_tests.py

* fix: make fixup output

* fix: add a comment
2022-06-23 13:26:14 +02:00
Yih-Dar
8d634b70e0
Fix push CI artifact path (#17788)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-23 12:31:22 +02:00
Sylvain Gugger
df8e6804c0
Offload fixes (#17810)
* Offload fixes

* Add a test
2022-06-22 12:23:07 -04:00
Joao Gante
0d0c392c45
CLI: use hub's create_commit (#17755)
* use create_commit

* better commit message and description

* touch setup.py to trigger cache update

* add hub version gating
2022-06-22 16:50:21 +01:00
dependabot[bot]
c366ce1011
Bump numpy from 1.21.0 to 1.22.0 in /examples/research_projects/lxmert (#17817)
Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst)
- [Commits](https://github.com/numpy/numpy/compare/v1.21.0...v1.22.0)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-22 09:29:40 -04:00
dependabot[bot]
af0d21e741
Bump numpy in /examples/research_projects/visual_bert (#17816)
Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst)
- [Commits](https://github.com/numpy/numpy/compare/v1.21.0...v1.22.0)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-22 09:29:28 -04:00
Arthur
56b83cf049
initial commit (#17818) 2022-06-22 14:26:03 +02:00
Eran Hirsch
1357038164
Add logits_processor parameter, used by generate, to Seq2SeqTrainer methods evaluate and predict (#17805)
* Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict`

* Add all generate parameters to `Seq2SeqTrainer`, and also to `QuestionAnsweringSeq2SeqTrainer` which overrides it

* Remove `self._num_beams` from trainer classes

* - Run fixup
- Fix "Constraint" not exposed
- Fix synced_gpus to actually read from param

* Use kwargs

* Copy kwargs before making changes to it

* Fix style issues unused imports
2022-06-22 08:11:39 -04:00
Arthur
16c6eb7ca1
Flax sharded (#17760) 2022-06-22 07:04:35 +02:00
unifyh
3b00b623b7
Fix top_k_top_p_filtering having unexpected behavior (#17744)
- Fix `top_k_top_p_filtering` not passing `filter_value` to
   `TopPLogitsWarper` causing any top-p filtered logits to be -inf
   instead of specified value

 - Add corresponding test
2022-06-21 21:35:55 +02:00
Kyungmin Lee
3ccff0d400
Remove duplicate code (#17708) 2022-06-21 21:30:40 +02:00
Bram Vanroy
26a6a42608
Improve error message Union not allowed (#17769)
* Improve error message Union not allowed

* make style

* Update src/transformers/hf_argparser.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-21 14:27:01 -04:00
Thomas Wang
abc400b06a
Add final_layer_norm to OPT model (#17785)
* Add final_layer_norm to OPT model

* Add JAX and TF version

* Fix Keras name

* Woops

* Allow for non breaking change

* Apply suggestions from code review

* add tests

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-06-21 20:26:36 +02:00
Zachary Mueller
52404cbad4
Properly check for a TPU device (#17802) 2022-06-21 13:39:55 -04:00
Sylvain Gugger
ef23fae596
Fix test for BF16 detection (#17803) 2022-06-21 18:31:15 +02:00
Arthur
7cced021fa
TF Sharded (#17713)
* initial commit

* update modeeling tf utils

* quality

* clean and update args

* update

* remove potential bug

* code quality

* update

* update max shard

* update tests for sharding from pretrained

* fix remaining test

* make style

* h5py if tf available

* update and fix test

* fix test

* style

* modified push to hub to support shard for TF

* quick fix

* update code

* merge branch main and style

* Apply suggestions from code review

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update based on reviews

* update doc

* update and style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update based on reviews

* fix typo

* style

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-21 18:01:08 +02:00
Yih-Dar
f47afefb21
Use 5e-5 For BigBird PT/Flax equivalence tests (#17780)
* rename to check_pt_flax_outputs

* update check_pt_flax_outputs

* use 5e-5 for BigBird PT/Flax test

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-21 17:55:26 +02:00