Commit Graph

10074 Commits

Author SHA1 Message Date
Zachary Mueller
75259b44bf
Properly calculate the total train iterations and recalculate num epochs in no_trainer scripts (#17856) 2022-06-23 15:46:01 -04:00
Sylvain Gugger
7c1b91281f
Index RNG states by global rank in saves (#17852) 2022-06-23 12:53:50 -04:00
Sijun He
7cf52a49de
Nezha Pytorch implementation (#17776)
* wip

* rebase

* all tests pass

* rebase

* ready for PR

* address comments

* fix styles

* add require_torch to pipeline test

* remove remote image to improve CI consistency

* address comments; fix tf/flax tests

* address comments; fix tf/flax tests

* fix tests; add alias

* repo consistency tests

* Update src/transformers/pipelines/visual_question_answering.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* address comments

* Update src/transformers/pipelines/visual_question_answering.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* merge

* wip

* wip

* wip

* most basic tests passes

* all tests pass now

* relative embedding

* wip

* running make fixup

* remove bert changes

* fix doc

* fix doc

* fix issues

* fix doc

* address comments

* fix CI

* remove redundant copied from

* address comments

* fix broken test

Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-06-23 12:36:22 -04:00
Zachary Mueller
acb709d551
Change no trainer image_classification test (#17635)
* Adjust test arguments and use a new example test
2022-06-23 11:11:16 -04:00
Fx039482
e70abdad1b
Update modeling_cvt.py (#17846)
As shown in the colab notebook I added the missing type hints for " CvtForImageClassification
CvtModel
"
2022-06-23 16:08:36 +01:00
Matt
1a7ef3349f
Fix broken test for models with batchnorm (#17841)
* Fix tests that broke when models used batchnorm

* Initializing the model twice does not actually...
...give you the same weights each time.
I am good at machine learning.

* Fix speed regression
2022-06-23 15:59:53 +01:00
Younes Belkada
18c263c4b6
BLOOM minor changes on tokenizer (#17823)
* few fixes:

- hardcode tokenizer padding side
- remove unused args

* few fixes:

- added new attribute on TokenizerTesterMixin
- added new slow test
- remove unused arg on tokenizer class

* make style

* Update src/transformers/models/bloom/tokenization_bloom_fast.py

Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

* make quality

* apply changes

- remove new attribute
- redefine test on the class

* add comments

Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
2022-06-23 15:57:12 +02:00
Leandro von Werra
6f29029b05
Improve performance docs (#17750)
* add skeleton files

* fix cpu inference link

* add hint to make clear that single gpu section contains general info

* add new files to ToC

* update toctree to have subsection for performance

* add "coming soon" to the still empty sections

* fix missing title

* fix typo

* add reference to empty documents

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2022-06-23 14:51:54 +02:00
Yih-Dar
5bc779ae28
Fix an error message in BigBird (#17840)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-23 14:43:53 +02:00
Guillaume Klein
3eed5530ec
Fix properties of unset special tokens in non verbose mode (#17797)
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
2022-06-23 14:40:13 +02:00
SaulLu
b2fdbaccdd
change message (#17836) 2022-06-23 14:39:48 +02:00
willtai
d37a68e685
Add missing type hints for QDQBertModel (#17783)
* Feat: add missing type hints for QDQBertModel

* fix: ran black and isort

* feat: Add missing output type for QDQBertModel

* feat: Add type hints for QDQBertLMHeadModel and models starting with QDQBertFor

* fix: add missing return type for QDQBertModel

* fix: remove wrong return type for QDQBertEmbeddings

* fix: readded config argument to load_tf_weights_in_qdqbert

* fix: add BertConfig type to BertEmbeddings config due t checko error in ci

* fix: removed config type hints to avoid copy checks
2022-06-23 12:58:43 +01:00
Fx039482
4297f44b63
Update type hints modeling_yoso.py (#17827)
* Update modeling_yoso.py

* make fixup

* Update modeling_yoso.py

That should be it copied from previous PR
2022-06-23 12:37:29 +01:00
Joao Gante
5cce3076c4
TF: generate without tf.TensorArray (#17801) 2022-06-23 12:28:08 +01:00
Quentin
ab223fc148
add doctests for DETR (#17786)
* add: check labels for detr object detection doctests

* add: check shapes

* add: add detr to documentation_tests.py

* fix: make fixup output

* fix: add a comment
2022-06-23 13:26:14 +02:00
Yih-Dar
8d634b70e0
Fix push CI artifact path (#17788)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-23 12:31:22 +02:00
Sylvain Gugger
df8e6804c0
Offload fixes (#17810)
* Offload fixes

* Add a test
2022-06-22 12:23:07 -04:00
Joao Gante
0d0c392c45
CLI: use hub's create_commit (#17755)
* use create_commit

* better commit message and description

* touch setup.py to trigger cache update

* add hub version gating
2022-06-22 16:50:21 +01:00
dependabot[bot]
c366ce1011
Bump numpy from 1.21.0 to 1.22.0 in /examples/research_projects/lxmert (#17817)
Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst)
- [Commits](https://github.com/numpy/numpy/compare/v1.21.0...v1.22.0)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-22 09:29:40 -04:00
dependabot[bot]
af0d21e741
Bump numpy in /examples/research_projects/visual_bert (#17816)
Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst)
- [Commits](https://github.com/numpy/numpy/compare/v1.21.0...v1.22.0)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-22 09:29:28 -04:00
Arthur
56b83cf049
initial commit (#17818) 2022-06-22 14:26:03 +02:00
Eran Hirsch
1357038164
Add logits_processor parameter, used by generate, to Seq2SeqTrainer methods evaluate and predict (#17805)
* Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict`

* Add all generate parameters to `Seq2SeqTrainer`, and also to `QuestionAnsweringSeq2SeqTrainer` which overrides it

* Remove `self._num_beams` from trainer classes

* - Run fixup
- Fix "Constraint" not exposed
- Fix synced_gpus to actually read from param

* Use kwargs

* Copy kwargs before making changes to it

* Fix style issues unused imports
2022-06-22 08:11:39 -04:00
Arthur
16c6eb7ca1
Flax sharded (#17760) 2022-06-22 07:04:35 +02:00
unifyh
3b00b623b7
Fix top_k_top_p_filtering having unexpected behavior (#17744)
- Fix `top_k_top_p_filtering` not passing `filter_value` to
   `TopPLogitsWarper` causing any top-p filtered logits to be -inf
   instead of specified value

 - Add corresponding test
2022-06-21 21:35:55 +02:00
Kyungmin Lee
3ccff0d400
Remove duplicate code (#17708) 2022-06-21 21:30:40 +02:00
Bram Vanroy
26a6a42608
Improve error message Union not allowed (#17769)
* Improve error message Union not allowed

* make style

* Update src/transformers/hf_argparser.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-21 14:27:01 -04:00
Thomas Wang
abc400b06a
Add final_layer_norm to OPT model (#17785)
* Add final_layer_norm to OPT model

* Add JAX and TF version

* Fix Keras name

* Woops

* Allow for non breaking change

* Apply suggestions from code review

* add tests

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-06-21 20:26:36 +02:00
Zachary Mueller
52404cbad4
Properly check for a TPU device (#17802) 2022-06-21 13:39:55 -04:00
Sylvain Gugger
ef23fae596
Fix test for BF16 detection (#17803) 2022-06-21 18:31:15 +02:00
Arthur
7cced021fa
TF Sharded (#17713)
* initial commit

* update modeeling tf utils

* quality

* clean and update args

* update

* remove potential bug

* code quality

* update

* update max shard

* update tests for sharding from pretrained

* fix remaining test

* make style

* h5py if tf available

* update and fix test

* fix test

* style

* modified push to hub to support shard for TF

* quick fix

* update code

* merge branch main and style

* Apply suggestions from code review

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update based on reviews

* update doc

* update and style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update based on reviews

* fix typo

* style

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-21 18:01:08 +02:00
Yih-Dar
f47afefb21
Use 5e-5 For BigBird PT/Flax equivalence tests (#17780)
* rename to check_pt_flax_outputs

* update check_pt_flax_outputs

* use 5e-5 for BigBird PT/Flax test

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-21 17:55:26 +02:00
Lysandre Debut
6a5272b205
Prepare transformers for v0.8.0 huggingface-hub release (#17716)
* Prepare CI for v0.8.0

* pin hfh (revert before merge)

* Revert "pin hfh (revert before merge)"

This reverts commit a0103140e1.

* Test rc3

* Test latest rc

* Unpin to the RC

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-06-21 11:51:18 -04:00
Sylvain Gugger
7bc88c0511
Fix forward reference imports in DeBERTa configs (#17800) 2022-06-21 11:21:06 -04:00
Anugunj Naman
27e907386a
Fix Automatic Download of Pretrained Weights in DETR (#17712)
* added use_backbone_pretrained

* style fixes

* update

* Update detr.mdx

* Update detr.mdx

* Update detr.mdx

* update using doc py

* Update detr.mdx

* Update src/transformers/models/detr/configuration_detr.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-21 16:45:35 +02:00
NielsRogge
b681e12d59
[ViTMAE] Fix docstrings and variable names (#17710)
* Fix docstrings and variable names

* Rename x to something better

* Improve messages

* Fix docstrings and add test for greyscale images

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-06-21 15:56:00 +02:00
NielsRogge
3fab17fce8
Add link to notebook (#17791)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-06-21 14:53:08 +02:00
Jia LI
da2bd2ae96
[CodeParrot] Near-deduplication with jaccard similarity (#17054)
* deduplication draft

* update style

* update style test

* dummy test main

* rename modules

* rename functions

* return extremes in deduplicate_clusters

* update style

* cast str for gzip

* update doc string

* time processing

* use dataset map to compute minhash

* fill value for short token

* remove da map method

* update style

* use share object to multiprocess

* update style

* use f-string and minor fix

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>

* update style

* use module parameters

* change ds_dedup to ds_filter

* save ds_dedup

* mv test to script tests

* make jaccard threshold a parameter of deduplicate_dataset

* update style

* add doc strings

* update style

* add doc string for DuplicationIndex

* save files into data dir

* update readme

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>

* make near deduplication optional

* move near deduplication in README

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* use f string

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>
2022-06-21 14:23:36 +02:00
mrbean
eb16be415a
add onnx support for deberta and debertav2 (#17617)
* add onnx support for debertav2

* debertav2 -> deberta-v2 in onnx features file

* remove causal lm

* add deberta-v2-xlarge to onnx tests

* use self.type().dtype() in xsoftmax

Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* remove hack for deberta

* remove unused imports

* Update src/transformers/models/deberta_v2/configuration_deberta_v2.py

Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* use generate dummy inputs

* linter

* add imports

* add support for deberta v1 as well

* deberta does not support multiple choice

* Update src/transformers/models/deberta/configuration_deberta.py

Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* Update src/transformers/models/deberta_v2/configuration_deberta_v2.py

Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* one line ordered dict

* fire build

Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>
2022-06-21 11:04:15 +02:00
Patrick von Platen
8fcbe275c3
Add UL2 (just docs) (#17740)
* Add UL2
Co-authored-by: Daniel Hesslow <Daniel.Hesslow@gmail.com>

* Correct naming

* sort better

* up

* apply sylvains suggestion
2022-06-21 10:24:50 +02:00
Brad Jascob
da27c4b398
Update modeling_longt5.py (#17777)
On line 180, `torch.tensor(-1.0, xxx)` gives the error "TypeError: 'float' object cannot be interpreted as an integer" 
This is because the dtype here is `int64`.  For `dtype=int64`, this needs to simply be `-1`.  
This impacts the long-t5-tglogbal-x model.  It does not impact the long-t5-local-x version which does not appear to call this line.
2022-06-20 18:49:08 +02:00
Yih-Dar
d3cb28886a
Not use -1e4 as attn mask (#17306)
* Use torch.finfo(self.dtype).min

* for GPTNeoX

* for Albert

* For Splinter

* Update src/transformers/models/data2vec/modeling_data2vec_audio.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix -inf used in Bart-like models

* Fix a few remaining -inf

* more fix

* clean up

* For CLIP

* For FSMT

* clean up

* fix test

* Add dtype argument and use it for LayoutLMv3

* update FlaxLongT5Attention

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-06-20 16:16:16 +02:00
Sylvain Gugger
fdb120805c
Fix cache for GPT-Neo-X (#17764)
* Fix cache for GPT-Neo-X

* Add more tests
2022-06-20 08:43:36 -04:00
Stas Bekman
a2d34b7c04
deprecate is_torch_bf16_available (#17738)
* deprecate is_torch_bf16_available

* address suggestions
2022-06-20 08:40:11 -04:00
Joao Gante
132402d752
TF: BART compatible with XLA generation (#17479)
* Also propagate changes to blenderbot, blenderbot_small, marian, mbart, and pegasus
2022-06-20 11:07:46 +01:00
Yih-Dar
6589e510fa
Attempt to change Push CI to workflow_run (#17753)
* Use workflow_run event for push CI

* change to workflow_run

* Add comments

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-18 08:35:03 +02:00
Rafael Zimmer
0d92798b45
Added translation of index.mdx to Portuguese Issue #16824 (#17565)
* Added translation of installation.mdx to Portuguese, as well
as default templates of _toctree.yml and _config.py

* [ build_documentation.yml ] - Updated doc_builder to build
documentation in Portuguese.
[ pipeline_tutorial.mdx ] - Created translation for the pipeline_tutorial.mdx.

* [ build_pr_documentation.yml ] - Added pt language to pr_documentation builder.

[ pipeline_tutorial.mdx ] - Grammar changes.

* [ accelerate.mdx ] - Translated to Portuguese the acceleration tutorial.

* [ multilingual.mdx ] - Added portuguese translation for multilingual tutorial.

[ training.mdx ] - Added portuguese translation for training tutorial.

* [ preprocessing.mdx ] - WIP

* Update _toctree.yml

* Adding Pré-processamento to _toctree.yml

* Update accelerate.mdx

* Nits and eliminate preprocessing file while it is ready

* [ index.mdx ] - Translated to Portuguese the index apresentation page.

* [ docs/source/pt ] - Updated _toctree.yml to match newest translations.

* Fix build_pr_documentation.yml

* Fix index nits

* nits in _toctree

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
2022-06-17 20:06:05 -04:00
Swetha Mandava
522a9ece4b
Save huggingface checkpoint as artifact in mlflow callback (#17686)
* Fix eval to compute rouge correctly for rouge_score

* styling

* moving sentence tokenization to utils from run_eval

* saving ckpt in mlflow

* use existing format of args

* fix documentation

Co-authored-by: Swetha Mandava <smandava@nvidia.com>
2022-06-17 14:14:03 -04:00
Sourab Mangrulkar
21a772426d
Migrate HFDeepSpeedConfig from trfrs to accelerate (#17623)
* Migrate HFDeepSpeedConfig from trfrs to accelerate

* add `accelerate` to testing dep

* addressing comments

* addressing comments

Using `_shared_state` and avoiding object creation. This is necessary as `notebook_launcher` in `launcers.py` checks `len(AcceleratorState._shared_state)>0` to throw an error.

* resolving comments

1. Use simple API from accelerate to manage the deepspeed config integration
2. Update the related documentation

* reverting changes and addressing comments

* docstring correction

* addressing nits

* addressing nits

* addressing nits 3

* bumping up the accelerate version to 0.10.0

* resolving import

* update setup.py to include deepspeed dependencies

* Update dependency_versions_table.py

* fixing imports

* reverting changes to CI dependencies for "run_tests_pipelines_tf*" tests

These changes didn't help with resolving the failures and I believe this needs to be addressed in another PR.

* removing `accelerate` as hard dependency

Resolves issues related to CI Tests

* adding `accelerate` as dependency for building docs

resolves failure in Build PR Documentation test

* adding `accelerate` as dependency in "dev" to resolve doc build issue

* resolving comments

1. adding `accelerate` to extras["all"]
2. Including check for accelerate too before import HFDeepSpeedConfig from there

Co-Authored-By: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* resolving comments

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-17 23:29:35 +05:30
dependabot[bot]
e44a569fef
Bump notebook in /examples/research_projects/lxmert (#17743)
Bumps [notebook](http://jupyter.org) from 6.4.10 to 6.4.12.

---
updated-dependencies:
- dependency-name: notebook
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-17 12:10:33 -04:00
dependabot[bot]
5089a2d412
Bump notebook in /examples/research_projects/visual_bert (#17742)
Bumps [notebook](http://jupyter.org) from 6.4.10 to 6.4.12.

---
updated-dependencies:
- dependency-name: notebook
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-17 12:10:17 -04:00