Commit Graph

544 Commits

Author SHA1 Message Date
Matt
3b3024da70
TF port of ESM (#19587)
* Partial TF port for ESM model

* Add ESM-TF tests

* Add the various imports for TF-ESM

* TF weight conversion almost ready

* Stop ignoring the decoder weights in PT

* Add tests and lots of fixes

* fix-copies

* Fix imports, add model docs

* Add get_vocab() to tokenizer

* Fix vocab links for pretrained files

* Allow multiple inputs with a sep

* Use EOS as SEP token because ESM vocab lacks SEP

* Correctly return special tokens mask from ESM tokenizer

* make fixup

* Stop testing unsupported embedding resizing

* Handle TF bias correctly

* Skip all models with slow tokenizers in the token classification test

* Fixing the batch/unbatcher of pipelines to accomodate the `None` being

passed around.

* Fixing pipeline bug caused by slow tokenizer  being different.

* Update src/transformers/models/esm/modeling_tf_esm.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/esm/modeling_tf_esm.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/esm/modeling_tf_esm.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update set_input_embeddings and the copyright notices

Co-authored-by: Your Name <you@example.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2022-10-17 14:16:16 +01:00
Arthur
d2e5b19b82
Add doctest info in testingmdx (#19623) 2022-10-17 11:23:20 +02:00
0xflotus
5ef2186692
fix: small error (#19612)
* fix: small error

* fix: another typo error
2022-10-14 11:10:33 -04:00
Akash Mahajan
504cd71a6b
add a note to whisper docs clarifying support of long-form decoding (#19497) 2022-10-13 10:39:03 +02:00
amyeroberts
1973b7716b
Image transforms library (#18520)
* Adapt FE methods to transforms library

* Mixin for saving the image processor

* Base processor skeleton

* BatchFeature for packaging image processor outputs

* Initial image processor for GLPN

* REmove accidental import

* Fixup and docs

* Mixin for saving the image processor

* Fixup and docs

* Import BatchFeature from feature_extraction_utils

* Fixup and docs

* Fixup and docs

* Fixup and docs

* Fixup and docs

* BatchFeature for packaging image processor outputs

* Import BatchFeature from feature_extraction_utils

* Import BatchFeature from feature_extraction_utils

* Fixup and docs

* Fixup and docs

* BatchFeature for packaging image processor outputs

* Import BatchFeature from feature_extraction_utils

* Fixup and docs

* Mixin for saving the image processor

* Fixup and docs

* Add rescale back and remove ImageType

* fix import mistake

* Fix enum var reference

* Can transform and specify image data format

* Remove redundant function

* Update reference

* Data format flag for rescale

* Fix typo

* Fix dimension check

* Fixes to make IP and FE outputs match

* Add tests for transforms

* Add test for utils

* Update some docstrings

* Make sure in channels last before converting to PIL

* Remove default to numpy batching

* Fix up

* Add docstring and model_input_types

* Use feature processor config from hub

* Alias GLPN feature extractor to image processor

* Alias feature extractor mixin

* Add return_numpy=False flag for resize

* Fix up

* Fix up

* Use different frameworks safely

* Safely import PIL

* Call function checking if PIL available

* Only import if vision available

* Address Sylvain PR comments
Co-authored-by: Sylvain.gugger@gmail.com

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/image_transforms.py

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

* Update src/transformers/models/glpn/feature_extraction_glpn.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add in docstrings

* Fix TFSwinSelfAttention to have relative position index as non-trainable weight (#18226)

Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

* Refactor `TFSwinLayer` to increase serving compatibility (#18352)

* Refactor `TFSwinLayer` to increase serving compatibility

Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

* Fix missed parameters while refactoring

Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

* Fix window_reverse to calculate batch size

Signed-off-by: Seunghwan Hong <harrydrippin@gmail.com>
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Add TF prefix to TF-Res test class (#18481)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Remove py.typed (#18485)

* Fix pipeline tests (#18487)

* Fix pipeline tests

* Make sure all pipelines tests run with init changes

* Use new huggingface_hub tools for download models (#18438)

* Draft new cached_file

* Initial draft for config and model

* Small fixes

* Fix first batch of tests

* Look in cache when internet is down

* Fix last tests

* Bad black, not fixing all quality errors

* Make diff less

* Implement change for TF and Flax models

* Add tokenizer and feature extractor

* For compatibility with main

* Add utils to move the cache and auto-do it at first use.

* Quality

* Deal with empty commit shas

* Deal with empty etag

* Address review comments

* Fix `test_dbmdz_english` by updating expected values (#18482)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Move cache folder to huggingface/hub for consistency with hf_hub (#18492)

* Move cache folder to just huggingface

* Thank you VsCode for this needless import

* Move to hub

* Forgot one

* Update some expected values in `quicktour.mdx` for `resampy 0.3.0` (#18484)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Forgot one new_ for cache migration

* disable Onnx test for google/long-t5-tglobal-base (#18454)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Typo reported by Joel Grus on TWTR (#18493)

* Just re-reading the whole doc every couple of months 😬 (#18489)

* Delete valohai.yaml

* NLP => ML

* typo

* website supports https

* datasets

* 60k + modalities

* unrelated link fixing for accelerate

* Ok those links were actually broken

* Fix link

* Make `AutoTokenizer` auto-link

* wording tweak

* add at least one non-nlp task

* `transformers-cli login` => `huggingface-cli login` (#18490)

* zero chance anyone's using that constant no?

* `transformers-cli login` => `huggingface-cli login`

* `transformers-cli repo create` => `huggingface-cli repo create`

* `make style`

* Add seed setting to image classification example (#18519)

* [DX fix] Fixing QA pipeline streaming a dataset. (#18516)

* [DX fix] Fixing QA pipeline streaming a dataset.

QuestionAnsweringArgumentHandler would iterate over the whole dataset
effectively killing all properties of the pipeline.
This restores nice properties when using `Dataset` or `Generator` since
those are meant to be consumed lazily.

* Handling TF better.

* Clean up hub (#18497)

* Clean up utils.hub

* Remove imports

* More fixes

* Last fix

* update fsdp docs (#18521)

* updating fsdp documentation

* typo fix

* Fix compatibility with 1.12 (#17925)

* Fix compatibility with 1.12

* Remove pin from examples requirements

* Update torch scatter version

* Fix compatibility with 1.12

* Remove pin from examples requirements

* Update torch scatter version

* fix torch.onnx.symbolic_opset12 import

* Reject bad version

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Remove debug statement

* Specify en in doc-builder README example (#18526)

Co-authored-by: Ankur Goyal <ankur@impira.com>

* New cache fixes: add safeguard before looking in folders (#18522)

* unpin resampy (#18527)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

*  update to use interlibrary links instead of Markdown (#18500)

* Add example of multimodal usage to pipeline tutorial (#18498)

* 📝 add example of multimodal usage to pipeline tutorial

* 🖍 apply feedbacks

* 🖍 apply niels feedback

* [VideoMAE] Add model to doc tests (#18523)

* Add videomae to doc tests

* Add pip install decord

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* Update perf_train_gpu_one.mdx (#18532)

* Update no_trainer.py scripts to include accelerate gradient accumulation wrapper (#18473)

* Added accelerate gradient accumulation wrapper to run_image_classification_no_trainer.py example script

* make fixup changes

* PR comments

* changed input to Acceletor based on PR comment, ran make fixup

* Added comment explaining the sync_gradients statement

* Fixed lr scheduler max steps

* Changed run_clm_no_trainer.py script to use accelerate gradient accum wrapper

* Fixed all scripts except wav2vec2 pretraining to use accelerate gradient accum wrapper

* Added accelerate gradient accum wrapper for wav2vec2_pretraining_no_trainer.py script

* make fixup and lr_scheduler step inserted back into run_qa_beam_search_no_trainer.py

* removed changes to run_wav2vec2_pretraining_no_trainer.py script and fixed using wrong constant in qa_beam_search_no_trainer.py script

* Add Spanish translation of converting_tensorflow_models.mdx (#18512)

* Add file in spanish docs to be translated

* Finish translation to Spanish

* Improve Spanish  wording

* Add suggested changes from review

* Spanish translation of summarization.mdx (#15947) (#18477)

* Add Spanish translation of summarization.mdx

* Apply suggestions from code review

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Let's not cast them all (#18471)

* add correct dtypes when checking for params dtype

* forward contrib credits

* Update src/transformers/modeling_utils.py

Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>

* more comments

- added more comments on why we cast only floating point parameters

* Update src/transformers/modeling_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: sgugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>

* fix: data2vec-vision Onnx ready-made configuration. (#18427)

* feat: add the data2vec conf that are missing https://huggingface.co/docs/transformers/serialization

* fix: wrong config

* Add mt5 onnx config (#18394)

* update features

* MT5OnnxConfig added with updated with tests and docs

* fix imports

* fix onnc_config_cls for mt5

Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai>

* Minor update of `run_call_with_unpacked_inputs` (#18541)

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* BART - Fix attention mask device issue on copied models (#18540)

* attempt to fix attn mask device

* fix bart `_prepare_decoder_attention_mask`

- add correct device
- run `make fix-copies` to propagate the fix

* Adding a new `align_to_words` param to qa pipeline. (#18010)

* Adding a new `align_to_words` param to qa pipeline.

* Update src/transformers/pipelines/question_answering.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Import protection.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* 📝 update metric with evaluate (#18535)

* Restore _init_weights value in no_init_weights (#18504)

* Recover _init_weights value in no_init_weights

For potential nested use. 
In addition, users might modify private no_init_weights as well.

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove private variable change check

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Clean up comment

* 📝 update documentation build section (#18548)

* `bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901)

* first commit

* correct replace function

* add final changes

- works like charm!
- cannot implement tests yet
- tested

* clean up a bit

* add bitsandbytes dependencies

* working version

- added import function
- added bitsandbytes utils file

* small fix

* small fix

- fix import issue

* fix import issues

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit

- move bitsandbytes utils to utils
- change comments on functions

* reformat docstring

- reformat docstring on init_empty_weights_8bit

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert bad formatting

* change to bitsandbytes

* refactor a bit

- remove init8bit since it is useless

* more refactoring

- fixed init empty weights issue
- added threshold param

* small hack to make it work

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* revmoe the small hack

* modify utils file

* make style + refactor a bit

* create correctly device map

* add correct dtype for device map creation

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

- remove with torch.grad
- do not rely on Python bool magic!

* add docstring

 - add docstring for new kwargs

* add docstring

- comment `replace_8bit_linear` function
- fix weird formatting

* - added more documentation
- added new utility function for memory footprint tracking
- colab demo to add

* few modifs

- typo doc
- force cast into float16 when load_in_8bit is enabled

* added colab link

* add test architecture + docstring a bit

* refactor a bit testing class

* make style + refactor a bit

* enhance checks

- add more checks
- start writing saving test

* clean up a bit

* male style

* add more details on doc

* add more tests

- still needs to fix 2 tests

* replace by "or"

- could not fix it from GitHub GUI

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit testing code + add readme

* make style

* fix import issue

* Update src/transformers/modeling_utils.py

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* add few comments

* add more doctring + make style

* more docstring

* raise error when loaded in 8bit

* make style

* add warning if loaded on CPU

* add small sanity check

* fix small comment

* add bitsandbytes on dockerfile

* Improve documentation

- improve documentation from comments

* add few comments

* slow tests pass on the VM but not on the CI VM

* Fix merge conflict

* make style

* another test should pass on a multi gpu setup

* fix bad import in testing file

* Fix slow tests

- remove dummy batches
- no more CUDA illegal memory errors

* odify dockerfile

* Update docs/source/en/main_classes/model.mdx

* Update Dockerfile

* Update model.mdx

* Update Dockerfile

* Apply suggestions from code review

* few modifications

- lm head can stay on disk/cpu
- change model name so that test pass

* change test value

- change test value to the correct output
- torch bmm changed to baddmm in bloom modeling when merging

* modify installation guidelines

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* replace `n`by `name`

* merge `load_in_8bit` and `low_cpu_mem_usage`

* first try - keep the lm head in full precision

* better check

- check the attribute `base_model_prefix` instead of computing the number of parameters

* added more tests

* Update src/transformers/utils/bitsandbytes.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit

* improve documentation

- fix typos for installation
- change title in the documentation

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* TF: XLA-trainable DeBERTa v2 (#18546)

* fix deberta issues

* add different code paths for gpu and tpu

* shorter gpu take along axis

* Stable Dropout without tf cond

* variable must be float

* Preserve hub-related kwargs in AutoModel.from_pretrained (#18545)

* Preserve hub-related kwargs in AutoModel.from_pretrained

* Fix tests

* Remove debug statement

* TF Examples Rewrite (#18451)

* Finished QA example

* Dodge a merge conflict

* Update text classification and LM examples

* Update NER example

* New Keras metrics WIP, fix NER example

* Update NER example

* Update MC, summarization and translation examples

* Add XLA warnings when shapes are variable

* Make sure batch_size is consistently scaled by num_replicas

* Add PushToHubCallback to all models

* Add docs links for KerasMetricCallback

* Add docs links for prepare_tf_dataset and jit_compile

* Correct inferred model names

* Don't assume the dataset has 'lang'

* Don't assume the dataset has 'lang'

* Write metrics in text classification

* Add 'framework' to TrainingArguments and TFTrainingArguments

* Export metrics in all examples and add tests

* Fix training args for Flax

* Update command line args for translation test

* make fixup

* Fix accidentally running other tests in fp16

* Remove do_train/do_eval from run_clm.py

* Remove do_train/do_eval from run_mlm.py

* Add tensorflow tests to circleci

* Fix circleci

* Update examples/tensorflow/language-modeling/run_mlm.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/test_tensorflow_examples.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/translation/run_translation.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/token-classification/run_ner.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fix save path for tests

* Fix some model card kwargs

* Explain the magical -1000

* Actually enable tests this time

* Skip text classification PR until we fix shape inference

* make fixup

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Use commit hash to look in cache instead of calling head (#18534)

* Use commit hash to look in cache instead of calling head

* Add tests

* Add attr for local configs too

* Stupid typos

* Fix tests

* Update src/transformers/utils/hub.py

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address Julien's comments

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* `pipeline` support for `device="mps"` (or any other string) (#18494)

* `pipeline` support for `device="mps"` (or any other string)

* Simplify `if` nesting

* Update src/transformers/pipelines/base.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix? @sgugger

* passing `attr=None` is not the same as not passing `attr` 🤯

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update philosophy to include other preprocessing classes (#18550)

* 📝 update philosophy to include other preprocessing classes

* 🖍 apply feedbacks

* Properly move cache when it is not in default path (#18563)

* Adds CLIP to models exportable with ONNX (#18515)

* onnx config for clip

* default opset as 14

* changes from the original repo

* input values order fix

* outputs fix

* remove unused import

* ran make fix-copies

* black format

* review comments: forward ref, import fix, model change revert, .to cleanup

* make style

* formatting fixes

* revert groupvit

* comment for cast to int32

* comment fix

* make .T as .t() for onnx conversion

* ran make fix-copies

* remove unneeded comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix copies

* remove comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* raise atol for MT5OnnxConfig (#18560)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* fix string (#18568)

* Segformer TF: fix output size in documentation (#18572)

* Segformer TF: fix output size in doc

* Segformer pytorch: fix output size in doc

Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com>

* Fix resizing bug in OWL-ViT (#18573)

* Fixes resizing bug in OWL-ViT
* Defaults to square resize if size is set to an int
* Sets do_center_crop default value to False

* Fix LayoutLMv3 documentation (#17932)

* fix typos

* fix sequence_length docs of LayoutLMv3Model

* delete trailing white spaces

* fix layoutlmv3 docs more

* apply make fixup & quality

* change to two versions of input docstring

* apply make fixup & quality

* Skip broken tests

* Change BartLearnedPositionalEmbedding's forward method signature to support Opacus training (#18486)

* changing BartLearnedPositionalEmbedding forward signature and references to it

* removing debugging dead code (thanks style checker)

* blackened modeling_bart file

* removing copy inconsistencies via make fix-copies

* changing references to copied signatures in Bart variants

* make fix-copies once more

* using expand over repeat (thanks @michaelbenayoun)

* expand instead of repeat for all model copies

Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com>

* german docs translation (#18544)

* Create _config.py

* Create _toctree.yml

* Create index.mdx

not sure about "du / ihr" oder "sie"

* Create quicktour.mdx

* Update _toctree.yml

* Update build_documentation.yml

* Update build_pr_documentation.yml

* fix build

* Update index.mdx

* Update quicktour.mdx

* Create installation.mdx

* Update _toctree.yml

* Deberta V2: Fix critical trace warnings to allow ONNX export (#18272)

* Fix critical trace warnings to allow ONNX export

* Force input to `sqrt` to be float type

* Cleanup code

* Remove unused import statement

* Update model sew

* Small refactor

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* Use broadcasting instead of repeat

* Implement suggestion

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* Match deberta v2 changes in sew_d

* Improve code quality

* Update code quality

* Consistency of small refactor

* Match changes in sew_d

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* [FX] _generate_dummy_input supports audio-classification models for labels (#18580)

* Support audio classification architectures for labels generation, as well as provides a flag to print warnings or not

* Use ENV_VARS_TRUE_VALUES

* Fix docstrings with last version of hf-doc-builder styler (#18581)

* Fix docstrings with last version of hf-doc-builder styler

* Remove empty Parameter block

* Bump nbconvert from 6.0.1 to 6.3.0 in /examples/research_projects/lxmert (#18565)

Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump nbconvert in /examples/research_projects/visual_bert (#18566)

Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix owlvit tests, update docstring examples (#18586)

* Return the permuted hidden states if return_dict=True (#18578)

* Load sharded pt to flax (#18419)

* initial commit

* add small test

* add cross pt tf flag to test

* fix quality

* style

* update test with new repo

* fix failing test

* update

* fix wrong param ordering

* style

* update based on review

* update related to recent new caching mechanism

* quality

* Update based on review

Co-authored-by: sgugger <sylvain.gugger@gmail.com>

* quality and style

* Update src/transformers/modeling_flax_utils.py
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add type hints for ViLT models (#18577)

* Add type hints for Vilt models

* Add missing return type for TokenClassification class

* update doc for perf_train_cpu_many, add intel mpi introduction (#18576)

* update doc for perf_train_cpu_many, add mpi introduction

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* Update docs/source/en/perf_train_cpu_many.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/perf_train_cpu_many.mdx

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* typos (#18594)

* FSDP bug fix for `load_state_dict` (#18596)

* Add `TFAutoModelForSemanticSegmentation` to the main `__init__.py` (#18600)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Generate: validate `model_kwargs` (and catch typos in generate arguments) (#18261)

* validate generate model_kwargs

* generate tests -- not all models have an attn mask

* Supporting seq2seq models for `bitsandbytes` integration (#18579)

* Supporting seq2seq models for `bitsandbytes` integration

- `bitsandbytes` integration supports now seq2seq models
- check if a model has tied weights as an additional check

* small modification

- tie the weights before looking at tied weights!

* Add Donut (#18488)

* First draft

* Improve script

* Update script

* Make conversion work

* Add final_layer_norm attribute to Swin's config

* Add DonutProcessor

* Convert more models

* Improve feature extractor and convert base models

* Fix bug

* Improve integration tests

* Improve integration tests and add model to README

* Add doc test

* Add feature extractor to docs

* Fix integration tests

* Remove register_buffer

* Fix toctree and add missing attribute

* Add DonutSwin

* Make conversion script work

* Improve conversion script

* Address comment

* Fix bug

* Fix another bug

* Remove deprecated method from docs

* Make Swin and Swinv2 untouched

* Fix code examples

* Fix processor

* Update model_type to donut-swin

* Add feature extractor tests, add token2json method, improve feature extractor

* Fix failing tests, remove integration test

* Add do_thumbnail for consistency

* Improve code examples

* Add code example for document parsing

* Add DonutSwin to MODEL_NAMES_MAPPING

* Add model to appropriate place in toctree

* Update namespace to appropriate organization

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* Fix URLs (#18604)

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* Update BLOOM parameter counts (#18531)

* Update BLOOM parameter counts

* Update BLOOM parameter counts

* [doc] fix anchors (#18591)

the manual anchors end up being duplicated with automatically added anchors and no longer work.

* [fsmt] deal with -100 indices in decoder ids (#18592)

* [fsmt] deal with -100 indices in decoder ids

Fixes: https://github.com/huggingface/transformers/issues/17945

decoder ids get the default index -100, which breaks the model - like t5 and many other models add a fix to replace -100 with the correct pad index. 

For some reason this use case hasn't been used with this model until recently - so this issue was there since the beginning it seems.

Any suggestions to how to add a simple test here? or perhaps we have something similar already? user's script is quite massive.

* style

* small change (#18584)

* Flax Remat for LongT5 (#17994)

* [Flax] Add remat (gradient checkpointing)

* fix variable naming in test

* flip: checkpoint using a method

* fix naming

* fix class naming

* apply PVP's suggestions from code review

* add gradient_checkpointing to examples

* Add gradient_checkpointing to run_mlm_flax

* Add remat to longt5

* Add gradient checkpointing test longt5

* Fix args errors

* Fix remaining tests

* Make fixup & quality fixes

* replace kwargs

* remove unecessary kwargs

* Make fixup changes

* revert long_t5_flax changes

* Remove return_dict and copy to LongT5

* Remove test_gradient_checkpointing

Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>

* mac m1 `mps` integration (#18598)

* mac m1 `mps` integration

* Update docs/source/en/main_classes/trainer.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* addressing comments

* Apply suggestions from code review

Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>

* resolve comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>

* Change scheduled CIs to use torch 1.12.1 (#18644)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Add checks for some workflow jobs (#18583)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* TF: Fix generation repetition penalty with XLA (#18648)

* Update longt5.mdx (#18634)

* Update run_translation_no_trainer.py (#18637)

* Update run_translation_no_trainer.py

found an error in selecting `no_decay` parameters and some small modifications when the user continues to train from a checkpoint

* fixs `no_decay` and `resume_step` issue

1. change `no_decay` list
2. if use continue to train their model from provided checkpoint, the `resume_step` will not be initialized properly if `args.gradient_accumulation_steps != 1`

* [bnb] Minor modifications (#18631)

* bnb minor modifications

- refactor documentation
- add troubleshooting README
- add PyPi library on DockerFile

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* put in one block

- put bash instructions in one block

* update readme

- refactor a bit hardware requirements

* change text a bit

* Apply suggestions from code review

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* apply suggestions

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* add link to paper

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update tests/mixed_int8/README.md

* Apply suggestions from code review

* refactor a bit

* add instructions Turing & Amperer

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* add A6000

* clarify a bit

* remove small part

* Update tests/mixed_int8/README.md

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Examples: add Bloom support for token classification (#18632)

* examples: add Bloom support for token classification (FLAX, PyTorch and TensorFlow)

* examples: remove support for Bloom in token classication (FLAX and TensorFlow currently have no support for it)

* Fix Yolos ONNX export test (#18606)

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Fixup

* Fix up

* Move PIL default arguments inside function for safe imports

* Add image utils to toctree

* Update `rescale` method to reflect changes in #18677

* Update docs/source/en/internal/image_processing_utils.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Address Niels PR comments

* Apply suggestions from code review - remove defaults to None

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix docstrings and revert to PIL.Image.XXX resampling

Use PIL.Image.XXX resampling values instead of PIL.Image.Resampling.XXX enum as it's only in the recent version >= 9.10 and version is not yet pinned and older version support deprecated

* Some more docstrings and PIL.Image tidy up

* Reorganise arguments so flags by modifiers

* Few last docstring fixes

Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Seunghwan Hong <harrydrippin@gmail.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
Co-authored-by: Ankur Goyal <ankrgyl@gmail.com>
Co-authored-by: Ankur Goyal <ankur@impira.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Mishig Davaadorj <dmishig@gmail.com>
Co-authored-by: Rasmus Arpe Fogh Jensen <Rasmus.arpe@gmail.com>
Co-authored-by: Ian Castillo <7807897+donelianc@users.noreply.github.com>
Co-authored-by: AguilaCudicio <aguila.cudicio@gmail.com>
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>
Co-authored-by: Niklas Hansson <niklas.sven.hansson@gmail.com>
Co-authored-by: Thomas Chaigneau <t.chaigneau.tc@gmail.com>
Co-authored-by: YouJiacheng <1503679330@qq.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Dhruv Karan <k4r4n.dhruv@gmail.com>
Co-authored-by: Michael Wyatt <mrwyattii@gmail.com>
Co-authored-by: Maxime G <joihn@users.noreply.github.com>
Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com>
Co-authored-by: Wonseok Lee (Jack) <rollerkid02@snu.ac.kr>
Co-authored-by: Dan Jones <dan.j.jones2@gmail.com>
Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com>
Co-authored-by: flozi00 <flozi00.fz@gmail.com>
Co-authored-by: iiLaurens <iiLaurens@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Wang, Yi <yi.a.wang@intel.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: Karim Foda <35491698+KMFODA@users.noreply.github.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>
Co-authored-by: zhoutang776 <47708118+zhoutang776@users.noreply.github.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-10-12 18:32:02 +01:00
Daniel van Strien
af539d6f0a
fix MarkupLMProcessor option flag (#19526) 2022-10-12 15:08:48 +02:00
Ritik Nandwal
e94384e4d8
Add depth estimation pipeline (#18618)
* Add initial files for depth estimation pipelines

* Add test file for depth estimation pipeline

* Update model mapping names

* Add updates for depth estimation output

* Add generic test

* Hopefully fixing the tests.

* Check if test passes

* Add make fixup and make fix-copies changes after rebase with main

* Rebase with main

* Fixing up depth pipeline.

* This is not used anymore.

* Fixing the test. `Image` is a module `Image.Image` is the type.

* Update docs/source/en/main_classes/pipelines.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-12 08:54:20 -04:00
NielsRogge
4d367a3c81
Add LiLT (#19450)
* First draft

* Fix more things

* Improve more things

* Remove some head models

* Fix more things

* Add missing layers

* Remove tokenizer

* Fix more things

* Fix copied from statements

* Make all tests pass

* Remove print statements

* Remove files

* Fix README and docs

* Add integration test and fix organization

* Add tips

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Make tests faster, improve docs

* Fix doc tests

* Add model to toctree

* Add docs

* Add note about creating new checkpoint

* Remove is_decoder

* Make tests smaller, add docs

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-12 10:11:20 +02:00
Wang, Yi
7543e275d4
update doc for perf_train_cpu_many (#19506)
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2022-10-11 22:54:19 -04:00
Mathieu Jouffroy
5ca131f3d4
[CvT] Tensorflow implementation (#18597)
* implemented TFCvtModel and TFCvtForImageClassification and modified relevant files, added an exception in convert_tf_weight_name_to_pt_weight_name, added quick testing file to compare with pytorch model

* added docstring + testing file in transformers testing suite

* added test in testing file, modified docs to pass repo-consistency, passed formatting test

* refactoring + passing all test

* small refacto, removing unwanted comments

* improved testing config

* corrected import error

* modified acces to pretrained model archive list, to pass tf_test

* corrected import structure in init files

* modified testing for keras_fit with cpu

* correcting PR issues + Refactoring

* Refactoring : improving readability and reducing the number of permutations

* corrected momentum value + cls_token initialization

* removed from_pt as weights were added to the hub

* Update tests/models/cvt/test_modeling_tf_cvt.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2022-10-11 18:16:52 +01:00
amyeroberts
e3f028f3af
Add TF whisper (#19378)
* simplify loop

* add featur extractor

* add model

* start conversion

* add dropout

* initial commit of test files

* copnversion for all models

* update processor for correct padding

* update feature extraction

* update integration test logits match

* fmnt: off for the logits

* on the fly mel bank

* small nit

* update test

* update tokenizer

* nit feature extraction

* update

* update tokenizer test

* adds logit processor and update tokenizer to get supress tokens

* style

* clean convert

* revert to original modeling tf utils

* Update

* update

* nit

* clean convert file

* update tests and nits

* quality

* slow generation test

* ffn_dim to allow customization

* update readme

* add to toctreee

* start fixing integration tests

* update tests and code

* fix feature extractor

* fix config tests common

* update code to fix tests

* fix feature exctractor

* nit feature extraction

* update test for new feature extractor

* style

* add absrtact

* large logits wioth custom decoder input ids

* wraap around is otrch available

* fix feature extractor

* correct logits for whisper small.en

* nit

* fix encoder_attentino_mask

* some fixes

* remove unnecessary inputs

* nits

* add normalizer file

* update etst tokenization

* fix attention mask not defined

* fix generate

* remove uncoder attention mask useless

* update test modeling whisper

* update condfig to add second non supress tokens

* nits on feature exrtactor

* nit for test tokenizers

* update etsts

* update tests

* update tokenization test

* fixup

* invalidated hf token. Clean convert openai to whisper

* fix logit tests

* fixup

* Add model to README

* Fix doc tests

* clean merge

* revert toc_tree changes

* remove useless LogitProcessor

* Update whisper .mdx

* update config file doc

* update configuration docstring

* update test tokenization

* update test tokenization

* update tokenization whisper
Added copied from where needed

* update feature extraction

* nit test name

* style

* quality

* remove get suppress tokens and update non_speech tokens global variables

* Update src/transformers/models/whisper/feature_extraction_whisper.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* clean modeling whisper and test
Removed the attention mask arguments that are deprecated

* fix large test

* Add multilingual audio test, and translate test

* style

* fix larg multilingual test

* nits

* add copied from for attention layer

* remove attention masks in doc

* add english normalizer

* Update docs/source/en/model_doc/whisper.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update tokenization test

* remove copied from in whisper attention : no bias in k_proj only

* wrap around dependencies in english normalizer

* style

* correct import generation logits

* for now, wrap feature extractor with torch

* remove torch depencies for feature extraction and style

* Update src/transformers/models/whisper/convert_openai_whisper_to_tfms.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/whisper.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixup

* nit

* update logitds

* style

* nit

* nits and fix final tests

* add `is_more_itertools_available` to utils

* quality

* add begin supress tokens, supress tokens to generate args and config

* clean supressTokensLogitProcessor in generation logits

* Nit naming

* add supressTokensAtBegin

* udpate tests, supress tokens to None or correct values

* nit and style

* update RAG to fit test and generate_logit

* add copy pasted statment on english normalizer

* add arguments to config_common_kwargs

* Update src/transformers/generation_utils.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/generation_logits_process.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* revert changes based on reviews

* update doc and nits

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* more nits

* last nits

* update test configuration common

* add BART name in decoder attention mask documentation

* Update src/transformers/models/whisper/modeling_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* style

* nit

* nit

* add english.json file to git

* nits on documentation

* nit

* nits

* last styling

* add main toctree file

* remove sentence piece dependency

* clean init file

* fix tokenizer that has no dependencies on sentencepiece

* update whisper init file, nit

* remove english.json file

* add get decoder prompt id

* All weights loading

* Remove hanging pdb

* Fixup and tidy up

* Use same copied from as PT model

* Remove whitespace changes

* Remove torch references

* Tie embeddings

* Remove logits processor input to generate

* Update logit values

* revert changes and add forced logit processor

* nit

* clean normalizer

* remove protected

* Add logit processors and update generation code & tests

* Some tidy up

* Update docstring

* update

* update based on review

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update to reflect changes on the PT model branch

* Tidy up

* Remove extra whitespace

* Fix test - make input ids small enough we can append

* Include upstream changes on main

* PR comments - add batch tests, remove comments & defaults

* Fix model output imports

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation_tf_logits_process.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/models/whisper/test_modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update docstring example

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Remove changes to adjust_logits_during_generation function

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Tidy up imports that don't require TF

* Update tests - skip and no more skip

* Update tests/generation/test_generation_tf_logits_process.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Add training flags

* Add (skipped) XLA generation tests

* Add embedding correctness test

* Add constant ids for generation tests

* Make logits finding a bit tidier

* Remove unused args

* xla generation enabled

* Don't skip XLA tests anymore

* Fix tests - add position ids to expected signature and update rag generation

* Undo method reorder

* Remove added whitespace

* Remove copy-paste gradient checkopint ref

* Remove

* Trigger CI - (issue with refs when pulling)

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <niels.rogge1@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
2022-10-10 14:48:17 +01:00
APAVOU Clément
af69360bf9
Add OPTForQuestionAnswering (#19402)
* Add `OPTForQuestionAnswering`

- added `OPTForQuestionAnswering` class based on `BloomForQuestionAnswering`
- added `OPTForQuestionAnswering` in common tests
- all common tests pass
- make fixup done

* added docstrings for OPTForQuestionAnswering

* Fix docstrings for OPTForQuestionAnswering
2022-10-10 09:30:59 -04:00
Mohit Sharma
3080bb4754
Add onnx support for VisionEncoderDecoder (#19254)
* Add onnx support for VisionEncoderDecoder

* Add onnx support for VisionEncoderDecoder

* Removed unused import

* Rename encoder hidden state

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update docstrings and removed redundant code

* Added test function for enc-dec models

* Update doc string text

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* fixed code style

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-10-10 09:20:19 -04:00
Amrit Sahu
e9a49babee
[WIP] Add ZeroShotObjectDetectionPipeline (#18445) (#18930)
* Add ZeroShotObjectDetectionPipeline (#18445)

* Add AutoModelForZeroShotObjectDetection task

This commit also adds the following

- Add explicit _processor method for ZeroShotObjectDetectionPipeline.
  This is necessary as pipelines don't auto infer processors yet and
  `OwlVitProcessor` wraps tokenizer and feature_extractor together, to
  process multiple images at once

- Add auto tests and other tests for ZeroShotObjectDetectionPipeline

* Add AutoModelForZeroShotObjectDetection task

This commit also adds the following

- Add explicit _processor method for ZeroShotObjectDetectionPipeline.
  This is necessary as pipelines don't auto infer processors yet and
  `OwlVitProcessor` wraps tokenizer and feature_extractor together, to
  process multiple images at once

- Add auto tests and other tests for ZeroShotObjectDetectionPipeline

* Add batching for ZeroShotObjectDetectionPipeline

* Fix doc-string ZeroShotObjectDetectionPipeline

* Fix output format: ZeroShotObjectDetectionPipeline
2022-10-07 10:00:19 -04:00
Bibhabasu Mohapatra
e162cebfa3
add ONNX support for swin transformer (#19390)
* swin transformer onnx support

* Updated image dimensions as dynamic

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-10-07 09:23:24 -04:00
Alara Dirik
ae3e3bc60a
fix docs example, add object_detection to DETR docs (#19377) 2022-10-07 00:02:26 +02:00
Arthur
45e14038f2
Add WhisperModel to transformers (#19166)
* simplify loop

* add featur extractor

* add model

* start conversion

* add dropout

* initial commit of test files

* copnversion for all models

* update processor for correct padding

* update feature extraction

* update integration test logits match

* fmnt: off for the logits

* on the fly mel bank

* small nit

* update test

* update tokenizer

* nit feature extraction

* update

* update tokenizer test

* adds logit processor and update tokenizer to get supress tokens

* style

* clean convert

* revert to original modeling tf utils

* Update

* update

* nit

* clean convert file

* update tests and nits

* quality

* slow generation test

* ffn_dim to allow customization

* update readme

* add to toctreee

* start fixing integration tests

* update tests and code

* fix feature extractor

* fix config tests common

* update code to fix tests

* fix feature exctractor

* nit feature extraction

* update test for new feature extractor

* style

* add absrtact

* large logits wioth custom decoder input ids

* wraap around is otrch available

* fix feature extractor

* correct logits for whisper small.en

* nit

* fix encoder_attentino_mask

* some fixes

* remove unnecessary inputs

* nits

* add normalizer file

* update etst tokenization

* fix attention mask not defined

* Add model to README

* Fix doc tests

* fix generate

* remove uncoder attention mask useless

* update test modeling whisper

* update condfig to add second non supress tokens

* nits on feature exrtactor

* nit for test tokenizers

* update etsts

* update tests

* update tokenization test

* fixup

* invalidated hf token. Clean convert openai to whisper

* fix logit tests

* fixup

* clean merge

* revert toc_tree changes

* remove useless LogitProcessor

* Update whisper .mdx

* update config file doc

* update configuration docstring

* update test tokenization

* update test tokenization

* update tokenization whisper
Added copied from where needed

* update feature extraction

* nit test name

* style

* quality

* remove get suppress tokens and update non_speech tokens global variables

* Update src/transformers/models/whisper/feature_extraction_whisper.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* clean modeling whisper and test
Removed the attention mask arguments that are deprecated

* fix large test

* Add multilingual audio test, and translate test

* style

* fix larg multilingual test

* nits

* Update docs/source/en/model_doc/whisper.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add copied from for attention layer

* remove attention masks in doc

* add english normalizer

* update tokenization test

* remove copied from in whisper attention : no bias in k_proj only

* wrap around dependencies in english normalizer

* style

* correct import generation logits

* for now, wrap feature extractor with torch

* Update src/transformers/models/whisper/convert_openai_whisper_to_tfms.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/whisper.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* remove torch depencies for feature extraction and style

* fixup

* nit

* update logitds

* style

* nit

* nits and fix final tests

* add `is_more_itertools_available` to utils

* quality

* add begin supress tokens, supress tokens to generate args and config

* clean supressTokensLogitProcessor in generation logits

* Nit naming

* add supressTokensAtBegin

* udpate tests, supress tokens to None or correct values

* nit and style

* update RAG to fit test and generate_logit

* add copy pasted statment on english normalizer

* add arguments to config_common_kwargs

* Update src/transformers/generation_utils.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/generation_logits_process.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* revert changes based on reviews

* update doc and nits

* more nits

* last nits

* update test configuration common

* add BART name in decoder attention mask documentation

* Update src/transformers/models/whisper/modeling_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* style

* nit

* nit

* add english.json file to git

* nits on documentation

* nit

* nits

* last styling

* add main toctree file

* remove sentence piece dependency

* clean init file

* fix tokenizer that has no dependencies on sentencepiece

* update whisper init file, nit

* remove english.json file

* add get decoder prompt id

* revert changes and add forced logit processor

* nit

* clean normalizer

* remove protected

* update

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update based on review

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add batched tests

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <niels.rogge1@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-05 22:28:31 +02:00
Alara Dirik
07e94bf159
Maskformer post-processing fixes and improvements (#19172)
- Improves MaskFormer docs, corrects minor typos
- Restructures MaskFormerFeatureExtractor.post_process_panoptic_segmentation for better readability, adds target_sizes argument for optional resizing
- Adds post_process_semantic_segmentation and post_process_instance_segmentation methods.
- Adds a deprecation warning to post_process_segmentation method in favour of post_process_instance_segmentation
2022-10-05 15:27:15 +03:00
Younes Belkada
587d84b178
Add BloomForQuestionAnswering (#19310)
* add bloom for question answering

- attempt to add Bloom for question answering
- adapted from `GPTJForQuestionAnswering`
- Fixed `num_labels` to `2` for common tests
- Added a bit of docstring
- All common tests pass

* Update src/transformers/models/bloom/modeling_bloom.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert changes related to `num_labels`

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-04 17:52:13 +02:00
Steven Liu
68f50f3453
Breakup export guide (#19271)
* split onnx and torchscript docs

* make style

* apply reviews
2022-10-03 13:18:29 -07:00
Alara Dirik
36f52e9593
Restructure DETR post-processing, return prediction scores (#19262)
* Restructure DetrFeatureExtractor post-processing methods
* Update post_process_instance_segmentation and post_process_panoptic_segmentation methods to return prediction scores
* Update DETR models docs
2022-10-03 12:02:51 +03:00
Kashif Rasul
5cd16f01db
time series forecasting model (#17965)
* initial files

* initial model via cli

* typos

* make a start on the model config

* ready with configuation

* remove tokenizer ref.

* init the transformer

* added initial model forward to return dec_output

* require gluonts

* update dep. ver table and add as extra

* fixed typo

* add type for prediction_length

* use num_time_features

* use config

* more config

* typos

* opps another typo

* freq can be none

* default via transformation is 1

* initial transformations

* fix imports

* added transform_start_field

* add helper to create pytorch dataloader

* added inital val and test data loader

* added initial distr head and loss

* training working

* remove TimeSeriesTransformerTokenizer

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/__init__.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixed copyright

* removed docs

* remove time series tokenizer

* fixed docs

* fix text

* fix second

* fix default

* fix order

* use config directly

* undo change

* fix comment

* fix year

* fix import

* add additional arguments for training vs. test

* initial greedy inference loop

* fix inference

* comment out token inputs to enc dec

* Use HF encoder/decoder

* fix inference

* Use Seq2SeqTSModelOutput output

* return Seq2SeqTSPredictionOutput

* added default arguments

* fix return_dict true

* scale is a tensor

* output static_features for inference

* clean up some unused bits

* fixed typo

* set return_dict if none

* call model once for both train/predict

* use cache if future_target is none

* initial generate func

* generate arguments

* future_time_feat is required

* return SampleTSPredictionOutput

* removed unneeded classes

* fix when params is none

* fix return dict

* fix num_attention_heads

* fix arguments

* remove unused shift_tokens_right

* add different dropout configs

* implement FeatureEmbedder, Scaler and weighted_average

* remove gluonts dependency

* fix class names

* avoid _variable names

* remove gluonts dependency

* fix imports

* remove gluonts from configuration

* fix docs

* fixed typo

* move utils to examples

* add example requirements

* config has no freq

* initial run_ts_no_trainer

* remove from ignore

* fix output_attentions and removed unsued getters/setters

* removed unsed tests

* add dec seq len

* add test_attention_outputs

* set has_text_modality=False

* add config attribute_map

* make style

* make fix-copies

* add encoder_outputs to TimeSeriesTransformerForPrediction forward

* Improve docs, add model to README

* added test_forward_signature

* More improvements

* Add more copied from

* Fix README

* Fix remaining quality issues

* updated encoder and decoder

* fix generate

* output_hidden_states and use_cache are optional

* past key_values returned too

* initialize weights of distribution_output module

* fixed more tests

* update test_forward_signature

* fix return_dict outputs

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* removed commented out tests

* added neg. bin and normal output

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* move to one line

* Add docstrings

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* add try except for assert and raise

* try and raise exception

* fix the documentation formatting

* fix assert call

* fix docstring formatting

* removed input_ids from DOCSTRING

* Update input docstring

* Improve variable names

* Update order of inputs

* Improve configuration

* Improve variable names

* Improve docs

* Remove key_length from tests

* Add extra docs

* initial unittests

* added test_inference_no_head test

* added test_inference_head

* add test_seq_to_seq_generation

* make style

* one line

* assert mean prediction

* removed comments

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix order of args

* make past_observed_mask optional as well

* added Amazon license header

* updated utils with new fieldnames

* make style

* cleanup

* undo position of past_observed_mask

* fix import

* typo

* more typo

* rename example files

* remove example for now

* Update docs/source/en/_toctree.yml

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update modeling_time_series_transformer.py

fix style

* fixed typo

* fix typo and grammer

* fix style

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: NielsRogge <niels.rogge1@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-09-30 15:32:59 -04:00
Joao Gante
cfb777f27c
Docs - Guide to add a new TensorFlow model (#19256)
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-09-30 20:30:38 +01:00
Matt
368b649af6
Rebase ESM PR and update all file formats (#19055)
* Rebase ESM PR and update all file formats

* Fix test relative imports

* Add __init__.py to the test dir

* Disable gradient checkpointing

* Remove references to TFESM... FOR NOW >:|

* Remove completed TODOs from tests

* Convert docstrings to mdx, fix-copies from BERT

* fix-copies for the README and index

* Update ESM's __init__.py to the modern format

* Add to _toctree.yml

* Ensure we correctly copy the pad_token_id from the original ESM model

* Ensure we correctly copy the pad_token_id from the original ESM model

* Tiny grammar nitpicks

* Make the layer norm after embeddings an optional flag

* Make the layer norm after embeddings an optional flag

* Update the conversion script to handle other model classes

* Remove token_type_ids entirely, fix attention_masking and add checks to convert_esm.py

* Break the copied from link from BertModel.forward to remove token_type_ids

* Remove debug array saves

* Begin ESM-2 porting

* Add a hacky workaround for the precision issue in original repo

* Code cleanup

* Remove unused checkpoint conversion code

* Remove unused checkpoint conversion code

* Fix copyright notices

* Get rid of all references to the TF weights conversion

* Remove token_type_ids from the tests

* Fix test code

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add credit

* Remove _ args and __ kwargs in rotary embedding

* Assertively remove asserts

* Replace einsum with torch.outer()

* Fix docstring formatting

* Remove assertions in tokenization

* Add paper citation to ESMModel docstring

* Move vocab list to single line

* Remove ESMLayer from init

* Add Facebook copyrights

* Clean up RotaryEmbedding docstring

* Fix docstring formatting

* Fix docstring for config object

* Add explanation for new config methods

* make fix-copies

* Rename all the ESM- classes to Esm-

* Update conversion script to allow pushing to hub

* Update tests to point at my repo for now

* Set config properly for tests

* Remove the gross hack that forced loss of precision in inv_freq and instead copy the data from the model being converted

* make fixup

* Update expected values for slow tests

* make fixup

* Remove EsmForCausalLM for now

* Remove EsmForCausalLM for now

* Fix padding idx test

* Updated README and docs with ESM-1b and ESM-2 separately (#19221)

* Updated README and docs with ESM-1b and ESM-2 separately

* Update READMEs, longer entry with 3 citations

* make fix-copies

Co-authored-by: Your Name <you@example.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Tom Sercu <tsercu@fb.com>
Co-authored-by: Your Name <you@example.com>
2022-09-30 14:16:25 +01:00
NielsRogge
f3d2f7a6e0
Add MarkupLM (#19198)
* First draft

* Make basic test work

* Fix most tokenizer tests

* More improvements

* Make more tests pass

* Fix more tests

* Fix some code quality

* Improve truncation

* Implement feature extractor

* Improve feature extractor and add tests

* Improve feature extractor tests

* Fix pair_input test partly

* Add fast tokenizer

* Improve implementation

* Fix rebase

* Fix rebase

* Fix most of the tokenizer tests.

* propose solution for fast

* add: integration test for fasttokenizer, warning for decode, fix template in slow tokenizer

* add: modify markuplmconverter

* add: some modify on converter and tokenizerfast

* Fix style, copies

* Make fixup

* Update tokenization_markuplm.py

* Update test_tokenization_markuplm.py

* Update markuplm related

* Improve processor, add integration test

* Add processor test file

* Improve processor

* Improve processor tests

* Fix more processor tests

* Fix processor tests

* Update docstrings

* Add Copied from statements

* Add more Copied from statements

* Add code examples

* Improve code examples

* Add model to doc tests

* Adding dependency check

* Add dummy file

* Add requires_backends

* Add model to toctree

* Fix more things, disable dependency check for now

* Apply more suggestions

* Add soft dependency

* Add annotators to tests

* Fix style

* Remove from_slow=True

* Remove print statements

* Add sanity check

* Fix processor test

* Fix processor tests, add more docs

* Add doc tests for mdx file

* Add more tips

* Apply suggestions

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: lockon-n <45759388+lockon-n@users.noreply.github.com>
Co-authored-by: SaulLu <lucilesaul.com@gmail.com>
Co-authored-by: lockon-n <dd098309@126.com>
2022-09-30 08:25:43 +02:00
mustapha ajeghrir
ba9e336fa3
Fix m2m_100.mdx doc example missing labels (#19149)
The `labels` variable is not defined, the `model_inputs` already contain this information.
2022-09-29 13:27:58 +02:00
Aritra Roy Gosthipaty
0dc7b3a785
[TensorFlow] Adding GroupViT (#18020)
* chore: initial commit

* chore: adding util methods

yet to work on the nn.functional.interpolate port with align_corener=True

* chore: refactor the utils

* used tf.compat.v1.image.resize to align the F.interpolate function
* added type hints to the method signatures
* added references to the gists where one 2 one alignment of torch and tf has been shown

* chore: adding the layers

* chore: porting all the layers from torch to tf

This is the initial draft, nothing is tested yet.

* chore: aligning the layers with reference to tf clip

* chore: aligning the modules

* added demaraction comments
* added copied and adapted from comments

* chore: aligning with CLIP

* chore: wrangling the layers to keep it tf compatible

* chore: aligning the names of the layers for porting

* chore: style changes

* chore: adding docs and inits

* chore: adding tfp dependencis

the code is taken from TAPAS

* chore: initial commit for testing

* chore: aligning the vision embeddings with the vit implementatino

* chore: changing model prefix

* chore: fixing the name of the model and the layer normalization test case

* chore: every test passes but the slow ones

* chore: fix style and integration test

* chore: moving comments below decorators

* chore: make fixup and fix-copies changes

* chore: adding the Vision and Text Model to check_repo

* chore: modifying the prefix name to align it with the torch implementation

* chore: fix typo in configuration

* choer: changing the name of the model variable

* chore: adding segmentation flag

* chore: gante's review

* chore: style refactor

* chore: amy review

* chore: adding shape_list to parts that have been copied from other snippets

* chore: init batchnorm with torch defaults

* chore: adding shape_list to pass the tests

* test fix: adding seed as 0

* set seed

* chore: changing the straight through trick to fix -ve dimensinos

* chore: adding a dimension to the loss

* chore: adding reviewers and contributors names to the docs

* chore: added changes after review

* chore: code quality fixup

* chore: fixing the segmentation snippet

* chore: adding  to the layer calls

* chore: changing int32 to int64 for inputs of serving

* chore: review changes

* chore: style changes

* chore: remove from_pt=True

* fix: repo consistency

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-29 10:48:04 +01:00
Steven Liu
6957350c2b
Focus doc around preprocessing classes (#18768)
* 📝 reframe docs around preprocessing classes

* small edits

* edits and review

* fix typo

* apply review

* clarify processor
2022-09-28 17:09:44 -07:00
Steven Liu
990936a868
Move AutoClasses under Main Classes (#19163)
* move autoclasses to main classes

* keep auto.mdx in model_doc
2022-09-28 17:09:29 -07:00
Wang, Yi
88f597ba6a
add doc for hyperparameter search (#19192)
* add doc for hyperparameter search

* update doc
2022-09-27 07:51:51 -04:00
Sylvain Gugger
c20b2c7e18
Use repo_type instead of deprecated datasets repo IDs (#19202)
* Use repo_type instead of deprecated datasets repo IDs

* Add missing one in doc
2022-09-26 09:50:48 -04:00
Alara Dirik
7e84723fe4
Add semantic segmentation post-processing method to MobileViT (#19105)
* add post-processing method for semantic segmentation

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-09-23 16:24:28 +03:00
Wang, Yi
e5b7cff5fe
update perf_train_cpu_many doc (#19151)
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2022-09-22 09:20:15 -04:00
NielsRogge
cf6308ef9b
Improve conditional detr docs (#19154)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-09-22 13:21:05 +02:00
Sayak Paul
2d9853b226
MSN (Masked Siamese Networks) for ViT (#18815)
* feat: modeling and conversion scripts for msn.

* chore: change license year.

* chore: remove unneeded modules.

* feat: direct loading of state_dict from remote url.

* fix: import paths.

* add: rest of the files.

* add and fix rest of the files.

Co-authored-by: Niels <niels.rogge1@gmail.com>

* chore: formatting.

* code quality fix.

* chore: remove pooler.

* feat: add classification top.

* fix: configuration object.

* add: initial test cases (one failing).

* fix: basemodeloutput.

* add: caution on using the classification head.

* add: rest of the model related files.

* add: vit msn readme.

* fix: copied from statement.

* fix: dummy objects.

* add: ViTMSNPreTrainedModel to inits.

* fix: repo consistency.

* minor change in the model doc.

* fix: tests.

* Empty-Commit

* Update src/transformers/models/vit_msn/configuration_vit_msn.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address PR comments.

* Update src/transformers/models/vit_msn/modeling_vit_msn.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* chore: put model in no_grad() and formatting.

Co-authored-by: Niels <niels.rogge1@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2022-09-22 07:15:03 -04:00
NielsRogge
9393f966bc
[fix] Add DeformableDetrFeatureExtractor (#19140)
* Add DeformableDetrFeatureExtractor

* Fix post_process

* Fix name

* Add tests for feature extractor

* Fix doc tests

* Fix name

* Address comments

* Apply same fix to DETR and YOLOS as well

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-09-22 09:45:24 +02:00
DepuMeng
126a739058
Add support for conditional detr (#18948)
* added conditional_detr files

* checked copies

* checked copies

* fixed style and copies

* fixed style and copies

* fixed hub

* fixed style

* Update README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/_toctree.yml

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/index.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/conditional_detr.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixed some doc issue

* changed prefix to ConditionalDetr

* fixed docs

* Update README_ko.md

* added spatial_model_name

* fixed fix-copies

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added some copied from

* added some copied from

* added some copied from

* added some copied from

* fixed use_pretrained issue

* changed post-process

* added conditional_detr files

* checked copies

* checked copies

* fixed style and copies

* fixed style and copies

* fixed hub

* fixed style

* Update README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/_toctree.yml

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/index.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixed some doc issue

* Update docs/source/en/model_doc/conditional_detr.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* changed prefix to ConditionalDetr

* fixed docs

* Update README_ko.md

* added spatial_model_name

* fixed fix-copies

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added some copied from

* added some copied from

* added some copied from

* added some copied from

* fixed use_pretrained issue

* changed post-process

* fix style quality and copies

* fix style quality and copies

* fix style quality and copies

* fix style quality and copies

* add more fix-copies

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixed some variable names & added more fix-copies

* fixed some variable names & added more fix-copies

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added more copied from

* fixed quality

* changed pretrained config

* added more copied-from and fixed the issue in feature_extraction_auto

* added conditional_detr files

* checked copies

* checked copies

* fixed style and copies

* fixed style and copies

* fixed hub

* fixed style

* Update README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/_toctree.yml

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/index.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixed some doc issue

* Update docs/source/en/model_doc/conditional_detr.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* changed prefix to ConditionalDetr

* fixed docs

* Update README_ko.md

* added spatial_model_name

* fixed fix-copies

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added some copied from

* added some copied from

* added some copied from

* added some copied from

* fixed use_pretrained issue

* changed post-process

* added conditional_detr files

* checked copies

* fixed style and copies

* fixed some doc issue

* changed prefix to ConditionalDetr

* fixed docs

* added spatial_model_name

* fixed fix-copies

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added some copied from

* added some copied from

* added some copied from

* added some copied from

* fix style quality and copies

* fix style quality and copies

* fix style quality and copies

* add more fix-copies

* fixed some variable names & added more fix-copies

* fixed some variable names & added more fix-copies

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added more copied from

* fixed quality

* changed pretrained config

* added more copied-from and fixed the issue in feature_extraction_auto

* fixed style

* added conditional_detr files

* checked copies

* checked copies

* fixed style and copies

* fixed style and copies

* fixed hub

* fixed style

* Update README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/_toctree.yml

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/index.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixed some doc issue

* Update docs/source/en/model_doc/conditional_detr.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* changed prefix to ConditionalDetr

* fixed docs

* Update README_ko.md

* added spatial_model_name

* fixed fix-copies

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added some copied from

* added some copied from

* added some copied from

* added some copied from

* fixed use_pretrained issue

* changed post-process

* added conditional_detr files

* checked copies

* fixed style and copies

* fixed some doc issue

* changed prefix to ConditionalDetr

* fixed docs

* added spatial_model_name

* fixed fix-copies

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added some copied from

* added some copied from

* added some copied from

* added some copied from

* fix style quality and copies

* fix style quality and copies

* fix style quality and copies

* add more fix-copies

* fixed some variable names & added more fix-copies

* fixed some variable names & added more fix-copies

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added more copied from

* fixed quality

* changed pretrained config

* added more copied-from and fixed the issue in feature_extraction_auto

* rebased

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Depu Meng <depumeng@Depus-MacBook-Pro.local>
2022-09-22 09:45:04 +02:00
Alara Dirik
e7fdfc720a
Add post_process_semantic_segmentation method to DPTFeatureExtractor (#19107)
* add post-processing method for semantic segmentation

* add test for post-processing
2022-09-21 15:15:26 +03:00
Alara Dirik
9e95706648
Add post_process_semantic_segmentation method to SegFormer (#19072)
* add post_process_semantic_segmentation method to SegformerFeatureExtractor
* add test for semantic segmentation post-processing
2022-09-21 11:40:35 +03:00
Alara Dirik
c81ebd1c39
Beit postprocessing (#19099)
* add post_process_semantic_segmentation method to BeiTFeatureExtractor
2022-09-20 10:41:56 +03:00
NielsRogge
e7206ceab9
Improve vision models docs (#19103)
* Add tips

* Add BEiT figure

* Fix URL

* Move tip to start

* Add tip to TF model as well

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-09-19 19:22:34 +02:00
Stas Bekman
8edf196310
[doc] debug: fix import (#19042)
correct the import statement
2022-09-14 16:29:58 -07:00
Hakjin Lee
abca1741cf
Fix a broken link for deepspeed ZeRO inference in the docs (#19001)
* Fix a broken link for deepspeed ZeRO inference

* fix link

Co-authored-by: Stas Bekman <stas@stason.org>
2022-09-14 16:21:06 -07:00
Shinya Otani
f5f430e5c8
Add support for Japanese GPT-NeoX-based model by ABEJA, Inc. (#18814)
* add gpt-neox-japanese model and tokenizer as new model

* Correction to PR's comment for GPT NeoX Japanese
- Fix to be able to use gpu
- Add comment # Copied... at the top of RotaryEmbedding
- Implement nn.Linear instead of original linear class
- Add generation test under @slow

* fix bias treatment for gpt-neox-japanese

* Modidy gpt-neox-japanese following PR
- add doc for bias_dropout_add
- style change following a PR comment

* add document for gpt-neox-japanese

* remove unused import from gpt-neox-japanese

* fix README for gpt-neox-japanese
2022-09-14 10:17:40 -04:00
NielsRogge
59407bbeb3
Add Deformable DETR (#17281)
* First draft

* More improvements

* Improve model, add custom CUDA code

* Import torch before

* Add script that imports custom layer

* Add everything in new ops directory

* Import custom layer in modeling file

* Fix ARCHIVE_MAP typo

* Creating the custom kernel on the fly.

* Import custom layer in modeling file

* More improvements

* Fix CUDA loading

* More improvements

* Improve conversion script

* Improve conversion script

* Make it work until encoder_outputs

* Make forward pass work

* More improvements

* Make logits match original implementation

* Make implementation also support single_scale model

* Add support for single_scale and dilation checkpoint

* Add support for with_box_refine model

* Support also two stage model

* Improve tests

* Fix more tests

* Make more tests pass

* Upload all models to the hub

* Clean up some code

* Improve decoder outputs

* Rename intermediate hidden states and reference points

* Improve model outputs

* Move tests to dedicated folder

* Improve model outputs

* Fix retain_grad test

* Improve docs

* Clean up and make test_initialization pass

* Improve variable names

* Add copied from statements

* Improve docs

* Fix style

* Improve docs

* Improve docs, move tests to model folder

* Fix rebase

* Remove DetrForSegmentation from auto mapping

* Apply suggestions from code review

* Improve variable names and docstrings

* Apply some more suggestions from code review

* Apply suggestion from code review

* better docs and variables names

* hint to num_queries and two_stage confusion

* remove asserts and code refactor

* add exception if two_stage is True and with_box_refine is False

* use f-strings

* Improve docs and variable names

* Fix code quality

* Fix rebase

* Add require_torch_gpu decorator

* Add pip install ninja to CI jobs

* Apply suggestion of @sgugger

* Remove DeformableDetrForObjectDetection from auto mapping

* Remove DeformableDetrModel from auto mapping

* Add model to toctree

* Add model back to mappings, skip model in pipeline tests

* Apply @sgugger's suggestion

* Fix imports in the init

* Fix copies

* Add CPU implementation

* Comment out GPU function

* Undo previous change

* Apply more suggestions

* Remove require_torch_gpu annotator

* Fix quality

* Add logger.info

* Fix logger

* Fix variable names

* Fix initializaztion

* Add missing initialization

* Update checkpoint name

* Add model to doc tests

* Add CPU/GPU equivalence test

* Add Deformable DETR to pipeline tests

* Skip model for object detection pipeline

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-09-14 11:45:21 +02:00
Chris Emezue
470799b3a6
Removed issue in wav2vec link (#18945)
Fix connected to [this issue](https://github.com/huggingface/transformers/issues/18944)
2022-09-12 21:59:19 +02:00
Tobias Nusser
4c2e983f44
Fixed typo (#18921)
Fixed typo itmes --> items
2022-09-12 21:03:48 +02:00
Rafał Jankowski
85125fcffd
Neptune.ai integration improvements (#18934)
* NeptuneCallback improvements

* After review suggestions and deduplication of initial run

* Added volatile checkpoints support due to missing post-rebase commit

* Update README per review comments

- Remove list formatting
- Correct Neptune docs link

Co-authored-by: Sabine <sabine.nyholm@neptune.ai>
2022-09-09 11:37:34 -04:00
HuYong
22f7218560
add task_type_id to BERT to support ERNIE-2.0 and ERNIE-3.0 models (#18686)
* add_ernie

* remove Tokenizer in ernie

* polish code

* format code style

* polish code

* fix style

* update doc

* make fix-copies

* change model name

* change model name

* fix dependency

* add more copied from

* rename ErnieLMHeadModel to ErnieForCausalLM
do not expose ErnieLayer
update doc

* fix

* make style

* polish code

* polish code

* fix

* fix

* fix

* fix

* fix

* final fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-09 07:36:46 -04:00
NielsRogge
bb6f6d5338
Add X-CLIP (#18852)
* First draft

* Improve conversion script

* Make vision encoder work

* More improvements

* Improve conversion script

* Fix quality

* Add MultiframeIntegrationTransformer

* More improvements

* Make MiT output work

* Fix quality

* Add prompts generator

* Add tests

* Fix some tests

* Fix some more tests

* Fix more tests

* Improve conversion script

* Fix model outputs

* Fix more tests

* Add XClipProcessor

* Use processor in conversion script

* Fix integration test

* Update README, fix docs

* Fix all tests

* Add MIT output to XClipOutput

* Create better variable names

* Rename XClip to XCLIP

* Extend conversion script

* Add support for large models

* Add support for 16 frame models

* Add another model'

* Fix module issue

* Apply suggestions from code review

* Add figure to docs

* Fix CLIPProcessor issue

* Apply suggestions from code review

* Delete file

* Convert more checkpoints

* Convert last checkpoint

* Update nielsr to microsoft
2022-09-08 14:50:30 +02:00
Devlee247
9832ac7c73
Fix LayoutXLM wrong link in README (#18932)
* fix LayoutXLM wrong link in README

* fix LayoutXLM worng link in index.mdx
2022-09-08 07:32:41 -04:00
Steven Liu
90f6fe9155
Skip some doctests in quicktour (#18927)
* skip some code examples for doctests

* make style

* fix code snippet formatting

* separate code snippet into two blocks
2022-09-07 14:45:22 -07:00
Ankur Goyal
2ef7742117
Add DocumentQuestionAnswering pipeline (#18414)
* [WIP] Skeleton of VisualQuestionAnweringPipeline extended to support LayoutLM-like models

* Fixup

* Use the full encoding

* Basic refactoring to DocumentQuestionAnsweringPipeline

* Cleanup

* Improve args, docs, and implement preprocessing

* Integrate OCR

* Refactor question_answering pipeline

* Use refactored QA code in the document qa pipeline

* Fix tests

* Some small cleanups

* Use a string type annotation for Image.Image

* Update encoding with image features

* Wire through the basic docs

* Handle invalid response

* Handle empty word_boxes properly

* Docstring fix

* Integrate Donut model

* Fixup

* Incorporate comments

* Address comments

* Initial incorporation of tests

* Address Comments

* Change assert to ValueError

* Comments

* Wrap `score` in float to make it JSON serializable

* Incorporate AutoModeLForDocumentQuestionAnswering changes

* Fixup

* Rename postprocess function

* Fix auto import

* Applying comments

* Improve docs

* Remove extra assets and add copyright

* Address comments

Co-authored-by: Ankur Goyal <ankur@impira.com>
2022-09-07 13:38:49 -04:00
Matt
2b9513fdab
Update TF fine-tuning docs (#18654)
* Update TF fine-tuning docs

* Fix formatting

* Add some section headers so the right sidebar works better

* Squiggly it

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Explain things in the text, not the comments

* Make the two dataset creation methods into a list

* Move the advice about collation out of a <Tip>

* Edits for clarity

* Edits for clarity

* Edits for clarity

* Replace `to_tf_dataset` with `prepare_tf_dataset` in the fine-tuning pages

* Restructure the page a little bit

* Restructure the page a little bit

* Restructure the page a little bit

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-09-07 13:30:07 +01:00
Ekagra Ranjan
0a632f076d
Fix incorrect size of input for 1st strided window length in Perplexity of fixed-length models (#18906)
* update the PPL for stride 512

* fix 1st strided window size

* linting

* fix typo

* styling
2022-09-06 15:20:12 -04:00
Ekagra Ranjan
f85acb4d73
Fix decode_input_ids to bare T5Model and improve doc (#18791)
* use tokenizer to output tensor

* add preprocessing for decoder_input_ids for bare T5Model

* add preprocessing to tf and flax

* linting

* linting

* Update src/transformers/models/t5/modeling_flax_t5.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/t5/modeling_tf_t5.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/t5/modeling_t5.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-09-06 14:12:26 +02:00
Surya Prakash Sahu
17c634fd5b
Update perf_train_gpu_one.mdx (#18442) 2022-09-05 14:06:36 +02:00
Lysandre Debut
591cfc6c90
Mention TF and Flax checkpoints (#18894) 2022-09-05 11:09:39 +02:00
Steven Liu
65fb71bc76
Add Trainer to quicktour (#18723)
* 📝 update quicktour

* 📝 add trainer section

* 🖍 markdown table, apply feedbacks

*  make style

* add tf training section

* make style
2022-09-02 15:05:31 -05:00
Steven Liu
ae32f3afef
Finetune guide for semantic segmentation (#18640)
* 📝 first draft

* oops add to toctree

* make style

* 📝 add inference section

* 🖍 make style

* 📝 add images

* 🖍 apply feedbacks

* remove num_labels and pytorch block

* apply feedbacks, add colab notebook

Co-authored-by: Steven <stevhliu@gmail.com>
2022-09-02 14:29:51 -05:00
Steven Liu
bf9d506137
Update docs landing page (#18590)
* 📝 update docs landing page

* 🖍 apply feedbacks

* apply feedbacks

* apply feedbacks, use <br> for list
2022-09-02 14:29:06 -05:00
Jason Phang
53e33e6f1b
PEGASUS-X (#18551)
* PegasusX Initial commit

* rename

* pegasus X implementation

* pegx update

* pegx fix

* pegasus-x fixes

* pegx updates

* cleanup

* cleanup

* cleanup

* tests

* stylefixes

* Documentation update

* Model hub fix

* cleanup

* update

* update

* testfix

* Check fix

* tweaks for merging

* style

* style

* updates for pr

* style

* change pegasus-x repo
2022-09-02 19:54:02 +02:00
NielsRogge
17981faf67
Add OWL-ViT to the appropriate section (#18867)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-09-02 15:59:25 +02:00
NielsRogge
c60dd98e87
[LayoutLM] Add clarification to docs (#18716)
* Add clarification

* Add another clarification

* Apply suggestion

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-09-02 14:48:19 +02:00
OlivierDehaene
129d73294e
Fix naming issue with ImageToText pipeline (#18864)
Co-authored-by: Olivier Dehaene <olivier@huggingface.co>
2022-09-02 07:55:30 -04:00
Steven Liu
142e12afb4
Split docs on modality (#18205)
* update

* 🖍 add missing files

* 📝 add nested sections

* 🖍 align titles with tasks

* oops

* remove quotes from titles
2022-09-01 15:19:11 -05:00
OlivierDehaene
ddb69e5af8
Add Image To Text Generation pipeline (#18821)
* Add Image2TextGenerationPipeline to supported pipelines

* Add Flax and Tensorflow support

* Add Flax and Tensorflow small tests

* Add default model for Tensorflow

* Add docstring

* Fix doc style

* Add tiny models for pytorch and flax

* Remove flax from pipeline.
Fix tests

* Use ydshieh/vit-gpt2-coco-en as a default for both PyTorch and Tensorflow

* Fix Tensorflow support

Co-authored-by: Olivier Dehaene <olivier@huggingface.co>
2022-09-01 12:07:14 -04:00
Sayak Paul
954e18ab97
TensorFlow MobileViT (#18555)
* initial implementation.

* add: working model till image classification.

* add: initial implementation that passes intg tests.

Co-authored-by: Amy <aeroberts4444@gmail.com>

* chore: formatting.

* add: tests (still breaking because of config mismatch).

Coo-authored-by: Yih <2521628+ydshieh@users.noreply.github.com>

* add: corrected tests and remaning changes.

* fix code style and repo consistency.

* address PR comments.

* address Amy's comments.

* chore: remove from_pt argument.

* chore: add full-stop.

* fix: TFLite model conversion in the doc.

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply formatting.

* chore: remove comments from the example block.

* remove identation in the example.

Co-authored-by: Amy <aeroberts4444@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-09-01 10:35:15 -04:00
Pedro Cuenca
f719c0377f
Minor typo in prose of model outputs documentation. (#18848) 2022-09-01 12:05:40 +02:00
lewtun
80367cd1fb
Add security warning about the from_pretrained() method (#18801)
* Add security warning about from_pretrained() method

* Add sentence about malware scanner

Co-authored-by: Julien Chaumond <julien@huggingface.co>
2022-08-31 21:48:40 +02:00
NielsRogge
7e7f743481
Add SegFormer ONNX support (#18006)
* Add ONNX support

* Make height and width dynamic axes

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-08-31 20:58:44 +02:00
Ankur Goyal
5c4c869014
Add LayoutLMForQuestionAnswering model (#18407)
* Add LayoutLMForQuestionAnswering model

* Fix output

* Remove TF TODOs

* Add test cases

* Add docs

* TF implementation

* Fix PT/TF equivalence

* Fix loss

* make fixup

* Fix up documentation code examples

* Fix up documentation examples + test them

* Remove LayoutLMForQuestionAnswering from the auto mapping

* Docstrings

* Add better docstrings

* Undo whitespace changes

* Update tokenizers in comments

* Fixup code and remove `from_pt=True`

* Fix tests

* Revert some unexpected docstring changes

* Fix tests by overriding _prepare_for_class

Co-authored-by: Ankur Goyal <ankur@impira.com>
2022-08-31 10:05:33 +02:00
Dhruv Karan
220da3b8a1
Adds GroupViT to models exportable with ONNX (#18628)
* groupvit to onnx

* dynamic shape for pixel values dim
2022-08-30 14:31:35 +02:00
Dhruv Karan
46d0e26a27
Adds OWLViT to models exportable with ONNX (#18588)
* onnx conversion for owlvit

* .T to .t()

* dynamic shapes for pixel values
2022-08-30 14:30:59 +02:00
Christoffer Koo Øhrstrøm
de8548ebf3
[LayoutLMv3] Add TensorFlow implementation (#18678)
Co-authored-by: Esben Toke Christensen <esben.christensen@visma.com>
Co-authored-by: Lasse Reedtz <lasse.reedtz@visma.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2022-08-30 11:48:11 +01:00
Philipp Schmid
f2fbe44753
Fix broken link DeepSpeed documentation link (#18783)
* Fix broken link

* Trigger CI

Co-authored-by: Stas Bekman <stas@stason.org>
2022-08-28 19:32:19 -07:00
Patrick Deutschmann
3223d49354
Add ONNX support for Longformer (#17176)
* Implement ONNX support for Longformer

Fix repo consistency check complaints

Fix value mismatches

Add pooler output for default model

Increase validation atol to accommodate multiple-choice error

Fix copies

Fix chunking for longer sequence lengths

Add future comment

* Fix issue in mask_invalid_locations

* Remove torch imports in configuration_longformer

* Change config access to fix LED

* Push opset version to support tril

* Work in review comments (mostly style)

* Add Longformer to ONNX tests
2022-08-25 08:34:42 +02:00
Daniel Stancl
c72d7d91bf
Add TF implementation of XGLMModel (#16543)
* Add TFXGLM models 

* Add todo: self.supports_xla_generation = False

Co-authored-by: Daniel Stancl <stancld@Daniels-MacBook-Pro.local>
Co-authored-by: Daniel Stancl <stancld@daniels-mbp.home>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Daniel <daniel.stancl@rossum.ai>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-08-24 10:51:05 +01:00
Yih-Dar
cecf9f9b27
fix pipeline_tutorial.mdx doctest (#18717)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-24 05:38:03 -04:00
Mishig Davaadorj
c12dbdc246
Update perf_infer_gpu_many.mdx (#18744) 2022-08-24 10:37:52 +02:00
Younes Belkada
a123eee9df
[bnb] Move documentation (#18671)
* fix bnb documentation

- move bnb documentation to `infer_gpu_many`

* small refactoring

- added text on infer_gpu_one
- added a small note on infer_gpu_many
- added customized multi gpu example on infer_gpu_many

* Update docs/source/en/perf_infer_gpu_many.mdx

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* apply suggestions

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2022-08-18 17:34:48 +02:00
Younes Belkada
6d175c1129
[bnb] Minor modifications (#18631)
* bnb minor modifications

- refactor documentation
- add troubleshooting README
- add PyPi library on DockerFile

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* put in one block

- put bash instructions in one block

* update readme

- refactor a bit hardware requirements

* change text a bit

* Apply suggestions from code review

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* apply suggestions

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* add link to paper

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update tests/mixed_int8/README.md

* Apply suggestions from code review

* refactor a bit

* add instructions Turing & Amperer

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* add A6000

* clarify a bit

* remove small part

* Update tests/mixed_int8/README.md

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2022-08-17 00:48:10 +02:00
flozi00
a27195b1de
Update longt5.mdx (#18634) 2022-08-16 10:20:46 -05:00
Sourab Mangrulkar
9cf274685a
mac m1 mps integration (#18598)
* mac m1 `mps` integration

* Update docs/source/en/main_classes/trainer.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* addressing comments

* Apply suggestions from code review

Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>

* resolve comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>
2022-08-16 16:34:51 +05:30
Stas Bekman
37c5991843
[doc] fix anchors (#18591)
the manual anchors end up being duplicated with automatically added anchors and no longer work.
2022-08-12 10:49:59 -07:00
Niklas Muennighoff
56ef0ba447
Update BLOOM parameter counts (#18531)
* Update BLOOM parameter counts

* Update BLOOM parameter counts
2022-08-12 19:36:18 +02:00
NielsRogge
153d1361c7
Fix URLs (#18604)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-08-12 18:52:49 +02:00
NielsRogge
2ab790e82d
Add Donut (#18488)
* First draft

* Improve script

* Update script

* Make conversion work

* Add final_layer_norm attribute to Swin's config

* Add DonutProcessor

* Convert more models

* Improve feature extractor and convert base models

* Fix bug

* Improve integration tests

* Improve integration tests and add model to README

* Add doc test

* Add feature extractor to docs

* Fix integration tests

* Remove register_buffer

* Fix toctree and add missing attribute

* Add DonutSwin

* Make conversion script work

* Improve conversion script

* Address comment

* Fix bug

* Fix another bug

* Remove deprecated method from docs

* Make Swin and Swinv2 untouched

* Fix code examples

* Fix processor

* Update model_type to donut-swin

* Add feature extractor tests, add token2json method, improve feature extractor

* Fix failing tests, remove integration test

* Add do_thumbnail for consistency

* Improve code examples

* Add code example for document parsing

* Add DonutSwin to MODEL_NAMES_MAPPING

* Add model to appropriate place in toctree

* Update namespace to appropriate organization

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-08-12 16:40:58 +02:00
Yih-Dar
2156619f10
Add TFAutoModelForSemanticSegmentation to the main __init__.py (#18600)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-12 15:10:00 +02:00
Wang, Yi
3cdaea47ec
update doc for perf_train_cpu_many, add intel mpi introduction (#18576)
* update doc for perf_train_cpu_many, add mpi introduction

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* Update docs/source/en/perf_train_cpu_many.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/perf_train_cpu_many.mdx

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-08-12 08:36:27 -04:00
Alara Dirik
f28f240828
fix owlvit tests, update docstring examples (#18586) 2022-08-11 19:10:25 +03:00
Dhruv Karan
f62cb8313c
Adds CLIP to models exportable with ONNX (#18515)
* onnx config for clip

* default opset as 14

* changes from the original repo

* input values order fix

* outputs fix

* remove unused import

* ran make fix-copies

* black format

* review comments: forward ref, import fix, model change revert, .to cleanup

* make style

* formatting fixes

* revert groupvit

* comment for cast to int32

* comment fix

* make .T as .t() for onnx conversion

* ran make fix-copies

* remove unneeded comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix copies

* remove comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-08-10 15:47:31 -04:00
Steven Liu
6936e7c487
Update philosophy to include other preprocessing classes (#18550)
* 📝 update philosophy to include other preprocessing classes

* 🖍 apply feedbacks
2022-08-10 13:20:39 -05:00
Younes Belkada
4a51075a96
bitsandbytes - Linear8bitLt integration into transformers models (#17901)
* first commit

* correct replace function

* add final changes

- works like charm!
- cannot implement tests yet
- tested

* clean up a bit

* add bitsandbytes dependencies

* working version

- added import function
- added bitsandbytes utils file

* small fix

* small fix

- fix import issue

* fix import issues

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit

- move bitsandbytes utils to utils
- change comments on functions

* reformat docstring

- reformat docstring on init_empty_weights_8bit

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert bad formatting

* change to bitsandbytes

* refactor a bit

- remove init8bit since it is useless

* more refactoring

- fixed init empty weights issue
- added threshold param

* small hack to make it work

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* revmoe the small hack

* modify utils file

* make style + refactor a bit

* create correctly device map

* add correct dtype for device map creation

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

- remove with torch.grad
- do not rely on Python bool magic!

* add docstring

 - add docstring for new kwargs

* add docstring

- comment `replace_8bit_linear` function
- fix weird formatting

* - added more documentation
- added new utility function for memory footprint tracking
- colab demo to add

* few modifs

- typo doc
- force cast into float16 when load_in_8bit is enabled

* added colab link

* add test architecture + docstring a bit

* refactor a bit testing class

* make style + refactor a bit

* enhance checks

- add more checks
- start writing saving test

* clean up a bit

* male style

* add more details on doc

* add more tests

- still needs to fix 2 tests

* replace by "or"

- could not fix it from GitHub GUI

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit testing code + add readme

* make style

* fix import issue

* Update src/transformers/modeling_utils.py

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* add few comments

* add more doctring + make style

* more docstring

* raise error when loaded in 8bit

* make style

* add warning if loaded on CPU

* add small sanity check

* fix small comment

* add bitsandbytes on dockerfile

* Improve documentation

- improve documentation from comments

* add few comments

* slow tests pass on the VM but not on the CI VM

* Fix merge conflict

* make style

* another test should pass on a multi gpu setup

* fix bad import in testing file

* Fix slow tests

- remove dummy batches
- no more CUDA illegal memory errors

* odify dockerfile

* Update docs/source/en/main_classes/model.mdx

* Update Dockerfile

* Update model.mdx

* Update Dockerfile

* Apply suggestions from code review

* few modifications

- lm head can stay on disk/cpu
- change model name so that test pass

* change test value

- change test value to the correct output
- torch bmm changed to baddmm in bloom modeling when merging

* modify installation guidelines

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* replace `n`by `name`

* merge `load_in_8bit` and `low_cpu_mem_usage`

* first try - keep the lm head in full precision

* better check

- check the attribute `base_model_prefix` instead of computing the number of parameters

* added more tests

* Update src/transformers/utils/bitsandbytes.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit

* improve documentation

- fix typos for installation
- change title in the documentation

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
2022-08-10 09:13:36 +02:00
Steven Liu
8cf4a6f0a6
📝 update documentation build section (#18548) 2022-08-09 18:22:55 -05:00
Steven Liu
0c183cc2f4
📝 update metric with evaluate (#18535) 2022-08-09 11:58:11 -05:00
Thomas Chaigneau
8cb5ecd912
Add mt5 onnx config (#18394)
* update features

* MT5OnnxConfig added with updated with tests and docs

* fix imports

* fix onnc_config_cls for mt5

Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai>
2022-08-09 03:46:53 -04:00
Mishig Davaadorj
f1f5de31ed
Update perf_train_gpu_one.mdx (#18532) 2022-08-08 20:33:34 +02:00
Steven Liu
3632531ec6
Add example of multimodal usage to pipeline tutorial (#18498)
* 📝 add example of multimodal usage to pipeline tutorial

* 🖍 apply feedbacks

* 🖍 apply niels feedback
2022-08-08 11:31:31 -05:00
Steven Liu
36b37990af
update to use interlibrary links instead of Markdown (#18500) 2022-08-08 10:53:52 -05:00
Sourab Mangrulkar
2fecde742d
update fsdp docs (#18521)
* updating fsdp documentation

* typo fix
2022-08-08 18:56:51 +05:30
Julien Chaumond
8d1f9039d0
Just re-reading the whole doc every couple of months 😬 (#18489)
* Delete valohai.yaml

* NLP => ML

* typo

* website supports https

* datasets

* 60k + modalities

* unrelated link fixing for accelerate

* Ok those links were actually broken

* Fix link

* Make `AutoTokenizer` auto-link

* wording tweak

* add at least one non-nlp task
2022-08-06 09:38:55 +02:00
Yih-Dar
9d64f7f00c
Update some expected values in quicktour.mdx for resampy 0.3.0 (#18484)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-05 19:17:51 +02:00
Sylvain Gugger
faacdf007b
Move cache folder to huggingface/hub for consistency with hf_hub (#18492)
* Move cache folder to just huggingface

* Thank you VsCode for this needless import

* Move to hub

* Forgot one
2022-08-05 13:14:00 -04:00
NielsRogge
f9a0008d2d
Add VideoMAE (#17821)
* First draft

* Add VideoMAEForVideoClassification

* Improve conversion script

* Add VideoMAEForPreTraining

* Add VideoMAEFeatureExtractor

* Improve VideoMAEFeatureExtractor

* Improve docs

* Add first draft of model tests

* Improve VideoMAEForPreTraining

* Fix base_model_prefix

* Make model take pixel_values of shape (B, T, C, H, W)

* Add loss computation of VideoMAEForPreTraining

* Improve tests

* Improve model testsé

* Make all tests pass

* Add VideoMAE to main README

* Add tests for VideoMAEFeatureExtractor

* Add integration test

* Improve conversion script

* Rename patch embedding class

* Remove VideoMAELayer from init

* Update design of patch embeddings

* Improve comments

* Improve conversion script

* Improve conversion script

* Add conversion of pretrained model

* Add loss verification of pretrained model

* Add loss verification of unnormalized targets

* Add integration test for pretraining model

* Apply suggestions from code review

* Fix bug to make feature extractor resize only shorter edge

* Address more comments

* Improve normalization of videos

* Add doc examples

* Move constants to dedicated script

* Remove scripts

* Transfer checkpoints, fix docs

* Update script

* Update image mean and std

* Fix doc tests

* Set return_tensors to NumPy by default

* Revert the previous change

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-08-04 18:02:55 +02:00
Steven Liu
92915ebec2
Update _toctree.yml (#18440)
This PR moves GroupViT and LXMert to their correct sections. As pointed out by @NielsRogge and @LysandreJik, GroupViT and LXMert are both multimodal models.
2022-08-03 12:26:01 +02:00
Christopher Akiki
5096a654b7
Add programming languages (#18434)
The current wording makes it sound as if the programming languages are part of the 46 natural languages.
2022-08-02 16:02:25 -04:00
Alara Dirik
8ae7784256
update maskformer docs (#18423)
* update maskformer docs

* fix typo
2022-08-02 18:43:58 +03:00
Steven Liu
151a2aaa4e
Split model list on modality (#18328)
* 📝 split up model list

* Adapt script to reorg

* apply niels feedback

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-08-01 11:10:20 -05:00
Sylvain Gugger
01db72abd4
Rewrite push_to_hub to use upload_files (#18366)
* Rewrite push_to_hub to use upload_files

* Adapt the doc a bit

* Address review comments and clean doc
2022-08-01 12:07:30 -04:00
Ikuya Yamada
62098b9348
Adding fine-tuning models to LUKE (#18353)
* add LUKE models for downstream tasks

* add new LUKE models to docs

* fix typos

* remove commented lines

* exclude None items from tuple return values
2022-08-01 11:09:47 -04:00
NielsRogge
7b9e995b70
Fix docs (#18399)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-08-01 17:02:51 +02:00
Sylvain Gugger
986526a0e4
Replace as_target context managers by direct calls (#18325)
* Preliminary work on tokenizers

* Quality + fix tests

* Treat processors

* Fix pad

* Remove all uses of  in tests, docs and examples

* Replace all as_target_tokenizer

* Fix tests

* Fix quality

* Update examples/flax/image-captioning/run_image_captioning_flax.py

Co-authored-by: amyeroberts <amy@huggingface.co>

* Style

Co-authored-by: amyeroberts <amy@huggingface.co>
2022-07-29 08:09:09 -04:00
Sanchit Gandhi
a4ee463d95
[Docs] Fix Speech Encoder Decoder doc sample (#18346)
* [Docs] Fix Speech Encoder Decoder doc sample

* improve pre-processing comment

* make style
2022-07-29 09:11:28 +01:00
Steven Liu
96be1b7f49
Update feature extractor docs (#18324)
As pointed out by @NielsRogge, a feature extractor is used to prepare inputs for a model with a single modality rather than multimodal models.
2022-07-27 15:32:57 -05:00
Wang, Yi
2b81f72be9
start from 1.12, torch_ccl is renamed as oneccl_bindings_for_pytorch … (#18229)
* start from 1.12, torch_ccl is renamed as oneccl_bindings_for_pytorch and should import it before use

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* add doc for perf_train_cpu_many

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* update doc

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2022-07-27 11:15:41 -04:00
Ritik Nandwal
e87ac9d18b
Add swin transformer v2 (#17469)
* Add files generated using transformer-cli add-new-model-like command

* Add changes for swinv2 attention and forward method

* Add fixes

* Add modifications for weight conversion and remaining args in swin model

* Add changes for patchmerging

* Add changes for SwinV2selfattention

* Update conversion script

* Add final fixes for the swin_v2 model

* Add changes for conversion script for pretrained window size case

* Add pretrained window size value from config in SwinV2Encoder class

* Make fixup

* Add swinv2 to models_not_in_readme to utils/check_copies.py

* Modify Swinv2v2 to Swin Transformer V2

* Remove copied from, to run make fixup command

* Add updates to swinv2tf from main branch

* Add pretrained_window_size to config, to make tests pass

* Add modified weights from nandwalritik profile for swinv2

* Update model weights from swinv2 from nandwalritik profile

* Add fix for build_pr_documentation CI fix

* Add fixes for weight conversion

* Add change to make input with padding work

* Add fixes for test cases

* Add few changes from swin to swinv2 to pass test cases

* Remove tests for tensorflow as swinv2 for TF is not added yet

* Overide test_pt_tf_model_equivalence function as TF implementation for swinv2 is not added yet

* Add modeling_tf_swinv2 to _ignore_modules as test file is removed for this one right now.

* Update docs url for swinv2 in README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Undo changes for check_repo

* Update url in readme.md

* Remove overrided function to test pt_tf_model_equivalence

* Remove TF model imports for Swinv2 as its not implemented in this PR

* Add changes for index.mdx

* Add swinv2 papers link,abstract and contributors details

* Rename cpb_mlp to continous_position_bias_mlp

* Add tips for swinv2 model

* Update src/transformers/models/swinv2/configuration_swinv2.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/swinv2/configuration_swinv2.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix indentation for docstring example in src/transformers/models/swinv2/configuration_swinv2.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update import order in src/transformers/models/swinv2/configuration_swinv2.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add copyright statements in weights conversion script.

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Remove Swinv2 from models_not_in_readme

* Reformat code

* Remove TF implementation file for swinv2

* Update start docstring.

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add changes for docstring

* Update orgname for weights to microsoft

* Remove to_2tuple function

* Add copied from statements wherever applicable

* Add copied from to Swinv2ForMaskedImageModelling class

* Reformat code.

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add unittest.skip(with reason.) for test_inputs_embeds test case.

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add updates for test_modeling_swinv2.py

* Add @unittest.skip() annotation for clarity to create_and_test_config_common_properties function

* Add continuous_position_bias_mlp parameter to conversion script

* Add test for testing masked_image_modelling for swinv2

* Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add suggested changes

* Add copied from to forward methods of Swinv2Stage and Swinv2Encoder

* Add push_to_hub flag to weight conversion script

* Change order or Swinv2DropPath class

* Add id2label mapping for imagenet 21k

* Add updated url for SwinV2 functions and classes used in implementation

* Update input_feature dimensions format, mentioned in comments.

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

* Add suggested changes for modeling_swin2.py

* Update docs

* Remove create_and_test_config_common_properties function, as test_model_common_attributes is sufficient.

* Fix indentation.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add changes for making Nit objects in code style

* Add suggested changes

* Add suggested changes for test_modelling_swinv2

* make fix-copies

* Update docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-27 11:14:47 -04:00
NielsRogge
ccd4180f8a
[EncoderDecoder] Improve docs (#18271)
* Improve docs

* Improve docs of speech one as well

* Apply suggestions from code review

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-07-27 10:08:59 +02:00
Gorkem Ozkaya
f58b9c0522
Update translation.mdx (#18169)
* Update translation.mdx

* update translation.mdx by running make style
2022-07-26 07:56:40 -04:00
gilad19
2b09650885
Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER) (#17924)
* Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER)

* Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER)

* provide classifier only text hidden states

* add test_for_token_classification

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* add test_for_token_classification

Co-authored-by: gfuchs <gfuchs@ebay.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-07-26 10:11:32 +02:00
Alara Dirik
002915aa2a
Owlvit docs test (#18257)
* fix docs and add owlvit docs test

* fix minor bug in post_process, add to processor

* improve owlvit code examples

* fix hardcoded image size
2022-07-26 10:55:14 +03:00
Muhammad Ahmed
7cb4da13fe
change bloom parameters to 176B (#18235) 2022-07-22 10:17:48 -04:00
Alara Dirik
12d66b4701
Add OWL-ViT model for zero-shot object detection (#17938)
* add owlvit model skeleton

* add class and box predictor heads

* convert modified flax clip to pytorch

* fix box and class predictors

* add OwlViTImageTextEmbedder

* convert class and box head checkpoints

* convert image text embedder checkpoints

* add object detection head

* fix bugs

* update conversion script

* update conversion script

* fix q,v,k,out weight conversion conversion

* add owlvit object detection output

* fix bug in image embedder

* fix bugs in text embedder

* fix positional embeddings

* fix bug in inference mode vision pooling

* update docs, init tokenizer and processor files

* support batch processing

* add OwlViTProcessor

* remove merge conflicts

* readd owlvit imports

* fix bug in OwlViTProcessor imports

* fix bugs in processor

* update docs

* fix bugs in processor

* update owlvit docs

* add OwlViTFeatureExtractor

* style changes, add postprocess method to feature extractor

* add feature extractor and processor tests

* add object detection tests

* update conversion script

* update config paths

* update config paths

* fix configuration paths and bugs

* fix bugs in OwlViT tests

* add import checks to processor

* fix docs and minor issues

* fix docs and minor issues

* fix bugs and issues

* fix bugs and issues

* fix bugs and issues

* fix bugs and issues

* update docs and examples

* fix bugs and issues

* update conversion script, fix positional embeddings

* process 2D input ids, update tests

* fix style and quality issues

* update docs

* update docs and imports

* update OWL-ViT index.md

* fix bug in OwlViT feature ext tests

* fix code examples, return_dict by default

* return_dict by default

* minor fixes, add tests to processor

* small fixes

* add output_attentions arg to main model

* fix bugs

* remove output_hidden_states arg from main model

* update self.config variables

* add option to return last_hidden_states

* fix bug in config variables

* fix copied from statements

* fix small issues and bugs

* fix bugs

* fix bugs, support greyscale images

* run fixup

* update repo name

* merge OwlViTImageTextEmbedder with obj detection head

* fix merge conflict

* fix merge conflict

* make fixup

* fix bugs

* fix bugs

* add additional processor test
2022-07-22 13:35:32 +03:00
Sayak Paul
561b9a8c00
[SegFormer] TensorFlow port (#17910)
* add: segformer utils and img. classification.

* add: segmentation layer.

* feat: working implementation of segformer.

* chore: remove unused variable.

* add test, remaining modifications.

* remove: unnecessary files.

* add: rest of the files.

Co-authored-by: matt <rocketknight1@gmail.com>

* chore: remove ModuleList comment.

* chore: apply make style.

* chore: apply make fixup-copies.

* add  to check_repo.py

* add decode head to IGNORE_NON_TESTED

* chore: run make style.

* chore: PR comments.

* chore: minor changes to model doc.

* tests: reduction across samples.

* add a note on the space.

* sort importats.

* fix: reduction in loss computation.

* chore: align loss function with that of NER.

* chore: correct utils/documentation_tests.txt

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* chore: simplify the interpolation of logits in loss computation.

* chore: return transposed logits when return_dict=False.

* chore: add link to the tf fine-tuning repo.

* address pr comments.

* address niels's comments.

* remove from_pt=True since tf weights are in.

* remove comment from pt model.

* address niels's comments.

Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2022-07-21 18:22:37 +01:00
Zhi Zheng
dbfeffd7c9
Update add_new_pipeline.mdx (#18224)
fix typo
2022-07-21 07:55:30 +02:00
Steven Liu
ff56b8fbff
Add custom config to quicktour (#18115)
* 📝 first draft of new quicktour

* make style

* 🖍 edit and review

* 🖍 small fixes

* 🖍 only add custom config section

* 🖍 use autoclass instead
2022-07-20 12:23:03 -05:00
Raghavan
dcec4c4387
Adding OPTForSeqClassification class (#18123)
* Adding OPTForSeqClassification class

* Fix import issues

* Add documentation for optforseqclassification

* Remove checkout

* fix failing tests

* fix typo

* Fix code formatting

* Incorporating the PR feedbacks

* Incorporate PR Feedbacks

* Fix failing test and add new test for multi label setup

* Fix formatting issue

* Fix failing tests

* Fix formatting issues

* Fix failing tests

* Fix failing tests

* Fix failing tests

* Fix failing tests

* PR feedback
2022-07-20 10:14:21 +02:00
Sylvain Gugger
dc9147ff36
Custom pipeline (#18079)
* Initial work

* More work

* Add tests for custom pipelines on the Hub

* Protect import

* Make the test work for TF as well

* Last PyTorch specific bit

* Add documentation

* Style

* Title in toc

* Bad names!

* Update docs/source/en/add_new_pipeline.mdx

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Auto stash before merge of "custom_pipeline" and "origin/custom_pipeline"

* Address review comments

* Address more review comments

* Update src/transformers/pipelines/__init__.py

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-07-19 12:02:35 +02:00
gcheron
8c14b342aa
add ONNX support for LeVit (#18154)
Co-authored-by: Guilhem Chéron <guilhemc@authentifier.com>
2022-07-18 15:17:07 +02:00
Lysandre Debut
c1c79b0655
NLLB tokenizer (#18126)
* NLLB tokenizer

* Apply suggestions from code review - Thanks Stefan!

Co-authored-by: Stefan Schweter <stefan@schweter.it>

* Final touches

* Style :)

* Update docs/source/en/model_doc/nllb.mdx

Co-authored-by: Stefan Schweter <stefan@schweter.it>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* PR reviews

* Auto models

Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-18 08:12:34 -04:00
amyeroberts
8581a798c0
Add TF DeiT implementation (#17806)
* Initial TF DeiT implementation

* Fix copies naming issues

* Fix up + docs

* Properly same main layer

* Name layers properly

* Initial TF DeiT implementation

* Fix copies naming issues

* Fix up + docs

* Properly same main layer

* Name layers properly

* Fixup

* Fix import

* Fix import

* Fix import

* Fix weight loading for tests whilst not on hub

* Add doc tests and remove to_2tuple

* Add back to_2tuple
Removing to_2tuple results in many downstream changes needed because of the copies checks

* Incorporate updates in Improve vision models #17731 PR

* Don't hard code num_channels

* Copy PyTorch DeiT embeddings and remove pytorch operations with mask

* Fix patch embeddings & tidy up

* Update PixelShuffle to move logic into class layer

* Update doc strings - remove PT references

* Use NHWC format in internal layers

* Fix up

* Use linear activation layer

* Remove unused import

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Move dataclass to top of file

* Remove from_pt now weights on hub

* Fixup

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>
2022-07-13 18:04:08 +01:00
Wei
7ea6ccc2b3
Enable torchdynamo with torch_tensorrt(fx path) (#17765)
* enable fx2trt

* Update perf_train_gpu_one.mdx

* Update perf_train_gpu_one.mdx

* add lib check

* update

* format

* update

* fix import check

* fix isort

* improve doc

* refactor ctx manager

* fix isort

* black format

* isort fix

* fix format

* update args

* update black

* cleanups

* Update perf_train_gpu_one.mdx

* code refactor

* code refactor to init

* remove redundancy

* isort

* replace self.args with args

Co-authored-by: Stas Bekman <stas@stason.org>
2022-07-13 12:43:28 -04:00
Yulv-git
95113d1365
Fix some typos. (#17560)
* Fix some typos.

Signed-off-by: Yulv-git <yulvchi@qq.com>

* Fix typo.

Signed-off-by: Yulv-git <yulvchi@qq.com>

* make fixup.
2022-07-11 05:00:13 -04:00
varshith
91c4a3ab1a
Added Command for windows VENV activation in installation docs (#18008)
* Added command for windows VENV activation

* changed linux and macos  specification
2022-07-07 08:18:44 -04:00
Sylvain Gugger
1b749a7f8d
Sort doc toc (#18034)
* Add script to sort doc ToC

* Style and fixes

* Add check to quality job
2022-07-07 08:17:58 -04:00
Sylvain Gugger
2e90c3df8f
Doc to dataset (#18037)
* Link to the Datasets doc

* Remove unwanted file
2022-07-06 12:10:06 -04:00
NielsRogge
22edb68d49
Squash commits (#17981)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-07-06 08:11:48 -04:00
Matthijs Hollemans
6cb19540c9
sort list of models (#18011) 2022-07-04 09:20:55 -04:00
amyeroberts
77ea5130a1
Add TF ResNet model (#17427)
* Rought TF conversion outline

* Tidy up

* Fix padding differences between layers

* Add back embedder - whoops

* Match test file to main

* Match upstream test file

* Correctly pass and assign image_size parameter

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Add in MainLayer

* Correctly name layer

* Tidy up AdaptivePooler

* Small tidy-up

More accurate type hints and remove whitespaces

* Change AdaptiveAvgPool

Use the AdaptiveAvgPool implementation by @Rocketknight1, which correctly pools if the output shape does not evenly divide by input shape c.f. 9e26607e22 (r900109509)

Co-authored-by: From: matt <rocketknight1@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Use updated AdaptiveAvgPool

Co-authored-by: matt <rocketknight1@gmail.com>

* Make AdaptiveAvgPool compatible with CPU

* Remove image_size from configuration

* Fixup

* Tensorflow -> TensorFlow

* Fix pt references in tests

* Apply suggestions from code review - grammar and wording

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add TFResNet to doc tests

* PR comments - GlobalAveragePooling and clearer comments

* Remove unused import

* Add in keepdims argument

* Add num_channels check

* grammar fix: by -> of

Co-authored-by: matt <rocketknight1@gmail.com>

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Remove transposes - keep NHWC throughout forward pass

* Fixup look sharp

* Add missing layer names

* Final tidy up - remove from_pt now weights on hub

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-07-04 10:59:15 +01:00
Lysandre Debut
7b18702ca7
Add link to existing documentation (#17931) 2022-07-04 04:13:05 -04:00
Nouamane Tazi
b68d408f1b
add ONNX support for BLOOM (#17961)
* add onnx support for BLOOM

* use TYPE_CHECKING for type annotations

* fix past_shape for bloom (different from gpt2)

* use logical_or instead of `+` for onnx support

* bigger `atol_for_validation` for larger bloom models

* copied -> taken because it's no longer an exact copy

* remove "copied from" comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-01 10:44:42 -04:00
Billy Cao
cb42502410
Fix typo in perf_train_gpu_one.mdx (#17983) 2022-07-01 09:19:13 -04:00
Aaron Pham
49cd736a28
feat: add pipeline registry abstraction (#17905)
* feat: add pipeline registry abstraction

- added `PipelineRegistry` abstraction
- updates `add_new_pipeline.mdx` (english docs) to reflect the api addition
- migrate `check_task` and `get_supported_tasks` from
  transformers/pipelines/__init__.py to
  transformers/pipelines/base.py#PipelineRegistry.{check_task,get_supported_tasks}

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* fix: update with upstream/main

chore: Apply suggestions from sgugger's code review

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* chore: PR updates

- revert src/transformers/dependency_versions_table.py from upstream/main
- updates pipeline registry to use global variables

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* tests: add tests for pipeline registry

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* tests: add test for output warning.

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* chore: fmt and cleanup unused imports

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* fix: change imports to top of the file and address comments

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-30 12:11:08 -04:00
regisss
9cb7cef285
Add ONNX support for LayoutLMv3 (#17953)
* Add ONNX support for LayoutLMv3

* Update docstrings

* Update empty description in docstring

* Fix imports and type hints
2022-06-30 12:09:52 -04:00
Crystina
692e61e91a
Flax t5 Encoder (#17784)
* first draft adding Flax-t5-encoder and Flax-mt5-encoder

* imports

* after make fixup

* flax t5 encoder test

* black on test

* make fix-copies

* clean

* all_model_classes -> tuple

* clean test

* is_encoder_decoder=False in t5-enc tester

* remove file docstring before FlaxT5Encoder

* black

* isort

* commit suggestions on src/transformers/models/t5/modeling_flax_t5.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* commit suggestions on src/transformers/models/t5/modeling_flax_t5.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* remove _get_encoder_module

* self.decoder_seq_length -> self.encoder_seq_length as t5-enc does not have decoder

* bugfix - self.module_class is class itself, not instance;

* docs for mt5 and t5

* call -> __call__ in t5 doc

* FlaxMT5EncoderModel to TYPE_HINT

* run doc-builder to allow change the files

Co-authored-by: Suraj Patil <surajp815@gmail.com>
2022-06-30 00:49:02 +02:00
Matthijs Hollemans
fbc7598bab
add MobileViT model (#17354)
* add MobileViT

* fixup

* Update README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* remove empty line

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* use clearer variable names

* rename to MobileViTTransformerLayer

* no longer inherit from nn.Sequential

* fixup

* fixup

* not sure why this got added twice

* rename organization for checkpoints

* fix it up

* Update src/transformers/models/mobilevit/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/configuration_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/configuration_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/configuration_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/models/mobilevit/test_modeling_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* code style improvements

* fixup

* Update docs/source/en/model_doc/mobilevit.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/mobilevit.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/mobilevit/configuration_mobilevit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/mobilevit/configuration_mobilevit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* download labels from hub

* rename layers

* rename more layers

* don't compute loss in separate function

* remove some nn.Sequential

* replace nn.Sequential with new MobileViTTransformer class

* replace nn.Sequential with MobileViTMobileNetLayer

* fix pruning since model structure changed

* fixup

* fix doc comment

* remove custom resize from feature extractor

* fix ONNX import

* add to doc tests

* use center_crop from image_utils

* move RGB->BGR flipping into image_utils

* fix broken tests

* wrong type hint

* small tweaks

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-29 16:07:51 -04:00
StevenTang1998
3cff4cc587
Add MVP model (#17787)
* Add MVP model

* Update README

* Remove useless module

* Update docs

* Fix bugs in tokenizer

* Remove useless test

* Remove useless module

* Update vocab

* Remove specifying

* Remove specifying

* Add #Copied ... statement

* Update paper link

* Remove useless TFMvp

* Add #Copied ... statement

* Fix style in test mvp model

* Fix some typos

* Fix properties of unset special tokens in non verbose mode

* Update paper link

* Update MVP doc

* Update MVP doc

* Fix README

* Fix typos in docs

* Update docs
2022-06-29 09:30:55 -04:00
Aritra Roy Gosthipaty
a7eba83161
TF implementation of RegNets (#17554)
* chore: initial commit

Copied the torch implementation of regnets and porting the code to tf step by step. Also introduced an output layer which was needed for regnets.

* chore: porting the rest of the modules to tensorflow

did not change the documentation yet, yet to try the playground on the model

* Fix initilizations (#1)

* fix: code structure in few cases.

* fix: code structure to align tf models.

* fix: layer naming, bn layer still remains.

* chore: change default epsilon and momentum in bn.

* chore: styling nits.

* fix: cross-loading bn params.

* fix: regnet tf model, integration passing.

* add: tests for TF regnet.

* fix: code quality related issues.

* chore: added rest of the files.

* minor additions..

* fix: repo consistency.

* fix: regnet tf tests.

* chore: reorganize dummy_tf_objects for regnet.

* chore: remove checkpoint var.

* chore: remov unnecessary files.

* chore: run make style.

* Update docs/source/en/model_doc/regnet.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* chore: PR feedback I.

* fix: pt test. thanks to @ydshieh.

* New adaptive pooler (#3)

* feat: new adaptive pooler

Co-authored-by: @Rocketknight1

* chore: remove image_size argument.

Co-authored-by: matt <rocketknight1@gmail.com>

Co-authored-by: matt <rocketknight1@gmail.com>

* Empty-Commit

* chore: remove image_size comment.

* chore: remove playground_tf.py

* chore: minor changes related to spacing.

* chore: make style.

* Update src/transformers/models/regnet/modeling_tf_regnet.py

Co-authored-by: amyeroberts <aeroberts4444@gmail.com>

* Update src/transformers/models/regnet/modeling_tf_regnet.py

Co-authored-by: amyeroberts <aeroberts4444@gmail.com>

* chore: refactored __init__.

* chore: copied from -> taken from./g

* adaptive pool -> global avg pool, channel check.

* chore: move channel check to stem.

* pr comments - minor refactor and add regnets to doc tests.

* Update src/transformers/models/regnet/modeling_tf_regnet.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* minor fix in the xlayer.

* Empty-Commit

* chore: removed from_pt=True.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: amyeroberts <aeroberts4444@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-06-29 13:45:14 +01:00
Jerry Jiarui XU
6c8f4c9a93
Adding GroupViT Models (#17313)
* add group vit and fixed test (except slow)

* passing slow test

* addressed some comments

* fixed test

* fixed style

* fixed copy

* fixed segmentation output

* fixed test

* fixed relative path

* fixed copy

* add ignore non auto configured

* fixed docstring, add doc

* fixed copies

* Apply suggestions from code review

merge suggestions

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* resolve comment, renaming model

* delete unused attr

* use fix copies

* resolve comments

* fixed attn

* remove unused vars

* refactor tests

* resolve final comments

* add demo notebook

* fixed inconsitent default

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* rename stage->stages

* Create single GroupViTEncoderLayer class

* Update conversion script

* Simplify conversion script

* Remove cross-attention class in favor of GroupViTAttention

* Convert other model as well, add processor to conversion script

* addressing final comment

* fixed args

* Update src/transformers/models/groupvit/modeling_groupvit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-06-28 20:51:47 +02:00
regisss
76d13de5ae
Add ONNX support for DETR (#17904) 2022-06-28 14:48:43 +02:00
Bill Ray
bfcd5743ee
In group_texts function, drop last block if smaller than block_size (#17908) 2022-06-28 08:34:55 -04:00
Matt
ee0d001de7
Add a TF in-graph tokenizer for BERT (#17701)
* Add a TF in-graph tokenizer for BERT

* Add from_pretrained

* Add proper truncation, option handling to match other tokenizers

* Add proper imports and guards

* Add test, fix all the bugs exposed by said test

* Fix truncation of paired texts in graph mode, more test updates

* Small fixes, add a (very careful) test for savedmodel

* Add tensorflow-text dependency, make fixup

* Update documentation

* Update documentation

* make fixup

* Slight changes to tests

* Add some docstring examples

* Update tests

* Update tests and add proper lowercasing/normalization

* make fixup

* Add docstring for padding!

* Mark slow tests

* make fixup

* Fall back to BertTokenizerFast if BertTokenizer is unavailable

* Fall back to BertTokenizerFast if BertTokenizer is unavailable

* make fixup

* Properly handle tensorflow-text dummies
2022-06-27 12:06:21 +01:00
rooa
d6b6fb9963
Add CodeGen model (#17443)
* Add CodeGen model

* Add missing key and switch order of super()

* Fix torch.ones init with uint8 instead of bool

* Address comments: copy statements and doc

* update tests

* remove old model parallel

* fix batch gen tests

* fix batch gen test

* update test_gpt2_sample_max_time

* fix codgen test and revert gpt2 test change

* Fix incorrect tie_word_embedding value, typo, URL

* Fix model order in README and styling

* Reorder model list alphabetically

* Set tie_word_embedding to False by default

* Apply suggestions from code review

* Better attn mask name & remove attn masked_bias

* add tokenizer for codegen

* quality

* doc tokenizer

* fix-copies

* add CodeGenTokenizer in converter

* make truncation optional

* add test for truncation

* add copyright

* fix-copies

* fix fast tokenizer decode

* Update src/transformers/models/codegen/tokenization_codegen.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* increase vocab_size in tests

Co-authored-by: patil-suraj <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-06-24 17:10:38 +02:00
Vishwas
c2c0d9db5f
Improve encoder decoder model docs (#17815)
* Copied all the changes from the last PR

* added in documentation_tests.txt

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Co-authored-by: vishwaspai <vishwas.pai@emplay.net>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2022-06-24 14:48:19 +02:00
Sijun He
7cf52a49de
Nezha Pytorch implementation (#17776)
* wip

* rebase

* all tests pass

* rebase

* ready for PR

* address comments

* fix styles

* add require_torch to pipeline test

* remove remote image to improve CI consistency

* address comments; fix tf/flax tests

* address comments; fix tf/flax tests

* fix tests; add alias

* repo consistency tests

* Update src/transformers/pipelines/visual_question_answering.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* address comments

* Update src/transformers/pipelines/visual_question_answering.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* merge

* wip

* wip

* wip

* most basic tests passes

* all tests pass now

* relative embedding

* wip

* running make fixup

* remove bert changes

* fix doc

* fix doc

* fix issues

* fix doc

* address comments

* fix CI

* remove redundant copied from

* address comments

* fix broken test

Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-06-23 12:36:22 -04:00
Leandro von Werra
6f29029b05
Improve performance docs (#17750)
* add skeleton files

* fix cpu inference link

* add hint to make clear that single gpu section contains general info

* add new files to ToC

* update toctree to have subsection for performance

* add "coming soon" to the still empty sections

* fix missing title

* fix typo

* add reference to empty documents

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2022-06-23 14:51:54 +02:00
Anugunj Naman
27e907386a
Fix Automatic Download of Pretrained Weights in DETR (#17712)
* added use_backbone_pretrained

* style fixes

* update

* Update detr.mdx

* Update detr.mdx

* Update detr.mdx

* update using doc py

* Update detr.mdx

* Update src/transformers/models/detr/configuration_detr.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-21 16:45:35 +02:00
mrbean
eb16be415a
add onnx support for deberta and debertav2 (#17617)
* add onnx support for debertav2

* debertav2 -> deberta-v2 in onnx features file

* remove causal lm

* add deberta-v2-xlarge to onnx tests

* use self.type().dtype() in xsoftmax

Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* remove hack for deberta

* remove unused imports

* Update src/transformers/models/deberta_v2/configuration_deberta_v2.py

Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* use generate dummy inputs

* linter

* add imports

* add support for deberta v1 as well

* deberta does not support multiple choice

* Update src/transformers/models/deberta/configuration_deberta.py

Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* Update src/transformers/models/deberta_v2/configuration_deberta_v2.py

Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* one line ordered dict

* fire build

Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>
2022-06-21 11:04:15 +02:00
Patrick von Platen
8fcbe275c3
Add UL2 (just docs) (#17740)
* Add UL2
Co-authored-by: Daniel Hesslow <Daniel.Hesslow@gmail.com>

* Correct naming

* sort better

* up

* apply sylvains suggestion
2022-06-21 10:24:50 +02:00
Sylvain Gugger
3981ee8650
Sort the model doc Toc Alphabetically (#17723) 2022-06-15 16:11:56 -04:00
Patrick von Platen
7f14839f55
[Wav2Vec2Conformer] Official release (#17709)
* [Wav2Vec2Conformer] Official release

* remove from not-in-readme
2022-06-15 18:34:15 +02:00
Hailey Schoelkopf
edb672ac5e
Add BloomForSequenceClassification and BloomForTokenClassification classes (#17639)
* add new bloom classes

* (feat) add bloom classification tests; make style

* style: change import in test

* add some typehints to bloom classes

* merge main into branch

* fix: input checking in bloom seq classification

* fix tests

* change model class tests

* fix few tests

- more tests should pass
- one test left

* make token classifier return hidden states

* style: make BLOOM typehints consistent

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2022-06-14 17:10:12 +02:00
jianan-gu
3b29c9fdb7
Extend Transformers Trainer Class to Enable PyTorch Torchscript for Inference (#17153)
* add jit mode option and model wrap

* Update src/transformers/training_args.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/training_args.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refine code

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add ut and refine code

* code refine

* refine code

* add inference doc

* Update src/transformers/trainer.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/trainer.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* add cpu inference performance doc

* Update perf_infer_cpu.mdx

* Update perf_infer_cpu.mdx

* Update performance.mdx

* Update _toctree.yml

* refine jit func naming

* Update _toctree.yml

* Delete perf_infer_gpu_one.mdx

* Update perf_infer_cpu.mdx

* Update docs/source/en/perf_infer_cpu.mdx

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* add none check before jit

* Update docs/source/en/perf_infer_cpu.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/perf_infer_cpu.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2022-06-14 07:56:47 -04:00
Daniel Stancl
a72f1c9f5b
Add LongT5 model (#16792)
* Initial commit

* Make some fixes

* Make PT model full forward pass

* Drop TF & Flax implementation, fix copies etc

* Add Flax model and update some corresponding stuff

* Drop some TF things

* Update config and flax local attn

* Add encoder_attention_type to config

* .

* Update docs

* Do some cleansing

* Fix some issues -> make style; add some docs

* Fix position_bias + mask addition + Update tests

* Fix repo consistency

* Fix model consistency by removing flax operation over attn_mask

* [WIP] Add PT TGlobal LongT5

* .

* [WIP] Add flax tglobal model

* [WIP] Update flax model to use the right attention type in the encoder

* Fix flax tglobal model forward pass

* Make the use of global_relative_attention_bias

* Add test suites for TGlobal model

* Fix minor bugs, clean code

* Fix pt-flax equivalence though not convinced with correctness

* Fix LocalAttn implementation to match the original impl. + update READMEs

* Few updates

* Update: [Flax] improve large model init and loading #16148

* Add ckpt conversion script accoring to #16853 + handle torch device placement

* Minor updates to conversion script.

* Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM

* gpu support + dtype fix

* Apply some suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* * Remove (de)parallelize stuff
* Edit shape comments
* Update README.md
* make fix-copies

* Remove caching logic for local & tglobal attention

* Apply another batch of suggestions from code review

* Add missing checkpoints
* Format converting scripts
* Drop (de)parallelize links from longT5 mdx

* Fix converting script + revert config file change

* Revert "Remove caching logic for local & tglobal attention"

This reverts commit 2a619828f6ddc3e65bd9bb1725a12b77fa883a46.

* Stash caching logic in Flax model

* Make side relative bias used always

* Drop caching logic in PT model

* Return side bias as it was

* Drop all remaining model parallel logic

* Remove clamp statements

* Move test files to the proper place

* Update docs with new version of hf-doc-builder

* Fix test imports

* Make some minor improvements

* Add missing checkpoints to docs
* Make TGlobal model compatible with torch.onnx.export
* Replace some np.ndarray with jnp.ndarray

* Fix TGlobal for ONNX conversion + update docs

* fix _make_global_fixed_block_ids and masked neg  value

* update flax model

* style and quality

* fix imports

* remove load_tf_weights_in_longt5 from init and fix copies

* add slow test for TGlobal model

* typo fix

* Drop obsolete is_parallelizable and one warning

* Update __init__ files to fix repo-consistency

* fix pipeline test

* Fix some device placements

* [wip]: Update tests -- need to generate summaries to update expected_summary

* Fix quality

* Update LongT5 model card

* Update (slow) summarization tests

* make style

* rename checkpoitns

* finish

* fix flax tests

Co-authored-by: phungvanduy <pvduy23@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patil-suraj <surajp815@gmail.com>
2022-06-13 22:36:58 +02:00
Sijun He
66336dc183
Add Visual Question Answering (VQA) pipeline (#17286)
* wip

* rebase

* all tests pass

* rebase

* ready for PR

* address comments

* fix styles

* add require_torch to pipeline test

* remove remote image to improve CI consistency

* address comments; fix tf/flax tests

* address comments; fix tf/flax tests

* fix tests; add alias

* repo consistency tests

* Update src/transformers/pipelines/visual_question_answering.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* address comments

* Update src/transformers/pipelines/visual_question_answering.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* merge

* Update src/transformers/models/auto/modeling_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* merge

Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-13 07:49:44 -04:00
Sylvain Gugger
29080643eb
Mention in the doc we drop support for fairscale (#17610) 2022-06-09 12:20:39 -04:00
regisss
e0be053e43
Add ONNX support for ConvNeXT (#17627) 2022-06-09 09:31:02 -04:00
regisss
5323094a22
Add ONNX support for ResNet (#17585)
* Add ONNX support for ResNet

* Add ONNX test

* make fix-copies
2022-06-09 08:44:27 -04:00
Younes Belkada
ca2a55e9df
BLOOM (#17474)
* adding template

* update model

* model update

* update conf for debug model

* update conversion

* update conversion script

* update conversion script

* fix missing keys check

* add tests to test the tokenizer in the local machine

* Change variable name

* add tests on xnli dataset

* add more description

* add descriptions + clearer code

* clearer code

* adding new tests + skipping few tests because of env problems

* change comment

* add dtype on the configuration

* add test embeddings

* add hardcoded test

* fix dtype issue

* adding torch.float16 to config

* adding more metrics (min, max, mean)

* add sum

* now the test passes with almost equal

* add files for conversion - test passes on cpu  gpu

* add final changes

* cleaning code

* add new args in the docstring

* fix one liner function

* remove macros

* remove forward attention

* clean up init funtion

* add comments on the issue

* rm scale mask softmax

* do make style

* fix dtype in init

* fixing for loop on att probs

* fix style with black

* fix style + doc error

* fix and debug CI errors (docs + style)

* some updates

- change new operations
- finally add scaled softmax
- added new args in the config

* make use cache working

* add changes

- save sharded models
- final changes on the modeling script

* add changes

- comment on alibi
- add TODO on seq length

* test commit

- added a text to test the commit

Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>

* final changes

- attention mask change
- generation works on BS176b

Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>

* changes - model + conversion

* move to correct dir

* put ,

* fex fixes

* fix tokenizer autodoc

* fix minor CI issues

* fix minor CI issues

* fix minor CI issues

* fix style issue

* fix minor import issues

* fix few issues

* remove def main on the test

* add require torch

* replace decorator with 'with'

* fix style

* change to bloom

* add quick fix tokenizer

* fix tokenizer file

* fix tokenizer

- merge tests
- small fixes

* fix import issue

* add bloom to readme

* fix consistency

* Update docs/source/en/model_doc/bloom.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

fix comment issues on file headers

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix doc issue

* small fix - modeling test

* some changes

- refactor some code
- taking into account reviews
- more tests should pass
- removed pruning tests

* remove useless division

* more tests should pass

* more tests should pass

* more tests should pass

* let's try this one

-add alibi offset
- remove all permutes to make the grad operations work
- finger crossed

* refactor

- refactor code
- style changes
- add new threshold for test

* major changes

- change BLOOM to Bloom
- add quick doc on bloom.mdx
- move embeddings test on modeling test

* modify readme

* small fixes

* small fix

- better threshold for a test

* remove old test file from fetcher

* fix small typo

* major change

- change BloomLMHead to BloomForCausalLM

* remove onnx config

* major changes

- refactor the code
- remove asserts
- change tol for test

* make style

* small change

* adding a slow test + commenting old ones for now

* make style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make style

* fix duplicates

* cleaning comments on config

* clean a bit conversion file

* refacor a bit modeling file

* refactor tokenizer file

* fix tokenization test issue

* fix tokenization issue #2

* fix tokenization issue second try

* fix test issue

* make style + add suggestions

* change test fetcher

* try this one

- slow tests should pass
- finger crossed

* possible final changes

* make style

* try fix padding side issue

* fix side

* fix padding issue

* fix ko-readme

* fix config auto

* cleaning modeling file

* keep bloom in caps in ko

* update config docs

* remove pretraining_pp

* remove model parallel

* update config

- add correct config files

* fix duplicates

* fix fetcher

* fix refactor issue

- remove divide function

* try to remove alibi

* small fixes

- fix alibi
- remove seq length
- refactor a bit the code

* put correct values

- fix bos and eos token ids

* fix attention mask loop

Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>

* small fixes:

- remove skip bias add

* small fixes

- fix typo in readme
- fix typos in config

* small changes

- remove a test
- add reconstruction test
- change config

* small changes

- change Scaled Softmax to BloomScaledSoftmax

* small fixes

- fix alibi dtype

* major changes

- removing explicit dtype when loading modules
- fixing test args (torch_dtype=auto)
- add dosctring

* fix readmes

* major changes

- now bloom supports alibi shifting
- refactor a bit the code
- better test tolerance now

* refactor a bit

* refactor a bit

* put correct name on test

* change docstring

* small changes

- fix docstring modeling
- fix test tolerance

* fix small nit

- take dtype from tensors in the conversion script

* minor fix

- fix mdx issue

* minor fix

- change config docstring

* forward contrib credits from PR14084

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* apply modifications

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* resolve softmax upcast

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/models/bloom/modeling_bloom.py

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

* final changes modeling

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Merge commit 'd156898f3b9b2c990e5963f5030a7143d57921a2'

* merge commit

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* apply suggestions

Apply suggestions from Stas comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix gradient checkpointing

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* add slow but exact

* add accelerate compatibility

Co-authored-by: Nicolas Patry <Narsil@users.noreply.github.com>

* forward contrib credits

Co-authored-by: thomasw21 <thomasw21@users.noreply.github.com>
Co-authored-by: sgugger <sgugger@users.noreply.github.com>
Co-authored-by: patrickvonplaten <patrickvonplaten@users.noreply.github.com>
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: LysandreJik <LysandreJik@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix torch device on tests

* make style

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix nits

Co-authored-by: patrickvonplaten<patrickvonplaten@users.noreply.github.com>

* remove final nits

* fix doc

- add more details on the doc
- add links to checkpoints

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/bloom/modeling_bloom.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

Co-authored-by: sgugger <sgugger@users.noreply.github.com>

* put test torchscript to false

* Update src/transformers/models/bloom/modeling_bloom.py

Co-authored-by: justheuristic <justheuristic@gmail.com>

* fix alibi

- create alibi only once

* add small doc

* make quality

* replace torch.nn

* remove token type emb

* fix fused op + output bias

* add fused op

- now can control fused operation from config

* remove fused op

* make quality

* small changes

- remove unsed args on config
- removed bias gelu file
- make the model torchscriptable
- add torchscript slow tests

* Update src/transformers/models/bloom/modeling_bloom.py

* fix slow

* make style

* add accelerate support

* add bloom to deepspeed tests

* minor changes

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* minor change

* slow tests pass

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/model_doc/bloom.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* minor changes:

- change docstring
- add link to paper

Co-authored-by: Thomwolf <thomwolf@gmail.com>
Co-authored-by: Thomas Wolf <thomas@huggingface.co>
Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: sIncerass <sheng.s@berkeley.edu>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: Nicolas Patry <Narsil@users.noreply.github.com>
Co-authored-by: thomasw21 <thomasw21@users.noreply.github.com>
Co-authored-by: sgugger <sgugger@users.noreply.github.com>
Co-authored-by: patrickvonplaten <patrickvonplaten@users.noreply.github.com>
Co-authored-by: LysandreJik <LysandreJik@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: justheuristic <justheuristic@gmail.com>
Co-authored-by: Stas Bekman <stas@stason.org>
2022-06-09 12:00:40 +02:00
jianan-gu
34097b3304
Extend Transformers Trainer Class to Enable CPU AMP and Integrate Intel Extension for PyTorch (#17138)
* init PR

* fix import ipex

* minor fix on bf16

* refine optimizer

* refine args notes

* refine code

* refine ipex optimize args

* refine half_precision_backend

* black format

* isort format

* isort format files

* flake8 format

* doc builder format

* refine codes

* remove jit and optim bits

* black preview format

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refine code

* refine notes

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* code refine

* add ipex ut

* add performance cpu doc

* link to the cpu doc from main perf doc

* install ipex into CI's docker

* Update perf_train_cpu.mdx

* Update docs/source/en/perf_train_cpu.mdx

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update perf_train_cpu.mdx

* Update perf_train_cpu.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2022-06-08 09:41:57 -04:00
Sayak Paul
9d99489f2f
Add TFData2VecVision for semantic segmentation (#17271)
* feat: initial implementation of data2vec segmentation model in TF.

* chore: minor corrections to make the segmenter work.

* chore: removed unncessary files.

* chore: add tests and other modifications.

* fix: loss computation for segmentation.

* chore: remove unused variable.

* chore: formatting.

* added a dummy adaptive pooling layer.

* removed unnecessary file.

* potentially add identifiers to layer names.

* fix: layer naming.

* chore: removed unnecessary print.

* Skipping unneeded test

* chore: add logging to debug tolerance.

* fix: segmentation tests for tfdata2vecvision

* chore: make style.

* fix: layer names, assertion to be resolved.

* Bumping test tolerance a bit

* chore: bump the tol in PT test.

Co-authored-by: matt <rocketknight1@gmail.com>
2022-06-08 14:03:18 +01:00
Chan Woo Kim
119e3c0fc8
M-CTC-T Model (#16402)
* added cbs to notebooks, made copy-paste error fix in generation_utils

* initial push for mctc model

* mctc feature extractor done

* added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.

* added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.

* passing attention, now struggling to figure out how attention masks make sense here

* works when excluding attention masks. ask later how one would integrate attention maskshere

* bizarre configuration error (model prefix comes first in config dict json and messes up the order)

* all passing but bizzarre config dict ordering issue when to_dict

* passing all major tests

* feature extraction, processor, tokenizer added & tests passing

* style & consistency & other logistical fixes

* copy paste fix

* model after feature extraction working

* commiting final feature extraction results; need to fix normalization

* feature extraction passing tests; probably should add tests on the specific flashlight-copied functions?

* delete print ; format code a bit

* fixing tests

* passing major tests

* fixing styles

* completed tokenization test with real example; not sure if these values are entirely correct.

* last test fixes from local

* reverting accidentally included custom setup configs

* remove load tf weights; fix config error

* testing couldnt import featureextractor

* fix docs

* fix docs

* resolving comments

* style fixes

* style fixes

* Update to MCTCConv1dSubSampler

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* relposemb fixes

* conv1d name issue; expecting config fail with paraentheses

* fix config issue

* fix config issue

* fix config issue

* change everything to MCTCT

* fixing naming change errors

* archive list

* copyrights and docs

* copyrights and docs

* copyrights and docs

* merge resolution

* move tests, fix to changed optionaldependency structure

* test directories changed

* fixing tests

* how to avoid tf tests?

* how to avoid tf tests?

* tests passing locally

* allow mctctprocessor imported any env

* allow mctctprocessor imported any env

* fixed second round of feedback, need to fix docs

* doc changes not being applied

* all fixed

* style fix

* feedback fixes

* fix copies and feature extraction style fix

* Update tests/models/visual_bert/test_modeling_visual_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* copy paste huggingface:main visual bert

* added eof newline to visual bert; all tests are passing otherwise

* fix slow tests by adding attention mask

* change model id to speechbrain

* make fix-copies

* fix readme unwanted deletes

* fixing readmes, make fix-copies

* consistent M-CTC-T naming

* Update src/transformers/models/mctct/__init__.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* all fixed but variable naming

* adjust double quotes

* fixed variable names

* copyright and mr quilter

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* correct slow tests

* make fix-copies

* Update src/transformers/models/mctct/configuration_mctct.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mctct/configuration_mctct.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* m-ctc-t not mctct

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-08 00:33:07 +02:00
Britney Muller
72f5b94984
Update index.mdx (#17547)
This PR updates our Expert Acceleration Program image with a new image featuring our experts.

This is similar to our Transformers/README.md image update that has proven to be successful.
2022-06-03 12:56:37 -05:00
Patrick Deutschmann
babeff5524
Add support for Perceiver ONNX export (#17213)
* Start adding perceiver support for ONNX

* Fix pad token bug for fast tokenizers

* Fix formatting

* Make get_preprocesor more opinionated (processor priority, otherwise tokenizer/feature extractor)

* Clean docs format

* Minor cleanup following @sgugger's comments

* Fix typo in docs

* Fix another docs typo

* Fix one more typo in docs

* Update src/transformers/onnx/utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/onnx/utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/onnx/utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-03 07:40:22 -04:00
Robert Dargavel Smith
5c17918fe4
Allow from transformers import TypicalLogitsWarper (#17477)
* Allow from transformers import TypicalLogitsWarper

* Added TypicalLogitsWarper

* Allow from transformers import TypicalLogitsWarper

* Allow from transformers import TypicalLogitsWarper

* Allow from transformers import TypicalLogitsWarper

* Allow from transformers import TypicalLogitsWarper

Added TypicalLogitsWarper

Allow from transformers import TypicalLogitsWarper

Allow from transformers import TypicalLogitsWarper

Allow from transformers import TypicalLogitsWarper
2022-06-03 11:08:35 +02:00
Sylvain Gugger
048dd73bba
Check list of models in the main README and sort it (#17517)
* Script for README

* Fix copies

* Complete error message
2022-06-02 08:10:08 -04:00
Anugunj Naman
84aaadd8c5
Adding LeViT Model by Facebook (#17466)
* levit files

* levit tests

* weights script

* weights script

* update

* style fixes

* few minor corrections

* Added teacher model

* edit docs

* fix-copies

* style fixes

* pr error resolved

* Update README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/index.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/levit.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/levit.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/levit.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/levit.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/__init__.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/configuration_levit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/configuration_levit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* suggested pr changes

* style fixes

* minor bug

* update

* minor doc edit

* style

* Update src/transformers/models/levit/feature_extraction_levit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/models/levit/test_modeling_levit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* residual layer readable

* style

* Update docs/source/en/model_doc/levit.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update tests/models/levit/test_feature_extraction_levit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* change checkpoints and style

* update

* minor changes

* Update src/transformers/models/levit/modeling_levit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-01 17:06:20 +02:00
Ruihua Fang
4f38808e9e
Add OnnxConfig for SqueezeBert iss17314 (#17315)
* add onnx config for SqueezeBert

* add test for onnx config for SqueezeBert

* add automatically updated doc for onnx config for SqueezeBert

* Update src/transformers/onnx/features.py

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update src/transformers/models/squeezebert/configuration_squeezebert.py

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-06-01 06:16:15 -04:00
Arthur
7822a9b7a7
Opt in flax and tf (#17388)
* initial commit

* add init file

* update globakl init

* update index and dummy objects

* style

* update modelling auto

* fix initi typo in src/transformers

* fix typo in modeling tf auto, opt was in wrong mapping name

* fixed a slow test : saved_model

* style

* fix positionnal embedding if no position id is provided

* update tf test

* update test flax requirements

* fixed serialization

* update

* update tf name to allow smooth convertion

* update flax tests

* style

* fix test typo

* fix tf typo test

* add xla for generate support in causal LM

* fixed bug

* cleaned tf tests

* style

* removed from PT for slow tests

* fix typp

* opt test as slow

* trying to fix GPT2 undefined

* correct documentation and add to test doc

* update tf doc

* fix doc

* fake commit

* Apply suggestions from code review

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* update test based on review

* merged main layer for functionning test

* fixup + quality

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update long comment

* make fix copies

Co-authored-by: Arthur <arthur@huggingface.co>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-31 18:41:22 +02:00
Ritik Nandwal
5af38953bb
Added XLM onnx config (#17030)
* Add onnx configuration for xlm

* Add supported features for xlm

* Add xlm to models exportable with onnx

* Add xlm architecture to test file

* Modify docs

* Make code quality fixes
2022-05-31 09:26:06 -04:00
Leandro von Werra
740a1574f1
fix link in performance docs (#17419) 2022-05-25 20:54:43 +02:00
Jason Phang
71e602725b
[WIP] Adding GPT-NeoX-20B (#16659)
* initial

* first try

* working 20B

* 20B tokenizers

* Docs

* Import fixes for missing classes

* Update docs, fixup

* black formatting

* isort

* flake

* dummy objects

* documentation

* Documentation yml

* more docs

* tweaks for tests

* tokenization auto

* fix neox tests

* test

* test

* einsum

* address PR feedback

* Documentation

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gpt_neox/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gpt_neox/configuration_gpt_neox.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove undefined LaTeX syntax

* Update to full url to avoid confusion about if that's supposed to refer to the Hub

* fix auto

* move tests

* documentation fix

* more doc fixes

* test refactor

* fix import

* fix import

* fix import

* fix import

* fix import

* style fixes

* More modeling fixes

Co-authored-by: Jason Phang <zp489@gr057.hpc.nyu.edu>
Co-authored-by: Stella Biderman <stellabiderman@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-24 09:31:10 -04:00
NielsRogge
31ee80d556
Add LayoutLMv3 (#17060)
* Make forward pass work

* More improvements

* Remove unused imports

* Remove timm dependency

* Improve loss calculation of token classifier

* Fix most tests

* Add docs

* Add model integration test

* Make all tests pass

* Add LayoutLMv3FeatureExtractor

* Improve integration test + make fixup

* Add example script

* Fix style

* Add LayoutLMv3Processor

* Fix style

* Add option to add visual labels

* Make more tokenizer tests pass

* Fix more tests

* Make more tests pass

* Fix bug and improve docs

* Fix import of processors

* Improve docstrings

* Fix toctree and improve docs

* Fix auto tokenizer

* Move tests to model folder

* Move tests to model folder

* change default behavior add_prefix_space

* add prefix space for fast

* add_prefix_spcae set to True for Fast

* no space before `unique_no_split` token

* add test to hightligh special treatment of added tokens

* fix `test_batch_encode_dynamic_overflowing` by building a long enough example

* fix `test_full_tokenizer` with add_prefix_token

* Fix tokenizer integration test

* Make the code more readable

* Add tests for LayoutLMv3Processor

* Fix style

* Add model to README and update init

* Apply suggestions from code review

* Replace asserts by value errors

* Add suggestion by @ducviet00

* Add model to doc tests

* Simplify script

* Improve README

* a step ahead to fix

* Update pair_input_test

* Make all tokenizer tests pass - phew

* Make style

* Add LayoutLMv3 to CI job

* Fix auto mapping

* Fix CI job name

* Make all processor tests pass

* Make tests of LayoutLMv2 and LayoutXLM consistent

* Add copied from statements to fast tokenizer

* Add copied from statements to slow tokenizer

* Remove add_visual_labels attribute

* Fix tests

* Add link to notebooks

* Improve docs of LayoutLMv3Processor

* Fix reference to section

Co-authored-by: SaulLu <lucilesaul.com@gmail.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-05-24 09:53:45 +02:00
Sylvain Gugger
56f50590d5
Use Accelerate in from_pretrained for big model inference (#17341)
* Initial work

* More or less finished with first draft

* Update src/transformers/modeling_utils.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/modeling_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix randomly initialized weights

* Update src/transformers/modeling_utils.py

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Address review comments

* Rename DeepSpeed folder to temporarily fix the test issue?

* Revert to try if Accelerate fix works

* Use latest Accelerate release

* Quality and fixes

* Style

* Quality

* Add doc

* Test + fix

* More blocks

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-05-23 14:32:21 -04:00
Anugunj Naman
c86aad6110
Fix cvt docstrings (#17367) 2022-05-23 16:11:09 +02:00
ghlai9665
7b8cb26953
Correct & Improve Doctests for LayoutLMv2 (#17168)
* add inference example to LayoutLMv2ForQuestionAnswering, passing doctest

* add loss example to LayoutLMv2ForQuestionAnswering, passing doctest

* Add correct doctest for LayoutLMv2ForTokenClassification, passing doctest

* add correct doctest for LayoutLMv2ForSequenceClassification, passing test

* add correct doctest for LayoutLMv2Model, passing test

* make fixup

* fix to address review comments

* make style

* fix doctest line break issue, add to documentaiton_tests.txt, address review comments

* move comment about layoutlmv2 dependencies to the doc page

* format doc page as suggested

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* delete extraneous backtick

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-23 08:02:31 -04:00
NielsRogge
adc0ff2502
Add CvT (#17299)
* Adding cvt files

* Adding cvt files

* changes in init file

* Adding cvt files

* changes in init file

* Style fixes

* Address comments from code review

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Format lists in docstring

* Fix copies

* Apply suggestion from code review

Co-authored-by: AnugunjNaman <anugunjjha@gmail.com>
Co-authored-by: Ayushman Singh <singhayushman13@protonmail.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-18 17:47:18 +02:00
Carl
d6b8e9cec7
Add trajectory transformer (#17141)
* Add trajectory transformer


Fix model init


Fix end of lines for .mdx files

Add trajectory transformer model to toctree

Add forward input docs

Fix docs, remove prints, simplify prediction test

Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update docs, more descriptive comments

Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update readme

Small comment update and add conversion script

Rebase and reformat

Fix copies

Fix rebase, remove duplicates

Fix rebase, remove duplicates

* Remove tapex

* Remove tapex

* Remove tapex
2022-05-17 19:07:43 -04:00
Cesare Campagnano
d9050dc768
[LED] fix global_attention_mask not being passed for generation and docs clarification about grad checkpointing (#17112)
* [LED] fixed global_attention_mask not passed for generation + docs clarification for gradient checkpointing

* LED docs clarification

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [LED] gradient_checkpointing=True should be passed to TrainingArguments

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [LED] docs: remove wrong word

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [LED] docs fix typo

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-05-17 23:44:37 +02:00
Jean Vancoppenolle
bad358398a
Add support for pretraining recurring span selection to Splinter (#17247)
* Add SplinterForSpanSelection for pre-training recurring span selection.

* Formatting.

* Rename SplinterForSpanSelection to SplinterForPreTraining.

* Ensure repo consistency

* Fixup changes

* Address SplinterForPreTraining PR comments

* Incorporate feedback and derive multiple question tokens per example.

* Update src/transformers/models/splinter/modeling_splinter.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/splinter/modeling_splinter.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Jean Vancoppenole <jean.vancoppenolle@retresco.de>
Co-authored-by: Tobias Günther <tobias.guenther@retresco.de>
Co-authored-by: Tobias Günther <github@tobigue.de>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-05-17 23:42:14 +02:00
Patrick von Platen
5a9957358c
Add Wav2Vec2Conformer (#16812)
* save intermediate

* add wav2vec2 conformer

* add more code

* more

* first test passes

* make all checkpoints work

* update

* up

* more clean ups

* save clean-up

* save clean-up

* save more

* remove bogus

* finalize design conformer

* remove vision

* finish all tests

* more changes

* finish code

* add doc tests

* add slow tests

* fix autoconfig test

* up

* correct docstring

* up

* update

* fix

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Update docs/source/en/model_doc/wav2vec2-conformer.mdx

* upload

* save copied from

* correct configs

* fix model outputs

* add to docs

* fix imports

* finish

* finish code

* correct copied from

* correct again

* correct make fix

* improve make fix copies

* save

* correct fix copy from

* correct init structure

* correct

* fix import

* apply suggestions

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
2022-05-17 00:43:16 +02:00
amyeroberts
f6a6388972
Add Tensorflow Swin model (#16988)
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-16 22:19:53 +01:00
Kevin Zehnder
6cb7187324
docs(transformers): fix typo (#17263) 2022-05-16 17:04:30 -04:00
Sander Land
053a80c606
logging documentation update (#17174)
* logging documentation

* style

Co-authored-by: Sander Land <sander@chatdesk.com>
2022-05-16 16:47:28 -04:00
Sylvain Gugger
ddb1a47ec8
Automatically sort auto mappings (#17250)
* Automatically sort auto mappings

* Better class extraction

* Some auto class magic

* Adapt test and underlying behavior

* Remove re-used config

* Quality
2022-05-16 13:24:20 -04:00
Stas Bekman
71abd3ade1
[WIP] [doc] performance/scalability revamp (#15723)
* [doc] performance/scalability revamp

* link the new docs

* no :

* mixed precision

* work on the first doc

* expand the main doc

* Trigger CI

* style

* revamp single GPU training section

* work on training performance

* remove files not used anymore or will be added later

* final touches

* fix rebase

* Add hardware section to toctree

* fix toctree again

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove `fast_tokenizers` entry that was copied in rebase

* add warning about DP vs DDP

* remove todo

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix missing closure of codeblock

* Update docs/source/en/perf_train_gpu_many.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* sync with #16860

* update toc

Co-authored-by: leandro <leandro.vonwerra@spoud.io>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-16 13:36:41 +02:00
Sayak Paul
9f16a1cc13
Update data2vec.mdx to include a Colab Notebook link (that shows fine-tuning) (#17194)
* Update data2vec.mdx

* Update data2vec.mdx

* Update docs/source/en/model_doc/data2vec.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-12 10:22:00 -04:00
Younes Belkada
b971c769e8
Add OPT (#17088)
* First version - OPT model

* Final changes

- putting use cache to False

* few changes

- remove commented block

* few changes

- remove unecessary files

* fix style issues

* few changes

- remove a test file
- added the logits test

* Update src/transformers/models/auto/tokenization_auto.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add gen tests

* few changes

- rm mask filling example on docstring

* few changes

- remove useless args

* some changes

- more tests should pass now
- needs to clean more
- documentation still needs to be done

* fix code quality

* major changes

- change attention architecture to BART-like
- modify some tests
- style fix

* rm useless classes

- remove opt for:
- QA
- cond generation
- seq classif

* Removed autodoc calls to non-existant classes

TOkenizers are not implemented

* Update src/transformers/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/auto/modeling_tf_auto.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Replaced OPTTokeniser with GPT2 tokenizer

* added GPT2Tokenizer.from_pretrained("patrickvonplaten/opt_gpt2_tokenizer")

* Removed OPTTokenizer

* make style

* Make style replaces

``` ...).unsqueeze(```
by
``` >>>).unsqueeze(```

* make repo consistency

* Removed PretrainedOPTModel

* fix opt.mdx removed other heads

* fix init, removed 3 heads

* removed heads

* finished cleaning head

* removed seauence classif and question answering

* removed unused imports

* removed useless dummy object for QA, SC and CG

* removed tests for removed useless dummy object for QA, SC and CG

* Removed head_mask using encoder layers which don't exist

* fixed test

* fix line

* added OPT to toctree

* Updated model path with pushed weigths

* fix model path

* fixed code quality

* fixed embeddings and generation tests

* update paths

* clean comments

* removed OPTClassificationHead for sentence classification

* renamed hidden layer

* renamed num layers to standard num_hidden_layers

* num_attention_heads fix

* changes for 125m

* add first version for 125m

* add first version - flax

* add new version

* causal LM output

* replace output type with BaseModelOutputWithPastAndCrossAttentions

* revert working config from 150m to 350m

* clean

* removed decoder input ids

* fixed embed dim

* more embed_dim issues

* make style + removed enc_dec test

* update falx model

* removed troublesome copy

* added is_encoder_decoder=False to config

* added set_input emb fuinction to model class

* requires torch on embed test

* use head mask instead of decoder head mask input param solves a test

* 8 test remaining, update

* Updated create_and_check_decoder_model_past_large_inputs

* Make style

* update op tokenizer with condition

* make style

* See if I can push

* some clean up

* remove linear head hack

* save intermediate

* save correct attention

* add copied from from bart

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix part of the reviewss
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* same changes in naming / conversion

* correct mask

* more fixes

* delete FlaxOPT and TfOPT

* clean traces of Flax and Tf

* fix mask

* fixed positionnal embedding length when past key value is provoded

* get 125m, 6.7b to work

* Added do_layer_norm

* solved mismatch in load dictionnary

* clean up preapre opt input dict

* fixed past key value as bool

* fix previus

* fixed return dict False tuple issue

* All tests are passing

* Make style

* Ignore OPTDecoder non tested

* make fix-copies

* make repo consistency

* small fix

* removed uselss @torch.no_grad decorator

* make styl;e

* fix previous opt test

* style

* make style

* added opt documentation

* update OPT_PRETRAINED_MODEL_ARCHIVE_LIST

* up

* more fixes

* model & config work

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* added comment on padding hack (+2)

* cleaup

* review update

* docstring for missing arg

* Update docs/source/en/model_doc/opt.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/en/model_doc/opt.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/en/model_doc/opt.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/opt/__init__.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update pretrained map

* update path and tests

* make style

* styling

* make consistency

* add gpt2 tok new

* more tok fixes

* Update src/transformers/models/auto/tokenization_auto.py

* Update docs/source/en/model_doc/opt.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/model_doc/opt.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/model_doc/opt.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/models/opt/test_modeling_opt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update based on reviews

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* make style

* make tokenizer auto tests pass

* apply Lysandre suggestion

* finish tests

* add some good tokenizer tests

* improve docs slighly

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2022-05-12 12:24:35 +02:00
Amanpreet Singh
a10f61834d
[feat] Add FLAVA model (#16654)
* [WIP] Add FLAVA model

This PR aims to add [FLAVA](ihttps://arxiv.org/abs/2112.04482) model to the transformers repo.

Following checklist delineates the list of things to be done for this PR
to be complete:

[x] Flava init
[x] Flava base models
[x] Flava layers
[x] Flava Configs
[x] Flava encoders
[x] Flava pretraining models
[ ] Flava classification/retrieval models (To be added in a separate PR)
[x] Documentation updates 
[x] Imports updates 
[x] Argstring updates
[x] Flava pretrained checkpoints 
[x] Flava tests
[x] Flava processors 
[x] Sanity check
[x] Lint
2022-05-11 14:56:48 -07:00
hasan salim kanmaz
c33f6046c3
[WIP] Enable reproducibility for distributed trainings (#16907)
* add seed worker and set_deterministic_seed_for_cuda function to enforce reproducability

* change function name to enable determinism, add docstrings, reproducability support for tf

* change function name to enable_determinism_for_distributed_training

* revert changes in set_seed and call set_seed within enable_full_determinism

* add one position argument for seed_worker function

* add full_determinism flag in training args and call enable_full_determinism when it is true

* add enable_full_determinism to documentation

* apply make fixup after the last commit

* Update src/transformers/training_args.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-11 09:37:13 -04:00
Jason Phang
48a8f3daa1
Add DebertaV2ForMultipleChoice (#17135) 2022-05-10 16:21:44 -04:00
Patrick Haller
259eeb6dab
Fixing the output of code examples in the preprocessing chapter (#17162) 2022-05-10 12:16:28 -04:00
Zachary Mueller
d719bcd46a
Fix all docs for accelerate install directions (#17145) 2022-05-09 15:45:18 -04:00
Sylvain Gugger
7783fa6bb3 Fix quality and repo consistency 2022-05-09 11:14:36 -04:00
Sourab Mangrulkar
05fc1766ff
PyTorch FSDP integration in Trainer (#17136)
* PyTorch FSDP integration in Trainer

* reformatting

make style and make quality are now compliant.

* Updating dependency check

* Trigger CI

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-05-09 20:40:56 +05:30
Manan Dey
dc3645dc9c
add mobilebert onnx configs (#17029)
* update docs of length_penalty

* Revert "update docs of length_penalty"

This reverts commit 466bf4800b.

* add mobilebert onnx config

* address suggestions

* Update auto.mdx

* Update __init__.py

* Update features.py
2022-05-09 10:36:53 -04:00
Ritik Nandwal
215e0681e4
Added BigBirdPegasus onnx config (#17104)
* Add onnx configuration for bigbird-pegasus

* Modify docs
2022-05-06 17:31:00 +02:00
Steven Liu
cad61b6839
Fix link to example scripts (#17103) 2022-05-05 15:20:27 -05:00
Steven Liu
23619ef6b7
📝 open fresh PR for pipeline doctests (#17073) 2022-05-04 11:30:34 -05:00
Sayak Paul
049e791758
Add Data2Vec for Vision in TF (#17008)
* add utilities till TFData2VecVisionLayer.

* chore: pass window_size to attention layer.

* feat: add TFData2VecVisionRelativePositionBias.

* feat: initial implementation ready for tf data2vec.

* fix: relative position bias index, table to be fixed.

* chore: implementation added, tests remaining.

* add: tests, other PR files.

* fix: code quality.

* fix: import structure in init.

* chore: run make fix-copies.

* chore: address PR feedback (round I).

* chore: styling nit.

* fix: tests due to removal of to_2tuple().

* chore: rebase with upstream main and move the test.

* Update src/transformers/models/auto/modeling_tf_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/auto/modeling_tf_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix: layer call.

* chore: remove from_pt=True and rerun test.

* chore: remove cast and tf.divide.

* chore: minor edits to the test script.

* Update src/transformers/models/data2vec/modeling_tf_data2vec_vision.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* fix: expand() on TF tensors with broadcast_to().

* fix: test import.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-05-04 08:08:25 -04:00
Sylvain Gugger
a8fa2f91f4
Make Trainer compatible with sharded checkpoints (#17053)
* Make Trainer compatible with sharded checkpoints

* Add doc
2022-05-03 09:55:10 -04:00
Sanchit Gandhi
cd9274d010
[FlaxBert] Add ForCausalLM (#16995)
* [FlaxBert] Add ForCausalLM

* make style

* fix output attentions

* Add RobertaForCausalLM

* remove comment

* fix fx-to-pt model loading

* remove comment

* add modeling tests

* add enc-dec model tests

* add big_bird

* add electra

* make style

* make repo-consitency

* add to docs

* remove roberta test

* quality

* amend cookiecutter

* fix attention_mask bug in flax bert model tester

* tighten pt-fx thresholds to 1e-5

* add 'copied from' statements

* amend 'copied from' statements

* amend 'copied from' statements

* quality
2022-05-03 11:26:19 +02:00
Lysandre Debut
bb2e088be7
Allow all imports from transformers (#17050) 2022-05-02 12:47:39 -04:00
NielsRogge
1ac698744c
Add YOLOS (#16848)
* First draft

* Add YolosForObjectDetection

* Make forward pass work

* Add mid position embeddings

* Add interpolation of position encodings

* Add expected values

* Add YOLOS to tests

* Add integration test

* Support tiny model as well

* Support all models in conversion script

* Remove mid_pe_size attribute

* Make more tests pass

* Add model to README and fix config

* Add copied from statements

* Rename base_model_prefix to vit

* Add missing YOLOS_PRETRAINED_CONFIG_ARCHIVE_MAP

* Apply suggestions from code review

* Apply more suggestions from code review

* Convert remaining checkpoints

* Improve docstrings

* Add YolosFeatureExtractor

* Add feature extractor to docs

* Add corresponding tests

* Fix style

* Fix docs

* Apply suggestion from code review

* Fix bad rebase

* Fix some more bad rebase

* Fix missing character

* Improve docs and variable names

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-05-02 18:30:55 +02:00
Yih-Dar
ede5e04191
Add a check on config classes docstring checkpoints (#17012)
* Add the check

* add missing ckpts

* add a list to ignore

* call the added check script

* better regex pattern

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-30 10:40:46 +02:00
Sylvain Gugger
7152ed2bae
Result of new doc style with fixes (#17015)
* Result of new doc style with fixes

* Add last two files

* Bump hf-doc-builder
2022-04-29 17:42:15 -04:00
Mishig Davaadorj
cf8a7c2490
Update custom_models.mdx (#16964)
BertModelForSequenceClassification -> BertForSequenceClassification
2022-04-27 16:46:55 +02:00
Yang Ming
10dfa126b7
documentation: some minor clean up (#16850) 2022-04-26 16:56:08 -04:00
Krishna Sirumalla
aaee4038c3
Add onnx config for RoFormer (#16861)
* add roformer onnx config
2022-04-26 16:51:15 +02:00
Rushi Chaudhari
8246caf3eb
added deit onnx config (#16887)
* added deit onnx config
2022-04-25 20:50:45 +02:00
Patrick von Platen
3a71e94a92
Fix doc test quicktour dataset (#16929)
* fix doc test

* fix doc test

Co-authored-by: Patrick <patrick@pop-os.localdomain>
2022-04-25 16:26:59 +02:00
Patrick von Platen
72728be3db
[DocTests] Fix some doc tests (#16889)
* [DocTests] Fix some doc tests

* hacky fix

* correct
2022-04-23 08:40:14 +02:00
Thomas Chaigneau
ec81c11a18
Add OnnxConfig for ConvBERT (#16859)
* add OnnxConfig for ConvBert

Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
2022-04-22 18:19:15 +02:00
Nicolas Patry
e789418ebe
Adding support for array key in raw dictionnaries in ASR pipeline. (#16827)
* Adding support for `array` key in raw dictionnaries in ASR pipeline.

* ES .

* Update src/transformers/pipelines/automatic_speech_recognition.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Making it work by not popping `array` first.

* Black 22.3

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-21 14:39:10 +02:00
Stas Bekman
67ed0e43dc
[docs] fix url (#16860) 2022-04-20 11:01:24 -07:00
Yang Ming
ff06b17791
add DebertaV2 fast tokenizer (#15529)
Co-authored-by: alcinos <carion.nicolas@gmail.com>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
Co-authored-by: Nicolas Carion <carion.nicolas@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-20 10:26:51 +02:00
Patrick von Platen
8d3f952adb
[Data2Vec] Add data2vec vision (#16760)
* save intermediate

* add vision

* add vision

* save

* finish models

* finish models

* continue

* finish

* up

* up

* up

* tests all pass

* clean up

* up

* up

* fix bugs in beit

* correct docs

* finish

* finish docs

* make style

* up

* more fixes

* fix type hint

* make style

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/data2vec/test_modeling_data2vec_vision.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix test

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-18 17:52:13 +02:00
Patrick von Platen
9a2995ee39
[Quicktour Audio] Improve && remove ffmpeg dependency (#16723)
* [Quicktour Audio] Improve && remove ffmpeg dependency

* final fix

* final touches
2022-04-18 16:50:13 +02:00
Patrick von Platen
b24201fa44
[Doctests] Fix all T5 doc tests (#16646)
* [Doctests] Fix all T5 doc tests

* make style

* Update docs/source/en/model_doc/t5.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply Sylvains comments

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-13 11:36:54 +02:00
Minh Chien Vu
9c9db751e2
add Bigbird ONNX config (#16427)
* add Bigbird ONNX config
2022-04-12 20:46:06 +02:00
Anmol Joshi
a315988bae
Moved functions to pytorch_utils.py (#16625)
* Moved functions to pytorch_utils.py

* isort formatting

* Reverted tf changes

* isort, make fix-copies

* documentation fix

* Fixed Conv1D import

* Reverted research examples file

* backward compatibility for pytorch_utils

* missing import

* isort fix
2022-04-12 12:38:50 -04:00
Sylvain Gugger
0711c45eae
Remove duplicate header (#16732) 2022-04-12 12:37:13 -04:00
Patrick von Platen
098b002644
[Doctests] Correct task summary (#16644) 2022-04-11 14:59:35 +02:00
Yih-Dar
8e93dc7eaf
Fix some doc examples in task summary (#16666)
* Fix some doc examples

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 11:20:03 +02:00
Steven Liu
7c5d79912a
Update audio examples with MInDS-14 (#16633)
*  update audio examples with minds dataset

* 🖍 make style

* 🖍 minor fixes for doctests
2022-04-08 15:55:42 -05:00
NielsRogge
4ef0abb738
Add TAPEX (#16473)
* Add TapexTokenizer

* Improve docstrings and provide option to provide answer

* Remove option for pretokenized inputs

* Add TAPEX to README

* Fix copies

* Remove option for pretokenized inputs

* Initial commit: add tapex fine-tuning examples on both table-based question answering and table-based fact verification.

* - Draft a README file for running the script and introducing some background.
- Remove unused code lines in tabfact script.
- Disable the deafult `pad_to_max_length` option which is memory-consuming.

* * Support `as_target_tokenizer` function for TapexTokenizer.
* Fix the do_lower_case behaviour of TapexTokenizer.
* Add unit tests for target scenarios and cased/uncased scenarios for both source and target.

* * Replace the label BartTokenizer with TapexTokenizer's as_target_tokenizer function.
* Fix typos in tapex example README.

* * fix the evaluation script - remove the property `task_name`

* * Make the label space more clear for tabfact tasks

* * Using a new fine-tuning script for tapex-base on tabfact.

* * Remove the lowercase code outside the tokenizer - we use the tokenizer to control whether do_lower_case
* Guarantee the hyper-parameter can be run without out-of-memory on 16GB card and report the new reproduced number on wikisql

* * Remove the default tokenizer_name option.
* Provide evaluation command.

* * Support for WikiTableQuestion dataset.

* Fix a typo in README.

* * Fix the datasets's key name in WikiTableQuestions

* Run make fixup and move test to folder

* Fix quality

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply some more suggestions from code review

* Improve docstrings

* Overwrite failing test

* Improve comment in example scripts

* Fix rebase

* Add TAPEX to Auto mapping

* Add TAPEX to auto config mappings

* Put TAPEX higher than BART in auto mapping

* Add TAPEX to doc tests

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
Co-authored-by: SivilTaram <qianlxc@outlook.com>
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-04-08 10:57:51 +02:00
Francesco Saverio Zuppichini
af14c61973
RegNet (#16188)
* base model done

* make style

* done

* added files

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Trigger doc build

* resolved conversations

* resolved conversations

* seer models

* minor changes

* minor changes

* make fixup

* glob variables

* minor changes

* fix copies

* config when possibile

* resolved conflicts

* resolved conflicts

* resolved conflicts

* CI

* conversion script for 10b param

* fixed for 10b model

* minor updates in the doc + make style

* removed unused code

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* removed unused code

* removed unused code

* updated modeling_utils from main

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-04-07 21:58:00 +02:00
Joao Gante
3f43d824b9
TF generate refactor - Beam Search (#16374)
* refactor TF beam search

* refactored generate can now properly use attention masks

* add force bos/eos logit processors
2022-04-06 18:19:34 +01:00
Patrick von Platen
c65633156b
[Speech2Text Doc] Fix docs (#16611)
* [Speech2Text Doc] Fix docs

* apply ydshiehs suggestions
2022-04-06 14:19:00 +02:00
Patrick von Platen
0bf18643f4
[Minds14] Correct quicktour (#16626) 2022-04-06 11:27:11 +02:00
Sylvain Gugger
208f4c109a Quality 2022-04-05 14:12:01 -04:00
Steven Liu
f553c3ce4c
Update summary of the tasks (#16528)
* 📝 add image/vision classification and asr

* 🖍 minor formatting fixes

* Fixed a typo in legacy seq2seq_trainer.py (#16531)

* Add ONNX export for BeiT (#16498)

* Add beit onnx conversion support

* Updated docs

* Added cross reference to ViT ONNX config

* call on_train_end when trial is pruned (#16536)

* Type hints added (#16529)

* Fix Bart type hints (#16297)

* Add type hints to PLBart PyTorch

* Remove pending merge conflicts

* Fix PLBart Type Hints

* Add changes from review

* Add VisualBert type hints (#16544)

* Adding missing type hints for mBART model (PyTorch) (#16429)

* added type hints for mbart tensorflow tf implementation

* Adding missing type hints for mBART model 

Tensorflow Implementation model added with missing type hints

* Missing Type hints - correction

For TF model

* Code fixup using make quality tests

* Hint types - typo error

* make fix-copies and make fixup

* type hints

* updated files

* type hints update

* making dependent modesls coherent

Co-authored-by: matt <rocketknight1@gmail.com>

* Remove MBart subclass of XLMRoberta in tokenzier docs (#16546)

* Remove MBart subclass of XLMRoberta in tokenzier

* Fix style

* Copy docs from MBart50 tokenizer

* Use random_attention_mask for TF tests (#16517)

* use random_attention_mask for TF tests

* Fix for TFCLIP test (for now).

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Improve code example (#16450)

Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>

* Pin tokenizers version <0.13 (#16539)

* Pin tokenizers version <0.13

* Style

* Add code samples for TF speech models (#16494)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* [FlaxSpeechEncoderDecoder] Fix dtype bug (#16581)

* [FlaxSpeechEncoderDecoder] Fix dtype bug

* more fixes

* Making the impossible to connect error actually report the right URL. (#16446)

* Fix flax import in __init__.py: modeling_xglm -> modeling_flax_xglm (#16556)

* Add utility to find model labels (#16526)

* Add utility to find model labels

* Use it in the Trainer

* Update src/transformers/utils/generic.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Quality

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Enable doc in Spanish (#16518)

* Reorganize doc for multilingual support

* Fix style

* Style

* Toc trees

* Adapt templates

* Add use_auth to load_datasets for private datasets to PT and TF examples (#16521)

* fix formatting and remove use_auth

* Add use_auth_token to Flax examples

* add a test checking the format of `convert_tokens_to_string`'s output (#16540)

* add new tests

* add comment to overridden tests

* TF: Finalize `unpack_inputs`-related changes (#16499)

* Add unpack_inputs to remaining models

* removed kwargs to `call()` in TF models

* fix TF T5 tests

* [SpeechEncoderDecoderModel] Correct Encoder Last Hidden State Output (#16586)

* initialize the default rank set on TrainerState (#16530)

* initialize the default rank set on TrainerState

* fix style

* Trigger doc build

* Fix CI: test_inference_for_pretraining in ViTMAEModelTest (#16591)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* add a template to add missing tokenization test (#16553)

* add a template to add missing tokenization test

* add cookiecutter setting

* improve doc

* Update templates/adding_a_missing_tokenization_test/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* made _load_pretrained_model_low_mem static + bug fix (#16548)

* handle torch_dtype in low cpu mem usage (#16580)

* [Doctests] Correct filenaming (#16599)

* [Doctests] Correct filenaming

* improve quicktour

* make style

* Adding new train_step logic to make things less confusing for users (#15994)

* Adding new train_step logic to make things less confusing for users

* DO NOT ASK WHY WE NEED THAT SUBCLASS

* Metrics now working, at least for single-output models with type annotations!

* Updates and TODOs for the new train_step

* Make fixup

* Temporary test workaround until T5 has types

* Temporary test workaround until T5 has types

* I think this actually works! Needs a lot of tests though

* MAke style/quality

* Revert changes to T5 tests

* Deleting the aforementioned unmentionable subclass

* Deleting the aforementioned unmentionable subclass

* Adding a Keras API test

* Style fixes

* Removing unneeded TODO and comments

* Update test_step too

* Stop trying to compute metrics with the dummy_loss, patch up test

* Make style

* make fixup

* Docstring cleanup

* make fixup

* make fixup

* Stop expanding 1D input tensors when using dummy loss

* Adjust T5 test given the new compile()

* make fixup

* Skipping test for convnext

* Removing old T5-specific Keras test now that we have a common one

* make fixup

* make fixup

* Only skip convnext test on CPU

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Avoiding TF import issues

* make fixup

* Update compile() to support TF 2.3

* Skipping model.fit() on template classes for now

* Skipping model.fit() on template class tests for now

* Replace ad-hoc solution with find_labels

* make fixup

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Adding missing type hints for BigBird model   (#16555)

* added type hints for mbart tensorflow tf implementation

* Adding missing type hints for mBART model 

Tensorflow Implementation model added with missing type hints

* Missing Type hints - correction

For TF model

* Code fixup using make quality tests

* Hint types - typo error

* make fix-copies and make fixup

* type hints

* updated files

* type hints update

* making dependent modesls coherent

* Type hints for BigBird

* removing typos

Co-authored-by: matt <rocketknight1@gmail.com>

* [deepspeed] fix typo, adjust config name (#16597)

* 🖍 apply feedback

Co-authored-by: Cathy <815244047@qq.com>
Co-authored-by: Jim Rohrer <jrohrer1@gmail.com>
Co-authored-by: Ferdinand Schlatt <fschlatt@gmail.com>
Co-authored-by: Dahlbomii <101373053+Dahlbomii@users.noreply.github.com>
Co-authored-by: Gunjan Chhablani <chhablani.gunjan@gmail.com>
Co-authored-by: Rishav Chandra Varma <rishavchandra.v16@iiits.in>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Daniel Stancl <46073029+stancld@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Karim Foda <35491698+KMFODA@users.noreply.github.com>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Andres Codas <andrescodas@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2022-04-05 12:48:42 -05:00
Patrick von Platen
7ccacdf10f
[Doctests] Correct filenaming (#16599)
* [Doctests] Correct filenaming

* improve quicktour

* make style
2022-04-05 14:15:02 +02:00
Sylvain Gugger
b9a768b3ff
Enable doc in Spanish (#16518)
* Reorganize doc for multilingual support

* Fix style

* Style

* Toc trees

* Adapt templates
2022-04-04 10:25:46 -04:00