Commit Graph

352 Commits

Author SHA1 Message Date
Yih-Dar
81ab11124f
Add checks for some workflow jobs (#18583)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-16 13:53:47 +02:00
NielsRogge
2ab790e82d
Add Donut (#18488)
* First draft

* Improve script

* Update script

* Make conversion work

* Add final_layer_norm attribute to Swin's config

* Add DonutProcessor

* Convert more models

* Improve feature extractor and convert base models

* Fix bug

* Improve integration tests

* Improve integration tests and add model to README

* Add doc test

* Add feature extractor to docs

* Fix integration tests

* Remove register_buffer

* Fix toctree and add missing attribute

* Add DonutSwin

* Make conversion script work

* Improve conversion script

* Address comment

* Fix bug

* Fix another bug

* Remove deprecated method from docs

* Make Swin and Swinv2 untouched

* Fix code examples

* Fix processor

* Update model_type to donut-swin

* Add feature extractor tests, add token2json method, improve feature extractor

* Fix failing tests, remove integration test

* Add do_thumbnail for consistency

* Improve code examples

* Add code example for document parsing

* Add DonutSwin to MODEL_NAMES_MAPPING

* Add model to appropriate place in toctree

* Update namespace to appropriate organization

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-08-12 16:40:58 +02:00
Younes Belkada
4a51075a96
bitsandbytes - Linear8bitLt integration into transformers models (#17901)
* first commit

* correct replace function

* add final changes

- works like charm!
- cannot implement tests yet
- tested

* clean up a bit

* add bitsandbytes dependencies

* working version

- added import function
- added bitsandbytes utils file

* small fix

* small fix

- fix import issue

* fix import issues

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit

- move bitsandbytes utils to utils
- change comments on functions

* reformat docstring

- reformat docstring on init_empty_weights_8bit

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert bad formatting

* change to bitsandbytes

* refactor a bit

- remove init8bit since it is useless

* more refactoring

- fixed init empty weights issue
- added threshold param

* small hack to make it work

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* revmoe the small hack

* modify utils file

* make style + refactor a bit

* create correctly device map

* add correct dtype for device map creation

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

- remove with torch.grad
- do not rely on Python bool magic!

* add docstring

 - add docstring for new kwargs

* add docstring

- comment `replace_8bit_linear` function
- fix weird formatting

* - added more documentation
- added new utility function for memory footprint tracking
- colab demo to add

* few modifs

- typo doc
- force cast into float16 when load_in_8bit is enabled

* added colab link

* add test architecture + docstring a bit

* refactor a bit testing class

* make style + refactor a bit

* enhance checks

- add more checks
- start writing saving test

* clean up a bit

* male style

* add more details on doc

* add more tests

- still needs to fix 2 tests

* replace by "or"

- could not fix it from GitHub GUI

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit testing code + add readme

* make style

* fix import issue

* Update src/transformers/modeling_utils.py

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* add few comments

* add more doctring + make style

* more docstring

* raise error when loaded in 8bit

* make style

* add warning if loaded on CPU

* add small sanity check

* fix small comment

* add bitsandbytes on dockerfile

* Improve documentation

- improve documentation from comments

* add few comments

* slow tests pass on the VM but not on the CI VM

* Fix merge conflict

* make style

* another test should pass on a multi gpu setup

* fix bad import in testing file

* Fix slow tests

- remove dummy batches
- no more CUDA illegal memory errors

* odify dockerfile

* Update docs/source/en/main_classes/model.mdx

* Update Dockerfile

* Update model.mdx

* Update Dockerfile

* Apply suggestions from code review

* few modifications

- lm head can stay on disk/cpu
- change model name so that test pass

* change test value

- change test value to the correct output
- torch bmm changed to baddmm in bloom modeling when merging

* modify installation guidelines

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* replace `n`by `name`

* merge `load_in_8bit` and `low_cpu_mem_usage`

* first try - keep the lm head in full precision

* better check

- check the attribute `base_model_prefix` instead of computing the number of parameters

* added more tests

* Update src/transformers/utils/bitsandbytes.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit

* improve documentation

- fix typos for installation
- change title in the documentation

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
2022-08-10 09:13:36 +02:00
NielsRogge
82bb682643
[VideoMAE] Add model to doc tests (#18523)
* Add videomae to doc tests

* Add pip install decord

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-08-08 19:28:51 +02:00
Sylvain Gugger
377cdded7a
Clean up hub (#18497)
* Clean up utils.hub

* Remove imports

* More fixes

* Last fix
2022-08-08 08:48:10 -04:00
Sylvain Gugger
70fa1a8d26
Fix pipeline tests (#18487)
* Fix pipeline tests

* Make sure all pipelines tests run with init changes
2022-08-05 09:14:51 -04:00
Yih-Dar
d2704c4143
Add machine type in the artifact of Examples directory job (#18459)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-04 18:52:01 +02:00
Steven Liu
151a2aaa4e
Split model list on modality (#18328)
* 📝 split up model list

* Adapt script to reorg

* apply niels feedback

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-08-01 11:10:20 -05:00
Yih-Dar
bd6d1b4300
Add a check regarding the number of occurrences of ``` (#18389)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-01 14:23:02 +02:00
Yih-Dar
b53dab601c
Remove Flax OPT from doctest for now (#18338)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-28 11:50:44 -04:00
Yih-Dar
c4c6b4dbda
Add PyTorch 1.11 to past CI (#18302)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-26 15:47:23 +02:00
gilad19
2b09650885
Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER) (#17924)
* Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER)

* Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER)

* provide classifier only text hidden states

* add test_for_token_classification

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* add test_for_token_classification

Co-authored-by: gfuchs <gfuchs@ebay.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-07-26 10:11:32 +02:00
Alara Dirik
002915aa2a
Owlvit docs test (#18257)
* fix docs and add owlvit docs test

* fix minor bug in post_process, add to processor

* improve owlvit code examples

* fix hardcoded image size
2022-07-26 10:55:14 +03:00
Raghavan
bd87480d20
Fix command of doc tests for local testing (#18236)
* Fix command of doc tests for local testing

* Fix command for after running doc tests locally
2022-07-26 03:07:11 -04:00
Alara Dirik
12d66b4701
Add OWL-ViT model for zero-shot object detection (#17938)
* add owlvit model skeleton

* add class and box predictor heads

* convert modified flax clip to pytorch

* fix box and class predictors

* add OwlViTImageTextEmbedder

* convert class and box head checkpoints

* convert image text embedder checkpoints

* add object detection head

* fix bugs

* update conversion script

* update conversion script

* fix q,v,k,out weight conversion conversion

* add owlvit object detection output

* fix bug in image embedder

* fix bugs in text embedder

* fix positional embeddings

* fix bug in inference mode vision pooling

* update docs, init tokenizer and processor files

* support batch processing

* add OwlViTProcessor

* remove merge conflicts

* readd owlvit imports

* fix bug in OwlViTProcessor imports

* fix bugs in processor

* update docs

* fix bugs in processor

* update owlvit docs

* add OwlViTFeatureExtractor

* style changes, add postprocess method to feature extractor

* add feature extractor and processor tests

* add object detection tests

* update conversion script

* update config paths

* update config paths

* fix configuration paths and bugs

* fix bugs in OwlViT tests

* add import checks to processor

* fix docs and minor issues

* fix docs and minor issues

* fix bugs and issues

* fix bugs and issues

* fix bugs and issues

* fix bugs and issues

* update docs and examples

* fix bugs and issues

* update conversion script, fix positional embeddings

* process 2D input ids, update tests

* fix style and quality issues

* update docs

* update docs and imports

* update OWL-ViT index.md

* fix bug in OwlViT feature ext tests

* fix code examples, return_dict by default

* return_dict by default

* minor fixes, add tests to processor

* small fixes

* add output_attentions arg to main model

* fix bugs

* remove output_hidden_states arg from main model

* update self.config variables

* add option to return last_hidden_states

* fix bug in config variables

* fix copied from statements

* fix small issues and bugs

* fix bugs

* fix bugs, support greyscale images

* run fixup

* update repo name

* merge OwlViTImageTextEmbedder with obj detection head

* fix merge conflict

* fix merge conflict

* make fixup

* fix bugs

* fix bugs

* add additional processor test
2022-07-22 13:35:32 +03:00
Sayak Paul
561b9a8c00
[SegFormer] TensorFlow port (#17910)
* add: segformer utils and img. classification.

* add: segmentation layer.

* feat: working implementation of segformer.

* chore: remove unused variable.

* add test, remaining modifications.

* remove: unnecessary files.

* add: rest of the files.

Co-authored-by: matt <rocketknight1@gmail.com>

* chore: remove ModuleList comment.

* chore: apply make style.

* chore: apply make fixup-copies.

* add  to check_repo.py

* add decode head to IGNORE_NON_TESTED

* chore: run make style.

* chore: PR comments.

* chore: minor changes to model doc.

* tests: reduction across samples.

* add a note on the space.

* sort importats.

* fix: reduction in loss computation.

* chore: align loss function with that of NER.

* chore: correct utils/documentation_tests.txt

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* chore: simplify the interpolation of logits in loss computation.

* chore: return transposed logits when return_dict=False.

* chore: add link to the tf fine-tuning repo.

* address pr comments.

* address niels's comments.

* remove from_pt=True since tf weights are in.

* remove comment from pt model.

* address niels's comments.

Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2022-07-21 18:22:37 +01:00
Yih-Dar
2c5747edfe
Update notification service (#17921)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-21 15:03:50 +02:00
Sylvain Gugger
dc9147ff36
Custom pipeline (#18079)
* Initial work

* More work

* Add tests for custom pipelines on the Hub

* Protect import

* Make the test work for TF as well

* Last PyTorch specific bit

* Add documentation

* Style

* Title in toc

* Bad names!

* Update docs/source/en/add_new_pipeline.mdx

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Auto stash before merge of "custom_pipeline" and "origin/custom_pipeline"

* Address review comments

* Address more review comments

* Update src/transformers/pipelines/__init__.py

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-07-19 12:02:35 +02:00
Sylvain Gugger
aeeab1ffd0
Fix template for new models in README (#18182) 2022-07-18 16:01:51 +02:00
amyeroberts
8581a798c0
Add TF DeiT implementation (#17806)
* Initial TF DeiT implementation

* Fix copies naming issues

* Fix up + docs

* Properly same main layer

* Name layers properly

* Initial TF DeiT implementation

* Fix copies naming issues

* Fix up + docs

* Properly same main layer

* Name layers properly

* Fixup

* Fix import

* Fix import

* Fix import

* Fix weight loading for tests whilst not on hub

* Add doc tests and remove to_2tuple

* Add back to_2tuple
Removing to_2tuple results in many downstream changes needed because of the copies checks

* Incorporate updates in Improve vision models #17731 PR

* Don't hard code num_channels

* Copy PyTorch DeiT embeddings and remove pytorch operations with mask

* Fix patch embeddings & tidy up

* Update PixelShuffle to move logic into class layer

* Update doc strings - remove PT references

* Use NHWC format in internal layers

* Fix up

* Use linear activation layer

* Remove unused import

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Move dataclass to top of file

* Remove from_pt now weights on hub

* Fixup

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>
2022-07-13 18:04:08 +01:00
Yulv-git
95113d1365
Fix some typos. (#17560)
* Fix some typos.

Signed-off-by: Yulv-git <yulvchi@qq.com>

* Fix typo.

Signed-off-by: Yulv-git <yulvchi@qq.com>

* make fixup.
2022-07-11 05:00:13 -04:00
Sylvain Gugger
7c046c5c22
Update localized READMES when template is filled. (#18062) 2022-07-08 11:08:52 -04:00
Sylvain Gugger
1b749a7f8d
Sort doc toc (#18034)
* Add script to sort doc ToC

* Style and fixes

* Add check to quality job
2022-07-07 08:17:58 -04:00
Yih-Dar
f681437203
Enable Past CI (#17919)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-05 18:08:36 +02:00
amyeroberts
77ea5130a1
Add TF ResNet model (#17427)
* Rought TF conversion outline

* Tidy up

* Fix padding differences between layers

* Add back embedder - whoops

* Match test file to main

* Match upstream test file

* Correctly pass and assign image_size parameter

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Add in MainLayer

* Correctly name layer

* Tidy up AdaptivePooler

* Small tidy-up

More accurate type hints and remove whitespaces

* Change AdaptiveAvgPool

Use the AdaptiveAvgPool implementation by @Rocketknight1, which correctly pools if the output shape does not evenly divide by input shape c.f. 9e26607e22 (r900109509)

Co-authored-by: From: matt <rocketknight1@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Use updated AdaptiveAvgPool

Co-authored-by: matt <rocketknight1@gmail.com>

* Make AdaptiveAvgPool compatible with CPU

* Remove image_size from configuration

* Fixup

* Tensorflow -> TensorFlow

* Fix pt references in tests

* Apply suggestions from code review - grammar and wording

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add TFResNet to doc tests

* PR comments - GlobalAveragePooling and clearer comments

* Remove unused import

* Add in keepdims argument

* Add num_channels check

* grammar fix: by -> of

Co-authored-by: matt <rocketknight1@gmail.com>

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Remove transposes - keep NHWC throughout forward pass

* Fixup look sharp

* Add missing layer names

* Final tidy up - remove from_pt now weights on hub

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-07-04 10:59:15 +01:00
Matthijs Hollemans
fbc7598bab
add MobileViT model (#17354)
* add MobileViT

* fixup

* Update README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* remove empty line

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* use clearer variable names

* rename to MobileViTTransformerLayer

* no longer inherit from nn.Sequential

* fixup

* fixup

* not sure why this got added twice

* rename organization for checkpoints

* fix it up

* Update src/transformers/models/mobilevit/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/configuration_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/configuration_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/configuration_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/models/mobilevit/test_modeling_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* code style improvements

* fixup

* Update docs/source/en/model_doc/mobilevit.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/mobilevit.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/mobilevit/configuration_mobilevit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/mobilevit/configuration_mobilevit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* download labels from hub

* rename layers

* rename more layers

* don't compute loss in separate function

* remove some nn.Sequential

* replace nn.Sequential with new MobileViTTransformer class

* replace nn.Sequential with MobileViTMobileNetLayer

* fix pruning since model structure changed

* fixup

* fix doc comment

* remove custom resize from feature extractor

* fix ONNX import

* add to doc tests

* use center_crop from image_utils

* move RGB->BGR flipping into image_utils

* fix broken tests

* wrong type hint

* small tweaks

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-29 16:07:51 -04:00
StevenTang1998
3cff4cc587
Add MVP model (#17787)
* Add MVP model

* Update README

* Remove useless module

* Update docs

* Fix bugs in tokenizer

* Remove useless test

* Remove useless module

* Update vocab

* Remove specifying

* Remove specifying

* Add #Copied ... statement

* Update paper link

* Remove useless TFMvp

* Add #Copied ... statement

* Fix style in test mvp model

* Fix some typos

* Fix properties of unset special tokens in non verbose mode

* Update paper link

* Update MVP doc

* Update MVP doc

* Fix README

* Fix typos in docs

* Update docs
2022-06-29 09:30:55 -04:00
Yih-Dar
5cdfff5df3
Fix job links in Slack report (#17892)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-29 14:53:13 +02:00
Aritra Roy Gosthipaty
a7eba83161
TF implementation of RegNets (#17554)
* chore: initial commit

Copied the torch implementation of regnets and porting the code to tf step by step. Also introduced an output layer which was needed for regnets.

* chore: porting the rest of the modules to tensorflow

did not change the documentation yet, yet to try the playground on the model

* Fix initilizations (#1)

* fix: code structure in few cases.

* fix: code structure to align tf models.

* fix: layer naming, bn layer still remains.

* chore: change default epsilon and momentum in bn.

* chore: styling nits.

* fix: cross-loading bn params.

* fix: regnet tf model, integration passing.

* add: tests for TF regnet.

* fix: code quality related issues.

* chore: added rest of the files.

* minor additions..

* fix: repo consistency.

* fix: regnet tf tests.

* chore: reorganize dummy_tf_objects for regnet.

* chore: remove checkpoint var.

* chore: remov unnecessary files.

* chore: run make style.

* Update docs/source/en/model_doc/regnet.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* chore: PR feedback I.

* fix: pt test. thanks to @ydshieh.

* New adaptive pooler (#3)

* feat: new adaptive pooler

Co-authored-by: @Rocketknight1

* chore: remove image_size argument.

Co-authored-by: matt <rocketknight1@gmail.com>

Co-authored-by: matt <rocketknight1@gmail.com>

* Empty-Commit

* chore: remove image_size comment.

* chore: remove playground_tf.py

* chore: minor changes related to spacing.

* chore: make style.

* Update src/transformers/models/regnet/modeling_tf_regnet.py

Co-authored-by: amyeroberts <aeroberts4444@gmail.com>

* Update src/transformers/models/regnet/modeling_tf_regnet.py

Co-authored-by: amyeroberts <aeroberts4444@gmail.com>

* chore: refactored __init__.

* chore: copied from -> taken from./g

* adaptive pool -> global avg pool, channel check.

* chore: move channel check to stem.

* pr comments - minor refactor and add regnets to doc tests.

* Update src/transformers/models/regnet/modeling_tf_regnet.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* minor fix in the xlayer.

* Empty-Commit

* chore: removed from_pt=True.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: amyeroberts <aeroberts4444@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-06-29 13:45:14 +01:00
Jerry Jiarui XU
6c8f4c9a93
Adding GroupViT Models (#17313)
* add group vit and fixed test (except slow)

* passing slow test

* addressed some comments

* fixed test

* fixed style

* fixed copy

* fixed segmentation output

* fixed test

* fixed relative path

* fixed copy

* add ignore non auto configured

* fixed docstring, add doc

* fixed copies

* Apply suggestions from code review

merge suggestions

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* resolve comment, renaming model

* delete unused attr

* use fix copies

* resolve comments

* fixed attn

* remove unused vars

* refactor tests

* resolve final comments

* add demo notebook

* fixed inconsitent default

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* rename stage->stages

* Create single GroupViTEncoderLayer class

* Update conversion script

* Simplify conversion script

* Remove cross-attention class in favor of GroupViTAttention

* Convert other model as well, add processor to conversion script

* addressing final comment

* fixed args

* Update src/transformers/models/groupvit/modeling_groupvit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-06-28 20:51:47 +02:00
Matt
ee0d001de7
Add a TF in-graph tokenizer for BERT (#17701)
* Add a TF in-graph tokenizer for BERT

* Add from_pretrained

* Add proper truncation, option handling to match other tokenizers

* Add proper imports and guards

* Add test, fix all the bugs exposed by said test

* Fix truncation of paired texts in graph mode, more test updates

* Small fixes, add a (very careful) test for savedmodel

* Add tensorflow-text dependency, make fixup

* Update documentation

* Update documentation

* make fixup

* Slight changes to tests

* Add some docstring examples

* Update tests

* Update tests and add proper lowercasing/normalization

* make fixup

* Add docstring for padding!

* Mark slow tests

* make fixup

* Fall back to BertTokenizerFast if BertTokenizer is unavailable

* Fall back to BertTokenizerFast if BertTokenizer is unavailable

* make fixup

* Properly handle tensorflow-text dummies
2022-06-27 12:06:21 +01:00
Sylvain Gugger
e8eb699ee8
Properly get tests deps in test_fetcher (#17870)
* Properly get tests deps in test_fetcher

* Remove print
2022-06-24 16:56:46 -04:00
Vishwas
c2c0d9db5f
Improve encoder decoder model docs (#17815)
* Copied all the changes from the last PR

* added in documentation_tests.txt

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/encoder-decoder.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Co-authored-by: vishwaspai <vishwas.pai@emplay.net>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2022-06-24 14:48:19 +02:00
Quentin
ab223fc148
add doctests for DETR (#17786)
* add: check labels for detr object detection doctests

* add: check shapes

* add: add detr to documentation_tests.py

* fix: make fixup output

* fix: add a comment
2022-06-23 13:26:14 +02:00
Yih-Dar
6589e510fa
Attempt to change Push CI to workflow_run (#17753)
* Use workflow_run event for push CI

* change to workflow_run

* Add comments

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-18 08:35:03 +02:00
Patrick von Platen
7f14839f55
[Wav2Vec2Conformer] Official release (#17709)
* [Wav2Vec2Conformer] Official release

* remove from not-in-readme
2022-06-15 18:34:15 +02:00
Daniel Stancl
a72f1c9f5b
Add LongT5 model (#16792)
* Initial commit

* Make some fixes

* Make PT model full forward pass

* Drop TF & Flax implementation, fix copies etc

* Add Flax model and update some corresponding stuff

* Drop some TF things

* Update config and flax local attn

* Add encoder_attention_type to config

* .

* Update docs

* Do some cleansing

* Fix some issues -> make style; add some docs

* Fix position_bias + mask addition + Update tests

* Fix repo consistency

* Fix model consistency by removing flax operation over attn_mask

* [WIP] Add PT TGlobal LongT5

* .

* [WIP] Add flax tglobal model

* [WIP] Update flax model to use the right attention type in the encoder

* Fix flax tglobal model forward pass

* Make the use of global_relative_attention_bias

* Add test suites for TGlobal model

* Fix minor bugs, clean code

* Fix pt-flax equivalence though not convinced with correctness

* Fix LocalAttn implementation to match the original impl. + update READMEs

* Few updates

* Update: [Flax] improve large model init and loading #16148

* Add ckpt conversion script accoring to #16853 + handle torch device placement

* Minor updates to conversion script.

* Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM

* gpu support + dtype fix

* Apply some suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* * Remove (de)parallelize stuff
* Edit shape comments
* Update README.md
* make fix-copies

* Remove caching logic for local & tglobal attention

* Apply another batch of suggestions from code review

* Add missing checkpoints
* Format converting scripts
* Drop (de)parallelize links from longT5 mdx

* Fix converting script + revert config file change

* Revert "Remove caching logic for local & tglobal attention"

This reverts commit 2a619828f6ddc3e65bd9bb1725a12b77fa883a46.

* Stash caching logic in Flax model

* Make side relative bias used always

* Drop caching logic in PT model

* Return side bias as it was

* Drop all remaining model parallel logic

* Remove clamp statements

* Move test files to the proper place

* Update docs with new version of hf-doc-builder

* Fix test imports

* Make some minor improvements

* Add missing checkpoints to docs
* Make TGlobal model compatible with torch.onnx.export
* Replace some np.ndarray with jnp.ndarray

* Fix TGlobal for ONNX conversion + update docs

* fix _make_global_fixed_block_ids and masked neg  value

* update flax model

* style and quality

* fix imports

* remove load_tf_weights_in_longt5 from init and fix copies

* add slow test for TGlobal model

* typo fix

* Drop obsolete is_parallelizable and one warning

* Update __init__ files to fix repo-consistency

* fix pipeline test

* Fix some device placements

* [wip]: Update tests -- need to generate summaries to update expected_summary

* Fix quality

* Update LongT5 model card

* Update (slow) summarization tests

* make style

* rename checkpoitns

* finish

* fix flax tests

Co-authored-by: phungvanduy <pvduy23@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patil-suraj <surajp815@gmail.com>
2022-06-13 22:36:58 +02:00
Bram Vanroy
73083581a4
explicitly set utf8 for Windows (#17664) 2022-06-13 08:05:45 -04:00
Yih-Dar
c70dacde94
Fix very long job failure text in Slack report (#17630)
* Fix very long job failure text in Slack report

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-09 18:37:48 +02:00
Chan Woo Kim
119e3c0fc8
M-CTC-T Model (#16402)
* added cbs to notebooks, made copy-paste error fix in generation_utils

* initial push for mctc model

* mctc feature extractor done

* added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.

* added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.

* passing attention, now struggling to figure out how attention masks make sense here

* works when excluding attention masks. ask later how one would integrate attention maskshere

* bizarre configuration error (model prefix comes first in config dict json and messes up the order)

* all passing but bizzarre config dict ordering issue when to_dict

* passing all major tests

* feature extraction, processor, tokenizer added & tests passing

* style & consistency & other logistical fixes

* copy paste fix

* model after feature extraction working

* commiting final feature extraction results; need to fix normalization

* feature extraction passing tests; probably should add tests on the specific flashlight-copied functions?

* delete print ; format code a bit

* fixing tests

* passing major tests

* fixing styles

* completed tokenization test with real example; not sure if these values are entirely correct.

* last test fixes from local

* reverting accidentally included custom setup configs

* remove load tf weights; fix config error

* testing couldnt import featureextractor

* fix docs

* fix docs

* resolving comments

* style fixes

* style fixes

* Update to MCTCConv1dSubSampler

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* relposemb fixes

* conv1d name issue; expecting config fail with paraentheses

* fix config issue

* fix config issue

* fix config issue

* change everything to MCTCT

* fixing naming change errors

* archive list

* copyrights and docs

* copyrights and docs

* copyrights and docs

* merge resolution

* move tests, fix to changed optionaldependency structure

* test directories changed

* fixing tests

* how to avoid tf tests?

* how to avoid tf tests?

* tests passing locally

* allow mctctprocessor imported any env

* allow mctctprocessor imported any env

* fixed second round of feedback, need to fix docs

* doc changes not being applied

* all fixed

* style fix

* feedback fixes

* fix copies and feature extraction style fix

* Update tests/models/visual_bert/test_modeling_visual_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* copy paste huggingface:main visual bert

* added eof newline to visual bert; all tests are passing otherwise

* fix slow tests by adding attention mask

* change model id to speechbrain

* make fix-copies

* fix readme unwanted deletes

* fixing readmes, make fix-copies

* consistent M-CTC-T naming

* Update src/transformers/models/mctct/__init__.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* all fixed but variable naming

* adjust double quotes

* fixed variable names

* copyright and mr quilter

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* correct slow tests

* make fix-copies

* Update src/transformers/models/mctct/configuration_mctct.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mctct/configuration_mctct.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* m-ctc-t not mctct

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-08 00:33:07 +02:00
Sylvain Gugger
c4e58cd8ba
Clean imports to fix test_fetcher (#17531)
* Clean imports to fix test_fetcher

* Add dependencies printer

* Update utils/tests_fetcher.py

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Fix Perceiver import

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-06-03 12:34:41 -04:00
Sylvain Gugger
048dd73bba
Check list of models in the main README and sort it (#17517)
* Script for README

* Fix copies

* Complete error message
2022-06-02 08:10:08 -04:00
Sylvain Gugger
f128ccb997
Clean README in post release job as well. (#17519) 2022-06-02 07:44:03 -04:00
Yih-Dar
659b27fd26
Print more library versions in CI (#17384)
* print more lib. versions and just befor test runs

* update print_env_pt.py

* rename to print_env

* Disable warning + better job name

* print python version

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-02 10:24:16 +02:00
Arthur
7822a9b7a7
Opt in flax and tf (#17388)
* initial commit

* add init file

* update globakl init

* update index and dummy objects

* style

* update modelling auto

* fix initi typo in src/transformers

* fix typo in modeling tf auto, opt was in wrong mapping name

* fixed a slow test : saved_model

* style

* fix positionnal embedding if no position id is provided

* update tf test

* update test flax requirements

* fixed serialization

* update

* update tf name to allow smooth convertion

* update flax tests

* style

* fix test typo

* fix tf typo test

* add xla for generate support in causal LM

* fixed bug

* cleaned tf tests

* style

* removed from PT for slow tests

* fix typp

* opt test as slow

* trying to fix GPT2 undefined

* correct documentation and add to test doc

* update tf doc

* fix doc

* fake commit

* Apply suggestions from code review

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* update test based on review

* merged main layer for functionning test

* fixup + quality

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update long comment

* make fix copies

Co-authored-by: Arthur <arthur@huggingface.co>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-31 18:41:22 +02:00
Sylvain Gugger
56b35ce3eb
Make check_init script more robust and clean inits (#17408) 2022-05-25 07:23:56 -04:00
Sylvain Gugger
bd908e9bb1
Fix README localizer script (#17407) 2022-05-25 07:23:40 -04:00
NielsRogge
31ee80d556
Add LayoutLMv3 (#17060)
* Make forward pass work

* More improvements

* Remove unused imports

* Remove timm dependency

* Improve loss calculation of token classifier

* Fix most tests

* Add docs

* Add model integration test

* Make all tests pass

* Add LayoutLMv3FeatureExtractor

* Improve integration test + make fixup

* Add example script

* Fix style

* Add LayoutLMv3Processor

* Fix style

* Add option to add visual labels

* Make more tokenizer tests pass

* Fix more tests

* Make more tests pass

* Fix bug and improve docs

* Fix import of processors

* Improve docstrings

* Fix toctree and improve docs

* Fix auto tokenizer

* Move tests to model folder

* Move tests to model folder

* change default behavior add_prefix_space

* add prefix space for fast

* add_prefix_spcae set to True for Fast

* no space before `unique_no_split` token

* add test to hightligh special treatment of added tokens

* fix `test_batch_encode_dynamic_overflowing` by building a long enough example

* fix `test_full_tokenizer` with add_prefix_token

* Fix tokenizer integration test

* Make the code more readable

* Add tests for LayoutLMv3Processor

* Fix style

* Add model to README and update init

* Apply suggestions from code review

* Replace asserts by value errors

* Add suggestion by @ducviet00

* Add model to doc tests

* Simplify script

* Improve README

* a step ahead to fix

* Update pair_input_test

* Make all tokenizer tests pass - phew

* Make style

* Add LayoutLMv3 to CI job

* Fix auto mapping

* Fix CI job name

* Make all processor tests pass

* Make tests of LayoutLMv2 and LayoutXLM consistent

* Add copied from statements to fast tokenizer

* Add copied from statements to slow tokenizer

* Remove add_visual_labels attribute

* Fix tests

* Add link to notebooks

* Improve docs of LayoutLMv3Processor

* Fix reference to section

Co-authored-by: SaulLu <lucilesaul.com@gmail.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-05-24 09:53:45 +02:00
ghlai9665
7b8cb26953
Correct & Improve Doctests for LayoutLMv2 (#17168)
* add inference example to LayoutLMv2ForQuestionAnswering, passing doctest

* add loss example to LayoutLMv2ForQuestionAnswering, passing doctest

* Add correct doctest for LayoutLMv2ForTokenClassification, passing doctest

* add correct doctest for LayoutLMv2ForSequenceClassification, passing test

* add correct doctest for LayoutLMv2Model, passing test

* make fixup

* fix to address review comments

* make style

* fix doctest line break issue, add to documentaiton_tests.txt, address review comments

* move comment about layoutlmv2 dependencies to the doc page

* format doc page as suggested

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* delete extraneous backtick

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-05-23 08:02:31 -04:00
Yih-Dar
1b20c970a2
Fix ci_url might be None (#17332)
* fix

* Update utils/notification_service.py

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-05-18 21:49:08 +02:00