Commit Graph

1845 Commits

Author SHA1 Message Date
Maria Khalusova
0248810300
Refactoring of the text generate API docs (#21112)
* initial commit, refactoring the text generation api reference

* removed repetitive code examples

* Refactoring the text generation docs to reduce repetition

* make style
2023-01-17 12:23:48 -05:00
Maria Khalusova
d386fd646a
Add: An introductory guide for text generation (#21090)
* Part of the "text generation" rework: adding a high-level overview of the text generation strategies

* code samples update via make style

* fixed a few formatting issues

* Apply suggestions from review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fixed spaces, and switched two links to markdown

* Apply Steven's suggestions from review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* new lines after headers to fix link rendering

* review feedback addressed. added links to image captioning and audio transcription examples

* minor capitalization fix

* addressed the review feedback

* Apply suggestions from review

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Applied review suggestions

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-01-17 12:23:22 -05:00
Maria Khalusova
868d37165f
Add: tensorflow example for image classification task guide (#21038)
* Added TF example for image classification

* Code style polishing

* code style polishing

* minor polishing

* fixed a link in a tip, and a typo in the inference TF content

* Apply Amy's suggestions from review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/tasks/image_classification.mdx

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* review feedback addressed

* make style

* added PushToHubCallback with save_strategy="no"

* minor polishing

* added PushToHubCallback with save_strategy=no

* minor polishing

* Update docs/source/en/tasks/image_classification.mdx

* added data augmentation

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* make style

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-01-17 12:20:08 -05:00
NielsRogge
3a9bd972e2
Add resources (#20872)
* Add resources

* Add more resources

* Remove pipeline tag

* Add more resources

* Add more resources

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2023-01-17 17:42:33 +01:00
Sayak Paul
f3feaf7f22
Change variable name to prevent shadowing (#21153)
fix: input -> input_string.
2023-01-17 11:29:23 -05:00
NielsRogge
cf028d0c3d
Add batch of resources (#20647)
* Add resources

* Add more resources

* Add more resources

* Add TAPAS

* Fix pipeline tag

* Fix pipeline tags

* Remove pipeline tag

* Remove depth-estimation tag

* Update docs/source/en/model_doc/segformer.mdx

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Apply suggestion

* Fix segformer

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Maria Khalusova <kafooster@gmail.com>
2023-01-17 17:18:56 +01:00
Sayak Paul
f30bcd5357
feat: add standalone guide on XLA support. (#21141)
* feat: add standalone guide on XLA support.

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Empty commit to trigger CI

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address PR comments.

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-01-17 15:07:59 +01:00
Alara Dirik
2411f0e465
Add Mask2Former (#20792)
* Adds Mask2Former to transformers

Co-authored-by: Shivalika Singh <shivalikasingh95@gmail.com>
Co-authored-by: Shivalika Singh <73357305+shivalikasingh95@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-01-16 20:37:07 +03:00
NielsRogge
4ed89d48ab
Add UperNet (#20648)
* First draft

* More improvements

* Add convnext backbone

* Add conversion script

* Add more improvements

* Comment out to_dict

* Add to_dict method

* Add default config

* Fix config

* Fix backbone

* Fix backbone some more

* Add docs, auto mapping, tests

* Fix some tests

* Fix more tests

* Fix more tests

* Add conversion script

* Improve conversion script

* Add support for getting reshaped undownsampled hidden states

* Fix forward pass

* Add print statements

* Comment out set_shift_and_window_size

* More improvements

* Correct downsampling layers conversion

* Fix style

* First draft

* Fix conversion script

* Remove config attribute

* Fix more tests

* Update READMEs

* Update ConvNextBackbone

* Fix ConvNext tests

* Align ConvNext with Swin

* Remove files

* Fix index

* Improve docs

* Add output_attentions to model forward

* Add backbone mixin, improve tests

* More improvements

* Update init_weights

* Fix interpolation of logits

* Add UperNetImageProcessor

* Improve image processor

* Fix image processor

* Remove print statements

* Remove script

* Update import

* Add image processor tests

* Remove print statements

* Fix test

* Add integration test

* Add convnext integration test

* Update docstring

* Fix README

* Simplify config

* Apply suggestions

* Improve docs

* Rename class

* Fix test_initialization

* Fix import

* Address review

* Fix confg

* Convert all checkpoints

* Fix default backbone

* Usage same processor as segformer

* Apply suggestions

* Fix init_weights, update conversion scripts

* Improve config

* Use Auto API instead of creating a new image processor

* Fix docs

* Add doctests

* Remove ResNetConfig dependency

* Add always_partition argument

* Fix rebaseé

* Improve docs

* Convert checkpoints

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
2023-01-16 09:39:13 +01:00
Shogo Hida
7f65d2366a
Add Spanish translation to community.mdx (#21055)
* Add community to toctree

Signed-off-by: Shogo Hida <shogo.hida@gmail.com>

* Copy English content

Signed-off-by: Shogo Hida <shogo.hida@gmail.com>

* Add some translations

Signed-off-by: Shogo Hida <shogo.hida@gmail.com>

* Add some translations

Signed-off-by: Shogo Hida <shogo.hida@gmail.com>

* Add some translations

Signed-off-by: Shogo Hida <shogo.hida@gmail.com>

* Fix position of community

Signed-off-by: Shogo Hida <shogo.hida@gmail.com>

* Fix translation

Signed-off-by: Shogo Hida <shogo.hida@gmail.com>

* Add translation

Signed-off-by: Shogo Hida <shogo.hida@gmail.com>

* Add translation

Signed-off-by: Shogo Hida <shogo.hida@gmail.com>

* Add translation

Signed-off-by: Shogo Hida <shogo.hida@gmail.com>

* Add translation

Signed-off-by: Shogo Hida <shogo.hida@gmail.com>

Signed-off-by: Shogo Hida <shogo.hida@gmail.com>
2023-01-14 09:25:05 +01:00
Steven Liu
f58248b824
Update task summary part 1 (#21014)
* first draft of new task summary

* make style

* review

* apply feedback

* apply feedbacks

* final touches
2023-01-13 11:01:53 -08:00
Steven Liu
8f796960f6
Fix header level (#21072)
fix header level
2023-01-10 10:24:10 -08:00
Sayak Paul
263fd3c4c7
add: task guide on video classification model fine-tuning. (#20827)
* add: task guide on video classification model fine-tuning.

* apply make style from hf-formatting.

* add: toc entry.

* chore: address PR comments.

Co-authored-by Maria Khalusova

* Reflect Maria's contributions.

Co-authored-by: Maria Khalusova <1065417+MKhalusova@users.noreply.github.com>

* chore: minor correction.

* Apply suggestions from code review

Co-authored-by: Nathan Raw <nxr9266@g.rit.edu>

* PyTorch Video -> PyTorchVideo.

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* change licensing year.

* minor rewording.

* apply make style.

* address Sylvain's comments.

* replace links.

Co-authored-by: Maria Khalusova <1065417+MKhalusova@users.noreply.github.com>
Co-authored-by: Nathan Raw <nxr9266@g.rit.edu>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-01-05 00:43:40 +05:30
Maria Khalusova
b493fee958
Add: doc page for the object detection task (#20925)
* Added Object Detection task guide (new branch)

* Polished code examples after running make style

* Update docs/source/en/tasks/object_detection.mdx

Rephrasing suggestion from Sayak

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/tasks/object_detection.mdx

A rephrasing suggestion from Sayak

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/tasks/object_detection.mdx

typo

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Applied reviewers suggestions
>
>
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>

* polished code examples

* Added a visualization of the inference result. Slightly changed hyperparameters, and updated the results.

* polished code examples

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/tasks/object_detection.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Applying Steven's review suggestions

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* minor punctuation fix

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-01-04 08:36:37 -05:00
Jongjyh
ce85686a1f
Add AltCLIP (#20446)
* add altclip

* update

* fix wrong title

* fix the copyright in readme

* add altclip model

* add altclip

* fix test_gradient_checkpointing_enable_disable

* code

* add return class

* add projection_state

* "fix pretrained model bug"

* delete print and fix 2 test instances.

* delete token

* rm xlmr

* one model one file.

* empty commit to trigger CI

* Fix modeling_outputs.py

* Fix __init__

* Fix quality

* Fix modeling file docstring

* Fix README.md

* Fix test file

* add vision model

* empty commit to trigger CI

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* del token in mdx file

* fix

* fix

* fix

* remove altrob from test list

* add vision test

* fix fx

* fix

* fix

* fix

* trigger CI

* fix copies

* fix tests

* fix style

* fix quality

* update

* recover import

* recover

* add ,

* recover

* fix copies

* trigger CI

* fix

* some of review

* update

* remove import

* last 2

* fix

* fix style

* fix style

* fix bug

* fix uncomment

* fix

* update

* fix

* second review

* empty commit to trigger CI

* empty commit to trigger CI

* fix position

* fix

* empty commit to trigger CI

* empty commit to trigger CI

* third comment

* Update docs/source/en/model_doc/altclip.mdx

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update docs/source/en/model_doc/altclip.mdx

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update src/transformers/models/altclip/configuration_altclip.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update src/transformers/models/altclip/modeling_altclip.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update src/transformers/models/altclip/processing_altclip.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update src/transformers/models/altclip/modeling_altclip.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* fix merge

* fix copies

* update

* update

* empty commit to trigger CI

* fix code example

* empty commit to trigger CI

* fix

* empty commit to trigger CI

* empty commit to trigger CI

Co-authored-by: shunxing1234 <xw747777271@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: shunxing1234 <33774367+shunxing1234@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2023-01-04 09:18:57 +01:00
Alara Dirik
cd2457809f
Improve OWL-ViT postprocessing (#20980)
* add post_process_object_detection method

* style changes
2023-01-03 19:25:09 +03:00
NielsRogge
9c6f7485a6
Add GIT (GenerativeImage2Text) (#20295)
* First draft

* Make model instantiation work

* Fix copied from statement

* More fixes

* Add correct output head

* Improve configuration

* Add conversion script

* Improve conversion script

* Remove token_type_ids

* Fix conversion of projection layers

* Convert all weights

* Use cats image

* Make logits match

* Generate caption on cats image

* Add GITProcessor

* Update conversion script

* Add support for more checkpoints

* Fix conversion script

* Add initial tests

* Remove cross-attention

* More improvements

* Remove is_decoder

* Improve model tests

* Improve tests

* Improve model outputs

* Fix model outputs equivalence

* Fix more tests

* Remove unused code

* Use generate to generate text, no use of cache for now

* Use generate more appropriately

* Fix config tests

* Fix style

* Add support for use_cache

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fix style

* Fix GIT vision encoder

* Update README

* Fix integration test

* Set bos and eos token ids

* Improve docs

* Improve code

* Add support for provided attention_mask

* Add copied from statement

* Fix gradient checkpointing test

* Set model_input_names

* Investigate model_input_names

* Remove script

* Fix model inputs

* Fix docstring

* Rename GIT to Git

* Support more models

* Add support for textvqa model

* Add video support

* Extend conversion script for video

* Add support for large variant

* Add support for more models

* Fix config archive map

* Update integration test

* Fix README

* Fix CLIP mean and std

* Update processor

* Fix use_cache for video, thanks @gante

* Remove print statements

* Remove assertion

* Add processor tests

* Fix model_input_names

* Use Auto API for processor

* Fix processor tests

* Fix integration test

* Fix pipeline test

* Make tests faster

* Update conversion script

* Update conversion script

* Convert more checkpoints

* Update conversion script

* Fix typo

* Update docstrings

* Improve code snippets

* Fix doc tests

* Add more code examplesé

* Fix doc tests

* Add integration tests

* Fix unused variable

* revert

* Add GIT to Japanese README

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-03 14:17:18 +01:00
Konstantin Kotik
367fdf3330
MinNewTokensLengthLogitsProcessor for .generate method #20814 (#20892)
* feat: add min new length logit processor

* test: add min new length logit processor

* docs: add MinNewTokensLengthLogitsProcessor

* feat: import MinNewTokensLengthLogitsProcessor

* fix: update pytorch dummy objects

* refactor & fix: rename attributes and var and get rid of dynamic attribute

* tests: align test with new interface

* docs: fix typo

* docs: minor clarification

* Empty-Commit

* empty commit

* run automated quality edits

Co-authored-by: Joao Gante <joao@huggingface.co>
2023-01-03 06:29:02 -05:00
Alex Hedges
0b686a8a1e
Remove non-breaking spaces (#20929)
* Remove non-breaking space in comment

It was likely added unintionally.

* Remove remaining non-breaking spaces
2022-12-29 02:12:40 -05:00
Yih-Dar
5fa0b17c3d
[Past CI] 🔥 Leave Past CI failures in the past 🔥 (#20861)
* torch.jit._state

* Fix past CI

* Fix for perceiver

* Fix REALM

* Fix for Bloom

* Fix for SwinMode

* Fix for TrajectoryTransformerModel

* Fix for test_wav2vec2_with_lm

* make style

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-27 18:37:25 +01:00
Eli Simhayev
e35bc46af6
fix docs typos in "add_new_model" (#20900)
fix Jupyter typos
2022-12-27 02:49:15 -05:00
Kamal Raj Kanakarajan
d1b3011292
Update flan-t5 original model link (#20897)
Update flan-t5.mdx
2022-12-27 02:26:14 -05:00
Nathan Barry
47146721b8
typo fix (#20891) 2022-12-26 02:06:23 -05:00
Syed Abdul Gaffar Shakhadri
15bc776fec
Add Onnx Config for PoolFormer (#20868)
poolformer onnx

Co-authored-by: syed <syed.abdul@sandlogic.com>
2022-12-23 01:30:57 -05:00
Maria Khalusova
04c560225b
Adding evaluate to the list of libraries required in generated notebooks (#20850)
Adding `evaluate` to the list of libraries to be installed for every generated notebook in transformers
2022-12-21 14:04:08 +01:00
Younes Belkada
0d284bd574
Add BLIP (#20716)
* add new model like

* add v1

* v1

* v1

* vision encoder logits match

* v2

* fix

* add docstring

* CI tests pass

* fix tests

* make fixup

* add to `toctree`

* fix processors

* fix processors

* fix doc

* fill title

* add content doc

* remove from tokenization auto

* fix config

* change order

* add `# Copied from`

* few fixes

- add correct license on modeling text
- remove dummy argument

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* replace name

* refactor a bit

* more refactor

* remove unused arg

* make fixup + remove some `# Adapted from ...`

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* more `# Copied from`

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* now `generate` supports no prefix

* remove `FeatureExtractor`

* fix path

* correct dependency

* fix tests

* few fixes

* add integration tests

* add correct conversion script

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add `blip` to tokenization auto

* fix docstrings

* fix test + add image

* remove processor from uncorrect place

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* clean up a bit

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* clean pixel mask

* clean pixel mask

* fix `F`

* Update src/transformers/models/blip/modeling_blip.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix output

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix pad token id

* remove `token_type_ids`

* make fixup

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* make fixup

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* add comments

* Update src/transformers/models/blip/modeling_blip.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* remove `token_type_ids`

* make fixup

* better name

* replace with `image_attention_mask`

* refactor

* make fixup

* better docstring

* replace `answer_xx`

* remove ununsed args

* add `labels`

* add `labels`

* fix processing tests

* make fixup

* make fixup

* put correct repo

* remove `pad`

* remove `crop` and `center_crop`

* Update src/transformers/models/blip/image_processing_blip.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix

* remove `size_divisor`

* fix weights `init`

* remove unneeded functions

* add suggestions

* minor changes

- change slow test output for PT 1.13
- docstring order

* replace `feature_extractor` by `image_processor`

* fix doctests

* fix weight init order + add fp16 slow test

* add `blip` to doctest

* add correct repo name and fix test

* Update src/transformers/models/blip/processing_blip.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix tests

* use `convert_to_rgb` from `image_transforms`

* make fixup

* fix large loading issue

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-21 09:39:10 +01:00
Steven Liu
3be028bc9d
Embed circle packing chart for model summary (#20791)
* embed circle packing chart

* trim whitespace from bottom

* explain bubble sizes
2022-12-20 10:26:52 -08:00
stanleycai95
bdb84e2bad
Add model resources for ViT (#20723)
* Set up overall resources documentation structure

* Update vit.mdx

* Removing irrelevant sections on text models

* Update vit.mdx

* Update vit.mdx

* Update vit.mdx

* Update vit.mdx

* Update vit.mdx

* Update vit.mdx

* Update vit.mdx

* Update vit.mdx

* Update vit.mdx

* Update vit.mdx

* Update vit.mdx

* Update vit.mdx

* Update vit.mdx

* Update vit.mdx
2022-12-19 10:59:34 -08:00
Andreas Madsen
b4b613b102
Implement Roberta PreLayerNorm (#20305)
* Copy RoBERTa

* formatting

* implement RoBERTa with prelayer normalization

* update test expectations

* add documentation

* add convertion script for DinkyTrain weights

* update checkpoint repo

Unfortunately the original checkpoints assumes a hacked roberta model

* add to RoBERTa-PreLayerNorm docs to toc

* run utils/check_copies.py

* lint files

* remove unused import

* fix check_repo reporting wrongly a test is missing

* fix import error, caused by rebase

* run make fix-copies

* add RobertaPreLayerNormConfig to ROBERTA_EMBEDDING_ADJUSMENT_CONFIGS

* Fix documentation <Facebook> -> Facebook

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup: Fix documentation <Facebook> -> Facebook

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add missing Flax header

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* expected_slice -> EXPECTED_SLICE

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update copies after rebase

* add missing copied from statements

* make fix-copies

* make prelayernorm explicit in code

* fix checkpoint path for the original implementation

* add flax integration tests

* improve docs

* update utils/documentation_tests.txt

* lint files

* Remove Copyright notice

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make fix-copies

* Remove EXPECTED_SLICE calculation comments

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-19 09:30:17 +01:00
NielsRogge
26dd041c6e
Add Swin2SR (#19784)
* First draft

* Add more improvements

* Improve forward pass

* Fix layernorm

* Add upscaler

* More improvements

* More improvements

* More improvements

* Improve conversion script

* Add preprocessing

* Make output match original implementation

* Add additional attributes

* Add support for more models

* Support more models

* Add support for real world sr

* Add initial Swin2SRFeatureExtractor

* Add ImageSuperResolutionOutput

* Make more tests pass

* Use BaseModelOutput

* Fix one more test

* Fix more tests

* Fix another test

* Fix all tests

* Rename to Swin2SRImageProcessor

* Fix toctree

* Fix toctree

* Fix rebase

* Improve Swin2SRImageProcessor

* Remove feature extractor file

* Improve model

* Improve conversion script

* Fix integration test

* Fix init

* Fix conversion script

* Address comments

* Improve upsampler

* Add NearestConvUpsampler

* Improve pixel shuffle upsampler

* Improve auxiliary upsampler

* Improve conversion script

* Rename conv_last to final_convolution

* Fix rebase

* Improve upsample module

* Add padding to image processor

* Fix bug

* Update padding

* Remove print statement and fix integration test

* Improve docs

* Add image processor tests

* Convert all checkpoints, fix testsé

* Remove print statements

* Fix import

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-12-16 16:24:01 +01:00
NielsRogge
7f99861218
Add Universal Segmentation class + mapping (#20766)
* Add mapping

* Add mapping to pipeline

* Apply suggestions

* Fix feature extractor tests

* Use ForInstance, add model to universal mapping

* More fixes

* Remove model from deprecated objectsé

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-12-16 14:22:46 +01:00
Joao Gante
4bc723f87d
Generate: use GenerationConfig as the basis for .generate() parametrization (#20388)
* generate from config mvp

* fix failing tests

* max_time test

* Load default gen config at model load time; Update docs

* further documentation; add tests

* adapt rag to the new structure

* handle models not instantiated with from_pretained (like in tests)

* better default generation config

* add can_generate fn

* handle legacy use case of ad hoc model config changes

* initialize gen config from config in individual methods, if gen config is none

* fix _get_decoder_start_token_id when called outside GenerationMixin

* correct model config load order (set attr > model config > decoder config)

* update rag to match latest changes

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* load gen config from model config in model.from_pretrained

* fix can_generate fn

* handle generate calls without a previous from_pretrained (e.g. tests)

* add legacy behavior (and a warning)

* lower logger severity

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-15 18:27:20 +00:00
Nicolas Patry
ba9da49aa2
Fixing the pipeline tutorial test (#20746)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-13 19:08:30 +01:00
Hazrul Akmal
f28c918c7e
Add docs xlm roberta (#20742)
* added model resources for xlm-roberta

* added model resources for xlm-roberta

* resolve suggested changes

* add resources to xlm-roberta
2022-12-13 09:25:55 -08:00
Ariel Ekgren
5f94855dc3
Add gpt-sw3 model to transformers (#20209)
* Add templates for gpt-sw3

* Add templates for gpt-sw3

* Added sentencepiece tokenizer

* intermediate commit with many changes

* fixed conflicts

* Init commit for tokenization port

* Tokenization progress

* Remove fast tokenizer

* Clean up and rename spm.model -> spiece.model

* Remove TF -> PT conversion script template, Clean up Megatron -> PT script

* Optimize encode & decode performance

* added new attention

* added new attention

* attention for gpt-sw3 working

* attention good

* Cache is now working

* fixed attention mask so that it works with causal attention

* fixed badbmm bug for cpu and caching

* updated config with correct parameters

* Refactor and leave optimizations as separate functions to avoid breaking expected functionality

* Fix special tokens mapping for both tokenizers

* cleaning up of code and comments

* HF compatible attention outputs

* Tokenizer now passing tests, add documentation

* Update documentation

* reverted back to base implementation after checking that it is identical to pretrained model

* updated gpt-sw3 config

* updated conversion script

* aligned parameters with gpt-sw3 config

* changed default scale_attn_by_inverse_layer_idx to true

* removed flag from conversion script

* added temporary model path

* reverted back to functioning convert script

* small changes to default config

* updated tests for gpt-sw3

* make style, make quality, minor cleanup

* Change local paths to testing online repository

* Change name: GptSw3 -> GPTSw3

* Remove GPTSw3TokenizerFast references

* Use official model repository and add more model sizes

* Added reference to 6.7b model

* Add GPTSw3DoubleHeadsModel to IGNORE_NON_AUTO_CONFIGURED, like GPT2DoubleHeadsModel

* Remove pointers to non-existing TFGPTSw3

* Add GPTSw3 to docs/_toctree.yml

* Remove TF artifacts from GPTSw3 in __init__ files

* Update README:s with 'make fix-copies'

* Add 20b model to archive list

* Add documentation for GPT-Sw3

* Fix typo in documentation for GPT-Sw3

* Do 'make fix-copies' again after having updated docs

* Fix some typos in docs

* Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/gpt_sw3/test_tokenization_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Resolve comments from PR feedback

* Resolve more comments from PR feedback, also set use_cache=True in convert script

* Add '# Copied from' comments for GPTSw3 modeling

* Set 'is_parallelizable = False'

* Remove '# Copied from' where code was modified and add 'with x->y' when appropriate

* Remove parallelize in mdx

* make style, make quality

* Update GPTSw3Config default values and corresponding documentation

* Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Clean up and protect GPTSw3Tokenizer imports with is_sentencepiece_available

* Make style, make quality

* Add dummy object for GPTSw3Tokenizer via 'make fix-copies'

* make fix-copies

* Remove GPTSw3 modeling classes

* make style, make quality

* Add GPTSw3 auto-mappings for other GPT2 heads

* Update docs/source/en/model_doc/gpt-sw3.mdx

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove old TODO-comment

* Add example usage to GPTSw3Tokenizer docstring

* make style, make quality

* Add implementation details and example usage to gpt-sw3.mdx

Co-authored-by: JoeyOhman <joeyoh@kth.se>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-12 13:12:13 -05:00
Matt
c1b9a11dd4
Convert tokenizer outputs for Keras in doc example (#20732)
* Convert tokenizer outputs for Keras in doc example

* Das deutsche Beispiel auch korrigieren
2022-12-12 16:14:04 +00:00
Juanjo do Olmo
0ba94aceb6
Spanish translation of the file debugging.mdx (#20566)
* Create and translate to Spanish debugging.mdx

* solved typo error in a header

* Update debugging.mdx

* Update debugging.mdx

* Update docs/source/es/debugging.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/debugging.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/debugging.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/debugging.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/debugging.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update _toctree.yml

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-12 10:38:56 -05:00
stanleycai95
17c742bbf5
Very small edit to change name to OpenAI GPT (#20722) 2022-12-12 09:43:43 -05:00
Alberto Mario Ceballos-Arroyo
8286af6f54
Spanish translation of asr.mdx and add_new_pipeline.mdx (#20569)
* Fix minor typo in question_answering.mdx

* Fixes minor typo in the english version of tasks/asr.mdx

* Update _toctree.yml

* Translate add_new_pipeline.mdx into Spanish

* Fixes some typos in the English version of add_new_pipeline.mdx

* Translate asr.mdx into Spanish

* Fixes small typos in add_new_pipeline.mdx

* Update docs/source/es/add_new_pipeline.mdx

Suggestion by @osanseviero

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/add_new_pipeline.mdx

Suggestion by @osanseviero: use "biblioteca" instead of "librería."

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/tasks/asr.mdx

Suggestion by @osanseviero.

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/add_new_pipeline.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/add_new_pipeline.mdx

Suggestion by @osanseviero.

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/add_new_pipeline.mdx

Suggestion by @osanseviero.

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/add_new_pipeline.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/tasks/asr.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/tasks/asr.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/tasks/asr.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update asr.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
2022-12-12 09:23:23 -05:00
Sylvain Gugger
799cea64ac
Fix rendering issue in quicktour (#20708)
* Fix rendering issue in quicktour

* Separate in two blocks
2022-12-09 13:51:35 -05:00
Michael Benayoun
6a062a3ed9
Change transformers.onnx to use optimum.exporters.onnx (#20529)
* Change transformers.onnx to use optimum.exporters.onnx

* Update doc

* Remove print

* Fix transformers.onnx cli

* Update documentation

* Update documentation

* Small fixes

* Fix log message

* Apply suggestions

* Update src/transformers/onnx/__main__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions

* Add missing line break

* Ran make fix-copies

* Update src/transformers/onnx/__main__.py

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update src/transformers/onnx/__main__.py

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

Co-authored-by: Michael Benayoun <michael@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-12-09 10:42:02 +01:00
Nathan Raw
9e56aff58a
Add video classification pipeline (#20151)
* 🚧 wip video classification pipeline

* 🚧 wip - add is_decord_available check

* 🐛 add missing import

*  add tests

* 🔧 add decord to setup extras

* 🚧 add is_decord_available

*  add video-classification pipeline

* 📝 add video classification pipe to docs

* 🐛 add missing VideoClassificationPipeline import

* 📌 add decord install in test runner

*  fix url inputs to video-classification pipeline

*  updates from review

* 📝 add video cls pipeline to docs

* 📝 add docstring

* 🔥 remove unused import

* 🔥 remove some code

* 📝 docfix
2022-12-08 16:22:43 -05:00
Sylvain Gugger
9cc65f8701
Migrate torchdynamo to torch.compile (#20634)
* Migrate torchdynamo to torch.compile

* Add docstring and generic option

* Properly use the function...

* Reorg args
2022-12-08 11:18:52 -05:00
Cole Howard
fc95386ea1
Add TFBartForSequenceClassification (#20570)
* read to load

* base functionality

* revert init

* fix dummy data

* moving right along

* moving right along

* finally

* cleanup

* pull out comment

* add test

* update docstring for main class

* flake comments and rewriting copies from make repo-consistency`

* remove irrelevant differences/accidental spaces

* put copies back after space removals

* mid

* final test pass

* stray comment

* update test file

* update test file

* fixup

* black

* missed

* black missed one more

* sytle

* add doc update

* fix order of output class

* comment

* Revert "comment"

This reverts commit 03f86b6948.

* remove redundant function, and redundant reshape

* move change out of common

* style

* put common spaces back

* reorder kwargs in output

* doc style
2022-12-07 18:05:39 +01:00
NielsRogge
d151a8c550
Add BiT + ViT hybrid (#20550)
* First draft

* More improvements

* Add backbone, first draft of ViT hybrid

* Add AutoBackbone

* More improvements

* Fix bug

* More improvements

* More improvements

* Convert ViT-hybrid

* More improvements

* add patch bit

* Fix style

* Improve code

* cleaned v1

* more cleaning

* more refactoring

* Improve models, add tests

* Add docs and tests

* Make more tests pass

* Improve default backbone config

* Update model_type

* Fix more tests

* Add more copied from statements

* More improvements

* Add push to hub to conversion scripts

* clean

* more cleanup

* clean

* replace to

* fix

* Update src/transformers/models/bit/configuration_bit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix base model prefix

* more cleaning

* get rid of stem

* clean

* replace flag

* Update src/transformers/models/bit/configuration_bit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/bit/configuration_bit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* add check

* another check

* fix for hybrid vit

* final fix

* update config

* fix class name

* fix `make fix-copies`

* remove `use_activation`

* Update src/transformers/models/bit/configuration_bit.py

* rm unneeded file

* Add BiT image processor

* rm unneeded file

* add doc

* Add image processor to conversion script

* Add ViTHybrid image processor

* Add resources

* Move bit to correct position

* Fix auto mapping

* Rename hybrid to Hybrid

* Fix name in toctree

* Fix READMEs'

* Improve config

* Simplify GroupNormActivation layer

* fix test + make style

* Improve config

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* remove comment

* remove comment

* replace

* replace

* remove all conv_layer

* refactor norm_layer

* revert x

* add copied from

* last changes + integration tests

* make fixup

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix name

* fix message

* remove assert and refactor

* refactor + make fixup

* refactor - add  + sfety checker

* fix docstring + checkpoint names

* fix merge issues

* fix function name

* fix copies

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix model checkpoint

* fix doctest output

* vit name on doc

* fix name on doc

* fix small nits

* fixed integration tests

* final changes - slow tests pass

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-07 11:03:39 +01:00
Samuel Xu
e842e181df
Documentation fixes (#20607) 2022-12-06 07:32:46 -05:00
Nicolas Patry
28f3d431d4
Rework the pipeline tutorial (#20437)
* [WIP] Rework the pipeline tutorial

- Switch to `asr` instead of another NLP task.
- It also has simpler to understand results.
- Added a section with interaction with `datasets`.
- Added a section with writing a simple webserver.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Addressing comments.

* Links.

* Fixing docs format.

* Adding pipeline_webserver to _toctree.

* Warnig -> Tip warnings={true}.

* Fix link ?

* Links ?

* Fixing link, adding chunk batching.

* Oops.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/pipeline_tutorial.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2022-12-06 10:47:31 +01:00
Steven Liu
720e9599c1
Split autoclasses on modality (#20559)
* split autoclasses on modality

* apply review

* auto classes
2022-12-05 12:28:44 -08:00
Steven Liu
7d1c1c5b21
Fix code sample in preprocess (#20561)
* change to image_processor

* apply review
2022-12-05 11:49:43 -08:00
Francisco Kurucz
ac3bccdc74
Fix link to Swin Model contributor novice03 (#20557) 2022-12-05 11:42:29 -05:00
Erin
87282cb73c
Add RemBERT ONNX config (#20520)
* rembert onnx config

* formatting

Co-authored-by: Ho <erincho@bcd0745f972b.ant.amazon.com>
2022-12-05 11:39:09 -05:00
Kamal Raj Kanakarajan
13e736685a
Add BioGPT (#20420)
* biogpt initial commit

* updated init

* fix faster decoding with use_cache

* 1. fix input_ids and input_embeds with correct device
2. added _keys_to_ignore_on_load_missing
3. updated prepare_inputs_for_generation

* add activation_dropout and scale_embedding

* replace fsmt attention with bart attention

* added test

* run make fix-copies

* doc init and fix build

* updated README with proper information

* 1. added tips to docs
2. updated BioGptTokenizer func

* 1. added tokenizer test
2. refactor tokenizer

* make fixup

* add biogpt fairseq to hf converter

* updated layer names more
similar to original checkpoints

* config update doc string and set defaults

* added "#copied" from bart model and
updated doc strings

* enable model_input_names in tokenizer

* 1.  positionalembedding depending on attention_mask
2. added attention mask to prepare for generation

* added test to verify past and generation

* BioGptLMHeadModel -> BioGptForCausalLM

* fix typo

* tokenization and test
Copyright and updated assertion

* updated Copyright and
one func at time in line

* Copyright updates and
minor doc fix

* replace assertion with ValueError

* rm extra space

* added code syntax

* revert cmnt position change

* add tokenizer to auto

* updated doc string

* tokenizer doc string update

* biogpt hub model update to microsoft/biogpt

* make fixup

* rm cmnt to fix flake8 5.0.4 vs 6 error
2022-12-05 10:12:03 -05:00
szhublox
699e90437f
flan-t5.mdx: fix link to large model (#20555) 2022-12-02 19:27:46 +01:00
fatih
cc3d0e1b01
[New Model] Add TimeSformer model (#18908)
* init timesformer

* apply fix-copies

* reformat style

* revert back some incoorect style updates

* init timesformer

* apply fix-copies

* reformat style

* revert back some incoorect style updates

* update timseformer doc

* add some functions and classes

* add new config params

* implement multiple classes

* update TimeSformerLayer

* update TimeSformerModel, TimeSformerPreTrainedModel, TimeSformerEncoder

* several fixes

* reformat

* temporary update

* fix some typos

* fix weight converter

* more fixes

* fix a typo

* fix typo

* remove redundant params

* fix for latest hf-hub

* merge fix

* fix some checks

* video classification works with einops

* add paper info to docs

* merge fix

* remove redundant line

* remove redundant docstring

* update config

* fix some typos

* fix converter

* update some test constants

* refactor einops functions

* reformat

* fix a comment

* remove redundat imports

* reformat

* fix a typo

* remove comment

* remove unused imports

* remove redundant doc line

* reformat

* add missing line

* fix docs

* fix timesformer auto feat ext

* add unittests

* reformat

* fix docs

* some fixes and updates

* fix readme

* fix modeling

* fix readme

* update index

* revert _toctree.yml changes

* update timseformer.mdx

* update drop_path_prob to drop_path_rate

* add dosctring for drop_path_rate

* update TimeSformerPatchEmbed naming

* remove to_2tuple

* explicit use of nn.functional

* reformat

* many updates from review comments

* fix a typo

* reformat

* remove assert, better variable name

* make variable names more explicit

* add some adapted from

* more explicit variable names

* remove redundant docstring

* fix initilaization

* move permute inside embedding

* update class names

* remove unused imports

* add test for video classification

* update PretrainedModel with PreTrainedModel

* remove double permute

* update based on sylvain's review

* aply auto fix

* update image_processing_auto for timesformer

* update hub urls

* reformat

* remove duplicate import

* update doc link
2022-12-02 09:13:25 +01:00
Younes Belkada
8b486c0310
add doc for (#20525) 2022-12-01 16:52:13 +01:00
Yang An
721764028e
Add Chinese-CLIP implementation (#20368)
* init chinese-clip model from clip

* init model tests and docs

* implement chinese-clip into hf

* implement chinese-clip into hf

* implement chinese-clip into hf

* implement chinese-clip into hf

* implement chinese-clip into hf

* update usecase example in model implementation

* fix codestyle

* fix model_type typo in readme

* add placeholder in doc

* add placeholder in doc

* update the init script

* update usecase

* fix codestyle

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* forward the convert_rgb

* update testcase

* update testcase

* update testcase

* merge the recent update from clip about model_input_name property

* update the doc

* update the doc

* update the doc

* update the doc

* remove unused imports

* reformat code style

* update the doc

* fix isort style

* bypass a weird failed unit test which is unrelated with my PR

* update the doc

* implement independent vision config class

* implement independent vision model class

* fix refactor bug

* fix refactor bug

* fix refactor bug

* make style

* fix refactor bug

* make style

* fix refactor bug

* fix refactor bug

* make style

* fix refactor bug

* fix refactor bug

* doc-build restyle

* implement independent text config class

* implement independent text model class

* implement independent text model class

* make style

* make fix-copies

* fix refactor bug

* fix refactor bug

* fix refactor bug

* fix refactor bug

* fix refactor bug

* fix refactor bug

* fix refactor bug

* fix refactor bug

* fix refactor bug

* fix refactor bug

* make style

* update doc

* black and isort

* update doc

* Update src/transformers/models/chinese_clip/configuration_chinese_clip.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/auto/tokenization_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* modify the model type from chinese-clip to chinese_clip

* format the example comment of ChineseCLIPVisionConfig

* correct the copyright comment

* fix the tokenizer specification

* add copied from for loss function

* remove unused class

* update CHINESE_CLIP_TEXT_INPUTS_DOCSTRING

* update CHINESE_CLIP_INPUTS_DOCSTRING

* update doc

* update doc

* update code comment in config

* update copied from statement

* make style

* rename the doc file

* add copied statement

* remove unused attention_mask, causal_attention_mask in ChineseCLIPVisionEncoder

* remove ChineseCLIPTextPreTrainedModel

* fix bug

* fix bug

* fix bug

* update doc

* make style

* Update src/transformers/models/chinese_clip/configuration_chinese_clip.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/chinese_clip/configuration_chinese_clip.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update ChineseCLIPImageProcessor in image_processing_auto

* fix config_class of chinesecliptextmodel

* fix the test case

* update the docs

* remove the copied from comment for ChineseCLIPTextModel, since it has diverged from BertModel with customed config_class

* update the testcase

* final fix

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-11-30 19:22:23 +01:00
Sylvain Gugger
08b4621899
Repurpose torchdynamo training args towards torch._dynamo (#20498)
* Repurpose torchdynamo training args towards torch._dynamo

* Add doc
2022-11-30 11:10:45 -05:00
Julian Pollmann
829374e4fc
Fix Typo in Docs for GPU (#20509) 2022-11-30 10:41:18 -05:00
amyeroberts
17a7b49bda
Update doc examples feature extractor -> image processor (#20501)
* Update doc example feature extractor -> image processor

* Apply suggestions from code review
2022-11-30 14:50:55 +00:00
amyeroberts
de6d19ea92
Add segmentation + object detection image processors (#20160)
* Add transforms for object detection

* DETR models + Yolos

* Scrappy additions

* Maskformer image processor

* Fix up; MaskFormer tests

* Update owlvit processor

* Add to docs

* OwlViT tests

* Update pad logic

* Remove changes to transforms

* Import fn directly

* Update to include pad transformation

* Remove uninstended changes

* Add new owlvit post processing function

* Tidy up

* Fix copies

* Fix some copies

* Include device fix

* Fix scipy imports

* Update _pad_image

* Update padding functionality

* Fix bug

* Properly handle ignore index

* Fix up

* Remove defaults to None in docstrings

* Fix docstrings & docs

* Fix sizes bug

* Resolve conflicts in init

* Cast to float after resizing

* Tidy & add size if missing

* Allow kwards when processing for owlvit

* Update test values
2022-11-30 10:24:03 +00:00
Francisco Kurucz
4aa630eeab
Fix documentation code to import facebook/detr-resnet-50 model (#20491) 2022-11-29 13:30:26 -05:00
Pi Esposito
fb2b45e562
add in layer gpt2 tokenizer (#20421)
* add minimal working gpt2 tokenizer

* graph mode and output equivalence tests working

* not today tensorflow. serialization test passing!

* fix style, documentation, docstrings and all that jazz

* passing consistency checks

* move keras nlp to tf dependencies

* fix tf modeling utils and gpt2 attention to enable compiling

* fix (I hope) keras nlp dependencies

* rever changes on generation

* remove debug prints

* remove redundant tf dummy objects

* add from config, get config and max length settings to address review

* let flake ignore the error on distillation you are welcome

* test from config

* add padding test

* address sgugger review
2022-11-29 10:02:40 -05:00
amyeroberts
ae1cffaf3c
Add Donut image processor (#20425)
* Add Donut image processor

* Update src/transformers/image_transforms.py

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

* Fix docstrings

* Full var names in docstring

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
2022-11-29 10:38:01 +00:00
NielsRogge
6dc884abc8
[Maskformer] Add MaskFormerSwin backbone (#20344)
* First draft

* Fix backwards compatibility

* More fixes

* More fixes

* Make backbone more general

* Improve backbone

* Improve test

* Fix config checkpoint

* Address comments

* Use model_type

* Address more comments

* Fix special model names

* Remove MaskFormerSwinModel and MaskFormerSwinPreTrainedModel from main init

* Fix typo

* Update backbone

* Apply suggestion

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-11-28 20:33:49 +01:00
Sayak Paul
d59d5a618b
chore: add link to the video cls notebook. (#20386)
* chore: add link to the video cls notebook.

* chore: segregate as resources.
2022-11-28 12:10:24 -05:00
Wang, Yi
ca3b652bbd
update cpu related doc (#20444) 2022-11-28 08:54:35 -05:00
Ian C
bc00c29d11
Add Spanish translation of pr_checks.mdx (#20339)
* Update _toctree and clone original doc

* Forgot to translate (lol)

* Translate documentation and update toctree

* Add suggested changes from review
2022-11-23 15:06:29 -05:00
Ian C
c3eb01013b
Fix toctree for Section 3 in Spanish Documentation (#20360)
* Order and group topics in the right section

* Translate "Computer Vision"
2022-11-21 16:44:34 -05:00
Steven Liu
d896029e27
Add inference section to task guides (#18781)
* 📝 start adding inference section to task guides

*  make style

* 📝 add multiple choice

* add rest of inference sections

* make style

* add compute_metric, push_to_hub, pipeline

* make style

* add updated sequence and token classification

* make style

* make edits in token classification

* add audio classification

* make style

* add asr

* make style

* add image classification

* make style

* add summarization

* make style

* add translation

* make style

* add multiple choice

* add language modeling

* add qa

* make style

* review and edits

* apply reviews

* make style

* fix call to processor

* apply audio reviews

* update to better asr model

* make style
2022-11-21 10:06:21 -08:00
NielsRogge
4973d2a04c
Add Audio Spectogram Transformer (#19981)
* First draft

* Make conversion script work

* Add id2label mapping, run code quality

* Fix copies

* Add first draft of feature extractor

* Update conversion script to use feature extractor

* Make more tests pass

* Add docs

* update input_features to input_values + pad by default to max length

* Fix doc tests

* Add feature extractor tests

* Add proper padding/truncation to feature extractor

* Add support for conversion of all audioset checkpoints

* Improve docs and extend conversion script

* Fix README

* Rename spectogram to spectrogram

* Fix copies

* Add integration test

* Remove dummy conv

* Update to ast

* Update organization

* Fix init

* Rename model to AST

* Add require_torchaudio annotator

* Move import of ASTFeatureExtractor under a is_speech_available

* Fix rebase

* Add pipeline config

* Update name of classifier head

* Rename time_dimension and frequency_dimension for clarity

* Remove print statement

* Fix pipeline test

* Fix pipeline test

* Fix index table

* Fix init

* Fix conversion script

* Rename to ForAudioClassification

* Fix index table

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-11-21 18:58:54 +01:00
NielsRogge
96783e53b4
Add resources (#20296)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-11-21 18:24:32 +01:00
Matthijs Hollemans
d21c97cc0f
add MobileNetV1 model (#17799)
* add model files etc for MobileNetV2

rename files for MobileNetV1

initial implementation of MobileNetV1

fix conversion script

cleanup

write docs

tweaks

fix conversion script

extract hidden states

fix test cases

make fixup

fixup it all

remove main from doc link

fixes

fix tests

fix up

use google org

fix weird assert

* fixup

* use google organization for checkpoints
2022-11-21 10:21:28 -05:00
Raj Rajhans
22d7161a52
fix: "BigSicence" typo in docs (#20331) 2022-11-21 09:44:54 -05:00
Ian C
d28448c5cd
Add Spanish translation of serialization.mdx (#20245)
* Update _toctree and clone original content

* Translate first three sections

* Add more translated chapters. Only 3 more left.

* Finish translation

* Run style from doc-builder

* Address recommended changes from reviewer
2022-11-21 08:46:54 -05:00
BFSS
05d80d856c
translate zh quicktour(#20095) (#20181)
* zh quicktour(#20095)

* add zh to doc workflow

* remove untranslation from toctree

Co-authored-by: BeifangSusu <BeifangSusu@bfss.com>
2022-11-21 08:44:18 -05:00
Joao Gante
3de07473da
Generate: add generation config class (#20218)
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-11-21 13:30:15 +00:00
Steven Liu
d316037ad7
organize pipelines by modality (#20306) 2022-11-18 12:06:25 -08:00
Ali Hassani
fc4a993e1b
Add Neighborhood Attention Transformer (NAT) and Dilated NAT (DiNAT) models (#20219)
* Add DiNAT

* Adds DiNAT + tests

* Minor fixes

* Added HF model

* Add natten to dependencies.

* Cleanup

* Minor fixup

* Reformat

* Optional NATTEN import.

* Reformat & add doc to _toctree

* Reformat (finally)

* Dummy objects for DiNAT

* Add NAT + minor changes

Adds NAT as its own independent model + docs, tests
Adds NATTEN to ext deps to ensure ci picks it up.

* Remove natten from `all` and `dev-torch` deps, add manual pip install to ci tests

* Minor fixes.

* Fix READMEs.

* Requested changes to docs + minor fixes.

* Requested changes.

* Add NAT/DiNAT tests to layoutlm_job

* Correction to Dinat doc.

* Requested changes.
2022-11-18 13:08:26 -05:00
amyeroberts
b98269425e
Add padding image transformation (#19838)
* Add padding transformation

* Add in upstream changes

* Update tests & docs

* Code formatting tuples in docstring
2022-11-18 11:27:21 +00:00
Shogo Hida
389702242d
[Docs] Add resources of OpenAI GPT (#20084)
* Add resources of OpenAI GPT

* Delete Deploy section and add .

* Add scripts

* Update docs/source/en/model_doc/openai-gpt.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Delete causal-language-modeling section

* Add TFOpenAIGPTLMHeadModel

* Add resources from community

* Delete a link

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2022-11-16 11:17:32 -05:00
Alara Dirik
a00b7e85ea
Adds image-guided object detection support to OWL-ViT (#20136)
Adds image-guided object detection method to OwlViTForObjectDetection class as described in the original paper. One-shot/ image-guided object detection enables users to use a query image to search for similar objects in the input image.

Co-Authored-By: Dhruv Karan k4r4n.dhruv@gmail.com
2022-11-16 09:07:46 +03:00
Ambuj Pawar
c19aa7acce
Add clip resources to the transformers documentation (#20190)
* WIP: Added CLIP resources from HuggingFace blog

* ADD: Notebooks documentation to clip

* Add link straight to notebook

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Change notebook links to colab

Co-authored-by: Ambuj Pawar <your_email@abc.example>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2022-11-15 13:26:46 -05:00
Saad Mahmud
5b62f8ea2b
Add to DeBERTa resources (#20155)
* Add to DeBERTa resources

* Fix mistakes with chapter number

* Add fill-mask pipeline

* Add sequence, token and QA pipeline

* Change token classification pipeline order

* Remove flax script and notebook links
2022-11-15 13:26:07 -05:00
Suraj Patil
7f74433814
[CLIP] allow loading projection layer in vision and text model (#18962)
* allow loading projection in text and vision model

* begin tests

* finish test for CLIPTextModelTest

* style

* add slow tests

* add new classes for projection heads

* remove with_projection

* add in init

* add in doc

* fix tests

* fix some more tests

* fix copies

* fix docs

* remove leftover from fix-copies

* add the head models in IGNORE_NON_AUTO_CONFIGURED

* fix docstr

* fix tests

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add docstr for models

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-11-15 17:50:07 +01:00
Muhammad Sakib Khan Inan
777b1bfe62
New logging support to "Trainer" Class (ClearML Logger) (#20184)
* Init Update

* ClearML Callbacks integration

* update corrections

* args reporting updated

* {'tensorboard': False, 'pytorch': False}

* ClearML Tests added

* add clearml

* output_uri=True in Task.init

* reformatted integrations.py

* reformatted and fixed

* IF-ELSE statement issue on "has_clearml" resolved

* Add clearml in main callback docs

* Add additional clearml documentation

* Update src/transformers/integrations.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Accept suggestion

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Accept suggestion

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Small change in comments

* Make style clearml

* Accept suggestion

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Victor Sonck <victor.sonck@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-11-15 10:08:59 -05:00
Kendall
683cbc4c34
fixed spelling error in testing.mdx (#20220) 2022-11-15 09:40:06 -05:00
amyeroberts
4c7e8d0900
Add object detection + segmentation transforms (#20003)
* Add transforms for object detection

* Update src/transformers/image_transforms.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Better var names & docstring

* Remove unused var desc in docstring

* Update src/transformers/image_transforms.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-11-15 12:50:03 +00:00
Younes Belkada
163ac3d3ee
Add Switch transformers (#19323)
* first commit

* add more comments

* add router v1

* clean up

- remove `tf` modeling files

* clean up

- remove `tf` modeling files

* clean up

* v0 routers

* added more router

- Implemented `ExpertsChooseMaskedRouter`

- added tests
- 2 more routers to implement

* last router

* improved docstring

- completed the docstring in `router.py`
- added more args in the config

* v0 sparse mlp

* replace wrong naming

* forward pass run

* update MOE layer

* small router update

* fixup

* consistency

* remove scatter router

* remove abstract layer

* update test and model for integration testing

* v1 conversion

* update

* hardcode hack

* all keys match

* add gin conversion, without additional libraries

* update conversion sctipy

* delete router file

* update tests wrt router deletion

* fix router issues

* update expert code

* update, logits match, code needsREFACTORING

* Refactor code

Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* add generate tests

Co-authored-by: younesbelkada <younesbelkada@gmail.com>

* add support for router loss

Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* fix forward error

* refactor a bit

* remove `FlaxSwitchTransformers` modules

* more tests pass

* Update code

Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* fixup

* fix tests

* fix doc

* fix doc + tokenization

* fix tokenizer test

* fix test

* fix loss output

* update code for backward pass

* add loss support

* update documentation

* fix documentation, clean tokenizer

* more doc fix, cleanup example_switch

* fix failing test

* fix test

* fix test

* fix loss issue

* move layer

* update doc and fix router capacity usage

* fixup

* add sparse mlp index for documentation on hub

* fixup

* test sparse mix architecture

* Apply suggestions from code review

* Update docs/source/en/model_doc/switch_transformers.mdx

* fixup on update

* fix tests

* fix another test

* attempt fix

* Update src/transformers/models/switch_transformers/configuration_switch_transformers.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/switch_transformers/convert_switch_transformers_original_flax_checkpoint_to_pytorch.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* try

* all tests pass

* fix jitter noise

* Apply suggestions from code review

* doc tests pass

* Update src/transformers/models/switch_transformers/modeling_switch_transformers.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/switch_transformers/modeling_switch_transformers.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove assert

* change config order

* fix readme japanese

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove parallelizable tests + add one liners

* remove ONNX config

* fix nits

- add `T5Tokenizer` in auto mapping
- remove `Switch Transformers` from ONNX supported models

* remove `_get_router`

* remove asserts

* add check in test for `router_dtype`

* add `SwitchTransformersConfig` in `run_pipeline_test`

* Update tests/pipelines/test_pipelines_summarization.py

* add huge model conversion script

* fix slow tests

- add better casting for `Linear8bitLt`
- remove `torchscript` tests

* add make dir

* style on new script

* fix nits

- doctest
- remove `_keys_to_ignore_on_load_unexpected`

* Update src/transformers/models/switch_transformers/configuration_switch_transformers.py

* add google as authors

* fix year

* remove last `assert` statements

* standardize vertical spaces

* fix failing import

* fix another failing test

* Remove strange àuthorized_keys`

* removing todo and padding that is never used

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: ybelkada <younes@huggingface.co>
Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Arthur Zucker <arthur@huggingface.co>
2022-11-15 13:06:45 +01:00
bofeng huang
9625924c60
Update tokenizer_summary.mdx (#20135) 2022-11-15 01:18:13 +01:00
Wonhyeong Seo
8fadfd5035
[docs] set overflowing image width to auto-scale (#20197)
* docs: fix: set overflowing image width to auto-scale

* docs: fix: new language Korean is also affected

* docs: fix: unnecessary line break in index page
2022-11-15 01:13:40 +01:00
Wonhyeong Seo
07d8d6e2f7
docs: translated index page to korean (#20180)
docs: i18n: first draft of index page
docs: fix: first revision of index page
docs: i18n: missed section - supported frameworks
docs: fix: second revision of index page
review by @ArthurZucker

refactor: remove untranslated files from korean
docs: fix: remove untranslated references from toctree.yml
feat: enable korean docs in gh actions
docs: feat: add in_translation page as placeholder
docs: bug: testing if internal toc need alphabet chars
docs: fix: custom english anchor for non-alphanumeric headings
review by @sgugger

docs: i18n: translate comments on install methods in _config.py
docs: refactor: more concise wording for translations
2022-11-14 12:09:21 -05:00
Bartosz Szmelczynski
78a471ff71
Fix tapas scatter (#20149)
* First draft

* Remove scatter dependency

* Add require_torch

* update vectorized sum test, add clone call

* remove artifacts

* fix style

* fix style v2

* remove "scatter" mentions from the code base

* fix isort error

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-11-14 01:04:26 -05:00
Matthijs Hollemans
f711d683b5
add MobileNetV2 model (#17845)
* add model files etc for MobileNetV2

* rename files for MobileNetV1

* initial implementation of MobileNetV1

* fix conversion script

* cleanup

* write docs

* tweaks

* fix conversion script

* extract hidden states

* fix test cases

* make fixup

* fixup it all

* rename V1 to V2

* fix checkpoints

* fixup

* implement first block + weight conversion

* add remaining layers

* add output stride and dilation

* fixup

* add tests

* add deeplabv3+ head

* a bit of fixup

* finish deeplab conversion

* add link to doc

* fix issue with JIT trace

in_height and in_width would be Tensor objects during JIT trace, which caused Core ML conversion to fail on the remainder op. By making them ints, the result of the padding calculation becomes a constant value.

* cleanup

* fix order of models

* fix rebase error

* remove main from doc link

* add image processor

* remove old feature extractor

* fix converter + other issues

* fixup

* fix unit test

* add to onnx tests (but these appear broken now)

* add post_process_semantic_segmentation

* use google org

* remove unused imports

* move args

* replace weird assert
2022-11-14 01:00:10 -05:00
Arthur
61a51f5f23
Add Jukebox model (replaces #16875) (#17826) 2022-11-10 21:05:27 +01:00
NielsRogge
9f0c72f93b
Add doc tests (#20158)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
2022-11-10 15:25:30 +01:00
NielsRogge
93e14486d6
[CLIPSeg] Add resources (#20118)
* Add resource

* Add tag

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-11-09 18:31:22 +01:00
Steven Liu
a44985b41c
add cv + audio labels (#20114) 2022-11-09 07:40:15 -08:00
Joao Gante
f270b960d6
Generate: move generation_*.py src files into generation/*.py (#20096)
* move generation_*.py src files into generation/*.py

* populate generation.__init__ with lazy loading

* move imports and references from generation.xxx.object to generation.object
2022-11-09 15:34:08 +00:00
amyeroberts
4eb918e656
AutoImageProcessor (#20111)
* AutoImageProcessor skeleton

* Update references

* Add mapping in init

* Add model image processors to __init__ for importing

* Add AutoImageProcessor tests

* Fix up

* Image Processor documentation

* Remove pdb

* Update docs/source/en/model_doc/mobilevit.mdx

* Update docs

* Don't add whitespace on json files

* Remove fixtures

* Move checking model config down

* Fix up

* Add check for image processor

* Remove FeatureExtractorMixin in docstrings

* Rename model_tmpfile to config_tmpfile

* Don't make None if not in image processor map
2022-11-08 19:54:41 +00:00
Weiwe Shi
efa889d2e4
Add RocBert (#20013)
* add roc_bert

* update roc_bert readme

* code style

* change name and delete unuse file

* udpate model file

* delete unuse log file

* delete tokenizer fast

* reformat code and change model file path

* add RocBertForPreTraining

* update docs

* delete wrong notes

* fix copies

* fix make repo-consistency error

* fix files are not present in the table of contents error

* change RocBert -> RoCBert

* add doc, add detail test

Co-authored-by: weiweishi <weiweishi@tencent.com>
2022-11-08 10:03:43 -05:00
NielsRogge
258963062b
Add CLIPSeg (#20066)
* Add first draft

* Update conversion script

* Improve conversion script

* Improve conversion script some more

* Add conditional embeddings

* Add initial decoder

* Fix activation function of decoder

* Make decoder outputs match original implementation

* Make decoder outputs match original implementation

* Add more copied from statements

* Improve model outputs

* Fix auto tokenizer file

* Fix more tests

* Add test

* Improve README and docs, improve conditional embeddings

* Fix more tests

* Remove print statements

* Remove initial embeddings

* Improve conversion script

* Add interpolation of position embeddings

* Finish addition of interpolation of position embeddings

* Add support for refined checkpoint

* Fix refined checkpoint

* Remove unused parameter

* Improve conversion script

* Add support for training

* Fix conversion script

* Add CLIPSegFeatureExtractor

* Fix processor

* Fix CLIPSegProcessor

* Fix conversion script

* Fix most tests

* Fix equivalence test

* Fix README

* Add model to doc tests

* Use better variable name

* Convert other checkpoint as well

* Update config, add link to paper

* Add docs

* Update organization

* Replace base_model_prefix with clip

* Fix base_model_prefix

* Fix checkpoint of config

* Fix config checkpoint

* Remove file

* Use logits for output

* Fix tests

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-11-08 10:55:47 +01:00
Tom Aarsen
6156bffa2b
Replace awkward timm link with the expected one (#20109) 2022-11-07 13:57:39 -05:00
Steven Liu
71f772ebd0
Add new terms to the glossary (#20051)
* add new terms

* apply review
2022-11-07 10:45:27 -08:00
Tom Aarsen
d44ac47bac
docs: Fixed variables in f-strings (#20087)
* docs: Fixed variables in f-strings

* Replace unknown `block` with known `block_type` in ValueError

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add missing torch import in docs code block

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-11-07 13:18:09 -05:00
Tom Aarsen
3222fc645b
docs: Resolve many typos in the English docs (#20088)
* docs: Fix typo in ONNX parser help: 'tolerence' => 'tolerance'

* docs: Resolve many typos in the English docs

Typos found via 'codespell ./docs/source/en'
2022-11-07 09:19:04 -05:00
Tom Aarsen
b8112eddec
Replace unsupported facebookresearch/bitsandbytes (#20093)
With https://github.com/TimDettmers/bitsandbytes, which is by the same author and is still being updated
2022-11-07 08:52:03 -05:00
Jordan Clive
3bd0007e87
Update documentation on seq2seq models with absolute positional embeddings, to be in line with Tips section for BERT and GPT2 (#20068)
Co-authored-by: jordiclive <jordiclive19@imperial.ac.uk>
2022-11-04 11:32:44 -04:00
Matt
6e1c5786dc
Update READMEs for ESMFold and add notebooks (#20067)
* Update READMEs for ESMFold and add notebooks

* Fix PyCharm formatting

* make fix-copies
2022-11-04 15:10:13 +00:00
Wang, Yi
2564f0c21d fix jit trace error for model forward sequence is not aligned with jit.trace tuple input sequence, update related doc (#19891)
* fix jit trace error for classification usecase, update related doc

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* add implementation in torch 1.14.0

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* update_doc

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* update_doc

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2022-11-03 10:50:03 -04:00
Sanchit Gandhi
06d488061f
[Whisper Tokenizer] Make more user-friendly (#19921)
* [Whisper Tokenizer] Make more user-friendly

* use property

* make indexing rigorous

* small clean-up

* tests

* skip seq2seq tests

* remove multilingual arg

* reorder args

* collapse to one function

Co-authored-by: ArthurZucker <arthur@huggingface.co>

* option to override attributes

Co-authored-by: ArthurZucker <arthur@huggingface.co>

* add to docs

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make comment more clear

Co-authored-by: sgugger <sylvain@huggingface.co>

* don't add special tokens in get_decoder_prompt_ids

* add test for set_prefix_tokens

Co-authored-by: ArthurZucker <arthur@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: sgugger <sylvain@huggingface.co>
2022-11-03 14:22:40 +00:00
Yih-Dar
9ccea7acb1
Fix some doctests after PR 15775 (#20036)
* Add skip_special_tokens=True in some doctest

* For T5

* Fix for speech_to_text.mdx

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-11-03 14:18:45 +01:00
Steven Liu
aa39967b28
reorganize glossary (#20010) 2022-11-02 16:58:17 -07:00
Yih-Dar
fb7cbe236b
Fix doctest (#20023)
* Fix doctest

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-11-02 19:37:25 +01:00
amyeroberts
a6b7759880
Add Image Processors (#19796)
* Add CLIP image processor

* Crop size as dict too

* Update warning

* Actually use logger this time

* Normalize doesn't change dtype of input

* Add perceiver image processor

* Tidy up

* Add DPT image processor

* Add Vilt image processor

* Tidy up

* Add poolformer image processor

* Tidy up

* Add LayoutLM v2 and v3 imsge processors

* Tidy up

* Add Flava image processor

* Tidy up

* Add deit image processor

* Tidy up

* Add ConvNext image processor

* Tidy up

* Add levit image processor

* Add segformer image processor

* Add in post processing

* Fix up

* Add ImageGPT image processor

* Fixup

* Add mobilevit image processor

* Tidy up

* Add postprocessing

* Fixup

* Add VideoMAE image processor

* Tidy up

* Add ImageGPT image processor

* Fixup

* Add ViT image processor

* Tidy up

* Add beit image processor

* Add mobilevit image processor

* Tidy up

* Add postprocessing

* Fixup

* Fix up

* Fix flava and remove tree module

* Fix image classification pipeline failing tests

* Update feature extractor in trainer scripts

* Update pad_if_smaller to accept tuple and int size

* Update for image segmentation pipeline

* Update src/transformers/models/perceiver/image_processing_perceiver.py

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

* Update src/transformers/image_processing_utils.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/beit/image_processing_beit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* PR comments - docstrings; remove accidentally added resize; var names

* Update docstrings

* Add exception if size is not in the right format

* Fix exception check

* Fix up

* Use shortest_edge in tuple in script

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-11-02 11:57:36 +00:00
Steven Liu
79c720c062
fix typo (#20006) 2022-11-01 11:30:36 -07:00
Steven Liu
ab74ac11e4
Add LayoutLMv3 resource (#19932)
* add layoutlmv3 resource

* add layoutlmv2 resources

* fix button
2022-11-01 11:10:46 -07:00
Steven Liu
dec8578e70
Add BERT resources (#19852)
* add resources for bert

* add course chapters

* apply reviews

* add pipeline icons and community resource

* fix buttons
2022-11-01 11:09:53 -07:00
Steven Liu
1f6885bad0
add dataset (#20005) 2022-11-01 10:37:20 -07:00
Sayak Paul
c87ae86a8f
Update image_classification.mdx (#19996) 2022-11-01 07:54:41 -04:00
Mohit Sharma
c796b6dea6
Added onnx config whisper (#19525)
* Added onnx config whisper

* added whisper support onnx

* add audio input data

* added whisper support onnx

* fixed the seqlength value

* Updated the whisper onnx ocnfig

* restore files to old version

* removed attention mask from inputs

* Updated get_dummy_input_onnxruntime docstring

* Updated relative imports and token generation

* update docstring
2022-11-01 07:50:42 -04:00
Matt
7f9b7b3f0e
Add ESMFold (#19977)
* initial commit

* First draft that gets outputs without crashing!

* Add all the ported openfold dependencies

* testing

* Restructure config files for ESMFold

* Debugging to find output discrepancies

* Mainly style

* Make model runnable without extra deps

* Remove utils and merge them to the modeling file

* Use correct gelu and remove some debug prints

* More cleanup

* Update esm docs

* Update conversion script to support ESMFold properly

* Port some top-level changes from ESMFold repo

* Expand EsmFold docstrings

* Make attention_mask optional (default to all 1s)

* Add inference test for ESMFold

* Use config and not n kwargs

* Add modeling output class

* Remove einops

* Remove chunking in ESM FFN

* Update tests for ESMFold

* Quality

* REpo consistency

* Remove tree dependency from ESMFold

* make fixup

* Add an error in case my structure map function breaks later

* Remove needless code

* Stop auto-casting the LM to float16 so CPU tests pass

* Stop auto-casting the LM to float16 so CPU tests pass

* Final test updates

* Split test file

* Copyright and quality

* Unpin PyTorch to see built doc

* Fix config file to_dict() method

* Add some docstrings to the output

* Skip TF checkpoint tests for ESM until we reupload those

* make fixup

* More docstrings

* Unpin to get even with main

* Flag example to write

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-10-31 21:32:58 -04:00
Jean Charles Kouame
6aede2d602
Tranformers documentation translation to Italian #17459 (#19988) 2022-10-31 13:19:15 -04:00
NielsRogge
0b294c2334
[Conditional, Deformable DETR] Add postprocessing methods (#19709)
* Add postprocessing methods

* Update docs

* Add fix

* Add test

* Add test for deformable detr postprocessing

* Add post processing methods for segmentation

* Update code examples

* Add post_process to make the pipeline work

* Apply updates

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-10-31 08:28:44 +01:00
Steven Liu
2e35bac4e7
Add wav2vec2 resources (#19931)
* add wav2vec2 resources

* apply review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2022-10-28 13:28:18 -07:00
Steven Liu
9d2788b46b
add resources for distilbert (#19930) 2022-10-28 13:16:07 -07:00
Steven Liu
b0a2c3a2d6
add resources for bart (#19928) 2022-10-28 13:15:43 -07:00
Raghav Prabhakar
0d4c45c585
Add Onnx Config for ImageGPT (#19868)
* add Onnx Config for ImageGPT

* add generate_dummy_inputs for onnx config

* add TYPE_CHECKING clause

* Update doc for generate_dummy_inputs

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-28 09:39:53 -04:00
Steven Liu
e4132952a1
Add GPT2 resources (#19879)
* add resources for gpt2

* add pipeline icons and community resources
2022-10-27 11:34:00 -07:00
Steven Liu
d818dd3a41
Add BLOOM resources (#19881)
* add bloom resources

* add pipeline icon
2022-10-27 11:33:52 -07:00
Steven Liu
50f5266b2c
Add T5 resources (#19878)
* add resources for t5

* add pipeline icons and community resources
2022-10-27 11:33:37 -07:00
Steven Liu
536a8ae6ad
Add RoBERTa resources (#19911)
* add roberta resources

* fix typo
2022-10-27 11:33:15 -07:00
Younes Belkada
7a1c68a845
Add flan-t5 documentation page (#19892)
* add `flan-t5` documentation page

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add more content

* revert `_toctree` modif

* revert `toctree` modif - 2

* Update README.md

* Revert "Update README.md"

This reverts commit 5660714429.

* Update README_es.md

* Update README_zh-hans.md

* Update README_zh-hant.md

* Update README_ko.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-26 17:22:57 +02:00
Lysandre Debut
eedaba682f
[Past CI] Vilt only supports PT >= v1.10 (#19851)
* Support for Vilt in v1.9

* Skip if not higher or equal than 1.10

* Move test :)

* I am bad at python
2022-10-25 15:59:35 +02:00
Davi Alves
0a77249178
Added translation of serialization.mdx to Portuguese Issue #16824 (#19869)
* [ custom_models.mdx ] - Translated to Portuguese the custom models tutorial.

* [ run_scripts.mdx ] - Translated to Portuguese the run scripts tutorial.

* [ converting_tensorflow_models.mdx ] - Translated to Portuguese the converting tensorflow models tutorial.

* [ converting_tensorflow_models.mdx ] - Translated to Portuguese the converting tensorflow models tutorial.

* [ serialization.mdx ] - Translated to Portuguese the serialization tutorial.
2022-10-25 09:34:28 -04:00
Alberto Mario Ceballos-Arroyo
371337a95b
Spanish translation of multiple_choice.mdx, question_answering.mdx. (#19821)
* Translated multiple_choice.mdx, question_answering.mdx. Added them to _toctree.yml

* Added translation for a missed line.

* Update _toctree.yml as per Omar's suggestions

* Update multiple_choice.mdx as per Omar's comments

* Updt question_answering.mdx as per Omar's comments
2022-10-24 20:11:34 -04:00
Steven Liu
9ecb13d63a
add small updates only (#19847) 2022-10-24 10:18:20 -07:00
Yih-Dar
072ed01c38
Fix doctest for MarkupLM (#19845)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-10-24 17:54:23 +02:00
Dhruv Singal
5cbf1fa8ca
fixed typo in fp16 training section for perf_train_gpu_one (#19736) 2022-10-24 10:04:28 -04:00
Davi Alves
743995e0e6
Added translation of converting_tensorflow_models.mdx to Portuguese Issue #16824 (#19824)
* [ custom_models.mdx ] - Translated to Portuguese the custom models tutorial.

* [ run_scripts.mdx ] - Translated to Portuguese the run scripts tutorial.

* [ converting_tensorflow_models.mdx ] - Translated to Portuguese the converting tensorflow models tutorial.

* [ converting_tensorflow_models.mdx ] - Translated to Portuguese the converting tensorflow models tutorial.
2022-10-24 09:50:16 -04:00
zhou fan
3b419cfc6f
fix broken links in testing.mdx (#19820) 2022-10-24 09:48:02 -04:00
Davi Alves
74b3eb3dea
Added translation of run_scripts.mdx to Portuguese Issue #16824 (#19800)
* [ custom_models.mdx ] - Translated to Portuguese the custom models tutorial.

* [ run_scripts.mdx ] - Translated to Portuguese the run scripts tutorial.
2022-10-21 17:38:35 -04:00
Davi Alves
2ebf4e6a7b
[ custom_models.mdx ] - Translated to Portuguese the custom models tutorial. (#19779) 2022-10-21 09:48:19 -04:00
ftorres16
c1f009ad9a
Update training.mdx (#19791) 2022-10-21 09:46:44 -04:00
Rohit Gupta
2dd1b8f0c5
adding key pair dataset (#19765) 2022-10-20 09:05:49 -04:00
amyeroberts
5041bc3511
Image transforms add center crop (#19718)
* Add center crop to transforms library

* Return PIL images if PIL image input by default

* Fixup and add docstring

* Trigger CI

* Update src/transformers/image_transforms.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/image_transforms.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* PR comments - move comments; unindent

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-19 16:15:01 +01:00
GMFTBY
71786b10c5
Adding the state-of-the-art contrastive search decoding methods for the codebase of generation_utils.py (#19477)
* add: the contrastive search for generaton_utils

* add: testing scripts for contrastive search under examples/text-generation

* update the quality of codes

* revise the docstring; make the generation_contrastive_search.py scripts;

* revise the examples/pytorch/text-generation/run_generation_contrastive_search.py to the auto-APIs format

* revise the necessary documents

* fix: revise the docstring of generation_contrastive_search.py

* Fix the code indentation

* fix: revise the nits and examples in contrastive_search docstring.

* fix the copyright

* delete generation_contrastive_search.py

* revise the logic in contrastive_search

* update the intergration test and the docstring

* run the tests over

* add the slow decorate to the contrastive_search intergrate test

* add more test

* do the style, quality, consistency checks
2022-10-19 10:17:46 +01:00
NielsRogge
14fe3e0410
Add docs (#19729)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-10-18 17:42:46 +02:00
NielsRogge
dd523da577
Add table transformer [v2] (#19614)
* First draft

* Add conversion script

* Make conversion work

* Upload checkpoints

* Add final fixes

* Revert changes of conditional and deformable detr

* Fix toctree, add and remove copied from

* Use model type

* Improve docs

* Improve code example

* Update copies

* Add copied formt

* Don't update conditional detr

* Don't update deformable detr
2022-10-18 15:20:09 +02:00
Antonio Carlos Falcão Petri
af150e4a1c
Allow user-managed Pool in Wav2Vec2ProcessorWithLM.batch_decode (#18351)
* [Wav2Vec2] Allow user-managed Pool in Wav2Vec2ProcessorWithLM.batch_decode

* [Wav2Vec2] Add user-managed LM's pool tests and usage examples

* Improve styling

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* [Wav2Vec2] Fix hyperlink references

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-18 08:48:03 -04:00
Christopher Akiki
71ca79448c
Fix typo in perf docs (#19705) 2022-10-18 12:18:19 +02:00
NielsRogge
90071fe42b
Improve DETR models (#19644)
* Improve DETR models

* Fix Deformable DETR loss and matcher

* Fixup

* Fix integration tests

* Improve variable names

* Apply suggestion

* Fix copies

* Fix DeformableDetrLoss

* Make Conditional DETR copy from Deformable DETR

* Copy from deformable detr's hungarian matcher

* Fix bug
2022-10-18 10:29:14 +02:00
NielsRogge
fd9a027aca
Fix docs (#19687)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-10-18 09:52:51 +02:00
amyeroberts
4181320b8c
Add normalize to image transforms module (#19544)
* Adapt FE methods to transforms library

* Mixin for saving the image processor

* Base processor skeleton

* BatchFeature for packaging image processor outputs

* Initial image processor for GLPN

* REmove accidental import

* Fixup and docs

* Mixin for saving the image processor

* Fixup and docs

* Import BatchFeature from feature_extraction_utils

* Fixup and docs

* Fixup and docs

* Fixup and docs

* Fixup and docs

* BatchFeature for packaging image processor outputs

* Import BatchFeature from feature_extraction_utils

* Import BatchFeature from feature_extraction_utils

* Fixup and docs

* Fixup and docs

* BatchFeature for packaging image processor outputs

* Import BatchFeature from feature_extraction_utils

* Fixup and docs

* Mixin for saving the image processor

* Fixup and docs

* Add rescale back and remove ImageType

* fix import mistake

* Fix enum var reference

* Can transform and specify image data format

* Remove redundant function

* Update reference

* Data format flag for rescale

* Fix typo

* Fix dimension check

* Fixes to make IP and FE outputs match

* Add tests for transforms

* Add test for utils

* Update some docstrings

* Make sure in channels last before converting to PIL

* Remove default to numpy batching

* Fix up

* Add docstring and model_input_types

* Use feature processor config from hub

* Alias GLPN feature extractor to image processor

* Alias feature extractor mixin

* Add return_numpy=False flag for resize

* Fix up

* Fix up

* Use different frameworks safely

* Safely import PIL

* Call function checking if PIL available

* Only import if vision available

* Address Sylvain PR comments
Co-authored-by: Sylvain.gugger@gmail.com

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/image_transforms.py

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

* Update src/transformers/models/glpn/feature_extraction_glpn.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add in docstrings

* Fix TFSwinSelfAttention to have relative position index as non-trainable weight (#18226)

Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

* Refactor `TFSwinLayer` to increase serving compatibility (#18352)

* Refactor `TFSwinLayer` to increase serving compatibility

Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

* Fix missed parameters while refactoring

Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

* Fix window_reverse to calculate batch size

Signed-off-by: Seunghwan Hong <harrydrippin@gmail.com>
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Add TF prefix to TF-Res test class (#18481)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Remove py.typed (#18485)

* Fix pipeline tests (#18487)

* Fix pipeline tests

* Make sure all pipelines tests run with init changes

* Use new huggingface_hub tools for download models (#18438)

* Draft new cached_file

* Initial draft for config and model

* Small fixes

* Fix first batch of tests

* Look in cache when internet is down

* Fix last tests

* Bad black, not fixing all quality errors

* Make diff less

* Implement change for TF and Flax models

* Add tokenizer and feature extractor

* For compatibility with main

* Add utils to move the cache and auto-do it at first use.

* Quality

* Deal with empty commit shas

* Deal with empty etag

* Address review comments

* Fix `test_dbmdz_english` by updating expected values (#18482)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Move cache folder to huggingface/hub for consistency with hf_hub (#18492)

* Move cache folder to just huggingface

* Thank you VsCode for this needless import

* Move to hub

* Forgot one

* Update some expected values in `quicktour.mdx` for `resampy 0.3.0` (#18484)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Forgot one new_ for cache migration

* disable Onnx test for google/long-t5-tglobal-base (#18454)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Typo reported by Joel Grus on TWTR (#18493)

* Just re-reading the whole doc every couple of months 😬 (#18489)

* Delete valohai.yaml

* NLP => ML

* typo

* website supports https

* datasets

* 60k + modalities

* unrelated link fixing for accelerate

* Ok those links were actually broken

* Fix link

* Make `AutoTokenizer` auto-link

* wording tweak

* add at least one non-nlp task

* `transformers-cli login` => `huggingface-cli login` (#18490)

* zero chance anyone's using that constant no?

* `transformers-cli login` => `huggingface-cli login`

* `transformers-cli repo create` => `huggingface-cli repo create`

* `make style`

* Add seed setting to image classification example (#18519)

* [DX fix] Fixing QA pipeline streaming a dataset. (#18516)

* [DX fix] Fixing QA pipeline streaming a dataset.

QuestionAnsweringArgumentHandler would iterate over the whole dataset
effectively killing all properties of the pipeline.
This restores nice properties when using `Dataset` or `Generator` since
those are meant to be consumed lazily.

* Handling TF better.

* Clean up hub (#18497)

* Clean up utils.hub

* Remove imports

* More fixes

* Last fix

* update fsdp docs (#18521)

* updating fsdp documentation

* typo fix

* Fix compatibility with 1.12 (#17925)

* Fix compatibility with 1.12

* Remove pin from examples requirements

* Update torch scatter version

* Fix compatibility with 1.12

* Remove pin from examples requirements

* Update torch scatter version

* fix torch.onnx.symbolic_opset12 import

* Reject bad version

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Remove debug statement

* Specify en in doc-builder README example (#18526)

Co-authored-by: Ankur Goyal <ankur@impira.com>

* New cache fixes: add safeguard before looking in folders (#18522)

* unpin resampy (#18527)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

*  update to use interlibrary links instead of Markdown (#18500)

* Add example of multimodal usage to pipeline tutorial (#18498)

* 📝 add example of multimodal usage to pipeline tutorial

* 🖍 apply feedbacks

* 🖍 apply niels feedback

* [VideoMAE] Add model to doc tests (#18523)

* Add videomae to doc tests

* Add pip install decord

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* Update perf_train_gpu_one.mdx (#18532)

* Update no_trainer.py scripts to include accelerate gradient accumulation wrapper (#18473)

* Added accelerate gradient accumulation wrapper to run_image_classification_no_trainer.py example script

* make fixup changes

* PR comments

* changed input to Acceletor based on PR comment, ran make fixup

* Added comment explaining the sync_gradients statement

* Fixed lr scheduler max steps

* Changed run_clm_no_trainer.py script to use accelerate gradient accum wrapper

* Fixed all scripts except wav2vec2 pretraining to use accelerate gradient accum wrapper

* Added accelerate gradient accum wrapper for wav2vec2_pretraining_no_trainer.py script

* make fixup and lr_scheduler step inserted back into run_qa_beam_search_no_trainer.py

* removed changes to run_wav2vec2_pretraining_no_trainer.py script and fixed using wrong constant in qa_beam_search_no_trainer.py script

* Add Spanish translation of converting_tensorflow_models.mdx (#18512)

* Add file in spanish docs to be translated

* Finish translation to Spanish

* Improve Spanish  wording

* Add suggested changes from review

* Spanish translation of summarization.mdx (#15947) (#18477)

* Add Spanish translation of summarization.mdx

* Apply suggestions from code review

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Let's not cast them all (#18471)

* add correct dtypes when checking for params dtype

* forward contrib credits

* Update src/transformers/modeling_utils.py

Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>

* more comments

- added more comments on why we cast only floating point parameters

* Update src/transformers/modeling_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: sgugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>

* fix: data2vec-vision Onnx ready-made configuration. (#18427)

* feat: add the data2vec conf that are missing https://huggingface.co/docs/transformers/serialization

* fix: wrong config

* Add mt5 onnx config (#18394)

* update features

* MT5OnnxConfig added with updated with tests and docs

* fix imports

* fix onnc_config_cls for mt5

Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai>

* Minor update of `run_call_with_unpacked_inputs` (#18541)

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* BART - Fix attention mask device issue on copied models (#18540)

* attempt to fix attn mask device

* fix bart `_prepare_decoder_attention_mask`

- add correct device
- run `make fix-copies` to propagate the fix

* Adding a new `align_to_words` param to qa pipeline. (#18010)

* Adding a new `align_to_words` param to qa pipeline.

* Update src/transformers/pipelines/question_answering.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Import protection.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* 📝 update metric with evaluate (#18535)

* Restore _init_weights value in no_init_weights (#18504)

* Recover _init_weights value in no_init_weights

For potential nested use. 
In addition, users might modify private no_init_weights as well.

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove private variable change check

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Clean up comment

* 📝 update documentation build section (#18548)

* `bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901)

* first commit

* correct replace function

* add final changes

- works like charm!
- cannot implement tests yet
- tested

* clean up a bit

* add bitsandbytes dependencies

* working version

- added import function
- added bitsandbytes utils file

* small fix

* small fix

- fix import issue

* fix import issues

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit

- move bitsandbytes utils to utils
- change comments on functions

* reformat docstring

- reformat docstring on init_empty_weights_8bit

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert bad formatting

* change to bitsandbytes

* refactor a bit

- remove init8bit since it is useless

* more refactoring

- fixed init empty weights issue
- added threshold param

* small hack to make it work

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* revmoe the small hack

* modify utils file

* make style + refactor a bit

* create correctly device map

* add correct dtype for device map creation

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

- remove with torch.grad
- do not rely on Python bool magic!

* add docstring

 - add docstring for new kwargs

* add docstring

- comment `replace_8bit_linear` function
- fix weird formatting

* - added more documentation
- added new utility function for memory footprint tracking
- colab demo to add

* few modifs

- typo doc
- force cast into float16 when load_in_8bit is enabled

* added colab link

* add test architecture + docstring a bit

* refactor a bit testing class

* make style + refactor a bit

* enhance checks

- add more checks
- start writing saving test

* clean up a bit

* male style

* add more details on doc

* add more tests

- still needs to fix 2 tests

* replace by "or"

- could not fix it from GitHub GUI

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit testing code + add readme

* make style

* fix import issue

* Update src/transformers/modeling_utils.py

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* add few comments

* add more doctring + make style

* more docstring

* raise error when loaded in 8bit

* make style

* add warning if loaded on CPU

* add small sanity check

* fix small comment

* add bitsandbytes on dockerfile

* Improve documentation

- improve documentation from comments

* add few comments

* slow tests pass on the VM but not on the CI VM

* Fix merge conflict

* make style

* another test should pass on a multi gpu setup

* fix bad import in testing file

* Fix slow tests

- remove dummy batches
- no more CUDA illegal memory errors

* odify dockerfile

* Update docs/source/en/main_classes/model.mdx

* Update Dockerfile

* Update model.mdx

* Update Dockerfile

* Apply suggestions from code review

* few modifications

- lm head can stay on disk/cpu
- change model name so that test pass

* change test value

- change test value to the correct output
- torch bmm changed to baddmm in bloom modeling when merging

* modify installation guidelines

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* replace `n`by `name`

* merge `load_in_8bit` and `low_cpu_mem_usage`

* first try - keep the lm head in full precision

* better check

- check the attribute `base_model_prefix` instead of computing the number of parameters

* added more tests

* Update src/transformers/utils/bitsandbytes.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit

* improve documentation

- fix typos for installation
- change title in the documentation

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* TF: XLA-trainable DeBERTa v2 (#18546)

* fix deberta issues

* add different code paths for gpu and tpu

* shorter gpu take along axis

* Stable Dropout without tf cond

* variable must be float

* Preserve hub-related kwargs in AutoModel.from_pretrained (#18545)

* Preserve hub-related kwargs in AutoModel.from_pretrained

* Fix tests

* Remove debug statement

* TF Examples Rewrite (#18451)

* Finished QA example

* Dodge a merge conflict

* Update text classification and LM examples

* Update NER example

* New Keras metrics WIP, fix NER example

* Update NER example

* Update MC, summarization and translation examples

* Add XLA warnings when shapes are variable

* Make sure batch_size is consistently scaled by num_replicas

* Add PushToHubCallback to all models

* Add docs links for KerasMetricCallback

* Add docs links for prepare_tf_dataset and jit_compile

* Correct inferred model names

* Don't assume the dataset has 'lang'

* Don't assume the dataset has 'lang'

* Write metrics in text classification

* Add 'framework' to TrainingArguments and TFTrainingArguments

* Export metrics in all examples and add tests

* Fix training args for Flax

* Update command line args for translation test

* make fixup

* Fix accidentally running other tests in fp16

* Remove do_train/do_eval from run_clm.py

* Remove do_train/do_eval from run_mlm.py

* Add tensorflow tests to circleci

* Fix circleci

* Update examples/tensorflow/language-modeling/run_mlm.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/test_tensorflow_examples.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/translation/run_translation.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/token-classification/run_ner.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fix save path for tests

* Fix some model card kwargs

* Explain the magical -1000

* Actually enable tests this time

* Skip text classification PR until we fix shape inference

* make fixup

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Use commit hash to look in cache instead of calling head (#18534)

* Use commit hash to look in cache instead of calling head

* Add tests

* Add attr for local configs too

* Stupid typos

* Fix tests

* Update src/transformers/utils/hub.py

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address Julien's comments

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* `pipeline` support for `device="mps"` (or any other string) (#18494)

* `pipeline` support for `device="mps"` (or any other string)

* Simplify `if` nesting

* Update src/transformers/pipelines/base.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix? @sgugger

* passing `attr=None` is not the same as not passing `attr` 🤯

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update philosophy to include other preprocessing classes (#18550)

* 📝 update philosophy to include other preprocessing classes

* 🖍 apply feedbacks

* Properly move cache when it is not in default path (#18563)

* Adds CLIP to models exportable with ONNX (#18515)

* onnx config for clip

* default opset as 14

* changes from the original repo

* input values order fix

* outputs fix

* remove unused import

* ran make fix-copies

* black format

* review comments: forward ref, import fix, model change revert, .to cleanup

* make style

* formatting fixes

* revert groupvit

* comment for cast to int32

* comment fix

* make .T as .t() for onnx conversion

* ran make fix-copies

* remove unneeded comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix copies

* remove comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* raise atol for MT5OnnxConfig (#18560)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* fix string (#18568)

* Segformer TF: fix output size in documentation (#18572)

* Segformer TF: fix output size in doc

* Segformer pytorch: fix output size in doc

Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com>

* Fix resizing bug in OWL-ViT (#18573)

* Fixes resizing bug in OWL-ViT
* Defaults to square resize if size is set to an int
* Sets do_center_crop default value to False

* Fix LayoutLMv3 documentation (#17932)

* fix typos

* fix sequence_length docs of LayoutLMv3Model

* delete trailing white spaces

* fix layoutlmv3 docs more

* apply make fixup & quality

* change to two versions of input docstring

* apply make fixup & quality

* Skip broken tests

* Change BartLearnedPositionalEmbedding's forward method signature to support Opacus training (#18486)

* changing BartLearnedPositionalEmbedding forward signature and references to it

* removing debugging dead code (thanks style checker)

* blackened modeling_bart file

* removing copy inconsistencies via make fix-copies

* changing references to copied signatures in Bart variants

* make fix-copies once more

* using expand over repeat (thanks @michaelbenayoun)

* expand instead of repeat for all model copies

Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com>

* german docs translation (#18544)

* Create _config.py

* Create _toctree.yml

* Create index.mdx

not sure about "du / ihr" oder "sie"

* Create quicktour.mdx

* Update _toctree.yml

* Update build_documentation.yml

* Update build_pr_documentation.yml

* fix build

* Update index.mdx

* Update quicktour.mdx

* Create installation.mdx

* Update _toctree.yml

* Deberta V2: Fix critical trace warnings to allow ONNX export (#18272)

* Fix critical trace warnings to allow ONNX export

* Force input to `sqrt` to be float type

* Cleanup code

* Remove unused import statement

* Update model sew

* Small refactor

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* Use broadcasting instead of repeat

* Implement suggestion

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* Match deberta v2 changes in sew_d

* Improve code quality

* Update code quality

* Consistency of small refactor

* Match changes in sew_d

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* [FX] _generate_dummy_input supports audio-classification models for labels (#18580)

* Support audio classification architectures for labels generation, as well as provides a flag to print warnings or not

* Use ENV_VARS_TRUE_VALUES

* Fix docstrings with last version of hf-doc-builder styler (#18581)

* Fix docstrings with last version of hf-doc-builder styler

* Remove empty Parameter block

* Bump nbconvert from 6.0.1 to 6.3.0 in /examples/research_projects/lxmert (#18565)

Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump nbconvert in /examples/research_projects/visual_bert (#18566)

Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix owlvit tests, update docstring examples (#18586)

* Return the permuted hidden states if return_dict=True (#18578)

* Load sharded pt to flax (#18419)

* initial commit

* add small test

* add cross pt tf flag to test

* fix quality

* style

* update test with new repo

* fix failing test

* update

* fix wrong param ordering

* style

* update based on review

* update related to recent new caching mechanism

* quality

* Update based on review

Co-authored-by: sgugger <sylvain.gugger@gmail.com>

* quality and style

* Update src/transformers/modeling_flax_utils.py
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add type hints for ViLT models (#18577)

* Add type hints for Vilt models

* Add missing return type for TokenClassification class

* update doc for perf_train_cpu_many, add intel mpi introduction (#18576)

* update doc for perf_train_cpu_many, add mpi introduction

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* Update docs/source/en/perf_train_cpu_many.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/perf_train_cpu_many.mdx

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* typos (#18594)

* FSDP bug fix for `load_state_dict` (#18596)

* Add `TFAutoModelForSemanticSegmentation` to the main `__init__.py` (#18600)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Generate: validate `model_kwargs` (and catch typos in generate arguments) (#18261)

* validate generate model_kwargs

* generate tests -- not all models have an attn mask

* Supporting seq2seq models for `bitsandbytes` integration (#18579)

* Supporting seq2seq models for `bitsandbytes` integration

- `bitsandbytes` integration supports now seq2seq models
- check if a model has tied weights as an additional check

* small modification

- tie the weights before looking at tied weights!

* Add Donut (#18488)

* First draft

* Improve script

* Update script

* Make conversion work

* Add final_layer_norm attribute to Swin's config

* Add DonutProcessor

* Convert more models

* Improve feature extractor and convert base models

* Fix bug

* Improve integration tests

* Improve integration tests and add model to README

* Add doc test

* Add feature extractor to docs

* Fix integration tests

* Remove register_buffer

* Fix toctree and add missing attribute

* Add DonutSwin

* Make conversion script work

* Improve conversion script

* Address comment

* Fix bug

* Fix another bug

* Remove deprecated method from docs

* Make Swin and Swinv2 untouched

* Fix code examples

* Fix processor

* Update model_type to donut-swin

* Add feature extractor tests, add token2json method, improve feature extractor

* Fix failing tests, remove integration test

* Add do_thumbnail for consistency

* Improve code examples

* Add code example for document parsing

* Add DonutSwin to MODEL_NAMES_MAPPING

* Add model to appropriate place in toctree

* Update namespace to appropriate organization

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* Fix URLs (#18604)

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* Update BLOOM parameter counts (#18531)

* Update BLOOM parameter counts

* Update BLOOM parameter counts

* [doc] fix anchors (#18591)

the manual anchors end up being duplicated with automatically added anchors and no longer work.

* [fsmt] deal with -100 indices in decoder ids (#18592)

* [fsmt] deal with -100 indices in decoder ids

Fixes: https://github.com/huggingface/transformers/issues/17945

decoder ids get the default index -100, which breaks the model - like t5 and many other models add a fix to replace -100 with the correct pad index. 

For some reason this use case hasn't been used with this model until recently - so this issue was there since the beginning it seems.

Any suggestions to how to add a simple test here? or perhaps we have something similar already? user's script is quite massive.

* style

* small change (#18584)

* Flax Remat for LongT5 (#17994)

* [Flax] Add remat (gradient checkpointing)

* fix variable naming in test

* flip: checkpoint using a method

* fix naming

* fix class naming

* apply PVP's suggestions from code review

* add gradient_checkpointing to examples

* Add gradient_checkpointing to run_mlm_flax

* Add remat to longt5

* Add gradient checkpointing test longt5

* Fix args errors

* Fix remaining tests

* Make fixup & quality fixes

* replace kwargs

* remove unecessary kwargs

* Make fixup changes

* revert long_t5_flax changes

* Remove return_dict and copy to LongT5

* Remove test_gradient_checkpointing

Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>

* mac m1 `mps` integration (#18598)

* mac m1 `mps` integration

* Update docs/source/en/main_classes/trainer.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* addressing comments

* Apply suggestions from code review

Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>

* resolve comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>

* Change scheduled CIs to use torch 1.12.1 (#18644)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Add checks for some workflow jobs (#18583)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* TF: Fix generation repetition penalty with XLA (#18648)

* Update longt5.mdx (#18634)

* Update run_translation_no_trainer.py (#18637)

* Update run_translation_no_trainer.py

found an error in selecting `no_decay` parameters and some small modifications when the user continues to train from a checkpoint

* fixs `no_decay` and `resume_step` issue

1. change `no_decay` list
2. if use continue to train their model from provided checkpoint, the `resume_step` will not be initialized properly if `args.gradient_accumulation_steps != 1`

* [bnb] Minor modifications (#18631)

* bnb minor modifications

- refactor documentation
- add troubleshooting README
- add PyPi library on DockerFile

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* put in one block

- put bash instructions in one block

* update readme

- refactor a bit hardware requirements

* change text a bit

* Apply suggestions from code review

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* apply suggestions

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* add link to paper

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update tests/mixed_int8/README.md

* Apply suggestions from code review

* refactor a bit

* add instructions Turing & Amperer

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* add A6000

* clarify a bit

* remove small part

* Update tests/mixed_int8/README.md

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Examples: add Bloom support for token classification (#18632)

* examples: add Bloom support for token classification (FLAX, PyTorch and TensorFlow)

* examples: remove support for Bloom in token classication (FLAX and TensorFlow currently have no support for it)

* Fix Yolos ONNX export test (#18606)

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Fixup

* Fix up

* Move PIL default arguments inside function for safe imports

* Add image utils to toctree

* Update `rescale` method to reflect changes in #18677

* Update docs/source/en/internal/image_processing_utils.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Address Niels PR comments

* Add normalize method to transforms library

* Apply suggestions from code review - remove defaults to None

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix docstrings and revert to PIL.Image.XXX resampling

Use PIL.Image.XXX resampling values instead of PIL.Image.Resampling.XXX enum as it's only in the recent version >= 9.10 and version is not yet pinned and older version support deprecated

* Some more docstrings and PIL.Image tidy up

* Reorganise arguments so flags by modifiers

* Few last docstring fixes

* Add normalize to docs

* Accept PIL.Image inputs with deprecation warning

* Update src/transformers/image_transforms.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update warning to include version

* Trigger CI - hash clash on doc build

Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Seunghwan Hong <harrydrippin@gmail.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
Co-authored-by: Ankur Goyal <ankrgyl@gmail.com>
Co-authored-by: Ankur Goyal <ankur@impira.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Mishig Davaadorj <dmishig@gmail.com>
Co-authored-by: Rasmus Arpe Fogh Jensen <Rasmus.arpe@gmail.com>
Co-authored-by: Ian Castillo <7807897+donelianc@users.noreply.github.com>
Co-authored-by: AguilaCudicio <aguila.cudicio@gmail.com>
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>
Co-authored-by: Niklas Hansson <niklas.sven.hansson@gmail.com>
Co-authored-by: Thomas Chaigneau <t.chaigneau.tc@gmail.com>
Co-authored-by: YouJiacheng <1503679330@qq.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Dhruv Karan <k4r4n.dhruv@gmail.com>
Co-authored-by: Michael Wyatt <mrwyattii@gmail.com>
Co-authored-by: Maxime G <joihn@users.noreply.github.com>
Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com>
Co-authored-by: Wonseok Lee (Jack) <rollerkid02@snu.ac.kr>
Co-authored-by: Dan Jones <dan.j.jones2@gmail.com>
Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com>
Co-authored-by: flozi00 <flozi00.fz@gmail.com>
Co-authored-by: iiLaurens <iiLaurens@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Wang, Yi <yi.a.wang@intel.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: Karim Foda <35491698+KMFODA@users.noreply.github.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>
Co-authored-by: zhoutang776 <47708118+zhoutang776@users.noreply.github.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-10-17 17:02:14 +01:00
Christopher Akiki
aa629e7a7c
Update perf_train_gpu_one.mdx (#19676) 2022-10-17 16:54:35 +02:00
Matt
3b3024da70
TF port of ESM (#19587)
* Partial TF port for ESM model

* Add ESM-TF tests

* Add the various imports for TF-ESM

* TF weight conversion almost ready

* Stop ignoring the decoder weights in PT

* Add tests and lots of fixes

* fix-copies

* Fix imports, add model docs

* Add get_vocab() to tokenizer

* Fix vocab links for pretrained files

* Allow multiple inputs with a sep

* Use EOS as SEP token because ESM vocab lacks SEP

* Correctly return special tokens mask from ESM tokenizer

* make fixup

* Stop testing unsupported embedding resizing

* Handle TF bias correctly

* Skip all models with slow tokenizers in the token classification test

* Fixing the batch/unbatcher of pipelines to accomodate the `None` being

passed around.

* Fixing pipeline bug caused by slow tokenizer  being different.

* Update src/transformers/models/esm/modeling_tf_esm.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/esm/modeling_tf_esm.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/esm/modeling_tf_esm.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update set_input_embeddings and the copyright notices

Co-authored-by: Your Name <you@example.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2022-10-17 14:16:16 +01:00
Arthur
d2e5b19b82
Add doctest info in testingmdx (#19623) 2022-10-17 11:23:20 +02:00
0xflotus
5ef2186692
fix: small error (#19612)
* fix: small error

* fix: another typo error
2022-10-14 11:10:33 -04:00
Akash Mahajan
504cd71a6b
add a note to whisper docs clarifying support of long-form decoding (#19497) 2022-10-13 10:39:03 +02:00
amyeroberts
1973b7716b
Image transforms library (#18520)
* Adapt FE methods to transforms library

* Mixin for saving the image processor

* Base processor skeleton

* BatchFeature for packaging image processor outputs

* Initial image processor for GLPN

* REmove accidental import

* Fixup and docs

* Mixin for saving the image processor

* Fixup and docs

* Import BatchFeature from feature_extraction_utils

* Fixup and docs

* Fixup and docs

* Fixup and docs

* Fixup and docs

* BatchFeature for packaging image processor outputs

* Import BatchFeature from feature_extraction_utils

* Import BatchFeature from feature_extraction_utils

* Fixup and docs

* Fixup and docs

* BatchFeature for packaging image processor outputs

* Import BatchFeature from feature_extraction_utils

* Fixup and docs

* Mixin for saving the image processor

* Fixup and docs

* Add rescale back and remove ImageType

* fix import mistake

* Fix enum var reference

* Can transform and specify image data format

* Remove redundant function

* Update reference

* Data format flag for rescale

* Fix typo

* Fix dimension check

* Fixes to make IP and FE outputs match

* Add tests for transforms

* Add test for utils

* Update some docstrings

* Make sure in channels last before converting to PIL

* Remove default to numpy batching

* Fix up

* Add docstring and model_input_types

* Use feature processor config from hub

* Alias GLPN feature extractor to image processor

* Alias feature extractor mixin

* Add return_numpy=False flag for resize

* Fix up

* Fix up

* Use different frameworks safely

* Safely import PIL

* Call function checking if PIL available

* Only import if vision available

* Address Sylvain PR comments
Co-authored-by: Sylvain.gugger@gmail.com

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/image_transforms.py

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

* Update src/transformers/models/glpn/feature_extraction_glpn.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add in docstrings

* Fix TFSwinSelfAttention to have relative position index as non-trainable weight (#18226)

Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

* Refactor `TFSwinLayer` to increase serving compatibility (#18352)

* Refactor `TFSwinLayer` to increase serving compatibility

Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

* Fix missed parameters while refactoring

Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

* Fix window_reverse to calculate batch size

Signed-off-by: Seunghwan Hong <harrydrippin@gmail.com>
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Add TF prefix to TF-Res test class (#18481)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Remove py.typed (#18485)

* Fix pipeline tests (#18487)

* Fix pipeline tests

* Make sure all pipelines tests run with init changes

* Use new huggingface_hub tools for download models (#18438)

* Draft new cached_file

* Initial draft for config and model

* Small fixes

* Fix first batch of tests

* Look in cache when internet is down

* Fix last tests

* Bad black, not fixing all quality errors

* Make diff less

* Implement change for TF and Flax models

* Add tokenizer and feature extractor

* For compatibility with main

* Add utils to move the cache and auto-do it at first use.

* Quality

* Deal with empty commit shas

* Deal with empty etag

* Address review comments

* Fix `test_dbmdz_english` by updating expected values (#18482)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Move cache folder to huggingface/hub for consistency with hf_hub (#18492)

* Move cache folder to just huggingface

* Thank you VsCode for this needless import

* Move to hub

* Forgot one

* Update some expected values in `quicktour.mdx` for `resampy 0.3.0` (#18484)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Forgot one new_ for cache migration

* disable Onnx test for google/long-t5-tglobal-base (#18454)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Typo reported by Joel Grus on TWTR (#18493)

* Just re-reading the whole doc every couple of months 😬 (#18489)

* Delete valohai.yaml

* NLP => ML

* typo

* website supports https

* datasets

* 60k + modalities

* unrelated link fixing for accelerate

* Ok those links were actually broken

* Fix link

* Make `AutoTokenizer` auto-link

* wording tweak

* add at least one non-nlp task

* `transformers-cli login` => `huggingface-cli login` (#18490)

* zero chance anyone's using that constant no?

* `transformers-cli login` => `huggingface-cli login`

* `transformers-cli repo create` => `huggingface-cli repo create`

* `make style`

* Add seed setting to image classification example (#18519)

* [DX fix] Fixing QA pipeline streaming a dataset. (#18516)

* [DX fix] Fixing QA pipeline streaming a dataset.

QuestionAnsweringArgumentHandler would iterate over the whole dataset
effectively killing all properties of the pipeline.
This restores nice properties when using `Dataset` or `Generator` since
those are meant to be consumed lazily.

* Handling TF better.

* Clean up hub (#18497)

* Clean up utils.hub

* Remove imports

* More fixes

* Last fix

* update fsdp docs (#18521)

* updating fsdp documentation

* typo fix

* Fix compatibility with 1.12 (#17925)

* Fix compatibility with 1.12

* Remove pin from examples requirements

* Update torch scatter version

* Fix compatibility with 1.12

* Remove pin from examples requirements

* Update torch scatter version

* fix torch.onnx.symbolic_opset12 import

* Reject bad version

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Remove debug statement

* Specify en in doc-builder README example (#18526)

Co-authored-by: Ankur Goyal <ankur@impira.com>

* New cache fixes: add safeguard before looking in folders (#18522)

* unpin resampy (#18527)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

*  update to use interlibrary links instead of Markdown (#18500)

* Add example of multimodal usage to pipeline tutorial (#18498)

* 📝 add example of multimodal usage to pipeline tutorial

* 🖍 apply feedbacks

* 🖍 apply niels feedback

* [VideoMAE] Add model to doc tests (#18523)

* Add videomae to doc tests

* Add pip install decord

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* Update perf_train_gpu_one.mdx (#18532)

* Update no_trainer.py scripts to include accelerate gradient accumulation wrapper (#18473)

* Added accelerate gradient accumulation wrapper to run_image_classification_no_trainer.py example script

* make fixup changes

* PR comments

* changed input to Acceletor based on PR comment, ran make fixup

* Added comment explaining the sync_gradients statement

* Fixed lr scheduler max steps

* Changed run_clm_no_trainer.py script to use accelerate gradient accum wrapper

* Fixed all scripts except wav2vec2 pretraining to use accelerate gradient accum wrapper

* Added accelerate gradient accum wrapper for wav2vec2_pretraining_no_trainer.py script

* make fixup and lr_scheduler step inserted back into run_qa_beam_search_no_trainer.py

* removed changes to run_wav2vec2_pretraining_no_trainer.py script and fixed using wrong constant in qa_beam_search_no_trainer.py script

* Add Spanish translation of converting_tensorflow_models.mdx (#18512)

* Add file in spanish docs to be translated

* Finish translation to Spanish

* Improve Spanish  wording

* Add suggested changes from review

* Spanish translation of summarization.mdx (#15947) (#18477)

* Add Spanish translation of summarization.mdx

* Apply suggestions from code review

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Let's not cast them all (#18471)

* add correct dtypes when checking for params dtype

* forward contrib credits

* Update src/transformers/modeling_utils.py

Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>

* more comments

- added more comments on why we cast only floating point parameters

* Update src/transformers/modeling_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: sgugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>

* fix: data2vec-vision Onnx ready-made configuration. (#18427)

* feat: add the data2vec conf that are missing https://huggingface.co/docs/transformers/serialization

* fix: wrong config

* Add mt5 onnx config (#18394)

* update features

* MT5OnnxConfig added with updated with tests and docs

* fix imports

* fix onnc_config_cls for mt5

Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai>

* Minor update of `run_call_with_unpacked_inputs` (#18541)

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* BART - Fix attention mask device issue on copied models (#18540)

* attempt to fix attn mask device

* fix bart `_prepare_decoder_attention_mask`

- add correct device
- run `make fix-copies` to propagate the fix

* Adding a new `align_to_words` param to qa pipeline. (#18010)

* Adding a new `align_to_words` param to qa pipeline.

* Update src/transformers/pipelines/question_answering.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Import protection.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* 📝 update metric with evaluate (#18535)

* Restore _init_weights value in no_init_weights (#18504)

* Recover _init_weights value in no_init_weights

For potential nested use. 
In addition, users might modify private no_init_weights as well.

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove private variable change check

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Clean up comment

* 📝 update documentation build section (#18548)

* `bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901)

* first commit

* correct replace function

* add final changes

- works like charm!
- cannot implement tests yet
- tested

* clean up a bit

* add bitsandbytes dependencies

* working version

- added import function
- added bitsandbytes utils file

* small fix

* small fix

- fix import issue

* fix import issues

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit

- move bitsandbytes utils to utils
- change comments on functions

* reformat docstring

- reformat docstring on init_empty_weights_8bit

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert bad formatting

* change to bitsandbytes

* refactor a bit

- remove init8bit since it is useless

* more refactoring

- fixed init empty weights issue
- added threshold param

* small hack to make it work

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* revmoe the small hack

* modify utils file

* make style + refactor a bit

* create correctly device map

* add correct dtype for device map creation

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

- remove with torch.grad
- do not rely on Python bool magic!

* add docstring

 - add docstring for new kwargs

* add docstring

- comment `replace_8bit_linear` function
- fix weird formatting

* - added more documentation
- added new utility function for memory footprint tracking
- colab demo to add

* few modifs

- typo doc
- force cast into float16 when load_in_8bit is enabled

* added colab link

* add test architecture + docstring a bit

* refactor a bit testing class

* make style + refactor a bit

* enhance checks

- add more checks
- start writing saving test

* clean up a bit

* male style

* add more details on doc

* add more tests

- still needs to fix 2 tests

* replace by "or"

- could not fix it from GitHub GUI

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit testing code + add readme

* make style

* fix import issue

* Update src/transformers/modeling_utils.py

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* add few comments

* add more doctring + make style

* more docstring

* raise error when loaded in 8bit

* make style

* add warning if loaded on CPU

* add small sanity check

* fix small comment

* add bitsandbytes on dockerfile

* Improve documentation

- improve documentation from comments

* add few comments

* slow tests pass on the VM but not on the CI VM

* Fix merge conflict

* make style

* another test should pass on a multi gpu setup

* fix bad import in testing file

* Fix slow tests

- remove dummy batches
- no more CUDA illegal memory errors

* odify dockerfile

* Update docs/source/en/main_classes/model.mdx

* Update Dockerfile

* Update model.mdx

* Update Dockerfile

* Apply suggestions from code review

* few modifications

- lm head can stay on disk/cpu
- change model name so that test pass

* change test value

- change test value to the correct output
- torch bmm changed to baddmm in bloom modeling when merging

* modify installation guidelines

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* replace `n`by `name`

* merge `load_in_8bit` and `low_cpu_mem_usage`

* first try - keep the lm head in full precision

* better check

- check the attribute `base_model_prefix` instead of computing the number of parameters

* added more tests

* Update src/transformers/utils/bitsandbytes.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit

* improve documentation

- fix typos for installation
- change title in the documentation

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* TF: XLA-trainable DeBERTa v2 (#18546)

* fix deberta issues

* add different code paths for gpu and tpu

* shorter gpu take along axis

* Stable Dropout without tf cond

* variable must be float

* Preserve hub-related kwargs in AutoModel.from_pretrained (#18545)

* Preserve hub-related kwargs in AutoModel.from_pretrained

* Fix tests

* Remove debug statement

* TF Examples Rewrite (#18451)

* Finished QA example

* Dodge a merge conflict

* Update text classification and LM examples

* Update NER example

* New Keras metrics WIP, fix NER example

* Update NER example

* Update MC, summarization and translation examples

* Add XLA warnings when shapes are variable

* Make sure batch_size is consistently scaled by num_replicas

* Add PushToHubCallback to all models

* Add docs links for KerasMetricCallback

* Add docs links for prepare_tf_dataset and jit_compile

* Correct inferred model names

* Don't assume the dataset has 'lang'

* Don't assume the dataset has 'lang'

* Write metrics in text classification

* Add 'framework' to TrainingArguments and TFTrainingArguments

* Export metrics in all examples and add tests

* Fix training args for Flax

* Update command line args for translation test

* make fixup

* Fix accidentally running other tests in fp16

* Remove do_train/do_eval from run_clm.py

* Remove do_train/do_eval from run_mlm.py

* Add tensorflow tests to circleci

* Fix circleci

* Update examples/tensorflow/language-modeling/run_mlm.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/test_tensorflow_examples.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/translation/run_translation.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/token-classification/run_ner.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fix save path for tests

* Fix some model card kwargs

* Explain the magical -1000

* Actually enable tests this time

* Skip text classification PR until we fix shape inference

* make fixup

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Use commit hash to look in cache instead of calling head (#18534)

* Use commit hash to look in cache instead of calling head

* Add tests

* Add attr for local configs too

* Stupid typos

* Fix tests

* Update src/transformers/utils/hub.py

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address Julien's comments

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* `pipeline` support for `device="mps"` (or any other string) (#18494)

* `pipeline` support for `device="mps"` (or any other string)

* Simplify `if` nesting

* Update src/transformers/pipelines/base.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix? @sgugger

* passing `attr=None` is not the same as not passing `attr` 🤯

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update philosophy to include other preprocessing classes (#18550)

* 📝 update philosophy to include other preprocessing classes

* 🖍 apply feedbacks

* Properly move cache when it is not in default path (#18563)

* Adds CLIP to models exportable with ONNX (#18515)

* onnx config for clip

* default opset as 14

* changes from the original repo

* input values order fix

* outputs fix

* remove unused import

* ran make fix-copies

* black format

* review comments: forward ref, import fix, model change revert, .to cleanup

* make style

* formatting fixes

* revert groupvit

* comment for cast to int32

* comment fix

* make .T as .t() for onnx conversion

* ran make fix-copies

* remove unneeded comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix copies

* remove comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* raise atol for MT5OnnxConfig (#18560)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* fix string (#18568)

* Segformer TF: fix output size in documentation (#18572)

* Segformer TF: fix output size in doc

* Segformer pytorch: fix output size in doc

Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com>

* Fix resizing bug in OWL-ViT (#18573)

* Fixes resizing bug in OWL-ViT
* Defaults to square resize if size is set to an int
* Sets do_center_crop default value to False

* Fix LayoutLMv3 documentation (#17932)

* fix typos

* fix sequence_length docs of LayoutLMv3Model

* delete trailing white spaces

* fix layoutlmv3 docs more

* apply make fixup & quality

* change to two versions of input docstring

* apply make fixup & quality

* Skip broken tests

* Change BartLearnedPositionalEmbedding's forward method signature to support Opacus training (#18486)

* changing BartLearnedPositionalEmbedding forward signature and references to it

* removing debugging dead code (thanks style checker)

* blackened modeling_bart file

* removing copy inconsistencies via make fix-copies

* changing references to copied signatures in Bart variants

* make fix-copies once more

* using expand over repeat (thanks @michaelbenayoun)

* expand instead of repeat for all model copies

Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com>

* german docs translation (#18544)

* Create _config.py

* Create _toctree.yml

* Create index.mdx

not sure about "du / ihr" oder "sie"

* Create quicktour.mdx

* Update _toctree.yml

* Update build_documentation.yml

* Update build_pr_documentation.yml

* fix build

* Update index.mdx

* Update quicktour.mdx

* Create installation.mdx

* Update _toctree.yml

* Deberta V2: Fix critical trace warnings to allow ONNX export (#18272)

* Fix critical trace warnings to allow ONNX export

* Force input to `sqrt` to be float type

* Cleanup code

* Remove unused import statement

* Update model sew

* Small refactor

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* Use broadcasting instead of repeat

* Implement suggestion

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* Match deberta v2 changes in sew_d

* Improve code quality

* Update code quality

* Consistency of small refactor

* Match changes in sew_d

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* [FX] _generate_dummy_input supports audio-classification models for labels (#18580)

* Support audio classification architectures for labels generation, as well as provides a flag to print warnings or not

* Use ENV_VARS_TRUE_VALUES

* Fix docstrings with last version of hf-doc-builder styler (#18581)

* Fix docstrings with last version of hf-doc-builder styler

* Remove empty Parameter block

* Bump nbconvert from 6.0.1 to 6.3.0 in /examples/research_projects/lxmert (#18565)

Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump nbconvert in /examples/research_projects/visual_bert (#18566)

Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix owlvit tests, update docstring examples (#18586)

* Return the permuted hidden states if return_dict=True (#18578)

* Load sharded pt to flax (#18419)

* initial commit

* add small test

* add cross pt tf flag to test

* fix quality

* style

* update test with new repo

* fix failing test

* update

* fix wrong param ordering

* style

* update based on review

* update related to recent new caching mechanism

* quality

* Update based on review

Co-authored-by: sgugger <sylvain.gugger@gmail.com>

* quality and style

* Update src/transformers/modeling_flax_utils.py
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add type hints for ViLT models (#18577)

* Add type hints for Vilt models

* Add missing return type for TokenClassification class

* update doc for perf_train_cpu_many, add intel mpi introduction (#18576)

* update doc for perf_train_cpu_many, add mpi introduction

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* Update docs/source/en/perf_train_cpu_many.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/perf_train_cpu_many.mdx

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* typos (#18594)

* FSDP bug fix for `load_state_dict` (#18596)

* Add `TFAutoModelForSemanticSegmentation` to the main `__init__.py` (#18600)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Generate: validate `model_kwargs` (and catch typos in generate arguments) (#18261)

* validate generate model_kwargs

* generate tests -- not all models have an attn mask

* Supporting seq2seq models for `bitsandbytes` integration (#18579)

* Supporting seq2seq models for `bitsandbytes` integration

- `bitsandbytes` integration supports now seq2seq models
- check if a model has tied weights as an additional check

* small modification

- tie the weights before looking at tied weights!

* Add Donut (#18488)

* First draft

* Improve script

* Update script

* Make conversion work

* Add final_layer_norm attribute to Swin's config

* Add DonutProcessor

* Convert more models

* Improve feature extractor and convert base models

* Fix bug

* Improve integration tests

* Improve integration tests and add model to README

* Add doc test

* Add feature extractor to docs

* Fix integration tests

* Remove register_buffer

* Fix toctree and add missing attribute

* Add DonutSwin

* Make conversion script work

* Improve conversion script

* Address comment

* Fix bug

* Fix another bug

* Remove deprecated method from docs

* Make Swin and Swinv2 untouched

* Fix code examples

* Fix processor

* Update model_type to donut-swin

* Add feature extractor tests, add token2json method, improve feature extractor

* Fix failing tests, remove integration test

* Add do_thumbnail for consistency

* Improve code examples

* Add code example for document parsing

* Add DonutSwin to MODEL_NAMES_MAPPING

* Add model to appropriate place in toctree

* Update namespace to appropriate organization

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* Fix URLs (#18604)

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* Update BLOOM parameter counts (#18531)

* Update BLOOM parameter counts

* Update BLOOM parameter counts

* [doc] fix anchors (#18591)

the manual anchors end up being duplicated with automatically added anchors and no longer work.

* [fsmt] deal with -100 indices in decoder ids (#18592)

* [fsmt] deal with -100 indices in decoder ids

Fixes: https://github.com/huggingface/transformers/issues/17945

decoder ids get the default index -100, which breaks the model - like t5 and many other models add a fix to replace -100 with the correct pad index. 

For some reason this use case hasn't been used with this model until recently - so this issue was there since the beginning it seems.

Any suggestions to how to add a simple test here? or perhaps we have something similar already? user's script is quite massive.

* style

* small change (#18584)

* Flax Remat for LongT5 (#17994)

* [Flax] Add remat (gradient checkpointing)

* fix variable naming in test

* flip: checkpoint using a method

* fix naming

* fix class naming

* apply PVP's suggestions from code review

* add gradient_checkpointing to examples

* Add gradient_checkpointing to run_mlm_flax

* Add remat to longt5

* Add gradient checkpointing test longt5

* Fix args errors

* Fix remaining tests

* Make fixup & quality fixes

* replace kwargs

* remove unecessary kwargs

* Make fixup changes

* revert long_t5_flax changes

* Remove return_dict and copy to LongT5

* Remove test_gradient_checkpointing

Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>

* mac m1 `mps` integration (#18598)

* mac m1 `mps` integration

* Update docs/source/en/main_classes/trainer.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* addressing comments

* Apply suggestions from code review

Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>

* resolve comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>

* Change scheduled CIs to use torch 1.12.1 (#18644)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Add checks for some workflow jobs (#18583)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* TF: Fix generation repetition penalty with XLA (#18648)

* Update longt5.mdx (#18634)

* Update run_translation_no_trainer.py (#18637)

* Update run_translation_no_trainer.py

found an error in selecting `no_decay` parameters and some small modifications when the user continues to train from a checkpoint

* fixs `no_decay` and `resume_step` issue

1. change `no_decay` list
2. if use continue to train their model from provided checkpoint, the `resume_step` will not be initialized properly if `args.gradient_accumulation_steps != 1`

* [bnb] Minor modifications (#18631)

* bnb minor modifications

- refactor documentation
- add troubleshooting README
- add PyPi library on DockerFile

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* put in one block

- put bash instructions in one block

* update readme

- refactor a bit hardware requirements

* change text a bit

* Apply suggestions from code review

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* apply suggestions

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* add link to paper

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update tests/mixed_int8/README.md

* Apply suggestions from code review

* refactor a bit

* add instructions Turing & Amperer

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* add A6000

* clarify a bit

* remove small part

* Update tests/mixed_int8/README.md

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Examples: add Bloom support for token classification (#18632)

* examples: add Bloom support for token classification (FLAX, PyTorch and TensorFlow)

* examples: remove support for Bloom in token classication (FLAX and TensorFlow currently have no support for it)

* Fix Yolos ONNX export test (#18606)

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Fixup

* Fix up

* Move PIL default arguments inside function for safe imports

* Add image utils to toctree

* Update `rescale` method to reflect changes in #18677

* Update docs/source/en/internal/image_processing_utils.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Address Niels PR comments

* Apply suggestions from code review - remove defaults to None

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix docstrings and revert to PIL.Image.XXX resampling

Use PIL.Image.XXX resampling values instead of PIL.Image.Resampling.XXX enum as it's only in the recent version >= 9.10 and version is not yet pinned and older version support deprecated

* Some more docstrings and PIL.Image tidy up

* Reorganise arguments so flags by modifiers

* Few last docstring fixes

Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Seunghwan Hong <harrydrippin@gmail.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
Co-authored-by: Ankur Goyal <ankrgyl@gmail.com>
Co-authored-by: Ankur Goyal <ankur@impira.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Mishig Davaadorj <dmishig@gmail.com>
Co-authored-by: Rasmus Arpe Fogh Jensen <Rasmus.arpe@gmail.com>
Co-authored-by: Ian Castillo <7807897+donelianc@users.noreply.github.com>
Co-authored-by: AguilaCudicio <aguila.cudicio@gmail.com>
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>
Co-authored-by: Niklas Hansson <niklas.sven.hansson@gmail.com>
Co-authored-by: Thomas Chaigneau <t.chaigneau.tc@gmail.com>
Co-authored-by: YouJiacheng <1503679330@qq.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Dhruv Karan <k4r4n.dhruv@gmail.com>
Co-authored-by: Michael Wyatt <mrwyattii@gmail.com>
Co-authored-by: Maxime G <joihn@users.noreply.github.com>
Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com>
Co-authored-by: Wonseok Lee (Jack) <rollerkid02@snu.ac.kr>
Co-authored-by: Dan Jones <dan.j.jones2@gmail.com>
Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com>
Co-authored-by: flozi00 <flozi00.fz@gmail.com>
Co-authored-by: iiLaurens <iiLaurens@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Wang, Yi <yi.a.wang@intel.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: Karim Foda <35491698+KMFODA@users.noreply.github.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>
Co-authored-by: zhoutang776 <47708118+zhoutang776@users.noreply.github.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-10-12 18:32:02 +01:00
Darío Hereñú
5760a8fcf6
Syntax issues (paragraphs 122, 130, 147, 155) Documentation: @sgugger (#19437)
* Syntax issues (paragraphs 122, 130, 147, 155)

`preentramiento` > `preentrenamiento`
* semantic issue (paragraph 220 & 232 & 252)

* Update docs/source/es/create_a_model.mdx

with approval of @ignacioct and scrutiny of @sgugger

Co-authored-by: Ignacio Talavera <ignaciotalaveracepeda@gmail.com>

Co-authored-by: Ignacio Talavera <ignaciotalaveracepeda@gmail.com>
2022-10-12 13:18:11 -04:00
Daniel van Strien
af539d6f0a
fix MarkupLMProcessor option flag (#19526) 2022-10-12 15:08:48 +02:00
Ritik Nandwal
e94384e4d8
Add depth estimation pipeline (#18618)
* Add initial files for depth estimation pipelines

* Add test file for depth estimation pipeline

* Update model mapping names

* Add updates for depth estimation output

* Add generic test

* Hopefully fixing the tests.

* Check if test passes

* Add make fixup and make fix-copies changes after rebase with main

* Rebase with main

* Fixing up depth pipeline.

* This is not used anymore.

* Fixing the test. `Image` is a module `Image.Image` is the type.

* Update docs/source/en/main_classes/pipelines.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-12 08:54:20 -04:00
Darío Hereñú
c60381e90d
Syntax issue (line 497, 526) Documentation @ssuggen (#19442) 2022-10-12 08:28:54 -04:00
NielsRogge
4d367a3c81
Add LiLT (#19450)
* First draft

* Fix more things

* Improve more things

* Remove some head models

* Fix more things

* Add missing layers

* Remove tokenizer

* Fix more things

* Fix copied from statements

* Make all tests pass

* Remove print statements

* Remove files

* Fix README and docs

* Add integration test and fix organization

* Add tips

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Make tests faster, improve docs

* Fix doc tests

* Add model to toctree

* Add docs

* Add note about creating new checkpoint

* Remove is_decoder

* Make tests smaller, add docs

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-12 10:11:20 +02:00
Wang, Yi
7543e275d4
update doc for perf_train_cpu_many (#19506)
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2022-10-11 22:54:19 -04:00
Mathieu Jouffroy
5ca131f3d4
[CvT] Tensorflow implementation (#18597)
* implemented TFCvtModel and TFCvtForImageClassification and modified relevant files, added an exception in convert_tf_weight_name_to_pt_weight_name, added quick testing file to compare with pytorch model

* added docstring + testing file in transformers testing suite

* added test in testing file, modified docs to pass repo-consistency, passed formatting test

* refactoring + passing all test

* small refacto, removing unwanted comments

* improved testing config

* corrected import error

* modified acces to pretrained model archive list, to pass tf_test

* corrected import structure in init files

* modified testing for keras_fit with cpu

* correcting PR issues + Refactoring

* Refactoring : improving readability and reducing the number of permutations

* corrected momentum value + cls_token initialization

* removed from_pt as weights were added to the hub

* Update tests/models/cvt/test_modeling_tf_cvt.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2022-10-11 18:16:52 +01:00
Darío Hereñú
ae710425d2
Syntax issues (lines 126, 203) (#19444) 2022-10-11 08:14:21 -04:00
Stefano Bosisio
b0b962ccca
Add Italian translation for add_new_model.mdx (#18713)
* fix conflicts

* start translating

* proof check

* add toc

* fix errors and typos
2022-10-10 10:12:40 -04:00
amyeroberts
e3f028f3af
Add TF whisper (#19378)
* simplify loop

* add featur extractor

* add model

* start conversion

* add dropout

* initial commit of test files

* copnversion for all models

* update processor for correct padding

* update feature extraction

* update integration test logits match

* fmnt: off for the logits

* on the fly mel bank

* small nit

* update test

* update tokenizer

* nit feature extraction

* update

* update tokenizer test

* adds logit processor and update tokenizer to get supress tokens

* style

* clean convert

* revert to original modeling tf utils

* Update

* update

* nit

* clean convert file

* update tests and nits

* quality

* slow generation test

* ffn_dim to allow customization

* update readme

* add to toctreee

* start fixing integration tests

* update tests and code

* fix feature extractor

* fix config tests common

* update code to fix tests

* fix feature exctractor

* nit feature extraction

* update test for new feature extractor

* style

* add absrtact

* large logits wioth custom decoder input ids

* wraap around is otrch available

* fix feature extractor

* correct logits for whisper small.en

* nit

* fix encoder_attentino_mask

* some fixes

* remove unnecessary inputs

* nits

* add normalizer file

* update etst tokenization

* fix attention mask not defined

* fix generate

* remove uncoder attention mask useless

* update test modeling whisper

* update condfig to add second non supress tokens

* nits on feature exrtactor

* nit for test tokenizers

* update etsts

* update tests

* update tokenization test

* fixup

* invalidated hf token. Clean convert openai to whisper

* fix logit tests

* fixup

* Add model to README

* Fix doc tests

* clean merge

* revert toc_tree changes

* remove useless LogitProcessor

* Update whisper .mdx

* update config file doc

* update configuration docstring

* update test tokenization

* update test tokenization

* update tokenization whisper
Added copied from where needed

* update feature extraction

* nit test name

* style

* quality

* remove get suppress tokens and update non_speech tokens global variables

* Update src/transformers/models/whisper/feature_extraction_whisper.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* clean modeling whisper and test
Removed the attention mask arguments that are deprecated

* fix large test

* Add multilingual audio test, and translate test

* style

* fix larg multilingual test

* nits

* add copied from for attention layer

* remove attention masks in doc

* add english normalizer

* Update docs/source/en/model_doc/whisper.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update tokenization test

* remove copied from in whisper attention : no bias in k_proj only

* wrap around dependencies in english normalizer

* style

* correct import generation logits

* for now, wrap feature extractor with torch

* remove torch depencies for feature extraction and style

* Update src/transformers/models/whisper/convert_openai_whisper_to_tfms.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/whisper.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixup

* nit

* update logitds

* style

* nit

* nits and fix final tests

* add `is_more_itertools_available` to utils

* quality

* add begin supress tokens, supress tokens to generate args and config

* clean supressTokensLogitProcessor in generation logits

* Nit naming

* add supressTokensAtBegin

* udpate tests, supress tokens to None or correct values

* nit and style

* update RAG to fit test and generate_logit

* add copy pasted statment on english normalizer

* add arguments to config_common_kwargs

* Update src/transformers/generation_utils.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/generation_logits_process.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* revert changes based on reviews

* update doc and nits

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* more nits

* last nits

* update test configuration common

* add BART name in decoder attention mask documentation

* Update src/transformers/models/whisper/modeling_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* style

* nit

* nit

* add english.json file to git

* nits on documentation

* nit

* nits

* last styling

* add main toctree file

* remove sentence piece dependency

* clean init file

* fix tokenizer that has no dependencies on sentencepiece

* update whisper init file, nit

* remove english.json file

* add get decoder prompt id

* All weights loading

* Remove hanging pdb

* Fixup and tidy up

* Use same copied from as PT model

* Remove whitespace changes

* Remove torch references

* Tie embeddings

* Remove logits processor input to generate

* Update logit values

* revert changes and add forced logit processor

* nit

* clean normalizer

* remove protected

* Add logit processors and update generation code & tests

* Some tidy up

* Update docstring

* update

* update based on review

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update to reflect changes on the PT model branch

* Tidy up

* Remove extra whitespace

* Fix test - make input ids small enough we can append

* Include upstream changes on main

* PR comments - add batch tests, remove comments & defaults

* Fix model output imports

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation_tf_logits_process.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/models/whisper/test_modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update docstring example

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Remove changes to adjust_logits_during_generation function

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Tidy up imports that don't require TF

* Update tests - skip and no more skip

* Update tests/generation/test_generation_tf_logits_process.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py

* Update src/transformers/models/whisper/modeling_tf_whisper.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Add training flags

* Add (skipped) XLA generation tests

* Add embedding correctness test

* Add constant ids for generation tests

* Make logits finding a bit tidier

* Remove unused args

* xla generation enabled

* Don't skip XLA tests anymore

* Fix tests - add position ids to expected signature and update rag generation

* Undo method reorder

* Remove added whitespace

* Remove copy-paste gradient checkopint ref

* Remove

* Trigger CI - (issue with refs when pulling)

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <niels.rogge1@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
2022-10-10 14:48:17 +01:00
APAVOU Clément
af69360bf9
Add OPTForQuestionAnswering (#19402)
* Add `OPTForQuestionAnswering`

- added `OPTForQuestionAnswering` class based on `BloomForQuestionAnswering`
- added `OPTForQuestionAnswering` in common tests
- all common tests pass
- make fixup done

* added docstrings for OPTForQuestionAnswering

* Fix docstrings for OPTForQuestionAnswering
2022-10-10 09:30:59 -04:00
Mohit Sharma
3080bb4754
Add onnx support for VisionEncoderDecoder (#19254)
* Add onnx support for VisionEncoderDecoder

* Add onnx support for VisionEncoderDecoder

* Removed unused import

* Rename encoder hidden state

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update docstrings and removed redundant code

* Added test function for enc-dec models

* Update doc string text

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* fixed code style

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-10-10 09:20:19 -04:00
Darío Hereñú
3410705730
Fixed duplicated line (paragraph #83) Documentation: @sgugger (#19436)
* Fixed duplicated line (paragraph #83) @omarespejel @sgugger

* Datasets map denomination fixed (paragraph 42)
2022-10-10 09:08:34 -04:00
Darío Hereñú
83dc49b69b
Backtick fixed (paragraph 68) (#19440) 2022-10-10 08:47:14 -04:00
Amrit Sahu
e9a49babee
[WIP] Add ZeroShotObjectDetectionPipeline (#18445) (#18930)
* Add ZeroShotObjectDetectionPipeline (#18445)

* Add AutoModelForZeroShotObjectDetection task

This commit also adds the following

- Add explicit _processor method for ZeroShotObjectDetectionPipeline.
  This is necessary as pipelines don't auto infer processors yet and
  `OwlVitProcessor` wraps tokenizer and feature_extractor together, to
  process multiple images at once

- Add auto tests and other tests for ZeroShotObjectDetectionPipeline

* Add AutoModelForZeroShotObjectDetection task

This commit also adds the following

- Add explicit _processor method for ZeroShotObjectDetectionPipeline.
  This is necessary as pipelines don't auto infer processors yet and
  `OwlVitProcessor` wraps tokenizer and feature_extractor together, to
  process multiple images at once

- Add auto tests and other tests for ZeroShotObjectDetectionPipeline

* Add batching for ZeroShotObjectDetectionPipeline

* Fix doc-string ZeroShotObjectDetectionPipeline

* Fix output format: ZeroShotObjectDetectionPipeline
2022-10-07 10:00:19 -04:00
Bibhabasu Mohapatra
e162cebfa3
add ONNX support for swin transformer (#19390)
* swin transformer onnx support

* Updated image dimensions as dynamic

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-10-07 09:23:24 -04:00
Alara Dirik
ae3e3bc60a
fix docs example, add object_detection to DETR docs (#19377) 2022-10-07 00:02:26 +02:00
Arthur
45e14038f2
Add WhisperModel to transformers (#19166)
* simplify loop

* add featur extractor

* add model

* start conversion

* add dropout

* initial commit of test files

* copnversion for all models

* update processor for correct padding

* update feature extraction

* update integration test logits match

* fmnt: off for the logits

* on the fly mel bank

* small nit

* update test

* update tokenizer

* nit feature extraction

* update

* update tokenizer test

* adds logit processor and update tokenizer to get supress tokens

* style

* clean convert

* revert to original modeling tf utils

* Update

* update

* nit

* clean convert file

* update tests and nits

* quality

* slow generation test

* ffn_dim to allow customization

* update readme

* add to toctreee

* start fixing integration tests

* update tests and code

* fix feature extractor

* fix config tests common

* update code to fix tests

* fix feature exctractor

* nit feature extraction

* update test for new feature extractor

* style

* add absrtact

* large logits wioth custom decoder input ids

* wraap around is otrch available

* fix feature extractor

* correct logits for whisper small.en

* nit

* fix encoder_attentino_mask

* some fixes

* remove unnecessary inputs

* nits

* add normalizer file

* update etst tokenization

* fix attention mask not defined

* Add model to README

* Fix doc tests

* fix generate

* remove uncoder attention mask useless

* update test modeling whisper

* update condfig to add second non supress tokens

* nits on feature exrtactor

* nit for test tokenizers

* update etsts

* update tests

* update tokenization test

* fixup

* invalidated hf token. Clean convert openai to whisper

* fix logit tests

* fixup

* clean merge

* revert toc_tree changes

* remove useless LogitProcessor

* Update whisper .mdx

* update config file doc

* update configuration docstring

* update test tokenization

* update test tokenization

* update tokenization whisper
Added copied from where needed

* update feature extraction

* nit test name

* style

* quality

* remove get suppress tokens and update non_speech tokens global variables

* Update src/transformers/models/whisper/feature_extraction_whisper.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* clean modeling whisper and test
Removed the attention mask arguments that are deprecated

* fix large test

* Add multilingual audio test, and translate test

* style

* fix larg multilingual test

* nits

* Update docs/source/en/model_doc/whisper.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add copied from for attention layer

* remove attention masks in doc

* add english normalizer

* update tokenization test

* remove copied from in whisper attention : no bias in k_proj only

* wrap around dependencies in english normalizer

* style

* correct import generation logits

* for now, wrap feature extractor with torch

* Update src/transformers/models/whisper/convert_openai_whisper_to_tfms.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/whisper.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* remove torch depencies for feature extraction and style

* fixup

* nit

* update logitds

* style

* nit

* nits and fix final tests

* add `is_more_itertools_available` to utils

* quality

* add begin supress tokens, supress tokens to generate args and config

* clean supressTokensLogitProcessor in generation logits

* Nit naming

* add supressTokensAtBegin

* udpate tests, supress tokens to None or correct values

* nit and style

* update RAG to fit test and generate_logit

* add copy pasted statment on english normalizer

* add arguments to config_common_kwargs

* Update src/transformers/generation_utils.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/generation_logits_process.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* revert changes based on reviews

* update doc and nits

* more nits

* last nits

* update test configuration common

* add BART name in decoder attention mask documentation

* Update src/transformers/models/whisper/modeling_whisper.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* style

* nit

* nit

* add english.json file to git

* nits on documentation

* nit

* nits

* last styling

* add main toctree file

* remove sentence piece dependency

* clean init file

* fix tokenizer that has no dependencies on sentencepiece

* update whisper init file, nit

* remove english.json file

* add get decoder prompt id

* revert changes and add forced logit processor

* nit

* clean normalizer

* remove protected

* update

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update based on review

* Update src/transformers/models/whisper/configuration_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add batched tests

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: NielsRogge <niels.rogge1@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-05 22:28:31 +02:00
Alara Dirik
07e94bf159
Maskformer post-processing fixes and improvements (#19172)
- Improves MaskFormer docs, corrects minor typos
- Restructures MaskFormerFeatureExtractor.post_process_panoptic_segmentation for better readability, adds target_sizes argument for optional resizing
- Adds post_process_semantic_segmentation and post_process_instance_segmentation methods.
- Adds a deprecation warning to post_process_segmentation method in favour of post_process_instance_segmentation
2022-10-05 15:27:15 +03:00
Younes Belkada
587d84b178
Add BloomForQuestionAnswering (#19310)
* add bloom for question answering

- attempt to add Bloom for question answering
- adapted from `GPTJForQuestionAnswering`
- Fixed `num_labels` to `2` for common tests
- Added a bit of docstring
- All common tests pass

* Update src/transformers/models/bloom/modeling_bloom.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert changes related to `num_labels`

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-04 17:52:13 +02:00
Steven Liu
68f50f3453
Breakup export guide (#19271)
* split onnx and torchscript docs

* make style

* apply reviews
2022-10-03 13:18:29 -07:00
Alara Dirik
36f52e9593
Restructure DETR post-processing, return prediction scores (#19262)
* Restructure DetrFeatureExtractor post-processing methods
* Update post_process_instance_segmentation and post_process_panoptic_segmentation methods to return prediction scores
* Update DETR models docs
2022-10-03 12:02:51 +03:00
Kashif Rasul
5cd16f01db
time series forecasting model (#17965)
* initial files

* initial model via cli

* typos

* make a start on the model config

* ready with configuation

* remove tokenizer ref.

* init the transformer

* added initial model forward to return dec_output

* require gluonts

* update dep. ver table and add as extra

* fixed typo

* add type for prediction_length

* use num_time_features

* use config

* more config

* typos

* opps another typo

* freq can be none

* default via transformation is 1

* initial transformations

* fix imports

* added transform_start_field

* add helper to create pytorch dataloader

* added inital val and test data loader

* added initial distr head and loss

* training working

* remove TimeSeriesTransformerTokenizer

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/__init__.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixed copyright

* removed docs

* remove time series tokenizer

* fixed docs

* fix text

* fix second

* fix default

* fix order

* use config directly

* undo change

* fix comment

* fix year

* fix import

* add additional arguments for training vs. test

* initial greedy inference loop

* fix inference

* comment out token inputs to enc dec

* Use HF encoder/decoder

* fix inference

* Use Seq2SeqTSModelOutput output

* return Seq2SeqTSPredictionOutput

* added default arguments

* fix return_dict true

* scale is a tensor

* output static_features for inference

* clean up some unused bits

* fixed typo

* set return_dict if none

* call model once for both train/predict

* use cache if future_target is none

* initial generate func

* generate arguments

* future_time_feat is required

* return SampleTSPredictionOutput

* removed unneeded classes

* fix when params is none

* fix return dict

* fix num_attention_heads

* fix arguments

* remove unused shift_tokens_right

* add different dropout configs

* implement FeatureEmbedder, Scaler and weighted_average

* remove gluonts dependency

* fix class names

* avoid _variable names

* remove gluonts dependency

* fix imports

* remove gluonts from configuration

* fix docs

* fixed typo

* move utils to examples

* add example requirements

* config has no freq

* initial run_ts_no_trainer

* remove from ignore

* fix output_attentions and removed unsued getters/setters

* removed unsed tests

* add dec seq len

* add test_attention_outputs

* set has_text_modality=False

* add config attribute_map

* make style

* make fix-copies

* add encoder_outputs to TimeSeriesTransformerForPrediction forward

* Improve docs, add model to README

* added test_forward_signature

* More improvements

* Add more copied from

* Fix README

* Fix remaining quality issues

* updated encoder and decoder

* fix generate

* output_hidden_states and use_cache are optional

* past key_values returned too

* initialize weights of distribution_output module

* fixed more tests

* update test_forward_signature

* fix return_dict outputs

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* removed commented out tests

* added neg. bin and normal output

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* move to one line

* Add docstrings

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* add try except for assert and raise

* try and raise exception

* fix the documentation formatting

* fix assert call

* fix docstring formatting

* removed input_ids from DOCSTRING

* Update input docstring

* Improve variable names

* Update order of inputs

* Improve configuration

* Improve variable names

* Improve docs

* Remove key_length from tests

* Add extra docs

* initial unittests

* added test_inference_no_head test

* added test_inference_head

* add test_seq_to_seq_generation

* make style

* one line

* assert mean prediction

* removed comments

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix order of args

* make past_observed_mask optional as well

* added Amazon license header

* updated utils with new fieldnames

* make style

* cleanup

* undo position of past_observed_mask

* fix import

* typo

* more typo

* rename example files

* remove example for now

* Update docs/source/en/_toctree.yml

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update modeling_time_series_transformer.py

fix style

* fixed typo

* fix typo and grammer

* fix style

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: NielsRogge <niels.rogge1@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-09-30 15:32:59 -04:00
Joao Gante
cfb777f27c
Docs - Guide to add a new TensorFlow model (#19256)
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-09-30 20:30:38 +01:00
Matt
368b649af6
Rebase ESM PR and update all file formats (#19055)
* Rebase ESM PR and update all file formats

* Fix test relative imports

* Add __init__.py to the test dir

* Disable gradient checkpointing

* Remove references to TFESM... FOR NOW >:|

* Remove completed TODOs from tests

* Convert docstrings to mdx, fix-copies from BERT

* fix-copies for the README and index

* Update ESM's __init__.py to the modern format

* Add to _toctree.yml

* Ensure we correctly copy the pad_token_id from the original ESM model

* Ensure we correctly copy the pad_token_id from the original ESM model

* Tiny grammar nitpicks

* Make the layer norm after embeddings an optional flag

* Make the layer norm after embeddings an optional flag

* Update the conversion script to handle other model classes

* Remove token_type_ids entirely, fix attention_masking and add checks to convert_esm.py

* Break the copied from link from BertModel.forward to remove token_type_ids

* Remove debug array saves

* Begin ESM-2 porting

* Add a hacky workaround for the precision issue in original repo

* Code cleanup

* Remove unused checkpoint conversion code

* Remove unused checkpoint conversion code

* Fix copyright notices

* Get rid of all references to the TF weights conversion

* Remove token_type_ids from the tests

* Fix test code

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add credit

* Remove _ args and __ kwargs in rotary embedding

* Assertively remove asserts

* Replace einsum with torch.outer()

* Fix docstring formatting

* Remove assertions in tokenization

* Add paper citation to ESMModel docstring

* Move vocab list to single line

* Remove ESMLayer from init

* Add Facebook copyrights

* Clean up RotaryEmbedding docstring

* Fix docstring formatting

* Fix docstring for config object

* Add explanation for new config methods

* make fix-copies

* Rename all the ESM- classes to Esm-

* Update conversion script to allow pushing to hub

* Update tests to point at my repo for now

* Set config properly for tests

* Remove the gross hack that forced loss of precision in inv_freq and instead copy the data from the model being converted

* make fixup

* Update expected values for slow tests

* make fixup

* Remove EsmForCausalLM for now

* Remove EsmForCausalLM for now

* Fix padding idx test

* Updated README and docs with ESM-1b and ESM-2 separately (#19221)

* Updated README and docs with ESM-1b and ESM-2 separately

* Update READMEs, longer entry with 3 citations

* make fix-copies

Co-authored-by: Your Name <you@example.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Tom Sercu <tsercu@fb.com>
Co-authored-by: Your Name <you@example.com>
2022-09-30 14:16:25 +01:00
NielsRogge
f3d2f7a6e0
Add MarkupLM (#19198)
* First draft

* Make basic test work

* Fix most tokenizer tests

* More improvements

* Make more tests pass

* Fix more tests

* Fix some code quality

* Improve truncation

* Implement feature extractor

* Improve feature extractor and add tests

* Improve feature extractor tests

* Fix pair_input test partly

* Add fast tokenizer

* Improve implementation

* Fix rebase

* Fix rebase

* Fix most of the tokenizer tests.

* propose solution for fast

* add: integration test for fasttokenizer, warning for decode, fix template in slow tokenizer

* add: modify markuplmconverter

* add: some modify on converter and tokenizerfast

* Fix style, copies

* Make fixup

* Update tokenization_markuplm.py

* Update test_tokenization_markuplm.py

* Update markuplm related

* Improve processor, add integration test

* Add processor test file

* Improve processor

* Improve processor tests

* Fix more processor tests

* Fix processor tests

* Update docstrings

* Add Copied from statements

* Add more Copied from statements

* Add code examples

* Improve code examples

* Add model to doc tests

* Adding dependency check

* Add dummy file

* Add requires_backends

* Add model to toctree

* Fix more things, disable dependency check for now

* Apply more suggestions

* Add soft dependency

* Add annotators to tests

* Fix style

* Remove from_slow=True

* Remove print statements

* Add sanity check

* Fix processor test

* Fix processor tests, add more docs

* Add doc tests for mdx file

* Add more tips

* Apply suggestions

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: lockon-n <45759388+lockon-n@users.noreply.github.com>
Co-authored-by: SaulLu <lucilesaul.com@gmail.com>
Co-authored-by: lockon-n <dd098309@126.com>
2022-09-30 08:25:43 +02:00
mustapha ajeghrir
ba9e336fa3
Fix m2m_100.mdx doc example missing labels (#19149)
The `labels` variable is not defined, the `model_inputs` already contain this information.
2022-09-29 13:27:58 +02:00
Aritra Roy Gosthipaty
0dc7b3a785
[TensorFlow] Adding GroupViT (#18020)
* chore: initial commit

* chore: adding util methods

yet to work on the nn.functional.interpolate port with align_corener=True

* chore: refactor the utils

* used tf.compat.v1.image.resize to align the F.interpolate function
* added type hints to the method signatures
* added references to the gists where one 2 one alignment of torch and tf has been shown

* chore: adding the layers

* chore: porting all the layers from torch to tf

This is the initial draft, nothing is tested yet.

* chore: aligning the layers with reference to tf clip

* chore: aligning the modules

* added demaraction comments
* added copied and adapted from comments

* chore: aligning with CLIP

* chore: wrangling the layers to keep it tf compatible

* chore: aligning the names of the layers for porting

* chore: style changes

* chore: adding docs and inits

* chore: adding tfp dependencis

the code is taken from TAPAS

* chore: initial commit for testing

* chore: aligning the vision embeddings with the vit implementatino

* chore: changing model prefix

* chore: fixing the name of the model and the layer normalization test case

* chore: every test passes but the slow ones

* chore: fix style and integration test

* chore: moving comments below decorators

* chore: make fixup and fix-copies changes

* chore: adding the Vision and Text Model to check_repo

* chore: modifying the prefix name to align it with the torch implementation

* chore: fix typo in configuration

* choer: changing the name of the model variable

* chore: adding segmentation flag

* chore: gante's review

* chore: style refactor

* chore: amy review

* chore: adding shape_list to parts that have been copied from other snippets

* chore: init batchnorm with torch defaults

* chore: adding shape_list to pass the tests

* test fix: adding seed as 0

* set seed

* chore: changing the straight through trick to fix -ve dimensinos

* chore: adding a dimension to the loss

* chore: adding reviewers and contributors names to the docs

* chore: added changes after review

* chore: code quality fixup

* chore: fixing the segmentation snippet

* chore: adding  to the layer calls

* chore: changing int32 to int64 for inputs of serving

* chore: review changes

* chore: style changes

* chore: remove from_pt=True

* fix: repo consistency

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-29 10:48:04 +01:00
Steven Liu
6957350c2b
Focus doc around preprocessing classes (#18768)
* 📝 reframe docs around preprocessing classes

* small edits

* edits and review

* fix typo

* apply review

* clarify processor
2022-09-28 17:09:44 -07:00
Steven Liu
990936a868
Move AutoClasses under Main Classes (#19163)
* move autoclasses to main classes

* keep auto.mdx in model_doc
2022-09-28 17:09:29 -07:00
Nicola Procopio
e3a30e2b99
translated add_new_pipeline (#19215) 2022-09-27 08:55:41 -04:00
Wang, Yi
88f597ba6a
add doc for hyperparameter search (#19192)
* add doc for hyperparameter search

* update doc
2022-09-27 07:51:51 -04:00
Sylvain Gugger
c20b2c7e18
Use repo_type instead of deprecated datasets repo IDs (#19202)
* Use repo_type instead of deprecated datasets repo IDs

* Add missing one in doc
2022-09-26 09:50:48 -04:00
flozi00
fa4eeb4fd3
german training, accelerate and model sharing (#19171)
* correct spelling in README

* processing

* german training

* accelerate

* german model sharing

* build doc

* ttf links

* casing
2022-09-23 14:52:09 -04:00
Alara Dirik
7e84723fe4
Add semantic segmentation post-processing method to MobileViT (#19105)
* add post-processing method for semantic segmentation

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-09-23 16:24:28 +03:00
Wang, Yi
e5b7cff5fe
update perf_train_cpu_many doc (#19151)
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2022-09-22 09:20:15 -04:00
NielsRogge
cf6308ef9b
Improve conditional detr docs (#19154)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-09-22 13:21:05 +02:00
Sayak Paul
2d9853b226
MSN (Masked Siamese Networks) for ViT (#18815)
* feat: modeling and conversion scripts for msn.

* chore: change license year.

* chore: remove unneeded modules.

* feat: direct loading of state_dict from remote url.

* fix: import paths.

* add: rest of the files.

* add and fix rest of the files.

Co-authored-by: Niels <niels.rogge1@gmail.com>

* chore: formatting.

* code quality fix.

* chore: remove pooler.

* feat: add classification top.

* fix: configuration object.

* add: initial test cases (one failing).

* fix: basemodeloutput.

* add: caution on using the classification head.

* add: rest of the model related files.

* add: vit msn readme.

* fix: copied from statement.

* fix: dummy objects.

* add: ViTMSNPreTrainedModel to inits.

* fix: repo consistency.

* minor change in the model doc.

* fix: tests.

* Empty-Commit

* Update src/transformers/models/vit_msn/configuration_vit_msn.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address PR comments.

* Update src/transformers/models/vit_msn/modeling_vit_msn.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* chore: put model in no_grad() and formatting.

Co-authored-by: Niels <niels.rogge1@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2022-09-22 07:15:03 -04:00
NielsRogge
9393f966bc
[fix] Add DeformableDetrFeatureExtractor (#19140)
* Add DeformableDetrFeatureExtractor

* Fix post_process

* Fix name

* Add tests for feature extractor

* Fix doc tests

* Fix name

* Address comments

* Apply same fix to DETR and YOLOS as well

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-09-22 09:45:24 +02:00
DepuMeng
126a739058
Add support for conditional detr (#18948)
* added conditional_detr files

* checked copies

* checked copies

* fixed style and copies

* fixed style and copies

* fixed hub

* fixed style

* Update README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/_toctree.yml

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/index.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/conditional_detr.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixed some doc issue

* changed prefix to ConditionalDetr

* fixed docs

* Update README_ko.md

* added spatial_model_name

* fixed fix-copies

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added some copied from

* added some copied from

* added some copied from

* added some copied from

* fixed use_pretrained issue

* changed post-process

* added conditional_detr files

* checked copies

* checked copies

* fixed style and copies

* fixed style and copies

* fixed hub

* fixed style

* Update README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/_toctree.yml

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/index.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixed some doc issue

* Update docs/source/en/model_doc/conditional_detr.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* changed prefix to ConditionalDetr

* fixed docs

* Update README_ko.md

* added spatial_model_name

* fixed fix-copies

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added some copied from

* added some copied from

* added some copied from

* added some copied from

* fixed use_pretrained issue

* changed post-process

* fix style quality and copies

* fix style quality and copies

* fix style quality and copies

* fix style quality and copies

* add more fix-copies

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixed some variable names & added more fix-copies

* fixed some variable names & added more fix-copies

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added more copied from

* fixed quality

* changed pretrained config

* added more copied-from and fixed the issue in feature_extraction_auto

* added conditional_detr files

* checked copies

* checked copies

* fixed style and copies

* fixed style and copies

* fixed hub

* fixed style

* Update README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/_toctree.yml

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/index.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixed some doc issue

* Update docs/source/en/model_doc/conditional_detr.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* changed prefix to ConditionalDetr

* fixed docs

* Update README_ko.md

* added spatial_model_name

* fixed fix-copies

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added some copied from

* added some copied from

* added some copied from

* added some copied from

* fixed use_pretrained issue

* changed post-process

* added conditional_detr files

* checked copies

* fixed style and copies

* fixed some doc issue

* changed prefix to ConditionalDetr

* fixed docs

* added spatial_model_name

* fixed fix-copies

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added some copied from

* added some copied from

* added some copied from

* added some copied from

* fix style quality and copies

* fix style quality and copies

* fix style quality and copies

* add more fix-copies

* fixed some variable names & added more fix-copies

* fixed some variable names & added more fix-copies

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added more copied from

* fixed quality

* changed pretrained config

* added more copied-from and fixed the issue in feature_extraction_auto

* fixed style

* added conditional_detr files

* checked copies

* checked copies

* fixed style and copies

* fixed style and copies

* fixed hub

* fixed style

* Update README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/_toctree.yml

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/index.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixed some doc issue

* Update docs/source/en/model_doc/conditional_detr.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* changed prefix to ConditionalDetr

* fixed docs

* Update README_ko.md

* added spatial_model_name

* fixed fix-copies

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added some copied from

* added some copied from

* added some copied from

* added some copied from

* fixed use_pretrained issue

* changed post-process

* added conditional_detr files

* checked copies

* fixed style and copies

* fixed some doc issue

* changed prefix to ConditionalDetr

* fixed docs

* added spatial_model_name

* fixed fix-copies

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added some copied from

* added some copied from

* added some copied from

* added some copied from

* fix style quality and copies

* fix style quality and copies

* fix style quality and copies

* add more fix-copies

* fixed some variable names & added more fix-copies

* fixed some variable names & added more fix-copies

* Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/conditional_detr/configuration_conditional_detr.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* added more copied from

* fixed quality

* changed pretrained config

* added more copied-from and fixed the issue in feature_extraction_auto

* rebased

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Depu Meng <depumeng@Depus-MacBook-Pro.local>
2022-09-22 09:45:04 +02:00
Alara Dirik
e7fdfc720a
Add post_process_semantic_segmentation method to DPTFeatureExtractor (#19107)
* add post-processing method for semantic segmentation

* add test for post-processing
2022-09-21 15:15:26 +03:00
Alara Dirik
9e95706648
Add post_process_semantic_segmentation method to SegFormer (#19072)
* add post_process_semantic_segmentation method to SegformerFeatureExtractor
* add test for semantic segmentation post-processing
2022-09-21 11:40:35 +03:00
flozi00
de26241645
german processing (#19121)
* correct spelling in README

* processing
2022-09-20 09:18:21 -04:00
Alara Dirik
c81ebd1c39
Beit postprocessing (#19099)
* add post_process_semantic_segmentation method to BeiTFeatureExtractor
2022-09-20 10:41:56 +03:00
NielsRogge
e7206ceab9
Improve vision models docs (#19103)
* Add tips

* Add BEiT figure

* Fix URL

* Move tip to start

* Add tip to TF model as well

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-09-19 19:22:34 +02:00
flozi00
ae219532e3
german autoclass (#19049)
* german autoclass

* Update _toctree.yml
2022-09-16 16:16:00 -04:00
Stas Bekman
8edf196310
[doc] debug: fix import (#19042)
correct the import statement
2022-09-14 16:29:58 -07:00
Hakjin Lee
abca1741cf
Fix a broken link for deepspeed ZeRO inference in the docs (#19001)
* Fix a broken link for deepspeed ZeRO inference

* fix link

Co-authored-by: Stas Bekman <stas@stason.org>
2022-09-14 16:21:06 -07:00
Shinya Otani
f5f430e5c8
Add support for Japanese GPT-NeoX-based model by ABEJA, Inc. (#18814)
* add gpt-neox-japanese model and tokenizer as new model

* Correction to PR's comment for GPT NeoX Japanese
- Fix to be able to use gpu
- Add comment # Copied... at the top of RotaryEmbedding
- Implement nn.Linear instead of original linear class
- Add generation test under @slow

* fix bias treatment for gpt-neox-japanese

* Modidy gpt-neox-japanese following PR
- add doc for bias_dropout_add
- style change following a PR comment

* add document for gpt-neox-japanese

* remove unused import from gpt-neox-japanese

* fix README for gpt-neox-japanese
2022-09-14 10:17:40 -04:00
NielsRogge
59407bbeb3
Add Deformable DETR (#17281)
* First draft

* More improvements

* Improve model, add custom CUDA code

* Import torch before

* Add script that imports custom layer

* Add everything in new ops directory

* Import custom layer in modeling file

* Fix ARCHIVE_MAP typo

* Creating the custom kernel on the fly.

* Import custom layer in modeling file

* More improvements

* Fix CUDA loading

* More improvements

* Improve conversion script

* Improve conversion script

* Make it work until encoder_outputs

* Make forward pass work

* More improvements

* Make logits match original implementation

* Make implementation also support single_scale model

* Add support for single_scale and dilation checkpoint

* Add support for with_box_refine model

* Support also two stage model

* Improve tests

* Fix more tests

* Make more tests pass

* Upload all models to the hub

* Clean up some code

* Improve decoder outputs

* Rename intermediate hidden states and reference points

* Improve model outputs

* Move tests to dedicated folder

* Improve model outputs

* Fix retain_grad test

* Improve docs

* Clean up and make test_initialization pass

* Improve variable names

* Add copied from statements

* Improve docs

* Fix style

* Improve docs

* Improve docs, move tests to model folder

* Fix rebase

* Remove DetrForSegmentation from auto mapping

* Apply suggestions from code review

* Improve variable names and docstrings

* Apply some more suggestions from code review

* Apply suggestion from code review

* better docs and variables names

* hint to num_queries and two_stage confusion

* remove asserts and code refactor

* add exception if two_stage is True and with_box_refine is False

* use f-strings

* Improve docs and variable names

* Fix code quality

* Fix rebase

* Add require_torch_gpu decorator

* Add pip install ninja to CI jobs

* Apply suggestion of @sgugger

* Remove DeformableDetrForObjectDetection from auto mapping

* Remove DeformableDetrModel from auto mapping

* Add model to toctree

* Add model back to mappings, skip model in pipeline tests

* Apply @sgugger's suggestion

* Fix imports in the init

* Fix copies

* Add CPU implementation

* Comment out GPU function

* Undo previous change

* Apply more suggestions

* Remove require_torch_gpu annotator

* Fix quality

* Add logger.info

* Fix logger

* Fix variable names

* Fix initializaztion

* Add missing initialization

* Update checkpoint name

* Add model to doc tests

* Add CPU/GPU equivalence test

* Add Deformable DETR to pipeline tests

* Skip model for object detection pipeline

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-09-14 11:45:21 +02:00
Chris Emezue
470799b3a6
Removed issue in wav2vec link (#18945)
Fix connected to [this issue](https://github.com/huggingface/transformers/issues/18944)
2022-09-12 21:59:19 +02:00
Tobias Nusser
4c2e983f44
Fixed typo (#18921)
Fixed typo itmes --> items
2022-09-12 21:03:48 +02:00
Rafał Jankowski
85125fcffd
Neptune.ai integration improvements (#18934)
* NeptuneCallback improvements

* After review suggestions and deduplication of initial run

* Added volatile checkpoints support due to missing post-rebase commit

* Update README per review comments

- Remove list formatting
- Correct Neptune docs link

Co-authored-by: Sabine <sabine.nyholm@neptune.ai>
2022-09-09 11:37:34 -04:00
HuYong
22f7218560
add task_type_id to BERT to support ERNIE-2.0 and ERNIE-3.0 models (#18686)
* add_ernie

* remove Tokenizer in ernie

* polish code

* format code style

* polish code

* fix style

* update doc

* make fix-copies

* change model name

* change model name

* fix dependency

* add more copied from

* rename ErnieLMHeadModel to ErnieForCausalLM
do not expose ErnieLayer
update doc

* fix

* make style

* polish code

* polish code

* fix

* fix

* fix

* fix

* fix

* final fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-09 07:36:46 -04:00
NielsRogge
bb6f6d5338
Add X-CLIP (#18852)
* First draft

* Improve conversion script

* Make vision encoder work

* More improvements

* Improve conversion script

* Fix quality

* Add MultiframeIntegrationTransformer

* More improvements

* Make MiT output work

* Fix quality

* Add prompts generator

* Add tests

* Fix some tests

* Fix some more tests

* Fix more tests

* Improve conversion script

* Fix model outputs

* Fix more tests

* Add XClipProcessor

* Use processor in conversion script

* Fix integration test

* Update README, fix docs

* Fix all tests

* Add MIT output to XClipOutput

* Create better variable names

* Rename XClip to XCLIP

* Extend conversion script

* Add support for large models

* Add support for 16 frame models

* Add another model'

* Fix module issue

* Apply suggestions from code review

* Add figure to docs

* Fix CLIPProcessor issue

* Apply suggestions from code review

* Delete file

* Convert more checkpoints

* Convert last checkpoint

* Update nielsr to microsoft
2022-09-08 14:50:30 +02:00
Devlee247
9832ac7c73
Fix LayoutXLM wrong link in README (#18932)
* fix LayoutXLM wrong link in README

* fix LayoutXLM worng link in index.mdx
2022-09-08 07:32:41 -04:00
Steven Liu
90f6fe9155
Skip some doctests in quicktour (#18927)
* skip some code examples for doctests

* make style

* fix code snippet formatting

* separate code snippet into two blocks
2022-09-07 14:45:22 -07:00
Ankur Goyal
2ef7742117
Add DocumentQuestionAnswering pipeline (#18414)
* [WIP] Skeleton of VisualQuestionAnweringPipeline extended to support LayoutLM-like models

* Fixup

* Use the full encoding

* Basic refactoring to DocumentQuestionAnsweringPipeline

* Cleanup

* Improve args, docs, and implement preprocessing

* Integrate OCR

* Refactor question_answering pipeline

* Use refactored QA code in the document qa pipeline

* Fix tests

* Some small cleanups

* Use a string type annotation for Image.Image

* Update encoding with image features

* Wire through the basic docs

* Handle invalid response

* Handle empty word_boxes properly

* Docstring fix

* Integrate Donut model

* Fixup

* Incorporate comments

* Address comments

* Initial incorporation of tests

* Address Comments

* Change assert to ValueError

* Comments

* Wrap `score` in float to make it JSON serializable

* Incorporate AutoModeLForDocumentQuestionAnswering changes

* Fixup

* Rename postprocess function

* Fix auto import

* Applying comments

* Improve docs

* Remove extra assets and add copyright

* Address comments

Co-authored-by: Ankur Goyal <ankur@impira.com>
2022-09-07 13:38:49 -04:00
Matt
2b9513fdab
Update TF fine-tuning docs (#18654)
* Update TF fine-tuning docs

* Fix formatting

* Add some section headers so the right sidebar works better

* Squiggly it

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Explain things in the text, not the comments

* Make the two dataset creation methods into a list

* Move the advice about collation out of a <Tip>

* Edits for clarity

* Edits for clarity

* Edits for clarity

* Replace `to_tf_dataset` with `prepare_tf_dataset` in the fine-tuning pages

* Restructure the page a little bit

* Restructure the page a little bit

* Restructure the page a little bit

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-09-07 13:30:07 +01:00
Ekagra Ranjan
0a632f076d
Fix incorrect size of input for 1st strided window length in Perplexity of fixed-length models (#18906)
* update the PPL for stride 512

* fix 1st strided window size

* linting

* fix typo

* styling
2022-09-06 15:20:12 -04:00
Ekagra Ranjan
f85acb4d73
Fix decode_input_ids to bare T5Model and improve doc (#18791)
* use tokenizer to output tensor

* add preprocessing for decoder_input_ids for bare T5Model

* add preprocessing to tf and flax

* linting

* linting

* Update src/transformers/models/t5/modeling_flax_t5.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/t5/modeling_tf_t5.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/t5/modeling_t5.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-09-06 14:12:26 +02:00
Surya Prakash Sahu
17c634fd5b
Update perf_train_gpu_one.mdx (#18442) 2022-09-05 14:06:36 +02:00
Lysandre Debut
591cfc6c90
Mention TF and Flax checkpoints (#18894) 2022-09-05 11:09:39 +02:00
Steven Liu
65fb71bc76
Add Trainer to quicktour (#18723)
* 📝 update quicktour

* 📝 add trainer section

* 🖍 markdown table, apply feedbacks

*  make style

* add tf training section

* make style
2022-09-02 15:05:31 -05:00
Steven Liu
ae32f3afef
Finetune guide for semantic segmentation (#18640)
* 📝 first draft

* oops add to toctree

* make style

* 📝 add inference section

* 🖍 make style

* 📝 add images

* 🖍 apply feedbacks

* remove num_labels and pytorch block

* apply feedbacks, add colab notebook

Co-authored-by: Steven <stevhliu@gmail.com>
2022-09-02 14:29:51 -05:00
Steven Liu
bf9d506137
Update docs landing page (#18590)
* 📝 update docs landing page

* 🖍 apply feedbacks

* apply feedbacks

* apply feedbacks, use <br> for list
2022-09-02 14:29:06 -05:00
Jason Phang
53e33e6f1b
PEGASUS-X (#18551)
* PegasusX Initial commit

* rename

* pegasus X implementation

* pegx update

* pegx fix

* pegasus-x fixes

* pegx updates

* cleanup

* cleanup

* cleanup

* tests

* stylefixes

* Documentation update

* Model hub fix

* cleanup

* update

* update

* testfix

* Check fix

* tweaks for merging

* style

* style

* updates for pr

* style

* change pegasus-x repo
2022-09-02 19:54:02 +02:00
NielsRogge
17981faf67
Add OWL-ViT to the appropriate section (#18867)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-09-02 15:59:25 +02:00
NielsRogge
c60dd98e87
[LayoutLM] Add clarification to docs (#18716)
* Add clarification

* Add another clarification

* Apply suggestion

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-09-02 14:48:19 +02:00
OlivierDehaene
129d73294e
Fix naming issue with ImageToText pipeline (#18864)
Co-authored-by: Olivier Dehaene <olivier@huggingface.co>
2022-09-02 07:55:30 -04:00
Steven Liu
142e12afb4
Split docs on modality (#18205)
* update

* 🖍 add missing files

* 📝 add nested sections

* 🖍 align titles with tasks

* oops

* remove quotes from titles
2022-09-01 15:19:11 -05:00
OlivierDehaene
ddb69e5af8
Add Image To Text Generation pipeline (#18821)
* Add Image2TextGenerationPipeline to supported pipelines

* Add Flax and Tensorflow support

* Add Flax and Tensorflow small tests

* Add default model for Tensorflow

* Add docstring

* Fix doc style

* Add tiny models for pytorch and flax

* Remove flax from pipeline.
Fix tests

* Use ydshieh/vit-gpt2-coco-en as a default for both PyTorch and Tensorflow

* Fix Tensorflow support

Co-authored-by: Olivier Dehaene <olivier@huggingface.co>
2022-09-01 12:07:14 -04:00
Sayak Paul
954e18ab97
TensorFlow MobileViT (#18555)
* initial implementation.

* add: working model till image classification.

* add: initial implementation that passes intg tests.

Co-authored-by: Amy <aeroberts4444@gmail.com>

* chore: formatting.

* add: tests (still breaking because of config mismatch).

Coo-authored-by: Yih <2521628+ydshieh@users.noreply.github.com>

* add: corrected tests and remaning changes.

* fix code style and repo consistency.

* address PR comments.

* address Amy's comments.

* chore: remove from_pt argument.

* chore: add full-stop.

* fix: TFLite model conversion in the doc.

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply formatting.

* chore: remove comments from the example block.

* remove identation in the example.

Co-authored-by: Amy <aeroberts4444@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-09-01 10:35:15 -04:00
Pedro Cuenca
f719c0377f
Minor typo in prose of model outputs documentation. (#18848) 2022-09-01 12:05:40 +02:00
flozi00
359f7b4b8d
Create pipeline_tutorial.mdx german docs (#18625)
* Create pipeline_tutorial.mdx

* Update _toctree.yml
2022-09-01 09:57:59 +02:00
lewtun
80367cd1fb
Add security warning about the from_pretrained() method (#18801)
* Add security warning about from_pretrained() method

* Add sentence about malware scanner

Co-authored-by: Julien Chaumond <julien@huggingface.co>
2022-08-31 21:48:40 +02:00
NielsRogge
7e7f743481
Add SegFormer ONNX support (#18006)
* Add ONNX support

* Make height and width dynamic axes

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-08-31 20:58:44 +02:00
Ankur Goyal
5c4c869014
Add LayoutLMForQuestionAnswering model (#18407)
* Add LayoutLMForQuestionAnswering model

* Fix output

* Remove TF TODOs

* Add test cases

* Add docs

* TF implementation

* Fix PT/TF equivalence

* Fix loss

* make fixup

* Fix up documentation code examples

* Fix up documentation examples + test them

* Remove LayoutLMForQuestionAnswering from the auto mapping

* Docstrings

* Add better docstrings

* Undo whitespace changes

* Update tokenizers in comments

* Fixup code and remove `from_pt=True`

* Fix tests

* Revert some unexpected docstring changes

* Fix tests by overriding _prepare_for_class

Co-authored-by: Ankur Goyal <ankur@impira.com>
2022-08-31 10:05:33 +02:00
Dhruv Karan
220da3b8a1
Adds GroupViT to models exportable with ONNX (#18628)
* groupvit to onnx

* dynamic shape for pixel values dim
2022-08-30 14:31:35 +02:00
Dhruv Karan
46d0e26a27
Adds OWLViT to models exportable with ONNX (#18588)
* onnx conversion for owlvit

* .T to .t()

* dynamic shapes for pixel values
2022-08-30 14:30:59 +02:00
Christoffer Koo Øhrstrøm
de8548ebf3
[LayoutLMv3] Add TensorFlow implementation (#18678)
Co-authored-by: Esben Toke Christensen <esben.christensen@visma.com>
Co-authored-by: Lasse Reedtz <lasse.reedtz@visma.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2022-08-30 11:48:11 +01:00
Philipp Schmid
f2fbe44753
Fix broken link DeepSpeed documentation link (#18783)
* Fix broken link

* Trigger CI

Co-authored-by: Stas Bekman <stas@stason.org>
2022-08-28 19:32:19 -07:00
Patrick Deutschmann
3223d49354
Add ONNX support for Longformer (#17176)
* Implement ONNX support for Longformer

Fix repo consistency check complaints

Fix value mismatches

Add pooler output for default model

Increase validation atol to accommodate multiple-choice error

Fix copies

Fix chunking for longer sequence lengths

Add future comment

* Fix issue in mask_invalid_locations

* Remove torch imports in configuration_longformer

* Change config access to fix LED

* Push opset version to support tril

* Work in review comments (mostly style)

* Add Longformer to ONNX tests
2022-08-25 08:34:42 +02:00
Daniel Stancl
c72d7d91bf
Add TF implementation of XGLMModel (#16543)
* Add TFXGLM models 

* Add todo: self.supports_xla_generation = False

Co-authored-by: Daniel Stancl <stancld@Daniels-MacBook-Pro.local>
Co-authored-by: Daniel Stancl <stancld@daniels-mbp.home>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Daniel <daniel.stancl@rossum.ai>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-08-24 10:51:05 +01:00
Yih-Dar
cecf9f9b27
fix pipeline_tutorial.mdx doctest (#18717)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-24 05:38:03 -04:00
Mishig Davaadorj
c12dbdc246
Update perf_infer_gpu_many.mdx (#18744) 2022-08-24 10:37:52 +02:00
Younes Belkada
a123eee9df
[bnb] Move documentation (#18671)
* fix bnb documentation

- move bnb documentation to `infer_gpu_many`

* small refactoring

- added text on infer_gpu_one
- added a small note on infer_gpu_many
- added customized multi gpu example on infer_gpu_many

* Update docs/source/en/perf_infer_gpu_many.mdx

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* apply suggestions

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2022-08-18 17:34:48 +02:00
Younes Belkada
6d175c1129
[bnb] Minor modifications (#18631)
* bnb minor modifications

- refactor documentation
- add troubleshooting README
- add PyPi library on DockerFile

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* put in one block

- put bash instructions in one block

* update readme

- refactor a bit hardware requirements

* change text a bit

* Apply suggestions from code review

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* apply suggestions

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* add link to paper

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update tests/mixed_int8/README.md

* Apply suggestions from code review

* refactor a bit

* add instructions Turing & Amperer

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* add A6000

* clarify a bit

* remove small part

* Update tests/mixed_int8/README.md

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2022-08-17 00:48:10 +02:00
flozi00
a27195b1de
Update longt5.mdx (#18634) 2022-08-16 10:20:46 -05:00
Sourab Mangrulkar
9cf274685a
mac m1 mps integration (#18598)
* mac m1 `mps` integration

* Update docs/source/en/main_classes/trainer.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* addressing comments

* Apply suggestions from code review

Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>

* resolve comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>
2022-08-16 16:34:51 +05:30
Stas Bekman
37c5991843
[doc] fix anchors (#18591)
the manual anchors end up being duplicated with automatically added anchors and no longer work.
2022-08-12 10:49:59 -07:00