Commit Graph

1549 Commits

Author SHA1 Message Date
Steven Liu
a44985b41c
add cv + audio labels (#20114) 2022-11-09 07:40:15 -08:00
Joao Gante
f270b960d6
Generate: move generation_*.py src files into generation/*.py (#20096)
* move generation_*.py src files into generation/*.py

* populate generation.__init__ with lazy loading

* move imports and references from generation.xxx.object to generation.object
2022-11-09 15:34:08 +00:00
amyeroberts
4eb918e656
AutoImageProcessor (#20111)
* AutoImageProcessor skeleton

* Update references

* Add mapping in init

* Add model image processors to __init__ for importing

* Add AutoImageProcessor tests

* Fix up

* Image Processor documentation

* Remove pdb

* Update docs/source/en/model_doc/mobilevit.mdx

* Update docs

* Don't add whitespace on json files

* Remove fixtures

* Move checking model config down

* Fix up

* Add check for image processor

* Remove FeatureExtractorMixin in docstrings

* Rename model_tmpfile to config_tmpfile

* Don't make None if not in image processor map
2022-11-08 19:54:41 +00:00
Weiwe Shi
efa889d2e4
Add RocBert (#20013)
* add roc_bert

* update roc_bert readme

* code style

* change name and delete unuse file

* udpate model file

* delete unuse log file

* delete tokenizer fast

* reformat code and change model file path

* add RocBertForPreTraining

* update docs

* delete wrong notes

* fix copies

* fix make repo-consistency error

* fix files are not present in the table of contents error

* change RocBert -> RoCBert

* add doc, add detail test

Co-authored-by: weiweishi <weiweishi@tencent.com>
2022-11-08 10:03:43 -05:00
NielsRogge
258963062b
Add CLIPSeg (#20066)
* Add first draft

* Update conversion script

* Improve conversion script

* Improve conversion script some more

* Add conditional embeddings

* Add initial decoder

* Fix activation function of decoder

* Make decoder outputs match original implementation

* Make decoder outputs match original implementation

* Add more copied from statements

* Improve model outputs

* Fix auto tokenizer file

* Fix more tests

* Add test

* Improve README and docs, improve conditional embeddings

* Fix more tests

* Remove print statements

* Remove initial embeddings

* Improve conversion script

* Add interpolation of position embeddings

* Finish addition of interpolation of position embeddings

* Add support for refined checkpoint

* Fix refined checkpoint

* Remove unused parameter

* Improve conversion script

* Add support for training

* Fix conversion script

* Add CLIPSegFeatureExtractor

* Fix processor

* Fix CLIPSegProcessor

* Fix conversion script

* Fix most tests

* Fix equivalence test

* Fix README

* Add model to doc tests

* Use better variable name

* Convert other checkpoint as well

* Update config, add link to paper

* Add docs

* Update organization

* Replace base_model_prefix with clip

* Fix base_model_prefix

* Fix checkpoint of config

* Fix config checkpoint

* Remove file

* Use logits for output

* Fix tests

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-11-08 10:55:47 +01:00
Tom Aarsen
6156bffa2b
Replace awkward timm link with the expected one (#20109) 2022-11-07 13:57:39 -05:00
Steven Liu
71f772ebd0
Add new terms to the glossary (#20051)
* add new terms

* apply review
2022-11-07 10:45:27 -08:00
Tom Aarsen
d44ac47bac
docs: Fixed variables in f-strings (#20087)
* docs: Fixed variables in f-strings

* Replace unknown `block` with known `block_type` in ValueError

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add missing torch import in docs code block

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-11-07 13:18:09 -05:00
Tom Aarsen
3222fc645b
docs: Resolve many typos in the English docs (#20088)
* docs: Fix typo in ONNX parser help: 'tolerence' => 'tolerance'

* docs: Resolve many typos in the English docs

Typos found via 'codespell ./docs/source/en'
2022-11-07 09:19:04 -05:00
Tom Aarsen
b8112eddec
Replace unsupported facebookresearch/bitsandbytes (#20093)
With https://github.com/TimDettmers/bitsandbytes, which is by the same author and is still being updated
2022-11-07 08:52:03 -05:00
Jordan Clive
3bd0007e87
Update documentation on seq2seq models with absolute positional embeddings, to be in line with Tips section for BERT and GPT2 (#20068)
Co-authored-by: jordiclive <jordiclive19@imperial.ac.uk>
2022-11-04 11:32:44 -04:00
Matt
6e1c5786dc
Update READMEs for ESMFold and add notebooks (#20067)
* Update READMEs for ESMFold and add notebooks

* Fix PyCharm formatting

* make fix-copies
2022-11-04 15:10:13 +00:00
Wang, Yi
2564f0c21d fix jit trace error for model forward sequence is not aligned with jit.trace tuple input sequence, update related doc (#19891)
* fix jit trace error for classification usecase, update related doc

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* add implementation in torch 1.14.0

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* update_doc

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* update_doc

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2022-11-03 10:50:03 -04:00
Sanchit Gandhi
06d488061f
[Whisper Tokenizer] Make more user-friendly (#19921)
* [Whisper Tokenizer] Make more user-friendly

* use property

* make indexing rigorous

* small clean-up

* tests

* skip seq2seq tests

* remove multilingual arg

* reorder args

* collapse to one function

Co-authored-by: ArthurZucker <arthur@huggingface.co>

* option to override attributes

Co-authored-by: ArthurZucker <arthur@huggingface.co>

* add to docs

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make comment more clear

Co-authored-by: sgugger <sylvain@huggingface.co>

* don't add special tokens in get_decoder_prompt_ids

* add test for set_prefix_tokens

Co-authored-by: ArthurZucker <arthur@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: sgugger <sylvain@huggingface.co>
2022-11-03 14:22:40 +00:00
Yih-Dar
9ccea7acb1
Fix some doctests after PR 15775 (#20036)
* Add skip_special_tokens=True in some doctest

* For T5

* Fix for speech_to_text.mdx

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-11-03 14:18:45 +01:00
Steven Liu
aa39967b28
reorganize glossary (#20010) 2022-11-02 16:58:17 -07:00
Yih-Dar
fb7cbe236b
Fix doctest (#20023)
* Fix doctest

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-11-02 19:37:25 +01:00
amyeroberts
a6b7759880
Add Image Processors (#19796)
* Add CLIP image processor

* Crop size as dict too

* Update warning

* Actually use logger this time

* Normalize doesn't change dtype of input

* Add perceiver image processor

* Tidy up

* Add DPT image processor

* Add Vilt image processor

* Tidy up

* Add poolformer image processor

* Tidy up

* Add LayoutLM v2 and v3 imsge processors

* Tidy up

* Add Flava image processor

* Tidy up

* Add deit image processor

* Tidy up

* Add ConvNext image processor

* Tidy up

* Add levit image processor

* Add segformer image processor

* Add in post processing

* Fix up

* Add ImageGPT image processor

* Fixup

* Add mobilevit image processor

* Tidy up

* Add postprocessing

* Fixup

* Add VideoMAE image processor

* Tidy up

* Add ImageGPT image processor

* Fixup

* Add ViT image processor

* Tidy up

* Add beit image processor

* Add mobilevit image processor

* Tidy up

* Add postprocessing

* Fixup

* Fix up

* Fix flava and remove tree module

* Fix image classification pipeline failing tests

* Update feature extractor in trainer scripts

* Update pad_if_smaller to accept tuple and int size

* Update for image segmentation pipeline

* Update src/transformers/models/perceiver/image_processing_perceiver.py

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

* Update src/transformers/image_processing_utils.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/beit/image_processing_beit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* PR comments - docstrings; remove accidentally added resize; var names

* Update docstrings

* Add exception if size is not in the right format

* Fix exception check

* Fix up

* Use shortest_edge in tuple in script

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-11-02 11:57:36 +00:00
Steven Liu
79c720c062
fix typo (#20006) 2022-11-01 11:30:36 -07:00
Steven Liu
ab74ac11e4
Add LayoutLMv3 resource (#19932)
* add layoutlmv3 resource

* add layoutlmv2 resources

* fix button
2022-11-01 11:10:46 -07:00
Steven Liu
dec8578e70
Add BERT resources (#19852)
* add resources for bert

* add course chapters

* apply reviews

* add pipeline icons and community resource

* fix buttons
2022-11-01 11:09:53 -07:00
Steven Liu
1f6885bad0
add dataset (#20005) 2022-11-01 10:37:20 -07:00
Sayak Paul
c87ae86a8f
Update image_classification.mdx (#19996) 2022-11-01 07:54:41 -04:00
Mohit Sharma
c796b6dea6
Added onnx config whisper (#19525)
* Added onnx config whisper

* added whisper support onnx

* add audio input data

* added whisper support onnx

* fixed the seqlength value

* Updated the whisper onnx ocnfig

* restore files to old version

* removed attention mask from inputs

* Updated get_dummy_input_onnxruntime docstring

* Updated relative imports and token generation

* update docstring
2022-11-01 07:50:42 -04:00
Matt
7f9b7b3f0e
Add ESMFold (#19977)
* initial commit

* First draft that gets outputs without crashing!

* Add all the ported openfold dependencies

* testing

* Restructure config files for ESMFold

* Debugging to find output discrepancies

* Mainly style

* Make model runnable without extra deps

* Remove utils and merge them to the modeling file

* Use correct gelu and remove some debug prints

* More cleanup

* Update esm docs

* Update conversion script to support ESMFold properly

* Port some top-level changes from ESMFold repo

* Expand EsmFold docstrings

* Make attention_mask optional (default to all 1s)

* Add inference test for ESMFold

* Use config and not n kwargs

* Add modeling output class

* Remove einops

* Remove chunking in ESM FFN

* Update tests for ESMFold

* Quality

* REpo consistency

* Remove tree dependency from ESMFold

* make fixup

* Add an error in case my structure map function breaks later

* Remove needless code

* Stop auto-casting the LM to float16 so CPU tests pass

* Stop auto-casting the LM to float16 so CPU tests pass

* Final test updates

* Split test file

* Copyright and quality

* Unpin PyTorch to see built doc

* Fix config file to_dict() method

* Add some docstrings to the output

* Skip TF checkpoint tests for ESM until we reupload those

* make fixup

* More docstrings

* Unpin to get even with main

* Flag example to write

Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-10-31 21:32:58 -04:00
Jean Charles Kouame
6aede2d602
Tranformers documentation translation to Italian #17459 (#19988) 2022-10-31 13:19:15 -04:00
NielsRogge
0b294c2334
[Conditional, Deformable DETR] Add postprocessing methods (#19709)
* Add postprocessing methods

* Update docs

* Add fix

* Add test

* Add test for deformable detr postprocessing

* Add post processing methods for segmentation

* Update code examples

* Add post_process to make the pipeline work

* Apply updates

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-10-31 08:28:44 +01:00
Steven Liu
2e35bac4e7
Add wav2vec2 resources (#19931)
* add wav2vec2 resources

* apply review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2022-10-28 13:28:18 -07:00
Steven Liu
9d2788b46b
add resources for distilbert (#19930) 2022-10-28 13:16:07 -07:00
Steven Liu
b0a2c3a2d6
add resources for bart (#19928) 2022-10-28 13:15:43 -07:00
Raghav Prabhakar
0d4c45c585
Add Onnx Config for ImageGPT (#19868)
* add Onnx Config for ImageGPT

* add generate_dummy_inputs for onnx config

* add TYPE_CHECKING clause

* Update doc for generate_dummy_inputs

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-28 09:39:53 -04:00
Steven Liu
e4132952a1
Add GPT2 resources (#19879)
* add resources for gpt2

* add pipeline icons and community resources
2022-10-27 11:34:00 -07:00
Steven Liu
d818dd3a41
Add BLOOM resources (#19881)
* add bloom resources

* add pipeline icon
2022-10-27 11:33:52 -07:00
Steven Liu
50f5266b2c
Add T5 resources (#19878)
* add resources for t5

* add pipeline icons and community resources
2022-10-27 11:33:37 -07:00
Steven Liu
536a8ae6ad
Add RoBERTa resources (#19911)
* add roberta resources

* fix typo
2022-10-27 11:33:15 -07:00
Younes Belkada
7a1c68a845
Add flan-t5 documentation page (#19892)
* add `flan-t5` documentation page

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add more content

* revert `_toctree` modif

* revert `toctree` modif - 2

* Update README.md

* Revert "Update README.md"

This reverts commit 5660714429.

* Update README_es.md

* Update README_zh-hans.md

* Update README_zh-hant.md

* Update README_ko.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-26 17:22:57 +02:00
Lysandre Debut
eedaba682f
[Past CI] Vilt only supports PT >= v1.10 (#19851)
* Support for Vilt in v1.9

* Skip if not higher or equal than 1.10

* Move test :)

* I am bad at python
2022-10-25 15:59:35 +02:00
Davi Alves
0a77249178
Added translation of serialization.mdx to Portuguese Issue #16824 (#19869)
* [ custom_models.mdx ] - Translated to Portuguese the custom models tutorial.

* [ run_scripts.mdx ] - Translated to Portuguese the run scripts tutorial.

* [ converting_tensorflow_models.mdx ] - Translated to Portuguese the converting tensorflow models tutorial.

* [ converting_tensorflow_models.mdx ] - Translated to Portuguese the converting tensorflow models tutorial.

* [ serialization.mdx ] - Translated to Portuguese the serialization tutorial.
2022-10-25 09:34:28 -04:00
Alberto Mario Ceballos-Arroyo
371337a95b
Spanish translation of multiple_choice.mdx, question_answering.mdx. (#19821)
* Translated multiple_choice.mdx, question_answering.mdx. Added them to _toctree.yml

* Added translation for a missed line.

* Update _toctree.yml as per Omar's suggestions

* Update multiple_choice.mdx as per Omar's comments

* Updt question_answering.mdx as per Omar's comments
2022-10-24 20:11:34 -04:00
Steven Liu
9ecb13d63a
add small updates only (#19847) 2022-10-24 10:18:20 -07:00
Yih-Dar
072ed01c38
Fix doctest for MarkupLM (#19845)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-10-24 17:54:23 +02:00
Dhruv Singal
5cbf1fa8ca
fixed typo in fp16 training section for perf_train_gpu_one (#19736) 2022-10-24 10:04:28 -04:00
Davi Alves
743995e0e6
Added translation of converting_tensorflow_models.mdx to Portuguese Issue #16824 (#19824)
* [ custom_models.mdx ] - Translated to Portuguese the custom models tutorial.

* [ run_scripts.mdx ] - Translated to Portuguese the run scripts tutorial.

* [ converting_tensorflow_models.mdx ] - Translated to Portuguese the converting tensorflow models tutorial.

* [ converting_tensorflow_models.mdx ] - Translated to Portuguese the converting tensorflow models tutorial.
2022-10-24 09:50:16 -04:00
zhou fan
3b419cfc6f
fix broken links in testing.mdx (#19820) 2022-10-24 09:48:02 -04:00
Davi Alves
74b3eb3dea
Added translation of run_scripts.mdx to Portuguese Issue #16824 (#19800)
* [ custom_models.mdx ] - Translated to Portuguese the custom models tutorial.

* [ run_scripts.mdx ] - Translated to Portuguese the run scripts tutorial.
2022-10-21 17:38:35 -04:00
Davi Alves
2ebf4e6a7b
[ custom_models.mdx ] - Translated to Portuguese the custom models tutorial. (#19779) 2022-10-21 09:48:19 -04:00
ftorres16
c1f009ad9a
Update training.mdx (#19791) 2022-10-21 09:46:44 -04:00
Rohit Gupta
2dd1b8f0c5
adding key pair dataset (#19765) 2022-10-20 09:05:49 -04:00
amyeroberts
5041bc3511
Image transforms add center crop (#19718)
* Add center crop to transforms library

* Return PIL images if PIL image input by default

* Fixup and add docstring

* Trigger CI

* Update src/transformers/image_transforms.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/image_transforms.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* PR comments - move comments; unindent

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-19 16:15:01 +01:00
GMFTBY
71786b10c5
Adding the state-of-the-art contrastive search decoding methods for the codebase of generation_utils.py (#19477)
* add: the contrastive search for generaton_utils

* add: testing scripts for contrastive search under examples/text-generation

* update the quality of codes

* revise the docstring; make the generation_contrastive_search.py scripts;

* revise the examples/pytorch/text-generation/run_generation_contrastive_search.py to the auto-APIs format

* revise the necessary documents

* fix: revise the docstring of generation_contrastive_search.py

* Fix the code indentation

* fix: revise the nits and examples in contrastive_search docstring.

* fix the copyright

* delete generation_contrastive_search.py

* revise the logic in contrastive_search

* update the intergration test and the docstring

* run the tests over

* add the slow decorate to the contrastive_search intergrate test

* add more test

* do the style, quality, consistency checks
2022-10-19 10:17:46 +01:00