Commit Graph

1102 Commits

Author SHA1 Message Date
Stas Bekman
f15c99fabf
[deepspeed docs] misc additions (#15585)
* [deepspeed docs] round_robin_gradients

* training and/or eval/predict loss is

* Update docs/source/main_classes/deepspeed.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-02-11 10:54:04 -08:00
Steven Liu
85aee09e9a
🖍 remove broken link (#15615) 2022-02-11 12:33:55 -06:00
Sylvain Gugger
6cf06d198c
Mark "code in the Hub" API as experimental (#15624) 2022-02-11 09:55:31 -05:00
Ngo Quang Huy
c0864d98ba
Correct JSON format (#15600) 2022-02-10 09:02:03 -08:00
lewtun
2e8b85f72e
Add local and TensorFlow ONNX export examples to docs (#15604)
* Add local and TensorFlow ONNX export examples to docs

* Use PyTorch - TensorFlow split
2022-02-10 16:31:00 +01:00
Alberto Bégué
cb7ed6e083
Add Tensorflow handling of ONNX conversion (#13831)
* Add TensorFlow support for ONNX export

* Change documentation to mention conversion with Tensorflow

* Refactor export into export_pytorch and export_tensorflow

* Check model's type instead of framework installation to choose between TF and Pytorch

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Alberto Bégué <alberto.begue@della.ai>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-02-10 11:18:41 +01:00
Sylvain Gugger
c722753afd
Expand tutorial for custom models (#15587)
* Expand tutorial for custom models

* Style

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-02-09 17:44:28 -05:00
NielsRogge
a86ee2261e
Add link (#15588)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
2022-02-09 23:33:39 +01:00
Stas Bekman
dee17d5676
[trainer docs] document how to select specific gpus (#15551)
* [trainer docs] document how to select specific gpus

* expand

* add urls

* add accelerate launcher
2022-02-09 10:12:29 -08:00
Chan Woo Kim
2b5603f6ac
Constrained Beam Search [without disjunctive decoding] (#15416)
* added classes to get started with constrained beam search

* in progress, think i can directly force tokens now but not yet with the round robin

* think now i have total control, now need to code the bank selection

* technically works as desired, need to optimize and fix design choices leading to undersirable outputs

* complete PR #1 without disjunctive decoding

* removed incorrect tests

* Delete k.txt

* Delete test.py

* Delete test.sh

* revert changes to test scripts

* genutils

* full implementation with testing, no disjunctive yet

* shifted docs

* passing all tests realistically ran locally

* removing accidentally included print statements

* fixed source of error in initial PR test

* fixing the get_device() vs device trap

* fixed documentation docstrings about constrained_beam_search

* fixed tests having failing for Speech2TextModel's floating point inputs

* fix cuda long tensor

* added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search

* deleted accidentally added test halting code with assert False

* code reformat

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

* fixing based on comments on PR

* took out the testing code that should but work fails without the beam search moditification ; style changes

* fixing comments issues

* docstrings for ConstraintListState

* typo in PhrsalConstraint docstring

* docstrings improvements

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-02-09 16:59:26 +01:00
Leandro von Werra
d923f76203
add model scaling section (#15119)
* add model scaling section

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* integrate reviewer feedback

* initialize GPU properly

* add note about BnB optimizer

* move doc from `scaling.mdx` to `performance.mdx`

* integrate reviewer feedback

* revert section levels

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-02-09 15:27:30 +01:00
Sylvain Gugger
b5c6fdecf0
PoC for a ProcessorMixin class (#15549)
* PoC for a ProcessorMixin class

* Documentation

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Roll out to other processors

* Add base feature extractor class in init

* Use args and kwargs

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-02-09 09:24:49 -05:00
Nathan Raw
fcb4f11c92
📝 Add codecarbon callback to docs (#15563) 2022-02-08 14:10:53 -05:00
Joao Gante
8406fa6dd5
Add TFSpeech2Text (#15113)
* Add wrapper classes

* convert inner layers to tf

* Add TF Encoder and Decoder layers

* TFSpeech2Text models

* Loadable model

* TF model with same outputs as PT model

* test skeleton

* correct tests and run the fixup

* correct attention expansion

* TFSpeech2Text pask_key_values with TF format
2022-02-08 16:27:23 +00:00
aaron
87d08afb16
electra is added to onnx supported model (#15084)
* electra is added to onnx supported model

* add google/electra-base-generator for test onnx module

Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com>
2022-02-08 15:47:49 +01:00
Steven Liu
552f8d3091
Create a custom model guide (#15489)
* 📝 add config section

* 📝 finish first draft

* 📝 add feature extractor and processor

* 🖍 apply feedback from review

* 📝 minor edits

* last review
2022-02-07 12:34:56 -06:00
lewtun
6775b211b6
Remove Longformers from ONNX-supported models (#15273) 2022-02-07 17:32:13 +01:00
NielsRogge
84eec9e6ba
Add ConvNeXT (#15277)
* First draft

* Add conversion script

* Improve conversion script

* Improve docs and implement tests

* Define model output class

* Fix tests

* Fix more tests

* Add model to README

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply more suggestions from code review

* Apply suggestions from code review

* Rename dims to hidden_sizes

* Fix equivalence test

* Rename gamma to gamma_parameter

* Clean up conversion script

* Add ConvNextFeatureExtractor

* Add corresponding tests

* Implement feature extractor correctly

* Make implementation cleaner

* Add ConvNextStem class

* Improve design

* Update design to also include encoder

* Fix gamma parameter

* Use sample docstrings

* Finish conversion, add center cropping

* Replace nielsr by facebook, make feature extractor tests smaller

* Fix integration test

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-02-07 16:11:37 +01:00
Stas Bekman
8ce1330631
[deepspeed docs] DeepSpeed ZeRO Inference (#15486)
* [deepspeed docs] DeepSpeed ZeRO Inference

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* tweak

* deal with black

* extra cleanup, better comments

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-02-04 13:51:02 -08:00
Sylvain Gugger
ac6aa10f23
Standardize semantic segmentation models outputs (#15469)
* Standardize instance segmentation models outputs

* Rename output

* Update src/transformers/modeling_outputs.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add legacy argument to the config and model forward

* Update src/transformers/models/beit/modeling_beit.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Copy fix in Segformer

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2022-02-04 14:52:07 -05:00
Stas Bekman
31be2f45a9
[deepspeed docs] Megatron-Deepspeed info (#15488) 2022-02-04 11:15:13 -08:00
Stas Bekman
21dcaec5d5
[deepspeed docs] memory requirements (#15506) 2022-02-03 10:55:14 -08:00
Sylvain Gugger
44b21f117b
Save code of registered custom models (#15379)
* Allow dynamic modules to use relative imports

* Work for configs

* Fix last merge conflict

* Save code of registered custom objects

* Map strings to strings

* Fix test

* Add tokenizer

* Rework tests

* Tests

* Ignore fixtures py files for tests

* Tokenizer test + fix collection

* With full path

* Rework integration

* Fix typo

* Remove changes in conftest

* Test for tokenizers

* Add documentation

* Update docs/source/custom_models.mdx

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add file structure and file content

* Add more doc

* Style

* Update docs/source/custom_models.mdx

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Address review comments

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2022-02-02 10:44:37 -05:00
Steven Liu
b9418a1d97
Update tutorial docs (#15165)
* first draft of pipeline, autoclass, preprocess tutorials

* apply review feedback

* 🖍 apply feedback from patrick/niels

* 📝add output image to preprocessed image

* 🖍 apply feedback from patrick
2022-02-01 18:31:35 -06:00
Steven Liu
c157c7e3fd
Update fine-tune docs (#15259)
* add fine-tune tutorial

* make edits, fix style

* 📝 make edits

* 🖍 fix code format links to external libraries

* 🔄revert code formatting

* 🖍 use DefaultDataCollator instead of DataCollatorWithPadding
2022-02-01 18:28:12 -06:00
Stas Bekman
44c7857b87
[deepspeed doc] fix import, extra notes (#15400)
* [deepspeed doc] fix import, extra notes

* typo
2022-01-31 08:28:10 -08:00
NielsRogge
47df0f2234
Add header (#15434) 2022-01-31 11:15:54 -05:00
Ogundepo Odunayo
282ae123e2
add t5 ner finetuning (#15432) 2022-01-31 17:03:06 +01:00
Soonhwan-Kwon
e09473a817
Add support for XLM-R XL and XXL models by modeling_xlm_roberta_xl.py (#13727)
* add xlm roberta xl

* add convert xlm xl fairseq checkpoint to pytorch

* fix init and documents for xlm-roberta-xl

* fix indention

* add test for XLM-R xl,xxl

* fix model hub name

* fix some stuff

* up

* correct init

* fix more

* fix as suggestions

* add torch_device

* fix default values of doc strings

* fix leftovers

* merge to master

* up

* correct hub names

* fix docs

* fix model

* up

* finalize

* last fix

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add copied from

* make style

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-01-29 13:42:37 +01:00
Steven Liu
16d4acbfdb
Get started docs (#15098)
* clean commit of changes

* apply review feedback, make edits

* fix backticks, minor formatting

* 🖍 make fixup and minor edits

* 🖍 fix # in header

* 📝 update code sample without from_pt

* 📝 final review
2022-01-28 19:01:37 -06:00
Steven Liu
cabd6d26a2
Update model share tutorial (#15288)
* add model sharing tutorial

* 🖍 apply feedback from review

* 📝 make edits

* 🖍 fix formatting

* 📝 convert from pt checkpoint to flax

* 📝 final review
2022-01-28 18:49:26 -06:00
Suraj Patil
d25e25ee2b
Add XGLM models (#14876)
* add xglm

* update vocab size

* fix model name

* style and tokenizer

* typo

* no mask token

* fix pos embed compute

* fix args

* fix tokenizer

* fix positions

* fix tokenization

* style and dic fixes

* fix imports

* add fast tokenizer

* update names

* add pt tests

* fix tokenizer

* fix typo

* fix tokenizer import

* fix fast tokenizer

* fix tokenizer

* fix converter

* add tokenizer test

* update checkpoint names

* fix tokenizer tests

* fix slow tests

* add copied from comments

* rst -> mdx

* flax model

* update flax tests

* quality

* style

* doc

* update index and readme

* fix copies

* fix doc

* update toctrr

* fix indent

* minor fixes

* fix config doc

* don't save embed_pos weights

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* address Sylvains commnets, few doc fixes

* fix check_repo

* align order of arguments

* fix copies

* fix labels

* remove unnecessary mapping

* fix saving tokenizer

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-01-28 18:55:23 +01:00
Ngo Quang Huy
4996922b6d
[docs] fix wrong file name in pr_check (#15380) 2022-01-28 07:52:01 -05:00
Steven Liu
f5db6ce76a
Fix code format for Accelerate doc (#15335)
* 🖍 fix code syntax to external libraries and replace image

* 🔄revert code formatting, replace image with code block

* 🖍 apply feedback
2022-01-27 13:49:04 -06:00
Lysandre
f87db5e412 Release: v4.16.0 2022-01-27 13:06:33 -05:00
Sylvain Gugger
8f6454bfac
Add proper documentation for Keras callbacks (#15374)
* Add proper documentation for Keras callbacks

* Add dummies
2022-01-27 10:51:38 -05:00
Stas Bekman
fc8fc400e3
[docs] post-PR merge fix (#15355)
* [docs] post-PR merge fix

* Update docs/source/main_classes/deepspeed.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-01-26 11:23:32 -08:00
novice
99a2771189
Add YOSO (#15091)
* Add cookiecutter files

* Add cuda kernels and cpp files

* Update modeling_yoso.py

* Add .h files

* Update configuration_yoso.py

* Updates

* Remove tokenizer

* Code quality

* Update modeling_yoso.py

* Update modeling_yoso.py

* Fix failing test

* Update modeling_yoso.py

* Fix code quality

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review and fix integration tests

* Update src/transformers/models/yoso/modeling_yoso.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Apply suggestions from code review

* Fix copied from statement

* Fix docstring

* Fix code quality

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions and fix mask

* Apply suggestions from code review

* Fix code quality

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix docstrings

* Fix code quality

* Remove trailing whitespace

* Update yoso.mdx

* Move kernel loading to YosoEncoder

* make style

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/yoso/modeling_yoso.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add short summary to docs

* Update docs/source/model_doc/yoso.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update yoso.mdx

* Update docs/source/model_doc/yoso.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Remove CausalLM model and add copied from

* Remove autoregressive code

* Remove unused imports

* add copied from for embeddings

* Fix code quality

* Update docs/source/model_doc/yoso.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestion from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-01-26 19:18:29 +01:00
Ngo Quang Huy
5d8b98608c
Fix deepspeed docs (#15346) 2022-01-26 07:24:33 -05:00
Jacob Deppen
96161ac408
make table into valid Markdown table syntax (#15337) 2022-01-26 07:10:00 -05:00
Maciej Pawłowski
e79a0faeae
Added missing code in exemplary notebook - custom datasets fine-tuning (#15300)
* Added missing code in exemplary notebook - custom datasets fine-tuning

Added missing code in tokenize_and_align_labels function in the exemplary notebook on custom datasets - token classification.
The missing code concerns adding labels for all but first token in a single word.
The added code was taken directly from huggingface official example - this [colab notebook](https://github.com/huggingface/notebooks/blob/master/transformers_doc/custom_datasets.ipynb).

* Changes requested in the review - keep the code as simple as possible
2022-01-25 17:26:17 -05:00
Steven Liu
0501beb846
Add 🤗 Accelerate tutorial (#15263)
* add accelerate tutorial

* 🖍 apply feedback from review

* 📝 make edits
2022-01-25 13:46:11 -06:00
novice
d43e308e7f
Add Swin Transformer (#15085)
* Add all files

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Updates

* Apply suggestions from review

* Fix failing tests

* Update __init__.py

* Update configuration_swin.py

* Update auto_factory.py

* Fix pytests

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix tests and default checkpoint

* Fix Recursion error

* Code quality

* Remove copied from

* Update modeling_swin.py

* Code quality

* Update modeling_swin.py

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

* Fix feature extractor

* Fix code quality

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

* Update configuration_swin.py

* Update default checkpoint

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/model_doc/swin.mdx

Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu>

* Update conversion script

* Reformat conversion script

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu>
2022-01-21 12:10:41 +01:00
NielsRogge
515ed3ad2a
Fix doc examples (#15257) 2022-01-20 21:51:51 +01:00
Kamal Raj
08b41b413a
Update pipelines.mdx (#15243)
fix few spelling mistakes
2022-01-20 08:46:48 -05:00
NielsRogge
80f7296091
Update Trainer code example (#15070)
* Update code example

* Fix code quality

* Add comment
2022-01-19 20:15:12 +01:00
NielsRogge
ac227093e4
Add ViLT (#14895)
* First commit

* Add conversion script

* Make conversion script work for base model

* More improvements

* Update conversion script, works for vqa

* Add indexing argument to meshgrid

* Make conversion script work for ViltForPreTraining

* Add ViltForPreTraining to docs

* Fix device issue

* Add processor

* Add MinMaxResize to feature extractor

* Implement call method of ViltProcessor

* Fix tests

* Add integration test

* Add loss calculation for VQA

* Improve tests

* Improve some more tests

* Debug tests

* Small improvements

* Add support for attention_mask

* Remove mask_it

* Add pixel_mask

* Add tests for ViltFeatureExtractor

* Improve tests

* Add ViltForNaturalLanguageVisualReasoning

* Add ViltForNaturalLanguageVisualReasoning to conversion script

* Minor fixes

* Add support for image_embeds, update docstrings to markdown

* Update docs to markdown

* Improve conversion script

* Rename ViltForPreTraining to ViltForMaskedLM

* Improve conversion script

* Convert docstrings to markdown

* Fix code example of retrieval model

* Properly convert masked language model

* Add integration test for nlvr

* Fix code quality

* Apply suggestions from code review

* Add copied from statements

* Fix pretrained_config_archive_map

* Fix docs

* Add model to README

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply more suggestions from code review

* Make code more readable

* Add ViltForNaturalLanguageVisualReasoning to the tests

* Rename ViltForVisualQuestionAnswering to ViltForQuestionAnswering

* Replace pixel_values_2 by single tensor

* Add hidden_states and attentions

* Fix one more test

* Fix all tests

* Update year

* Fix rebase issues

* Fix another rebase issue

* Remove ViltForPreTraining from auto mapping

* Rename ViltForImageRetrievalTextRetrieval to ViltForImageAndTextRetrieval

* Make it possible to use BertTokenizerFast in the processor

* Use BertTokenizerFast by default

* Rename ViltForNaturalLanguageVisualReasoning, define custom model output

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-01-19 19:51:59 +01:00
NielsRogge
842298f84f
[ViTMAE] Various fixes (#15221)
* Add MAE to AutoFeatureExtractor

* Add link to notebook

* Fix relative paths
2022-01-19 15:27:57 +01:00
Li-Huai (Allan) Lin
841d979190
Add FastTokenizer to REALM (#15211)
* Remove BertTokenizer abstraction

* Add FastTokenizer to REALM

* Fix config archive map

* Fix copies

* Update realm.mdx

* Apply suggestions from code review
2022-01-19 15:19:36 +01:00
Sylvain Gugger
db3503949d Finish conversion of REALM doc to MDX 2022-01-18 18:00:30 -05:00