Commit Graph

1111 Commits

Author SHA1 Message Date
Stas Bekman
bee361c6f1
[t5/t0/mt5 models] faster/leaner custom layer norm (#14656)
* [t5] faster/leaner custom layer norm

* wip

* apex.normalization.FusedRMSNorm

* cleanup

* cleanup

* add doc

* add catch all

* Trigger CI

* expand
2022-02-15 16:49:57 -08:00
Patrick von Platen
2e12b907ae
TF generate refactor - Greedy Search (#15562)
* TF generate start refactor

* Add tf tests for sample generate

* re-organize

* boom boom

* Apply suggestions from code review

* re-add

* add all code

* make random greedy pass

* make encoder-decoder random work

* further improvements

* delete bogus file

* make gpt2 and t5 tests work

* finish logits tests

* correct logits processors

* correct past / encoder_outputs drama

* refactor some methods

* another fix

* refactor shape_list

* fix more shape list

* import shape
_list

* finish docs

* fix imports

* make style

* correct tf utils

* Fix TFRag as well

* Apply Lysandre's and Sylvais suggestions

* Update tests/test_generation_tf_logits_process.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Update src/transformers/tf_utils.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* remove cpu according to gante

* correct logit processor

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-02-15 17:54:43 +01:00
Nicolas Patry
cdf19c501d
Re-export KeyDataset. (#15645)
* Re-export `KeyDataset`.

* Update the docs locations.
2022-02-15 17:49:38 +01:00
Stas Bekman
28e6155d8a
add a network debug script and document it (#15652)
* add a network debug script and document it

* doc
2022-02-15 08:48:00 -08:00
jonrbates
86a7845c0c
Fix typo in speech2text2 doc (#15617)
Forward looks for inputs, not input_ids
2022-02-15 13:54:34 +01:00
fra
05a8580964 Revert "logger doc"
This reverts commit 41168a49ce.
2022-02-15 10:46:45 +01:00
fra
41168a49ce logger doc 2022-02-15 10:03:28 +01:00
NielsRogge
b090b79022
Make Swin work with VisionEncoderDecoderModel (#15527)
* Add attribute_map

* Add mention in docs

* Set hidden_size attribute correctly

* Add note about Transformer-based models only

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
2022-02-14 17:33:35 +01:00
Daniel Erenrich
4f403ea899
Fix grammar in tokenizer_summary (#15614)
"to make ensure" is redundant.
2022-02-11 16:51:30 -05:00
Stas Bekman
f15c99fabf
[deepspeed docs] misc additions (#15585)
* [deepspeed docs] round_robin_gradients

* training and/or eval/predict loss is

* Update docs/source/main_classes/deepspeed.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-02-11 10:54:04 -08:00
Steven Liu
85aee09e9a
🖍 remove broken link (#15615) 2022-02-11 12:33:55 -06:00
Sylvain Gugger
6cf06d198c
Mark "code in the Hub" API as experimental (#15624) 2022-02-11 09:55:31 -05:00
Ngo Quang Huy
c0864d98ba
Correct JSON format (#15600) 2022-02-10 09:02:03 -08:00
lewtun
2e8b85f72e
Add local and TensorFlow ONNX export examples to docs (#15604)
* Add local and TensorFlow ONNX export examples to docs

* Use PyTorch - TensorFlow split
2022-02-10 16:31:00 +01:00
Alberto Bégué
cb7ed6e083
Add Tensorflow handling of ONNX conversion (#13831)
* Add TensorFlow support for ONNX export

* Change documentation to mention conversion with Tensorflow

* Refactor export into export_pytorch and export_tensorflow

* Check model's type instead of framework installation to choose between TF and Pytorch

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Alberto Bégué <alberto.begue@della.ai>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-02-10 11:18:41 +01:00
Sylvain Gugger
c722753afd
Expand tutorial for custom models (#15587)
* Expand tutorial for custom models

* Style

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-02-09 17:44:28 -05:00
NielsRogge
a86ee2261e
Add link (#15588)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
2022-02-09 23:33:39 +01:00
Stas Bekman
dee17d5676
[trainer docs] document how to select specific gpus (#15551)
* [trainer docs] document how to select specific gpus

* expand

* add urls

* add accelerate launcher
2022-02-09 10:12:29 -08:00
Chan Woo Kim
2b5603f6ac
Constrained Beam Search [without disjunctive decoding] (#15416)
* added classes to get started with constrained beam search

* in progress, think i can directly force tokens now but not yet with the round robin

* think now i have total control, now need to code the bank selection

* technically works as desired, need to optimize and fix design choices leading to undersirable outputs

* complete PR #1 without disjunctive decoding

* removed incorrect tests

* Delete k.txt

* Delete test.py

* Delete test.sh

* revert changes to test scripts

* genutils

* full implementation with testing, no disjunctive yet

* shifted docs

* passing all tests realistically ran locally

* removing accidentally included print statements

* fixed source of error in initial PR test

* fixing the get_device() vs device trap

* fixed documentation docstrings about constrained_beam_search

* fixed tests having failing for Speech2TextModel's floating point inputs

* fix cuda long tensor

* added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search

* deleted accidentally added test halting code with assert False

* code reformat

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

* fixing based on comments on PR

* took out the testing code that should but work fails without the beam search moditification ; style changes

* fixing comments issues

* docstrings for ConstraintListState

* typo in PhrsalConstraint docstring

* docstrings improvements

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-02-09 16:59:26 +01:00
Leandro von Werra
d923f76203
add model scaling section (#15119)
* add model scaling section

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* integrate reviewer feedback

* initialize GPU properly

* add note about BnB optimizer

* move doc from `scaling.mdx` to `performance.mdx`

* integrate reviewer feedback

* revert section levels

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-02-09 15:27:30 +01:00
Sylvain Gugger
b5c6fdecf0
PoC for a ProcessorMixin class (#15549)
* PoC for a ProcessorMixin class

* Documentation

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Roll out to other processors

* Add base feature extractor class in init

* Use args and kwargs

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-02-09 09:24:49 -05:00
Nathan Raw
fcb4f11c92
📝 Add codecarbon callback to docs (#15563) 2022-02-08 14:10:53 -05:00
Joao Gante
8406fa6dd5
Add TFSpeech2Text (#15113)
* Add wrapper classes

* convert inner layers to tf

* Add TF Encoder and Decoder layers

* TFSpeech2Text models

* Loadable model

* TF model with same outputs as PT model

* test skeleton

* correct tests and run the fixup

* correct attention expansion

* TFSpeech2Text pask_key_values with TF format
2022-02-08 16:27:23 +00:00
aaron
87d08afb16
electra is added to onnx supported model (#15084)
* electra is added to onnx supported model

* add google/electra-base-generator for test onnx module

Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com>
2022-02-08 15:47:49 +01:00
Steven Liu
552f8d3091
Create a custom model guide (#15489)
* 📝 add config section

* 📝 finish first draft

* 📝 add feature extractor and processor

* 🖍 apply feedback from review

* 📝 minor edits

* last review
2022-02-07 12:34:56 -06:00
lewtun
6775b211b6
Remove Longformers from ONNX-supported models (#15273) 2022-02-07 17:32:13 +01:00
NielsRogge
84eec9e6ba
Add ConvNeXT (#15277)
* First draft

* Add conversion script

* Improve conversion script

* Improve docs and implement tests

* Define model output class

* Fix tests

* Fix more tests

* Add model to README

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply more suggestions from code review

* Apply suggestions from code review

* Rename dims to hidden_sizes

* Fix equivalence test

* Rename gamma to gamma_parameter

* Clean up conversion script

* Add ConvNextFeatureExtractor

* Add corresponding tests

* Implement feature extractor correctly

* Make implementation cleaner

* Add ConvNextStem class

* Improve design

* Update design to also include encoder

* Fix gamma parameter

* Use sample docstrings

* Finish conversion, add center cropping

* Replace nielsr by facebook, make feature extractor tests smaller

* Fix integration test

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-02-07 16:11:37 +01:00
Stas Bekman
8ce1330631
[deepspeed docs] DeepSpeed ZeRO Inference (#15486)
* [deepspeed docs] DeepSpeed ZeRO Inference

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* tweak

* deal with black

* extra cleanup, better comments

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-02-04 13:51:02 -08:00
Sylvain Gugger
ac6aa10f23
Standardize semantic segmentation models outputs (#15469)
* Standardize instance segmentation models outputs

* Rename output

* Update src/transformers/modeling_outputs.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add legacy argument to the config and model forward

* Update src/transformers/models/beit/modeling_beit.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Copy fix in Segformer

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2022-02-04 14:52:07 -05:00
Stas Bekman
31be2f45a9
[deepspeed docs] Megatron-Deepspeed info (#15488) 2022-02-04 11:15:13 -08:00
Stas Bekman
21dcaec5d5
[deepspeed docs] memory requirements (#15506) 2022-02-03 10:55:14 -08:00
Sylvain Gugger
44b21f117b
Save code of registered custom models (#15379)
* Allow dynamic modules to use relative imports

* Work for configs

* Fix last merge conflict

* Save code of registered custom objects

* Map strings to strings

* Fix test

* Add tokenizer

* Rework tests

* Tests

* Ignore fixtures py files for tests

* Tokenizer test + fix collection

* With full path

* Rework integration

* Fix typo

* Remove changes in conftest

* Test for tokenizers

* Add documentation

* Update docs/source/custom_models.mdx

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add file structure and file content

* Add more doc

* Style

* Update docs/source/custom_models.mdx

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Address review comments

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2022-02-02 10:44:37 -05:00
Steven Liu
b9418a1d97
Update tutorial docs (#15165)
* first draft of pipeline, autoclass, preprocess tutorials

* apply review feedback

* 🖍 apply feedback from patrick/niels

* 📝add output image to preprocessed image

* 🖍 apply feedback from patrick
2022-02-01 18:31:35 -06:00
Steven Liu
c157c7e3fd
Update fine-tune docs (#15259)
* add fine-tune tutorial

* make edits, fix style

* 📝 make edits

* 🖍 fix code format links to external libraries

* 🔄revert code formatting

* 🖍 use DefaultDataCollator instead of DataCollatorWithPadding
2022-02-01 18:28:12 -06:00
Stas Bekman
44c7857b87
[deepspeed doc] fix import, extra notes (#15400)
* [deepspeed doc] fix import, extra notes

* typo
2022-01-31 08:28:10 -08:00
NielsRogge
47df0f2234
Add header (#15434) 2022-01-31 11:15:54 -05:00
Ogundepo Odunayo
282ae123e2
add t5 ner finetuning (#15432) 2022-01-31 17:03:06 +01:00
Soonhwan-Kwon
e09473a817
Add support for XLM-R XL and XXL models by modeling_xlm_roberta_xl.py (#13727)
* add xlm roberta xl

* add convert xlm xl fairseq checkpoint to pytorch

* fix init and documents for xlm-roberta-xl

* fix indention

* add test for XLM-R xl,xxl

* fix model hub name

* fix some stuff

* up

* correct init

* fix more

* fix as suggestions

* add torch_device

* fix default values of doc strings

* fix leftovers

* merge to master

* up

* correct hub names

* fix docs

* fix model

* up

* finalize

* last fix

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add copied from

* make style

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-01-29 13:42:37 +01:00
Steven Liu
16d4acbfdb
Get started docs (#15098)
* clean commit of changes

* apply review feedback, make edits

* fix backticks, minor formatting

* 🖍 make fixup and minor edits

* 🖍 fix # in header

* 📝 update code sample without from_pt

* 📝 final review
2022-01-28 19:01:37 -06:00
Steven Liu
cabd6d26a2
Update model share tutorial (#15288)
* add model sharing tutorial

* 🖍 apply feedback from review

* 📝 make edits

* 🖍 fix formatting

* 📝 convert from pt checkpoint to flax

* 📝 final review
2022-01-28 18:49:26 -06:00
Suraj Patil
d25e25ee2b
Add XGLM models (#14876)
* add xglm

* update vocab size

* fix model name

* style and tokenizer

* typo

* no mask token

* fix pos embed compute

* fix args

* fix tokenizer

* fix positions

* fix tokenization

* style and dic fixes

* fix imports

* add fast tokenizer

* update names

* add pt tests

* fix tokenizer

* fix typo

* fix tokenizer import

* fix fast tokenizer

* fix tokenizer

* fix converter

* add tokenizer test

* update checkpoint names

* fix tokenizer tests

* fix slow tests

* add copied from comments

* rst -> mdx

* flax model

* update flax tests

* quality

* style

* doc

* update index and readme

* fix copies

* fix doc

* update toctrr

* fix indent

* minor fixes

* fix config doc

* don't save embed_pos weights

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* address Sylvains commnets, few doc fixes

* fix check_repo

* align order of arguments

* fix copies

* fix labels

* remove unnecessary mapping

* fix saving tokenizer

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-01-28 18:55:23 +01:00
Ngo Quang Huy
4996922b6d
[docs] fix wrong file name in pr_check (#15380) 2022-01-28 07:52:01 -05:00
Steven Liu
f5db6ce76a
Fix code format for Accelerate doc (#15335)
* 🖍 fix code syntax to external libraries and replace image

* 🔄revert code formatting, replace image with code block

* 🖍 apply feedback
2022-01-27 13:49:04 -06:00
Lysandre
f87db5e412 Release: v4.16.0 2022-01-27 13:06:33 -05:00
Sylvain Gugger
8f6454bfac
Add proper documentation for Keras callbacks (#15374)
* Add proper documentation for Keras callbacks

* Add dummies
2022-01-27 10:51:38 -05:00
Stas Bekman
fc8fc400e3
[docs] post-PR merge fix (#15355)
* [docs] post-PR merge fix

* Update docs/source/main_classes/deepspeed.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-01-26 11:23:32 -08:00
novice
99a2771189
Add YOSO (#15091)
* Add cookiecutter files

* Add cuda kernels and cpp files

* Update modeling_yoso.py

* Add .h files

* Update configuration_yoso.py

* Updates

* Remove tokenizer

* Code quality

* Update modeling_yoso.py

* Update modeling_yoso.py

* Fix failing test

* Update modeling_yoso.py

* Fix code quality

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review and fix integration tests

* Update src/transformers/models/yoso/modeling_yoso.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Apply suggestions from code review

* Fix copied from statement

* Fix docstring

* Fix code quality

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions and fix mask

* Apply suggestions from code review

* Fix code quality

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix docstrings

* Fix code quality

* Remove trailing whitespace

* Update yoso.mdx

* Move kernel loading to YosoEncoder

* make style

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/yoso/modeling_yoso.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add short summary to docs

* Update docs/source/model_doc/yoso.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update yoso.mdx

* Update docs/source/model_doc/yoso.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Remove CausalLM model and add copied from

* Remove autoregressive code

* Remove unused imports

* add copied from for embeddings

* Fix code quality

* Update docs/source/model_doc/yoso.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestion from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-01-26 19:18:29 +01:00
Ngo Quang Huy
5d8b98608c
Fix deepspeed docs (#15346) 2022-01-26 07:24:33 -05:00
Jacob Deppen
96161ac408
make table into valid Markdown table syntax (#15337) 2022-01-26 07:10:00 -05:00
Maciej Pawłowski
e79a0faeae
Added missing code in exemplary notebook - custom datasets fine-tuning (#15300)
* Added missing code in exemplary notebook - custom datasets fine-tuning

Added missing code in tokenize_and_align_labels function in the exemplary notebook on custom datasets - token classification.
The missing code concerns adding labels for all but first token in a single word.
The added code was taken directly from huggingface official example - this [colab notebook](https://github.com/huggingface/notebooks/blob/master/transformers_doc/custom_datasets.ipynb).

* Changes requested in the review - keep the code as simple as possible
2022-01-25 17:26:17 -05:00