transformers/utils/documentation_tests.txt
Younes Belkada 163ac3d3ee
Add Switch transformers (#19323)
* first commit

* add more comments

* add router v1

* clean up

- remove `tf` modeling files

* clean up

- remove `tf` modeling files

* clean up

* v0 routers

* added more router

- Implemented `ExpertsChooseMaskedRouter`

- added tests
- 2 more routers to implement

* last router

* improved docstring

- completed the docstring in `router.py`
- added more args in the config

* v0 sparse mlp

* replace wrong naming

* forward pass run

* update MOE layer

* small router update

* fixup

* consistency

* remove scatter router

* remove abstract layer

* update test and model for integration testing

* v1 conversion

* update

* hardcode hack

* all keys match

* add gin conversion, without additional libraries

* update conversion sctipy

* delete router file

* update tests wrt router deletion

* fix router issues

* update expert code

* update, logits match, code needsREFACTORING

* Refactor code

Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* add generate tests

Co-authored-by: younesbelkada <younesbelkada@gmail.com>

* add support for router loss

Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* fix forward error

* refactor a bit

* remove `FlaxSwitchTransformers` modules

* more tests pass

* Update code

Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>

* fixup

* fix tests

* fix doc

* fix doc + tokenization

* fix tokenizer test

* fix test

* fix loss output

* update code for backward pass

* add loss support

* update documentation

* fix documentation, clean tokenizer

* more doc fix, cleanup example_switch

* fix failing test

* fix test

* fix test

* fix loss issue

* move layer

* update doc and fix router capacity usage

* fixup

* add sparse mlp index for documentation on hub

* fixup

* test sparse mix architecture

* Apply suggestions from code review

* Update docs/source/en/model_doc/switch_transformers.mdx

* fixup on update

* fix tests

* fix another test

* attempt fix

* Update src/transformers/models/switch_transformers/configuration_switch_transformers.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/switch_transformers/convert_switch_transformers_original_flax_checkpoint_to_pytorch.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* try

* all tests pass

* fix jitter noise

* Apply suggestions from code review

* doc tests pass

* Update src/transformers/models/switch_transformers/modeling_switch_transformers.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/switch_transformers/modeling_switch_transformers.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove assert

* change config order

* fix readme japanese

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove parallelizable tests + add one liners

* remove ONNX config

* fix nits

- add `T5Tokenizer` in auto mapping
- remove `Switch Transformers` from ONNX supported models

* remove `_get_router`

* remove asserts

* add check in test for `router_dtype`

* add `SwitchTransformersConfig` in `run_pipeline_test`

* Update tests/pipelines/test_pipelines_summarization.py

* add huge model conversion script

* fix slow tests

- add better casting for `Linear8bitLt`
- remove `torchscript` tests

* add make dir

* style on new script

* fix nits

- doctest
- remove `_keys_to_ignore_on_load_unexpected`

* Update src/transformers/models/switch_transformers/configuration_switch_transformers.py

* add google as authors

* fix year

* remove last `assert` statements

* standardize vertical spaces

* fix failing import

* fix another failing test

* Remove strange àuthorized_keys`

* removing todo and padding that is never used

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: ybelkada <younes@huggingface.co>
Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Arthur Zucker <arthur@huggingface.co>
2022-11-15 13:06:45 +01:00

197 lines
11 KiB
Plaintext

docs/source/en/quicktour.mdx
docs/source/es/quicktour.mdx
docs/source/en/pipeline_tutorial.mdx
docs/source/en/autoclass_tutorial.mdx
docs/source/en/task_summary.mdx
docs/source/en/model_doc/markuplm.mdx
docs/source/en/model_doc/speech_to_text.mdx
docs/source/en/model_doc/switch_transformers.mdx
docs/source/en/model_doc/t5.mdx
docs/source/en/model_doc/t5v1.1.mdx
docs/source/en/model_doc/byt5.mdx
docs/source/en/model_doc/tapex.mdx
docs/source/en/model_doc/donut.mdx
docs/source/en/model_doc/encoder-decoder.mdx
src/transformers/generation/utils.py
src/transformers/generation/tf_utils.py
src/transformers/models/albert/configuration_albert.py
src/transformers/models/albert/modeling_albert.py
src/transformers/models/albert/modeling_tf_albert.py
src/transformers/models/bart/configuration_bart.py
src/transformers/models/bart/modeling_bart.py
src/transformers/models/beit/configuration_beit.py
src/transformers/models/beit/modeling_beit.py
src/transformers/models/bert/configuration_bert.py
src/transformers/models/bert/modeling_bert.py
src/transformers/models/bert/modeling_tf_bert.py
src/transformers/models/bert_generation/configuration_bert_generation.py
src/transformers/models/bigbird_pegasus/configuration_bigbird_pegasus.py
src/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py
src/transformers/models/big_bird/configuration_big_bird.py
src/transformers/models/big_bird/modeling_big_bird.py
src/transformers/models/blenderbot/configuration_blenderbot.py
src/transformers/models/blenderbot/modeling_blenderbot.py
src/transformers/models/blenderbot_small/configuration_blenderbot_small.py
src/transformers/models/blenderbot_small/modeling_blenderbot_small.py
src/transformers/models/bloom/configuration_bloom.py
src/transformers/models/camembert/configuration_camembert.py
src/transformers/models/canine/configuration_canine.py
src/transformers/models/clip/configuration_clip.py
src/transformers/models/clipseg/modeling_clipseg.py
src/transformers/models/codegen/configuration_codegen.py
src/transformers/models/conditional_detr/configuration_conditional_detr.py
src/transformers/models/conditional_detr/modeling_conditional_detr.py
src/transformers/models/convbert/configuration_convbert.py
src/transformers/models/convnext/configuration_convnext.py
src/transformers/models/convnext/modeling_convnext.py
src/transformers/models/ctrl/configuration_ctrl.py
src/transformers/models/ctrl/modeling_ctrl.py
src/transformers/models/cvt/configuration_cvt.py
src/transformers/models/cvt/modeling_cvt.py
src/transformers/models/data2vec/configuration_data2vec_audio.py
src/transformers/models/data2vec/configuration_data2vec_text.py
src/transformers/models/data2vec/configuration_data2vec_vision.py
src/transformers/models/data2vec/modeling_data2vec_audio.py
src/transformers/models/data2vec/modeling_data2vec_vision.py
src/transformers/models/deberta/configuration_deberta.py
src/transformers/models/deberta/modeling_deberta.py
src/transformers/models/deberta_v2/configuration_deberta_v2.py
src/transformers/models/deberta_v2/modeling_deberta_v2.py
src/transformers/models/decision_transformer/configuration_decision_transformer.py
src/transformers/models/deformable_detr/modeling_deformable_detr.py
src/transformers/models/deit/configuration_deit.py
src/transformers/models/deit/modeling_deit.py
src/transformers/models/deit/modeling_tf_deit.py
src/transformers/models/detr/configuration_detr.py
src/transformers/models/detr/modeling_detr.py
src/transformers/models/distilbert/configuration_distilbert.py
src/transformers/models/dpr/configuration_dpr.py
src/transformers/models/dpt/modeling_dpt.py
src/transformers/models/electra/configuration_electra.py
src/transformers/models/electra/modeling_electra.py
src/transformers/models/electra/modeling_tf_electra.py
src/transformers/models/ernie/configuration_ernie.py
src/transformers/models/flava/configuration_flava.py
src/transformers/models/fnet/configuration_fnet.py
src/transformers/models/glpn/modeling_glpn.py
src/transformers/models/gpt2/configuration_gpt2.py
src/transformers/models/gpt2/modeling_gpt2.py
src/transformers/models/gptj/modeling_gptj.py
src/transformers/models/gpt_neo/configuration_gpt_neo.py
src/transformers/models/gpt_neox/configuration_gpt_neox.py
src/transformers/models/gpt_neox_japanese/configuration_gpt_neox_japanese.py
src/transformers/models/groupvit/modeling_groupvit.py
src/transformers/models/groupvit/modeling_tf_groupvit.py
src/transformers/models/hubert/modeling_hubert.py
src/transformers/models/imagegpt/configuration_imagegpt.py
src/transformers/models/layoutlm/configuration_layoutlm.py
src/transformers/models/layoutlm/modeling_layoutlm.py
src/transformers/models/layoutlm/modeling_tf_layoutlm.py
src/transformers/models/layoutlmv2/configuration_layoutlmv2.py
src/transformers/models/layoutlmv2/modeling_layoutlmv2.py
src/transformers/models/layoutlmv3/configuration_layoutlmv3.py
src/transformers/models/layoutlmv3/modeling_layoutlmv3.py
src/transformers/models/layoutlmv3/modeling_tf_layoutlmv3.py
src/transformers/models/levit/configuration_levit.py
src/transformers/models/lilt/modeling_lilt.py
src/transformers/models/longformer/modeling_longformer.py
src/transformers/models/longformer/modeling_tf_longformer.py
src/transformers/models/longt5/modeling_longt5.py
src/transformers/models/marian/modeling_marian.py
src/transformers/models/markuplm/modeling_markuplm.py
src/transformers/models/maskformer/configuration_maskformer.py
src/transformers/models/maskformer/modeling_maskformer.py
src/transformers/models/mbart/configuration_mbart.py
src/transformers/models/mbart/modeling_mbart.py
src/transformers/models/mctct/configuration_mctct.py
src/transformers/models/megatron_bert/configuration_megatron_bert.py
src/transformers/models/mobilebert/configuration_mobilebert.py
src/transformers/models/mobilebert/modeling_mobilebert.py
src/transformers/models/mobilebert/modeling_tf_mobilebert.py
src/transformers/models/mobilenet_v2/modeling_mobilenet_v2.py
src/transformers/models/mobilevit/modeling_mobilevit.py
src/transformers/models/mobilevit/modeling_tf_mobilevit.py
src/transformers/models/nezha/configuration_nezha.py
src/transformers/models/openai/configuration_openai.py
src/transformers/models/opt/configuration_opt.py
src/transformers/models/opt/modeling_opt.py
src/transformers/models/opt/modeling_tf_opt.py
src/transformers/models/owlvit/modeling_owlvit.py
src/transformers/models/pegasus/configuration_pegasus.py
src/transformers/models/pegasus/modeling_pegasus.py
src/transformers/models/pegasus_x/configuration_pegasus_x.py
src/transformers/models/perceiver/modeling_perceiver.py
src/transformers/models/plbart/configuration_plbart.py
src/transformers/models/plbart/modeling_plbart.py
src/transformers/models/poolformer/configuration_poolformer.py
src/transformers/models/poolformer/modeling_poolformer.py
src/transformers/models/realm/configuration_realm.py
src/transformers/models/reformer/configuration_reformer.py
src/transformers/models/reformer/modeling_reformer.py
src/transformers/models/regnet/modeling_regnet.py
src/transformers/models/regnet/modeling_tf_regnet.py
src/transformers/models/resnet/configuration_resnet.py
src/transformers/models/resnet/modeling_resnet.py
src/transformers/models/resnet/modeling_tf_resnet.py
src/transformers/models/roberta/configuration_roberta.py
src/transformers/models/roberta/modeling_roberta.py
src/transformers/models/roberta/modeling_tf_roberta.py
src/transformers/models/roc_bert/modeling_roc_bert.py
src/transformers/models/roc_bert/tokenization_roc_bert.py
src/transformers/models/segformer/modeling_segformer.py
src/transformers/models/sew/configuration_sew.py
src/transformers/models/sew/modeling_sew.py
src/transformers/models/sew_d/configuration_sew_d.py
src/transformers/models/sew_d/modeling_sew_d.py
src/transformers/models/speech_encoder_decoder/modeling_speech_encoder_decoder.py
src/transformers/models/speech_to_text/configuration_speech_to_text.py
src/transformers/models/speech_to_text/modeling_speech_to_text.py
src/transformers/models/speech_to_text_2/configuration_speech_to_text_2.py
src/transformers/models/speech_to_text_2/modeling_speech_to_text_2.py
src/transformers/models/segformer/modeling_tf_segformer.py
src/transformers/models/squeezebert/configuration_squeezebert.py
src/transformers/models/swin/configuration_swin.py
src/transformers/models/swin/modeling_swin.py
src/transformers/models/swinv2/configuration_swinv2.py
src/transformers/models/table_transformer/modeling_table_transformer.py
src/transformers/models/time_series_transformer/configuration_time_series_transformer.py
src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
src/transformers/models/trajectory_transformer/configuration_trajectory_transformer.py
src/transformers/models/transfo_xl/configuration_transfo_xl.py
src/transformers/models/trocr/configuration_trocr.py
src/transformers/models/trocr/modeling_trocr.py
src/transformers/models/unispeech/configuration_unispeech.py
src/transformers/models/unispeech/modeling_unispeech.py
src/transformers/models/unispeech_sat/modeling_unispeech_sat.py
src/transformers/models/van/modeling_van.py
src/transformers/models/videomae/modeling_videomae.py
src/transformers/models/vilt/modeling_vilt.py
src/transformers/models/vision_encoder_decoder/configuration_vision_encoder_decoder.py
src/transformers/models/vision_encoder_decoder/modeling_vision_encoder_decoder.py
src/transformers/models/vision_text_dual_encoder/configuration_vision_text_dual_encoder.py
src/transformers/models/vit/configuration_vit.py
src/transformers/models/vit/modeling_vit.py
src/transformers/models/vit/modeling_tf_vit.py
src/transformers/models/vit_mae/modeling_vit_mae.py
src/transformers/models/vit_mae/configuration_vit_mae.py
src/transformers/models/vit_msn/modeling_vit_msn.py
src/transformers/models/visual_bert/configuration_visual_bert.py
src/transformers/models/wav2vec2/configuration_wav2vec2.py
src/transformers/models/wav2vec2/modeling_wav2vec2.py
src/transformers/models/wav2vec2/tokenization_wav2vec2.py
src/transformers/models/wav2vec2_conformer/configuration_wav2vec2_conformer.py
src/transformers/models/wav2vec2_conformer/modeling_wav2vec2_conformer.py
src/transformers/models/wav2vec2_with_lm/processing_wav2vec2_with_lm.py
src/transformers/models/wavlm/configuration_wavlm.py
src/transformers/models/wavlm/modeling_wavlm.py
src/transformers/models/whisper/configuration_whisper.py
src/transformers/models/whisper/modeling_whisper.py
src/transformers/models/whisper/modeling_tf_whisper.py
src/transformers/models/xlm/configuration_xlm.py
src/transformers/models/xlm_roberta/configuration_xlm_roberta.py
src/transformers/models/xlm_roberta_xl/configuration_xlm_roberta_xl.py
src/transformers/models/xlnet/configuration_xlnet.py
src/transformers/models/yolos/configuration_yolos.py
src/transformers/models/yolos/modeling_yolos.py
src/transformers/models/x_clip/modeling_x_clip.py
src/transformers/models/yoso/configuration_yoso.py