Commit Graph

14 Commits

Author SHA1 Message Date
Arthur
19ade2426a
[WIP]NLLB-MoE Adds the moe model (#22024)
* Initial commit

* update modeling code

* update doc

* add functions necessary

* fix impotrs

* revert changes

* fixup

* more styling to get going

* remove standalone encoder

* update code

* styling

* fix config and model

* update code and some refactoring

* make more tests pass

* Adding NLLB-200 - MoE - 54.5B for no language left behind
Fixes #21300

* fix mor common tests

* styke

* update testing file

* update

* update

* Router2 doc

* update check config with sparse layer

* add dummy router

* update current conversion script

* create on the fly conversion script

* Fixup

* style

* style 2

* fix empty return

* fix return

* Update default config sparse layers

* easier to create sparse layers

* update

* update conversion script

* update modeling

* add to toctree

* styling

* make ruff happy

* update docstring

* update conversion script

* update, will break tests but impelemting top2

* update

* local groups are supported here

* ⚠️ Support for local groups is now removed ⚠️

This is because it has to work with model parallelism that we do not support

* finish simplificaiton

* Fix forward

* style

* fixup

* Update modelling and test, refactoring

* update tests

* remove final layer)norm as it is done in the FF

* routing works! Logits test added

* nit in test

* remove top1router

* style

* make sure sparse are tested. Had to change route_tokens a liottle bit

* add support for unslip models when converting

* fixup

* style

* update test s

* update test

* REFACTOR

* encoder outputs match!

* style

* update testing

* 🎉encoder and decoder logits match 🎉

* styleing

* update tests

* cleanup tests

* fix router test and CIs

* cleanup

* cleanup test styling

* fix tests

* Finally the generation tests match!

* cleanup

* update test

* style testing file

* remove script

* cleanup

* more cleanup

* nits

* update

* NLLB tokenizer is wrong and will be fixed soon

* use LongTensors

* update tests

* revert some small changes

* fix second expert sampling and batch prioritized routing

* update tests

* finish last tests

* make ruff happy

* update

* ruff again

* style

* Update docs/source/en/model_doc/nllb-moe.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updates based on review

* style and fix import issue

* nit

* more nits

* cleanup

* styling

* update test_seconde_expert_policy

* fix name

* last nit on the markdown examples

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-03-27 19:42:00 +02:00
amyeroberts
8ac29fe090
Fix doc links (#22274) 2023-03-20 17:07:31 +00:00
Thomas Paviot
ba2a5f13f7
Fix en documentation typos (#21799)
* fix wrong url

* typos in english documentation
2023-02-27 08:36:36 +01:00
Maria Khalusova
619d51e01f
Added "Open in Colab" to task guides (#21729)
added Open in Colab to task guides
2023-02-22 08:32:35 -05:00
tanreinama
f56174ac5b
add GPTSAN model (reopen) (#21291)
* add GPTSAN-Japanese

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN (update for review)

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* fix typo in comment text

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* fix document and comments

* fix class name GPTSAN->GPTSan

* fix import and test for tokenizer
2023-02-20 11:25:27 +01:00
Steven Liu
0a75717602
Fix task guide formatting (#21409)
fix formatting
2023-02-02 10:06:26 -08:00
Maria Khalusova
73a2ff6974
Automated compatible models list for task guides (#21338)
* initial commit. added tip placeholders and a script

* removed unused imports, fixed paths

* fixed generated links

* make style

* split language modeling doc into two: causal language modeling and masked language modeling

* added check_task_guides.py to make fix-copies

* review feedback addressed
2023-01-27 13:19:28 -05:00
Maria Khalusova
238449414f
Documentation code sample fixes (#21302)
* Fixed the following:
pipe -> pipeline
out in pipe(data()) is a list of dict, not a dict

* Fixed the TypeError: __init__() missing 1 required positional argument: 'key'

* Added a tip: code sample requires additional libraries to run

* Fixed custom config's name

* added seqeval to the required libraries

* fixed a missing dependency,
fixed metric naming,
added checkpoint to fix the datacollator

* added checkpoint to fix the datacollator,
added missing dependency
2023-01-25 11:33:39 -05:00
Steven Liu
d896029e27
Add inference section to task guides (#18781)
* 📝 start adding inference section to task guides

*  make style

* 📝 add multiple choice

* add rest of inference sections

* make style

* add compute_metric, push_to_hub, pipeline

* make style

* add updated sequence and token classification

* make style

* make edits in token classification

* add audio classification

* make style

* add asr

* make style

* add image classification

* make style

* add summarization

* make style

* add translation

* make style

* add multiple choice

* add language modeling

* add qa

* make style

* review and edits

* apply reviews

* make style

* fix call to processor

* apply audio reviews

* update to better asr model

* make style
2022-11-21 10:06:21 -08:00
Matt
2b9513fdab
Update TF fine-tuning docs (#18654)
* Update TF fine-tuning docs

* Fix formatting

* Add some section headers so the right sidebar works better

* Squiggly it

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Explain things in the text, not the comments

* Make the two dataset creation methods into a list

* Move the advice about collation out of a <Tip>

* Edits for clarity

* Edits for clarity

* Edits for clarity

* Replace `to_tf_dataset` with `prepare_tf_dataset` in the fine-tuning pages

* Restructure the page a little bit

* Restructure the page a little bit

* Restructure the page a little bit

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-09-07 13:30:07 +01:00
Sylvain Gugger
986526a0e4
Replace as_target context managers by direct calls (#18325)
* Preliminary work on tokenizers

* Quality + fix tests

* Treat processors

* Fix pad

* Remove all uses of  in tests, docs and examples

* Replace all as_target_tokenizer

* Fix tests

* Fix quality

* Update examples/flax/image-captioning/run_image_captioning_flax.py

Co-authored-by: amyeroberts <amy@huggingface.co>

* Style

Co-authored-by: amyeroberts <amy@huggingface.co>
2022-07-29 08:09:09 -04:00
Gorkem Ozkaya
f58b9c0522
Update translation.mdx (#18169)
* Update translation.mdx

* update translation.mdx by running make style
2022-07-26 07:56:40 -04:00
Sylvain Gugger
2e90c3df8f
Doc to dataset (#18037)
* Link to the Datasets doc

* Remove unwanted file
2022-07-06 12:10:06 -04:00
Sylvain Gugger
b9a768b3ff
Enable doc in Spanish (#16518)
* Reorganize doc for multilingual support

* Fix style

* Style

* Toc trees

* Adapt templates
2022-04-04 10:25:46 -04:00