transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

History

Arthur 19ade2426a [WIP]`NLLB-MoE` Adds the moe model (#22024 ) * Initial commit * update modeling code * update doc * add functions necessary * fix impotrs * revert changes * fixup * more styling to get going * remove standalone encoder * update code * styling * fix config and model * update code and some refactoring * make more tests pass * Adding NLLB-200 - MoE - 54.5B for no language left behind Fixes #21300 * fix mor common tests * styke * update testing file * update * update * Router2 doc * update check config with sparse layer * add dummy router * update current conversion script * create on the fly conversion script * Fixup * style * style 2 * fix empty return * fix return * Update default config sparse layers * easier to create sparse layers * update * update conversion script * update modeling * add to toctree * styling * make ruff happy * update docstring * update conversion script * update, will break tests but impelemting top2 * update * ❗local groups are supported here * ⚠️ Support for local groups is now removed ⚠️ This is because it has to work with model parallelism that we do not support * finish simplificaiton * Fix forward * style * fixup * Update modelling and test, refactoring * update tests * remove final layer)norm as it is done in the FF * routing works! Logits test added * nit in test * remove top1router * style * make sure sparse are tested. Had to change route_tokens a liottle bit * add support for unslip models when converting * fixup * style * update test s * update test * REFACTOR * encoder outputs match! * style * update testing * 🎉encoder and decoder logits match 🎉 * styleing * update tests * cleanup tests * fix router test and CIs * cleanup * cleanup test styling * fix tests * Finally the generation tests match! * cleanup * update test * style testing file * remove script * cleanup * more cleanup * nits * update * NLLB tokenizer is wrong and will be fixed soon * use LongTensors * update tests * revert some small changes * fix second expert sampling and batch prioritized routing * update tests * finish last tests * make ruff happy * update * ruff again * style * Update docs/source/en/model_doc/nllb-moe.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updates based on review * style and fix import issue * nit * more nits * cleanup * styling * update test_seconde_expert_policy * fix name * last nit on the markdown examples --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>		2023-03-27 19:42:00 +02:00
..
test_module	AutoImageProcessor (#20111 )	2022-11-08 19:54:41 +00:00
tf_ops	Check TF ops for ONNX compliance (#10025 )	2021-02-15 07:55:10 -05:00
check_config_attributes.py	[Time-Series] informer model (#21099 )	2023-03-07 21:36:38 +01:00
check_config_docstrings.py	LLaMA Implementation (#21955 )	2023-03-16 09:00:53 -04:00
check_copies.py	Apply ruff flake8-comprehensions (#21694 )	2023-02-22 09:14:54 +01:00
check_doc_toc.py	Apply ruff flake8-comprehensions (#21694 )	2023-02-22 09:14:54 +01:00
check_doctest_list.py	Update quality tooling for formatting (#21480 )	2023-02-06 18:10:56 -05:00
check_dummies.py	Cleanup quality (#21493 )	2023-02-07 12:27:31 -05:00
check_inits.py	refactor: Make direct_transformers_import util (#21652 )	2023-02-16 11:32:32 -05:00
check_model_tester.py	Add a new script to check model testers' config (#22063 )	2023-03-13 19:11:19 +01:00
check_repo.py	[WIP]`NLLB-MoE` Adds the moe model (#22024 )	2023-03-27 19:42:00 +02:00
check_self_hosted_runner.py	Update quality tooling for formatting (#21480 )	2023-02-06 18:10:56 -05:00
check_table.py	refactor: Make direct_transformers_import util (#21652 )	2023-02-16 11:32:32 -05:00
check_task_guides.py	Depth estimation task guide (#22205 )	2023-03-17 08:36:23 -04:00
check_tf_ops.py	Check TF ops for ONNX compliance (#10025 )	2021-02-15 07:55:10 -05:00
create_dummy_models.py	Automatically create/update tiny models (#22275 )	2023-03-23 19:14:17 +01:00
custom_init_isort.py	Update quality tooling for formatting (#21480 )	2023-02-06 18:10:56 -05:00
documentation_tests.txt	Final update of doctest (#22299 )	2023-03-22 01:00:33 +01:00
download_glue_data.py	Raise exceptions instead of asserts (#13907 )	2021-10-07 12:44:23 +05:30
extract_warnings.py	Make Slack CI reporting stronger (#21823 )	2023-02-28 17:12:44 +01:00
get_ci_error_statistics.py	Make Slack CI reporting stronger (#21823 )	2023-02-28 17:12:44 +01:00
get_github_job_time.py	Make Slack CI reporting stronger (#21823 )	2023-02-28 17:12:44 +01:00
get_modified_files.py	exclude deleted files in the fixup script (#21436 )	2023-02-03 12:57:02 -05:00
get_test_info.py	Add an utility file to get information from test files (#21856 )	2023-03-01 17:53:29 +01:00
notification_service_doc_tests.py	Update quality tooling for formatting (#21480 )	2023-02-06 18:10:56 -05:00
notification_service.py	Show the number of `huggingface_hub` warnings in CI report (#22054 )	2023-03-09 15:39:05 +01:00
past_ci_versions.py	Make Slack CI reporting stronger (#21823 )	2023-02-28 17:12:44 +01:00
prepare_for_doc_test.py	Add a check regarding the number of occurrences of ``` (#18389 )	2022-08-01 14:23:02 +02:00
print_env.py	Print more library versions in CI (#17384 )	2022-06-02 10:24:16 +02:00
release.py	Clean README in post release job as well. (#17519 )	2022-06-02 07:44:03 -04:00
sort_auto_mappings.py	Automatically sort auto mappings (#17250 )	2022-05-16 13:24:20 -04:00
tests_fetcher.py	🔥Rework pipeline testing by removing `PipelineTestCaseMeta` 🚀 (#21516 )	2023-02-28 19:40:57 +01:00
update_metadata.py	Add AutoModelForZeroShotImageClassification (#22087 )	2023-03-13 12:46:14 +03:00
update_tiny_models.py	Automatically create/update tiny models (#22275 )	2023-03-23 19:14:17 +01:00