transformers/docs/source/en
Arthur dcb183f4bd
[MPT] Add MosaicML's MPT model to transformers (#24629)
* draft add new model like

* some cleaning of the config

* nits

* add nested configs

* nits

* update

* update

* added layer norms + triton kernels

* consider only LPLayerNorm for now.

* update

* all keys match.

* Update

* fixing nits here and there

* working forward pass.

* removed einops dependency

* nits

* format

* add alibi

* byebye head mask

* refactor attention

* nits.

* format

* fix nits.

* nuke ande updates

* nuke tokenizer test

* don't reshape query with kv heads

* added a bit of documentation.

* remove unneeded things

* nuke more stuff

* nit

* logits match - same generations

* rm unneeded methods

* 1 remaining failing CI test

* nit

* fix nits

* fix docs

* fix docs

* rm tokenizer

* fixup

* fixup

* fixup and fix tests

* fixed configuration object.

* use correct activation

* few minor fixes

* clarify docs a bit

* logits match à 1e-12

* skip and unskip a test

* added some slow tests.

* fix readme

* add more details

* Update docs/source/en/model_doc/mpt.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix configuration issues

* more fixes in config

* added more models

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove unneeded position ids

* fix some  comments

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* revert suggestion

* mpt alibi + added batched generation

* Update src/transformers/models/mpt/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove init config

* Update src/transformers/models/mpt/configuration_mpt.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix nit

* add another slow test

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fits in one line

* some refactor because make fixup doesn't pass

* add ft notebook

* update md

* correct doc path

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-25 14:32:40 +02:00
..
internal Generate: add SequenceBiasLogitsProcessor (#24334) 2023-06-21 11:14:41 +01:00
main_classes fsdp fixes and enhancements (#24980) 2023-07-21 17:52:48 +05:30
model_doc [MPT] Add MosaicML's MPT model to transformers (#24629) 2023-07-25 14:32:40 +02:00
tasks [MPT] Add MosaicML's MPT model to transformers (#24629) 2023-07-25 14:32:40 +02:00
_config.py Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
_toctree.yml [MPT] Add MosaicML's MPT model to transformers (#24629) 2023-07-25 14:32:40 +02:00
accelerate.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
add_new_model.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
add_new_pipeline.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
add_tensorflow_model.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
attention.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
autoclass_tutorial.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
benchmarks.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
bertology.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
big_models.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
community.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
contributing.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
create_a_model.md Update old existing feature extractor references (#24552) 2023-06-29 10:17:36 +01:00
custom_models.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
custom_tools.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
debugging.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
fast_tokenizers.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
generation_strategies.md Generate: group_beam_search requires diversity_penalty>0.0 (#24456) 2023-06-27 10:46:39 +01:00
glossary.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
hpo_train.md Update RayTune doc link for Hyperparameter tuning (#24422) 2023-06-22 10:38:01 -04:00
index.md [MPT] Add MosaicML's MPT model to transformers (#24629) 2023-07-25 14:32:40 +02:00
installation.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
model_memory_anatomy.md [docs] Performance docs tidy up, part 1 (#23963) 2023-07-24 08:57:24 -04:00
model_sharing.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
model_summary.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
multilingual.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
notebooks.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
pad_truncation.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_hardware.md 🌐 [i18n-KO] Translated perf_hardware.md to Korean (#24966) 2023-07-25 07:44:24 -04:00
perf_infer_cpu.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_infer_gpu_many.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_infer_gpu_one.md fix: add TOC anchor link (#25066) 2023-07-25 08:02:33 -04:00
perf_infer_special.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_train_cpu_many.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_train_cpu.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_train_gpu_many.md deprecate sharded_ddp training argument (#24825) 2023-07-17 06:57:42 -04:00
perf_train_gpu_one.md Set TF32 flag for PyTorch cuDNN backend (#25075) 2023-07-25 08:04:48 -04:00
perf_train_special.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_train_tpu_tf.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_train_tpu.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
performance.md [docs] Performance docs tidy up, part 1 (#23963) 2023-07-24 08:57:24 -04:00
perplexity.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
philosophy.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
pipeline_tutorial.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
pipeline_webserver.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
pr_checks.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
preprocessing.md Removal of deprecated vision methods and specify deprecation versions (#24570) 2023-06-29 15:09:51 +01:00
quicktour.md 🌐 [i18n-KO] Fixed Korean and English quicktour.md (#24664) 2023-07-21 08:19:28 -04:00
run_scripts.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
sagemaker.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
serialization.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
task_summary.md Add Multimodal heading and Document question answering in task_summary.mdx (#23318) 2023-07-17 13:51:19 +01:00
tasks_explained.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
testing.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
tf_xla.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
tflite.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
tokenizer_summary.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
torchscript.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
training.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
transformers_agents.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
troubleshooting.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00