transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-14 10:08:29 +06:00

History

Arthur dcb183f4bd [`MPT`] Add MosaicML's `MPT` model to transformers (#24629 ) * draft add new model like * some cleaning of the config * nits * add nested configs * nits * update * update * added layer norms + triton kernels * consider only LPLayerNorm for now. * update * all keys match. * Update * fixing nits here and there * working forward pass. * removed einops dependency * nits * format * add alibi * byebye head mask * refactor attention * nits. * format * fix nits. * nuke ande updates * nuke tokenizer test * don't reshape query with kv heads * added a bit of documentation. * remove unneeded things * nuke more stuff * nit * logits match - same generations * rm unneeded methods * 1 remaining failing CI test * nit * fix nits * fix docs * fix docs * rm tokenizer * fixup * fixup * fixup and fix tests * fixed configuration object. * use correct activation * few minor fixes * clarify docs a bit * logits match à 1e-12 * skip and unskip a test * added some slow tests. * fix readme * add more details * Update docs/source/en/model_doc/mpt.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix configuration issues * more fixes in config * added more models * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove unneeded position ids * fix some comments * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * revert suggestion * mpt alibi + added batched generation * Update src/transformers/models/mpt/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove init config * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix nit * add another slow test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fits in one line * some refactor because make fixup doesn't pass * add ft notebook * update md * correct doc path --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>		2023-07-25 14:32:40 +02:00
..
internal	Generate: add SequenceBiasLogitsProcessor (#24334 )	2023-06-21 11:14:41 +01:00
main_classes	fsdp fixes and enhancements (#24980 )	2023-07-21 17:52:48 +05:30
model_doc	[`MPT`] Add MosaicML's `MPT` model to transformers (#24629 )	2023-07-25 14:32:40 +02:00
tasks	[`MPT`] Add MosaicML's `MPT` model to transformers (#24629 )	2023-07-25 14:32:40 +02:00
_config.py	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
_toctree.yml	[`MPT`] Add MosaicML's `MPT` model to transformers (#24629 )	2023-07-25 14:32:40 +02:00
accelerate.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
add_new_model.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
add_new_pipeline.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
add_tensorflow_model.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
attention.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
autoclass_tutorial.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
benchmarks.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
bertology.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
big_models.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
community.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
contributing.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
create_a_model.md	Update old existing feature extractor references (#24552 )	2023-06-29 10:17:36 +01:00
custom_models.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
custom_tools.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
debugging.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
fast_tokenizers.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
generation_strategies.md	Generate: `group_beam_search` requires `diversity_penalty>0.0` (#24456 )	2023-06-27 10:46:39 +01:00
glossary.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
hpo_train.md	Update RayTune doc link for Hyperparameter tuning (#24422 )	2023-06-22 10:38:01 -04:00
index.md	[`MPT`] Add MosaicML's `MPT` model to transformers (#24629 )	2023-07-25 14:32:40 +02:00
installation.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
model_memory_anatomy.md	[docs] Performance docs tidy up, part 1 (#23963 )	2023-07-24 08:57:24 -04:00
model_sharing.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
model_summary.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
multilingual.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
notebooks.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
pad_truncation.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_hardware.md	🌐 [i18n-KO] Translated `perf_hardware.md` to Korean (#24966 )	2023-07-25 07:44:24 -04:00
perf_infer_cpu.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_infer_gpu_many.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_infer_gpu_one.md	fix: add TOC anchor link (#25066 )	2023-07-25 08:02:33 -04:00
perf_infer_special.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_train_cpu_many.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_train_cpu.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_train_gpu_many.md	deprecate `sharded_ddp` training argument (#24825 )	2023-07-17 06:57:42 -04:00
perf_train_gpu_one.md	Set `TF32` flag for PyTorch cuDNN backend (#25075 )	2023-07-25 08:04:48 -04:00
perf_train_special.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_train_tpu_tf.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_train_tpu.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
performance.md	[docs] Performance docs tidy up, part 1 (#23963 )	2023-07-24 08:57:24 -04:00
perplexity.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
philosophy.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
pipeline_tutorial.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
pipeline_webserver.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
pr_checks.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
preprocessing.md	Removal of deprecated vision methods and specify deprecation versions (#24570 )	2023-06-29 15:09:51 +01:00
quicktour.md	🌐 [i18n-KO] Fixed Korean and English `quicktour.md` (#24664 )	2023-07-21 08:19:28 -04:00
run_scripts.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
sagemaker.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
serialization.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
task_summary.md	Add Multimodal heading and Document question answering in task_summary.mdx (#23318 )	2023-07-17 13:51:19 +01:00
tasks_explained.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
testing.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
tf_xla.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
tflite.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
tokenizer_summary.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
torchscript.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
training.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
transformers_agents.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
troubleshooting.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00