transformers/docs/source/en
Mayank Mishra e472e077c2
Granitemoe (#33207)
* first commit

* drop tokenizer

* drop tokenizer

* drop tokenizer

* drop convert

* granite

* drop tokenization test

* mup

* fix

* reformat

* reformat

* reformat

* fix docs

* stop checking for checkpoint

* update support

* attention multiplier

* update model

* tiny drop

* saibo drop

* skip test

* fix test

* fix test

* drop

* drop useless imports

* update docs

* drop flash function

* copied from

* drop pretraining tp

* drop pretraining tp

* drop pretraining tp

* drop unused import

* drop code path

* change name

* softmax scale

* head dim

* drop legacy cache

* rename params

* cleanup

* fix copies

* comments

* add back legacy cache

* multipliers

* multipliers

* multipliers

* text fix

* fix copies

* merge

* multipliers

* attention multiplier

* drop unused imports

* add granitemoe

* add decoration

* remove moe from sequenceclassification

* fix test

* fix

* fix

* fix

* move rope?

* merge

* drop bias

* drop bias

* Update src/transformers/models/granite/configuration_granite.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* Update src/transformers/models/granite/modeling_granite.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* fix

* fix

* fix

* drop

* drop

* fix

* fix

* cleanup

* cleanup

* fix

* fix granite tests

* fp32 test

* fix

* drop jitter

* fix

* rename

* rename

* fix config

* add gen test

---------

Co-authored-by: Yikang Shen <yikang.shn@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-09-21 01:43:50 +02:00
..
internal Add a static cache that offloads to the CPU or other device (#32161) 2024-08-29 11:51:09 +02:00
main_classes Decorator for easier tool building (#33439) 2024-09-18 11:07:51 +02:00
model_doc Granitemoe (#33207) 2024-09-21 01:43:50 +02:00
quantization Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
tasks Add keypoint-detection task guide (#33274) 2024-09-16 13:08:31 +02:00
_config.py [#29174] ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 Fix (#29888) 2024-04-08 14:21:16 +01:00
_redirects.yml Docs / Quantization: Redirect deleted page (#31063) 2024-05-28 18:29:22 +02:00
_toctree.yml Granitemoe (#33207) 2024-09-21 01:43:50 +02:00
accelerate.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
add_new_model.md Remove add-new-model in favor of add-new-model-like (#30424) 2024-04-24 09:38:18 +02:00
add_new_pipeline.md add push_to_hub to pipeline (#29172) 2024-04-16 15:34:04 +01:00
agents_advanced.md Decorator for easier tool building (#33439) 2024-09-18 11:07:51 +02:00
agents.md Decorator for easier tool building (#33439) 2024-09-18 11:07:51 +02:00
attention.md [Docs] Fix broken links and syntax issues (#28918) 2024-02-08 14:13:35 -08:00
autoclass_tutorial.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
benchmarks.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
bertology.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
big_models.md [docs] Big model loading (#29920) 2024-04-01 18:47:32 -07:00
chat_templating.md Add explicit example for RAG chat templating (#33503) 2024-09-17 16:08:05 +01:00
community.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
contributing.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
conversations.md [docs] change temperature to a positive value (#32077) 2024-07-23 17:47:51 +01:00
create_a_model.md Enable HF pretrained backbones (#31145) 2024-06-06 22:02:38 +01:00
custom_models.md Updated the custom_models.md changed cross_entropy code (#33118) 2024-08-26 13:15:43 +02:00
debugging.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
deepspeed.md Fix typos (#31819) 2024-07-08 11:52:47 +01:00
fast_tokenizers.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
fsdp.md [docs] Trainer docs (#28145) 2023-12-20 10:37:23 -08:00
generation_strategies.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
gguf.md Add support for GGUF Phi-3 (#31844) 2024-09-10 13:32:38 +02:00
glossary.md Fix typos (#31819) 2024-07-08 11:52:47 +01:00
hpo_train.md Remove-auth-token (#27060) 2023-11-13 14:20:54 +01:00
index.md Granitemoe (#33207) 2024-09-21 01:43:50 +02:00
installation.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
kv_cache.md Cache: don't show warning in forward passes when past_key_values is None (#33541) 2024-09-19 12:02:46 +01:00
llm_optims.md Docs: add more cross-references to the KV cache docs (#33323) 2024-09-06 10:22:00 +01:00
llm_tutorial_optimization.md Docs: add more cross-references to the KV cache docs (#33323) 2024-09-06 10:22:00 +01:00
llm_tutorial.md Add SynCode to llm_tutorial (#32884) 2024-08-22 15:30:22 +02:00
model_memory_anatomy.md Very small change to one of the function parameters (#32548) 2024-08-27 09:29:05 -07:00
model_sharing.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
model_summary.md model_summary.md - Restore link to Harvard's Annotated Transformer. (#29702) 2024-03-23 18:29:39 -07:00
multilingual.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
notebooks.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
pad_truncation.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
peft.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
perf_hardware.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
perf_infer_cpu.md [Docs] Fix spelling and grammar mistakes (#28825) 2024-02-02 08:45:00 +01:00
perf_infer_gpu_one.md Granitemoe (#33207) 2024-09-21 01:43:50 +02:00
perf_torch_compile.md fix(docs): Fixed a link in docs (#32274) 2024-07-29 10:50:43 +01:00
perf_train_cpu_many.md Update the distributed CPU training on Kubernetes documentation (#32669) 2024-08-14 09:36:43 -07:00
perf_train_cpu.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
perf_train_gpu_many.md Update perf_train_gpu_many.md (#31451) 2024-06-18 11:00:26 -07:00
perf_train_gpu_one.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
perf_train_special.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
perf_train_tpu_tf.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
performance.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
perplexity.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
philosophy.md [docs] fixed links with 404 (#27327) 2023-11-06 19:45:03 +00:00
pipeline_tutorial.md Docs: Fixed whisper-large-v2 model link in docs (#32871) 2024-08-19 09:50:35 -07:00
pipeline_webserver.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
pr_checks.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
preprocessing.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
quicktour.md docs: fix broken link (#31370) 2024-06-12 11:33:00 +01:00
run_scripts.md [docs] refine the doc for train with a script (#33423) 2024-09-12 10:16:12 -07:00
sagemaker.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
serialization.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
task_summary.md More fixes for doctest (#30265) 2024-04-16 11:58:55 +02:00
tasks_explained.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
testing.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
tf_xla.md fix(docs): Fixed a link in docs (#32274) 2024-07-29 10:50:43 +01:00
tflite.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
tiktoken.md Support reading tiktoken tokenizer.model file (#31656) 2024-09-06 14:24:02 +02:00
tokenizer_summary.md [docs] Spanish translation of tokenizer_summary.md (#31154) 2024-06-03 16:52:23 -07:00
torchscript.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
trainer.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
training.md Added the necessay import of module (#30804) 2024-05-14 18:45:06 +01:00
troubleshooting.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00