transformers/docs/source/en
Sanchit Gandhi e93103632b
Add bloom flax (#25094)
* First commit

* step 1 working

* add alibi

* placeholder for `scan`

* add matrix mult alibi

* beta scaling factor for bmm

* working v1 - simple forward pass

* move layer_number from attribute to arg in call

* partial functioning scan

* hacky working scan

* add more modifs

* add test

* update scan for new kwarg order

* fix position_ids problem

* fix bug in attention layer

* small fix

- do the alibi broadcasting only once

* prelim refactor

* finish refactor

* alibi shifting

* incorporate dropout_add to attention module

* make style

* make padding work again

* update

* remove bogus file

* up

* get generation to work

* clean code a bit

* added small tests

* adding albii test

* make CI tests pass:

- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work

* fix few nits

* fix nit onnx

* fix onnx nit

* add missing dtype args to nn.Modules

* remove debugging statements

* fix scan generate

* Update modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* fix small test issue + make style

* clean up

* Update tests/models/bloom/test_modeling_flax_bloom.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* fix function name

* small fix test

* forward contrib credits from PR17761

* Fix failing test

* fix small typo documentation

* fix non passing test

- remove device from build alibi

* refactor call

- refactor `FlaxBloomBlockCollection` module

* make style

* upcast to fp32

* cleaner way to upcast

* remove unused args

* remove layer number

* fix scan test

* make style

* fix i4 casting

* fix slow test

* Update src/transformers/models/bloom/modeling_flax_bloom.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove `layer_past`

* refactor a bit

* fix `scan` slow test

* remove useless import

* major changes

- remove unused code
- refactor a bit
- revert import `torch`

* major refactoring

- change build alibi

* remove scan

* fix tests

* make style

* clean-up alibi

* add integration tests

* up

* fix batch norm conversion

* style

* style

* update pt-fx cross tests

* update copyright

* Update src/transformers/modeling_flax_pytorch_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* per-weight check

* style

* line formats

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-27 18:24:56 +01:00
..
internal Generate: add SequenceBiasLogitsProcessor (#24334) 2023-06-21 11:14:41 +01:00
main_classes fsdp fixes and enhancements (#24980) 2023-07-21 17:52:48 +05:30
model_doc Add bloom flax (#25094) 2023-07-27 18:24:56 +01:00
tasks [T5, MT5, UMT5] Add [T5, MT5, UMT5]ForSequenceClassification (#24726) 2023-07-25 21:02:49 +02:00
_config.py Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
_toctree.yml [MPT] Add MosaicML's MPT model to transformers (#24629) 2023-07-25 14:32:40 +02:00
accelerate.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
add_new_model.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
add_new_pipeline.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
add_tensorflow_model.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
attention.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
autoclass_tutorial.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
benchmarks.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
bertology.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
big_models.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
community.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
contributing.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
create_a_model.md Update old existing feature extractor references (#24552) 2023-06-29 10:17:36 +01:00
custom_models.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
custom_tools.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
debugging.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
fast_tokenizers.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
generation_strategies.md Generate: group_beam_search requires diversity_penalty>0.0 (#24456) 2023-06-27 10:46:39 +01:00
glossary.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
hpo_train.md Update RayTune doc link for Hyperparameter tuning (#24422) 2023-06-22 10:38:01 -04:00
index.md Add bloom flax (#25094) 2023-07-27 18:24:56 +01:00
installation.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
model_memory_anatomy.md [docs] Performance docs tidy up, part 1 (#23963) 2023-07-24 08:57:24 -04:00
model_sharing.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
model_summary.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
multilingual.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
notebooks.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
pad_truncation.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_hardware.md 🌐 [i18n-KO] Translated perf_hardware.md to Korean (#24966) 2023-07-25 07:44:24 -04:00
perf_infer_cpu.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_infer_gpu_many.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_infer_gpu_one.md fix: add TOC anchor link (#25066) 2023-07-25 08:02:33 -04:00
perf_infer_special.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_train_cpu_many.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_train_cpu.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_train_gpu_many.md deprecate sharded_ddp training argument (#24825) 2023-07-17 06:57:42 -04:00
perf_train_gpu_one.md Set TF32 flag for PyTorch cuDNN backend (#25075) 2023-07-25 08:04:48 -04:00
perf_train_special.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_train_tpu_tf.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
perf_train_tpu.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
performance.md [docs] Performance docs tidy up, part 1 (#23963) 2023-07-24 08:57:24 -04:00
perplexity.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
philosophy.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
pipeline_tutorial.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
pipeline_webserver.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
pr_checks.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
preprocessing.md Removal of deprecated vision methods and specify deprecation versions (#24570) 2023-06-29 15:09:51 +01:00
quicktour.md 🌐 [i18n-KO] Fixed Korean and English quicktour.md (#24664) 2023-07-21 08:19:28 -04:00
run_scripts.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
sagemaker.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
serialization.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
task_summary.md Fix doctest (#25031) 2023-07-25 22:10:06 +02:00
tasks_explained.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
testing.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
tf_xla.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
tflite.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
tokenizer_summary.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
torchscript.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
training.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
transformers_agents.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
troubleshooting.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00