transformers/docs/source/en/tasks
Arthur 0fe44059ae
Add recurrent gemma (#30143)
* Fork.

* RecurrentGemma initial commit.

* Updating __init__.py.

* Minor modification to how we initialize the cache.
Changing how the config specifies the architecture.

* Reformat code to 4 spaces.
Fixed a few typos.

* Fixed the forward pass.
Still unclear on the cache?

* Fixed the RecurrentGemmaForCausalLM

* Minor comment that we might not need attention_mask and output_attention arguments.

* Now cache should work as well.

* Adding a temporary example to check whether the model generation works.

* Adding the tests and updating imports.

* Adding the example file missing in the previous commit.

* First working example.

* Removing .gitignore and reverting parts of __init__.

* Re-add .gitignore.

* Addressing comments for configuration.

* Move mask creation to `_prepare_inputs_for_generation`.

* First try at integration tests:
1. AttributeError: 'GriffinCausalLMOutput' object has no attribute 'attentions'.
2. `cache_position` not passed

* Transfoering between machines.

* Running normal tests.

* Minor fix.

* More fixes.

* Addressing more comments.

* Minor fixes.

* first stab at cleanup

* more refactoring

* fix copies and else

* renaming and get init to work

* fix causal mask creation

* update

* nit

* fix a hell lot of things

* updates

* update conversion script

* make all keys importable

* nits

* add auto mappings

* properly convert ffw_up and down

* add scaling

* fix generations

* for recurrent dtype

* update

* fix going beyong window

* fixup

* add missing files

* current updates to remove last einops

* finish modeling refactor

* TADA

* fix compile

* fix most failing testt ? ?

* update tests

* refactor and update

* update

* nits, fixup and update tests

* more fixup

* nits

* fix imports

* test format

* fixups

* nits

* tuple typing

* fix code quality

* add model card

* fix doc

* skip most generation tests

* nits

* style

* doc fixes

* fix pr and check_copies?

* last nit

* oupsy

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <hi@lysand.re>

* update

* Update src/transformers/models/recurrent_gemma/convert_recurrent_gemma_to_hf.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* update based on review

* doc nit

* fix quality

* quality

* fix slow test model path

* update default dype

* ignore attributes that can be safely ignored in check config attributes

* 0lallalala come on

* save nit

* style

* remove to dict update

* make sure we can also run in float16

* style

---------

Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: Aleksandar Botev <botev@google.com>
Co-authored-by: Leonard Berrada <lberrada@users.noreply.github.com>
Co-authored-by: anushanf <anushanf@google.com>
Co-authored-by: botev <botevmg@gmail.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-10 16:59:13 +02:00
..
asr.md Add new meta w2v2-conformer BERT-like model (#28165) 2024-01-18 13:37:34 +00:00
audio_classification.md Add new meta w2v2-conformer BERT-like model (#28165) 2024-01-18 13:37:34 +00:00
document_question_answering.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
idefics.md [Docs] Fix broken links and syntax issues (#28918) 2024-02-08 14:13:35 -08:00
image_captioning.md [Docs] Fix backticks in inline code and documentation links (#28875) 2024-02-06 11:15:44 -08:00
image_classification.md [Trainer] Undo #29896 (#30129) 2024-04-09 12:55:42 +02:00
image_feature_extraction.md Fix header in IFE task guide (#29859) 2024-03-26 12:32:37 +01:00
image_to_image.md Image-to-Image Task Guide (#26595) 2023-10-16 15:12:03 +02:00
knowledge_distillation_for_image_classification.md fixed typos (issue 27919) (#27920) 2023-12-11 18:44:23 -05:00
language_modeling.md Add recurrent gemma (#30143) 2024-04-10 16:59:13 +02:00
mask_generation.md Mask Generation Task Guide (#28897) 2024-02-14 18:29:49 +00:00
masked_language_modeling.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
monocular_depth_estimation.md Add Depth Anything (#28654) 2024-01-25 09:34:50 +01:00
multiple_choice.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
object_detection.md [Trainer] Undo #29896 (#30129) 2024-04-09 12:55:42 +02:00
prompting.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
question_answering.md fix the post-processing link (#29091) 2024-02-19 10:15:58 +00:00
semantic_segmentation.md [docs] Fix image segmentation guide (#30132) 2024-04-09 09:08:37 -07:00
sequence_classification.md Add Qwen2MoE (#29377) 2024-03-27 02:11:55 +01:00
summarization.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
text-to-speech.md Add FastSpeech2Conformer (#23439) 2024-01-03 18:01:06 +00:00
token_classification.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
translation.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
video_classification.md [Trainer] Undo #29896 (#30129) 2024-04-09 12:55:42 +02:00
visual_question_answering.md VQA task guide (#25244) 2023-08-09 08:29:06 -04:00
zero_shot_image_classification.md [docs] Fix model reference in zero shot image classification example (#26206) 2023-09-19 00:45:12 +02:00
zero_shot_object_detection.md [Docs] Update README and default pipelines (#28864) 2024-02-12 10:21:36 +01:00