Commit Graph

1 Commits

Author SHA1 Message Date
Biao Zhang
3ef8896906
Encoder-Decoder Gemma (#38332)
* Initial submit

* Fix bugs:
1. add __init__ file
2. tied word embedding
3. support flash/flex attention
4. model saving and loading

* Code refactor:
* Rename encdecgemma to t5gemma.
* Split attention into self- and cross-attention
* Split stack into encoder and decoder
* Add test cases
* Add auto configuration

* Update configurations.

* Fix bugs related to copy and attribute checks

* Fix type union

* Fix merge errors

* run ruff format

* Run make style and update tests.

* Add t5gemma model doc.

* ruff and style formatting.

* Add missed module config.

* Add dummy checkpoint link to pass tests (need updated when real checkpoints are uplioaded.).

* Update model doc.

* Minor updates following Arthur's comments:
* replace docstrings with auto_docstrings
* remove checkpoint layers
* remove deprecate_kwargs

* fix rebase errors

* Fix docstring issues.

* fix t5gemma doc issue.

* run ruff format

* Updates:
* split encoder-only model out
* make t5gemmamodel encoder-decoder only
* update token and sequence classification
* update tests
2025-06-25 09:05:10 +00:00