* Updated albert.md doc for ALBERT model
* Update docs/source/en/model_doc/albert.md
Fixed Resources heading
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update the ALBERT model doc resources
Fixed resource example for fine-tuning the ALBERT sentence-pair classification.
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/albert.md
Removed resource duplicate
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Updated albert.md doc with reviewed changes
* Updated albert.md doc for ALBERT
* Update docs/source/en/model_doc/albert.md
Removed duplicates from updated docs/source/en/model_doc/albert.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/albert.md
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* try to stylify using ruff
* might need to remove these changes?
* use ruf format andruff check
* use isinstance instead of type comparision
* use # fmt: skip
* use # fmt: skip
* nits
* soem styling changes
* update ci job
* nits isinstance
* more files update
* nits
* more nits
* small nits
* check and format
* revert wrong changes
* actually use formatter instead of checker
* nits
* well docbuilder is overwriting this commit
* revert notebook changes
* try to nuke docbuilder
* style
* fix feature exrtaction test
* remve `indent-width = 4`
* fixup
* more nits
* update the ruff version that we use
* style
* nuke docbuilder styling
* leve the print for detected changes
* nits
* Remove file I/O
Co-authored-by: charliermarsh
<charlie.r.marsh@gmail.com>
* style
* nits
* revert notebook changes
* Add # fmt skip when possible
* Add # fmt skip when possible
* Fix
* More ` # fmt: skip` usage
* More ` # fmt: skip` usage
* More ` # fmt: skip` usage
* NIts
* more fixes
* fix tapas
* Another way to skip
* Recommended way
* Fix two more fiels
* Remove asynch
Remove asynch
---------
Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com>
* translate model.md to chinese
* apply review suggestion
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Fix bug in handling varying encoder and decoder layers
This commit resolves an issue where the script failed to convert T5x models to PyTorch models when the number of decoder layers differed from the number of encoder layers. I've addressed this issue by passing an additional 'num_decoder_layers' parameter to the relevant function.
* Fix bug in handling varying encoder and decoder layers
* Remove the torch main_process_first context manager from TF examples
* Correctly set num_beams=1 in our examples, and add a guard in GenerationConfig.validate()
* Update src/transformers/generation/configuration_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* import hf error
* nits
* fixup
* catch the error at the correct place
* style
* improve message a tiny bit
* Update src/transformers/utils/hub.py
Co-authored-by: Lucain <lucainp@gmail.com>
* add a test
---------
Co-authored-by: Lucain <lucainp@gmail.com>
* skip 4 tests
* nits
* style
* wow it's not my day
* skip new failing tests
* style
* skip for NLLB MoE as well
* skip `test_assisted_decoding_sample` for everyone
* Have seq2seq just use gather
* Change
* Reset after
* Make slow
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Clean
* Simplify and just use gather
* Update tests/trainer/test_trainer_seq2seq.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* gather always for seq2seq
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update and reorder docs for chat templates
* Fix Mistral docstring
* Add section link and small fixes
* Remove unneeded line in Mistral example
* Add comment on saving memory
* Fix generation prompts linl
* Fix code block languages
* fix speecht5 wrong attention mask when padding
* enable batch generation and add parameter attention_mask
* fix doc
* fix format
* batch postnet inputs, return batched lengths, and consistent to old api
* fix format
* fix format
* fix the format
* fix doc-builder error
* add test, cross attention and docstring
* optimize code based on reviews
* docbuild
* refine
* not skip slow test
* add consistent dropout for batching
* loose atol
* add another test regarding to the consistency of vocoder
* fix format
* refactor
* add return_concrete_lengths as parameter for consistency w/wo batching
* fix review issues
* fix cross_attention issue