* DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing
* DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing
* Addressed review comments, modified the docstring and made a test for the DataCollatorForLanguageModeling
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
Enhanced installation section with troubleshooting, GPU setup, and OS-specific details.
* Update README.md
Enhanced installation section with troubleshooting, GPU setup, and OS-specific details.
* Update installation.md
Updated installation.md to include virtual environment and GPU setup instructions.
* Update installation.md
Updated installation.md to include virtual environment and GPU setup instructions.
* Update installation.md
Updated installation.md to include virtual environment, troubleshooting and GPU setup instructions.
* Update installation.md
* Update installation.md
* Update installation.md
* Update installation.md
Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions.
* Update installation.md
Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions.
* Update installation.md
Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions.
* Update README.md
Removed numbering from README.md.
* Update README.md
Removed unnecessary "a)" formatting as per maintainer feedback.
* Update README.md
Added blank lines around code snippets for better readability.
* Update README.md
Removed the line "b) Install a backend framework:" from README.md as per feedback.
* Update README.md
Simplified "For Windows:" to "Windows" in README.md as per feedback as well as "For macOS/Linux:" to "macOS/Linux"
* Update README.md
Removed unnecessary heading and retained valid code snippet.
* Update README.md
Removed unnecessary heading "d) Optional: Install from source for the latest updates" as per feedback.
* Update README.md
Removed "GPU Setup (Optional)" section to align with minimal design feedback.
* Update installation.md
Removed "Create and Activate a Virtual Environment" section from installation.md as per feedback.
* Update installation.md
Adjusted "Troubleshooting" to a second-level heading and added an introductory line as per feedback.
* Update installation.md
Updated troubleshooting section with simplified headings and formatted code blocks as per feedback.
* Update installation.md
Integrated GPU setup instructions into the "Install with pip" section for better content flow.
* Update README.md
Removed Troubleshooting section from README.md for minimalism as per maintainer feedback.
* Update torchao.md: use auto-compilation
* Update torchao.md: indicate updating transformers to the latest
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Add the helium model.
* Add a missing helium.
* And add another missing helium.
* Use float for the rmsnorm mul.
* Add the Helium tokenizer converter.
* Add the pad token as suggested by Arthur.
* Update the RMSNorm + some other tweaks.
* Fix more rebase issues.
* fix copies and style
* fixes and add helium.md
* add missing tests
* udpate the backlink
* oups
* style
* update init, and expected results
* small fixes
* match test outputs
* style fixup, fix doc builder
* add dummies and we should be good to go!z
* update sdpa and fa2 documentation
---------
Co-authored-by: laurent <laurent.mazare@gmail.com>
* Create token_classification.md
* Update token_classification.md
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Update _toctree.yml
---------
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>
* Removed duplicate class field definition.
* Removed duplicate code in try-except block.
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* Update codeowners with individual model owners
* rip yoach
* add comment
* Replace - with _
* Add @qubvel for zero-shot object-detection
* Update CODEOWNERS
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update CODEOWNERS
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update CODEOWNERS
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update CODEOWNERS
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Add yoni for omdet-turbo
* Update CODEOWNERS
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
* Refactor / comment the CODEOWNERS file
* Capture modular files as well
* Add dummies without owner
* More cleanup
* Set Niels on a few more models that he added
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
* model can convert to HF and be loaded back
* nit
* works in single batch generation but hallucinates
* use the image tokens
* add image generation
* now it works
* add tests
* update
* add modulare but it doesn't work for porting docstring :(
* skip some tests
* add slow tests
* modular removed the import?
* guess this works
* update
* update
* fix copies
* fix test
* fix copies
* update
* docs
* fix tests
* last fix tests?
* pls
* repo consistency
* more style
* style
* remove file
* address comments
* tiny bits
* update after the new modular
* fix tests
* add one more cond in check attributes
* decompose down/up/mid blocks
* allow static cache generation in VLMs
* nit
* fix copies
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix VAE upsampling
* Update src/transformers/models/emu3/modular_emu3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* address comments
* state overwritten stuff explicitly
* fix copies
* add the flag for flex attn
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Introduce 5 integration tests for the 4 model classes + torch export
* ModernBert: reuse GemmaRotaryEmbedding via modular
* Revert #35589, keep rope_kwargs; rely on them in modular_modernbert
* Revert "Revert #35589, keep rope_kwargs; rely on them in modular_modernbert"
This reverts commit 11b44b9ee8.
* Don't set rope_kwargs; override 'self.rope_init_fn' call instead
* bug fixes
* organize imports
* wrap cpu warning in reference_compile
* Avoid needing repad_logits_with_grad, always repad with grads when training
I'm not 100% that the conditional with "or labels is None" makes sense though - not sure what the intention is there. Perhaps we can remove that?
* Revert "Avoid needing repad_logits_with_grad, always repad with grads when training"
This reverts commit cedcb4e89b.
* Fix grammar: keep -> keeps
* Propagate grammar fix with modular_model_converter
---------
Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com>
* universal checkpoint
* Update docs/source/en/deepspeed.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/deepspeed.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/deepspeed.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer
in PreTrainedTokenizerFast, rather than relying on subclasses to take care of this.
* Simplify setting self.add_prefix_space, ensure pre_tok exists
* Wrap in try-except to catch 'Custom PreTokenizer cannot be serialized'
862d1a346a/bindings/python/src/pre_tokenizers.rs (L672) produces the Exception. They're triggered by the roformer tests, as the RoFormerTokenizerFast uses a custom PreTokenizer.
* Propagate add_prefix_space in T5TokenizerFast to superclass
* look-ahead negation
* re add examples by default
* Fix the bug in topological sort
* Update create_dependency_mapping.py
* start adding test
* finalize test
* more tests
* style
* style