* Add attention mask and pad token warning to many of the models
* Remove changes under examples/research_projects
These files are not maintained by HG.
* Skip the warning check during torch.fx or JIT tracing
* Switch ordering for the warning and input shape assignment
This ordering is a little cleaner for some of the cases.
* Add missing line break in one of the files
* Register ModelOutput subclasses as supported torch.utils._pytree nodes
Fixes#25357 where DDP with static_graph=True does not sync gradients when calling backward() over tensors contained in ModelOutput subclasses
* Add test for torch pytree ModelOutput serialization and deserialization
* Add descriptive docstring to MinNewTokensLength
It addresses https://github.com/huggingface/transformers/issues/24783
* Refine the differences between `min_length` and `min_new_tokens`
* Remove extra line
* Remove extra arguments in generate
* Add a missing space
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Run the linter
* Add clarification comments
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* added benchmarks for compile
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* added more models
* added more models fr
* added visualizations
* minor fix
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/perf_torch_compile.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Added links to models and put charts side by side
* Added batch comparisons
* Added more comparisons
* Fix table
* Added link to wheel
* Update perf_torch_compile.md
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Add Description And Example to Docstring
* make style corrections
* make style
* Doc Style Consistent With HF
* Apply make style
* Modify Docstring
* Edit Type in Docstring
* Feedback Incorporated
* Edit Docstring
* make style
* Post Review Changes
* Review Feedback Incorporated
* Styling
* Formatting
* make style
* pep8
* Loosen output shape restrictions on GPT-style models
* Use more self-explanatory variables
* Revert "Use more self-explanatory variables"
This reverts commit 5fd9ab3911.
* Remove jnp.DeviceArray since it is deprecated.
* Replace all instances of jnp.DeviceArray with jax.Array
* Update src/transformers/models/bert/modeling_flax_bert.py
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Deal better with nested configs
* Fixes
* More fixes
* Fix last test
* Clean up existing configs
* Remove hack in MPT Config
* Update src/transformers/configuration_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Fix setting a nested config via dict in the kwargs
* Adapt common test
* Add test for nested config load with dict
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
The former spelling is deprecated and has been discouraged for a
while. The latter spelling seems to be more common in this project
anyway, so this change ought to be safe.
Fixes https://github.com/huggingface/transformers/issues/25283
* Update InstructBLIP values
Note: the tests are not independent. Running the test independentely produces different logits compared to running all the integration tests
* Update test values after rescale update
* Remove left over commented out code
* Revert to previous rescaling logic
* Update rescale tests
* Update list of logging integrations in docstring
Also update type hint
* Also add 'flyte' to report_to callback list
* Revert 'report_to' type hint update
Due to CLI breaking
Fix bug in InstructBlip generate function
Previously, the postprocessing conducted on generated sequences in InstructBlip's generate function assumed these sequences were tensors (i.e. that `return_dict_in_generate == False`).
This commit checks whether the result of the call to the wrapped language model `generate()` is a tensor, and if not attempts to postprocess the sequence attribute of the returned results object.