* docs: replace torch.distributed.run by torchrun
`transformers` now officially support pytorch >= 1.10.
The entrypoint `torchrun`` is present from 1.10 onwards.
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
* Update src/transformers/trainer.py
with @ArthurZucker's suggestion
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Renamed variable extension to builder_name
* If builder name is jsonl change to json to align with load_datasets
* Apply suggestions from code review
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
---------
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
* try to stylify using ruff
* might need to remove these changes?
* use ruf format andruff check
* use isinstance instead of type comparision
* use # fmt: skip
* use # fmt: skip
* nits
* soem styling changes
* update ci job
* nits isinstance
* more files update
* nits
* more nits
* small nits
* check and format
* revert wrong changes
* actually use formatter instead of checker
* nits
* well docbuilder is overwriting this commit
* revert notebook changes
* try to nuke docbuilder
* style
* fix feature exrtaction test
* remve `indent-width = 4`
* fixup
* more nits
* update the ruff version that we use
* style
* nuke docbuilder styling
* leve the print for detected changes
* nits
* Remove file I/O
Co-authored-by: charliermarsh
<charlie.r.marsh@gmail.com>
* style
* nits
* revert notebook changes
* Add # fmt skip when possible
* Add # fmt skip when possible
* Fix
* More ` # fmt: skip` usage
* More ` # fmt: skip` usage
* More ` # fmt: skip` usage
* NIts
* more fixes
* fix tapas
* Another way to skip
* Recommended way
* Fix two more fiels
* Remove asynch
Remove asynch
---------
Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com>
* Remove the torch main_process_first context manager from TF examples
* Correctly set num_beams=1 in our examples, and add a guard in GenerationConfig.validate()
* Update src/transformers/generation/configuration_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Normalize only if needed
* Update examples/pytorch/image-classification/run_image_classification.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* if else in one line
* within block
* one more place, sorry for mess
* import order
* Update examples/pytorch/image-classification/run_image_classification.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update examples/pytorch/image-classification/run_image_classification_no_trainer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
`jnp.array` is a function, not a type:
https://jax.readthedocs.io/en/latest/_autosummary/jax.numpy.array.html
so it never makes sense to use `jnp.array` in a type annotation. Presumably the intent was to write `jnp.ndarray` aka `jax.Array`.
Co-authored-by: Peter Hawkins <phawkins@google.com>
* remove SharedDDP as it was drepracated
* apply review suggestion
* make style
* Oops,forgot to remove the compute_loss context manager in Seq2SeqTrainer.
* remove the unnecessary conditional statement
* keep the logic of IPEX
* clean code
* mix precision setup & make fixup
---------
Co-authored-by: statelesshz <jihuazhong1@huawei.com>
* refactor: change default block_size
* fix: return tf to origin
* fix: change files to origin
* rebase
* rebase
* rebase
* rebase
* rebase
* rebase
* rebase
* rebase
* refactor: add min block_size to files
* reformat: add min block_size for run_clm tf
* from seq2seq speech
* [Flax] Example script for speech seq2seq
* tests and fixes
* make style
* fix: label padding tokens
* fix: label padding tokens over list
* update ln names for Whisper
* try datasets iter loader
* create readme and append results
* style
* make style
* adjust lr
* use pt dataloader
* make fast
* pin gen max len
* finish
* add pt to requirements for test
* fix pt -> torch
* add accelerate