* Add XLA torchrun support
* Clarify that currently DDP doesn't work with torch.distributed XLA backend yet
* Enable DDP with torchrun and XLA (now available in PT-XLA 1.13)
* Add check for AWS Neuron availability and AWS Neuron specific compiler flag
* Change the new test's name to TestTrainerDistributedNeuronCore
* Remove "assert" and replace raised exception
* Remove compiler flag as it is optional. If needed, will be another PR.
* Use TORCHELASTIC_RUN_ID to determine whether torchrun is used
* `blip` support for training
* remove labels creation
* remove unneeded `decoder_input_ids` creation
* final changes
- add colab link to documentation
- reduction = mean for loss
* fix nits
* update link
* clearer error message
* Add epsilon- and eta-sampling.
Add epsilon- and eta-sampling, following the official code from https://github.com/john-hewitt/truncation-sampling and adapting to be more configurable, as required by Huggingface transformers.
* Add unit tests for epsilon- and eta-sampling.
* Black: fix code formatting.
* Fix docstring spacing.
* Clean up newlines.
* Fix implementation bugs and their associated tests.
* Remove epsilon- and eta-sampling parameters from PretrainedConfig.
* Clarify and clean up the documentation.
* Remove parameters for PretrainedConfig test.
* initial commit, refactoring the text generation api reference
* removed repetitive code examples
* Refactoring the text generation docs to reduce repetition
* make style
* Part of the "text generation" rework: adding a high-level overview of the text generation strategies
* code samples update via make style
* fixed a few formatting issues
* Apply suggestions from review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fixed spaces, and switched two links to markdown
* Apply Steven's suggestions from review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* new lines after headers to fix link rendering
* review feedback addressed. added links to image captioning and audio transcription examples
* minor capitalization fix
* addressed the review feedback
* Apply suggestions from review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Applied review suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Added TF example for image classification
* Code style polishing
* code style polishing
* minor polishing
* fixed a link in a tip, and a typo in the inference TF content
* Apply Amy's suggestions from review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/tasks/image_classification.mdx
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* review feedback addressed
* make style
* added PushToHubCallback with save_strategy="no"
* minor polishing
* added PushToHubCallback with save_strategy=no
* minor polishing
* Update docs/source/en/tasks/image_classification.mdx
* added data augmentation
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* make style
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* add draft logit processor
* add template functions
* update timesapmt processor parameters
* draft script
* simplify code
* cleanup
* fixup and clean
* update pipeline
* style
* clean up previous idea
* add tokenization utils
* update tokenizer and asr output
* fit whisper type
* style and update test
* clean test
* style test
* update tests
* update error test
* udpate code (not based on review yet)
* update tokenization
* update asr pipeline
* update code
* cleanup and update test
* fmt
* remove text verificatino
* cleanup
* cleanup
* add model test
* update tests
* update code add docstring
* update code and add docstring
* fix pipeline tests
* add draft logit processor
add template functions
update timesapmt processor parameters
draft script
simplify code
cleanup
fixup and clean
update pipeline
style
clean up previous idea
add tokenization utils
update tokenizer and asr output
fit whisper type
style and update test
clean test
style test
update tests
update error test
udpate code (not based on review yet)
update tokenization
update asr pipeline
update code
cleanup and update test
fmt
remove text verificatino
cleanup
cleanup
add model test
update tests
update code add docstring
update code and add docstring
fix pipeline tests
* Small update.
* Fixup.
* Tmp.
* More support.
* Making `forced_decoder_ids` non mandatory for users to set.
* update and fix first bug
* properly process sequence right after merge if last
* tofo
* allow list inputs + compute begin index better
* start adding tests
* add the 3 edge cases
* style
* format sequences
* fixup
* update
* update
* style
* test passes, edge cases should be good
* update last value
* remove Trie
* update tests and expec ted values
* handle bigger chunk_length
* clean tests a bit
* refactor chunk iter and clean pipeline
* update tests
* style
* refactor chunk iter and clean pipeline
* upade
* resolve comments
* Apply suggestions from code review
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
* take stride right into account
* update test expected values
* Update code based on review
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
* Clarify and add missing typical_p docstring.
* Make the docstring easier to understand.
* Clarify typical_p docstring
Accept the suggestion by @stevhliu for paraphrasing the docstring.
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Use the same docstring as in GenerationConfig
Follow the suggestion suggested by @stevhliu in the pull request conversation.
* Fix docstring spacing.
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Add num_workers for prepare_tf_dataset
* Bugfix in the default collator and change default tensor type
* Remove the "num_workers" arg and move it to a new PR
* Fixing #20783
* Update src/transformers/pipelines/base.py
* Fixing some tests.
* Fixup.
* Remove ffmpeg dep + a bit more relaxed for bigbird QA precision.
* Better dataset.
* Prevent failing on TF.
* Better condition. We can't use `can_use_iterator` since we cannot use it
directly.