* Added accelerate gradient accumulation wrapper to run_image_classification_no_trainer.py example script
* make fixup changes
* PR comments
* changed input to Acceletor based on PR comment, ran make fixup
* Added comment explaining the sync_gradients statement
* Fixed lr scheduler max steps
* Changed run_clm_no_trainer.py script to use accelerate gradient accum wrapper
* Fixed all scripts except wav2vec2 pretraining to use accelerate gradient accum wrapper
* Added accelerate gradient accum wrapper for wav2vec2_pretraining_no_trainer.py script
* make fixup and lr_scheduler step inserted back into run_qa_beam_search_no_trainer.py
* removed changes to run_wav2vec2_pretraining_no_trainer.py script and fixed using wrong constant in qa_beam_search_no_trainer.py script
* [DX fix] Fixing QA pipeline streaming a dataset.
QuestionAnsweringArgumentHandler would iterate over the whole dataset
effectively killing all properties of the pipeline.
This restores nice properties when using `Dataset` or `Generator` since
those are meant to be consumed lazily.
* Handling TF better.
* Delete valohai.yaml
* NLP => ML
* typo
* website supports https
* datasets
* 60k + modalities
* unrelated link fixing for accelerate
* Ok those links were actually broken
* Fix link
* Make `AutoTokenizer` auto-link
* wording tweak
* add at least one non-nlp task
* Draft new cached_file
* Initial draft for config and model
* Small fixes
* Fix first batch of tests
* Look in cache when internet is down
* Fix last tests
* Bad black, not fixing all quality errors
* Make diff less
* Implement change for TF and Flax models
* Add tokenizer and feature extractor
* For compatibility with main
* Add utils to move the cache and auto-do it at first use.
* Quality
* Deal with empty commit shas
* Deal with empty etag
* Address review comments
* Adding a better error message when the model is improperly configured
within transformers.
* Update src/transformers/pipelines/__init__.py
* Black version.
* Overriding task aliases so that tokenizer+feature_extractor
values are correct.
* Fixing task aliases by overriding their names early
* X.
* Fixing feature-extraction.
* black again.
* Normalizing `translation` too.
* Fixing last few corner cases.
translation need to use its non normalized name (translation_XX_to_YY,
so that the task_specific_params are correctly overloaded).
This can be removed and cleaned up in a later PR.
`speech-encode-decoder` actually REQUIRES to pass a `tokenizer` manually
so the error needs to be discarded when the `tokenizer` is already
there.
* doc-builder fix.
* Fixing the real issue.
* Removing dead code.
* Do not import the actual config classes.
* First draft
* Add VideoMAEForVideoClassification
* Improve conversion script
* Add VideoMAEForPreTraining
* Add VideoMAEFeatureExtractor
* Improve VideoMAEFeatureExtractor
* Improve docs
* Add first draft of model tests
* Improve VideoMAEForPreTraining
* Fix base_model_prefix
* Make model take pixel_values of shape (B, T, C, H, W)
* Add loss computation of VideoMAEForPreTraining
* Improve tests
* Improve model testsé
* Make all tests pass
* Add VideoMAE to main README
* Add tests for VideoMAEFeatureExtractor
* Add integration test
* Improve conversion script
* Rename patch embedding class
* Remove VideoMAELayer from init
* Update design of patch embeddings
* Improve comments
* Improve conversion script
* Improve conversion script
* Add conversion of pretrained model
* Add loss verification of pretrained model
* Add loss verification of unnormalized targets
* Add integration test for pretraining model
* Apply suggestions from code review
* Fix bug to make feature extractor resize only shorter edge
* Address more comments
* Improve normalization of videos
* Add doc examples
* Move constants to dedicated script
* Remove scripts
* Transfer checkpoints, fix docs
* Update script
* Update image mean and std
* Fix doc tests
* Set return_tensors to NumPy by default
* Revert the previous change
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Enable HFTracer to trace with custom dummy inputs instead of pre-computed ones
* Add HFTracer.trace docstring, and make it possible to handle callable and torch.nn.Module in general
* Remove pdb comment
* Apply suggestions
* Cleanup some code
* Improve signatures
* Try to reduce the number of reshape/copies
* I don't think we actually need the layer_num scaling trick
* No need for duplication
* Try to fix beam_search
* Fix beam search
* Removing layer num normalization seems to be breaking
* Not sure self.layer_number normalization actually matters
* Try and be backward compatible
* Try to fix beam_search
* Revert attempt to be backward compatible
* Improve documentation on past_key_values format
* Optimize the device allocation in case of hidden_states in multiple devices
* No need to manually cast the values to a specific device
* Rename with long version of variables
* Improve type hinting
* Add comment that explains that some methods return views
* Actually i think the attention casting only makes sense when we use torch.float16
* We don't actually need layer_number to be passed anymore
* Fix FX test
* Bypass torch.baddbmm
* Apply suggestions from code review
* Add comment about support for torchScript v1.11
* fix ONNX support for bloom (#18456)
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>
Comparisons like
version.parse(torch.__version__) > version.parse("1.6")
are True for torch==1.6.0+cu101 or torch==1.6.0+cpu
version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py
* fix: keras fit tests for segformer tf and minor refactors.
* refactor: test_keras_fit to make it simpler using the existing one.
* fix: styling issues.
* Add file in spanish docs to be translated
* Translate first two sections to Spanish
* Translate four additional sections to Spanish
* Finish translation to Spanish
* Improve writing style in Spanish
* Add suggested changes from reviewer
This PR moves GroupViT and LXMert to their correct sections. As pointed out by @NielsRogge and @LysandreJik, GroupViT and LXMert are both multimodal models.