* Add Image2TextGenerationPipeline to supported pipelines
* Add Flax and Tensorflow support
* Add Flax and Tensorflow small tests
* Add default model for Tensorflow
* Add docstring
* Fix doc style
* Add tiny models for pytorch and flax
* Remove flax from pipeline.
Fix tests
* Use ydshieh/vit-gpt2-coco-en as a default for both PyTorch and Tensorflow
* Fix Tensorflow support
Co-authored-by: Olivier Dehaene <olivier@huggingface.co>
* Implement ONNX support for Longformer
Fix repo consistency check complaints
Fix value mismatches
Add pooler output for default model
Increase validation atol to accommodate multiple-choice error
Fix copies
Fix chunking for longer sequence lengths
Add future comment
* Fix issue in mask_invalid_locations
* Remove torch imports in configuration_longformer
* Change config access to fix LED
* Push opset version to support tril
* Work in review comments (mostly style)
* Add Longformer to ONNX tests
* bnb minor modifications
- refactor documentation
- add troubleshooting README
- add PyPi library on DockerFile
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
* put in one block
- put bash instructions in one block
* update readme
- refactor a bit hardware requirements
* change text a bit
* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* apply suggestions
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* add link to paper
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Update tests/mixed_int8/README.md
* Apply suggestions from code review
* refactor a bit
* add instructions Turing & Amperer
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* add A6000
* clarify a bit
* remove small part
* Update tests/mixed_int8/README.md
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* onnx config for clip
* default opset as 14
* changes from the original repo
* input values order fix
* outputs fix
* remove unused import
* ran make fix-copies
* black format
* review comments: forward ref, import fix, model change revert, .to cleanup
* make style
* formatting fixes
* revert groupvit
* comment for cast to int32
* comment fix
* make .T as .t() for onnx conversion
* ran make fix-copies
* remove unneeded comment
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix copies
* remove comment
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* first commit
* correct replace function
* add final changes
- works like charm!
- cannot implement tests yet
- tested
* clean up a bit
* add bitsandbytes dependencies
* working version
- added import function
- added bitsandbytes utils file
* small fix
* small fix
- fix import issue
* fix import issues
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* refactor a bit
- move bitsandbytes utils to utils
- change comments on functions
* reformat docstring
- reformat docstring on init_empty_weights_8bit
* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* revert bad formatting
* change to bitsandbytes
* refactor a bit
- remove init8bit since it is useless
* more refactoring
- fixed init empty weights issue
- added threshold param
* small hack to make it work
* Update src/transformers/modeling_utils.py
* Update src/transformers/modeling_utils.py
* revmoe the small hack
* modify utils file
* make style + refactor a bit
* create correctly device map
* add correct dtype for device map creation
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* apply suggestions
- remove with torch.grad
- do not rely on Python bool magic!
* add docstring
- add docstring for new kwargs
* add docstring
- comment `replace_8bit_linear` function
- fix weird formatting
* - added more documentation
- added new utility function for memory footprint tracking
- colab demo to add
* few modifs
- typo doc
- force cast into float16 when load_in_8bit is enabled
* added colab link
* add test architecture + docstring a bit
* refactor a bit testing class
* make style + refactor a bit
* enhance checks
- add more checks
- start writing saving test
* clean up a bit
* male style
* add more details on doc
* add more tests
- still needs to fix 2 tests
* replace by "or"
- could not fix it from GitHub GUI
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* refactor a bit testing code + add readme
* make style
* fix import issue
* Update src/transformers/modeling_utils.py
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
* add few comments
* add more doctring + make style
* more docstring
* raise error when loaded in 8bit
* make style
* add warning if loaded on CPU
* add small sanity check
* fix small comment
* add bitsandbytes on dockerfile
* Improve documentation
- improve documentation from comments
* add few comments
* slow tests pass on the VM but not on the CI VM
* Fix merge conflict
* make style
* another test should pass on a multi gpu setup
* fix bad import in testing file
* Fix slow tests
- remove dummy batches
- no more CUDA illegal memory errors
* odify dockerfile
* Update docs/source/en/main_classes/model.mdx
* Update Dockerfile
* Update model.mdx
* Update Dockerfile
* Apply suggestions from code review
* few modifications
- lm head can stay on disk/cpu
- change model name so that test pass
* change test value
- change test value to the correct output
- torch bmm changed to baddmm in bloom modeling when merging
* modify installation guidelines
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* replace `n`by `name`
* merge `load_in_8bit` and `low_cpu_mem_usage`
* first try - keep the lm head in full precision
* better check
- check the attribute `base_model_prefix` instead of computing the number of parameters
* added more tests
* Update src/transformers/utils/bitsandbytes.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit
* improve documentation
- fix typos for installation
- change title in the documentation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
* update features
* MT5OnnxConfig added with updated with tests and docs
* fix imports
* fix onnc_config_cls for mt5
Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai>
* Delete valohai.yaml
* NLP => ML
* typo
* website supports https
* datasets
* 60k + modalities
* unrelated link fixing for accelerate
* Ok those links were actually broken
* Fix link
* Make `AutoTokenizer` auto-link
* wording tweak
* add at least one non-nlp task
* First draft
* Add VideoMAEForVideoClassification
* Improve conversion script
* Add VideoMAEForPreTraining
* Add VideoMAEFeatureExtractor
* Improve VideoMAEFeatureExtractor
* Improve docs
* Add first draft of model tests
* Improve VideoMAEForPreTraining
* Fix base_model_prefix
* Make model take pixel_values of shape (B, T, C, H, W)
* Add loss computation of VideoMAEForPreTraining
* Improve tests
* Improve model testsé
* Make all tests pass
* Add VideoMAE to main README
* Add tests for VideoMAEFeatureExtractor
* Add integration test
* Improve conversion script
* Rename patch embedding class
* Remove VideoMAELayer from init
* Update design of patch embeddings
* Improve comments
* Improve conversion script
* Improve conversion script
* Add conversion of pretrained model
* Add loss verification of pretrained model
* Add loss verification of unnormalized targets
* Add integration test for pretraining model
* Apply suggestions from code review
* Fix bug to make feature extractor resize only shorter edge
* Address more comments
* Improve normalization of videos
* Add doc examples
* Move constants to dedicated script
* Remove scripts
* Transfer checkpoints, fix docs
* Update script
* Update image mean and std
* Fix doc tests
* Set return_tensors to NumPy by default
* Revert the previous change
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Add file in spanish docs to be translated
* Translate first two sections to Spanish
* Translate four additional sections to Spanish
* Finish translation to Spanish
* Improve writing style in Spanish
* Add suggested changes from reviewer
This PR moves GroupViT and LXMert to their correct sections. As pointed out by @NielsRogge and @LysandreJik, GroupViT and LXMert are both multimodal models.
* add LUKE models for downstream tasks
* add new LUKE models to docs
* fix typos
* remove commented lines
* exclude None items from tuple return values