* Properly use test_fetcher for examples
* Fake example modification
* Fake modeling file modification
* Clean fake modifications
* Run example tests for any modification.
* Correct outdated function signatures on website.
* Upgrade sphinx to 3.5.4 (latest 3.x)
* Test
* Test
* Test
* Test
* Test
* Test
* Revert unnecessary changes.
* Change sphinx version to 3.5.4"
* Test python 3.7.11
* First commit
* Make style
* Fix dummy objects
* Add Detectron2 config
* Add LayoutLMv2 pooler
* More improvements, add documentation
* More improvements
* Add model tests
* Add clarification regarding image input
* Improve integration test
* Fix bug
* Fix another bug
* Fix another bug
* Fix another bug
* More improvements
* Make more tests pass
* Make more tests pass
* Improve integration test
* Remove gradient checkpointing and add head masking
* Add integration test
* Add LayoutLMv2ForSequenceClassification to the tests
* Add LayoutLMv2ForQuestionAnswering
* More improvements
* More improvements
* Small improvements
* Fix _LazyModule
* Fix fast tokenizer
* Move sync_batch_norm to a separate method
* Replace dummies by requires_backends
* Move calculation of visual bounding boxes to separate method + update README
* Add models to main init
* First draft
* More improvements
* More improvements
* More improvements
* More improvements
* More improvements
* Remove is_split_into_words
* More improvements
* Simply tesseract - no use of pandas anymore
* Add LayoutLMv2Processor
* Update is_pytesseract_available
* Fix bugs
* Improve feature extractor
* Fix bug
* Add print statement
* Add truncation of bounding boxes
* Add tests for LayoutLMv2FeatureExtractor and LayoutLMv2Tokenizer
* Improve tokenizer tests
* Make more tokenizer tests pass
* Make more tests pass, add integration tests
* Finish integration tests
* More improvements
* More improvements - update API of the tokenizer
* More improvements
* Remove support for VQA training
* Remove some files
* Improve feature extractor
* Improve documentation and one more tokenizer test
* Make quality and small docs improvements
* Add batched tests for LayoutLMv2Processor, remove fast tokenizer
* Add truncation of labels
* Apply suggestions from code review
* Improve processor tests
* Fix failing tests and add suggestion from code review
* Fix tokenizer test
* Add detectron2 CI job
* Simplify CI job
* Comment out non-detectron2 jobs and specify number of processes
* Add pip install torchvision
* Add durations to see which tests are slow
* Fix tokenizer test and make model tests smaller
* Frist draft
* Use setattr
* Possible fix
* Proposal with configuration
* First draft of fast tokenizer
* More improvements
* Enable fast tokenizer tests
* Make more tests pass
* Make more tests pass
* More improvements
* Addd padding to fast tokenizer
* Mkae more tests pass
* Make more tests pass
* Make all tests pass for fast tokenizer
* Make fast tokenizer support overflowing boxes and labels
* Add support for overflowing_labels to slow tokenizer
* Add support for fast tokenizer to the processor
* Update processor tests for both slow and fast tokenizers
* Add head models to model mappings
* Make style & quality
* Remove Detectron2 config file
* Add configurable option to label all subwords
* Fix test
* Skip visual segment embeddings in test
* Use ResNet-18 backbone in tests instead of ResNet-101
* Proposal
* Re-enable all jobs on CI
* Fix installation of tesseract
* Fix failing test
* Fix index table
* Add LayoutXLM doc page, first draft of code examples
* Improve documentation a lot
* Update expected boxes for Tesseract 4.0.0 beta
* Use offsets to create labels instead of checking if they start with ##
* Update expected boxes for Tesseract 4.1.1
* Fix conflict
* Make variable names cleaner, add docstring, add link to notebooks
* Revert "Fix conflict"
This reverts commit a9b46ce9afe47ebfcfe7b45e6a121d49e74ef2c5.
* Revert to make integration test pass
* Apply suggestions from @LysandreJik's review
* Address @patrickvonplaten's comments
* Remove fixtures DocVQA in favor of dataset on the hub
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* Base test
* More test
* Fix mistake
* Add a docstring change
* Add doc ignore
* Add changes
* Add recursive dep search
* Add recursive dep search
* save
* Finalize test mapping
* Fix bug
* Print prettier
* Ignore comments and empty lines
* Make script runnable from anywhere
* Need dev install
* Like that
* Adapt
* Add as artifact
* Try on torch tests
* Fix yaml error
* Install GitPython
* Apply everywhere
* Be more defensive
* Revert to all tests if something is wrong
* Install GitPython
* Test if there are tests before launching.
* Fixes
* Fixes
* Fixes
* Fixes
* Bash syntax is horrible
* Be less stupid
* Try differently
* Typo
* Typo
* Typo
* Style
* Better name
* Escape quotes
* Ignore black unhelpful re-formatting
* Not a docstring
* Deal with inits in dependency map
* Run all tests once PR is merged.
* Add last job
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Stronger dependencies gather
* Ignore empty lines too!
* Clean up
* Fix quality
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Laying down building stone for more flexible ONNX export capabilities
* Ability to provide a map of config key to override before exporting.
* Makes it possible to export BART with/without past keys.
* Supports simple mathematical syntax for OnnxVariable.repeated
* Effectively apply value override from onnx config for model
* Supports export with additional features such as with-past for seq2seq
* Store the output path directly in the args for uniform usage across.
* Make BART_ONNX_CONFIG_* constants and fix imports.
* Support BERT model.
* Use tokenizer for more flexibility in defining the inputs of a model.
* Add TODO as remainder to provide the batch/sequence_length as CLI args
* Enable optimizations to be done on the model.
* Enable GPT2 + past
* Improve model validation with outputs containing nested structures
* Enable Roberta
* Enable Albert
* Albert requires opset >= 12
* BERT-like models requires opset >= 12
* Remove double printing.
* Enable XLM-Roberta
* Enable DistilBERT
* Disable optimization by default
* Fix missing setattr when applying optimizer_features
* Add value field to OnnxVariable to define constant input (not from tokenizers)
* Add T5 support.
* Simplify model type retrieval
* Example exporting token_classification pipeline for DistilBERT.
* Refactoring to package `transformers.onnx`
* Solve circular dependency & __main__
* Remove unnecessary imports in `__init__`
* Licences
* Use @Narsil's suggestion to forward the model's configuration to the ONNXConfig to avoid interpolation.
* Onnx export v2 fixes (#12388)
* Tiny fixes
Remove `convert_pytorch` from onnxruntime-less runtimes
Correct reference to model
* Style
* Fix Copied from
* LongFormer ONNX config.
* Removed optimizations
* Remvoe bad merge relicas.
* Remove unused constants.
* Remove some deleted constants from imports.
* Fix unittest to remove usage of PyTorch model for onnx.utils.
* Fix distilbert export
* Enable ONNX export test for supported model.
* Style.
* Fix lint.
* Enable all supported default models.
* GPT2 only has one output
* Fix bad property name when overriding config.
* Added unittests and docstrings.
* Disable with_past tests for now.
* Enable outputs validation for default export.
* Remove graph opt lvls.
* Last commit with on-going past commented.
* Style.
* Disabled `with_past` for now
* Remove unused imports.
* Remove framework argument
* Remove TFPreTrainedModel reference
* Add documentation
* Add onnxruntime tests to CircleCI
* Add test
* Rename `convert_pytorch` to `export`
* Use OrderedDict for dummy inputs
* WIP Wav2Vec2
* Revert "WIP Wav2Vec2"
This reverts commit f665efb04c92525c3530e589029f0ae7afdf603e.
* Style
* Use OrderedDict for I/O
* Style.
* Specify OrderedDict documentation.
* Style :)
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Temporarily deactivate torch-scatter while we wait for new release
* torch-1.8.1 binary for scatter
* Revert to 1.8.0
* Pin torch dependency
* torchaudio and torchvision
* Squash all commits of modeling_detr_v7 branch into one
* Improve docs
* Fix tests
* Style
* Improve docs some more and fix most tests
* Fix slow tests of ViT, DeiT and DETR
* Improve replacement of batch norm
* Restructure timm backbone forward
* Make DetrForSegmentation support any timm backbone
* Fix name of output
* Address most comments by @LysandreJik
* Give better names for variables
* Conditional imports + timm in setup.py
* Address additional comments by @sgugger
* Make style, add require_timm and require_vision to testsé
* Remove train_backbone attribute of DetrConfig, add methods to freeze/unfreeze backbone
* Add png files to fixtures
* Fix type hint
* Add timm to workflows
* Add `BatchNorm2d` to the weight initialization
* Fix retain_grad test
* Replace model checkpoints by Facebook namespace
* Fix name of checkpoint in test
* Add user-friendly message when scipy is not available
* Address most comments by @patrickvonplaten
* Remove return_intermediate_layers attribute of DetrConfig and simplify Joiner
* Better initialization
* Scipy is necessary to get sklearn metrics
* Rename TimmBackbone to DetrTimmConvEncoder and rename DetrJoiner to DetrConvModel
* Make style
* Improve docs and add 2 community notebooks
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* Make quality scripts work when one backend is missing.
* Check env variable is properly set
* Add default
* With print statements
* Fix typo
* Set env variable
* Remove debug code
* Initial support for upload to hub
* push -> upload
* Fixes + examples
* Fix torchhub test
* Torchhub test I hate you
* push_model_to_hub -> push_to_hub
* Apply mixin to other pretrained models
* Remove ABC inheritance
* Add tests
* Typo
* Run tests
* Install git-lfs
* Change approach
* Add push_to_hub to all
* Staging test suite
* Typo
* Maybe like this?
* More deps
* Cache
* Adapt name
* Quality
* MOAR tests
* Put it in testing_utils
* Docs + torchhub last hope
* Styling
* Wrong method
* Typos
* Update src/transformers/file_utils.py
Co-authored-by: Julien Chaumond <julien@huggingface.co>
* Address review comments
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Base move
* Examples reorganization
* Update references
* Put back test data
* Move conftest
* More fixes
* Move test data to test fixtures
* Update path
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments and clean
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Add a special tokenizer for CPM model
* make style
* fix
* Add docs
* styles
* cpm doc
* fix ci
* fix the overview
* add test
* make style
* typo
* Custom tokenizer flag
* Add REAMDE.md
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* Squash all commits into one
* Update ViTFeatureExtractor to use image_utils instead of torchvision
* Remove torchvision and add Pillow
* Small docs improvement
* Address most comments by @sgugger
* Fix tests
* Clean up conversion script
* Pooler first draft
* Fix quality
* Improve conversion script
* Make style and quality
* Make fix-copies
* Minor docs improvements
* Should use fix-copies instead of manual handling
* Revert "Should use fix-copies instead of manual handling"
This reverts commit fd4e591bce.
* Place ViT in alphabetical order
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Initial script
* Add script to properly sort imports in init.
* Add to the CI
* Update utils/custom_init_isort.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Separate scripts that change content from quality
* Move class_mapping_update to style_checks
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>