* Fixed test_saved_model_extended
* Fix TFGPT2 tests
* make fixup
* Make sure keras-nlp utils are available for type hinting too
* Update src/transformers/testing_utils.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* make fixup
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
error show like: “Currently the auto_kernel_selection does not support the grad mode! Please add torch.no_grad() before the inference runtime..”
since jit mode only work in inference mode, it's safe to add such logic.
Neuron supports extraction of XLA graphs for compilation.
However, when both do_train and do_eval options are enabled,
sizes returned by tensor operator can be 0. To avoid
INVALID_ARGUMENT error, we use inequality in the check whether
a tensor needs padding or not.
* [modelcard] Set model name if empty
* no magic
Co-authored-by: Sylvain Gugger <sylvain@huggingface.co>
Co-authored-by: Sylvain Gugger <sylvain@huggingface.co>
* add minimal working gpt2 tokenizer
* graph mode and output equivalence tests working
* not today tensorflow. serialization test passing!
* fix style, documentation, docstrings and all that jazz
* passing consistency checks
* move keras nlp to tf dependencies
* fix tf modeling utils and gpt2 attention to enable compiling
* fix (I hope) keras nlp dependencies
* rever changes on generation
* remove debug prints
* remove redundant tf dummy objects
* add from config, get config and max length settings to address review
* let flake ignore the error on distillation you are welcome
* test from config
* add padding test
* address sgugger review
* Add Donut image processor
* Update src/transformers/image_transforms.py
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
* Fix docstrings
* Full var names in docstring
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
* First draft
* Fix backwards compatibility
* More fixes
* More fixes
* Make backbone more general
* Improve backbone
* Improve test
* Fix config checkpoint
* Address comments
* Use model_type
* Address more comments
* Fix special model names
* Remove MaskFormerSwinModel and MaskFormerSwinPreTrainedModel from main init
* Fix typo
* Update backbone
* Apply suggestion
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Add hidden states and attentions to backbone outputs
* Update ResNet
* Fix more tests
* Debug test
* Fix test_determinism
* Fix test_save_load
* Remove file
* Disable fx tests
* Test
* Add fx support for backbones
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* INtegrate safetensos in weight offloading
* Use safetensors checkpoint for offload when available
* Make naming consistent
* Make load faster
* Quality
* Add default
* Changed assert into 7-8 exceptions
* updated syntax error
* updated error
* updated file (Co-autho: Batese2001)
* Successful test on test_modeling_distilbert.py
Successful raising errors and exceptions on the revised code in test_modeling_distilbert.py .
Co-credit: @batese2001
* Delete test_modeling_distilbert.ipynb
* Update modeling_distilbert.py
* Successful raising of exceptions with the conditions that are contrary to defined condition that asserts statements (Co-author: Batese2001)
* Successful raising of exceptions with the conditions that are contrary to defined condition that asserts statements (Co-author: Batese2001)
* committing the reformatted distilbert model
* reformatted distilbert model
* reformatted distilbert model
* reformatted distilbert model
* reformatted distilbert model with black
* Changed comments that explain better about raising exceptions for not having the even number of multi heads
* Changed comments that explain better about raising exceptions for not having the even number of multi heads
* changed based on the feedback
* Changed line 833 based on the suggestion made from @younesbelkada
* Changed line 833 based on the suggestion made from @younesbelkada draft2
* reformatted file
* Update src/transformers/models/distilbert/modeling_distilbert.py
* Update src/transformers/models/distilbert/modeling_distilbert.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>