* first draft adding Flax-t5-encoder and Flax-mt5-encoder
* imports
* after make fixup
* flax t5 encoder test
* black on test
* make fix-copies
* clean
* all_model_classes -> tuple
* clean test
* is_encoder_decoder=False in t5-enc tester
* remove file docstring before FlaxT5Encoder
* black
* isort
* commit suggestions on src/transformers/models/t5/modeling_flax_t5.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* commit suggestions on src/transformers/models/t5/modeling_flax_t5.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* remove _get_encoder_module
* self.decoder_seq_length -> self.encoder_seq_length as t5-enc does not have decoder
* bugfix - self.module_class is class itself, not instance;
* docs for mt5 and t5
* call -> __call__ in t5 doc
* FlaxMT5EncoderModel to TYPE_HINT
* run doc-builder to allow change the files
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* chore: initial commit
Copied the torch implementation of regnets and porting the code to tf step by step. Also introduced an output layer which was needed for regnets.
* chore: porting the rest of the modules to tensorflow
did not change the documentation yet, yet to try the playground on the model
* Fix initilizations (#1)
* fix: code structure in few cases.
* fix: code structure to align tf models.
* fix: layer naming, bn layer still remains.
* chore: change default epsilon and momentum in bn.
* chore: styling nits.
* fix: cross-loading bn params.
* fix: regnet tf model, integration passing.
* add: tests for TF regnet.
* fix: code quality related issues.
* chore: added rest of the files.
* minor additions..
* fix: repo consistency.
* fix: regnet tf tests.
* chore: reorganize dummy_tf_objects for regnet.
* chore: remove checkpoint var.
* chore: remov unnecessary files.
* chore: run make style.
* Update docs/source/en/model_doc/regnet.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* chore: PR feedback I.
* fix: pt test. thanks to @ydshieh.
* New adaptive pooler (#3)
* feat: new adaptive pooler
Co-authored-by: @Rocketknight1
* chore: remove image_size argument.
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: matt <rocketknight1@gmail.com>
* Empty-Commit
* chore: remove image_size comment.
* chore: remove playground_tf.py
* chore: minor changes related to spacing.
* chore: make style.
* Update src/transformers/models/regnet/modeling_tf_regnet.py
Co-authored-by: amyeroberts <aeroberts4444@gmail.com>
* Update src/transformers/models/regnet/modeling_tf_regnet.py
Co-authored-by: amyeroberts <aeroberts4444@gmail.com>
* chore: refactored __init__.
* chore: copied from -> taken from./g
* adaptive pool -> global avg pool, channel check.
* chore: move channel check to stem.
* pr comments - minor refactor and add regnets to doc tests.
* Update src/transformers/models/regnet/modeling_tf_regnet.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* minor fix in the xlayer.
* Empty-Commit
* chore: removed from_pt=True.
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: amyeroberts <aeroberts4444@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Fixing a regression with `return_all_scores` introduced in #17606
- The legacy test actually tested `return_all_scores=False` (the actual
default) instead of `return_all_scores=True` (the actual weird case).
This commit adds the correct legacy test and fixes it.
Tmp legacy tests.
Actually fix the regression (also contains lists)
Less diffed code.
* Add a TF in-graph tokenizer for BERT
* Add from_pretrained
* Add proper truncation, option handling to match other tokenizers
* Add proper imports and guards
* Add test, fix all the bugs exposed by said test
* Fix truncation of paired texts in graph mode, more test updates
* Small fixes, add a (very careful) test for savedmodel
* Add tensorflow-text dependency, make fixup
* Update documentation
* Update documentation
* make fixup
* Slight changes to tests
* Add some docstring examples
* Update tests
* Update tests and add proper lowercasing/normalization
* make fixup
* Add docstring for padding!
* Mark slow tests
* make fixup
* Fall back to BertTokenizerFast if BertTokenizer is unavailable
* Fall back to BertTokenizerFast if BertTokenizer is unavailable
* make fixup
* Properly handle tensorflow-text dummies
* Add CodeGen model
* Add missing key and switch order of super()
* Fix torch.ones init with uint8 instead of bool
* Address comments: copy statements and doc
* update tests
* remove old model parallel
* fix batch gen tests
* fix batch gen test
* update test_gpt2_sample_max_time
* fix codgen test and revert gpt2 test change
* Fix incorrect tie_word_embedding value, typo, URL
* Fix model order in README and styling
* Reorder model list alphabetically
* Set tie_word_embedding to False by default
* Apply suggestions from code review
* Better attn mask name & remove attn masked_bias
* add tokenizer for codegen
* quality
* doc tokenizer
* fix-copies
* add CodeGenTokenizer in converter
* make truncation optional
* add test for truncation
* add copyright
* fix-copies
* fix fast tokenizer decode
* Update src/transformers/models/codegen/tokenization_codegen.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* increase vocab_size in tests
Co-authored-by: patil-suraj <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix tests that broke when models used batchnorm
* Initializing the model twice does not actually...
...give you the same weights each time.
I am good at machine learning.
* Fix speed regression
* few fixes:
- hardcode tokenizer padding side
- remove unused args
* few fixes:
- added new attribute on TokenizerTesterMixin
- added new slow test
- remove unused arg on tokenizer class
* make style
* Update src/transformers/models/bloom/tokenization_bloom_fast.py
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
* make quality
* apply changes
- remove new attribute
- redefine test on the class
* add comments
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
- Fix `top_k_top_p_filtering` not passing `filter_value` to
`TopPLogitsWarper` causing any top-p filtered logits to be -inf
instead of specified value
- Add corresponding test
* Add final_layer_norm to OPT model
* Add JAX and TF version
* Fix Keras name
* Woops
* Allow for non breaking change
* Apply suggestions from code review
* add tests
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* rename to check_pt_flax_outputs
* update check_pt_flax_outputs
* use 5e-5 for BigBird PT/Flax test
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Prepare CI for v0.8.0
* pin hfh (revert before merge)
* Revert "pin hfh (revert before merge)"
This reverts commit a0103140e1.
* Test rc3
* Test latest rc
* Unpin to the RC
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
* Fix docstrings and variable names
* Rename x to something better
* Improve messages
* Fix docstrings and add test for greyscale images
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Use torch.finfo(self.dtype).min
* for GPTNeoX
* for Albert
* For Splinter
* Update src/transformers/models/data2vec/modeling_data2vec_audio.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* fix -inf used in Bart-like models
* Fix a few remaining -inf
* more fix
* clean up
* For CLIP
* For FSMT
* clean up
* fix test
* Add dtype argument and use it for LayoutLMv3
* update FlaxLongT5Attention
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* add new bloom classes
* (feat) add bloom classification tests; make style
* style: change import in test
* add some typehints to bloom classes
* merge main into branch
* fix: input checking in bloom seq classification
* fix tests
* change model class tests
* fix few tests
- more tests should pass
- one test left
* make token classifier return hidden states
* style: make BLOOM typehints consistent
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>