Kamal Raj
d329b63369
Deberta tf ( #12972 )
...
* TFDeberta
moved weights to build and fixed name scope
added missing ,
bug fixes to enable graph mode execution
updated setup.py
fixing typo
fix imports
embedding mask fix
added layer names avoid autmatic incremental names
+XSoftmax
cleanup
added names to layer
disable keras_serializable
Distangled attention output shape hidden_size==None
using symbolic inputs
test for Deberta tf
make style
Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
removed tensorflow-probability
removed blank line
* removed tf experimental api
+torch_gather tf implementation from @Rocketknight1
* layername DeBERTa --> deberta
* copyright fix
* added docs for TFDeberta & make style
* layer_name change to fix load from pt model
* layer_name change as pt model
* SequenceClassification layername change,
to same as pt model
* switched to keras built-in LayerNormalization
* added `TFDeberta` prefix most layer classes
* updated to tf.Tensor in the docstring
2021-08-12 05:01:26 -04:00
Shubham Sanghavi
30ede8994e
Implement Fast Tokenization for Deberta ( #11387 )
2021-04-30 08:08:15 -04:00
Sylvain Gugger
74712e22f3
Honor contributors to models ( #11329 )
...
* Honor contributors to models
* Fix typo
* Address review comments
* Add more authors
2021-04-21 09:47:27 -04:00
Pengcheng He
9a7e63729f
Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… ( #10018 )
...
* Integrate DeBERTa v2(the 1.5B model surpassed human performance on SuperGLUE); Add DeBERTa v2 900M,1.5B models;
* DeBERTa-v2
* Fix v2 model loading issue (#10129 )
* Doc members
* Update src/transformers/models/deberta/modeling_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Address Sylvain's comments
* Address Patrick's comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Style
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-19 18:34:44 -05:00
NielsRogge
d1370d29b1
Add DeBERTa head models ( #9691 )
...
* Add DebertaForMaskedLM, DebertaForTokenClassification, DebertaForQuestionAnswering
* Add docs and fix quality
* Fix Deberta not having pooler
2021-01-20 10:18:50 -05:00
Sylvain Gugger
00aa9dbca2
Copyright ( #8970 )
...
* Add copyright everywhere missing
* Style
2020-12-07 18:36:34 -05:00
Patrick von Platen
2a6fbe6a40
[XLNet] Fix mems behavior ( #8567 )
...
* fix mems in xlnet
* fix use_mems
* fix use_mem_len
* fix use mems
* clean docs
* fix tf typo
* make xlnet tf for generation work
* fix tf test
* refactor use cache
* add use cache for missing models
* correct use_cache in generate
* correct use cache in tf generate
* fix tf
* correct getattr typo
* make sylvain happy
* change in docs as well
* do not apply to cookie cutter statements
* fix tf test
* make pytorch model fully backward compatible
2020-11-25 16:54:59 -05:00
Sylvain Gugger
08f534d2da
Doc styling ( #8067 )
...
* Important files
* Styling them all
* Revert "Styling them all"
This reverts commit 7d029395fd
.
* Syling them for realsies
* Fix syntax error
* Fix benchmark_utils
* More fixes
* Fix modeling auto and script
* Remove new line
* Fixes
* More fixes
* Fix more files
* Style
* Add FSMT
* More fixes
* More fixes
* More fixes
* More fixes
* Fixes
* More fixes
* More fixes
* Last fixes
* Make sphinx happy
2020-10-26 18:26:02 -04:00
Pengcheng He
7a0cf0ec93
Add DeBERTa model ( #5929 )
...
* Add DeBERTa model
* Remove dependency of deberta
* Address comments
* Patch DeBERTa
Documentation
Style
* Add final tests
* Style
* Enable tests + nitpicks
* position IDs
* BERT -> DeBERTa
* Quality
* Style
* Tokenization
* Last updates.
* @patrickvonplaten's comments
* Not everything can be a copy
* Apply most of @sgugger's review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Last reviews
* DeBERTa -> Deberta
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-30 07:07:30 -04:00