Kamal Raj
|
d329b63369
|
Deberta tf (#12972)
* TFDeberta
moved weights to build and fixed name scope
added missing ,
bug fixes to enable graph mode execution
updated setup.py
fixing typo
fix imports
embedding mask fix
added layer names avoid autmatic incremental names
+XSoftmax
cleanup
added names to layer
disable keras_serializable
Distangled attention output shape hidden_size==None
using symbolic inputs
test for Deberta tf
make style
Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update src/transformers/models/deberta/modeling_tf_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
removed tensorflow-probability
removed blank line
* removed tf experimental api
+torch_gather tf implementation from @Rocketknight1
* layername DeBERTa --> deberta
* copyright fix
* added docs for TFDeberta & make style
* layer_name change to fix load from pt model
* layer_name change as pt model
* SequenceClassification layername change,
to same as pt model
* switched to keras built-in LayerNormalization
* added `TFDeberta` prefix most layer classes
* updated to tf.Tensor in the docstring
|
2021-08-12 05:01:26 -04:00 |
|