Sylvain Gugger
|
c28bc80bbb
|
Generalize problem_type to all sequence classification models (#14180)
* Generalize problem_type to all classification models
* Missing import
* Deberta BC and fix tests
* Fix template
* Missing imports
* Revert change to reformer test
* Fix style
|
2021-10-29 10:32:56 -04:00 |
|
Patrick von Platen
|
0c3174c758
|
Add TF<>PT and Flax<>PT everywhere (#14047)
* up
* up
* up
* up
* up
* up
* up
* add clip
* fix clip PyTorch
* fix clip PyTorch
* up
* up
* up
* up
* up
* up
* up
|
2021-10-25 23:55:08 +02:00 |
|
Lysandre Debut
|
c3d9ac7607
|
Expose get_config() on ModelTesters (#12812)
* Expose get_config() on ModelTesters
* Typo
|
2021-07-21 04:13:11 -04:00 |
|
abhishek thakur
|
c40c7e213b
|
Add multi-class, multi-label and regression to transformers (#11012)
* add to bert
* review comments
* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* self.config.problem_type
* fix style
* fix
* fin
* fix
* update doc
* fix
* test
* Test more problem types
* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix
* remove
* fix
* quality
* make fix-copies
* remove test
Co-authored-by: abhishek thakur <abhishekkrthakur@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
|
2021-05-04 02:23:40 -04:00 |
|
Sylvain Gugger
|
ba8b1f4754
|
Add support for multiple models for one config in auto classes (#11150)
* Add support for multiple models for one config in auto classes
* Use get_values everywhere
* Prettier doc
|
2021-04-08 18:41:36 -04:00 |
|
Vasudev Gupta
|
7442801df5
|
fix tests (#11109)
|
2021-04-07 10:07:26 -04:00 |
|
Patrick von Platen
|
7772ddb473
|
fix big bird gpu test (#10967)
|
2021-03-30 17:03:48 +03:00 |
|
Vasudev Gupta
|
6dfd027279
|
BigBird (#10183)
* init bigbird
* model.__init__ working, conversion script ready, config updated
* add conversion script
* BigBirdEmbeddings working :)
* slightly update conversion script
* BigBirdAttention working :) ; some bug in layer.output.dense
* add debugger-notebook
* forward() working for BigBirdModel :) ; replaced gelu with gelu_fast
* tf code adapted to torch till rand_attn in bigbird_block_sparse_attention ; till now everything working :)
* BigBirdModel working in block-sparse attention mode :)
* add BigBirdForPreTraining
* small fix
* add tokenizer for BigBirdModel
* fix config & hence modeling
* fix base prefix
* init testing
* init tokenizer test
* pos_embed must be absolute, attn_type=original_full when add_cross_attn=True , nsp loss is optional in BigBirdForPreTraining, add assert statements
* remove position_embedding_type arg
* complete normal tests
* add comments to block sparse attention
* add attn_probs for sliding & global tokens
* create fn for block sparse attn mask creation
* add special tests
* restore pos embed arg
* minor fix
* attn probs update
* make big bird fully gpu friendly
* fix tests
* remove pruning
* correct tokenzier & minor fixes
* update conversion script , remove norm_type
* tokenizer-inference test add
* remove extra comments
* add docs
* save intermediate
* finish trivia_qa conversion
* small update to forward
* correct qa and layer
* better error message
* BigBird QA ready
* fix rebased
* add triva-qa debugger notebook
* qa setup
* fixed till embeddings
* some issue in q/k/v_layer
* fix bug in conversion-script
* fixed till self-attn
* qa fixed except layer norm
* add qa end2end test
* fix gradient ckpting ; other qa test
* speed-up big bird a bit
* hub_id=google
* clean up
* make quality
* speed up einsum with bmm
* finish perf improvements for big bird
* remove wav2vec2 tok
* fix tokenizer
* include docs
* correct docs
* add helper to auto pad block size
* make style
* remove fast tokenizer for now
* fix some
* add pad test
* finish
* fix some bugs
* fix another bug
* fix buffer tokens
* fix comment and merge from master
* add comments
* make style
* commit some suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Fix typos
* fix some more suggestions
* add another patch
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix copies
* another path
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* update
* update nit suggestions
* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
|
2021-03-30 08:51:34 +03:00 |
|