Commit Graph

57 Commits

Author SHA1 Message Date
Patrick von Platen
8c9b5fcbaf
[Flax] Big FlaxBert Refactor (#11364)
* improve flax

* refactor

* typos

* Update src/transformers/modeling_flax_utils.py

* Apply suggestions from code review

* Update src/transformers/modeling_flax_utils.py

* fix typo

* improve error tolerance

* typo

* correct nasty saving bug

* fix from pretrained

* correct tree map

* add note

* correct weight tying
2021-04-23 09:53:09 +02:00
Patrick von Platen
e87505f3a1
[Flax] Add other BERT classes (#10977)
* add first code structures

* add all bert models

* add to init and docs

* correct docs

* make style
2021-03-31 09:45:58 +03:00
Patrick von Platen
8780caa388
[WIP][Flax] Add general conversion script (#10809)
* save intermediate

* finish first version

* delete some more

* improve import

* fix roberta

* Update src/transformers/modeling_flax_pytorch_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_flax_pytorch_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* small corrections

* apply all comments

* fix deterministic

* make fix-copies

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-30 12:13:59 +03:00
Patrick von Platen
0b98ca368f
[Flax] Adapt Flax models to new structure (#9484)
* Create modeling_flax_eletra with code copied from modeling_flax_bert

* Add ElectraForMaskedLM and ElectraForPretraining

* Add modeling test for Flax electra and fix naming and arg in Flax Electra model

* Add documentation

* Fix code style

* Create modeling_flax_eletra with code copied from modeling_flax_bert

* Add ElectraForMaskedLM and ElectraForPretraining

* Add modeling test for Flax electra and fix naming and arg in Flax Electra model

* Add documentation

* Fix code style

* Fix code quality

* Adjust tol in assert_almost_equal due to very small difference between model output, ranging 0.0010 - 0.0016

* Remove redundant ElectraPooler

* save intermediate

* adapt

* correct bert flax design

* adapt roberta as well

* finish roberta flax

* finish

* apply suggestions

* apply suggestions

Co-authored-by: Chris Nguyen <anhtu2687@gmail.com>
2021-03-18 09:44:17 +03:00
Patrick von Platen
9f8619c6aa
Flax testing should not run the full torch test suite (#10725)
* make flax tests pytorch independent

* fix typo

* finish

* improve circle ci

* fix return tensors

* correct flax test

* re-add sentencepiece

* last tokenizer fixes

* finish maybe now
2021-03-16 08:05:37 +03:00
Patrick von Platen
640e6fe190
[Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054)
* save intermediate

* save intermediate

* save intermediate

* correct flax bert model file

* new module / model naming

* make style

* almost finish BERT

* finish roberta

* make fix-copies

* delete keys file

* last refactor

* fixes in run_mlm_flax.py

* remove pooled from run_mlm_flax.py`

* fix gelu | gelu_new

* remove Module from inits

* splits

* dirty print

* preventing warmup_steps == 0

* smaller splits

* make fix-copies

* dirty print

* dirty print

* initial_evaluation argument

* declaration order fix

* proper model initialization/loading

* proper initialization

* run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug

* removed tokenizers warning hack, fixed model re-initialization

* reverted training_args.py changes

* fix flax from pretrained

* improve test in flax

* apply sylvains tips

* update init

* make 0.3.0 compatible

* revert tevens changes

* revert tevens changes 2

* finalize revert

* fix bug

* add docs

* add pretrained to init

* Update src/transformers/modeling_flax_utils.py

* fix copies

* final improvements

Co-authored-by: TevenLeScao <teven.lescao@gmail.com>
2020-12-16 13:03:32 +01:00
Sylvain Gugger
8d4bb02056
Refactor FLAX tests (#9034) 2020-12-10 15:57:39 -05:00