transformers/templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}
Suraj Patil d3bd9ac728
[Flax] improve large model init and loading (#16148)
* begin do_init

* add params_shape_tree

* raise error if params are accessed when do_init is False

* don't allow do_init=False when keys are missing

* make shape tree a property

* assign self._params at the end

* add test for do_init

* add do_init arg to all flax models

* fix param setting

* disbale do_init for composite models

* update test

* add do_init in FlaxBigBirdForMultipleChoice

* better names and errors

* improve test

* style

* add a warning when do_init=False

* remove extra if

* set params after _required_params

* add test for from_pretrained

* do_init => _do_init

* chage warning to info

* fix typo

* add params in init_weights

* add params to gpt neo init

* add params to init_weights

* update do_init test

* Trigger CI

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update template

* trigger CI

* style

* style

* fix template

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-04-19 14:19:55 +02:00
..
__init__.py Reorganize file utils (#16264) 2022-03-23 10:26:33 -04:00
{{cookiecutter.lowercase_modelname}}.mdx Check the repo consistency in model templates test (#15141) 2022-01-14 04:52:38 -05:00
configuration_{{cookiecutter.lowercase_modelname}}.py Happy New Year! (#15094) 2022-01-10 12:05:57 -05:00
configuration.json Add template for adding flax models (#12441) 2021-09-01 09:49:03 +02:00
modeling_{{cookiecutter.lowercase_modelname}}.py Moved functions to pytorch_utils.py (#16625) 2022-04-12 12:38:50 -04:00
modeling_flax_{{cookiecutter.lowercase_modelname}}.py [Flax] improve large model init and loading (#16148) 2022-04-19 14:19:55 +02:00
modeling_tf_{{cookiecutter.lowercase_modelname}}.py TF: Finalize unpack_inputs-related changes (#16499) 2022-04-04 16:37:33 +01:00
test_modeling_{{cookiecutter.lowercase_modelname}}.py Reorganize file utils (#16264) 2022-03-23 10:26:33 -04:00
test_modeling_flax_{{cookiecutter.lowercase_modelname}}.py [Test refactor 1/5] Per-folder tests reorganization (#15725) 2022-02-23 15:46:28 -05:00
test_modeling_tf_{{cookiecutter.lowercase_modelname}}.py Adding new train_step logic to make things less confusing for users (#15994) 2022-04-05 14:23:27 +01:00
to_replace_{{cookiecutter.lowercase_modelname}}.py Happy New Year! (#15094) 2022-01-10 12:05:57 -05:00
tokenization_{{cookiecutter.lowercase_modelname}}.py fix the tokenizer_config.json file for the slow tokenizer when a fast version is available (#15319) 2022-02-01 16:48:25 +01:00
tokenization_fast_{{cookiecutter.lowercase_modelname}}.py Happy New Year! (#15094) 2022-01-10 12:05:57 -05:00