mirror of https://github.com/huggingface/transformers.git synced 2025-07-18 12:08:22 +06:00

History

Hamid Shojanazeri af6e01c5bc Fix for the issue of device-id getting hardcoded for token_type_ids during Tracing [WIP] (#11252 ) * registering a buffer for token_type_ids, to pass the error of device-id getting hardcoded when tracing * sytle format * adding persistent flag to the resgitered buffers that prevent from adding them to the state_dict and addresses the Backward compatibility issue * adding the try catch to the fix as persistent flag is only available from PT >1.6 * adding version check * added the condition to only use the token_type_ids buffer when its autogenerated not passed by user * adding comments and making the conidtion where token_type_ids are None to use the registered buffer * taking out position-embeddding from the if block * adding comments * handling the case if buffer for position_ids was not registered * reverted the changes on position_ids, fix the issue with size of token_type_ids buffer, moved the modification for generated token_type_ids to Bertmodel, instead of Embeddings * reverting the token_type_ids in case of None to the previous version * reverting changes on position_ids adding back the if block * changes added by running make fix-copies * changes added by running make fix-copies and added the import version as it was getting used * changes added by running make fix-copies * changes added by running make fix-copies * fixing the import format * fixing the import format * modified to use temp tensor for trimed and expanded token_type_ids buffer * changes made by fix-copies after temp tensor modifications * changes made by fix-copies after temp tensor modifications * changes made by fix-copies after temp tensor modifications * clean up * clean up * clean up * clean up * Nit * Nit * Nit * modified according to support device conversion on traced models * modified according to support device conversion on traced models * modified according to support device conversion on traced models * modified according to support device conversion on traced models * changes based on latest in master * Adapt templates * Add version import Co-authored-by: Ubuntu <ubuntu@ip-172-31-32-81.us-west-2.compute.internal> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>		2021-06-22 05:21:30 -04:00
..
cookiecutter-template-{{cookiecutter.modelname}}	Fix for the issue of device-id getting hardcoded for token_type_ids during Tracing [WIP] (#11252 )	2021-06-22 05:21:30 -04:00
open_model_proposals	consistent nn. and nn.functional: p2 templates (#12153 )	2021-06-14 11:41:24 -07:00
tests	Model Templates for Seq2Seq (#9251 )	2020-12-22 23:41:20 +01:00
ADD_NEW_MODEL_PROPOSAL_TEMPLATE.md	consistent nn. and nn.functional: p2 templates (#12153 )	2021-06-14 11:41:24 -07:00
cookiecutter.json	Model Templates for Seq2Seq (#9251 )	2020-12-22 23:41:20 +01:00
README.md	Copyright (#8970 )	2020-12-07 18:36:34 -05:00

README.md

Using `cookiecutter` to generate models

This folder contains templates to generate new models that fit the current API and pass all tests. It generates models in both PyTorch and TensorFlow, completes the __init__.py and auto-modeling files, and creates the documentation.

Usage

Using the cookiecutter utility requires to have all the dev dependencies installed. Let's first clone the repository and install it in our environment:

git clone https://github.com/huggingface/transformers
cd transformers
pip install -e ".[dev]"

Once the installation is done, you can use the CLI command add-new-model to generate your models:

transformers-cli add-new-model

This should launch the cookiecutter package which should prompt you to fill in the configuration.

The modelname should be cased according to the plain text casing, i.e., BERT, RoBERTa, DeBERTa.

modelname [<ModelNAME>]:
uppercase_modelname [<MODEL_NAME>]: 
lowercase_modelname [<model_name>]: 
camelcase_modelname [<ModelName>]:

Fill in the authors with your team members:

authors [The HuggingFace Team]:

The checkpoint identifier is the checkpoint that will be used in the examples across the files. Put the name you wish, as it will appear on the modelhub. Do not forget to include the organisation.

checkpoint_identifier [organisation/<model_name>-base-cased]:

The tokenizer should either be based on BERT if it behaves exactly like the BERT tokenizer, or a standalone otherwise.

Select tokenizer_type:
1 - Based on BERT
2 - Standalone
Choose from 1, 2 [1]:

Once the command has finished, you should have a total of 7 new files spread across the repository:

docs/source/model_doc/<model_name>.rst
src/transformers/models/<model_name>/configuration_<model_name>.py
src/transformers/models/<model_name>/modeling_<model_name>.py
src/transformers/models/<model_name>/modeling_tf_<model_name>.py
src/transformers/models/<model_name>/tokenization_<model_name>.py
tests/test_modeling_<model_name>.py
tests/test_modeling_tf_<model_name>.py

You can run the tests to ensure that they all pass:

python -m pytest ./tests/test_*<model_name>*.py

Feel free to modify each file to mimic the behavior of your model.

⚠ You should be careful about the classes preceded by the following line:️

# Copied from transformers.[...]

This line ensures that the copy does not diverge from the source. If it should diverge, because the implementation is different, this line needs to be deleted. If you don't delete this line and run make fix-copies, your changes will be overwritten.

Once you have edited the files to fit your architecture, simply re-run the tests (and edit them if a change is needed!) afterwards to make sure everything works as expected.

Once the files are generated and you are happy with your changes, here's a checklist to ensure that your contribution will be merged quickly:

You should run the make fixup utility to fix the style of the files and to ensure the code quality meets the library's standards.
You should complete the documentation file (docs/source/model_doc/<model_name>.rst) so that your model may be usable.

README.md Unescape Escape

Using cookiecutter to generate models

Usage

README.md

Using `cookiecutter` to generate models