Stas Bekman
4a872caef4
remove extra white space from log format ( #12360 )
2021-06-25 13:20:14 -07:00
Vasudev Gupta
332a245861
Add FlaxBigBird QuestionAnswering script ( #12233 )
...
* port bigbird script
* adapt script a bit
* change location
* adapt more
* save progress
* init commit
* style
* dataset script tested
* readme add
2021-06-25 18:05:48 +01:00
michal pitr
d4ce31e839
fixed typo ( #12356 )
2021-06-25 07:49:29 -04:00
Patrick von Platen
aa550c4a11
Update README.md
2021-06-25 11:55:51 +01:00
Marc van Zee
f2c4ce7e33
Add flax/jax quickstart ( #12342 )
2021-06-24 17:04:18 +01:00
Suraj Patil
aef3823e1a
[examples/Flax] move the examples table up ( #12341 )
2021-06-24 16:03:37 +05:30
Sylvain Gugger
2150dfed31
v4.9.0.dev0
2021-06-23 13:31:19 -04:00
Sylvain Gugger
9252a5127f
Release: v4.8.0
2021-06-23 13:25:56 -04:00
Patrick von Platen
44739c8180
[Flax/JAX] Add how to propose projects markdown ( #12311 )
...
* fix_torch_device_generate_test
* remove @
* finish
* make style
2021-06-23 14:50:35 +01:00
Suraj Patil
c0fe3c9a7a
Flax summarization script ( #12230 )
...
* add summrization script
* fix arguments, preprocessing, metrics
* add generation and metrics
* auto model, prediction loop
* prettify
* label smoothing
* adress Sylvain and Patricks suggestions
* dynamically import shift_tokens_right
* fix shift_tokens_right_fn call
2021-06-23 15:49:30 +05:30
Stas Bekman
ebe5413589
[trainer] 2 bug fixes and a rename ( #12309 )
...
* bug fixes and a rename
* add extended DDP test
2021-06-22 11:13:23 -07:00
Patrick von Platen
64029abe4c
[Flax] Main doc for event orga ( #12305 )
...
* fix_torch_device_generate_test
* remove @
* push
* finish
* some typos
* add more info on communication
* add suggestions
2021-06-22 18:02:52 +01:00
Stas Bekman
dad414d5f9
[trainer + examples] set log level from CLI ( #12276 )
...
* set log level from CLI
* add log_level_replica + test + extended docs
* cleanup
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* rename datasets objects to allow datasets module
* improve the doc
* style
* doc improve
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-21 19:30:50 -07:00
Matt
e3cb7a0b60
Tensorflow QA example ( #12252 )
...
* New Tensorflow QA example!
* Style pass
* Updating README.md for the new example
* flake8 fixes
* Update examples/tensorflow/question-answering/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-21 16:37:28 +01:00
Vishal Burman
b53bc55ba9
Fix for making student ProphetNet for Seq2Seq Distillation ( #12130 )
...
* make_student.py: fix to make student ProphetNet
* reformat
2021-06-21 09:36:44 -04:00
Bhavitvya Malik
e43e11260f
update desc for map in all examples ( #12226 )
...
* update desc for map in all examples
* added plm
* suggestions
2021-06-17 15:37:31 -04:00
Lysandre
0daadc1919
Docs for v4.8.0
2021-06-17 18:17:42 +02:00
Lysandre
7a6c9fab8e
Release: v4.7.0
2021-06-17 17:57:42 +02:00
Sylvain Gugger
7d7ceca396
Model card defaults ( #12122 )
...
* [WIP] Model card defaults
* finetuned_from default value
* Add all mappings to the mapping file
* Be more defensive on finetuned_from arg
* Add default task tag
* Separate tags from tasks
* Edge case for dataset
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-06-15 16:01:37 -04:00
kumapo
955b2b97a6
Enable add_prefix_space if model_type is roberta or gpt2 ( #12116 )
2021-06-15 09:33:21 -04:00
Avital Oliver
9b393240a2
Use a released version of optax rather than installing from Git. ( #12173 )
...
Use a released version of optax rather than installing from Git
2021-06-15 16:42:51 +05:30
Stas Bekman
88e84186e5
[style] consistent nn. and nn.functional: part 4 examples
( #12156 )
...
* consistent nn. and nn.functional: p4 examples
* restore
2021-06-14 12:28:24 -07:00
Kumar Abhishek
9de62cfbce
[lm examples] Replicate --config_overrides addition to other LM examples ( #12135 )
...
* [lm examples] Replicate --config_overrides addition to other LM examples
* Removing no trainer files changes
* Update README
Co-authored-by: Kumar Abhishek <kabhishek@expedia.com>
2021-06-14 08:12:22 -04:00
Nicholas Broad
cd7961b632
Use text_column_name variable instead of "text" ( #12132 )
...
* Use text_column_name variable instead of "text"
`text_column_name` was already defined above where I made the changes and it was also used below where I made changes.
This is a very minor change. If a dataset does not use "text" as the column name, then the `tokenize_function` will now use whatever column is assigned to `text_column_name`. `text_column_name` is just the first column name if "text" is not a column name. It makes the function a little more robust, though I would assume that 90% + of datasets use "text" anyway.
* black formatting
* make style
Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>
2021-06-14 08:11:13 -04:00
Sylvain Gugger
b8ab541340
Don't log anything before logging is setup in examples ( #12121 )
...
* Don't log anything before logging is setup in examples
* Last example
2021-06-14 08:03:33 -04:00
Patrick von Platen
7566fefa69
[Flax] Add links to google colabs ( #12146 )
...
* fix_torch_device_generate_test
* remove @
* add colab links
2021-06-14 11:00:29 +01:00
Suraj Patil
d36fce8237
add readme for flax clm ( #12111 )
...
* add readme for flax clm
* use section link for tokenizer
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* update metrics
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-06-14 15:03:55 +05:30
Patrick von Platen
16c0efca2c
Add mlm pretraining xla torch readme ( #12011 )
...
* fix_torch_device_generate_test
* remove @
* upload
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
* Update examples/flax/language-modeling/README.md
* add more info
* finish
* fix
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-14 10:31:21 +01:00
Suraj Patil
15b498f3b8
Flax CLM script ( #12023 )
...
* first draft
* max_seq_length => block_size
* fix arg names
* fix typos
* fix loss calculation
* add max examples, fix train eval steps, metrics
* optimizer mask
* fix perpelexity, metric logging
* fix logging
* data_collator = > data_loader
* refactor loss_fn
* support single GPU
* pass distributed to write_metric
* fix jitting
* fix single device training
* fix single device metrics
* close inner progress bars once finished
* add overwrite_cache arg
* ifx dataset caching issue
* add more logs
* few small fixes,
* address nicholas suggestions
* fix docstr
* address patricks suggestions
* make flake happy
* pass new new_dropout_rng to apply_gradients
* reset train metrics after every epoc
* remove distributed logis, small fixes
2021-06-11 15:16:20 +05:30
Bhavitvya Malik
d2753dcbec
add relevant description to tqdm in examples ( #11927 )
...
* add relevant `desc` in examples
* require_version datasets>=1.8.0
2021-06-10 15:59:55 -04:00
Matt
bebbdd0fc9
Appending label2id and id2label to models to ensure inference works properly ( #12102 )
2021-06-10 15:25:04 +01:00
Matt
4cda08decb
Minor style edits
2021-06-10 15:10:57 +01:00
Matt
7f08dbd10a
Update README.md to cover the TF GLUE example.
2021-06-10 14:33:42 +01:00
Sylvain Gugger
d72e5a3a6d
Fix quality
2021-06-10 09:27:11 -04:00
Matt
73a532651a
New TF GLUE example ( #12028 )
...
* Pushing partially-complete new GLUE example
* First draft of the new TF GLUE example! Needs a little more testing to be sure but it's almost ready.
* Fix to the fit() call
* Bugfixes, making sure TPU and multi-GPU support is ready
* Remove logger line that depends on Pytorch
* Style pass
* Deleting old TF GLUE example
* Include label2id and id2label in the saved model config
* Don't clobber the existing model.config.label2id
* Style fixes
* Update examples/tensorflow/text-classification/run_glue.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-10 14:14:37 +01:00
kumapo
472a867626
Add text_column_name and label_column_name to run_ner and run_ner_no_trainer args ( #12083 )
...
* Add text_column_name and label_column_name to run_ner args
* Minor fix: grouping for text and label column name
2021-06-10 08:03:20 -04:00
Stas Bekman
61e191987d
rm require_version_examples ( #12088 )
2021-06-09 11:02:52 -07:00
Suraj Patil
d1500d9151
pass decay_mask fn to optimizer ( #12087 )
2021-06-09 18:49:27 +01:00
Anton Lozhkov
d472bd7b18
Wav2Vec2 Pretraining ( #11306 )
...
* Working quantizer forward
* Working quantizer forward
* Clean up unused model parts, test reproducibility
* Working quantizer forward
* Clean up unused model parts, test reproducibility
* Remove custom outputs from the shared ones
* correct conversion
* correct bug
* add first pretrain script
* save intermediate
* static shapes
* save intermediate
* finish first pretrain script version
* more refactor
* remove wanddb
* refactor more
* improve test
* correct perplexity compute bug
* finish model implementation
* add to docs
* finish docs
* finish pretraining script
* finish pretraining script
* remove wandb
* finish PR for merge
* finish config
* finish
* make deepspeed work
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* apply suggestions
* fix flaky test
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-09 18:40:56 +01:00
Stas Bekman
d14e0af274
sync LayerDrop for Wav2Vec2Encoder + tests ( #12076 )
2021-06-09 13:21:03 +01:00
Koichi Yasuoka
82a2b76c95
Update run_ner.py with id2label config ( #12001 )
2021-06-09 07:27:05 -04:00
Stas Bekman
11d86d3de4
[Deepspeed Wav2vec2] integration ( #11638 )
...
* wip
* wip - but working with https://github.com/microsoft/DeepSpeed/pull/1044
* cleanup
* workaround
* working 5/8 modes
* solve fp32 distributed zero3
* style
* sync
* sync
* rework
* deprecation
* cleanup
* https://github.com/microsoft/DeepSpeed/pull/1044 pr was merged
* clean up
* add a guide
* more prose
* more prose
* fix
* more prose
* sub_group_size was too big
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* refactor
* bug fix
* make the true check explicit
* new deepspeed release
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-08 12:32:03 -07:00
Sylvain Gugger
fd6902838a
Properly indent block_size ( #12070 )
2021-06-08 10:27:02 -04:00
cdleong
49bee0aea4
Add torch to requirements.txt in language-modeling ( #12040 )
...
* Add torch to requirements.txt in language-modeling
* Update examples/pytorch/language-modeling/requirements.txt
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-08 09:02:35 -04:00
Mario Šaško
f5eec0d8e9
Replace legacy tensor.Tensor with torch.tensor/torch.empty ( #12027 )
...
* Replace legacy torch.Tensor constructor with torch.{tensor, empty}
* Remove torch.Tensor in examples
2021-06-08 13:58:38 +01:00
Shamane Siri
e33085d648
updated the original RAG implementation to be compatible with latest Pytorch-Lightning ( #11806 )
...
* updated the original RAG implementation to be compatible with the latest PL version
* updated the requirements.txt file
* execute make style
* code quality test
* code quality
* conflix resolved in requirement.txt
* code quality
* changed the MyDDP class name to CustomDDP
2021-06-08 13:42:49 +01:00
Russell Klopfer
e363e1d936
adds metric prefix. ( #12057 )
...
* adds metric prefix.
* update tests to include prefix
2021-06-07 22:34:10 -04:00
Patrick von Platen
242ec31aa5
[Flax] Refactor MLM ( #12013 )
...
* fix_torch_device_generate_test
* remove @
* finish refactor
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-03 16:31:32 +01:00
Nicholas Vadivelu
4674061b2a
Fix weight decay masking in run_flax_glue.py
( #11964 )
...
* Fix weight decay masking in `run_flax_glue.py`
Issues with the previous implementation:
- The `dict` from `traverse_util.flatten_dict` has keys which are tuples of strings, not one long string with the path separated by periods.
- `optax.masked` applies the transformation wherever the mask is True, so the masks are flipped.
- Flax's LayerNorm calls the scale parameter `scale` not `weight`
* Fix formatting with black
* adapt results
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-03 11:35:26 +01:00
dependabot[bot]
6db3a87de2
Bump urllib3 from 1.25.8 to 1.26.5 in /examples/research_projects/lxmert ( #11983 )
...
Bumps [urllib3](https://github.com/urllib3/urllib3 ) from 1.25.8 to 1.26.5.
- [Release notes](https://github.com/urllib3/urllib3/releases )
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst )
- [Commits](https://github.com/urllib3/urllib3/compare/1.25.8...1.26.5 )
---
updated-dependencies:
- dependency-name: urllib3
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-06-02 03:40:20 -04:00
Fan Zhang
7e73601f32
modify qa-trainer ( #11872 )
...
* modify qa-trainer
* fix flax model
2021-06-01 08:28:41 -04:00
Shamane Siri
9ec0f01b6c
RAG-2nd2end-revamp ( #11893 )
...
* initial
* code quality test
* code quality
* added test functions in test_modeling_rag.py and test_retrieval_rag.py to test end2end retreiver
* minor change in test_modeling_rag
* fixed tests
* Update examples/research_projects/rag-end2end-retriever/README.md
typo corrected as suggested by lhoestq
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
* Update examples/research_projects/rag-end2end-retriever/finetune_rag.py
type change suggested by lhoestq
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
* Update src/transformers/models/rag/retrieval_rag.py
Adding this change as mentioned by lhoestq.
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
* completed the minor changes suggested by the reviewers
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
2021-06-01 07:32:26 +01:00
Philip May
cfca638acb
Add MT5ForConditionalGeneration as supported arch. to summarization README ( #11961 )
...
* Add MT5ForConditionalGeneration as supported arch.
* Update README.md
2021-05-31 21:24:33 +05:30
Nicholas Vadivelu
1ab147d648
Remove redundant nn.log_softmax
in run_flax_glue.py
( #11920 )
...
* Remove redundant `nn.log_softmax` in `run_flax_glue.py`
`optax.softmax_cross_entropy` expects unnormalized logits, and so it already calls `nn.log_softmax`, so I believe it is not needed here. `nn.log_softmax` is idempotent so mathematically it shouldn't have made a difference.
* Remove unused 'flax.linen' import
2021-05-31 15:29:04 +01:00
Avital Oliver
2df546918e
Link official Cloud TPU JAX docs ( #11892 )
2021-05-26 15:44:40 -04:00
Stas Bekman
1b6530104d
[Examples] create model with custom config on the fly ( #11798 )
...
* create custom model on the flight
* better wording
* add update_from_string
* cleanup
* cleanup
* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* more bool options
* style
* fix logger
* add test
* add the doc
* assert on conflict of options
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-05-25 10:40:49 -07:00
Stas Bekman
6287c929c1
[lm examples] fix overflow in perplexity calc ( #11855 )
...
* fix overflow in perplexity calc
* use inf
* fix
2021-05-25 08:11:26 -07:00
Sylvain Gugger
f086652b16
Add option to log only once in multinode training ( #11819 )
...
* Add option to long only once in multinode training
* Use an alternate property
2021-05-25 08:03:43 -04:00
Wang Ran (汪然)
b8344a274f
typo ( #11858 )
2021-05-25 04:23:46 -04:00
Patrick von Platen
f580604157
[Flax] Fix PyTorch import error ( #11839 )
...
* fix_torch_device_generate_test
* remove @
* change pytorch import to flax import
2021-05-24 10:41:10 +01:00
Patrick von Platen
da22245ed9
Add flax text class colab ( #11824 )
...
* fix_torch_device_generate_test
* remove @
* add flax glue link
2021-05-21 23:11:58 +01:00
Patrick von Platen
82335185fe
[Flax] Small fixes in run_flax_glue.py
( #11820 )
...
* fix_torch_device_generate_test
* remove @
* correct best seed for flax fine-tuning
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-21 16:52:23 +01:00
Patrick von Platen
bd9871657b
[Flax] Align GLUE training script with mlm training script ( #11778 )
...
* speed up flax glue
* remove unnecessary line
* remove folder
* remove run in loop
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-21 09:36:56 +01:00
Keren Fuentes
223943872e
Fix failing test on Windows Platform ( #11589 )
...
* add separator for windows
* fixes test_is_copy_consistent on Windows
* fixing writing encoding issue on extended test (for Windows)
* resolving comments
2021-05-20 19:54:23 -04:00
Patrick von Platen
00440e350f
[Flax MLM] Refactor run mlm with optax ( #11745 )
...
* refactor
* update
* update
* update
* refactor run mlm
* finalize
* refactor more
* fix typo
* update
* finish refactor
* modify run mlm
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
* small fixes
* upload
* upload
* finish run mlm script
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-19 12:00:58 +01:00
Tomy Hsieh
eb3e072a3b
Fix a small error in summarization example ( #11762 )
2021-05-18 14:38:36 -04:00
Avital Oliver
77f9bd18af
Add Flax Examples and Cloud TPU README ( #11753 )
...
* Add Flax Examples README
* Apply suggestions from code review
* Update examples/flax/README.md
* add nice table
* fix
* fix
* apply suggestions
* upload
* finish flax readme.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-05-18 17:45:16 +01:00
Philipp Schmid
04e25c6286
add dataset_name
to data_args and added accuracy metric ( #11760 )
...
* add `dataset_name` to data_args and added accuracy metric
* added documentation for dataset_name
* spelling correction
2021-05-18 16:27:29 +02:00
Patrick von Platen
cebb96f53a
Add more subsections to main doc ( #11758 )
...
* add headers to main doc
* Apply suggestions from code review
* update
* upload
2021-05-18 14:38:56 +01:00
Tommy Chiang
da7e73b721
Fix incorrect newline in #11650 ( #11757 )
2021-05-18 15:28:13 +02:00
Sylvain Gugger
936b57158a
Use new evaluation loop in TrainerQA ( #11746 )
2021-05-17 10:10:13 -04:00
Marc van Zee
726e953d44
Improvements to Flax finetuning script ( #11727 )
...
* Add Cloud details to README
* Flax script and readme updates
* Some simplifications of Flax script
2021-05-17 09:26:33 +01:00
Marc van Zee
94a2348706
Add Cloud details to README ( #11706 )
...
* Add Cloud details to README
* Flax script and readme updates
2021-05-14 14:51:25 +01:00
Patrick von Platen
113eaa7575
correct example script ( #11726 )
2021-05-14 12:02:57 +01:00
Lysandre
d77eb0cf92
Docs for v4.7.0.dev0
2021-05-12 17:08:35 +02:00
Lysandre
64e78564a5
Release: v4.6.0
2021-05-12 17:03:03 +02:00
Philip May
77f4c46b50
remove defaults to None if optional ( #11703 )
2021-05-12 09:11:10 -04:00
Marc van Zee
6797cdc077
Updates README and fixes bug ( #11701 )
2021-05-12 13:52:52 +01:00
Marc van Zee
4ce6bcc310
Adds Flax BERT finetuning example on GLUE ( #11564 )
...
* Adds Flax BERT finetuning example
* fix traced jax tensor type
* Use Optax losses and learning schedulers
* Add 1GPU training results
* merge into master & make style
* fix input
* del file
* Fix bug in loss and add torch runs
* finish bert flax fine-tune
* Update examples/flax/text-classification/README.md
* Update examples/flax/text-classification/run_flax_glue.py
* add requirements
* finalize
* finalize
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-11 19:02:59 +01:00
Sylvain Gugger
a135f59536
Auto modelcard ( #11599 )
...
* Autogenerate model cards from the Trainer
* ModelCard deprecated
* Fix test
* Style
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Address review comments
* Quality
* With all metadata
* Metadata
* Post-merge conflict mess
* Data args and all examples
* Default license and languages when possible
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-05-11 11:30:34 -04:00
Jonathan Chang
64232bc0df
Add --text_column to run_summarization_no_trainer ( #11673 )
2021-05-11 07:58:38 -04:00
Matt
ef8d32c5ea
Fix suggested by @bhadreshpsavani ( #11660 )
2021-05-10 14:28:04 +01:00
Quentin Lhoest
1a0b41781d
Update requirements.txt ( #11634 )
2021-05-10 11:19:52 +05:30
Tommy Chiang
7e406f4a65
[Examples] Fix invalid links after reorg ( #11650 )
2021-05-10 11:16:48 +05:30
Tommy Chiang
f2ffcaf49f
[Examples] Check key exists in datasets first ( #11503 )
2021-05-09 15:42:38 -04:00
Stas Bekman
ba0d50f214
[examples] fix sys.path in conftest.py ( #11636 )
...
* restore conftest.py
* fix conftest and make copies
* remove unneeded parts
* remove unwanted files
2021-05-07 14:44:22 -07:00
Jonathan Chang
6f40e31766
Fix comment in run_clm_no_trainer.py ( #11624 )
2021-05-07 12:32:30 +05:30
Vipul Raheja
f594090a93
fix typo in command ( #11605 )
2021-05-06 12:32:54 +05:30
Patrick von Platen
3e3e41ae20
Pytorch - Lazy initialization of models ( #11471 )
...
* lazy_init_weights
* remove ipdb
* save int
* add necessary code
* remove unnecessary utils
* Update src/transformers/models/t5/modeling_t5.py
* clean
* add tests
* correct
* finish tests
* finish tests
* fix some more tests
* fix xlnet & transfo-xl
* fix more tests
* make sure tests are independent
* fix tests more
* finist tests
* final touches
* Update src/transformers/modeling_utils.py
* Apply suggestions from code review
* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* clean tests
* give arg positive name
* add more mock weights to xlnet
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-05-05 17:22:20 +02:00
Sylvain Gugger
6b241e0e3b
Reproducible checkpoint ( #11582 )
...
* Set generator in dataloader
* Use generator in all random samplers
* Checkpoint all RNG states
* Final version
* Quality
* Test
* Address review comments
* Quality
* Remove debug util
* Add python and numpy RNGs
* Split states in different files in distributed
* Quality
* local_rank for TPUs
* Only use generator when accepted
* Add test
* Set seed to avoid flakiness
* Make test less flaky
* Quality
2021-05-04 16:20:56 -04:00
Patrick von Platen
084a187da3
[FlaxRoberta] Add FlaxRobertaModels & adapt run_mlm_flax.py ( #11470 )
...
* add flax roberta
* make style
* correct initialiazation
* modify model to save weights
* fix copied from
* fix copied from
* correct some more code
* add more roberta models
* Apply suggestions from code review
* merge from master
* finish
* finish docs
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-04 19:57:59 +02:00
Sylvain Gugger
87dd1a00ef
Fix metric computation in run_glue_no_trainer
( #11569 )
2021-05-03 11:42:55 -04:00
Bhadresh Savani
84326a28f8
[Examples] Added support for test-file in QA examples with no trainer ( #11510 )
...
* added support for test-file
* fixed typo
* added suggested changes
* reformatted code
* modifed files
* fix post processing error
* Trigger CI
* removed extra lines
2021-04-30 09:02:50 -04:00
Suraj Patil
57c8e822f7
reszie token embeds ( #11524 )
2021-04-30 08:47:01 -04:00
Matt
20d6931e32
Update TF text classification example ( #11496 )
...
Big refactor, fixes and multi-GPU/TPU support
2021-04-30 13:45:33 +01:00
Manuel Romero
58c789e3d2
Update README.md ( #11489 )
...
Add link to code
2021-04-30 04:29:59 -04:00
Sylvain Gugger
b29eb247d3
Split checkpoint from model_name_or_path in examples ( #11492 )
...
* Split checkpoint from model_name_or_path in examples
* Address review comments
* Address review comments
2021-04-29 18:33:47 -04:00
Jaimeen Ahn
0661abc545
Variable Correction for Consistency in Distillation Example ( #11444 )
...
As the error comes from the inconsistency of variable meaning number of gpus in parser and its actual usage in the train.py script, 'gpus' and 'n_gpu' respectively, the correction makes the example work
2021-04-26 13:30:48 -04:00
Bhadresh Savani
1d30ec95c7
[Examples] Fixes inconsistency around eval vs val and predict vs test ( #11380 )
...
* added changes for uniformity
* modified files
* corrected typo
* fixed qa scripts
* fix typos
* fixed predict typo in qa no trainer
* fixed test file
* reverted trainer changes
* reverted trainer changes in custom exmaples
* updated readme
* added changes in deepspeed test
* added changes for predict and eval
2021-04-26 09:24:31 -07:00
Amine Abdaoui
e3e70f9551
docs(examples): fix link to TPU launcher script ( #11427 )
2021-04-26 09:08:43 -04:00
Patrick von Platen
32dbb2d954
make style ( #11442 )
2021-04-26 13:50:34 +02:00
Sylvain Gugger
1ef152eb48
Default to accuracy metric ( #11405 )
2021-04-23 14:49:59 -04:00
Sylvain Gugger
bf2e0cf70b
Trainer push to hub ( #11328 )
...
* Initial support for upload to hub
* push -> upload
* Fixes + examples
* Fix torchhub test
* Torchhub test I hate you
* push_model_to_hub -> push_to_hub
* Apply mixin to other pretrained models
* Remove ABC inheritance
* Add tests
* Typo
* Run tests
* Install git-lfs
* Change approach
* Add push_to_hub to all
* Staging test suite
* Typo
* Maybe like this?
* More deps
* Cache
* Adapt name
* Quality
* MOAR tests
* Put it in testing_utils
* Docs + torchhub last hope
* Styling
* Wrong method
* Typos
* Update src/transformers/file_utils.py
Co-authored-by: Julien Chaumond <julien@huggingface.co>
* Address review comments
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-04-23 09:17:37 -04:00
Yoshitomo Matsubara
c3d6f33918
fixed typos ( #11391 )
2021-04-23 07:48:42 -04:00
Max Del
a90d3f1862
Fix typo in text ( #11396 )
2021-04-23 07:37:19 -04:00
Patrick von Platen
b48cf7124c
correct typo ( #11393 )
2021-04-23 11:34:59 +02:00
Matt
2617396094
Correctly cast num_train_epochs to int ( #11379 )
2021-04-22 13:49:59 +01:00
johnson7788
5b5e4ca366
[run_translation.py] fix typo ( #11372 )
...
fix typo
Co-authored-by: johnson <johnson@github.com>
2021-04-22 17:47:11 +05:30
Matt
6fe79e57d7
Move old TF text classification script to legacy ( #11361 )
...
And update README to explain the work-in-progress!
2021-04-21 17:36:18 +01:00
Matt
ac588594e2
Merge new TF example script ( #11360 )
...
First of the new and more idiomatic TF examples!
2021-04-21 17:04:55 +01:00
Sylvain Gugger
dabeb15292
Examples reorg ( #11350 )
...
* Base move
* Examples reorganization
* Update references
* Put back test data
* Move conftest
* More fixes
* Move test data to test fixtures
* Update path
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments and clean
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-21 11:11:20 -04:00
Sylvain Gugger
f1b938fda8
Update to use datasets remove_cloumns method ( #11343 )
...
* Update to use datasets remove_cloumns method
* Quality
2021-04-20 14:12:01 -04:00
rajvi-k
bfd83c17a7
Added translation example script ( #11196 )
...
* initial changes
* modified evaluation
* updated evaluation
* updated evaluation on text translation example script
* added translation example script
* Formatted translation example script
* Reformatted translation example
* Fixed evaluation bug and added support for other tokenisers
* Fixed evaluation bug and added support for other tokenisers
* Added translation example script
* Formatted summarization example script
* Removed typos from summarization example script
2021-04-20 07:18:47 -04:00
Sudharsan S T
f25444cb22
Close open files to suppress ResourceWarning ( #11240 )
...
Co-authored-by: Sudharsan Thirumalai <sudharsan.t@sprinklr.com>
2021-04-14 10:31:04 -04:00
Nithin Holla
653076ca30
Save the Wav2Vec2 processor before training starts ( #10910 )
...
Co-authored-by: nithin19 <nithin@amberscript.com>
2021-04-14 14:52:06 +03:00
Philipp Schmid
9fa2995993
added cache_dir=model_args.cache_dir to all example with cache_dir arg ( #11220 )
2021-04-13 18:35:18 +02:00
Takuya Makino
cb251ba619
Fix typo ( #11188 )
2021-04-12 17:35:32 -04:00
Masatoshi TSUCHIYA
ef102c4886
model_path should be ignored as the checkpoint path ( #11157 )
...
* model_path is refered as the path of the trainer, and should be ignored as the checkpoint path.
* Improved according to Sgugger's comment.
2021-04-12 09:06:41 -04:00
Stas Bekman
07f0bb691d
[examples run_clm] fix _LazyModule hasher error ( #11168 )
...
* fix _LazyModule hasher error
* reword
2021-04-09 11:39:12 -07:00
Suraj Patil
c161dd56df
[examples/translation] support mBART-50 and M2M100 fine-tuning ( #11170 )
...
* keep a list of multilingual tokenizers
* add forced_bos_token argument
2021-04-09 23:58:42 +05:30
Saviour Owolabi
6060746570
Update README.md ( #11161 )
...
Corrected a typo ('Downlowd' to 'Download')
2021-04-09 11:52:21 -04:00
Stas Bekman
66446909b2
[tests] relocate core integration tests ( #11146 )
...
* relocate core integration tests
* add sys.path context manager
* cleanup
* try
* try2
* fix path
* doc
* style
* add dep
* add 2 more deps
2021-04-08 13:13:17 -07:00
Andrea Cappelli
6c40e49712
Run mlm pad to multiple for fp16 ( #11128 )
...
* Add mlm collator pad to multiple option (#10627 )
* Use padding to 8x in run mlm (#10627 )
2021-04-08 16:12:49 -04:00
Stas Bekman
c6d664849b
[DeepSpeed] ZeRO Stage 3 ( #10753 )
...
* synced gpus
* fix
* fix
* need to use t5-small for quality tests
* notes
* complete merge
* fix a disappearing std stream problem
* start zero3 tests
* wip
* tune params
* sorting out the pre-trained model loading
* reworking generate loop wip
* wip
* style
* fix tests
* split the tests
* refactor tests
* wip
* parameterized
* fix
* workout the resume from non-ds checkpoint pass + test
* cleanup
* remove no longer needed code
* split getter/setter functions
* complete the docs
* suggestions
* gpus and their compute capabilities link
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* style
* remove invalid paramgd
* automatically configure zero3 params that rely on hidden size
* make _get_resized_embeddings zero3-aware
* add test exercising resize_token_embeddings()
* add docstring
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-08 09:53:01 -07:00
Stas Bekman
acc851e1ff
[run_clm] clarify why we get the tokenizer warning on long input ( #11145 )
...
* clarify why we get the warning here
* Update examples/language-modeling/run_clm.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* wording
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-08 09:46:28 -07:00
Stas Bekman
424419f549
[examples] fix white space ( #11099 )
...
these get concatenated without whitespace, so fix it
2021-04-07 09:20:58 -04:00
Stas Bekman
c9035e4537
fix: The 'warn' method is deprecated ( #11105 )
...
* The 'warn' method is deprecated
* fix test
2021-04-07 09:20:06 -04:00
Sylvain Gugger
fd338abdeb
Style
2021-04-06 19:54:13 -04:00
SHYAM SUNDER KUMAR
aef4cf8c52
accelerate question answering examples with no trainer ( #11091 )
...
* accelerate question answering examples with no trainer
* removed train and eval flags also fixed fill np array function
* Update examples/question-answering/run_qa_beam_search_no_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update examples/question-answering/run_qa_no_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-06 19:35:21 -04:00
Lysandre
9853c5dd58
Development on v4.6.0dev0
2021-04-06 12:53:25 -04:00
Lysandre
4906a29f7f
Release v4.5.0
2021-04-06 12:37:47 -04:00
Hemil Desai
6ab7d1a429
Add Readme for language modeling scripts with accelerate ( #11073 )
2021-04-05 20:56:12 -04:00
Hemil Desai
b51b87c41d
Add examples/language_modeling/run_clm_no_trainer.py
( #11026 )
...
* Initial draft for clm no trainer
* Remove unwanted args
* Fix bug
* Update examples/language-modeling/run_clm_no_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-05 12:27:52 -04:00
Stas Bekman
3d39226a51
s|Pretrained|PreTrained| ( #11048 )
2021-04-04 18:08:42 -07:00
versis
335c0ca35c
fixed typo: logging instead of logger ( #11025 )
2021-04-02 09:22:22 -04:00
Hemil Desai
838f83d84c
Add examples/language_modeling/run_mlm_no_trainer.py
( #11001 )
...
* Add initial script for finetuning MLM models with accelerate
* Add evaluation metric calculation
* Fix bugs
* Use no_grad on evaluation
* update script docstring
* Update examples/language-modeling/run_mlm_no_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* PR feedback
* Fix CI failure
* Update examples/language-modeling/run_mlm_no_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-31 18:49:45 -04:00
Sylvain Gugger
acc3bd9d2a
Enforce string-formatting with f-strings ( #10980 )
...
* First third
* Styling and fix mistake
* Quality
* All the rest
* Treat %s and %d
* typo
* Missing )
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-31 10:00:27 -04:00
WybeKoper
645f45c462
Fixed some typos and removed legacy url ( #10989 )
...
* Fixed typos
* Removed legacy colab notebook from readme
Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>
2021-03-31 16:53:15 +05:30
Yih-Dar
e031162a6b
fix md file to avoid evaluation crash ( #10962 )
2021-03-30 21:26:22 +03:00
Philipp Schmid
3e09d813aa
[examples/s2s] added py7zr dep ( #10971 )
...
* added py7zr
* comment out check_min for sagemaker test
* added min version again
2021-03-30 23:17:12 +05:30
Stas Bekman
05c966f24b
[vulnerability] dep fix ( #10954 )
...
Fixes https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/Pygments/open
@LysandreJik
2021-03-29 17:25:47 -04:00
Daniel Stancl
5057213bcc
Add examples/multiple-choice/run_swag_no_trainer.py
( #10934 )
...
* Initial commit
* Another bunch of updates
* make style quliaty + delete debug arg from bash script
* Use compue_metrics func
* Do a few fixes
* Add copyright
* Fix typos
2021-03-29 16:41:09 -04:00
Sylvain Gugger
4002f95eb6
Remove duplicate code
2021-03-29 15:27:12 -04:00
Daniel Stancl
d7b50ce469
Add examples/run_ner_no_trainer.py
( #10902 )
...
* Add NER example with accelerate library
* This commit contains the first (yet really unfinished)
version of a script for showing how to train HuggingFace model
with their new accelerate library.
* Fix metric calculation
* make style quality
* mv ner_no_trainer to token-classification dir
* Delete --debug flag from running script
* hf_datasets -> raw_datasets
* Make a few slight adjustments
* Add an informative comment + rewrite a help comment
* Change header
* Fix a few things
* Enforce to use fast tokenizers only
* DataCollatorWithPadding -> DataCollatorForTokenClassification
* Change bash script: python3 -> accelerate launch
* make style
* Add a few missing things (see below)
* Add a max-lenghth padding to predictions and labels to
enable accelerate gather functionality
* Add PyTorch no trainer example to the example README.md
* Remove --do-train from args as being redundant for now
* DataCollatorWithPadding -> DataCollatorForTokenClassification
* Remove some obsolete args.do_train conditions from the script
* Delete --do_train from bash running script
* Delete use_slow_tokenizer from args
* Add unintentionally removed flag --label_all_tokens
* Delete --debug flag from running script
2021-03-29 15:11:23 -04:00
WybeKoper
ddea8771c6
Updated colab links in readme of examples ( #10932 )
...
Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>
2021-03-29 08:47:09 -04:00
Bhadresh Savani
4f21e1ddd6
fixed finename ( #10939 )
2021-03-28 09:48:12 -07:00
Stas Bekman
3c27d246e5
[vulnerability] fix dependency ( #10914 )
...
this PR fixes https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/PyYAML/open
2021-03-26 09:06:11 -04:00
Jethro Kuan
5f1491d3b3
run_glue_no_trainer: datasets -> raw_datasets ( #10898 )
...
Use the correct variable (raw_datasets) instead of the module (datasets)
where appropriate.
2021-03-25 08:28:17 -04:00
Bhadresh Savani
7ef40120a0
[Examples] Added predict stage and Updated Example Template ( #10868 )
...
* added predict stage
* added test keyword in exception message
* removed example specific saving predictions
* fixed f-string error
* removed extra line
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-03-23 10:37:59 -07:00
Eliza Szczechla
9f8fa4e973
Use DataCollatorForSeq2Seq in run_summarization in all cases ( #10856 )
...
Co-authored-by: Eliza <eliza@habanero.tiger.com.pl>
2021-03-22 15:05:39 -04:00
Boris Dayma
125ccead71
feat(wandb): logging and configuration improvements ( #10826 )
...
* feat: ensure unique artifact id
* feat: allow manual init
* fix: simplify reinit logic
* fix: no dropped value + immediate commits
* fix: wandb use in sagemaker
* docs: improve documenation and formatting
* fix: typos
* docs: improve formatting
2021-03-22 10:45:17 -04:00
Stas Bekman
8fb4671811
[vulnerability] in example deps fix ( #10817 )
...
Takes care of:
https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/jinja2/open
@LysandreJik
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-22 09:05:24 -04:00
dependabot[bot]
dbfe379514
Bump jinja2 from 2.11.2 to 2.11.3 in /examples/research_projects/lxmert ( #10818 )
...
Bumps [jinja2](https://github.com/pallets/jinja ) from 2.11.2 to 2.11.3.
- [Release notes](https://github.com/pallets/jinja/releases )
- [Changelog](https://github.com/pallets/jinja/blob/master/CHANGES.rst )
- [Commits](https://github.com/pallets/jinja/compare/2.11.2...2.11.3 )
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-03-22 08:54:50 -04:00
Qiushi Pan
29904a967b
Update FINE_TUNE_XLSR_WAV2VEC2.md ( #10849 )
...
Fix typo.
2021-03-22 07:58:59 -04:00
Patrick von Platen
0f226f78ce
push ( #10846 )
2021-03-22 10:32:21 +03:00
Suraj Patil
82b8d8c7b0
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-21 22:47:09 +05:30
Patrick von Platen
af6125ffdb
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-21 12:31:33 +03:00
Patrick von Platen
5aaf6e1460
small improvements for wav2vec2 info script ( #10829 )
2021-03-21 11:41:44 +03:00
Suraj Patil
68b55885ed
add doc for Local machine ( #10828 )
2021-03-21 13:25:34 +05:30
Julien Chaumond
1438c487df
wav2vec doc tweaks ( #10808 )
...
* wording/typos tweaks
* Make model upload instructions simpler
2021-03-19 12:48:54 -04:00
Patrick von Platen
b9570a813c
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-19 19:45:28 +03:00
Sylvain Gugger
946400fb68
Expand a bit the presentation of examples ( #10799 )
...
* Expand a bit the presentation of examples
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-03-19 10:06:08 -04:00
Bhadresh Savani
fd1d9f1ab8
[Example] Updating Question Answering examples for Predict Stage ( #10792 )
...
* added prediction stage and eval fix
* style correction
* removed extra lines
2021-03-19 09:42:17 -04:00
Patrick von Platen
e8968bd03a
[XLSR-Wav2Vec2 Info doc] Add a couple of lines ( #10806 )
...
* finish
* fix
* fix
* fix
* fix
2021-03-19 12:52:54 +03:00
Stas Bekman
427ea3fecb
addressing vulnerability report in research project deps ( #10802 )
...
Following up on a security alert:
https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/Pillow/open
2021-03-18 22:02:10 -04:00
Patrick von Platen
2ae678229f
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-19 00:29:20 +03:00
Patrick von Platen
68a3215949
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-19 00:27:40 +03:00
Patrick von Platen
03df3fbcb4
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-19 00:26:49 +03:00
Patrick von Platen
e84adbed40
Add XLSR-Wav2Vec2 Fine-Tuning README.md ( #10786 )
...
* upload
* upload fine-tuning script
* improve
* adapt
* Apply suggestions from code review
* correct
* upload
* finalize
* remove @
* correct typos
2021-03-19 00:22:43 +03:00
Stas Bekman
9352b5151a
[examples/seq2seq/README.md] fix t5 examples ( #10734 )
...
* [examples/seq2seq] fix t5 examples
This PR:
* fixes T5 examples to include `--source_prefix` - it's **not** optional. If you give it a try you will see that you get 10x worse bleu scores w/o it. w/ `27.6849`, w/ `2.374`
* added a normal translation example w/o the peculiarities of MBart and T5
* reduces the default max samples to 50 so it's much faster to test quickly
summarization seems to be broken for t5 score-wise: https://github.com/huggingface/transformers/issues/10733
@sgugger
* specify explicitly the t5 models requiring the special handling
* one more
* update the t5 summarization example to use cnn_dailymail
* move max*samples into the top level README.md
* better wording
* better wording
2021-03-18 09:55:39 -07:00
Julien Chaumond
4f3e93cfaf
[file_utils] do not gobble certain kinds of requests.ConnectionError ( #10235 )
...
* do not gobble certain kinds of requests.ConnectionError
* Apply review comments
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-03-18 12:37:45 -04:00
Suraj Patil
5f19c07a70
add run_common_voice script ( #10767 )
...
* add initial script
* finish script
* add shell script example
* accept chars_to_ignor as cl arg
* align the script with other example scripts
* add torchaudio dep
2021-03-18 17:21:16 +05:30
Mohamed El-Geish
af8afdc88d
wav2vec2: support datasets other than LibriSpeech ( #10581 )
...
* wav2vec2: support datasets other than LibriSpeech
* Formatting run_asr.py to pass code quality test
* bundled orthography options and added verbose logs
* fixing a typo in timit fine-tuning script
* update comment for clarity
* resize_lm_head and load custom vocab from file
* adding a max_duration_in_seconds filter
* do not assign `duration_filter` lambda, use a def
* log untransliterated text as well
* fix base model for arabic
* fix duration filter when target_sr is not set
* drop duration_in_seconds when unneeded
* script for wav2vec2-large-lv60-timit-asr
* fix for "tha" in arabic corpus (huggingface#10581)
* adding more options to work with common_voice
* PR feedback (huggingface#10581)
* small README change
2021-03-18 10:20:26 +03:00
Stas Bekman
393739194e
[examples] document resuming ( #10776 )
...
* document resuming in examples
* fix
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* put trainer code last, adjust notes
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-17 12:48:35 -07:00
Stas Bekman
cd8c93f701
[DeepSpeed] improve checkpoint loading code plus tests ( #10760 )
...
* deepspeed checkpoint loading code plus tests
* style
* style
2021-03-17 10:22:58 -07:00
Cheng Li
c83fbc5f2d
[Deepspeed] Allow HF optimizer and scheduler to be passed to deepspeed ( #10464 )
...
* pass hf optimizer and scheduler to deepspeed if not specified in ds config
* pass hf optimizer and scheduler to deepspeed if not specified in ds config
* update
* make init_deepspeed support config dict
* fix docstring formatting
* clean up trainer's comments
* add new tests
* fix type
* composit argparse doesn't work
* style
* add a new test, rename others
* document new functionality
* complete tests, add docs
* style
* correct level
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add new methods to the doc
* must tell DS we are using a non-native optimizer
* add protection against cpu_offload + HF optimizer combo
* fix the cli overrides
* sync docs + tests
* restore AdamW
* better docs
* need new version
* no longer needed
* remove outdate information
* refactor duplicated code
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-16 15:51:09 -07:00
Lysandre
1b5ce1e63b
Development on v4.5.0dev0
2021-03-16 11:41:15 -04:00
Lysandre
c988db5af2
Release v4.4.0
2021-03-16 11:33:35 -04:00
Russell Klopfer
87d685b8a9
independent training / eval with local files ( #10710 )
...
* independent training / eval with local files
* remove redundant assert
2021-03-15 19:35:26 -04:00
Sylvain Gugger
4c379daf64
Add minimum version check in examples ( #10724 )
...
* Add minimum version check in examples
* Style
* No need for new line maybe?
* Add helpful comment
2021-03-15 19:29:54 -04:00
Joe Davison
966ba081c9
zero-shot pipeline multi_class -> multi_label ( #10727 )
2021-03-15 16:02:46 -06:00
Théo Matussière
6f840990a7
split seq2seq script into summarization & translation ( #10611 )
...
* split seq2seq script, update docs
* needless diff
* fix readme
* remove test diff
* s/summarization/translation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* cr
* fix arguments & better mbart/t5 refs
* copyright
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* reword readme
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* s/summarization/translation
* short script names
* fix tests
* fix isort, include mbart doc
* delete old script, update tests
* automate source prefix
* automate source prefix for translation
* s/translation/trans
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* fix script name (short version)
* typos
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* exact parameter
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* remove superfluous source_prefix calls in docs
* rename scripts & warn for source prefix
* black
* flake8
Co-authored-by: theo <theo@matussie.re>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-03-15 09:11:42 -04:00
Stas Bekman
4c32f9f26e
AdamW is now supported by default ( #9624 )
2021-03-12 13:40:07 -08:00
Lysandre Debut
9fbb4cdc80
Specify minimum version for sacrebleu ( #10662 )
2021-03-11 13:45:06 -05:00
ArvidYin
27d9e05ce2
Update README.md ( #10647 )
...
correct spell error: 'nether'
2021-03-11 08:58:04 -05:00
Sylvain Gugger
efb5c0a453
Add new GLUE example with no Trainer. ( #10555 )
...
* Add new GLUE example with no Trainer.
* Style
* Address review comments
2021-03-10 09:29:19 -05:00
Allen Wang
6f52fce673
Fixes an issue in text-classification
where MNLI eval/test datasets are not being preprocessed. ( #10621 )
...
* Fix MNLI tests
* Linter fix
2021-03-09 22:13:45 -05:00
Sylvain Gugger
0d909f6bd8
Fairscale FSDP fix model save ( #10596 )
...
* Hotfix fairscale FSDP
* Evaluation works
* Save on process zero
2021-03-09 14:42:07 -05:00
Stas Bekman
f284089ec4
[examples tests on multigpu] resolving require_torch_non_multi_gpu_but_fix_me ( #10561 )
...
* batch 1
* this is tpu
* deebert attempt
* the rest
2021-03-08 11:11:40 -08:00
Bhadresh Savani
dfd16af832
Added max_sample_ arguments ( #10551 )
...
* reverted changes of logging and saving metrics
* added max_sample arguments
* fixed code
* white space diff
* reformetting code
* reformatted code
2021-03-08 13:57:10 -05:00
Stas Bekman
917f104502
[examples tests] various fixes ( #10584 )
...
* fix sharded ddp enum
* test fixes
* stronger validation + apex breaks other tests
2021-03-08 10:28:44 -08:00
Stas Bekman
e6ce636e02
fix nltk lookup ( #10585 )
2021-03-07 22:09:58 -08:00
Stas Bekman
88a951e3cc
offline mode for firewalled envs ( #10407 )
...
* offline mode start
* add specific values
* fix fallback
* add test
* better values check and range
* test that actually works
* document the offline mode
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* more strict check
* cleaner test
* pt-only test
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-05 17:27:48 -08:00
Patrick von Platen
395ffcd757
fix run seq2seq ( #10547 )
2021-03-05 18:17:12 +03:00
Sylvain Gugger
a5bd40b75c
Not always consider a local model a checkpoint in run_glue ( #10517 )
2021-03-04 11:11:39 -05:00
Sylvain Gugger
745ea78dcc
Revert "Not always consider a local model a checkpoint in run_glue"
...
This reverts commit f3660613bc
.
2021-03-04 09:45:18 -05:00
Sylvain Gugger
f3660613bc
Not always consider a local model a checkpoint in run_glue
2021-03-04 09:44:02 -05:00
Patrick von Platen
0234de8418
Add Fine-Tuning for Wav2Vec2 ( #10145 )
...
* add encode labels function to tokenizer
* start adding finetuning
* init dropout
* upload
* correct convert script
* apply changes
* fix second typo
* make first dummy training run
* adapt convert script
* push confg for comparison
* remove conf
* finish training
* adapt data collator
* add research folder
* update according to fairseq feedback
* some minor corrections
* refactor masking indices a bit
* some minor changes
* clean tokenizer
* finish clean-up
* remove previous logic
* update run script
* correct training
* finish changes
* finish model
* correct bug
* fix training a bit more
* add some tests
* finish gradient checkpointing
* finish example
* correct gradient checkpointing
* improve tokenization method
* revert changes in tokenizer
* revert general change
* adapt fine-tuning
* update
* save intermediate test
* Update README.md
* finish finetuning
* delete conversion script
* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py
* Update src/transformers/models/wav2vec2/processing_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* finish wav2vec2 script
* finish wav2vec2 fine-tuning
* finalize test
* correct test
* adapt tests
* finish
* remove test file
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-01 12:13:17 +03:00
Bhadresh Savani
aca6288ff4
updated logging and saving metrics ( #10436 )
...
* updated logging and saving metrics
* space removal
2021-02-27 09:53:44 -08:00
Stas Bekman
f52a15897b
[run_seq2seq.py] restore functionality: saving to test_generations.txt ( #10428 )
...
This PR restores the original functionality that for some reason was modified.
Fixes: https://github.com/huggingface/transformers/issues/10381
@sgugger
2021-02-27 08:21:50 -08:00
Stas Bekman
ee04b69822
[examples] better model example ( #10427 )
...
* refactors
* typo
2021-02-26 17:01:01 -08:00
Sylvain Gugger
17b6e0d474
Fix run_glue evaluation when model has a label correspondence ( #10401 )
2021-02-25 15:30:38 -05:00
Sylvain Gugger
9d14be5c20
Add support for ZeRO-2/3 and ZeRO-offload in fairscale ( #10354 )
...
* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale
* Quality
* Rework from review comments
* Add doc
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-02-25 11:07:53 -05:00
Patrick von Platen
cb38ffcc5e
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer ( #10324 )
...
* push to show
* small improvement
* small improvement
* Update src/transformers/feature_extraction_utils.py
* Update src/transformers/feature_extraction_utils.py
* implement base
* add common tests
* make all tests pass for wav2vec2
* make padding work & add more tests
* finalize feature extractor utils
* add call method to feature extraction
* finalize feature processor
* finish tokenizer
* finish general processor design
* finish tests
* typo
* remove bogus file
* finish docstring
* add docs
* finish docs
* small fix
* correct docs
* save intermediate
* load changes
* apply changes
* apply changes to doc
* change tests
* apply surajs recommend
* final changes
* Apply suggestions from code review
* fix typo
* fix import
* correct docstring
2021-02-25 17:42:46 +03:00
Stas Bekman
3437d12134
[Trainer/Deepspeed] handle get_last_lr() before first step() ( #10362 )
...
* handle get_last_lr() before first step()
* abstract away the lr getting logic
* cleanup
* add test
* move to utils
2021-02-23 17:42:25 -08:00
Akmal
23e87c27be
Fix broken examples/seq2seq/README.md markdown ( #10344 )
2021-02-23 10:49:25 -05:00
Stas Bekman
622a8c5995
[trainer] add Trainer methods for metrics logging and saving ( #10266 )
...
* make logging and saving trainer built-in
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-22 13:02:53 -08:00
Stas Bekman
eab0afc19c
[Trainer] implement gradient_accumulation_steps support in DeepSpeed integration ( #10310 )
...
* implement gradient_accumulation_steps support in DeepSpeed integration
* typo
* cleanup
* cleanup
2021-02-22 11:15:59 -08:00
Stas Bekman
f991daed18
defensive programming + expand/correct README ( #10295 )
2021-02-22 10:58:50 -08:00
Julien Plu
536aee99bb
Move the TF NER example ( #10276 )
2021-02-19 16:06:13 -05:00
Joe Davison
cbadb5243c
Zero shot distillation script cuda patch ( #10284 )
2021-02-19 14:06:57 -05:00
Joe Davison
c6fe17557e
Script for distilling zero-shot classifier to more efficient student ( #10244 )
...
* add zero-shot distillation script
* readme wordsmithing
* clean up code
* add multi-gpu teacher inference
plus tidying up more code
* add use_fast_tokenizer arg
* update results in readme
* more readme wordsmithing
* style
* Add handle to readme
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* fix code block
* add error+docs about distributed & tpu
* add @sgugger format requests
* xla -> tpu
* support fp16 for teacher preds
* no checkpoint by default
* add demo colab link
* add model sharing prompt + model link
* correct resulting acc of example
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-02-18 17:08:45 -05:00
Stas Bekman
97e688bc22
[Trainer] memory tracker metrics ( #10225 )
...
* memory tracker metrics
* go back to eval for somewhat consistency
* handle no-gpu case
* deal with stackable eval calls
* restore callback order
* style
* simplify the API
* add test
* docs
* consistently use eval_ prefix
* improve docs
* Update src/transformers/trainer_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* rename method
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-18 09:27:32 -08:00
Stas Bekman
d1eb88f42d
[CI] 2 fixes ( #10248 )
...
* fix invalid port
* missing requirements
2021-02-17 14:12:39 -08:00
Zhang Cheng
df1b0fb54d
set tgt_lang of MBart Tokenizer for summarization ( #10205 )
2021-02-16 09:39:37 -05:00
Suraj Patil
1c8c2d9ab3
[WIP][examples/seq2seq] move old s2s scripts to legacy ( #10136 )
...
* move old s2s scripts to legacy
* add the tests back
* proper rename
* restore
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-15 10:48:02 -08:00
Stas Bekman
0b1f552a24
fix run_seq2seq.py; porting trainer tests to it ( #10162 )
...
* fix run_seq2seq.py; porting DeepSpeed tests to it
* unrefactor
* defensive programming
* defensive programming 2
* port the rest of the trainer tests
* style
* a cleaner scripts dir finder
* cleanup
2021-02-15 09:12:17 -08:00
Suraj Patil
f51188cbe7
[examples/run_s2s] remove task_specific_params and update rouge computation ( #10133 )
...
* fix rouge metrics and task specific params
* fix typo
* round metrics
* typo
* remove task_specific_params
2021-02-12 17:18:21 +05:30
Stas Bekman
b54cb0bd82
[DeepSpeed in notebooks] Jupyter + Colab ( #10130 )
...
* init devices/setup explicitly
* docs + test
* simplify
* cleanup
* cleanup
* cleanup
* correct the required dist setup
* derive local_rank from env LOCAL_RANK
2021-02-11 14:02:05 -08:00
Qbiwan
8dcfaea08d
Update run_xnli.py to use Datasets library ( #9829 )
...
* remove xnli_compute_metrics, add load_dataset, load_metric, set_seed,metric.compute,load_metric
* fix
* fix
* fix
* push
* fix
* everything works
* fix init
* fix
* special treatment for sepconv1d
* style
* 🙏🏽
* add doc and cleanup
* fix doc
* fix doc again
* fix doc again
* Apply suggestions from code review
* make style
* Proposal that should work
* Remove needless code
* Fix test
* Apply suggestions from code review
* remove xnli_compute_metrics, add load_dataset, load_metric, set_seed,metric.compute,load_metric
* amend README
* removed data_args.task_name and replaced with task_name = "xnli"; use split function to load train and validation dataset separately; remove __post_init__; remove flag --task_name from README.
* removed dict task_to_keys, use str "xnli" instead of variable task_name, change preprocess_function to use examples["premise"], examples["hypothesis"] directly, remove sentence1_key and sentence2_key, change compute_metrics function to cater only to accuracy metric, add condition for train_langauge is None when using dataset.load_dataset()
* removed `torch.distributed.barrier()` and `import torch` as `from_pretrained` is able to do the work; amend README
2021-02-11 10:27:23 +05:30
Stas Bekman
77b862847b
[DeepSpeed] restore memory for evaluation ( #10114 )
...
* free up memory at the end of train
* rework tests
* consistent formatting
* correction
2021-02-10 09:09:48 -08:00
Lysandre Debut
0d8e554d42
Line endings should be LF across repo and not CRLF ( #10119 )
2021-02-10 10:50:00 -05:00
Boris Dayma
7c7962ba89
doc: update W&B related doc ( #10086 )
...
* doc: update W&B related doc
* doc(wandb): mention report_to
* doc(wandb): commit suggestion
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* doc(wandb): fix typo
* doc(wandb): remove WANDB_DISABLED
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-09 14:47:52 -05:00
Suraj Patil
63fddcf69c
[examples/s2s] add test set predictions ( #10085 )
...
* add do_predict, pass eval_beams durig eval
* update help
* apply suggestions from code review
2021-02-09 20:41:41 +05:30
Stas Bekman
781220acab
transition to new tests dir ( #10080 )
2021-02-08 12:41:52 -08:00
Stas Bekman
322037e842
[trainer] deepspeed bug fixes and tests ( #10039 )
...
* deepspeed bug fixes and tests
* manual wrap?
2021-02-08 09:44:02 -08:00
Olivier
ece6c51458
[s2s examples] Replace -100 token ids with the tokenizer pad_id for compute_metrics ( #10046 )
...
* replace -100 token ids with the tokenizer pad_id for compute_metrics
* fixed typo for label_ids
2021-02-08 10:08:16 -05:00
Sylvain Gugger
b01483faa0
Truncate max length if needed in all examples ( #10034 )
2021-02-08 05:03:55 -05:00
Stas Bekman
24db8cc329
Can't mix --fp16 and --device cpu ( #10041 )
2021-02-07 17:54:20 -08:00
Stas Bekman
769948fad2
json to jsonlines, and doc, and typo ( #10043 )
2021-02-07 17:51:34 -08:00
Stas Bekman
8ea412a86f
[examples] make run scripts executable ( #10037 )
...
* make executable
* make executable
* same for the template
* cleanup
2021-02-05 15:51:18 -08:00
Suraj Patil
1cd16512dc
[examples/seq2seq] support label smoothing ( #9844 )
...
* add prepare_decoder_input_ids_from_labels in s2s models
* support lbl smoothing and enc/emb freezing
* fix freezing
* use pad_token_id from config
* remove embed freezing and add warning
* prepare decoder_input_ids inside DataCollatorForSeq2Seq
2021-02-05 23:21:57 +05:30
Suraj Patil
bca0dd5ee3
[run_clm.py] fix getting extention
2021-02-03 20:14:42 +05:30
Stas Bekman
d55e10beab
[research proj] [lxmert] rm bleach dependency ( #9970 )
...
Looks like a vulnerability and it's not really used anywhere in the code, so just as well remove it completely from deps.
https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/bleach/open
2021-02-03 05:24:40 -05:00
Patrick von Platen
538b3b4607
[Tokenizer Utils Base] Make pad function more flexible ( #9928 )
...
* change tokenizer requirement
* split line
* Correct typo from list to str
* improve style
* make other function pretty as well
* add comment
* correct typo
* add new test
* pass tests for tok without padding token
* Apply suggestions from code review
2021-02-02 10:35:27 +03:00
Sylvain Gugger
115d97dd2f
Remove subclass for sortish sampler ( #9907 )
...
* Remove subclass for sortish sampler
* Use old Seq2SeqTrainer in script
* Styling
2021-02-01 08:06:32 -05:00
wlhgtc
1682804ebd
Fit chinese wwm to new datasets ( #9887 )
...
* MOD: fit chinese wwm to new datasets
* MOD: move wwm to new folder
* MOD: formate code
* Styling
* MOD add param and recover trainer
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2021-02-01 03:37:59 -05:00
Stas Bekman
6bab83683b
fix logger format for non-main process ( #9911 )
2021-02-01 03:08:12 -05:00
Stas Bekman
6bf94bc0b6
correctly handle mt5 ( #9879 )
2021-01-29 08:11:22 -08:00
Sylvain Gugger
b4e559cfa1
Deprecate model_path in Trainer.train ( #9854 )
2021-01-28 08:32:46 -05:00
Sylvain Gugger
f2fabedbab
Setup logging with a stdout handler ( #9816 )
2021-01-27 03:39:11 -05:00
Yusuke Mori
059bb25817
Fix a bug in run_glue.py ( #9812 ) ( #9815 )
2021-01-26 14:32:19 -05:00
Magdalena Biesialska
8f6c12d306
Fix fine-tuning translation scripts ( #9809 )
2021-01-26 11:30:31 -05:00
Andrea Cappelli
10e5f28212
Improve pytorch examples for fp16 ( #9796 )
...
* Pad to 8x for fp16 multiple choice example (#9752 )
* Pad to 8x for fp16 squad trainer example (#9752 )
* Pad to 8x for fp16 ner example (#9752 )
* Pad to 8x for fp16 swag example (#9752 )
* Pad to 8x for fp16 qa beam search example (#9752 )
* Pad to 8x for fp16 qa example (#9752 )
* Pad to 8x for fp16 seq2seq example (#9752 )
* Pad to 8x for fp16 glue example (#9752 )
* Pad to 8x for fp16 new ner example (#9752 )
* update script template #9752
* Update examples/multiple-choice/run_swag.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update examples/question-answering/run_qa.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update examples/question-answering/run_qa_beam_search.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* improve code quality #9752
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-26 04:47:07 -05:00
Sylvain Gugger
caf4abf768
Auto-resume training from checkpoint ( #9776 )
...
* Auto-resume training from checkpoint
* Update examples/text-classification/run_glue.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Roll out to other examples
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-25 12:03:51 -05:00
Wilfried L. Bounsi
9152f16023
Fix broken [Open in Colab] links ( #9761 )
2021-01-23 15:11:46 +05:30
Sylvain Gugger
411c582109
Fixes to run_seq2seq and instructions ( #9734 )
...
* Fixes to run_seq2seq and instructions
* Add more defaults for summarization
2021-01-22 10:03:57 -05:00
Stefan Schweter
08b22722c7
examples: fix XNLI url ( #9741 )
2021-01-22 18:13:52 +05:30
Sylvain Gugger
5f80c15ef5
Fix memory regression in Seq2Seq example ( #9713 )
...
* Fix memory regression in Seq2Seq example
* Fix test and properly deal with -100
* Easier condition with device safety
* Patch for MBartTokenzierFast
2021-01-21 12:05:46 -05:00
Sylvain Gugger
582f516adb
Use datasets squad_v2 metric in run_qa ( #9677 )
2021-01-20 04:52:13 -05:00