Tommy Chiang
7e406f4a65
[Examples] Fix invalid links after reorg ( #11650 )
2021-05-10 11:16:48 +05:30
Tommy Chiang
f2ffcaf49f
[Examples] Check key exists in datasets first ( #11503 )
2021-05-09 15:42:38 -04:00
Stas Bekman
ba0d50f214
[examples] fix sys.path in conftest.py ( #11636 )
...
* restore conftest.py
* fix conftest and make copies
* remove unneeded parts
* remove unwanted files
2021-05-07 14:44:22 -07:00
Stas Bekman
cd9b8d7efe
[self-push CI] sync with self-scheduled ( #11637 )
...
forgot to add the missing `libaio-dev` to this workflow
2021-05-07 14:06:33 -07:00
Lysandre Debut
da37eb8e43
Reduce to 1 worker and set timeout for GPU TF tests ( #11633 )
2021-05-07 11:55:20 -04:00
Lysandre Debut
39084ca663
Add the ImageClassificationPipeline ( #11598 )
...
* Add the ImageClassificationPipeline
* Code review
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
* Have `load_image` at the module level
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2021-05-07 08:08:40 -04:00
Patrick von Platen
e7bff0aabe
make fix copy ( #11627 )
2021-05-07 07:48:51 -04:00
Vasudev Gupta
dc3f6758cf
Add BigBirdPegasus ( #10991 )
...
* init bigbird pegasus
* add debugging nb ; update config
* init conversion
* update conversion script
* complete conversion script
* init forward()
* complete forward()
* add tokenizer
* add some slow tests
* commit current
* fix copies
* add docs
* add conversion script for bigbird-roberta-summarization
* remove TODO
* small fixups
* correct tokenizer
* add bigbird core for now
* fix config
* fix more
* revert pegasus-tokenizer back
* make style
* everything working for pubmed; yayygit status
* complete tests finally
* remove bigbird pegasus tok
* correct tokenizer
* correct tests
* add tokenizer files
* finish make style
* fix test
* update
* make style
* fix tok utils base file
* make fix-copies
* clean a bit
* small update
* fix some suggestions
* add to readme
* fix a bit, clean tests
* fix more tests
* Update src/transformers/__init__.py
* Update src/transformers/__init__.py
* make fix-copies
* complete attn switching, auto-padding left
* make style
* fix auto-padding test
* make style
* fix batched attention tests
* put tolerance at 1e-1 for stand-alone decoder test
* fix docs
* fix tests
* correct slow tokenizer conversion
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* complete remaining suggestions
* fix test
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-05-07 09:27:43 +02:00
Jonathan Chang
6f40e31766
Fix comment in run_clm_no_trainer.py ( #11624 )
2021-05-07 12:32:30 +05:30
Sylvain Gugger
33fd83bc01
Fix RNG saves in distributed mode. ( #11620 )
...
* Fix RNG saves in distributed mode.
* Update src/transformers/trainer.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-05-06 17:14:12 -04:00
Stas Bekman
619200cc42
[cuda ext tests] fixing tests ( #11619 )
...
* fixing tests
* cleanup
2021-05-06 13:35:28 -07:00
Patrick von Platen
44c5621db0
fix tests ( #11615 )
2021-05-06 20:42:51 +02:00
Sylvain Gugger
7eee950ac3
Re-styling in seq2seq attention ( #11613 )
2021-05-06 14:24:19 -04:00
Eldar Kurtic
cf409e5594
Fix docstring typo ( #11611 )
2021-05-06 17:09:28 +05:30
Vipul Raheja
f594090a93
fix typo in command ( #11605 )
2021-05-06 12:32:54 +05:30
Lysandre Debut
079557c1c5
Fix Python version ( #11607 )
2021-05-06 02:50:11 -04:00
baeseongsu
c1780ce7a4
fix head_mask for albert encoder part(AlbertTransformer
) ( #11596 )
...
* fix head mask for albert encoder part
* fix head_mask for albert encoder part
2021-05-06 02:18:02 -04:00
Mats Sjöberg
864c1dfe34
Accept tensorflow-rocm package when checking TF availability ( #11595 )
2021-05-05 14:44:29 -04:00
Patrick von Platen
3e3e41ae20
Pytorch - Lazy initialization of models ( #11471 )
...
* lazy_init_weights
* remove ipdb
* save int
* add necessary code
* remove unnecessary utils
* Update src/transformers/models/t5/modeling_t5.py
* clean
* add tests
* correct
* finish tests
* finish tests
* fix some more tests
* fix xlnet & transfo-xl
* fix more tests
* make sure tests are independent
* fix tests more
* finist tests
* final touches
* Update src/transformers/modeling_utils.py
* Apply suggestions from code review
* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* clean tests
* give arg positive name
* add more mock weights to xlnet
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-05-05 17:22:20 +02:00
Lysandre
8fa8e19429
Skip Funnel test
2021-05-05 12:38:01 +02:00
Deepali
83e59d8e0b
add importlib_metadata and huggingface_hub as dependency in the conda recipe ( #11591 )
...
* add importlib_metadata as dependency (#11490 )
Co-authored-by: Deepali Chourasia <deepch23@us.ibm.com>
* add huggingface_hub dependency
Co-authored-by: Deepali Chourasia <deepch23@us.ibm.com>
2021-05-05 03:36:18 -04:00
Stas Bekman
bf0dfa98d3
copies need to be fixed too ( #11585 )
2021-05-05 03:35:15 -04:00
Stas Bekman
c065025c47
[trainer] document resume randomness ( #11588 )
...
* document resume randomness
* fix link
* reword
* fix
* reword
* style
2021-05-04 14:17:11 -07:00
Sylvain Gugger
6b241e0e3b
Reproducible checkpoint ( #11582 )
...
* Set generator in dataloader
* Use generator in all random samplers
* Checkpoint all RNG states
* Final version
* Quality
* Test
* Address review comments
* Quality
* Remove debug util
* Add python and numpy RNGs
* Split states in different files in distributed
* Quality
* local_rank for TPUs
* Only use generator when accepted
* Add test
* Set seed to avoid flakiness
* Make test less flaky
* Quality
2021-05-04 16:20:56 -04:00
Patrick Fernandes
0afe4a90f9
[Flax] Add Electra models ( #11426 )
...
* add electra model to flax
* Remove Electra Next Sentence Prediction model added by mistake
* fix parameter sharing and loosen equality threshold
* fix styling issues
* add mistaken removen imports
* fix electra table
* Add FlaxElectra to automodels and fixe docs
* fix issues pointed out the PR
* fix flax electra to comply with latest changes
* remove stale class
* add copied from
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-05-04 20:56:09 +02:00
Philipp Schmid
226e74b610
Removes SageMakerTrainer code but keeps class as wrapper ( #11587 )
...
* removed all old code
* make quality
2021-05-04 14:31:18 -04:00
Patrick von Platen
084a187da3
[FlaxRoberta] Add FlaxRobertaModels & adapt run_mlm_flax.py ( #11470 )
...
* add flax roberta
* make style
* correct initialiazation
* modify model to save weights
* fix copied from
* fix copied from
* correct some more code
* add more roberta models
* Apply suggestions from code review
* merge from master
* finish
* finish docs
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-04 19:57:59 +02:00
Sylvain Gugger
2ce0fb84cc
Make quality scripts work when one backend is missing. ( #11573 )
...
* Make quality scripts work when one backend is missing.
* Check env variable is properly set
* Add default
* With print statements
* Fix typo
* Set env variable
* Remove debug code
2021-05-04 09:53:44 -04:00
Lysandre Debut
09b0bcfea9
Enable added tokens ( #11325 )
...
* Fix tests
* Reorganize
* Update tests/test_modeling_mobilebert.py
* Remove unnecessary addition
2021-05-04 08:13:57 -04:00
abhishek thakur
c40c7e213b
Add multi-class, multi-label and regression to transformers ( #11012 )
...
* add to bert
* review comments
* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* self.config.problem_type
* fix style
* fix
* fin
* fix
* update doc
* fix
* test
* Test more problem types
* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix
* remove
* fix
* quality
* make fix-copies
* remove test
Co-authored-by: abhishek thakur <abhishekkrthakur@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-05-04 02:23:40 -04:00
Stas Bekman
7c622482e8
fix resize_token_embeddings ( #11572 )
2021-05-03 13:12:06 -07:00
Sylvain Gugger
fe82b1bfa0
Update training tutorial ( #11533 )
...
* Update training tutorial
* Apply suggestions from code review
Co-authored-by: Hamel Husain <hamelsmu@github.com>
* Address review comments
* Update docs/source/training.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* More review comments
* Last review comments
Co-authored-by: Hamel Husain <hamelsmu@github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-05-03 13:18:46 -04:00
Sylvain Gugger
f4c9a7e62e
Accumulate opt state dict on do_rank 0 ( #11481 )
2021-05-03 13:18:27 -04:00
Nicolas Patry
1e8e06862f
Fixes a useless warning. ( #11566 )
...
Fixes #11525
2021-05-03 18:48:13 +02:00
Sylvain Gugger
87dd1a00ef
Fix metric computation in run_glue_no_trainer
( #11569 )
2021-05-03 11:42:55 -04:00
Muktan
a721a5eefd
[Wav2vec2] Fixed tokenization mistakes while adding single-char tokens to tokenizer ( #11538 )
...
* Fixed tokenization mistakes while adding single-char tokens to tokenizer
* Added tests and Removed unnecessary comments.
* finalize wav2vec2 tok
* add more aggressive tests
* Apply suggestions from code review
* fix useless import
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-05-03 17:19:12 +02:00
NielsRogge
f3cf8ae7b3
Add LUKE ( #11223 )
...
* Rebase with master
* Minor bug fix in docs
* Copy files from adding_luke_v2 and improve docs
* change the default value of use_entity_aware_attention to True
* remove word_hidden_states
* fix head models
* fix tests
* fix the conversion script
* add integration tests for the pretrained large model
* improve docstring
* Improve docs, make style
* fix _init_weights for pytorch 1.8
* improve docs
* fix tokenizer to construct entity sequence with [MASK] entity when entities=None
* Make fix-copies
* Make style & quality
* Bug fixes
* Add LukeTokenizer to init
* Address most comments by @patil-suraj and @LysandreJik
* rename _compute_extended_attention_mask to get_extended_attention_mask
* add comments to LukeSelfAttention
* fix the documentation of the tokenizer
* address comments by @patil-suraj, @LysandreJik, and @sgugger
* improve docs
* Make style, quality and fix-copies
* Improve docs
* fix docs
* add "entity_span_classification" task
* update example code for LukeForEntitySpanClassification
* improve docs
* improve docs
* improve the code example in luke.rst
* rename the classification layer in LukeForEntityClassification from typing to classifier
* add bias to the classifier in LukeForEntitySpanClassification
* update docs to use fine-tuned hub models in code examples of the head models
* update the example sentences
* Make style & quality
* Add require_torch to tokenizer tests
* Add require_torch to tokenizer tests
* Address comments by @sgugger and add community notebooks
* Make fix-copies
Co-authored-by: Ikuya Yamada <ikuya@ikuya.net>
2021-05-03 09:07:29 -04:00
Frederik Bode
6a11e4c2ad
fix the mlm longformer example by changing [MASK] to <mask> ( #11559 )
2021-05-03 12:43:30 +01:00
Lysandre Debut
1c86157d9d
Remove datasets
submodule. ( #11563 )
2021-05-03 06:02:33 -04:00
Patrick von Platen
c448c01f25
[Wav2Vec2] Fix convert ( #11562 )
...
* push
* small change
* correct other typo
2021-05-03 11:53:30 +02:00
Suraj Patil
623281aa12
[Flax BERT/Roberta] few small fixes ( #11558 )
...
* small fixes
* style
2021-05-03 10:35:06 +02:00
lewtun
a5d2967bd8
Fix examples in M2M100 docstrings ( #11540 )
...
Replaces `tok` with `tokenizer` so examples can run with copy-paste
2021-05-03 10:56:31 +05:30
jingyihe
980208650a
Fixed docs for the shape of scores
in generate()
( #10057 )
...
* Fixed the doc for the shape of return scores tuples in generation_utils.py.
* Fix the output shape of `scores` for `DecoderOnlyOutput`.
* style fix
2021-05-02 10:10:47 +02:00
Stas Bekman
4e7bf94e72
[DeepSpeed] fp32 support ( #11499 )
...
* prep for deepspeed==0.3.16
* new version
* too soon
* support and test fp32 mode
* troubleshooting doc start
* workaround no longer needed
* add fp32 doc
* style
* cleanup, add tf32 note
* clarify
* release was made
2021-04-30 12:51:48 -07:00
Stas Bekman
282f3ac3ef
[debug utils] activation/weights underflow/overflow detector ( #11274 )
...
* sync
* add activation overflow debug utility
* cleanup
* document detect_overflow
* import torch
* add deprecation warning
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* convert to rst, add note
* add class
* fix docs
* improve the doc
* rework to dump a lot more info about each frame
* complete expansion
* cleanup
* format
* cleanup
* doesn't have to be transformers
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* wrap long line
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-30 11:15:46 -07:00
Hamel Husain
804c2974d5
Improve task summary docs ( #11513 )
...
* fix task summary docs
* refactor to use model.config.id2label instead of list
* fix nit
* Update docs/source/task_summary.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-30 09:06:47 -04:00
Sylvain Gugger
bc80f8bc37
Add Stas and Suraj as authors ( #11526 )
2021-04-30 09:03:13 -04:00
Bhadresh Savani
84326a28f8
[Examples] Added support for test-file in QA examples with no trainer ( #11510 )
...
* added support for test-file
* fixed typo
* added suggested changes
* reformatted code
* modifed files
* fix post processing error
* Trigger CI
* removed extra lines
2021-04-30 09:02:50 -04:00
Lysandre Debut
af0692a2ca
Run model templates on master ( #11527 )
2021-04-30 08:47:12 -04:00
Suraj Patil
57c8e822f7
reszie token embeds ( #11524 )
2021-04-30 08:47:01 -04:00