Mohamed Al Salti
1321356bdf
Fix typo in GPT2DoubleHeadsModel docs ( #10148 )
...
* Fix typo
* apply suggestion
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-02-12 22:48:39 +05:30
Suraj Patil
f51188cbe7
[examples/run_s2s] remove task_specific_params and update rouge computation ( #10133 )
...
* fix rouge metrics and task specific params
* fix typo
* round metrics
* typo
* remove task_specific_params
2021-02-12 17:18:21 +05:30
Sylvain Gugger
31245775e5
Add SageMakerTrainer for model paralellism ( #10122 )
...
* Refactor things out of main train
* Store signature
* Add SageMakerTrainer
* Init + Copyright
* Address review comments
2021-02-11 18:44:18 -05:00
Stas Bekman
b54cb0bd82
[DeepSpeed in notebooks] Jupyter + Colab ( #10130 )
...
* init devices/setup explicitly
* docs + test
* simplify
* cleanup
* cleanup
* cleanup
* correct the required dist setup
* derive local_rank from env LOCAL_RANK
2021-02-11 14:02:05 -08:00
Sylvain Gugger
6710d1d5ef
Typo fix
2021-02-11 15:12:35 -05:00
Patrick von Platen
8e13b73593
Update README.md
2021-02-11 18:35:27 +03:00
Patrick von Platen
d6b4f48ecb
Update ADD_BIG_BIRD.md
2021-02-11 18:34:17 +03:00
Patrick von Platen
495c157d6f
[Wav2Vec2] Improve Tokenizer & Model for batched inference ( #10117 )
...
* save intermediate
* finish batch the same as fairseq
* add normalization
* fix batched input
* add better comment
* Update src/transformers/models/wav2vec2/modeling_wav2vec2.py
* add nice docstring
* add tokenizer tests
* make all slow tests pass
* finish PR
* correct import
2021-02-11 15:40:54 +03:00
Tanmay Thakur
2f3b5f4dcc
Add new community notebook - Blenderbot ( #10126 )
...
* Update:community.md, new nb add
* feat: updated grammar on nb description
* Update: Train summarizer for BlenderBotSmall
2021-02-11 12:53:40 +03:00
Qbiwan
8dcfaea08d
Update run_xnli.py to use Datasets library ( #9829 )
...
* remove xnli_compute_metrics, add load_dataset, load_metric, set_seed,metric.compute,load_metric
* fix
* fix
* fix
* push
* fix
* everything works
* fix init
* fix
* special treatment for sepconv1d
* style
* 🙏🏽
* add doc and cleanup
* fix doc
* fix doc again
* fix doc again
* Apply suggestions from code review
* make style
* Proposal that should work
* Remove needless code
* Fix test
* Apply suggestions from code review
* remove xnli_compute_metrics, add load_dataset, load_metric, set_seed,metric.compute,load_metric
* amend README
* removed data_args.task_name and replaced with task_name = "xnli"; use split function to load train and validation dataset separately; remove __post_init__; remove flag --task_name from README.
* removed dict task_to_keys, use str "xnli" instead of variable task_name, change preprocess_function to use examples["premise"], examples["hypothesis"] directly, remove sentence1_key and sentence2_key, change compute_metrics function to cater only to accuracy metric, add condition for train_langauge is None when using dataset.load_dataset()
* removed `torch.distributed.barrier()` and `import torch` as `from_pretrained` is able to do the work; amend README
2021-02-11 10:27:23 +05:30
Stas Bekman
77b862847b
[DeepSpeed] restore memory for evaluation ( #10114 )
...
* free up memory at the end of train
* rework tests
* consistent formatting
* correction
2021-02-10 09:09:48 -08:00
Suraj Patil
c130e67dce
remove adjust_logits_during_generation method ( #10087 )
...
* add forced logits processors
* delete adjust_logits method
* add forced_eos_token_id argument in config
* add tests for forced logits processors
* update gen utils tests
* add forced option to tf generate
* remove adjust_logits method from tf models
* update adjust_logits for marian
* delete _force_token_id_to_be_generated method
* style
* import warnings
* pass max_length to _get_logits_processor
* set forced_eos_token_id to None
* set forced attributes in conf utils
* typo
* fix rag generate
* add forced_eos_token_id in rag config
* remove force_bos_token_to_be_generated from BartConfig
* remove _force_token_ids_generation from FSMT
* nit
* fix negative constant
* apply suggestions from code review
2021-02-10 22:39:09 +05:30
Julien Plu
22a32cf485
Fix TF LED/Longformer attentions computation ( #10007 )
...
* Fix test
* Remove commented test
* Fix name
* Apply style
* Fix check copies
* Remove prints
* Restore boolean
* Fix reshape
2021-02-10 10:58:37 -05:00
Lysandre Debut
0d8e554d42
Line endings should be LF across repo and not CRLF ( #10119 )
2021-02-10 10:50:00 -05:00
Stas Bekman
937f67074d
add deepspeed fairscale ( #10116 )
2021-02-10 03:12:27 -05:00
Stas Bekman
d478257d9b
[CI] build docs faster ( #10115 )
...
I assume the CI machine should have at least 4 cores, so let's build docs faster
2021-02-10 03:02:39 -05:00
Stas Bekman
7c07a47dfb
[DeepSpeed docs] new information ( #9610 )
...
* how to specify a specific gpu
* new paper
* expand on buffer sizes
* style
* where to find config examples
* specific example
* small updates
2021-02-09 22:16:20 -08:00
Anthony MOI
1fbaa3c117
Fix tokenizers training in notebook ( #10110 )
2021-02-09 21:48:22 -05:00
Shiva Zamani
85395e4901
Remove speed metrics from default compute objective ( #10107 )
2021-02-09 19:03:02 -05:00
Boris Dayma
7c7962ba89
doc: update W&B related doc ( #10086 )
...
* doc: update W&B related doc
* doc(wandb): mention report_to
* doc(wandb): commit suggestion
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* doc(wandb): fix typo
* doc(wandb): remove WANDB_DISABLED
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-09 14:47:52 -05:00
abhishek thakur
480a9d6ba0
Fix TFConvBertModelIntegrationTest::test_inference_masked_lm Test ( #10104 )
2021-02-09 20:22:54 +01:00
Sylvain Gugger
0c3d23dff7
Add patch releases to the doc
2021-02-09 14:17:09 -05:00
Suraj Patil
3e0c62b611
[RAG] fix generate ( #10094 )
...
* fix rag generate and tests
* put back adjust_logits_during_generation
* tests are okay
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-09 21:57:38 +03:00
Patrick von Platen
226973a9c5
fix import ( #10103 )
2021-02-09 21:43:41 +03:00
Patrick von Platen
4cda2d73ef
Update ADD_BIG_BIRD.md
2021-02-09 19:58:35 +03:00
Julien Plu
b82fe7d258
Replace strided slice with tf.expand_dims ( #10078 )
...
* Replace tf.newaxis -> tf.expand_dims
* Fix tests
* Fix tests
* Use reshape when a tensors needs a double expand
* Fix GPT2
* Fix GPT2
2021-02-09 11:48:28 -05:00
Daniel Stancl
e7381c4596
Add head_mask and decoder_head_mask to TF LED ( #9988 )
...
* Add head masking to TF LED
* Add head_mask to Longformer + one doc piece to LED
* Fix integration tests
2021-02-09 11:45:18 -05:00
Sylvain Gugger
77c0ce8c0c
Fix some edge cases in report_to and add deprecation warnings ( #10100 )
2021-02-09 10:38:12 -05:00
Lysandre Debut
78f4a0e7e5
Logging propagation ( #10092 )
...
* Enable propagation by default
* Document enable/disable default handler
2021-02-09 10:27:49 -05:00
Suraj Patil
63fddcf69c
[examples/s2s] add test set predictions ( #10085 )
...
* add do_predict, pass eval_beams durig eval
* update help
* apply suggestions from code review
2021-02-09 20:41:41 +05:30
Julien Plu
c6d5e56595
Fix naming ( #10095 )
2021-02-09 06:10:31 -05:00
abhishek thakur
4ed763779e
Fix example in Wav2Vec2 documentation ( #10096 )
...
* Fix example in Wav2Vec2 documentation
* fix style
2021-02-09 06:07:56 -05:00
Lysandre
bf1a06a437
Docs for v4.3.1 release
2021-02-09 10:02:50 +01:00
Patrick von Platen
b972125ced
Deprecate Wav2Vec2ForMaskedLM and add Wav2Vec2ForCTC ( #10089 )
...
* add wav2vec2CTC and deprecate for maskedlm
* remove from docs
2021-02-09 03:49:02 -05:00
Lysandre
ba542ffb49
Fix deployment script
2021-02-09 08:43:00 +01:00
sandip
263fac71a2
Integration test for electra model ( #10073 )
2021-02-08 15:42:25 -05:00
Stas Bekman
781220acab
transition to new tests dir ( #10080 )
2021-02-08 12:41:52 -08:00
demSd
84acf0c7bb
remove token_type_ids from TokenizerBertGeneration output ( #10070 )
2021-02-08 13:05:32 -05:00
Juan Cruz-Benito
e4bf9910dc
Removing run_pl_glue.py from text classification docs, include run_xnli.py & run_tf_text_classification.py ( #10066 )
...
* Removing run_pl_glue.py from seq classification docs
* Adding run_tf_text_classification.py
* Using :prefix_link: to refer local files
* Applying "make style" to the branch
* Update docs/source/task_summary.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Removing last underscores
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-08 13:04:21 -05:00
Lysandre
0dd579c9cf
Docs for v4.3.0
2021-02-08 18:53:24 +01:00
Stas Bekman
322037e842
[trainer] deepspeed bug fixes and tests ( #10039 )
...
* deepspeed bug fixes and tests
* manual wrap?
2021-02-08 09:44:02 -08:00
Anthony MOI
f285e4c3ad
Update tokenizers requirement ( #10077 )
2021-02-08 12:27:26 -05:00
noise-field
ddaafd78fb
Fix mlflow param overflow clean ( #10071 )
...
* Unify logging with f-strings
* Get limits from MLflow rather than hardcode
* Add a check for parameter length overflow
Also constants are marked as internal
* Don't stop run in on_train_end
This causes bad behaviour when there is a seprarte validation step:
validation gets recorded as separate run.
* Fix style
2021-02-08 11:58:02 -05:00
Olivier
ece6c51458
[s2s examples] Replace -100 token ids with the tokenizer pad_id for compute_metrics ( #10046 )
...
* replace -100 token ids with the tokenizer pad_id for compute_metrics
* fixed typo for label_ids
2021-02-08 10:08:16 -05:00
Lysandre Debut
c9df1b1d53
Model templates ( #10072 )
2021-02-08 09:07:02 -05:00
demSd
3b7e612a5e
Implementing the test integration of BertGeneration ( #9990 )
...
* claiming this issue
* Integration test for BertGeneration(Encoder and Decoder)
* fix code quality
2021-02-08 08:22:19 -05:00
Julien Plu
cdd8659231
Fix TF template ( #10069 )
...
* Fix template
* Fix template
2021-02-08 08:10:50 -05:00
Patrick von Platen
9e795eac88
fix bert2bert test ( #10063 )
2021-02-08 16:04:28 +03:00
Julien Plu
31563e056d
Restore TF embeddings and attention layers to their previous version ( #9890 )
...
* Refacto BERT
* Restore all the concerned models
* Remove print
* Update template
* Apply Sylvain's and Morgan's comments
* Fix cast
* Put the cast inside call
* Remove cond in ebds
* Fix funnel
* Restore previous dot product (attention_scores) computation
* Add ConvBERT and BART
* Make all the S2S models ONNX compliant
* Fix test
* Fix check copies
2021-02-08 14:36:30 +03:00
Julien Plu
8bb52bd240
Disable temporarily too slow tests (Longformer/LED) ( #10062 )
...
* Disable temporarily too slow tests
* Fix style
* Fix template
2021-02-08 12:32:31 +01:00