Lysandre Debut
0d8e554d42
Line endings should be LF across repo and not CRLF ( #10119 )
2021-02-10 10:50:00 -05:00
Stas Bekman
937f67074d
add deepspeed fairscale ( #10116 )
2021-02-10 03:12:27 -05:00
Stas Bekman
d478257d9b
[CI] build docs faster ( #10115 )
...
I assume the CI machine should have at least 4 cores, so let's build docs faster
2021-02-10 03:02:39 -05:00
Stas Bekman
7c07a47dfb
[DeepSpeed docs] new information ( #9610 )
...
* how to specify a specific gpu
* new paper
* expand on buffer sizes
* style
* where to find config examples
* specific example
* small updates
2021-02-09 22:16:20 -08:00
Anthony MOI
1fbaa3c117
Fix tokenizers training in notebook ( #10110 )
2021-02-09 21:48:22 -05:00
Shiva Zamani
85395e4901
Remove speed metrics from default compute objective ( #10107 )
2021-02-09 19:03:02 -05:00
Boris Dayma
7c7962ba89
doc: update W&B related doc ( #10086 )
...
* doc: update W&B related doc
* doc(wandb): mention report_to
* doc(wandb): commit suggestion
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* doc(wandb): fix typo
* doc(wandb): remove WANDB_DISABLED
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-09 14:47:52 -05:00
abhishek thakur
480a9d6ba0
Fix TFConvBertModelIntegrationTest::test_inference_masked_lm Test ( #10104 )
2021-02-09 20:22:54 +01:00
Sylvain Gugger
0c3d23dff7
Add patch releases to the doc
2021-02-09 14:17:09 -05:00
Suraj Patil
3e0c62b611
[RAG] fix generate ( #10094 )
...
* fix rag generate and tests
* put back adjust_logits_during_generation
* tests are okay
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-09 21:57:38 +03:00
Patrick von Platen
226973a9c5
fix import ( #10103 )
2021-02-09 21:43:41 +03:00
Patrick von Platen
4cda2d73ef
Update ADD_BIG_BIRD.md
2021-02-09 19:58:35 +03:00
Julien Plu
b82fe7d258
Replace strided slice with tf.expand_dims ( #10078 )
...
* Replace tf.newaxis -> tf.expand_dims
* Fix tests
* Fix tests
* Use reshape when a tensors needs a double expand
* Fix GPT2
* Fix GPT2
2021-02-09 11:48:28 -05:00
Daniel Stancl
e7381c4596
Add head_mask and decoder_head_mask to TF LED ( #9988 )
...
* Add head masking to TF LED
* Add head_mask to Longformer + one doc piece to LED
* Fix integration tests
2021-02-09 11:45:18 -05:00
Sylvain Gugger
77c0ce8c0c
Fix some edge cases in report_to and add deprecation warnings ( #10100 )
2021-02-09 10:38:12 -05:00
Lysandre Debut
78f4a0e7e5
Logging propagation ( #10092 )
...
* Enable propagation by default
* Document enable/disable default handler
2021-02-09 10:27:49 -05:00
Suraj Patil
63fddcf69c
[examples/s2s] add test set predictions ( #10085 )
...
* add do_predict, pass eval_beams durig eval
* update help
* apply suggestions from code review
2021-02-09 20:41:41 +05:30
Julien Plu
c6d5e56595
Fix naming ( #10095 )
2021-02-09 06:10:31 -05:00
abhishek thakur
4ed763779e
Fix example in Wav2Vec2 documentation ( #10096 )
...
* Fix example in Wav2Vec2 documentation
* fix style
2021-02-09 06:07:56 -05:00
Lysandre
bf1a06a437
Docs for v4.3.1 release
2021-02-09 10:02:50 +01:00
Patrick von Platen
b972125ced
Deprecate Wav2Vec2ForMaskedLM and add Wav2Vec2ForCTC ( #10089 )
...
* add wav2vec2CTC and deprecate for maskedlm
* remove from docs
2021-02-09 03:49:02 -05:00
Lysandre
ba542ffb49
Fix deployment script
2021-02-09 08:43:00 +01:00
sandip
263fac71a2
Integration test for electra model ( #10073 )
2021-02-08 15:42:25 -05:00
Stas Bekman
781220acab
transition to new tests dir ( #10080 )
2021-02-08 12:41:52 -08:00
demSd
84acf0c7bb
remove token_type_ids from TokenizerBertGeneration output ( #10070 )
2021-02-08 13:05:32 -05:00
Juan Cruz-Benito
e4bf9910dc
Removing run_pl_glue.py from text classification docs, include run_xnli.py & run_tf_text_classification.py ( #10066 )
...
* Removing run_pl_glue.py from seq classification docs
* Adding run_tf_text_classification.py
* Using :prefix_link: to refer local files
* Applying "make style" to the branch
* Update docs/source/task_summary.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Removing last underscores
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-08 13:04:21 -05:00
Lysandre
0dd579c9cf
Docs for v4.3.0
2021-02-08 18:53:24 +01:00
Stas Bekman
322037e842
[trainer] deepspeed bug fixes and tests ( #10039 )
...
* deepspeed bug fixes and tests
* manual wrap?
2021-02-08 09:44:02 -08:00
Anthony MOI
f285e4c3ad
Update tokenizers requirement ( #10077 )
2021-02-08 12:27:26 -05:00
noise-field
ddaafd78fb
Fix mlflow param overflow clean ( #10071 )
...
* Unify logging with f-strings
* Get limits from MLflow rather than hardcode
* Add a check for parameter length overflow
Also constants are marked as internal
* Don't stop run in on_train_end
This causes bad behaviour when there is a seprarte validation step:
validation gets recorded as separate run.
* Fix style
2021-02-08 11:58:02 -05:00
Olivier
ece6c51458
[s2s examples] Replace -100 token ids with the tokenizer pad_id for compute_metrics ( #10046 )
...
* replace -100 token ids with the tokenizer pad_id for compute_metrics
* fixed typo for label_ids
2021-02-08 10:08:16 -05:00
Lysandre Debut
c9df1b1d53
Model templates ( #10072 )
2021-02-08 09:07:02 -05:00
demSd
3b7e612a5e
Implementing the test integration of BertGeneration ( #9990 )
...
* claiming this issue
* Integration test for BertGeneration(Encoder and Decoder)
* fix code quality
2021-02-08 08:22:19 -05:00
Julien Plu
cdd8659231
Fix TF template ( #10069 )
...
* Fix template
* Fix template
2021-02-08 08:10:50 -05:00
Patrick von Platen
9e795eac88
fix bert2bert test ( #10063 )
2021-02-08 16:04:28 +03:00
Julien Plu
31563e056d
Restore TF embeddings and attention layers to their previous version ( #9890 )
...
* Refacto BERT
* Restore all the concerned models
* Remove print
* Update template
* Apply Sylvain's and Morgan's comments
* Fix cast
* Put the cast inside call
* Remove cond in ebds
* Fix funnel
* Restore previous dot product (attention_scores) computation
* Add ConvBERT and BART
* Make all the S2S models ONNX compliant
* Fix test
* Fix check copies
2021-02-08 14:36:30 +03:00
Julien Plu
8bb52bd240
Disable temporarily too slow tests (Longformer/LED) ( #10062 )
...
* Disable temporarily too slow tests
* Fix style
* Fix template
2021-02-08 12:32:31 +01:00
Nicolas Patry
b1aa4982cd
Cleaning up ConversationalPipeline
to support more than DialoGPT. ( #10002 )
...
* Cleaning up `ConversationalPipeline` to support more than DialoGPT.
Currently ConversationalPipeline was heavily biased towards DialoGPT
,which is the default model for this pipeline.
This PR proposes changes to put back the modifications specific to
DialoGPT into tokenizer-specific behavior wherever possible, by
creating `_build_conversation_input_ids` function that takes
conversation as input, and returns a list of ints corresponding
to the tokens. It feels natural to put here because all models
have probably different strategies to build input_ids from the
full conversation and it's the tokenizer's job to transform strings
into tokens (and vice-versa)
If `_build_conversation_input_ids` is missing, previous behavior is
used so we don't break anything so far (except for blenderbot where it's a fix).
This PR also contains a fix for too long inputs. There used
to be dead code for trying to limit the size of incoming input.
The introduced fixed is that we limit
within `_build_conversation_input_ids` to `tokenizer.model_max_length`.
It corresponds to the intent of the removed dead code and is actually
better because it corresponds to `model_max_length` which is different
from `max_length` (which is a default parameter for `generate`).
- Removed `history` logic from the Conversation as it's not relevant
anymore because tokenization logic has been moved to tokenizer.
And tokenizer cannot save any cache, and conversation cannot know
what is relevant or not.
Also it's not usable from `blenderbot` because the input_ids are
not append only (EOS tokens is always at the end).
- Added `iter_texts` method on `Conversation` because all
the code was literred with some form of this iteration of
past/generated_responses.
* Removing torch mention in types.
* Adding type checking to `_build_conversation_input_ids`.
* Fixing import in strings.
2021-02-08 14:29:07 +03:00
Lysandre Debut
ae37ceacbd
Fix typo ( #10064 )
2021-02-08 06:02:05 -05:00
Patrick von Platen
9a0399e18d
fix bart tests ( #10060 )
2021-02-08 13:25:09 +03:00
Sylvain Gugger
b01483faa0
Truncate max length if needed in all examples ( #10034 )
2021-02-08 05:03:55 -05:00
Sylvain Gugger
45aaf5f7ab
A few fixes in the documentation ( #10033 )
2021-02-08 05:02:01 -05:00
Sylvain Gugger
04fd783cc5
Check copies match full class/function names ( #10030 )
2021-02-08 04:58:25 -05:00
Lysandre Debut
d51302cca0
Fix slow dpr test ( #10059 )
...
* Correct cast to device
* Comment back the slow test
2021-02-08 04:43:25 -05:00
sandip
12e44af5d3
Integration test for FlauBert ( #10022 )
2021-02-08 04:36:50 -05:00
Stas Bekman
24db8cc329
Can't mix --fp16 and --device cpu ( #10041 )
2021-02-07 17:54:20 -08:00
Stas Bekman
769948fad2
json to jsonlines, and doc, and typo ( #10043 )
2021-02-07 17:51:34 -08:00
Stas Bekman
8ea412a86f
[examples] make run scripts executable ( #10037 )
...
* make executable
* make executable
* same for the template
* cleanup
2021-02-05 15:51:18 -08:00
Suraj Patil
1cd16512dc
[examples/seq2seq] support label smoothing ( #9844 )
...
* add prepare_decoder_input_ids_from_labels in s2s models
* support lbl smoothing and enc/emb freezing
* fix freezing
* use pad_token_id from config
* remove embed freezing and add warning
* prepare decoder_input_ids inside DataCollatorForSeq2Seq
2021-02-05 23:21:57 +05:30
Patrick von Platen
b9720dd6f2
Bump minimum Jax requirement to 2.8.0 ( #10027 )
...
* Bump minimum Jax requirement to 2.8.0
* update table
2021-02-05 16:20:26 +03:00