Commit Graph

6746 Commits

Author SHA1 Message Date
Patrick von Platen
bd8f6cafd4
make rag tests smaller (#10679) 2021-03-15 10:07:12 +03:00
Stas Bekman
4c32f9f26e
AdamW is now supported by default (#9624) 2021-03-12 13:40:07 -08:00
ymfa
fa35cda91e
Pass encoder outputs into GenerationMixin (#10599)
* Pass encoder_outputs into generate()

* Remove an if-statement

* Reformat

* Minimize changes to generate()

* Comment on input_ids
2021-03-12 21:43:11 +05:30
PaulLerner
00cad2e5c1
fix: #10628 expanduser path in TrainingArguments (#10660)
* fix: #10628 expanduser path in TrainingArguments

* docs: explain why we expand paths in TrainingArguments

* Style

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2021-03-12 09:18:19 -05:00
Sylvain Gugger
e8246f78f9
Add auto_wrap option in fairscale integration (#10673)
* Add auto_wrap option in fairscale integration

* Style
2021-03-12 07:50:20 -05:00
Lysandre Debut
184ef8ecd0
TensorFlow tests: having from_pt set to True requires torch to be installed. (#10664)
* TF model exists for Blenderbot 400M

* Marian

* RAG
2021-03-12 14:16:40 +03:00
Nicolas Patry
543d0549f8
Adding new parameter to generate: max_time. (#9846)
* [WIP] Adding new parameter to `generate`:  `max_time`.

Generation by tokens number is sometimes a bit clunky because we don't
know how many tokens are good enough or even how many tokens are in
the payload (for pipelines users for instance). This leads to hard
to understand behavior.

This PR proposes a new argument `max_time` which is a float of seconds
for the allowed time for `generate` to run on.
Ideally combinations of `max_tokens=None`, `max_time=2` could be used to
generate as many tokens as possible within time budget.

NB: Another possible approach consists of passing a callback to `generate`
  putting the caller in charge of the actual decision of when to stop
  generating tokens. It opens the door to 'which args should we pass'
  to this callback. It's hard to imagine other use-cases for this
  early stopping behavior than time (that are not already covered by
  parameters of generate)

* Revamp with StoppingCriteria

* Removing deprecated mentions.

* Forgot arguments to stopping criteria.

* Readding max_length it's not just used as a stopping criteria.

* Default value for `stopping_criteria`.

* Address @patrickvonplaten comments.

- More docstrings
- Actual doc
- Include in global namespace
- Remove TF work.

* Put back `max_length` (deprecation different PR).

* Doc quality.

* Fixing old behavior without `stopping_criteria` but with `max_length`.

Making sure we don't break that in the future.

* Adding more tests for possible inconsistencies between

`max_length` and `stopping_criteria`.

* Fixing the torch imports.
2021-03-12 10:11:50 +01:00
Lysandre Debut
ea46e3fa9c
Adjust loss difference (#10669) 2021-03-12 09:09:46 +03:00
Benjamin Fineran
c526bde319
fix typing error for HfArgumentParser for Optional[bool] (#10672)
* fix typing error for TrainingArguments Optional[bool]

* updating equality check for Optional[bool]
2021-03-11 17:42:54 -05:00
Sylvain Gugger
fa1a8d102f Tentative fix for HFArgumentParser in Python 3.8 2021-03-11 14:44:29 -05:00
WybeKoper
2f8485199c
Fix broken link (#10656)
* Fixed broken link

* fixed max length violation

Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>
2021-03-11 14:29:02 -05:00
jeswan
a01ea31b5c
Add DeBERTa to MODEL_FOR_PRETRAINING_MAPPING (#10668)
* add deberta to pretraining mapping

* add deberta_v2 to PRETRAINING_MAPPING
2021-03-11 13:56:47 -05:00
Lysandre Debut
9fbb4cdc80
Specify minimum version for sacrebleu (#10662) 2021-03-11 13:45:06 -05:00
Sylvain Gugger
fda703a553
Fix integration slow tests (#10670)
* PoC

* Fix slow tests for the PT1.8 Embedding problem
2021-03-11 13:43:53 -05:00
Funtowicz Morgan
3ab6820370
Onnx fix test (#10663)
* Allow to pass kwargs to model's from_pretrained when using pipeline.

* Disable the use of past_keys_values for GPT2 when exporting to ONNX.

* style

* Remove comment.

* Appease the documentation gods

* Fix style

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-03-11 13:38:29 -05:00
Lysandre Debut
a637ae00c4
Fixes Pegasus tokenization tests (#10671) 2021-03-11 13:35:50 -05:00
Lysandre Debut
7e4428749c
Conversion to tensors requires padding (#10661) 2021-03-11 12:58:15 -05:00
Lysandre Debut
2adc8c926a
W2v2 test require torch (#10665)
* Adds a @require_torch to a test that requires it

* Tokenizer too

* Style
2021-03-11 12:56:12 -05:00
Suraj Patil
055ed78f52
[S2T] fix example in docs (#10667) 2021-03-11 22:43:37 +05:30
Sylvain Gugger
89693e170d
Remove special treatment for custom vocab files (#10637)
* Remove special path for custom vocab files

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Expand error message

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-03-11 11:11:56 -05:00
Lysandre Debut
6d9e11a193
S2S + M2M100 should be available in tokenization_auto (#10657)
* S2S + M2M100 should be available in tokenization_auto

* Requires sentencepiece

* SentencePiece for S2T as well :)
2021-03-11 09:53:36 -05:00
Patrick von Platen
602d63f05c
[XLSR-Wav2Vec2] Add multi-lingual Wav2Vec2 models (#10648)
* add conversion script

* add wav2vec2 xslr models

* finish

* Update docs/source/model_doc/xlsr_wav2vec2.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-11 17:44:18 +03:00
Sylvain Gugger
63c295ac05
Ensure metric results are JSON-serializable (#10632) 2021-03-11 09:00:23 -05:00
ArvidYin
27d9e05ce2
Update README.md (#10647)
correct spell error: 'nether'
2021-03-11 08:58:04 -05:00
Lysandre Debut
053f0197b8
merge_file -> merges_file (#10653) 2021-03-11 08:34:08 -05:00
Sylvain Gugger
26a33cfd8c
Document Trainer limitation on custom models (#10635) 2021-03-10 14:58:22 -05:00
Philipp Schmid
49c61a4ae7
Extend trainer logging for sm (#10633)
* renamed logging to hf_logging

* changed logging from hf_logging to logging and loggin to native_logging

* removed everything trying to fix import Trainer error

* adding imports again

* added custom add_handler function to logging.py

* make style

* added remove_handler

* added another conditional to assert
2021-03-10 20:53:49 +01:00
Sylvain Gugger
1aa9c13f70 Fix GPU tests with speech 2021-03-10 12:51:06 -05:00
Sylvain Gugger
2295d783d5
Copy tokenizer files in each of their repo (#10624)
* Move tokenizer files in each repo

* Fix mBART50 tests

* Fix mBART tests

* Fix Marian tests

* Update templates
2021-03-10 11:26:23 -05:00
Suraj Patil
d26b37e744
Speech2TextTransformer (#10175)
* s2t

* fix config

* conversion script

* fix import

* add tokenizer

* fix tok init

* fix tokenizer

* first version working

* fix embeds

* fix lm head

* remove extra heads

* fix convert script

* handle encoder attn mask

* style

* better enc attn mask

* override _prepare_attention_mask_for_generation

* handle attn_maks in encoder and decoder

* input_ids => input_features

* enable use_cache

* remove old code

* expand embeddings if needed

* remove logits bias

* masked_lm_loss => loss

* hack tokenizer to support feature processing

* fix model_input_names

* style

* fix error message

* doc

* remove inputs_embeds

* remove input_embeds

* remove unnecessary docstring

* quality

* SpeechToText => Speech2Text

* style

* remove shared_embeds

* subsample => conv

* remove Speech2TextTransformerDecoderWrapper

* update output_lengths formula

* fix table

* remove max_position_embeddings

* update conversion scripts

* add possibility to do upper case for now

* add FeatureExtractor and Processor

* add tests for extractor

* require_torch_audio => require_torchaudio

* add processor test

* update import

* remove classification head

* attention mask is now 1D

* update docstrings

* attention mask should be of type long

* handle attention mask from generate

* alwyas return attention_mask

* fix test

* style

* doc

* Speech2TextTransformer => Speech2Text

* Speech2TextTransformerConfig => Speech2TextConfig

* remove dummy_inputs

* nit

* style

* multilinguial tok

* fix tokenizer

* add tgt_lang setter

* save lang_codes

* fix tokenizer

* add forced_bos_token_id to tokenizer

* apply review suggestions

* add torchaudio to extra deps

* add speech deps to CI

* fix dep

* add libsndfile to ci

* libsndfile1

* add speech to extras all

* libsndfile1 -> libsndfile1

* libsndfile

* libsndfile1-dev

* apt update

* add sudo to install

* update deps table

* install libsndfile1-dev on CI

* tuple to list

* init conv layer

* add model tests

* quality

* add integration tests

* skip_special_tokens

* add speech_to_text_transformer in toctree

* fix tokenizer

* fix fp16 tests

* add tokenizer tests

* fix copyright

* input_values => input_features

* doc

* add model in readme

* doc

* change checkpoint names

* fix copyright

* fix code example

* add max_model_input_sizes in tokenizer

* fix integration tests

* add do_lower_case to tokenizer

* remove clamp trick

* fix "Add modeling imports here"

* fix copyrights

* fix tests

* SpeechToTextTransformer => SpeechToText

* fix naming

* fix table formatting

* fix typo

* style

* fix typos

* remove speech dep from extras[testing]

* fix copies

* rename doc file,

* put imports under is_torch_available

* run feat extract tests when torch is available

* dummy objects for processor and extractor

* fix imports in tests

* fix import in modeling test

* fxi imports

* fix torch import

* fix imports again

* fix positional embeddings

* fix typo in import

* adapt new extractor refactor

* style

* fix torchscript test

* doc

* doc

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix docs, copied from, style

* fix docstring

* handle imports

* remove speech from all extra deps

* remove s2t from seq2seq lm mapping

* better names

* skip training tests

* add install instructions

* List => Tuple

* doc

* fix conversion script

* fix urls

* add instruction for libsndfile

* fix fp16 test

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-10 21:42:04 +05:30
Sylvain Gugger
efb5c0a453
Add new GLUE example with no Trainer. (#10555)
* Add new GLUE example with no Trainer.

* Style

* Address review comments
2021-03-10 09:29:19 -05:00
Suraj Patil
44f64132a5
remove final_logits_bias (#10606) 2021-03-10 09:52:31 +05:30
Allen Wang
6f52fce673
Fixes an issue in text-classification where MNLI eval/test datasets are not being preprocessed. (#10621)
* Fix MNLI tests

* Linter fix
2021-03-09 22:13:45 -05:00
Sylvain Gugger
72d9e039f9
Fix tests of TrainerCallback (#10615)
* Fix tests of TrainerCallback

* Update tests/test_trainer_callback.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-09 16:25:32 -05:00
Sylvain Gugger
0d909f6bd8
Fairscale FSDP fix model save (#10596)
* Hotfix fairscale FSDP

* Evaluation works

* Save on process zero
2021-03-09 14:42:07 -05:00
Bhadresh Savani
ac17f71159
added max_sample args and metrics changes (#10602) 2021-03-09 12:06:56 -05:00
Philipp Schmid
c19c811a2d
Trigger add sm information (#10610)
* added sm to ua

* update id

* removed id

* removed comments

* added env variable

* changed variable name

* make quality happy

* added sguggers feedback

* make styling happy and remove brackets

* added sm to ua

* update id

* removed id

* removed comments

* added env variable

* changed variable name

* make quality happy

* added sguggers feedback

* make styling happy and remove brackets
2021-03-09 17:31:45 +01:00
Suraj Patil
20c10258a4
layerdrop 0 (#10604) 2021-03-09 17:35:07 +03:00
Lysandre
95ab06778c Update cache version for github actions 2021-03-09 07:10:58 -05:00
Patrick von Platen
9a06b6b11b
[FeatureExtractorSavingUtils] Refactor PretrainedFeatureExtractor (#10594)
* save first version

* finish refactor

* finish refactor

* correct naming

* correct naming

* shorter names

* Update src/transformers/feature_extraction_common_utils.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* change name

* finish

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-09 12:16:59 +03:00
Stas Bekman
b6a28e9ac9
[docs] How to solve "Title level inconsistent" sphinx error (#10600)
* How to solve: Title level inconsistent

* list chars
2021-03-08 20:16:33 -08:00
Lysandre Debut
546cbe7e9e
Speedup tf tests (#10601)
* Pipeline tests should be slow

* Temporarily mark some tests as slow

* Temporarily mark Barthez tests as slow
2021-03-08 21:44:07 -05:00
Ratthachat (Jung)
696e8a4365
Add TFRag (#9002)
* Create modeling_tf_dpr.py

* Add TFDPR

* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot

last commit accidentally deleted these 4 lines, so I recover them back

* Add TFDPR

* Add TFDPR

* clean up some comments, add TF input-style doc string

* Add TFDPR

* Make return_dict=False as default

* Fix return_dict bug (in .from_pretrained)

* Add get_input_embeddings()

* Create test_modeling_tf_dpr.py

The current version is already passed all 27 tests!
Please see the test run at : 
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing

* fix quality

* delete init weights

* run fix copies

* fix repo consis

* del config_class, load_tf_weights

They shoud be 'pytorch only'

* add config_class back

after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion

* newline after .. note::

* import tf, np (Necessary for ModelIntegrationTest)

* slow_test from_pretrained with from_pt=True

At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug

* Add simple TFDPRModelIntegrationTest

Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet

* upload correct tf model

* remove position_ids as missing keys

* create modeling_tf_rag

* add tests for tf

* add tf tests

* revert wrong pt commit

* further refactor

* further refactor

* refactor

* Update modeling_tf_rag.py

- input_processing
- fix prepare_input_for_generation (mostly fix generate bug)
- bring back from_pretrained hack in order to test generate

* delete colab pieces of code

* Show case of greedy "generate"

Temporarily change from beam_search test to greedy_search test to show case that TF and PT do get equivalent output.

* cosmetic update

* correct typos

* update

* push some progress

* make easy check

* fix rag save from pretrained

* Update src/transformers/modeling_tf_utils.py

* remove commented out lines

* delete unnecessary lines

* add simple test case for nq_checkpoint

Add nq_checkpoint test to show that current version without hack still fails

* temporarily put ugly hack back again

* Add TFRagSequenceForGeneration!!

* __init__.py , import TFRagSequenceForGeneration

* Add TFRagSequence tests!

* rag init.py - add TFRagSequenceForGeneration

* fix from_pretrained

* fix prepare_inputs_for_generation

* Beam search for RagToken!

* minor clean up

* add tf.cast in TFRagModel

* More tf.cast

* Add all remaining tests (still have issues)

* delete all T5 related

* make style

* fix load weight prefix

* fix bart

* fix return_dict for tf_rag

make all tests pass .. Hooray

* fix some tests

* fix code quality

* fix qualtiy check

* finish tests tf rag

* add tf rag to docs

* remove TFT5 from docstring

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* remove TFT5 from docstring

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Delete outdated comments

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* improve doc strings

* add generative model classes

* fix adjust token logic

* refactor generate for TFRag

* using shape_list, not _get_shape

Co-authored-by: Julien Plu <plu.julien@gmail.com>

* axis=[1]->axis=1

* delete NEED_HELP comment

* improve readability

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improve readability

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improve readability

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Indicating model is in a developing state in docstrings

As suggested by Julien

* small last changes

* apply sylvains suggestions

* finish tf rag

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
Co-authored-by: Julien Plu <plu.julien@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-09 00:49:51 +03:00
Sylvain Gugger
3ced9b3eb9
Check layer types for Optimizer construction (#10598)
* Check layer types for Optimizer construction

* Duplicate class
2021-03-08 16:40:11 -05:00
Sylvain Gugger
821d518e03 Revert "Tests"
This reverts commit b35e7b68ca.
2021-03-08 16:05:55 -05:00
Sylvain Gugger
4196bfeda0 Revert "Style"
This reverts commit a8ec52efc2.
2021-03-08 16:05:52 -05:00
Sylvain Gugger
a8ec52efc2 Style 2021-03-08 16:04:46 -05:00
Sylvain Gugger
b35e7b68ca Tests 2021-03-08 16:04:30 -05:00
Stas Bekman
f284089ec4
[examples tests on multigpu] resolving require_torch_non_multi_gpu_but_fix_me (#10561)
* batch 1

* this is tpu

* deebert attempt

* the rest
2021-03-08 11:11:40 -08:00
Bhadresh Savani
dfd16af832
Added max_sample_ arguments (#10551)
* reverted changes of logging and saving metrics

* added max_sample arguments

* fixed code

* white space diff

* reformetting code

* reformatted code
2021-03-08 13:57:10 -05:00