Commit Graph

7165 Commits

Author SHA1 Message Date
Nicolas Patry
aad95c7cde
Removed max_length from being mandatory within generate. (#11314)
* Removed `max_length` from being mandatory within `generate`.

- Moving on to fully using `StoppingCriteria` for `greedy` and `sample`
modes.
- `max_length` still used for `beam_search` and `group_beam_search`
(Follow up PR)
- Fixes a bug with MaxLengthStoppingCriteria (we should stop as soon a
we hit the max_length, the comparison needs to be or equal, that affects
the tests).
- Added options to use `logits_processor` and `stopping_criteria`
directly within `generate` function (so some users can define their own
`logits_processor` and `stopping_criteria`).
- Modified the backward compat tests to make sure we issue a warning.

* Fix `max_length` argument in `generate`.

* Moving validate to being functional.

- Renamed `smax_length` to `stoppping_max_length`.

* Removing `logits_processor` and `stopping_criteria` from `generate`
arguments.

* Deepcopy.

* Fix global variable name.
2021-04-21 11:56:45 +02:00
Yusuke Mori
95dab34d55
Add an error message that fires when Reformer is not in training mode, but one runs .backward() (#11117) 2021-04-21 00:23:37 +02:00
Sylvain Gugger
f1b938fda8
Update to use datasets remove_cloumns method (#11343)
* Update to use datasets remove_cloumns method

* Quality
2021-04-20 14:12:01 -04:00
Suraj Patil
cfd2eaa8cf
[GPTNeo] create local attention mask ones (#11335)
* create local attention mask ones

* remove old method, address patricks comment
2021-04-20 18:37:44 +05:30
Patrick von Platen
f464f10a2c
[Generate] Remove outdated code (#11331)
* remove update function

* update

* refactor more

* refactor
2021-04-20 15:16:02 +03:00
rajvi-k
bfd83c17a7
Added translation example script (#11196)
* initial changes

* modified evaluation

* updated evaluation

* updated evaluation on text translation example script

* added translation example script

* Formatted translation example script

* Reformatted translation example

* Fixed evaluation bug and added support for other tokenisers

* Fixed evaluation bug and added support for other tokenisers

* Added translation example script

* Formatted summarization example script

* Removed typos from summarization example script
2021-04-20 07:18:47 -04:00
Sylvain Gugger
c0328a6c26
Load checkpoint without re-creating the model (#11318) 2021-04-19 20:31:29 -04:00
Sylvain Gugger
95037a169f
[Trainer] Add a progress bar for batches skipped (#11324) 2021-04-19 19:04:52 -04:00
Stas Bekman
95ffbe1686
[Trainer] fix the placement on device with fp16_full_eval (#11322)
* fix the placement on device with fp16_full_eval

* deepspeed never goes on device
2021-04-19 11:55:33 -07:00
TAE YOUNGDON
3981ce3dd2
modify double considering special tokens in language_modeling.py (#11275)
* Update language_modeling.py

in "class TextDatasetForNextSentencePrediction(Dataset)", double considering "self.tokenizer.num_special_tokens_to_add(pair=True)" 

so, i remove self.block_size, and add parameter for "def create_examples_from_document". like "class LineByLineWithSOPTextDataset" do

* Update language_modeling.py
2021-04-19 11:24:43 -04:00
e
5a34d8d982
move device statements outside if statements (#11292) 2021-04-19 08:25:40 -04:00
Sylvain Gugger
d9c62047a8
Trainer support for IterableDataset for evaluation and predict (#11286)
* Bulk of the work

* Polish and tests

* Update QA Trainer

* Avoid breaking the predict method

* Deprecation warnings

* Store real eval dataloder

* Get eval dataset reference before wrap
2021-04-16 16:01:58 -04:00
Lysandre
e783ea7304 Fix failing workflows 2021-04-16 08:09:51 -04:00
Nicolas Patry
92970c0cb9
Enabling multilingual models for translation pipelines. (#10536)
* [WIP] Enabling multilingual models for translation pipelines.

* decoder_input_ids -> forced_bos_token_id

* Improve docstring.

* Rebase

* Fixing 2 bugs

- Type token_ids coming from `_parse_and_tokenize`
- Wrong index from tgt_lang.

* Fixing black version.

* Adding tests for _build_translation_inputs and add them for all
tokenizers.

* Mbart actually puts the lang code at the end.

* Fixing m2m100.

* Adding TF support to `deep_round`.

* Update src/transformers/pipelines/text2text_generation.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Adding one line comment.

* Fixing M2M100 `_build_translation_input_ids`, and fix the call site.

* Fixing tests + deep_round -> nested_simplify

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-16 11:31:35 +02:00
Lysandre Debut
5254220e7f
Workflow fixes (#11270) 2021-04-15 23:21:17 -04:00
Stas Bekman
dfc6dd8584
update dependency_versions_table (#11273)
missed this updating when bumped the version.
2021-04-15 19:10:29 -07:00
Sylvain Gugger
2550b41aa2
Tokenizer fast save (#11234)
* Save fast tokenizers in both formats

* Fix for HerBERT

* Proper fix

* Properly test new behavior
2021-04-15 09:32:32 -04:00
Sylvain Gugger
6e1ee47b36
Support for set_epoch (#11258) 2021-04-15 07:36:32 -04:00
Nicolas Patry
c3fcba3219
Adding pipeline task aliases. (#11247)
* Adding task aliases and adding `token-classification` and
`text-classification` tasks.

* Cleaning docstring.
2021-04-15 09:51:24 +02:00
Sylvain Gugger
aaaed56ffc
Trainer iterable dataset (#11254)
* IterableDatasetShard

* Test and integration in Trainer

* Update src/transformers/trainer_pt_utils.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Style

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-14 17:02:26 -04:00
Stas Bekman
83206ca6a8
[deepspeed] test on one node 2 gpus max (#11237)
* test on one node 2 gpus max

* fix the other place

* refactor

* fix

* cleanup

* more exact version
2021-04-14 11:06:59 -07:00
Sylvain Gugger
25e1af36e0
Fix #10128 (#11248) 2021-04-14 11:47:54 -04:00
Stas Bekman
63ca402380
[troubleshooting] add 2 points of reference to the offline mode (#11236)
* add 2 points of reference to the offline mode

* link the new doc

* add error message

* Update src/transformers/modeling_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

* rename

* Trigger CI

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-14 08:39:23 -07:00
Yusuke Mori
075e821d1d
Add prefix to examples in model_doc rst (#11226)
* Add prefix to examples in model_doc rst

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-14 10:58:55 -04:00
Thomas Wood
4670b57ce9
Fix dimention misspellings. (#11238)
* Update modeling_gpt_neo.py

dimention -> dimension

* Update configuration_speech_to_text.py

dimention -> dimension
2021-04-14 10:39:37 -04:00
Sudharsan S T
f25444cb22
Close open files to suppress ResourceWarning (#11240)
Co-authored-by: Sudharsan Thirumalai <sudharsan.t@sprinklr.com>
2021-04-14 10:31:04 -04:00
Lysandre Debut
7fe5aaa8b0
Stale bot updated (#10562)
* Updated stale bot

* Specify issue number

* Remove particular handling of assignees

* Unleash the stalebot

* Remove debug branch
2021-04-14 10:24:31 -04:00
Joel Stremmel
9337c6c668
make embeddings plural in warning message (#11228) 2021-04-14 10:13:25 -04:00
Nithin Holla
653076ca30
Save the Wav2Vec2 processor before training starts (#10910)
Co-authored-by: nithin19 <nithin@amberscript.com>
2021-04-14 14:52:06 +03:00
Stas Bekman
3d339ee659
[Deepspeed] zero3 tests band aid (#11235)
* temp band-aid

* style
2021-04-13 17:58:09 -04:00
Lysandre Debut
1ad7b0398c
Run CI on deepspeed and fairscale (#11172)
* Run CI on deepspeed and fairscale

* Test it on this branch :)

* Rename

* Update the CI image
2021-04-13 15:47:06 -04:00
Sylvain Gugger
f38cd4373f
Indent code block in the documentation (#11233)
* Indent code block

* Indent code blocks version 2

* Quality
2021-04-13 15:36:36 -04:00
Sylvain Gugger
9d8e8a8703
Avoid using no_sync on SageMaker DP (#11229) 2021-04-13 15:34:00 -04:00
Philipp Schmid
9fa2995993
added cache_dir=model_args.cache_dir to all example with cache_dir arg (#11220) 2021-04-13 18:35:18 +02:00
Sylvain Gugger
3312e96bfb
Doc check: a bit of clean up (#11224) 2021-04-13 12:14:25 -04:00
Suraj Patil
edca520d0f
Refactor GPT2 (#11225)
* refactor GPT2

* fix mlp and head pruning

* address Sylvains comments

* apply suggestion from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-13 21:15:24 +05:30
Sylvain Gugger
893e51a53f Document v4.5.1 2021-04-13 11:28:17 -04:00
Sylvain Gugger
81009b7a5c
Replace error by warning when loading an architecture in another (#11207)
* Replace error by warning when loading an architecture in another

* Style

* Style again

* Add a test

* Adapt old test
2021-04-13 10:33:52 -04:00
Yusuke Mori
22fa0a6004
Add documentation for BertJapanese (#11219)
* Start writing BERT-Japanese doc

* Fix typo, Update toctree

* Modify model file to use comment for document, Add examples

* Clean bert_japanese by make style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Split a big code block into two

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add prefix >>> to all lines in code blocks

* Clean bert_japanese by make fixup

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-13 09:49:15 -04:00
Suraj Patil
896d7be974
fix docstrings (#11221) 2021-04-13 08:58:08 -04:00
Lysandre Debut
823df93955
Fix GPT-2 warnings (#11213)
* Fix GPT-2 warnings

* Update src/transformers/models/gpt2/modeling_gpt2.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-04-13 08:53:03 -04:00
Lysandre Debut
0cd89d8c83
Add Matt as the TensorFlow reference (#11212) 2021-04-13 08:52:30 -04:00
Ceyda Cinarel
7c205bf40c
wav2vec2 converter: create the proper vocab.json while converting fairseq wav2vec2 finetuned model (#11041)
* add vocab while converting wav2vec2 original finetuned model

* check save directory exists

* return_attention_mask fix

* quality
2021-04-13 15:54:33 +05:30
calpt
d49d3cf6d6
Use MSELoss in (M)BartForSequenceClassification (#11178) 2021-04-13 15:24:46 +05:30
Philipp Schmid
f243a5ec0d
Sagemaker test docs update for framework upgrade (#11206)
* increased train_runtime for model parallelism

* added documentation for framework upgrade
2021-04-12 19:08:33 -04:00
Lysandre Debut
74d7c24d8d
Import torch.utils.checkpoint in ProphetNet (#11214) 2021-04-12 18:56:17 -04:00
cronoik
38a10c6b52
Replaced which with who (#11183) 2021-04-12 18:08:28 -04:00
NielsRogge
9f1260971f
Add DeiT (PyTorch) (#11056)
* First draft of deit

* More improvements

* Remove DeiTTokenizerFast from init

* Conversion script works

* Add DeiT to ViT conversion script

* Add tests, add head model, add support for deit in vit conversion script

* Update model checkpoint names

* Update image_mean and image_std, set resample to bicubic

* Improve docs

* Docs improvements

* Add DeiTForImageClassificationWithTeacher to init

* Address comments by @sgugger

* Improve feature extractors

* Make fix-copies

* Minor fixes

* Address comments by @patil-suraj

* All models uploaded

* Fix tests

* Remove labels argument from DeiTForImageClassificationWithTeacher

* Fix-copies, style and quality

* Fix tests

* Fix typo

* Multiple docs improvements

* More docs fixes
2021-04-12 18:07:10 -04:00
Takuya Makino
cb251ba619
Fix typo (#11188) 2021-04-12 17:35:32 -04:00
fghuman
0c6fcd3034
Added documentation for data collator. (#10941)
* Added documentation for data collator.

* Update docs/source/data_collator.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Added documentation for data collator.

* Added documentation for the data collator.

* Merge branch 'doc_DataCollator' of C:\Users\mahii\PycharmProjects\transformers with conflicts.

* Update documentation for the data collator.

* Update documentation for the data collator.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Amna <A.A.Ahmad@student.tudelft.nl>
2021-04-12 11:59:46 -04:00