Commit Graph

8442 Commits

Author SHA1 Message Date
NielsRogge
7375758bee
Fix tests (#14703) 2021-12-09 08:32:35 -05:00
Sylvain Gugger
68e53e6fcd
Add a job to test doc building (for realsies this time) (#14662) 2021-12-09 07:01:03 -05:00
Sylvain Gugger
e9800122a6 Add kenlm dep to missing tests 2021-12-08 19:59:44 -05:00
Yih-Dar
ee6674d450
Fix doc examples: name '...' is not defined (#14687)
* Fix doc examples: name '...' is not defined

* remove >>> and ... in some docstrings in visual_bert

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2021-12-08 16:39:35 -05:00
Sylvain Gugger
e6219320b9
Make MLuke tokenizer tests slow (#14690) 2021-12-08 15:59:57 -05:00
Sylvain Gugger
13186d7152
Move pyctcdecode (#14686)
* Move pyctcdecode dep

* Fix doc and last objects

* Quality

* Style

* Ignore this black
2021-12-08 15:41:58 -05:00
Stas Bekman
d104dd46d9
[trainer] support UserDict inputs (torch-nightly) (#14688) 2021-12-08 12:21:43 -08:00
Stas Bekman
1228661285
[bf16 support] tweaks (#14580)
* [bf16 support] tweaks

* corrections

Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>
2021-12-08 11:33:24 -08:00
Yih-Dar
16870d114b
Fix wrong checkpoint paths in doc examples (#14685)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2021-12-08 14:25:48 -05:00
Sylvain Gugger
01b8cd5932
Revert open-in-colab and add perceiver (#14683) 2021-12-08 13:52:31 -05:00
Sylvain Gugger
f6b87c5f30
Fixes in init (#14681)
* Fixes in init

* Style
2021-12-08 13:42:22 -05:00
Dhruv Nair
fe06f8dcac
Improvements to Comet Integration (#14680)
* change args to address overwriting issue

* remove project name from args

* remove passing args as kwargs to experiment object

* remove passing args as kwargs to offline experiment

* fix offline directory assignment in experiment kwargs

* log checkpoint folder on training end

* log entire output_dir as asset folder

* log asset folder  recursively

* end experiment at the end of training

* clean up

* clean up

* Default to always log training assets to Comet when using CometCallback

* change logging training assets to be true when running callback setup

* fix so that experiment always ends when training ends

* styling and quality fixes

* update docstring for COMET_LOG_ASSETS environment variable

* run styling and quality checks

* clean up to docstring

* remove merge markers

* change asset logging to false to avoid hitting max assets per experiment limit

* update training asset description

* fix styling
2021-12-08 13:39:10 -05:00
Gaurang Tandon
4ea19de80c
fix: verify jsonlines file in run_translation (#14660) (#14661)
* fix: verify jsonl in run_translation (#14660)

* fix(run_translation.py): json/jsonl validation

Both json and jsonl are to be accepted as valid jsonlines file extension

* fix(run_translation.py): make black happy

* Ran make style
2021-12-08 13:25:30 -05:00
Sylvain Gugger
cf36f4d7a8
Convert tutorials (#14665)
* Convert a few docs

* And another

* Last tutorials

* New syntax for colab links

* Convert a few docs

* And another

* Last tutorials

* New syntax for colab links
2021-12-08 13:19:46 -05:00
lewtun
0f4e39c559
Revert "Added support for other features for already supported models (#14358)" (#14679)
This reverts commit 0c70f145d1.
2021-12-08 13:04:40 -05:00
Michael Benayoun
0c70f145d1
Added support for other features for already supported models (#14358)
* Added support for other features for already supported models

* Partial support for causal and seq2seq models

* Partial support for causal and seq2seq models

* OnnxSeq2SeqConfigWithPast to support seq2seq models

* Parameterized the onnx tests

* Restored run_mlm.py

* Restored run_mlm.py

* [WIP] BART update

* BART and MBART

* Added comments

* Another sequence length of the past_key_values
2021-12-08 18:39:56 +01:00
Patrick von Platen
ee4fa2e465
[AutoProcessor] Add Wav2Vec2WithLM & small fix (#14675)
* [AutoProcessor] Add Wav2Vec2WithLM & small fix

* revert line removal

* Update src/transformers/__init__.py

* add test

* up

* up

* small fix
2021-12-08 15:51:28 +01:00
Lysandre Debut
2294071a0c
Fix doc builder (#14676) 2021-12-08 09:14:36 -05:00
ZOHETH
fab3b518ef
fix deprecated tf method (#14671)
tf.matrix_band_part -> tf.linalg.band_part
2021-12-08 13:43:21 +00:00
NielsRogge
65b20b739b
Add Perceiver IO (#14487)
* First draft

* Style and remove mlm

* Make forward pass work

* More improvements

* More improvements

* Fix bug

* More improvements

* More improvements

* Add PerceiverTokenizer first draft

* Improve conversion script

* More improvements

* Make conversion script work for the encoder

* Make conversion script work with local pickle files

* Style & quality, fix-copies

* Add dummy input to conversion script

* Add absolute position embeddings to TextPreProcessor

* Make forward pass of encoder work

* More improvements

* Move text preprocessor to separate script

* More improvements

* More improvements

* Add post processor

* Make MLM model work

* Style

* Add PerceiverForMaskedLM

* Add PerceiverImagePreprocessor

* Make style

* Make PerceiverForImageClassification work

* More improvements

* More improvements

* Use tokenizer in conversion script

* Use PerceiverForMaskedLM in conversion script

* Define custom PerceiverModelOutput

* Improve PerceiverAttention to make it work for both MLM and image classification

* More improvements

* More improvements

* More improvements to the conversion script

* Make conversion script work for both MLM and image classification

* Add PerceiverFeatureExtractor

* More improvements

* Style and quality

* Add center cropping

* Fix bug

* Small fix

* Add print statement

* Fix bug in image preprocessor

* Fix bug with conversion script

* Make output position embeddings an nn.Parameter layer instead of nn.Embedding

* Comment out print statements

* Add position encoding classes

* More improvements

* Use position_encoding_kwargs

* Add PerceiverForImageClassificationFourier

* Make style & quality

* Add PerceiverForImageClassificationConvProcessing

* Style & quality

* Add flow model

* Move processors to modeling file

* Make position encodings modular

* Make basic decoder use modular position encodings

* Add PerceiverForOpticalFlow to conversion script

* Add AudioPreprocessor

* Make it possible for the basic decoder to use Fourier position embeddings

* Add PerceiverForMultimodalAutoencoding

* Improve model for optical flow

* Improve _build_network_inputs method

* Add print statement

* Fix device issue

* Fix device of Fourier embeddings

* Add print statements for debugging

* Add another print statement

* Add another print statement

* Add another print statement

* Add another print statement

* Improve PerceiverAudioPreprocessor

* Improve conversion script for multimodal modal

* More improvements

* More improvements

* Improve multimodal model

* Make forward pass multimodal model work

* More improvements

* Improve tests

* Fix some more tests

* Add output dataclasses

* Make more tests pass

* Add print statements for debuggin

* Add tests for image classification

* Add PerceiverClassifierOutput

* More improvements

* Make more tests pass for the optical flow model

* Make style & quality

* Small improvements

* Don't support training for optical flow model for now

* Fix _prepare_for_class for tests

* Make more tests pass, add some docs

* Add multimodal model to tests

* Minor fixes

* Fix tests

* Improve conversion script

* Make fixup

* Remove pos_dim argument

* Fix device issue

* Potential fix for OOM

* Revert previous commit

* Fix test_initialization

* Add print statements for debugging

* Fix print statement

* Add print statement

* Add print statement

* Add print statement

* Add print statement

* Add print statement

* Add print statement

* Remove need for output_shape

* Comment out output_shape

* Remove unnecessary code

* Improve docs

* Fix make fixup

* Remove PerceiverTextProcessor from init

* Improve docs

* Small improvement

* Apply first batch of suggestions from code review

* Apply more suggestions from code review

* Update docstrings

* Define dicts beforehand for readability

* Rename task to architecture in conversion script, include PerceiverModel in tests

* Add print statements for debugging

* Fix tests on GPU

* Remove preprocessors, postprocessors and decoders from main init

* Add integration test

* Fix docs

* Replace einops by torch

* Update for new docs frontend

* Rename PerceiverForImageClassification

* Improve docs

* Improve docs

* Improve docs of PerceiverModel

* Fix some more tests

* Improve center_crop

* Add PerceiverForSequenceClassification

* Small improvements

* Fix tests

* Add integration test for optical flow model

* Clean up

* Add tests for tokenizer

* Fix tokenizer by adding special tokens properly

* Fix CI
2021-12-08 14:20:34 +01:00
Patrick von Platen
961732c276
[Wav2Vec2] PyCTCDecode Integration to support language model boosted decoding (#14339)
* up

* up

* up

* make it cleaner

* correct

* make styhahalal

* add more tests

* finish

* small fix

* make style

* up

* tryout to solve cicrle ci

* up

* fix more tests

* fix more tests

* apply sylvains suggestions

* fix import

* correct docs

* add pyctcdecode only to speech tests

* fix more tests

* add tf, flax and pt tests

* add pt

* fix last tests

* fix more tests

* Apply suggestions from code review

* change lines

* Apply suggestions from code review

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* correct tests

* correct tests

* add doc string

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
2021-12-08 12:07:54 +01:00
Nicolas Patry
2e12d90b9e
Fixing Dataset for TQA + token-classification. (#14658)
* Fixing Dataset for TQA + token-classification.

* Fixing the tests.

* Making sure `offset_mappings` is a valid argument.
2021-12-08 09:54:24 +01:00
Stas Bekman
fae0b9faef
[trainer] conditional ctx managers into one wrapper (#14663)
* [trainer] conditional ctx managers into one wrapper

* workaround for contextlib.nullcontext for py<3.7

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* one more autocast

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-12-07 13:04:18 -08:00
TranSirius
39f1dff5a0
Fix a Bug, trainer_seq2seq.py, in the else branch at Line 172, generation_inputs should be a dict (#14546)
* fix bug, trainer_seq2seq.py, Line 172, generation_inputs must be a dict before feeding into self.model.generation()

* fix bug, trainer_seq2seq.py, Line 172, generation_inputs must be a dict before feeding into self.model.generation()
2021-12-07 12:09:18 -05:00
Nouamane Tazi
2171695cc2
quick fix SummarizationPipeline error messages (#14618)
* quick fix SummarizationPipeline error messages

Fix error messages to avoid spam errors, and errors of type:
`Your max_length is set to 50, but you input_length is only 46. You might consider decreasing max_length manually, e.g. summarizer('...', max_length=50)`

* correcto SummarizationPipeline error messages fixes
2021-12-07 16:44:28 +01:00
Stas Bekman
b66c5ab20c
[deepspeed] fix --load_best_model_at_end (#14652)
* [deepspeed] fix load_best_model_at_end

* try with pull_request_target

* revert: try with pull_request_target

* style

* add test

* cleanup
2021-12-06 21:57:47 -08:00
Ryokan RI
30646a0a3c
Add mLUKE (#14640)
* implement MLukeTokenizer and LukeForMaskedLM

* update tests

* update docs

* add LukeForMaskedLM to check_repo.py

* update README

* fix test and specify the entity pad id in tokenization_(m)luke

* fix EntityPredictionHeadTransform
2021-12-07 00:25:28 -05:00
Yih-Dar
4cdb67caba
Use cross_attention_hidden_size in Encoder-Decoder models (#14378)
* add cross_attention_hidden_size to text-2-text encoder-decoder models (PT/Flax)

* for TFEncoderDecoderModel

* add equivalence test for TFEncoderDecoderModel

* fix

* fix failed equivalence tests

* remove unused import

* add detailed comment

* Fix check_equivalence_tf_to_pt by using encoder/decoder

* cleaning

* Use cross_attention_hidden_size in speech-to-text

* clean fast init logging msg in encoder decoder models

* increase tol from 1e-5 to 1e-3 for tf test

* style

* style

* make sure projection layer can run

* remove type conversion + add check

* fix conflict (config.output_hidden_size)

* Remove TF -> PT in check_pt_tf_equivalence for TFEncoderDecoderModel

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2021-12-07 00:27:32 +01:00
Sylvain Gugger
381b05a3f5 Remove nonworking workflow for now 2021-12-06 17:25:28 -05:00
Suraj Patil
75ae287aec
fix flax examples tests (#14646)
* make tensorboard optional

* update test_fetcher for flax examples

* make the tests slow
2021-12-07 00:34:27 +05:30
Sylvain Gugger
03fda7b743
Add a job to test the documentation build (#14645)
* Add a job to the documentation build

* Add caching

* Test cache
2021-12-06 13:55:59 -05:00
Sylvain Gugger
e513c16e82
Fix syntax for class references (#14644) 2021-12-06 13:31:27 -05:00
Lysandre Debut
e9688875bf
Auto processor fix (#14623)
* Add AutoProcessor class
Init and tests
Add doc
Fix init
Update src/transformers/models/auto/processing_auto.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Reverts to tokenizer or feature extractor when available
Adapt test

* Revert "Adapt test"

This reverts commit bbdde5fab0.

* Revert "Reverts to tokenizer or feature extractor when available"

This reverts commit 77659ff5d2.

* Don't revert everything Lysandre!

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2021-12-06 12:49:50 -05:00
Suraj Patil
cbe6026536
fix flax example tests (#14643) 2021-12-06 23:14:37 +05:30
guhur
df085d8ea8
doc: mismatch between pooler/d_output (#14641)
The model outputs a pooler_output whereas the doctype examples were using a pooled_output.
2021-12-06 11:51:53 -05:00
tucan9389
0f3f045ebd
Add GPTJForQuestionAnswering (#14503)
* Add GPTJForQuestionAnswering

* Reformat for GPTJForQuestionAnswering

* Fix isort error

* make style for GPTJForQA

* Add _keys_to_ignore_on_load_missing

* Change the sequence of qa and classification

Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-12-06 11:44:10 -05:00
Jay Zhang
1ccc033c56
Update the example of exporting Bart + BeamSearch to ONNX module to resolve comments. (#14310)
* Update code to resolve comments left in previous PR.

* Add README.md file for this example.

* Update examples/onnx/pytorch/translation/README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update examples/onnx/pytorch/translation/README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update examples/onnx/pytorch/translation/README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update README.md file to resolve comments.

* Add a section name.

* Update examples/onnx/pytorch/translation/README.md

Co-authored-by: Gary Miguel <garymm@garymm.org>

* Add more comments for _convert_past_list_to_tuple().

* Change the default file name to a consistent one.

* Fix a format issue.

* Update examples/onnx/pytorch/translation/README.md

Co-authored-by: Gary Miguel <garymm@garymm.org>

* Update examples/onnx/pytorch/translation/run_onnx_exporter.py

Co-authored-by: Gary Miguel <garymm@garymm.org>

* Update examples/onnx/pytorch/translation/README.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Change the folder to summarization and address some other coments.

* Update the torch version.

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Gary Miguel <garymm@garymm.org>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2021-12-06 14:01:51 +01:00
Julien Chaumond
6cdc3a7844
[urls to hub] Replace outdated model tags with their now-canonical pipeline types (#14617)
* Replace outdated model tags with their now-canonical pipeline types

* spam the CI till it's green
2021-12-06 04:35:01 -05:00
Suraj Patil
c824d7ed48
add flax example tests in CI workflow (#14637) 2021-12-06 14:50:43 +05:30
Suraj Patil
bc8a9f415b
fix typo (#14635) 2021-12-06 10:52:43 +05:30
Suraj Patil
c5bd732ac6
Add Flax example tests (#14599)
* add test for glue

* add tests for clm

* fix clm test

* add summrization tests

* more tests

* fix few tests

* add test for t5 mlm

* fix t5 mlm test

* fix tests for multi device

* cleanup

* ci job

* fix metric file name

* make t5 more robust
2021-12-06 10:48:58 +05:30
Kamal Raj
803a8cd18f
updated readme with proper arguments (#14624) 2021-12-05 22:12:51 -05:00
(Bill) Yuchen Lin
3977b58437
fix a typo (#14626) 2021-12-05 11:31:23 +05:30
Matt
73ec4340ec
Make DefaultDataCollator importable from root (#14588)
* Make DefaultDataCollator importable from root

* Add documentation for DefaultDataCollator and add return_tensors argument to all class docstrings

* make style

* Add DefaultDataCollator to data_collator.rst

* Add DefaultDataCollator to data_collator.rst
2021-12-03 15:15:09 -05:00
Stas Bekman
71b1bf7ea8
[trainer] add tf32-mode control (#14606)
* [trainer] add --tf32 support

* it's pt>=.17

* it's pt>=.17

* flip the default to True

* add experimental note

* simplify logic

* style

* switch to 3-state logic

* doc

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* re-style code

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-12-03 10:08:58 -08:00
Lysandre Debut
aada989ad5
Fix doc builder (#14616)
* Fix doc builder

* Fix doc builder

* Fix doc builder
2021-12-03 12:09:25 -05:00
Lysandre Debut
ec47baeba2
2022 is the year of multi-modality (#14610)
* 2022 is the year of multi-modality

* Small fix

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Apply suggestions from code review

* Apply to documentation index

* Apply suggestions from code review

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update README.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2021-12-03 11:35:44 -05:00
Stas Bekman
e62091d5a7
[CI] move env print to util, add pt, nccl versions (#14607)
* move env print to util, add pt, nccl versions

* style

* version

* align
2021-12-03 08:18:36 -05:00
Li-Huai (Allan) Lin
66ea739168
Improve tokenizer tests (#13594)
* Use new method to acquire tokenizers

* Resolve TODOs.

* Style

* Fix

* Enable do_lower_case in test_tokenize_special_tokens

* Apply suggestion from code review

* Fix mask token handling

* Revert "Fix mask token handling"

This reverts commit daaa3f5291.

* Fix FNet mask token tokenization

* Complete everything

* Apply suggestions from code review
2021-12-03 08:39:10 +01:00
Nik
6645eb61fa
fix #14524 (IndexError when mask prob is too low) (#14525)
* fix #14524 (IndexError when mask prob is too low)

* fix formatting

* correct documentation, add option for setting min_num_masks

* change the semantic meaning of `mask_prob` in _compute_mask_indices

With this commit the meaing of `mask_prob` actually adhered to the probability for each
vector to be the start of a masked span of length.

* fix check_copies test

* fix documentation to semantic meaning of `upper bound of overall masking percentage`, revert changes to _compute_mask_indices

* fix typo
2021-12-02 17:05:31 +03:00