NielsRogge
7db7aab439
Add semantic script no trainer, v2 ( #16788 )
...
* Add first draft from previous PR
* First draft
* Improve README and remove num_labels
* Make script more aligned with other scripts
* Improve README and apply suggestion from code review
2022-04-19 09:07:29 +02:00
NielsRogge
494c2a8c4d
Clean up semantic segmentation tests ( #16801 )
...
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-04-19 09:02:19 +02:00
David Hall
989a15d173
fix _setup_devices in case where there is no torch.distributed package in build ( #16821 )
...
* fix _setup_devices in case where there is not torch.distributed
* in training_args_sm.py as well
2022-04-18 18:36:46 -04:00
Lysandre Debut
c11a49573f
Refactor issues with yaml ( #16772 )
...
* Refactor issues with yaml
* Update .github/ISSUE_TEMPLATE/bug-report.yml
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Update .github/ISSUE_TEMPLATE/bug-report.yml
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Update .github/ISSUE_TEMPLATE/feature-request.yml
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update .github/ISSUE_TEMPLATE/bug-report.yml
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update .github/ISSUE_TEMPLATE/bug-report.yml
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Address review comments
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-04-18 16:43:21 -04:00
jsnfly
51e0ebedcb
Allow passing encoder_ouputs as tuple to EncoderDecoder Models ( #16814 )
...
* Add passing encoder_outputs as tuple to existing test
* Add check for tuple
* Add check for tuple also for speech and vision
Co-authored-by: jsnfly <jsnfly@gmx.de>
2022-04-18 19:49:58 +02:00
Nicholas Broad
51fa7191b1
use base_version to check torch version in torch_less_than_1_11 ( #16806 )
...
* use base_version
* make is_torch_less_than_1_8 match 1_11
Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>
2022-04-18 13:02:00 -04:00
Patrick von Platen
8d3f952adb
[Data2Vec] Add data2vec vision ( #16760 )
...
* save intermediate
* add vision
* add vision
* save
* finish models
* finish models
* continue
* finish
* up
* up
* up
* tests all pass
* clean up
* up
* up
* fix bugs in beit
* correct docs
* finish
* finish docs
* make style
* up
* more fixes
* fix type hint
* make style
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update tests/data2vec/test_modeling_data2vec_vision.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* fix test
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-18 17:52:13 +02:00
Zhengqiang Yin
33cd4be576
fix megatron bert convert state dict naming ( #15820 )
2022-04-18 11:34:36 -04:00
Patrick von Platen
9a2995ee39
[Quicktour Audio] Improve && remove ffmpeg dependency ( #16723 )
...
* [Quicktour Audio] Improve && remove ffmpeg dependency
* final fix
* final touches
2022-04-18 16:50:13 +02:00
NielsRogge
d3c9d0e55f
[ViT, BEiT, DeiT, DPT] Improve code ( #16799 )
...
* Improve code
* Fix bugs
* Fix another bug
* Clean up DTP as well
* Update DPT model outputs
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-04-18 09:25:08 -04:00
Sylvain Gugger
3785f4665a
Fix syntax error in TorchHub workflow
2022-04-18 07:54:00 -04:00
Joao Gante
6984848ed0
Create empty venv on cache miss ( #16816 )
2022-04-18 07:49:31 -04:00
Allan Jie
438144832e
Raise error and suggestion when using custom optimizer with Fairscale or Deepspeed ( #16786 )
...
* optimizer issues related to saving
* remove the "optimizer saving" option
* reformat using make style
2022-04-18 07:47:21 -04:00
Joao Gante
b4ddd2677c
TF generate refactor - XLA sample ( #16713 )
2022-04-18 10:58:24 +01:00
Joao Gante
02de7a8e7f
CI: non-remote GH Actions now use a python venv ( #16789 )
2022-04-18 09:47:38 +01:00
Sylvain Gugger
dee6f01636
Pin Jax to last working release ( #16808 )
...
* Pin Jax to last working release
* Try lower
* Try lower
2022-04-16 21:15:19 -04:00
NielsRogge
78f346c2b5
Update README.md ( #16797 )
2022-04-15 14:10:16 +02:00
Yih-Dar
ee209d4d01
Fix PT TF ViTMAE ( #16766 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-15 06:37:10 +02:00
Stas Bekman
5da33f8729
[modeling utils] revamp from_pretrained(..., low_cpu_mem_usage=True)
+ tests ( #16657 )
...
* add low_cpu_mem_usage tests
* wip: revamping
* wip
* install /usr/bin/time
* wip
* cleanup
* cleanup
* cleanup
* cleanup
* cleanup
* fix assert
* put the wrapper back
* cleanup; switch to bert-base-cased
* Trigger CI
* Trigger CI
2022-04-14 18:10:05 -07:00
Stas Bekman
ce2fef2ad2
[trainer / deepspeed] fix hyperparameter_search ( #16740 )
...
* [trainer / deepspeed] fix hyperparameter_search
* require optuna
* style
* oops
* add dep in the right place
* create deepspeed-testing dep group
* Trigger CI
2022-04-14 17:24:38 -07:00
code-review-doctor
1b7de41a07
Fix issue avoid-missing-comma found at https://codereview.doctor ( #16768 )
2022-04-14 16:42:27 -04:00
Sanchit Gandhi
de8b06f9bf
[SpeechEncoderDecoderModel] Fix bug in reshaping labels ( #16748 )
2022-04-14 19:02:40 +01:00
NielsRogge
048443db86
Improve image classification example ( #16585 )
...
* Improve README
* Make dataset_name argument optional
* Improve local data
* Fix bug
* Improve README some more
* Apply suggestions from code review
* Improve README
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-04-14 18:10:52 +02:00
Sylvain Gugger
3e4eec47f5
Kill async pushes when calling push_to_hub with blocking=True ( #16755 )
2022-04-14 10:02:29 -04:00
Stas Bekman
c21e1071a7
[deepspeed / m2m_100] make deepspeed zero-3 work with layerdrop ( #16717 )
...
* [deepspeed / m2m_100] make deepspeed 3 work with layerdrop
* fix
* revert last
2022-04-14 06:51:55 -07:00
Zachary Mueller
89293a0f6b
Make nightly install dev accelerate ( #16783 )
2022-04-14 09:41:02 -04:00
Sylvain Gugger
b151ddb9b9
Fix batch size in evaluation loop ( #16763 )
...
* Fix batch size in evaluation loop
* remove debug statement
2022-04-14 09:22:54 -04:00
Sanchit Gandhi
d8269eb4d5
[Flax .from_pretrained
] Raise a warning if model weights are not in float32 ( #16762 )
...
* [Flax] Raise a warning if model weights are not in float32
* apply suggestions and few small changes
* reorder wording for better readability
2022-04-14 11:52:15 +02:00
Nicolas Patry
195fbbb6cf
Enabling Tapex
in table question answering pipeline. ( #16663 )
...
* Enabling `Tapex` in table question answering pipeline.
* Questions are independant for Tapex, making the test respect that.
* Missing extra space.
2022-04-14 09:06:14 +02:00
Bhadresh Savani
442dc45645
[Doctest] added doctest changes for electra ( #16675 )
...
* added doctest changes for electra
* fixed doctest tests
* updated changes
2022-04-13 22:39:00 +02:00
Zachary Mueller
be752d12f8
Fixup no_trainer examples scripts and add more tests ( #16765 )
...
* Change tracking to store_true
* Remove step param and use it in the log dictionary directly
* use vars(args) when passing args to init_trackers
* Include tracking tests since tensorboard is already a dep
2022-04-13 14:40:48 -04:00
Stas Bekman
3a16ab25c8
[self-scheduled ci] explain where dependencies are ( #16757 )
2022-04-13 12:28:02 -04:00
Tu Vu
34ef029dc0
Add self training code for text classification ( #16738 )
...
* Add self-training code for text-classification
* Add self-training code for text-classification
* Add self-training code for text-classification
* Add self-training code for text-classification
* Add self-training code for text-classification
* Delete strata
2022-04-13 12:03:24 -04:00
Sylvain Gugger
8e0d3b427f
Add defensive check for config num_labels and id2label ( #16709 )
...
* Add defensive check for config num_labels and id2label
* Actually check value...
* Only warning inside init plus better error message
2022-04-13 11:28:19 -04:00
Yih-Dar
6bed0647fe
Reduce Funnel PT/TF diff ( #16744 )
...
* Make Funnel Test less flaky
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-13 17:19:52 +02:00
Joao Gante
0b8f697219
CI: setup-dependent pip cache ( #16751 )
...
* Setup-dependent pip cache
* Do not restore from old versions
2022-04-13 16:19:14 +01:00
Stas Bekman
ac43a40e6a
[modeling_utils] better explanation of ignore keys ( #16741 )
2022-04-13 08:03:20 -07:00
Jeremy Fisher
0235bc57ab
Fix and improve CTRL doctests ( #16573 )
...
* Improve CTRL doctests
* Fix `CTRLForSequenceClassification` flakiness with inconsistent losses
* Remove unused
* Fixup
* Add CTRL to documentation_tests.txt
* Fix control code not being first
* Add output assertions
* Change from sshleifer/tiny-ctrl -> ctrl
* Run `make fixup`
* apply `list` to output logits shape for clarity
* Reduce output loss precision to make assertion more robust
* Add assertion of control code being first
* Fix docstyle
* upper case sentence following control code
* Weird bug fixes
* Add a better generation example
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2022-04-13 15:44:31 +02:00
Michael Chung
06b4aac9eb
Add Doc Test for GPT-J ( #16507 )
...
* Required the values GPTJ unfortunately cannot run the model =)
* Added the file to the doc tests
* Run Fixup and Style
* Fixed with the test versions of gptj. Ran Style and Fixup.
* Trigger ci
* A Minor Change to License
* Fixed spacing added to the benchmark_utils. Then refactored tests to const variables.
* Removed strings that were included as default parameters anyways.
Co-authored-by: ArEnSc <xx.mike.chung.xx@gmail.com>
2022-04-13 15:04:47 +02:00
Stas Bekman
12bfa97a43
[from_pretrained] refactor find_mismatched_keys ( #16706 )
2022-04-13 07:50:15 -04:00
davidleonfdez
9f8bfe703c
Fix #16660 (tokenizers setters of ids of special tokens) ( #16661 )
...
* Fix setters of *_token_id properties of SpecialTokensMixin
* Test setters of common tokens ids
* Move to a separate test checks of setters of tokens ids
* Add independent test for ByT5
* Add Canine test
* Test speech to text
2022-04-13 07:49:06 -04:00
Patrick von Platen
b24201fa44
[Doctests] Fix all T5 doc tests ( #16646 )
...
* [Doctests] Fix all T5 doc tests
* make style
* Update docs/source/en/model_doc/t5.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Apply Sylvains comments
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-13 11:36:54 +02:00
Santiago Castro
f7196f2e63
Fix decoding score comparison when using logits processors or warpers ( #10638 )
...
* Normalize using a logits warper
* Add a flag in `generate` to support the logit renormalization
* Add in RAG
2022-04-13 09:37:33 +01:00
Joao Gante
eb5bdcdfa5
TF generate: handle case without cache in beam search ( #16704 )
2022-04-12 20:46:10 +01:00
Minh Chien Vu
9c9db751e2
add Bigbird ONNX config ( #16427 )
...
* add Bigbird ONNX config
2022-04-12 20:46:06 +02:00
Sanchit Gandhi
a960406722
[FlaxWav2Vec2Model] Fix bug in attention mask ( #16725 )
...
* [FlaxWav2Vec2Model] Fix bug in attention mask
* more fixes
* add (Flax)SpeechEncoderDecoderModel PT-FX cross-test
2022-04-12 19:48:24 +02:00
Sanchit Gandhi
6adefba3f0
[FlaxSpeechEncoderDecoder] Fix input shape bug in weights init ( #16728 )
...
* [FlaxSpeechEncoderDecoder] Fix input shape bug in weights init
* make style
2022-04-12 19:33:57 +02:00
hiromu
1bac40db8a
Add Doc Tests for Reformer PyTorch ( #16565 )
...
* start working
* fix: ReformerForQA doctest
* fix: ReformerModelWithLMHead doctest
* fix: ReformerModelForSC doctest
* fix: ReformerModelForMLM doctest
* add: documentation_tests.txt
* make fixup
* change: ReformerModelForSC doctest
* change: checkpoint
2022-04-12 18:52:31 +02:00
Joao Gante
d7f7f29f29
TF: remove set_tensor_by_indices_to_value ( #16729 )
2022-04-12 17:51:47 +01:00
Anmol Joshi
a315988bae
Moved functions to pytorch_utils.py ( #16625 )
...
* Moved functions to pytorch_utils.py
* isort formatting
* Reverted tf changes
* isort, make fix-copies
* documentation fix
* Fixed Conv1D import
* Reverted research examples file
* backward compatibility for pytorch_utils
* missing import
* isort fix
2022-04-12 12:38:50 -04:00