Commit Graph

8821 Commits

Author SHA1 Message Date
Jonathan Chang
783b0dd589
Fix t5 error message (#12136)
* Fix t5 error message

* Fix again
2021-06-13 12:02:57 +01:00
Lysandre Debut
3b1f5caff2
Add from_pretrained to dummy timm objects (#12097)
* Add from_pretrained to dummy timm

* Fix at the source

* Update utils/check_dummies.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Missing pretrained dummies

* Style

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-11 12:27:10 -04:00
Suraj Patil
15b498f3b8
Flax CLM script (#12023)
* first draft

* max_seq_length => block_size

* fix arg names

* fix typos

* fix loss calculation

* add max examples, fix  train eval steps, metrics

* optimizer mask

* fix perpelexity, metric logging

* fix logging

* data_collator = > data_loader

* refactor loss_fn

* support single GPU

* pass distributed to write_metric

* fix jitting

* fix single device training

* fix single device metrics

* close inner progress bars once finished

* add overwrite_cache arg

* ifx dataset caching issue

* add more logs

* few small fixes,

* address nicholas suggestions

* fix docstr

* address patricks suggestions

* make flake happy

* pass new new_dropout_rng to apply_gradients

* reset train metrics after every epoc

* remove distributed logis, small fixes
2021-06-11 15:16:20 +05:30
Patrick von Platen
e47765d884
Fix head masking generate tests (#12110)
* fix_torch_device_generate_test

* remove @

* fix tests
2021-06-11 04:04:07 -04:00
Bhavitvya Malik
d2753dcbec
add relevant description to tqdm in examples (#11927)
* add relevant `desc` in examples

* require_version datasets>=1.8.0
2021-06-10 15:59:55 -04:00
Jayendra
9a9314f6d9
Flax VisionTransformer (#11951)
* adding vit for flax

* added test for Flax-vit and some bug-fixes

* overrided methods where variable changes were necessary for flax_vit test

* added FlaxViTForImageClassification for test

* Update src/transformers/models/vit/modeling_flax_vit.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* made changes suggested in PR

* Adding jax-vit models for autoimport

* swapping num_channels and height,width dimension

* fixing the docstring for torch-like inputs for VIT

* add model to main init

* add docs

* doc, fix-copies

* docstrings

* small test fixes

* fix docs

* fix docstr

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* style

Co-authored-by: jayendra <jayendra@infocusp.in>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-06-10 21:17:13 +05:30
Daniel Stancl
0eaeae2e36
Fix a condition in test_generate_with_head_masking (#11911)
* Fix a condition in test_generate_with_head_masking

* Fix usage of head_mask in bigbirg_pegasus

* Fix head masking for speech2text

* Resolve copy mismatch + drop unwanted print statement

* Fix the condition
2021-06-10 15:28:07 +01:00
Matt
bebbdd0fc9
Appending label2id and id2label to models to ensure inference works properly (#12102) 2021-06-10 15:25:04 +01:00
Matt
4cda08decb
Minor style edits 2021-06-10 15:10:57 +01:00
Matt
7f08dbd10a
Update README.md to cover the TF GLUE example. 2021-06-10 14:33:42 +01:00
Sylvain Gugger
d72e5a3a6d Fix quality 2021-06-10 09:27:11 -04:00
Matt
73a532651a
New TF GLUE example (#12028)
* Pushing partially-complete new GLUE example

* First draft of the new TF GLUE example! Needs a little more testing to be sure but it's almost ready.

* Fix to the fit() call

* Bugfixes, making sure TPU and multi-GPU support is ready

* Remove logger line that depends on Pytorch

* Style pass

* Deleting old TF GLUE example

* Include label2id and id2label in the saved model config

* Don't clobber the existing model.config.label2id

* Style fixes

* Update examples/tensorflow/text-classification/run_glue.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-10 14:14:37 +01:00
Tobias Norlund
9d2cee8b48
CLIPFeatureExtractor should resize images with kept aspect ratio (#11994)
* Resize with kept aspect ratio

* Fixed failed test

* Overload center_crop and resize methods instead

* resize should handle non-PIL images

* update slow test

* Tensor => tensor

Co-authored-by: patil-suraj <surajp815@gmail.com>
2021-06-10 18:40:41 +05:30
kumapo
472a867626
Add text_column_name and label_column_name to run_ner and run_ner_no_trainer args (#12083)
* Add text_column_name and label_column_name to run_ner args

* Minor fix: grouping for text and label column name
2021-06-10 08:03:20 -04:00
Patrick von Platen
bc6f51e539
[Wav2Vec2ForPretraining] Correct checkpoints wav2vec2 & fix tests (#12089)
* fix_torch_device_generate_test

* remove @

* fix tests
2021-06-09 20:41:59 +01:00
Stas Bekman
61e191987d
rm require_version_examples (#12088) 2021-06-09 11:02:52 -07:00
Suraj Patil
d1500d9151
pass decay_mask fn to optimizer (#12087) 2021-06-09 18:49:27 +01:00
Anton Lozhkov
d472bd7b18
Wav2Vec2 Pretraining (#11306)
* Working quantizer forward

* Working quantizer forward

* Clean up unused model parts, test reproducibility

* Working quantizer forward

* Clean up unused model parts, test reproducibility

* Remove custom outputs from the shared ones

* correct conversion

* correct bug

* add first pretrain script

* save intermediate

* static shapes

* save intermediate

* finish first pretrain script version

* more refactor

* remove wanddb

* refactor more

* improve test

* correct perplexity compute bug

* finish model implementation

* add to docs

* finish docs

* finish pretraining script

* finish pretraining script

* remove wandb

* finish PR for merge

* finish config

* finish

* make deepspeed work

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

* fix flaky test

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-09 18:40:56 +01:00
Stas Bekman
b1a8aa94f0
[test] support more than 2 gpus (#12074)
* support more than 2 gpus

* style
2021-06-09 09:23:47 -07:00
NielsRogge
d3eacbb829
Add DETR (#11653)
* Squash all commits of modeling_detr_v7 branch into one

* Improve docs

* Fix tests

* Style

* Improve docs some more and fix most tests

* Fix slow tests of ViT, DeiT and DETR

* Improve replacement of batch norm

* Restructure timm backbone forward

* Make DetrForSegmentation support any timm backbone

* Fix name of output

* Address most comments by @LysandreJik

* Give better names for variables

* Conditional imports + timm in setup.py

* Address additional comments by @sgugger

* Make style, add require_timm and require_vision to testsé

* Remove train_backbone attribute of DetrConfig, add methods to freeze/unfreeze backbone

* Add png files to fixtures

* Fix type hint

* Add timm to workflows

* Add `BatchNorm2d` to the weight initialization

* Fix retain_grad test

* Replace model checkpoints by Facebook namespace

* Fix name of checkpoint in test

* Add user-friendly message when scipy is not available

* Address most comments by @patrickvonplaten

* Remove return_intermediate_layers attribute of DetrConfig and simplify Joiner

* Better initialization

* Scipy is necessary to get sklearn metrics

* Rename TimmBackbone to DetrTimmConvEncoder and rename DetrJoiner to DetrConvModel

* Make style

* Improve docs and add 2 community notebooks

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-06-09 11:51:13 -04:00
Stas Bekman
d14e0af274
sync LayerDrop for Wav2Vec2Encoder + tests (#12076) 2021-06-09 13:21:03 +01:00
Koichi Yasuoka
82a2b76c95
Update run_ner.py with id2label config (#12001) 2021-06-09 07:27:05 -04:00
Stas Bekman
0e82f0cbc2
typo 2021-06-08 12:55:17 -07:00
Stas Bekman
11d86d3de4
[Deepspeed Wav2vec2] integration (#11638)
* wip

* wip - but working with https://github.com/microsoft/DeepSpeed/pull/1044

* cleanup

* workaround

* working 5/8 modes

* solve fp32 distributed zero3

* style

* sync

* sync

* rework

* deprecation

* cleanup

* https://github.com/microsoft/DeepSpeed/pull/1044 pr was merged

* clean up

* add a guide

* more prose

* more prose

* fix

* more prose

* sub_group_size was too big

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor

* bug fix

* make the true check explicit

* new deepspeed release

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-08 12:32:03 -07:00
Stas Bekman
32290d87f6
[Deepspeed] various fixes (#12058)
* replace deprecated config

* sub_group_size was too big

* complete deprecation removal
2021-06-08 08:36:15 -07:00
Sylvain Gugger
fd6902838a
Properly indent block_size (#12070) 2021-06-08 10:27:02 -04:00
cdleong
49bee0aea4
Add torch to requirements.txt in language-modeling (#12040)
* Add torch to requirements.txt in language-modeling

* Update examples/pytorch/language-modeling/requirements.txt

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-08 09:02:35 -04:00
Mario Šaško
f5eec0d8e9
Replace legacy tensor.Tensor with torch.tensor/torch.empty (#12027)
* Replace legacy torch.Tensor constructor with torch.{tensor, empty}

* Remove torch.Tensor in examples
2021-06-08 13:58:38 +01:00
Shamane Siri
e33085d648
updated the original RAG implementation to be compatible with latest Pytorch-Lightning (#11806)
* updated the original RAG implementation to be compatible with the latest PL version

* updated the requirements.txt file

* execute make style

* code quality test

* code quality

* conflix resolved in requirement.txt

* code quality

* changed the MyDDP class name to CustomDDP
2021-06-08 13:42:49 +01:00
NielsRogge
70f88eeccc
Fix tapas issue (#12063)
* Fix scatter function to be compatible with torch-scatter 2.7.0

* Allow test again
2021-06-08 05:22:31 -04:00
NielsRogge
e56e3140dd
Fix integration tests (#12066) 2021-06-08 05:21:38 -04:00
Stas Bekman
4abc6dd690
skip failing test (#12059) 2021-06-07 20:48:41 -07:00
Russell Klopfer
e363e1d936
adds metric prefix. (#12057)
* adds metric prefix.

* update tests to include prefix
2021-06-07 22:34:10 -04:00
Peter Izsak
8994c1e472
Add optional grouped parsers description to HfArgumentParser (#12042)
* Adding optional argument group to HfArgumentParser

* Minor

* remove whitespace

* Minor styling
2021-06-07 11:47:12 -04:00
Nicolas Patry
2056f26e85
Extend pipelines for automodel tupels (#12025)
* fix_torch_device_generate_test

* remove @

* finish

* refactor

* add test

* fix test

* Attempt at simplification.

* Small fix.

* Fixing non existing AutoModel for TF.

* Naming.

* Remove extra condition.

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2021-06-07 17:41:27 +02:00
François Lagunas
f8bd8c6c7e
Fixes bug that appears when using QA bert and distilation. (#12026)
* Fixing bug that appears when using distilation (and potentially other uses).
During backward pass Pytorch complains with:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
This happens because the QA model code modifies the start_positions and end_positions input tensors, using clamp_ function: as a consequence the teacher and the student both modifies the inputs, and backward pass fails.

* Fixing all models QA clamp_ bug.
2021-06-07 11:21:59 -04:00
Patrick von Platen
59f75d538b
[JAX] Bump jax lib (#12053)
* fix_torch_device_generate_test

* remove @

* bump up jax lib
2021-06-07 13:04:18 +01:00
Suraj Patil
185122ef22
fix docs of past_key_values (#12049) 2021-06-07 15:24:03 +05:30
Philip May
3857f2b4e3
fix deberta 2 tokenizer integration test (#12017) 2021-06-07 04:55:55 -04:00
Shiva Pundir
20b6f3b80c
Fixed Typo in modeling_bart.py (#12035)
* Fixed Typo in modeling_bart.py - Issue #11895

* Fixed Typo in modeling_bart.py
2021-06-07 11:44:25 +05:30
Stas Bekman
1f335aef3b
[TrainerArguments] format and sort __repr__, add __str__ (#12018)
* format and sort __repr__, add __str__

* typo

* use __str__ directly

* alias __repr__ = __str__
2021-06-04 09:39:38 -07:00
Stas Bekman
2c73b93099
[Deepspeed] Assert on mismatches between ds and hf args (#12021)
* wip

* add mismatch validation + test

* renames

* Update docs/source/main_classes/deepspeed.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* renames

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-04 08:58:23 -07:00
Patrick von Platen
242ec31aa5
[Flax] Refactor MLM (#12013)
* fix_torch_device_generate_test

* remove @

* finish refactor

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-03 16:31:32 +01:00
Nicholas Vadivelu
4674061b2a
Fix weight decay masking in run_flax_glue.py (#11964)
* Fix weight decay masking in `run_flax_glue.py`

Issues with the previous implementation:
- The `dict` from `traverse_util.flatten_dict` has keys which are tuples of strings, not one long string with the path separated by periods.
- `optax.masked` applies the transformation wherever the mask is True, so the masks are flipped.
- Flax's LayerNorm calls the scale parameter `scale` not `weight`

* Fix formatting with black

* adapt results

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-06-03 11:35:26 +01:00
Stas Bekman
61c5063491
[deepspeed] add nvme test skip rule (#11997)
* add nvme skip rule

* fix
2021-06-02 12:06:37 -07:00
Stas Bekman
640318befa
[deepspeed] Move code and doc into standalone files (#11984)
* move code and docs

* style

* moved

* restore
2021-06-02 09:56:00 -07:00
Kou Yong Kang
d6d747cb28
Update return introduction (#11976)
Make it clear that the `forward` method now returns a dict instead of tuple.

Fix style
2021-06-02 12:53:09 -04:00
Stas Bekman
d406a2729a
[docs] fix xref to PreTrainedModel.generate (#11049)
* fix xref to generate

* do the same for search methods

* style

* style
2021-06-02 09:21:05 -07:00
Gunjan Chhablani
123b597f5d
Fix examples (#11990) 2021-06-02 10:12:52 -04:00
Gunjan Chhablani
88ca6a231d
VisualBERT (#10534)
* Init VisualBERT

* Add cookie-cutter, Config, and Embeddings

* Add preliminary Model

* Add Bert analogous classes

* Add basic code for NLVR, VQA, Flickr

* Update Init

* Fix VisualBert Downstream Models

* Rename classifier to cls

* Comment position_ids buffer

* Remove sentence image predictor output

* Update output dicts

* Remove unnecessary files

* Fix Auto Modeling

* Fix transformers init

* Add conversion script

* Add conversion script

* Fix docs

* Update visualbert modelling

* Update configuration

* Style fixes

* Add model and integration tests

* Add all tests

* Update model mapping

* Add simple detector from original repository

* Update docs and configs

* Fix style

* Fix style

* Update docs

* Fix style

* Fix import issues in style

* Fix style

* Add changes from review

* Fix style

* Fix style

* Update docs

* Fix style

* Fix style

* Update docs/source/model_doc/visual_bert.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_modeling_visual_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add changes from review

* Remove convert run script

* Add changes from review

* Update src/transformers/models/visual_bert/modeling_visual_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/visual_bert/modeling_visual_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add changes from review

* Add changes from review

* Add visual embedding example in docs

* Fix "copied from" comments

* Add changes from review

* Fix error, style, checkpoints

* Update docs

* Fix integration tests

* Fix style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-02 18:13:08 +05:30