Commit Graph

153 Commits

Author SHA1 Message Date
Stas Bekman
d25ad34c82
[CI] add dependency table sync verification (#12364)
* add dependency table sync verification

* improve the message

* improve the message

* revert

* ready to merge
2021-06-28 08:55:59 -07:00
Lysandre Debut
3a960c4857
Support for torch 1.9.0 (#12224)
* Support for torch 1.9.0

* Torch scatter for 1.9.0

* Github Actions run on 1.9.0
2021-06-17 11:29:01 -04:00
Lysandre Debut
52c7ca0488
Temporarily deactivate torch-scatter while we wait for new release (#12181)
* Temporarily deactivate torch-scatter while we wait for new release

* torch-1.8.1 binary for scatter

* Revert to 1.8.0

* Pin torch dependency

* torchaudio and torchvision
2021-06-15 16:03:58 -04:00
NielsRogge
d3eacbb829
Add DETR (#11653)
* Squash all commits of modeling_detr_v7 branch into one

* Improve docs

* Fix tests

* Style

* Improve docs some more and fix most tests

* Fix slow tests of ViT, DeiT and DETR

* Improve replacement of batch norm

* Restructure timm backbone forward

* Make DetrForSegmentation support any timm backbone

* Fix name of output

* Address most comments by @LysandreJik

* Give better names for variables

* Conditional imports + timm in setup.py

* Address additional comments by @sgugger

* Make style, add require_timm and require_vision to testsé

* Remove train_backbone attribute of DetrConfig, add methods to freeze/unfreeze backbone

* Add png files to fixtures

* Fix type hint

* Add timm to workflows

* Add `BatchNorm2d` to the weight initialization

* Fix retain_grad test

* Replace model checkpoints by Facebook namespace

* Fix name of checkpoint in test

* Add user-friendly message when scipy is not available

* Address most comments by @patrickvonplaten

* Remove return_intermediate_layers attribute of DetrConfig and simplify Joiner

* Better initialization

* Scipy is necessary to get sklearn metrics

* Rename TimmBackbone to DetrTimmConvEncoder and rename DetrJoiner to DetrConvModel

* Make style

* Improve docs and add 2 community notebooks

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-06-09 11:51:13 -04:00
Sylvain Gugger
cbbf49f644 Fix doc deployment 2021-05-13 10:34:14 -04:00
Sylvain Gugger
2ce0fb84cc
Make quality scripts work when one backend is missing. (#11573)
* Make quality scripts work when one backend is missing.

* Check env variable is properly set

* Add default

* With print statements

* Fix typo

* Set env variable

* Remove debug code
2021-05-04 09:53:44 -04:00
Sylvain Gugger
81a6c7cd39 Use 3 workers for torch tests 2021-04-23 18:47:46 -04:00
Sylvain Gugger
ca6b80cadb Wrong branch Sylvain... 2021-04-23 12:46:54 -04:00
Sylvain Gugger
3951fc55ee Try to trigger failure more 2021-04-23 12:44:54 -04:00
Sylvain Gugger
bf2e0cf70b
Trainer push to hub (#11328)
* Initial support for upload to hub

* push -> upload

* Fixes + examples

* Fix torchhub test

* Torchhub test I hate you

* push_model_to_hub -> push_to_hub

* Apply mixin to other pretrained models

* Remove ABC inheritance

* Add tests

* Typo

* Run tests

* Install git-lfs

* Change approach

* Add push_to_hub to all

* Staging test suite

* Typo

* Maybe like this?

* More deps

* Cache

* Adapt name

* Quality

* MOAR tests

* Put it in testing_utils

* Docs + torchhub last hope

* Styling

* Wrong method

* Typos

* Update src/transformers/file_utils.py

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address review comments

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-04-23 09:17:37 -04:00
Sylvain Gugger
dabeb15292
Examples reorg (#11350)
* Base move

* Examples reorganization

* Update references

* Put back test data

* Move conftest

* More fixes

* Move test data to test fixtures

* Update path

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments and clean

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-21 11:11:20 -04:00
Sylvain Gugger
26212c14e5 Reactivate Megatron tests an use less workers 2021-04-09 18:09:53 -04:00
Kevin Canwen Xu
fb41f9f50c
Add a special tokenizer for CPM model (#11068)
* Add a special tokenizer for CPM model

* make style

* fix

* Add docs

* styles

* cpm doc

* fix ci

* fix the overview

* add test

* make style

* typo

* Custom tokenizer flag

* Add REAMDE.md

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-04-10 02:07:47 +08:00
Stas Bekman
97ccf67bb3
[setup] extras[docs] must include 'all' (#11148)
* extras[doc] must include 'all'

* fix

* better

* regroup
2021-04-08 18:10:44 -04:00
Sylvain Gugger
b0d49fd536
Add a script to check inits are consistent (#11024) 2021-04-04 20:41:34 -04:00
NielsRogge
30677dc743
Add Vision Transformer and ViTFeatureExtractor (#10950)
* Squash all commits into one

* Update ViTFeatureExtractor to use image_utils instead of torchvision

* Remove torchvision and add Pillow

* Small docs improvement

* Address most comments by @sgugger

* Fix tests

* Clean up conversion script

* Pooler first draft

* Fix quality

* Improve conversion script

* Make style and quality

* Make fix-copies

* Minor docs improvements

* Should use fix-copies instead of manual handling

* Revert "Should use fix-copies instead of manual handling"

This reverts commit fd4e591bce.

* Place ViT in alphabetical order

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-01 11:16:05 -04:00
Sylvain Gugger
d0b3797a3b
Add more metadata to the user agent (#10972)
* Add more metadata to the user agent

* Fix typo

* Use DISABLE_TELEMETRY

* Address review comments

* Use global env

* Add clean envs on circle CI
2021-03-31 09:36:07 -04:00
Sylvain Gugger
21e86f99e6
Sort init import (#10801)
* Initial script

* Add script to properly sort imports in init.

* Add to the CI

* Update utils/custom_init_isort.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Separate scripts that change content from quality

* Move class_mapping_update to style_checks

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-19 16:17:13 -04:00
Patrick von Platen
9f8619c6aa
Flax testing should not run the full torch test suite (#10725)
* make flax tests pytorch independent

* fix typo

* finish

* improve circle ci

* fix return tensors

* correct flax test

* re-add sentencepiece

* last tokenizer fixes

* finish maybe now
2021-03-16 08:05:37 +03:00
Suraj Patil
d26b37e744
Speech2TextTransformer (#10175)
* s2t

* fix config

* conversion script

* fix import

* add tokenizer

* fix tok init

* fix tokenizer

* first version working

* fix embeds

* fix lm head

* remove extra heads

* fix convert script

* handle encoder attn mask

* style

* better enc attn mask

* override _prepare_attention_mask_for_generation

* handle attn_maks in encoder and decoder

* input_ids => input_features

* enable use_cache

* remove old code

* expand embeddings if needed

* remove logits bias

* masked_lm_loss => loss

* hack tokenizer to support feature processing

* fix model_input_names

* style

* fix error message

* doc

* remove inputs_embeds

* remove input_embeds

* remove unnecessary docstring

* quality

* SpeechToText => Speech2Text

* style

* remove shared_embeds

* subsample => conv

* remove Speech2TextTransformerDecoderWrapper

* update output_lengths formula

* fix table

* remove max_position_embeddings

* update conversion scripts

* add possibility to do upper case for now

* add FeatureExtractor and Processor

* add tests for extractor

* require_torch_audio => require_torchaudio

* add processor test

* update import

* remove classification head

* attention mask is now 1D

* update docstrings

* attention mask should be of type long

* handle attention mask from generate

* alwyas return attention_mask

* fix test

* style

* doc

* Speech2TextTransformer => Speech2Text

* Speech2TextTransformerConfig => Speech2TextConfig

* remove dummy_inputs

* nit

* style

* multilinguial tok

* fix tokenizer

* add tgt_lang setter

* save lang_codes

* fix tokenizer

* add forced_bos_token_id to tokenizer

* apply review suggestions

* add torchaudio to extra deps

* add speech deps to CI

* fix dep

* add libsndfile to ci

* libsndfile1

* add speech to extras all

* libsndfile1 -> libsndfile1

* libsndfile

* libsndfile1-dev

* apt update

* add sudo to install

* update deps table

* install libsndfile1-dev on CI

* tuple to list

* init conv layer

* add model tests

* quality

* add integration tests

* skip_special_tokens

* add speech_to_text_transformer in toctree

* fix tokenizer

* fix fp16 tests

* add tokenizer tests

* fix copyright

* input_values => input_features

* doc

* add model in readme

* doc

* change checkpoint names

* fix copyright

* fix code example

* add max_model_input_sizes in tokenizer

* fix integration tests

* add do_lower_case to tokenizer

* remove clamp trick

* fix "Add modeling imports here"

* fix copyrights

* fix tests

* SpeechToTextTransformer => SpeechToText

* fix naming

* fix table formatting

* fix typo

* style

* fix typos

* remove speech dep from extras[testing]

* fix copies

* rename doc file,

* put imports under is_torch_available

* run feat extract tests when torch is available

* dummy objects for processor and extractor

* fix imports in tests

* fix import in modeling test

* fxi imports

* fix torch import

* fix imports again

* fix positional embeddings

* fix typo in import

* adapt new extractor refactor

* style

* fix torchscript test

* doc

* doc

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix docs, copied from, style

* fix docstring

* handle imports

* remove speech from all extra deps

* remove s2t from seq2seq lm mapping

* better names

* skip training tests

* add install instructions

* List => Tuple

* doc

* fix conversion script

* fix urls

* add instruction for libsndfile

* fix fp16 test

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-10 21:42:04 +05:30
Sylvain Gugger
7da995c00c
Fix embeddings for PyTorch 1.8 (#10549)
* Fix embeddings for PyTorch 1.8

* Try with PyTorch 1.8.0

* Fix embeddings init

* Fix copies

* Typo

* More typos
2021-03-05 16:18:48 -05:00
Lysandre
dc9aaa3848 Pin torch to 1.7.1 in tests while we resolve issues 2021-03-05 07:57:35 -05:00
Lysandre
093b88f4e9 Update scatter to use torch 1.8.0 2021-03-05 07:31:51 -05:00
Stas Bekman
d478257d9b
[CI] build docs faster (#10115)
I assume the CI machine should have at least 4 cores, so let's build docs faster
2021-02-10 03:02:39 -05:00
Lysandre Debut
910aa89671
Temporarily deactivate TPU tests while we work on fixing them (#9720) 2021-01-21 04:17:39 -05:00
NielsRogge
1551e2dc6d
[WIP] Tapas v4 (tres) (#9117)
* First commit: adding all files from tapas_v3

* Fix multiple bugs including soft dependency and new structure of the library

* Improve testing by adding torch_device to inputs and adding dependency on scatter

* Use Python 3 inheritance rather than Python 2

* First draft model cards of base sized models

* Remove model cards as they are already on the hub

* Fix multiple bugs with integration tests

* All model integration tests pass

* Remove print statement

* Add test for convert_logits_to_predictions method of TapasTokenizer

* Incorporate suggestions by Google authors

* Fix remaining tests

* Change position embeddings sizes to 512 instead of 1024

* Comment out positional embedding sizes

* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES

* Added more model names

* Fix truncation when no max length is specified

* Disable torchscript test

* Make style & make quality

* Quality

* Address CI needs

* Test the Masked LM model

* Fix the masked LM model

* Truncate when overflowing

* More much needed docs improvements

* Fix some URLs

* Some more docs improvements

* Test PyTorch scatter

* Set to slow + minify

* Calm flake8 down

* First commit: adding all files from tapas_v3

* Fix multiple bugs including soft dependency and new structure of the library

* Improve testing by adding torch_device to inputs and adding dependency on scatter

* Use Python 3 inheritance rather than Python 2

* First draft model cards of base sized models

* Remove model cards as they are already on the hub

* Fix multiple bugs with integration tests

* All model integration tests pass

* Remove print statement

* Add test for convert_logits_to_predictions method of TapasTokenizer

* Incorporate suggestions by Google authors

* Fix remaining tests

* Change position embeddings sizes to 512 instead of 1024

* Comment out positional embedding sizes

* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES

* Added more model names

* Fix truncation when no max length is specified

* Disable torchscript test

* Make style & make quality

* Quality

* Address CI needs

* Test the Masked LM model

* Fix the masked LM model

* Truncate when overflowing

* More much needed docs improvements

* Fix some URLs

* Some more docs improvements

* Add add_pooling_layer argument to TapasModel

Fix comments by @sgugger and @patrickvonplaten

* Fix issue in docs + fix style and quality

* Clean up conversion script and add task parameter to TapasConfig

* Revert the task parameter of TapasConfig

Some minor fixes

* Improve conversion script and add test for absolute position embeddings

* Improve conversion script and add test for absolute position embeddings

* Fix bug with reset_position_index_per_cell arg of the conversion cli

* Add notebooks to the examples directory and fix style and quality

* Apply suggestions from code review

* Move from `nielsr/` to `google/` namespace

* Apply Sylvain's comments

Co-authored-by: sgugger <sylvain.gugger@gmail.com>

Co-authored-by: Rogge Niels <niels.rogge@howest.be>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2020-12-15 17:08:49 -05:00
Lysandre Debut
91fa707217
Remove docs only check (#9065) 2020-12-11 10:27:31 -05:00
Sylvain Gugger
783d7d2629
Reorganize examples (#9010)
* Reorganize example folder

* Continue reorganization

* Change requirements for tests

* Final cleanup

* Finish regroup with tests all passing

* Copyright

* Requirements and readme

* Make a full link for the documentation

* Address review comments

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add symlink

* Reorg again

* Apply suggestions from code review

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Adapt title

* Update to new strucutre

* Remove test

* Update READMEs

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-12-11 10:07:02 -05:00
Stas Bekman
5e637e6c69
[wip] [ci] doc-job-skip take #4 dry-run (#8980)
* ci-doc-job-skip-take-4

* wip

* wip

* wip

* wip

* skip yaml

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* ready to test

* yet another way

* trying with HEAD

* trying with head.sha

* trying with head.sha fix

* trying with head.sha fix wip

* undo

* try to switch to sha

* current branch

* current branch

* PR number check

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride
2020-12-09 15:36:36 -05:00
Lysandre Debut
2ae7388eee
Check table as independent script (#8976) 2020-12-07 19:55:12 -05:00
Julien Chaumond
28fa014a1f
transformers-cli: LFS multipart uploads (> 5GB) (#8663)
* initial commit

* [cli] lfs commands

* Fix FileSlice

* Tweak to FileSlice

* [hf_api] Backport filetype arg from `datasets`

cc @lhoestq

* Silm down the CI while i'm working

* Ok let's try this in CI

* Update config.yml

* Do not try this at home

* one more try

* Update lfs.py

* Revert "Tweak to FileSlice"

This reverts commit d7e32c4b35.

* Update test_hf_api.py

* Update test_hf_api.py

* Update test_hf_api.py

* CI still green?

* make CI green again?

* Update test_hf_api.py

* make CI red again?

* Update test_hf_api.py

* add CI style back

* Fix CI?

* oh my

* doc + switch back to real staging endpoint

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>

* Fix docblock + f-strings

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>
2020-12-07 16:38:39 -05:00
Stas Bekman
37f4c24f10
> 30 files leads to hanging on --More--
cancel debug printing for now. As it can be seen lead to a failing test here:
https://app.circleci.com/pipelines/github/huggingface/transformers/16894/workflows/cc86f7a9-4020-45af-8ab3-c22f79b427cf/jobs/131924
2020-12-07 12:18:05 -08:00
Stas Bekman
73c51f7fcd
[ci] skip doc jobs - circleCI is not reliable - disable skip for now (#8926)
* disable skipping, but leave logging for the future
2020-12-04 10:13:42 -08:00
Stas Bekman
24f0c2fe33
[ci] skip doc jobs take #3 (#8885)
* check that we get any match first

* docs only

* 2 docs only

* add code

* restore
2020-12-02 10:06:45 -05:00
Stas Bekman
693ac3594b
disable job skip - need more work
reference: https://github.com/huggingface/transformers/pull/8853#issuecomment-736779863
2020-12-01 12:03:29 -08:00
Stas Bekman
21db560df3
[CI] skip docs-only jobs take #2 (#8853)
* restore skip

* Revert "Remove deprecated `evalutate_during_training` (#8852)"

This reverts commit 5530299096.

* check that pipeline.git.base_revision is defined before proceeding

* Revert "Revert "Remove deprecated `evalutate_during_training` (#8852)""

This reverts commit dfec84db3f.

* check that pipeline.git.base_revision is defined before proceeding

* doc only

* doc + code

* restore

* restore

* typo
2020-12-01 13:15:25 -05:00
Sylvain Gugger
08e707633c Comment the skip job on doc line 2020-11-30 10:51:25 -05:00
Stas Bekman
c239dcda83
[CI] implement job skipping for doc-only PRs (#8826)
* implement job skipping for doc-only PRs

* silent grep is crucial

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* let's add doc

* let's add code

* revert test commits

* restore

* Better name

* Better name

* Better name

* some more testing

* some more testing

* some more testing

* finish testing
2020-11-29 11:31:30 -05:00
Julien Chaumond
0cc5ab1333
Improve bert-japanese tokenizer handling (#8659)
* Make ci fail

* Try to make tests actually run?

* CI finally failing?

* Fix CI

* Revert "Fix CI"

This reverts commit ca7923be73.

* Ooops wrong one

* one more try

* Ok ok let's move this elsewhere

* Alternative to globals() (#8667)

* Alternative to globals()

* Error is raised later so return None

* Sentencepiece not installed make some tokenizers None

* Apply Lysandre wisdom

* Slightly clearer comment?

cc @sgugger

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-23 11:15:02 -05:00
Sylvain Gugger
6494910f27
Add sentencepiece to the CI and fix tests (#8672)
* Fix the CI and tests

* Fix quality

* Remove that m form nowhere
2020-11-19 16:44:20 -05:00
Funtowicz Morgan
121c24efa4
Update deploy-docs dependencies on CI to enable Flax (#8475)
* Update deploy-docs dependencies on CI to enable Flax

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added pair of ""

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-11-11 18:31:41 -05:00
Funtowicz Morgan
a5b682329c
Flax/Jax documentation (#8331)
* First addition of Flax/Jax documentation

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* make style

* Ensure input order match between Bert & Roberta

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Install dependencies "all" when building doc

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* wraps build_doc deps with ""

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Addressing @sgugger comments.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Use list to highlight JAX features.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Make style.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Let's not look to much into the future for now.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Style

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-11-11 14:53:36 -05:00
Sylvain Gugger
854b44aa38 Revert size change as it doesn't change anything 2020-11-04 11:13:24 -05:00
Sylvain Gugger
414985c427 Upgrade resource for doc building 2020-11-04 10:44:19 -05:00
Stas Bekman
1bb4bba53c
[CIs] Better reports everywhere (#8275)
* make it possible to invoke testconf.py in both test suites without crashing on having the same option added

* perl -pi -e 's|--make_reports|--make-reports|' to be consistent with other opts

* add `pytest --make-reports` to all CIs (and artifacts)

* fix
2020-11-03 16:57:12 -05:00
Sylvain Gugger
4c19f3baab
Clean Trainer tests and datasets dep (#8268) 2020-11-03 15:50:55 -05:00
Sylvain Gugger
691176283d
Add a template for examples and apply it for mlm and plm examples (#8153)
* Add a template for example scripts and apply it to mlm

* Formatting

* Fix test

* Add plm script

* Add a template for example scripts and apply it to mlm

* Formatting

* Fix test

* Add plm script

* Add a template for example scripts and apply it to mlm

* Formatting

* Fix test

* Add plm script

* Styling
2020-10-29 13:38:11 -04:00
Lysandre Debut
1b6c8d4811
Update CI cache (#8126) 2020-10-28 13:59:43 -04:00
Lysandre Debut
a0906068cf
Fully remove codecov (#8093) 2020-10-27 14:14:13 -04:00
Stas Bekman
bfd5e370a7
[CI] generate separate report files as artifacts (#7995)
* better reports

* a whole bunch of reports in their own files

* clean up

* improvements

* github artifacts experiment

* style

* complete the report generator with multiple improvements/fixes

* fix

* save all reports under one dir to easy upload

* can remove temp failing tests

* doc fix

* some cleanup
2020-10-27 09:25:07 -04:00