Commit Graph

8392 Commits

Author SHA1 Message Date
yis11178
96cc02b51b
change tf.math.divide with int(/) to remove dim_per_head from the TF graph (#14600)
Co-authored-by: yis <yis@graphcore.ai>
2021-12-02 13:13:42 +00:00
Leandro von Werra
43f953cc2e
Add CodeParrot 🦜 codebase (#14536)
* add readme skeleton

* update readme

* add initialization script

* add deduplication script

* add codeparrot training script

* add code generation evaluation

* add validation loss script

* add requirements

* update readme

* tweak readme

* make style

* add highlights to readme

* add CLIs to scripts

* add tokenizer training script

* add docstring to constant length dataset

* fix defaults in arguments

* update readme with cli

* move image to hub

* tweaks of readme

* fix cli commands

* add author

* explain env variables

* fix formatting

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Apply suggestions from code review

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* replace generic with gpt2 tokenizer

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2021-12-02 10:41:35 +01:00
Lysandre Debut
e4c67d60ec
Python 3.6 -> Python 3.7 for TF runs (#14598) 2021-12-02 04:09:17 -05:00
Daniel Stancl
50d909be28
[Flax] Add FlaxBlenderbotSmall (#14576)
* [WIP] Add FlaxBlenderbotSmall

* Revert some unintentionally changed files

Revert some unintentionally files changed by improperly filled cookiecutter instructions.

* Fix repo consistency

* Fix Flax-PT equivalence

* Apply suggestions from code review

* Update index.mdx

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-12-02 14:21:48 +05:30
Lysandre Debut
77d87e732e
Adds a git pull instruction to the documentation builder (#14597)
* Adds a git pull instruction

* master -> main
2021-12-02 03:32:38 -05:00
Mishig Davaadorj
275402bf2b
Update doc img links (#14593)
* Update doc img links

* Rename toctree.yml -> _toctree.yml (#14594)

* Update doc img links

* Update performance.md img link
2021-12-02 09:01:35 +01:00
Mishig Davaadorj
4f68de625c
Rename toctree.yml -> _toctree.yml (#14594) 2021-12-02 08:58:39 +01:00
Stas Bekman
fbe278c76c
[doc] bf16/tf32 guide (#14579)
* [doc] bf16/tf32 guide

* expand

* expand

* Update docs/source/performance.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-12-01 14:18:58 -08:00
Li-Huai (Allan) Lin
934e2799da
Fix mask token handling (#14364)
* Fix mask token handling

* Revert "Fix mask token handling"

This reverts commit daaa3f5291.

* Fix FNet mask token tokenization
2021-12-01 20:16:52 +01:00
Sylvain Gugger
4df7d05a87
Doc new front (#14590)
* Convert PretrainedConfig doc to Markdown

* Use syntax

* Add necessary doc files (#14496)

* Doc fixes (#14499)

* Fixes for the new front

* Convert DETR file for table

* Title is needed

* Simplify a bit

* Even simpler

* Remove imports

* Fix typo in toctree (#14516)

* Fix checkpoints badge

* Update versions.yml format (#14517)

* Doc new front github actions (#14512)

* Doc new front github actions

* Fix docstring

* Fix feature extraction utils import (#14515)

* Address Julien's comments

* Push to doc-builder

* Ready for merge

* Remove old build and deploy

* Doc misc fixes (#14583)

* Rm versions.yml from doc

* Fix converting.rst

* Rm pretrained_models from toctree

* Fix index links (#14567)

* Fix links in README

* Localized READMEs

* Fix copy script

* Fix find doc script

* Update README_ko.md

Co-authored-by: Julien Chaumond <julien@huggingface.co>

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Adapt build command to new CLI tools (#14578)

* Fix typo

* Fix doc interlinks (#14589)

* Convert PretrainedConfig doc to Markdown

* Use syntax

* Rm pattern <[a-z]+(.html).*>

* Rm huggingface.co/transformers/master

* Rm .html

* Rm .html from index.mdx

* Rm .html from model_summary.rst

* Update index.mdx rm html

* Update remove .html

* Fix inner doc links

* Fix interlink in preprocssing.rst

* Update pr_checks

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Convert PretrainedConfig doc to Markdown

* Use syntax

* Add necessary doc files (#14496)

* Doc fixes (#14499)

* Fixes for the new front

* Convert DETR file for table

* Title is needed

* Simplify a bit

* Even simpler

* Remove imports

* Fix checkpoints badge

* Fix typo in toctree (#14516)

* Update versions.yml format (#14517)

* Doc new front github actions (#14512)

* Doc new front github actions

* Fix docstring

* Fix feature extraction utils import (#14515)

* Address Julien's comments

* Push to doc-builder

* Ready for merge

* Remove old build and deploy

* Doc misc fixes (#14583)

* Rm versions.yml from doc

* Fix converting.rst

* Rm pretrained_models from toctree

* Fix index links (#14567)

* Fix links in README

* Localized READMEs

* Fix copy script

* Fix find doc script

* Update README_ko.md

Co-authored-by: Julien Chaumond <julien@huggingface.co>

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Adapt build command to new CLI tools (#14578)

* Fix typo

* Fix doc interlinks (#14589)

* Convert PretrainedConfig doc to Markdown

* Use syntax

* Rm pattern <[a-z]+(.html).*>

* Rm huggingface.co/transformers/master

* Rm .html

* Rm .html from index.mdx

* Rm .html from model_summary.rst

* Update index.mdx rm html

* Update remove .html

* Fix inner doc links

* Fix interlink in preprocssing.rst

* Update pr_checks

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Styling

Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
2021-12-01 14:13:02 -05:00
Stas Bekman
14cc50d081 fix autocast for older pytorch 2021-12-01 09:32:52 -08:00
Suraj Patil
4c0dd199c8
FlaxGPTJ (#14396)
* add flax gptj

* no bias in attention dense

* no wpe

* fix rotary embeddings

* fix rotary embeds

* fix rotray embeds

* quality

* doc and quality

* fix equivalence tests
2021-12-01 10:57:39 +05:30
Jamie DeAntonis
70996a5420
WIP: Support for Training with BF16 (#13207)
* started bf16 integration

* minor changes

* code now runs

* style

* lay foundation for bf16 testing

* lay foundation for bf16 testing

* start the tests

* better bf16 check

* style

* 2 separate checkers - one for bf16 support, another for bf16+autocast

* Update src/transformers/training_args.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* a couple of comment resolutions

* more comment resolutions

* resolved a small bug

* just some print statemtns

* added todo marking

* added a todo

* adjust for API change s/fast_dtype/dtype/

* fix style

* merge 2 bf16 util functions

* bf16 now does scaling too

* Add support for bfloat16

* Revert T5 layernorm to float32

This is based on the comment at https://github.com/huggingface/transformers/pull/14448/files#r752660929 and the PyTorch PR https://github.com/pytorch/pytorch/pull/66920 .

* Add comment about conversion to float32 before returning the numpy data

* Add comment about AMP-bfloat16 incompatibility

* Fix formatting

* typo

* reformer / bf16

* cleanup

* require at least pt-1.10

* fix

* will deal with deepspeed separately

* cleanup

* revert

* cleanup

* fp16_full_eval and bf16_full_eval are separate modes

* proper deprecation

* cleanup

* test and fixes

* spelling

* cleanup

* add a note that this API is experimental

Co-authored-by: jamie <jamie@cortx.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: suriya <suriya@cortx.com>
Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>
2021-11-30 18:00:47 -08:00
Suraj Patil
fc1d97f29d
VisionTextDualEncoder (#13511)
* init vision_text_dual_encoder

* fix merge

* remove extra heads

* fix tests

* remove VISION_TEXT_DUAL_ENCODER_PRETRAINED_CONFIG_ARCHIVE_MAP

* remove archive map

* fix imports

* fix more imports

* fix init

* delete tokenizers

* fix imports

* clean

* support clip's vision model

* handle None config

* begin tests

* more test and few fixes

* warn about newly init weights

* more tests

* add loss to model

* remove extra classes from doc

* add processor

* doc and small fixes

* add start docstr

* update flax model

* flax tests

* more flax tests

* doc

* quality

* doc and quality

* fix doc

* doc

* remove comments

* update warning

* quality

* fix docs

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* replace asserts, fix imports

* update imports

* fix import

* address some review comments

* fix check

* reduce tolerance

* fix test

* add flax integration test

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address Sylvain's comments

* fix style

* add pt_flax_equivalence test in PT tests

* add pt integration test

* update test

* use pre-trained checkpoint in examples

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-11-30 22:21:48 +05:30
Thomas Viehmann
6ed9882ddb
use functional interface for softmax in attention (#14198)
* use functional interface instead of instantiating module and immediately calling it

* fix torch.nn.functional to nn.functional. Thank you Stas!
2021-11-30 11:47:33 -05:00
giacomo snidero
4176bc161c
Add documentation for multi-label classification (#14168)
* "update example docstring multilabel example

* update example docstring multilabel example
2021-11-30 11:34:41 -05:00
Daniel Stancl
faacd74729
[Flax] Add FlaxBlenderbot (#13633)
* Init Flax implementation for Blenderbot

* Add a majority of stuff except for tests

* make style quality

* Add tests and fix some bugs

* Add tests

* Clean source code and fix some bugs

* Fix copies and docs

* Fix jax device condition for tests

* Fix layer norm in the encoder

* Fix a few typos in the test file

* make fix-copies

* make fix-copies

* fix layer norm

* Fix Flax params dtype (#13090)

* Fix PR reference (#13098)

* make fix-copies

* Update tests/test_modeling_flax_blenderbot.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-11-30 17:36:54 +05:30
Sylvain Gugger
254fef67cf
Fix backend regex (#14566) 2021-11-30 05:32:20 -05:00
Kamal Raj
c468a87a69
Tapas tf (#13393)
* TF Tapas first commit

* updated docs

* updated logger message

* updated pytorch weight conversion
script to support scalar array

* added use_cache to tapas model config to
work properly with tf input_processing

* 1. rm embeddings_sum
2. added # Copied
3. + TFTapasMLMHead
4. and lot other small fixes

* updated docs

* + test for tapas

* updated testing_utils to check
is_tensorflow_probability_available

* converted model logits post processing using
numpy to work with both PT and TF models

* + TFAutoModelForTableQuestionAnswering

* added TF support

* added test for
TFAutoModelForTableQuestionAnswering

* added test for
TFAutoModelForTableQuestionAnswering pipeline

* updated auto model docs

* fixed typo in import

* added tensorflow_probability to run tests

* updated MLM head

* updated tapas.rst with TF  model docs

* fixed optimizer import in docs

* updated convert to np
data from pt model is not
`transformers.tokenization_utils_base.BatchEncoding`
after pipeline upgrade

* updated pipeline:
1. with torch.no_gard removed, pipeline forward handles
2. token_type_ids converted to numpy

* updated docs.

* removed `use_cache` from config

* removed floats_tensor

* updated code comment

* updated Copyright Year and
logits_aggregation Optional

* updated docs and comments

* updated docstring

* fixed model weight loading

* make fixup

* fix indentation

* added tf slow pipeline test

* pip upgrade

* upgrade python to 3.7

* removed from_pt from tests

* revert commit f18cfa9
2021-11-30 11:07:55 +01:00
Matt
6fc38adff2
Add model checkpointing to push_to_hub and PushToHubCallback (#14492)
* Add checkpointing to push_to_hub and PushToHubCallback

* Add checkpoint loading

* Add missing default value

* Correct method name

* make style

* Moving everything to the right location

* make style

* Revert changes to file_utils.py

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Adding docstrings and comments to clarify code

* make style

* Fix organization positional arg

* Fix load_repo_checkpoint to no longer accidentally create empty repos

* make style

* Remove unnecessary 'organization' argument in load_repo_checkpoint

* Avoid private `_create_or_get_repo` method

* make style

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-11-29 17:36:19 +00:00
Rahul Nadkarni
8332327dca
Fix sentinel token IDs in data collator for Flax T5 pretraining script (#14477) 2021-11-29 17:30:17 +01:00
Kamal Raj
2bd950ca47
[Flax] token-classification model steps enumerate start from 1 (#14547)
* step start from 1

* Updated cur_step calcualtion
2021-11-29 21:55:59 +05:30
Patrick von Platen
cea17acd8c
[Generate] Fix generate with inputs_embeds on GPU (#14564) 2021-11-29 16:10:19 +01:00
NielsRogge
25156eb296
Rename ImageGPT (#14526)
* Rename

* Add MODEL_FOR_CAUSAL_IMAGE_MODELING_MAPPING
2021-11-29 10:19:11 +01:00
Štěpán Műller
4ee0b755bd
LayoutLMv2FeatureExtractor now supports non-English languages when applying Tesseract OCR. (#14514)
* Added the lang argument to apply_tesseract in feature_extraction_layoutlmv2.py, which is used in pytesseract.image_to_data.

* Added ocr_lang argument to LayoutLMv2FeatureExtractor.__init__, which is used when calling apply_tesseract

* Updated the documentation of the LayoutLMv2FeatureExtractor

* Specified in the documentation of the LayoutLMv2FeatureExtractor that the ocr_lang argument should be a language code.

* Update src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Split comment into two lines to adhere to the max line size limit.

* Update src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2021-11-29 04:15:08 -05:00
Xing Han Lu
ebbe8cc3fe
Tokenizers docs: Specify which class contains __call__ method (#14379)
* Update tokenizer.rst

* Apply `make fixup`
2021-11-28 18:55:38 -05:00
Suraj Patil
69511cdcae
unfreeze initial cache in gpt models (#14535) 2021-11-26 18:21:47 +05:30
Lysandre Debut
2318bf77eb
Fixes (#14534) 2021-11-26 04:35:08 -05:00
Lysandre Debut
c15f4f203f
Quicktour updates (#14533) 2021-11-26 04:09:31 -05:00
Chris Fregly
1bbd6fcdeb
added save_directories for _psave_pretrained_pt and _tf, changed model to tf_model and pt_model, enable the notebook to run cleanly from top to bottom without error (#14529)
* added save_directories for _psave_pretrained_pt and _tf, changed model to tf_model and pt_model, enable the notebook to run cleanly from top to bottom without error

* Update quicktour.rst

* added >>>

* dependencies

* added space
2021-11-26 03:46:07 -05:00
Nicolas Patry
04683c0659
Fix a slow test. (#14527) 2021-11-25 12:59:33 -05:00
Stas Bekman
d1fd64e7aa
clear ~/.cache/torch_extensions between builds (#14520) 2021-11-25 03:15:35 -05:00
NielsRogge
3772af49ce
[Tests] Improve vision tests (#14458)
* Improve tests

* Install vision for tf tests
2021-11-24 15:22:20 +01:00
Lysandre Debut
f2e90bcb8f
Fix feature extraction utils import (#14515) 2021-11-24 09:03:21 -05:00
Vladimir Maryasin
6c4d688ffa
add cache_dir for tokenizer verification loading (#14508)
When loading a pretrained tokenizer, a verification is done to ensure
that the actual tokenizer class matches the class it was called from.
If the tokenizer is absent, its config file is loaded from the repo.

However, the cache_dir for downloading is not provided, which leads to
ignoring of the user-specified cache_dir, storing files in several
places and and may result in incorrect warnings when the default
cache_dir is unreachsble.

This commit fixes that.
2021-11-24 06:22:03 -05:00
Stas Bekman
956a483173
[deepspeed] zero inference (#14253)
* [deepspeed] zero inference

* only z3 makes sense for inference

* fix and style

* docs

* rework

* fix test

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* responding to suggestions

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-11-23 14:09:15 -08:00
Nicholas Broad
69e16abf98
Switch from using sum for flattening lists of lists in group_texts (#14472)
* remove sum for list flattening

* change to chain(*)

* make chain object a list

* delete empty lines

per sgugger's suggestions

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-11-22 16:17:26 -05:00
Valentin
0b7d053c13
fixes some key names for in LayoutLMv2 / LayoutXLM tokenizers (#14493)
in case of left padding_side there was a copy/paste error
assigning the bbox data to the labels
2021-11-22 16:00:43 -05:00
Sylvain Gugger
204d251310
Auto processor (#14465)
* Add AutoProcessor class

* Init and tests

* Add doc

* Fix init

* Update src/transformers/models/auto/processing_auto.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Reverts to tokenizer or feature extractor when available

* Adapt test

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-11-22 12:17:38 -05:00
Stas Bekman
11f65d4158
[test] add test for --config_overrides (#14466)
* add test for --config_overrides

* remove unneeded parts of the test
2021-11-22 11:33:43 -05:00
Daniel Stancl
e0e2da1194
Improve a add-new-pipeline docs a bit (#14485) 2021-11-22 10:35:49 -05:00
Nicolas Patry
a4553e6c64
Moving pipeline tests from Narsil to hf-internal-testing. (#14463)
* Moving everything to `hf-internal-testing`.

* Fixing test values.

* Moving to other repo.

* Last touch?
2021-11-22 04:40:45 -05:00
Sylvain Gugger
1a92bc5788
Fix dummy objects for quantization (#14478)
* Fix dummy objects for quantization

* Add more models
2021-11-21 17:39:20 -05:00
Alexander Measure
c9d2cf855a
add Tuple as possible type hint for EvalPredictions label_ids (#14473)
* Update trainer_utils.py

* add Tuple type hints to all label_ids outputs

affects EvalLoopOutput and PredicctionOutput
2021-11-21 10:31:09 -05:00
Shang Zhang
a59e7c1ed4
Add QDQBert model and quantization examples of SQUAD task (#14066)
* clean up branch for add-qdqbert-model

* README update for QAT example; update docstrings in modeling_qdqbert.py

* Update qdqbert.rst

* Update README.md

* Update README.md

* calibration data using traning set; QAT example runs in fp32

* re-use BERTtokenizer for qdqbert

* Update docs/source/model_doc/qdqbert.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/qdqbert.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/qdqbert.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove qdqbert tokenizer

* Update qdqbert.rst

* update evaluate-hf-trt-qa.py

* update configuration_qdqbert.py

* update modeling_qdqbert.py: add copied statement; replace assert with ValueError

* update copied from statement

* add is_quantization_available; run make fix-copies

* unittest add require_quantization

* add backend dependency to qdqbert model

* update README; update evaluate script; make style

* lint

* docs qdqbert update

* circleci build_doc add pytorch-quantization for qdqbert

* update README

* update example readme with instructions to upgrade TensorRT to 8.2

* Update src/transformers/models/qdqbert/configuration_qdqbert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* change quantization to pytorch_quantization for backend requirement

* feed_forward_chunking not supported in QDQBert

* make style

* update model docstrings and comments in testing scripts

* rename example to quantization-qdqbert; rename example scripts from qat to quant

* Update src/transformers/models/qdqbert/modeling_qdqbert.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* rm experimental functions in quant_trainer

* qa cleanup

* make fix-copies for docs index.rst

* fix doctree; use post_init() for qdqbert

* fix early device assignment for qdqbert

* fix CI:Model templates runner

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-11-19 13:33:39 -05:00
Nicolas Patry
81fe8afaac
Adding support for hidden_states and attentions in unbatching (#14420)
support.
2021-11-19 15:37:52 +01:00
Patrick von Platen
f25a9332e8
[Generation] Allow inputs_embeds as an input (#14443)
* up

* finalize

* finalize

* finish

* Update src/transformers/generation_utils.py

* apply feedback
2021-11-19 15:35:06 +01:00
NielsRogge
0490b98877
[ImageGPT] Small fixes (#14460)
* Add integration test

* Fix typo
2021-11-19 15:15:02 +01:00
Lysandre Debut
331c3d2aa0
Add GitPython to quality tools (#14459)
* Update setup.py

* Update setup.py

* Update setup.py

* Remove GitPython install
2021-11-19 08:43:48 -05:00
Patrick von Platen
efea0f868b
[Speech Recognition] More examples
Add more XLS-R training runs to the official examples
2021-11-18 23:42:02 +01:00