transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-13 01:30:04 +06:00

Author	SHA1	Message	Date
Nicola Procopio	0a5b61d004	Added preprocessing.mdx italian translation (#17600 ) * updated _toctree.yml * added preprocessing * updated preprocessing.mdx * updated preprocessing.mdx updated after review	2022-07-18 19:06:10 +02:00
gcheron	8c14b342aa	add ONNX support for LeVit (#18154 ) Co-authored-by: Guilhem Chéron <guilhemc@authentifier.com>	2022-07-18 15:17:07 +02:00
Lysandre Debut	c1c79b0655	NLLB tokenizer (#18126 ) * NLLB tokenizer * Apply suggestions from code review - Thanks Stefan! Co-authored-by: Stefan Schweter <stefan@schweter.it> * Final touches * Style :) * Update docs/source/en/model_doc/nllb.mdx Co-authored-by: Stefan Schweter <stefan@schweter.it> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * PR reviews * Auto models Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-07-18 08:12:34 -04:00
amyeroberts	8581a798c0	Add TF DeiT implementation (#17806 ) * Initial TF DeiT implementation * Fix copies naming issues * Fix up + docs * Properly same main layer * Name layers properly * Initial TF DeiT implementation * Fix copies naming issues * Fix up + docs * Properly same main layer * Name layers properly * Fixup * Fix import * Fix import * Fix import * Fix weight loading for tests whilst not on hub * Add doc tests and remove to_2tuple * Add back to_2tuple Removing to_2tuple results in many downstream changes needed because of the copies checks * Incorporate updates in Improve vision models #17731 PR * Don't hard code num_channels * Copy PyTorch DeiT embeddings and remove pytorch operations with mask * Fix patch embeddings & tidy up * Update PixelShuffle to move logic into class layer * Update doc strings - remove PT references * Use NHWC format in internal layers * Fix up * Use linear activation layer * Remove unused import * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Move dataclass to top of file * Remove from_pt now weights on hub * Fixup Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>	2022-07-13 18:04:08 +01:00
Wei	7ea6ccc2b3	Enable torchdynamo with torch_tensorrt(fx path) (#17765 ) * enable fx2trt * Update perf_train_gpu_one.mdx * Update perf_train_gpu_one.mdx * add lib check * update * format * update * fix import check * fix isort * improve doc * refactor ctx manager * fix isort * black format * isort fix * fix format * update args * update black * cleanups * Update perf_train_gpu_one.mdx * code refactor * code refactor to init * remove redundancy * isort * replace self.args with args Co-authored-by: Stas Bekman <stas@stason.org>	2022-07-13 12:43:28 -04:00
Yulv-git	95113d1365	Fix some typos. (#17560 ) * Fix some typos. Signed-off-by: Yulv-git <yulvchi@qq.com> * Fix typo. Signed-off-by: Yulv-git <yulvchi@qq.com> * make fixup.	2022-07-11 05:00:13 -04:00
varshith	91c4a3ab1a	Added Command for windows VENV activation in installation docs (#18008 ) * Added command for windows VENV activation * changed linux and macos specification	2022-07-07 08:18:44 -04:00
Sylvain Gugger	1b749a7f8d	Sort doc toc (#18034 ) * Add script to sort doc ToC * Style and fixes * Add check to quality job	2022-07-07 08:17:58 -04:00
Sylvain Gugger	2e90c3df8f	Doc to dataset (#18037 ) * Link to the Datasets doc * Remove unwanted file	2022-07-06 12:10:06 -04:00
NielsRogge	22edb68d49	Squash commits (#17981 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-07-06 08:11:48 -04:00
Matthijs Hollemans	6cb19540c9	sort list of models (#18011 )	2022-07-04 09:20:55 -04:00
amyeroberts	77ea5130a1	Add TF ResNet model (#17427 ) * Rought TF conversion outline * Tidy up * Fix padding differences between layers * Add back embedder - whoops * Match test file to main * Match upstream test file * Correctly pass and assign image_size parameter Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Add in MainLayer * Correctly name layer * Tidy up AdaptivePooler * Small tidy-up More accurate type hints and remove whitespaces * Change AdaptiveAvgPool Use the AdaptiveAvgPool implementation by @Rocketknight1, which correctly pools if the output shape does not evenly divide by input shape c.f. `9e26607e22 (r900109509)` Co-authored-by: From: matt <rocketknight1@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Use updated AdaptiveAvgPool Co-authored-by: matt <rocketknight1@gmail.com> * Make AdaptiveAvgPool compatible with CPU * Remove image_size from configuration * Fixup * Tensorflow -> TensorFlow * Fix pt references in tests * Apply suggestions from code review - grammar and wording Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add TFResNet to doc tests * PR comments - GlobalAveragePooling and clearer comments * Remove unused import * Add in keepdims argument * Add num_channels check * grammar fix: by -> of Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Remove transposes - keep NHWC throughout forward pass * Fixup look sharp * Add missing layer names * Final tidy up - remove from_pt now weights on hub Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-07-04 10:59:15 +01:00
Lysandre Debut	7b18702ca7	Add link to existing documentation (#17931 )	2022-07-04 04:13:05 -04:00
Nouamane Tazi	b68d408f1b	add ONNX support for BLOOM (#17961 ) * add onnx support for BLOOM * use TYPE_CHECKING for type annotations * fix past_shape for bloom (different from gpt2) * use logical_or instead of `+` for onnx support * bigger `atol_for_validation` for larger bloom models * copied -> taken because it's no longer an exact copy * remove "copied from" comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-07-01 10:44:42 -04:00
Billy Cao	cb42502410	Fix typo in perf_train_gpu_one.mdx (#17983 )	2022-07-01 09:19:13 -04:00
Aaron Pham	49cd736a28	feat: add pipeline registry abstraction (#17905 ) * feat: add pipeline registry abstraction - added `PipelineRegistry` abstraction - updates `add_new_pipeline.mdx` (english docs) to reflect the api addition - migrate `check_task` and `get_supported_tasks` from transformers/pipelines/__init__.py to transformers/pipelines/base.py#PipelineRegistry.{check_task,get_supported_tasks} Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * fix: update with upstream/main chore: Apply suggestions from sgugger's code review Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * chore: PR updates - revert src/transformers/dependency_versions_table.py from upstream/main - updates pipeline registry to use global variables Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * tests: add tests for pipeline registry Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * tests: add test for output warning. Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * chore: fmt and cleanup unused imports Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * fix: change imports to top of the file and address comments Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-30 12:11:08 -04:00
regisss	9cb7cef285	Add ONNX support for LayoutLMv3 (#17953 ) * Add ONNX support for LayoutLMv3 * Update docstrings * Update empty description in docstring * Fix imports and type hints	2022-06-30 12:09:52 -04:00
Crystina	692e61e91a	Flax t5 Encoder (#17784 ) * first draft adding Flax-t5-encoder and Flax-mt5-encoder * imports * after make fixup * flax t5 encoder test * black on test * make fix-copies * clean * all_model_classes -> tuple * clean test * is_encoder_decoder=False in t5-enc tester * remove file docstring before FlaxT5Encoder * black * isort * commit suggestions on src/transformers/models/t5/modeling_flax_t5.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * commit suggestions on src/transformers/models/t5/modeling_flax_t5.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * remove _get_encoder_module * self.decoder_seq_length -> self.encoder_seq_length as t5-enc does not have decoder * bugfix - self.module_class is class itself, not instance; * docs for mt5 and t5 * call -> __call__ in t5 doc * FlaxMT5EncoderModel to TYPE_HINT * run doc-builder to allow change the files Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-06-30 00:49:02 +02:00
Matthijs Hollemans	fbc7598bab	add MobileViT model (#17354 ) * add MobileViT * fixup * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * remove empty line Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * use clearer variable names * rename to MobileViTTransformerLayer * no longer inherit from nn.Sequential * fixup * fixup * not sure why this got added twice * rename organization for checkpoints * fix it up * Update src/transformers/models/mobilevit/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/models/mobilevit/test_modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * code style improvements * fixup * Update docs/source/en/model_doc/mobilevit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/mobilevit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * download labels from hub * rename layers * rename more layers * don't compute loss in separate function * remove some nn.Sequential * replace nn.Sequential with new MobileViTTransformer class * replace nn.Sequential with MobileViTMobileNetLayer * fix pruning since model structure changed * fixup * fix doc comment * remove custom resize from feature extractor * fix ONNX import * add to doc tests * use center_crop from image_utils * move RGB->BGR flipping into image_utils * fix broken tests * wrong type hint * small tweaks Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-29 16:07:51 -04:00
StevenTang1998	3cff4cc587	Add MVP model (#17787 ) * Add MVP model * Update README * Remove useless module * Update docs * Fix bugs in tokenizer * Remove useless test * Remove useless module * Update vocab * Remove specifying * Remove specifying * Add #Copied ... statement * Update paper link * Remove useless TFMvp * Add #Copied ... statement * Fix style in test mvp model * Fix some typos * Fix properties of unset special tokens in non verbose mode * Update paper link * Update MVP doc * Update MVP doc * Fix README * Fix typos in docs * Update docs	2022-06-29 09:30:55 -04:00
Aritra Roy Gosthipaty	a7eba83161	TF implementation of RegNets (#17554 ) * chore: initial commit Copied the torch implementation of regnets and porting the code to tf step by step. Also introduced an output layer which was needed for regnets. * chore: porting the rest of the modules to tensorflow did not change the documentation yet, yet to try the playground on the model * Fix initilizations (#1) * fix: code structure in few cases. * fix: code structure to align tf models. * fix: layer naming, bn layer still remains. * chore: change default epsilon and momentum in bn. * chore: styling nits. * fix: cross-loading bn params. * fix: regnet tf model, integration passing. * add: tests for TF regnet. * fix: code quality related issues. * chore: added rest of the files. * minor additions.. * fix: repo consistency. * fix: regnet tf tests. * chore: reorganize dummy_tf_objects for regnet. * chore: remove checkpoint var. * chore: remov unnecessary files. * chore: run make style. * Update docs/source/en/model_doc/regnet.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * chore: PR feedback I. * fix: pt test. thanks to @ydshieh. * New adaptive pooler (#3) * feat: new adaptive pooler Co-authored-by: @Rocketknight1 * chore: remove image_size argument. Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: matt <rocketknight1@gmail.com> * Empty-Commit * chore: remove image_size comment. * chore: remove playground_tf.py * chore: minor changes related to spacing. * chore: make style. * Update src/transformers/models/regnet/modeling_tf_regnet.py Co-authored-by: amyeroberts <aeroberts4444@gmail.com> * Update src/transformers/models/regnet/modeling_tf_regnet.py Co-authored-by: amyeroberts <aeroberts4444@gmail.com> * chore: refactored __init__. * chore: copied from -> taken from./g * adaptive pool -> global avg pool, channel check. * chore: move channel check to stem. * pr comments - minor refactor and add regnets to doc tests. * Update src/transformers/models/regnet/modeling_tf_regnet.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * minor fix in the xlayer. * Empty-Commit * chore: removed from_pt=True. Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: amyeroberts <aeroberts4444@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-06-29 13:45:14 +01:00
Jerry Jiarui XU	6c8f4c9a93	Adding GroupViT Models (#17313 ) * add group vit and fixed test (except slow) * passing slow test * addressed some comments * fixed test * fixed style * fixed copy * fixed segmentation output * fixed test * fixed relative path * fixed copy * add ignore non auto configured * fixed docstring, add doc * fixed copies * Apply suggestions from code review merge suggestions Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * resolve comment, renaming model * delete unused attr * use fix copies * resolve comments * fixed attn * remove unused vars * refactor tests * resolve final comments * add demo notebook * fixed inconsitent default * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * rename stage->stages * Create single GroupViTEncoderLayer class * Update conversion script * Simplify conversion script * Remove cross-attention class in favor of GroupViTAttention * Convert other model as well, add processor to conversion script * addressing final comment * fixed args * Update src/transformers/models/groupvit/modeling_groupvit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-06-28 20:51:47 +02:00
regisss	76d13de5ae	Add ONNX support for DETR (#17904 )	2022-06-28 14:48:43 +02:00
Bill Ray	bfcd5743ee	In `group_texts` function, drop last block if smaller than `block_size` (#17908 )	2022-06-28 08:34:55 -04:00
Matt	ee0d001de7	Add a TF in-graph tokenizer for BERT (#17701 ) * Add a TF in-graph tokenizer for BERT * Add from_pretrained * Add proper truncation, option handling to match other tokenizers * Add proper imports and guards * Add test, fix all the bugs exposed by said test * Fix truncation of paired texts in graph mode, more test updates * Small fixes, add a (very careful) test for savedmodel * Add tensorflow-text dependency, make fixup * Update documentation * Update documentation * make fixup * Slight changes to tests * Add some docstring examples * Update tests * Update tests and add proper lowercasing/normalization * make fixup * Add docstring for padding! * Mark slow tests * make fixup * Fall back to BertTokenizerFast if BertTokenizer is unavailable * Fall back to BertTokenizerFast if BertTokenizer is unavailable * make fixup * Properly handle tensorflow-text dummies	2022-06-27 12:06:21 +01:00
rooa	d6b6fb9963	Add CodeGen model (#17443 ) * Add CodeGen model * Add missing key and switch order of super() * Fix torch.ones init with uint8 instead of bool * Address comments: copy statements and doc * update tests * remove old model parallel * fix batch gen tests * fix batch gen test * update test_gpt2_sample_max_time * fix codgen test and revert gpt2 test change * Fix incorrect tie_word_embedding value, typo, URL * Fix model order in README and styling * Reorder model list alphabetically * Set tie_word_embedding to False by default * Apply suggestions from code review * Better attn mask name & remove attn masked_bias * add tokenizer for codegen * quality * doc tokenizer * fix-copies * add CodeGenTokenizer in converter * make truncation optional * add test for truncation * add copyright * fix-copies * fix fast tokenizer decode * Update src/transformers/models/codegen/tokenization_codegen.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * increase vocab_size in tests Co-authored-by: patil-suraj <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-06-24 17:10:38 +02:00
Vishwas	c2c0d9db5f	Improve encoder decoder model docs (#17815 ) * Copied all the changes from the last PR * added in documentation_tests.txt * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: vishwaspai <vishwas.pai@emplay.net> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2022-06-24 14:48:19 +02:00
Sijun He	7cf52a49de	Nezha Pytorch implementation (#17776 ) * wip * rebase * all tests pass * rebase * ready for PR * address comments * fix styles * add require_torch to pipeline test * remove remote image to improve CI consistency * address comments; fix tf/flax tests * address comments; fix tf/flax tests * fix tests; add alias * repo consistency tests * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * address comments * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * merge * wip * wip * wip * most basic tests passes * all tests pass now * relative embedding * wip * running make fixup * remove bert changes * fix doc * fix doc * fix issues * fix doc * address comments * fix CI * remove redundant copied from * address comments * fix broken test Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-06-23 12:36:22 -04:00
Leandro von Werra	6f29029b05	Improve performance docs (#17750 ) * add skeleton files * fix cpu inference link * add hint to make clear that single gpu section contains general info * add new files to ToC * update toctree to have subsection for performance * add "coming soon" to the still empty sections * fix missing title * fix typo * add reference to empty documents * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-06-23 14:51:54 +02:00
Anugunj Naman	27e907386a	Fix Automatic Download of Pretrained Weights in DETR (#17712 ) * added use_backbone_pretrained * style fixes * update * Update detr.mdx * Update detr.mdx * Update detr.mdx * update using doc py * Update detr.mdx * Update src/transformers/models/detr/configuration_detr.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-21 16:45:35 +02:00
mrbean	eb16be415a	add onnx support for deberta and debertav2 (#17617 ) * add onnx support for debertav2 * debertav2 -> deberta-v2 in onnx features file * remove causal lm * add deberta-v2-xlarge to onnx tests * use self.type().dtype() in xsoftmax Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com> * remove hack for deberta * remove unused imports * Update src/transformers/models/deberta_v2/configuration_deberta_v2.py Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com> * use generate dummy inputs * linter * add imports * add support for deberta v1 as well * deberta does not support multiple choice * Update src/transformers/models/deberta/configuration_deberta.py Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com> * Update src/transformers/models/deberta_v2/configuration_deberta_v2.py Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com> * one line ordered dict * fire build Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>	2022-06-21 11:04:15 +02:00
Patrick von Platen	8fcbe275c3	Add UL2 (just docs) (#17740 ) * Add UL2 Co-authored-by: Daniel Hesslow <Daniel.Hesslow@gmail.com> * Correct naming * sort better * up * apply sylvains suggestion	2022-06-21 10:24:50 +02:00
Rafael Zimmer	0d92798b45	Added translation of index.mdx to Portuguese Issue #16824 (#17565 ) * Added translation of installation.mdx to Portuguese, as well as default templates of _toctree.yml and _config.py * [ build_documentation.yml ] - Updated doc_builder to build documentation in Portuguese. [ pipeline_tutorial.mdx ] - Created translation for the pipeline_tutorial.mdx. * [ build_pr_documentation.yml ] - Added pt language to pr_documentation builder. [ pipeline_tutorial.mdx ] - Grammar changes. * [ accelerate.mdx ] - Translated to Portuguese the acceleration tutorial. * [ multilingual.mdx ] - Added portuguese translation for multilingual tutorial. [ training.mdx ] - Added portuguese translation for training tutorial. * [ preprocessing.mdx ] - WIP * Update _toctree.yml * Adding Pré-processamento to _toctree.yml * Update accelerate.mdx * Nits and eliminate preprocessing file while it is ready * [ index.mdx ] - Translated to Portuguese the index apresentation page. * [ docs/source/pt ] - Updated _toctree.yml to match newest translations. * Fix build_pr_documentation.yml * Fix index nits * nits in _toctree Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-06-17 20:06:05 -04:00
Sylvain Gugger	3981ee8650	Sort the model doc Toc Alphabetically (#17723 )	2022-06-15 16:11:56 -04:00
Patrick von Platen	7f14839f55	[Wav2Vec2Conformer] Official release (#17709 ) * [Wav2Vec2Conformer] Official release * remove from not-in-readme	2022-06-15 18:34:15 +02:00
Hailey Schoelkopf	edb672ac5e	Add `BloomForSequenceClassification` and `BloomForTokenClassification` classes (#17639 ) * add new bloom classes * (feat) add bloom classification tests; make style * style: change import in test * add some typehints to bloom classes * merge main into branch * fix: input checking in bloom seq classification * fix tests * change model class tests * fix few tests - more tests should pass - one test left * make token classifier return hidden states * style: make BLOOM typehints consistent Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2022-06-14 17:10:12 +02:00
jianan-gu	3b29c9fdb7	Extend Transformers Trainer Class to Enable PyTorch Torchscript for Inference (#17153 ) * add jit mode option and model wrap * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refine code * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add ut and refine code * code refine * refine code * add inference doc * Update src/transformers/trainer.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add cpu inference performance doc * Update perf_infer_cpu.mdx * Update perf_infer_cpu.mdx * Update performance.mdx * Update _toctree.yml * refine jit func naming * Update _toctree.yml * Delete perf_infer_gpu_one.mdx * Update perf_infer_cpu.mdx * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add none check before jit * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-06-14 07:56:47 -04:00
Daniel Stancl	a72f1c9f5b	Add `LongT5` model (#16792 ) * Initial commit * Make some fixes * Make PT model full forward pass * Drop TF & Flax implementation, fix copies etc * Add Flax model and update some corresponding stuff * Drop some TF things * Update config and flax local attn * Add encoder_attention_type to config * . * Update docs * Do some cleansing * Fix some issues -> make style; add some docs * Fix position_bias + mask addition + Update tests * Fix repo consistency * Fix model consistency by removing flax operation over attn_mask * [WIP] Add PT TGlobal LongT5 * . * [WIP] Add flax tglobal model * [WIP] Update flax model to use the right attention type in the encoder * Fix flax tglobal model forward pass * Make the use of global_relative_attention_bias * Add test suites for TGlobal model * Fix minor bugs, clean code * Fix pt-flax equivalence though not convinced with correctness * Fix LocalAttn implementation to match the original impl. + update READMEs * Few updates * Update: [Flax] improve large model init and loading #16148 * Add ckpt conversion script accoring to #16853 + handle torch device placement * Minor updates to conversion script. * Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM * gpu support + dtype fix * Apply some suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * * Remove (de)parallelize stuff * Edit shape comments * Update README.md * make fix-copies * Remove caching logic for local & tglobal attention * Apply another batch of suggestions from code review * Add missing checkpoints * Format converting scripts * Drop (de)parallelize links from longT5 mdx * Fix converting script + revert config file change * Revert "Remove caching logic for local & tglobal attention" This reverts commit 2a619828f6ddc3e65bd9bb1725a12b77fa883a46. * Stash caching logic in Flax model * Make side relative bias used always * Drop caching logic in PT model * Return side bias as it was * Drop all remaining model parallel logic * Remove clamp statements * Move test files to the proper place * Update docs with new version of hf-doc-builder * Fix test imports * Make some minor improvements * Add missing checkpoints to docs * Make TGlobal model compatible with torch.onnx.export * Replace some np.ndarray with jnp.ndarray * Fix TGlobal for ONNX conversion + update docs * fix _make_global_fixed_block_ids and masked neg value * update flax model * style and quality * fix imports * remove load_tf_weights_in_longt5 from init and fix copies * add slow test for TGlobal model * typo fix * Drop obsolete is_parallelizable and one warning * Update __init__ files to fix repo-consistency * fix pipeline test * Fix some device placements * [wip]: Update tests -- need to generate summaries to update expected_summary * Fix quality * Update LongT5 model card * Update (slow) summarization tests * make style * rename checkpoitns * finish * fix flax tests Co-authored-by: phungvanduy <pvduy23@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: patil-suraj <surajp815@gmail.com>	2022-06-13 22:36:58 +02:00
Sijun He	66336dc183	Add Visual Question Answering (VQA) pipeline (#17286 ) * wip * rebase * all tests pass * rebase * ready for PR * address comments * fix styles * add require_torch to pipeline test * remove remote image to improve CI consistency * address comments; fix tf/flax tests * address comments; fix tf/flax tests * fix tests; add alias * repo consistency tests * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * address comments * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * merge * Update src/transformers/models/auto/modeling_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * merge Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-13 07:49:44 -04:00
Martina Fumanelli	e0b58fb5ba	Translation/autoclass (#17615 ) * Add Italian translation for autoclass_tutorial.mdx * Fix synthesis Co-authored-by: martina.fumanelli <martina.fumanelli@MBP-di-martinafumanelli.local>	2022-06-09 20:56:44 -04:00
Sylvain Gugger	29080643eb	Mention in the doc we drop support for fairscale (#17610 )	2022-06-09 12:20:39 -04:00
regisss	e0be053e43	Add ONNX support for ConvNeXT (#17627 )	2022-06-09 09:31:02 -04:00
regisss	5323094a22	Add ONNX support for ResNet (#17585 ) * Add ONNX support for ResNet * Add ONNX test * make fix-copies	2022-06-09 08:44:27 -04:00
Younes Belkada	ca2a55e9df	BLOOM (#17474 ) * adding template * update model * model update * update conf for debug model * update conversion * update conversion script * update conversion script * fix missing keys check * add tests to test the tokenizer in the local machine * Change variable name * add tests on xnli dataset * add more description * add descriptions + clearer code * clearer code * adding new tests + skipping few tests because of env problems * change comment * add dtype on the configuration * add test embeddings * add hardcoded test * fix dtype issue * adding torch.float16 to config * adding more metrics (min, max, mean) * add sum * now the test passes with almost equal * add files for conversion - test passes on cpu gpu * add final changes * cleaning code * add new args in the docstring * fix one liner function * remove macros * remove forward attention * clean up init funtion * add comments on the issue * rm scale mask softmax * do make style * fix dtype in init * fixing for loop on att probs * fix style with black * fix style + doc error * fix and debug CI errors (docs + style) * some updates - change new operations - finally add scaled softmax - added new args in the config * make use cache working * add changes - save sharded models - final changes on the modeling script * add changes - comment on alibi - add TODO on seq length * test commit - added a text to test the commit Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com> * final changes - attention mask change - generation works on BS176b Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com> * changes - model + conversion * move to correct dir * put , * fex fixes * fix tokenizer autodoc * fix minor CI issues * fix minor CI issues * fix minor CI issues * fix style issue * fix minor import issues * fix few issues * remove def main on the test * add require torch * replace decorator with 'with' * fix style * change to bloom * add quick fix tokenizer * fix tokenizer file * fix tokenizer - merge tests - small fixes * fix import issue * add bloom to readme * fix consistency * Update docs/source/en/model_doc/bloom.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review fix comment issues on file headers Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix doc issue * small fix - modeling test * some changes - refactor some code - taking into account reviews - more tests should pass - removed pruning tests * remove useless division * more tests should pass * more tests should pass * more tests should pass * let's try this one -add alibi offset - remove all permutes to make the grad operations work - finger crossed * refactor - refactor code - style changes - add new threshold for test * major changes - change BLOOM to Bloom - add quick doc on bloom.mdx - move embeddings test on modeling test * modify readme * small fixes * small fix - better threshold for a test * remove old test file from fetcher * fix small typo * major change - change BloomLMHead to BloomForCausalLM * remove onnx config * major changes - refactor the code - remove asserts - change tol for test * make style * small change * adding a slow test + commenting old ones for now * make style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make style * fix duplicates * cleaning comments on config * clean a bit conversion file * refacor a bit modeling file * refactor tokenizer file * fix tokenization test issue * fix tokenization issue #2 * fix tokenization issue second try * fix test issue * make style + add suggestions * change test fetcher * try this one - slow tests should pass - finger crossed * possible final changes * make style * try fix padding side issue * fix side * fix padding issue * fix ko-readme * fix config auto * cleaning modeling file * keep bloom in caps in ko * update config docs * remove pretraining_pp * remove model parallel * update config - add correct config files * fix duplicates * fix fetcher * fix refactor issue - remove divide function * try to remove alibi * small fixes - fix alibi - remove seq length - refactor a bit the code * put correct values - fix bos and eos token ids * fix attention mask loop Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com> * small fixes: - remove skip bias add * small fixes - fix typo in readme - fix typos in config * small changes - remove a test - add reconstruction test - change config * small changes - change Scaled Softmax to BloomScaledSoftmax * small fixes - fix alibi dtype * major changes - removing explicit dtype when loading modules - fixing test args (torch_dtype=auto) - add dosctring * fix readmes * major changes - now bloom supports alibi shifting - refactor a bit the code - better test tolerance now * refactor a bit * refactor a bit * put correct name on test * change docstring * small changes - fix docstring modeling - fix test tolerance * fix small nit - take dtype from tensors in the conversion script * minor fix - fix mdx issue * minor fix - change config docstring * forward contrib credits from PR14084 * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * apply modifications Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * resolve softmax upcast * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * final changes modeling Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Merge commit 'd156898f3b9b2c990e5963f5030a7143d57921a2' * merge commit * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * apply suggestions Apply suggestions from Stas comments Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Fix gradient checkpointing Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add slow but exact * add accelerate compatibility Co-authored-by: Nicolas Patry <Narsil@users.noreply.github.com> * forward contrib credits Co-authored-by: thomasw21 <thomasw21@users.noreply.github.com> Co-authored-by: sgugger <sgugger@users.noreply.github.com> Co-authored-by: patrickvonplaten <patrickvonplaten@users.noreply.github.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: LysandreJik <LysandreJik@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix torch device on tests * make style * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix nits Co-authored-by: patrickvonplaten<patrickvonplaten@users.noreply.github.com> * remove final nits * fix doc - add more details on the doc - add links to checkpoints * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions Co-authored-by: sgugger <sgugger@users.noreply.github.com> * put test torchscript to false * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: justheuristic <justheuristic@gmail.com> * fix alibi - create alibi only once * add small doc * make quality * replace torch.nn * remove token type emb * fix fused op + output bias * add fused op - now can control fused operation from config * remove fused op * make quality * small changes - remove unsed args on config - removed bias gelu file - make the model torchscriptable - add torchscript slow tests * Update src/transformers/models/bloom/modeling_bloom.py * fix slow * make style * add accelerate support * add bloom to deepspeed tests * minor changes * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * minor change * slow tests pass * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/bloom.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * minor changes: - change docstring - add link to paper Co-authored-by: Thomwolf <thomwolf@gmail.com> Co-authored-by: Thomas Wolf <thomas@huggingface.co> Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: sIncerass <sheng.s@berkeley.edu> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: Nicolas Patry <Narsil@users.noreply.github.com> Co-authored-by: thomasw21 <thomasw21@users.noreply.github.com> Co-authored-by: sgugger <sgugger@users.noreply.github.com> Co-authored-by: patrickvonplaten <patrickvonplaten@users.noreply.github.com> Co-authored-by: LysandreJik <LysandreJik@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: justheuristic <justheuristic@gmail.com> Co-authored-by: Stas Bekman <stas@stason.org>	2022-06-09 12:00:40 +02:00
jianan-gu	34097b3304	Extend Transformers Trainer Class to Enable CPU AMP and Integrate Intel Extension for PyTorch (#17138 ) * init PR * fix import ipex * minor fix on bf16 * refine optimizer * refine args notes * refine code * refine ipex optimize args * refine half_precision_backend * black format * isort format * isort format files * flake8 format * doc builder format * refine codes * remove jit and optim bits * black preview format * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refine code * refine notes * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * code refine * add ipex ut * add performance cpu doc * link to the cpu doc from main perf doc * install ipex into CI's docker * Update perf_train_cpu.mdx * Update docs/source/en/perf_train_cpu.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update perf_train_cpu.mdx * Update perf_train_cpu.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-06-08 09:41:57 -04:00
Sayak Paul	9d99489f2f	Add TFData2VecVision for semantic segmentation (#17271 ) * feat: initial implementation of data2vec segmentation model in TF. * chore: minor corrections to make the segmenter work. * chore: removed unncessary files. * chore: add tests and other modifications. * fix: loss computation for segmentation. * chore: remove unused variable. * chore: formatting. * added a dummy adaptive pooling layer. * removed unnecessary file. * potentially add identifiers to layer names. * fix: layer naming. * chore: removed unnecessary print. * Skipping unneeded test * chore: add logging to debug tolerance. * fix: segmentation tests for tfdata2vecvision * chore: make style. * fix: layer names, assertion to be resolved. * Bumping test tolerance a bit * chore: bump the tol in PT test. Co-authored-by: matt <rocketknight1@gmail.com>	2022-06-08 14:03:18 +01:00
Chan Woo Kim	119e3c0fc8	M-CTC-T Model (#16402 ) * added cbs to notebooks, made copy-paste error fix in generation_utils * initial push for mctc model * mctc feature extractor done * added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly. * added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly. * passing attention, now struggling to figure out how attention masks make sense here * works when excluding attention masks. ask later how one would integrate attention maskshere * bizarre configuration error (model prefix comes first in config dict json and messes up the order) * all passing but bizzarre config dict ordering issue when to_dict * passing all major tests * feature extraction, processor, tokenizer added & tests passing * style & consistency & other logistical fixes * copy paste fix * model after feature extraction working * commiting final feature extraction results; need to fix normalization * feature extraction passing tests; probably should add tests on the specific flashlight-copied functions? * delete print ; format code a bit * fixing tests * passing major tests * fixing styles * completed tokenization test with real example; not sure if these values are entirely correct. * last test fixes from local * reverting accidentally included custom setup configs * remove load tf weights; fix config error * testing couldnt import featureextractor * fix docs * fix docs * resolving comments * style fixes * style fixes * Update to MCTCConv1dSubSampler Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * relposemb fixes * conv1d name issue; expecting config fail with paraentheses * fix config issue * fix config issue * fix config issue * change everything to MCTCT * fixing naming change errors * archive list * copyrights and docs * copyrights and docs * copyrights and docs * merge resolution * move tests, fix to changed optionaldependency structure * test directories changed * fixing tests * how to avoid tf tests? * how to avoid tf tests? * tests passing locally * allow mctctprocessor imported any env * allow mctctprocessor imported any env * fixed second round of feedback, need to fix docs * doc changes not being applied * all fixed * style fix * feedback fixes * fix copies and feature extraction style fix * Update tests/models/visual_bert/test_modeling_visual_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * copy paste huggingface:main visual bert * added eof newline to visual bert; all tests are passing otherwise * fix slow tests by adding attention mask * change model id to speechbrain * make fix-copies * fix readme unwanted deletes * fixing readmes, make fix-copies * consistent M-CTC-T naming * Update src/transformers/models/mctct/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * all fixed but variable naming * adjust double quotes * fixed variable names * copyright and mr quilter * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct slow tests * make fix-copies * Update src/transformers/models/mctct/configuration_mctct.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mctct/configuration_mctct.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * m-ctc-t not mctct Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-08 00:33:07 +02:00
Vítor Fróis	706bb8364d	quicktour.mdx en -> pt translation (#17074 ) * Quicktour Portuguese Translation Translated quicktour.mdx until line 161 * Finished translating quicktour.mdx Ready to upload and adjust eventual .mdx or translation mistakes. * Add _toctree.yml and fix nits * Fixed pt-br mdx syntax problem Closed <frameworkcontent> instance * Changed </frameworkcontent> line * Copied missing block from english version of quicktour.mdx * Reviwed the entire file once again. It should be working now. Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-06-07 17:35:05 -04:00
Omar U. Espejel	b118730745	Fix gendered sentence in Spanish translation(#17558 )	2022-06-07 14:09:39 +02:00
Nicola Procopio	34a886fce3	Translation/italian: added pipeline_tutorial.mdx [Issue: #17459 ] (#17507 ) * added toctree.yml file * first translation * added pipeline_tutorial.mdx translation added pipeline_tutorial.mdx updated _toctree.yml * updated pipeline_tutorial.mdx * updated _toctree.yml Updated preprocessing and training * updated preprocessing.mdx start translation * Update _toctree.yml * Delete preprocessing.mdx * Update _toctree.yml * updated _toctree.yml * added preprocessing * Update _toctree.yml * updated _toctree.yml * undo * Revert "undo" This reverts commit `5d38d76875`. * Revert "Revert "undo"" This reverts commit `8aa0830b58`.	2022-06-06 10:35:20 -04:00
Martina Fumanelli	f6ad0e0556	Add installation.mdx Italian translation (#17530 ) * Add the Italian translation of the file installation.mdx and edit _toctree * Add the Italian translation of the file installation.mdx and edit _toctree	2022-06-06 07:48:08 -04:00
Jonatas Grosman	4aed1dc81b	Adding the Portuguese version of the tasks/token_classification.mdx documentation (#17492 ) * add tasks/token_classification pt doc structure * add tasks/token_classification pt doc translation * add tasks/token_classification pt doc translation	2022-06-06 07:47:34 -04:00
Britney Muller	72f5b94984	Update index.mdx (#17547 ) This PR updates our Expert Acceleration Program image with a new image featuring our experts. This is similar to our Transformers/README.md image update that has proven to be successful.	2022-06-03 12:56:37 -05:00
Patrick Deutschmann	babeff5524	Add support for Perceiver ONNX export (#17213 ) * Start adding perceiver support for ONNX * Fix pad token bug for fast tokenizers * Fix formatting * Make get_preprocesor more opinionated (processor priority, otherwise tokenizer/feature extractor) * Clean docs format * Minor cleanup following @sgugger's comments * Fix typo in docs * Fix another docs typo * Fix one more typo in docs * Update src/transformers/onnx/utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/onnx/utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/onnx/utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-03 07:40:22 -04:00
Robert Dargavel Smith	5c17918fe4	Allow from transformers import TypicalLogitsWarper (#17477 ) * Allow from transformers import TypicalLogitsWarper * Added TypicalLogitsWarper * Allow from transformers import TypicalLogitsWarper * Allow from transformers import TypicalLogitsWarper * Allow from transformers import TypicalLogitsWarper * Allow from transformers import TypicalLogitsWarper Added TypicalLogitsWarper Allow from transformers import TypicalLogitsWarper Allow from transformers import TypicalLogitsWarper Allow from transformers import TypicalLogitsWarper	2022-06-03 11:08:35 +02:00
Sylvain Gugger	048dd73bba	Check list of models in the main README and sort it (#17517 ) * Script for README * Fix copies * Complete error message	2022-06-02 08:10:08 -04:00
Anugunj Naman	84aaadd8c5	Adding LeViT Model by Facebook (#17466 ) * levit files * levit tests * weights script * weights script * update * style fixes * few minor corrections * Added teacher model * edit docs * fix-copies * style fixes * pr error resolved * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/configuration_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/configuration_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * suggested pr changes * style fixes * minor bug * update * minor doc edit * style * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/models/levit/test_modeling_levit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * residual layer readable * style * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update tests/models/levit/test_feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * change checkpoints and style * update * minor changes * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-01 17:06:20 +02:00
Ruihua Fang	4f38808e9e	Add OnnxConfig for SqueezeBert iss17314 (#17315 ) * add onnx config for SqueezeBert * add test for onnx config for SqueezeBert * add automatically updated doc for onnx config for SqueezeBert * Update src/transformers/onnx/features.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update src/transformers/models/squeezebert/configuration_squeezebert.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-06-01 06:16:15 -04:00
Arthur	7822a9b7a7	Opt in flax and tf (#17388 ) * initial commit * add init file * update globakl init * update index and dummy objects * style * update modelling auto * fix initi typo in src/transformers * fix typo in modeling tf auto, opt was in wrong mapping name * fixed a slow test : saved_model * style * fix positionnal embedding if no position id is provided * update tf test * update test flax requirements * fixed serialization * update * update tf name to allow smooth convertion * update flax tests * style * fix test typo * fix tf typo test * add xla for generate support in causal LM * fixed bug * cleaned tf tests * style * removed from PT for slow tests * fix typp * opt test as slow * trying to fix GPT2 undefined * correct documentation and add to test doc * update tf doc * fix doc * fake commit * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * update test based on review * merged main layer for functionning test * fixup + quality * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update long comment * make fix copies Co-authored-by: Arthur <arthur@huggingface.co> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-31 18:41:22 +02:00
Martina Fumanelli	dfc38463b8	Setup for Italian translation and add quicktour.mdx translation (#17472 ) * Setup for Italian translation and add first document - Add 'it' folder for files translated into Italian - Add _config.py and _toctree.yml files - Add translation of quicktour.mdx * Fix style issue of italian documentation files * Add 'it' to the languages section in the .github/workflows * Remove - installation from _toctree for Italian * Translation for index file - Add index to _toctree.yml - Add translation of index.mdx * Fix typo in docs/source/it/index.mdx * Translate code comments in docs/source/it/_config.py Co-authored-by: Martina Fumanelli <martinafumanelli@Martinas-MBP.homenet.telecomitalia.it>	2022-05-31 09:57:43 -04:00
Ritik Nandwal	5af38953bb	Added XLM onnx config (#17030 ) * Add onnx configuration for xlm * Add supported features for xlm * Add xlm to models exportable with onnx * Add xlm architecture to test file * Modify docs * Make code quality fixes	2022-05-31 09:26:06 -04:00
Omar U. Espejel	2ef09ecfb8	Fix nits (#17349 )	2022-05-31 08:41:54 -04:00
Yhary Arias	2295bcaea8	Spanish translation of the file preprocessing.mdx (#16299 ) * Spanish translation of the file training.mdx * Settings - Spanish translation of the file training.mdx * Latest changes to the Spanish translation of the training.mdx file * Delete Hugging.mdx * Last changes to the training fil Espanish version * Latest modifications * Latest changes, document ready for PR * Nits * Spanish translation of the preprocessing file * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Nits and add preprocessing to _toctree.yml Co-authored-by: Yhary Arias <yharystefa@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-26 07:28:14 -04:00
Juanjo do Olmo	8f46ac9849	Spanish translation of the files sagemaker.mdx and image_classification.mdx (#17262 ) * Duplication of the source eng file * Spanish translation of the file multilingual.mdx * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Fix nits and finish translation * Spanish translation of sagemaker.mdx * Was deleted in main * Security saving * Complete translation of image_classification.mdx * Nits * nits * Update docs/source/es/image_classification.mdx * Add files to _toctree.yml * Fix toctree and add tasks folder Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-25 19:10:16 -04:00
Joaq	5e7f085fcc	Added es version of bertology.mdx doc (#17255 ) * added bertology es doc * toctree fix * Update docs/source/es/bertology.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/bertology.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/bertology.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * change position of bertology in _toctree.yml Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-25 18:46:53 -04:00
Jonatas Grosman	70484a8d74	Adding the Portuguese version of the tasks/sequence_classification.mdx documentation (#17352 ) * add sequence_classification pt doc structure * add Portuguese tasks/sequence_classification.mdx	2022-05-25 16:21:27 -04:00
Leandro von Werra	740a1574f1	fix link in performance docs (#17419 )	2022-05-25 20:54:43 +02:00
Jason Phang	71e602725b	[WIP] Adding GPT-NeoX-20B (#16659 ) * initial * first try * working 20B * 20B tokenizers * Docs * Import fixes for missing classes * Update docs, fixup * black formatting * isort * flake * dummy objects * documentation * Documentation yml * more docs * tweaks for tests * tokenization auto * fix neox tests * test * test * einsum * address PR feedback * Documentation * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gpt_neox/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gpt_neox/configuration_gpt_neox.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Remove undefined LaTeX syntax * Update to full url to avoid confusion about if that's supposed to refer to the Hub * fix auto * move tests * documentation fix * more doc fixes * test refactor * fix import * fix import * fix import * fix import * fix import * style fixes * More modeling fixes Co-authored-by: Jason Phang <zp489@gr057.hpc.nyu.edu> Co-authored-by: Stella Biderman <stellabiderman@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-24 09:31:10 -04:00
NielsRogge	31ee80d556	Add LayoutLMv3 (#17060 ) * Make forward pass work * More improvements * Remove unused imports * Remove timm dependency * Improve loss calculation of token classifier * Fix most tests * Add docs * Add model integration test * Make all tests pass * Add LayoutLMv3FeatureExtractor * Improve integration test + make fixup * Add example script * Fix style * Add LayoutLMv3Processor * Fix style * Add option to add visual labels * Make more tokenizer tests pass * Fix more tests * Make more tests pass * Fix bug and improve docs * Fix import of processors * Improve docstrings * Fix toctree and improve docs * Fix auto tokenizer * Move tests to model folder * Move tests to model folder * change default behavior add_prefix_space * add prefix space for fast * add_prefix_spcae set to True for Fast * no space before `unique_no_split` token * add test to hightligh special treatment of added tokens * fix `test_batch_encode_dynamic_overflowing` by building a long enough example * fix `test_full_tokenizer` with add_prefix_token * Fix tokenizer integration test * Make the code more readable * Add tests for LayoutLMv3Processor * Fix style * Add model to README and update init * Apply suggestions from code review * Replace asserts by value errors * Add suggestion by @ducviet00 * Add model to doc tests * Simplify script * Improve README * a step ahead to fix * Update pair_input_test * Make all tokenizer tests pass - phew * Make style * Add LayoutLMv3 to CI job * Fix auto mapping * Fix CI job name * Make all processor tests pass * Make tests of LayoutLMv2 and LayoutXLM consistent * Add copied from statements to fast tokenizer * Add copied from statements to slow tokenizer * Remove add_visual_labels attribute * Fix tests * Add link to notebooks * Improve docs of LayoutLMv3Processor * Fix reference to section Co-authored-by: SaulLu <lucilesaul.com@gmail.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-05-24 09:53:45 +02:00
Sylvain Gugger	56f50590d5	Use Accelerate in `from_pretrained` for big model inference (#17341 ) * Initial work * More or less finished with first draft * Update src/transformers/modeling_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Fix randomly initialized weights * Update src/transformers/modeling_utils.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Address review comments * Rename DeepSpeed folder to temporarily fix the test issue? * Revert to try if Accelerate fix works * Use latest Accelerate release * Quality and fixes * Style * Quality * Add doc * Test + fix * More blocks Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2022-05-23 14:32:21 -04:00
Anugunj Naman	c86aad6110	Fix cvt docstrings (#17367 )	2022-05-23 16:11:09 +02:00
ghlai9665	7b8cb26953	Correct & Improve Doctests for LayoutLMv2 (#17168 ) * add inference example to LayoutLMv2ForQuestionAnswering, passing doctest * add loss example to LayoutLMv2ForQuestionAnswering, passing doctest * Add correct doctest for LayoutLMv2ForTokenClassification, passing doctest * add correct doctest for LayoutLMv2ForSequenceClassification, passing test * add correct doctest for LayoutLMv2Model, passing test * make fixup * fix to address review comments * make style * fix doctest line break issue, add to documentaiton_tests.txt, address review comments * move comment about layoutlmv2 dependencies to the doc page * format doc page as suggested Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * delete extraneous backtick Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-23 08:02:31 -04:00
NielsRogge	adc0ff2502	Add CvT (#17299 ) * Adding cvt files * Adding cvt files * changes in init file * Adding cvt files * changes in init file * Style fixes * Address comments from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Format lists in docstring * Fix copies * Apply suggestion from code review Co-authored-by: AnugunjNaman <anugunjjha@gmail.com> Co-authored-by: Ayushman Singh <singhayushman13@protonmail.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-18 17:47:18 +02:00
Carl	d6b8e9cec7	Add trajectory transformer (#17141 ) * Add trajectory transformer Fix model init Fix end of lines for .mdx files Add trajectory transformer model to toctree Add forward input docs Fix docs, remove prints, simplify prediction test Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Update docs, more descriptive comments Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Update readme Small comment update and add conversion script Rebase and reformat Fix copies Fix rebase, remove duplicates Fix rebase, remove duplicates * Remove tapex * Remove tapex * Remove tapex	2022-05-17 19:07:43 -04:00
Cesare Campagnano	d9050dc768	[LED] fix global_attention_mask not being passed for generation and docs clarification about grad checkpointing (#17112 ) * [LED] fixed global_attention_mask not passed for generation + docs clarification for gradient checkpointing * LED docs clarification Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * [LED] gradient_checkpointing=True should be passed to TrainingArguments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * [LED] docs: remove wrong word Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * [LED] docs fix typo Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-05-17 23:44:37 +02:00
Jean Vancoppenolle	bad358398a	Add support for pretraining recurring span selection to Splinter (#17247 ) * Add SplinterForSpanSelection for pre-training recurring span selection. * Formatting. * Rename SplinterForSpanSelection to SplinterForPreTraining. * Ensure repo consistency * Fixup changes * Address SplinterForPreTraining PR comments * Incorporate feedback and derive multiple question tokens per example. * Update src/transformers/models/splinter/modeling_splinter.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/splinter/modeling_splinter.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Jean Vancoppenole <jean.vancoppenolle@retresco.de> Co-authored-by: Tobias Günther <tobias.guenther@retresco.de> Co-authored-by: Tobias Günther <github@tobigue.de> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-05-17 23:42:14 +02:00
Patrick von Platen	5a9957358c	Add Wav2Vec2Conformer (#16812 ) * save intermediate * add wav2vec2 conformer * add more code * more * first test passes * make all checkpoints work * update * up * more clean ups * save clean-up * save clean-up * save more * remove bogus * finalize design conformer * remove vision * finish all tests * more changes * finish code * add doc tests * add slow tests * fix autoconfig test * up * correct docstring * up * update * fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * Update docs/source/en/model_doc/wav2vec2-conformer.mdx * upload * save copied from * correct configs * fix model outputs * add to docs * fix imports * finish * finish code * correct copied from * correct again * correct make fix * improve make fix copies * save * correct fix copy from * correct init structure * correct * fix import * apply suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>	2022-05-17 00:43:16 +02:00
amyeroberts	f6a6388972	Add Tensorflow Swin model (#16988 ) Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-16 22:19:53 +01:00
Kevin Zehnder	6cb7187324	docs(transformers): fix typo (#17263 )	2022-05-16 17:04:30 -04:00
Sander Land	053a80c606	logging documentation update (#17174 ) * logging documentation * style Co-authored-by: Sander Land <sander@chatdesk.com>	2022-05-16 16:47:28 -04:00
Sylvain Gugger	ddb1a47ec8	Automatically sort auto mappings (#17250 ) * Automatically sort auto mappings * Better class extraction * Some auto class magic * Adapt test and underlying behavior * Remove re-used config * Quality	2022-05-16 13:24:20 -04:00
Stas Bekman	71abd3ade1	[WIP] [doc] performance/scalability revamp (#15723 ) * [doc] performance/scalability revamp * link the new docs * no : * mixed precision * work on the first doc * expand the main doc * Trigger CI * style * revamp single GPU training section * work on training performance * remove files not used anymore or will be added later * final touches * fix rebase * Add hardware section to toctree * fix toctree again * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove `fast_tokenizers` entry that was copied in rebase * add warning about DP vs DDP * remove todo * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix missing closure of codeblock * Update docs/source/en/perf_train_gpu_many.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * sync with #16860 * update toc Co-authored-by: leandro <leandro.vonwerra@spoud.io> Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-16 13:36:41 +02:00
Ignacio Talavera	ee393c009a	Guide to create custom models in Spanish (#17158 ) * file copied and toctree updated * Intro and configuration translated * model section translated * enter hotfix * Translation over, correction pending * Typos and corrections * Update docs/source/es/create_a_model.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/create_a_model.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/create_a_model.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/create_a_model.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-13 16:19:29 -04:00
Gerardo Huerta Robles	16be422912	Translated version of model_sharing.mdx doc to spanish (#16184 ) * Translated version of model_sharing to spanish * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Addind model sharing to _toctree.yml Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-13 16:18:46 -04:00
Fellip Silva Alves	f9024814e1	[ fast_tokenizers.mdx ] - Added translation to portuguese to tutorial (#17076 ) * [ fast_tokenizers.mdx ] - Added translation to portuguese to tutorial * Delete docs/source/pt-br directory * [ fast_tokenizers.mdx ] - Continuing work on file * [ fast_tokenizers.mdx ] - Continuing work on file * Add fast tokenizers to _toctree.yml * Eliminated config and toctree.yml * Nits in fast_tokenizers.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-13 16:18:14 -04:00
Rafael Zimmer	85fc455972	Added translation of installation.mdx to Portuguese Issue #16824 (#16979 ) * Added translation of installation.mdx to Portuguese, as well as default templates of _toctree.yml and _config.py * [ build_documentation.yml ] - Updated doc_builder to build documentation in Portuguese. [ pipeline_tutorial.mdx ] - Created translation for the pipeline_tutorial.mdx. * [ build_pr_documentation.yml ] - Added pt language to pr_documentation builder. [ pipeline_tutorial.mdx ] - Grammar changes. * [ accelerate.mdx ] - Translated to Portuguese the acceleration tutorial. * [ multilingual.mdx ] - Added portuguese translation for multilingual tutorial. [ training.mdx ] - Added portuguese translation for training tutorial. * [ preprocessing.mdx ] - WIP * Update _toctree.yml * Adding Pré-processamento to _toctree.yml * Update accelerate.mdx * Nits and eliminate preprocessing file while it is ready Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-13 07:55:44 -04:00
Sayak Paul	9f16a1cc13	Update data2vec.mdx to include a Colab Notebook link (that shows fine-tuning) (#17194 ) * Update data2vec.mdx * Update data2vec.mdx * Update docs/source/en/model_doc/data2vec.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-12 10:22:00 -04:00
Younes Belkada	b971c769e8	Add OPT (#17088 ) * First version - OPT model * Final changes - putting use cache to False * few changes - remove commented block * few changes - remove unecessary files * fix style issues * few changes - remove a test file - added the logits test * Update src/transformers/models/auto/tokenization_auto.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add gen tests * few changes - rm mask filling example on docstring * few changes - remove useless args * some changes - more tests should pass now - needs to clean more - documentation still needs to be done * fix code quality * major changes - change attention architecture to BART-like - modify some tests - style fix * rm useless classes - remove opt for: - QA - cond generation - seq classif * Removed autodoc calls to non-existant classes TOkenizers are not implemented * Update src/transformers/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/auto/modeling_tf_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Replaced OPTTokeniser with GPT2 tokenizer * added GPT2Tokenizer.from_pretrained("patrickvonplaten/opt_gpt2_tokenizer") * Removed OPTTokenizer * make style * Make style replaces ``` ...).unsqueeze(``` by ``` >>>).unsqueeze(``` * make repo consistency * Removed PretrainedOPTModel * fix opt.mdx removed other heads * fix init, removed 3 heads * removed heads * finished cleaning head * removed seauence classif and question answering * removed unused imports * removed useless dummy object for QA, SC and CG * removed tests for removed useless dummy object for QA, SC and CG * Removed head_mask using encoder layers which don't exist * fixed test * fix line * added OPT to toctree * Updated model path with pushed weigths * fix model path * fixed code quality * fixed embeddings and generation tests * update paths * clean comments * removed OPTClassificationHead for sentence classification * renamed hidden layer * renamed num layers to standard num_hidden_layers * num_attention_heads fix * changes for 125m * add first version for 125m * add first version - flax * add new version * causal LM output * replace output type with BaseModelOutputWithPastAndCrossAttentions * revert working config from 150m to 350m * clean * removed decoder input ids * fixed embed dim * more embed_dim issues * make style + removed enc_dec test * update falx model * removed troublesome copy * added is_encoder_decoder=False to config * added set_input emb fuinction to model class * requires torch on embed test * use head mask instead of decoder head mask input param solves a test * 8 test remaining, update * Updated create_and_check_decoder_model_past_large_inputs * Make style * update op tokenizer with condition * make style * See if I can push * some clean up * remove linear head hack * save intermediate * save correct attention * add copied from from bart * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix part of the reviewss Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * same changes in naming / conversion * correct mask * more fixes * delete FlaxOPT and TfOPT * clean traces of Flax and Tf * fix mask * fixed positionnal embedding length when past key value is provoded * get 125m, 6.7b to work * Added do_layer_norm * solved mismatch in load dictionnary * clean up preapre opt input dict * fixed past key value as bool * fix previus * fixed return dict False tuple issue * All tests are passing * Make style * Ignore OPTDecoder non tested * make fix-copies * make repo consistency * small fix * removed uselss @torch.no_grad decorator * make styl;e * fix previous opt test * style * make style * added opt documentation * update OPT_PRETRAINED_MODEL_ARCHIVE_LIST * up * more fixes * model & config work * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * added comment on padding hack (+2) * cleaup * review update * docstring for missing arg * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/opt/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update pretrained map * update path and tests * make style * styling * make consistency * add gpt2 tok new * more tok fixes * Update src/transformers/models/auto/tokenization_auto.py * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/models/opt/test_modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update based on reviews * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * make style * make tokenizer auto tests pass * apply Lysandre suggestion * finish tests * add some good tokenizer tests * improve docs slighly Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: ArthurZucker <arthur.zucker@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2022-05-12 12:24:35 +02:00
Omar U. Espejel	1a688709b3	Fix contents in index.mdx to match docs' sidebar (#17198 ) * Fix contents in index.mdx to match docs' sidebar * Eliminates api section from contents	2022-05-12 02:37:13 -05:00
Omar Sanseviero	b17b78897b	Fix style error in Spanish docs (#17197 )	2022-05-12 08:51:46 +02:00
Omar U. Espejel	1a66a6c677	Translate index.mdx (to ES) and add Spanish models to quicktour.mdx examples (#16685 ) * Change nits in Spanish for quicktour.mdx - Add tasks names in English too. - Fix small nits in Spanish * Translate index.mdx to Spanish * Translate body of index. * Translated the compatible models list (not the papers´ names). Since this should not be updated manually, I can come back to the original text. * Add models and a dataset for Spanish in the code exmaples * Replaced the English models to Spanish versions. * Add index to _toctree.yml and fix Spanish * Fix double ““ error * Change negative example in ASR example * make style * Debug style in quicktour.mdx	2022-05-11 23:35:07 -05:00
Jorge Loayza R	e2d678b71c	Documentation: Spanish translation of fast_tokenizers.mdx (#16882 ) * Spanish translation of fast_tokenizers.mdx * add fast_tokenizers to the spanish _toctree.yml * Update docs/source/es/fast_tokenizers.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/fast_tokenizers.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/fast_tokenizers.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/fast_tokenizers.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/fast_tokenizers.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/fast_tokenizers.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-11 22:25:44 -05:00
Joaq	ae82da2181	Added es version of language_modeling.mdx doc (#17021 ) * Spanish version of language_modeling.mdx doc file * modification to toctree.yml file * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Correct position of Guías conceptuales Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-11 22:04:56 -05:00
jkmg	36ddcc0d35	Spanish translation of philosophy.mdx #15947 (#16922 ) * adding philosophy.mdx translation to Spanish * adding philosophy.mdx translation to Spanish * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * philosophy translation to Spanish * Update _toctree.yml * Update _toctree.yml * nits Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-11 20:47:50 -05:00
Amanpreet Singh	a10f61834d	[feat] Add FLAVA model (#16654 ) * [WIP] Add FLAVA model This PR aims to add [FLAVA](ihttps://arxiv.org/abs/2112.04482) model to the transformers repo. Following checklist delineates the list of things to be done for this PR to be complete: [x] Flava init [x] Flava base models [x] Flava layers [x] Flava Configs [x] Flava encoders [x] Flava pretraining models [ ] Flava classification/retrieval models (To be added in a separate PR) [x] Documentation updates [x] Imports updates [x] Argstring updates [x] Flava pretrained checkpoints [x] Flava tests [x] Flava processors [x] Sanity check [x] Lint	2022-05-11 14:56:48 -07:00
hasan salim kanmaz	c33f6046c3	[WIP] Enable reproducibility for distributed trainings (#16907 ) * add seed worker and set_deterministic_seed_for_cuda function to enforce reproducability * change function name to enable determinism, add docstrings, reproducability support for tf * change function name to enable_determinism_for_distributed_training * revert changes in set_seed and call set_seed within enable_full_determinism * add one position argument for seed_worker function * add full_determinism flag in training args and call enable_full_determinism when it is true * add enable_full_determinism to documentation * apply make fixup after the last commit * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-11 09:37:13 -04:00
Jason Phang	48a8f3daa1	Add DebertaV2ForMultipleChoice (#17135 )	2022-05-10 16:21:44 -04:00
Patrick Haller	259eeb6dab	Fixing the output of code examples in the preprocessing chapter (#17162 )	2022-05-10 12:16:28 -04:00
Zachary Mueller	d719bcd46a	Fix all docs for accelerate install directions (#17145 )	2022-05-09 15:45:18 -04:00
Sylvain Gugger	7783fa6bb3	Fix quality and repo consistency	2022-05-09 11:14:36 -04:00
Sourab Mangrulkar	05fc1766ff	PyTorch FSDP integration in Trainer (#17136 ) * PyTorch FSDP integration in Trainer * reformatting make style and make quality are now compliant. * Updating dependency check * Trigger CI Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-05-09 20:40:56 +05:30
Manan Dey	dc3645dc9c	add `mobilebert` onnx configs (#17029 ) * update docs of length_penalty * Revert "update docs of length_penalty" This reverts commit `466bf4800b`. * add mobilebert onnx config * address suggestions * Update auto.mdx * Update __init__.py * Update features.py	2022-05-09 10:36:53 -04:00
Ritik Nandwal	215e0681e4	Added BigBirdPegasus onnx config (#17104 ) * Add onnx configuration for bigbird-pegasus * Modify docs	2022-05-06 17:31:00 +02:00
Steven Liu	cad61b6839	Fix link to example scripts (#17103 )	2022-05-05 15:20:27 -05:00
Daniel Espejel	db377a0b37	Added spanish translation of autoclass_tutorial. (#17069 ) * Added spanish translation of autoclass_tutorial. Added 'local' and 'title' fields for autoclass_tutorial. * Fixed autoclass_tutorial title in _toctree.yml and autoclass_tutorial.mdx	2022-05-04 14:18:24 -05:00
Steven Liu	23619ef6b7	📝 open fresh PR for pipeline doctests (#17073 )	2022-05-04 11:30:34 -05:00
Sayak Paul	049e791758	Add Data2Vec for Vision in TF (#17008 ) * add utilities till TFData2VecVisionLayer. * chore: pass window_size to attention layer. * feat: add TFData2VecVisionRelativePositionBias. * feat: initial implementation ready for tf data2vec. * fix: relative position bias index, table to be fixed. * chore: implementation added, tests remaining. * add: tests, other PR files. * fix: code quality. * fix: import structure in init. * chore: run make fix-copies. * chore: address PR feedback (round I). * chore: styling nit. * fix: tests due to removal of to_2tuple(). * chore: rebase with upstream main and move the test. * Update src/transformers/models/auto/modeling_tf_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/auto/modeling_tf_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix: layer call. * chore: remove from_pt=True and rerun test. * chore: remove cast and tf.divide. * chore: minor edits to the test script. * Update src/transformers/models/data2vec/modeling_tf_data2vec_vision.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * fix: expand() on TF tensors with broadcast_to(). * fix: test import. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-05-04 08:08:25 -04:00
Sylvain Gugger	a8fa2f91f4	Make Trainer compatible with sharded checkpoints (#17053 ) * Make Trainer compatible with sharded checkpoints * Add doc	2022-05-03 09:55:10 -04:00
Sanchit Gandhi	cd9274d010	[FlaxBert] Add ForCausalLM (#16995 ) * [FlaxBert] Add ForCausalLM * make style * fix output attentions * Add RobertaForCausalLM * remove comment * fix fx-to-pt model loading * remove comment * add modeling tests * add enc-dec model tests * add big_bird * add electra * make style * make repo-consitency * add to docs * remove roberta test * quality * amend cookiecutter * fix attention_mask bug in flax bert model tester * tighten pt-fx thresholds to 1e-5 * add 'copied from' statements * amend 'copied from' statements * amend 'copied from' statements * quality	2022-05-03 11:26:19 +02:00
Lysandre Debut	bb2e088be7	Allow all imports from transformers (#17050 )	2022-05-02 12:47:39 -04:00
NielsRogge	1ac698744c	Add YOLOS (#16848 ) * First draft * Add YolosForObjectDetection * Make forward pass work * Add mid position embeddings * Add interpolation of position encodings * Add expected values * Add YOLOS to tests * Add integration test * Support tiny model as well * Support all models in conversion script * Remove mid_pe_size attribute * Make more tests pass * Add model to README and fix config * Add copied from statements * Rename base_model_prefix to vit * Add missing YOLOS_PRETRAINED_CONFIG_ARCHIVE_MAP * Apply suggestions from code review * Apply more suggestions from code review * Convert remaining checkpoints * Improve docstrings * Add YolosFeatureExtractor * Add feature extractor to docs * Add corresponding tests * Fix style * Fix docs * Apply suggestion from code review * Fix bad rebase * Fix some more bad rebase * Fix missing character * Improve docs and variable names Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-05-02 18:30:55 +02:00
Yih-Dar	ede5e04191	Add a check on config classes docstring checkpoints (#17012 ) * Add the check * add missing ckpts * add a list to ignore * call the added check script * better regex pattern Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-30 10:40:46 +02:00
Sylvain Gugger	7152ed2bae	Result of new doc style with fixes (#17015 ) * Result of new doc style with fixes * Add last two files * Bump hf-doc-builder	2022-04-29 17:42:15 -04:00
Mishig Davaadorj	cf8a7c2490	Update custom_models.mdx (#16964 ) BertModelForSequenceClassification -> BertForSequenceClassification	2022-04-27 16:46:55 +02:00
Yang Ming	10dfa126b7	documentation: some minor clean up (#16850 )	2022-04-26 16:56:08 -04:00
Krishna Sirumalla	aaee4038c3	Add onnx config for RoFormer (#16861 ) * add roformer onnx config	2022-04-26 16:51:15 +02:00
Rushi Chaudhari	8246caf3eb	added deit onnx config (#16887 ) * added deit onnx config	2022-04-25 20:50:45 +02:00
Patrick von Platen	3a71e94a92	Fix doc test quicktour dataset (#16929 ) * fix doc test * fix doc test Co-authored-by: Patrick <patrick@pop-os.localdomain>	2022-04-25 16:26:59 +02:00
Patrick von Platen	72728be3db	[DocTests] Fix some doc tests (#16889 ) * [DocTests] Fix some doc tests * hacky fix * correct	2022-04-23 08:40:14 +02:00
Thomas Chaigneau	ec81c11a18	Add OnnxConfig for ConvBERT (#16859 ) * add OnnxConfig for ConvBert Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>	2022-04-22 18:19:15 +02:00
Nicolas Patry	e789418ebe	Adding support for `array` key in raw dictionnaries in ASR pipeline. (#16827 ) * Adding support for `array` key in raw dictionnaries in ASR pipeline. * ES . * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Making it work by not popping `array` first. * Black 22.3 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-21 14:39:10 +02:00
Stas Bekman	67ed0e43dc	[docs] fix url (#16860 )	2022-04-20 11:01:24 -07:00
Yang Ming	ff06b17791	add DebertaV2 fast tokenizer (#15529 ) Co-authored-by: alcinos <carion.nicolas@gmail.com> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> Co-authored-by: Nicolas Carion <carion.nicolas@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-20 10:26:51 +02:00
Patrick von Platen	8d3f952adb	[Data2Vec] Add data2vec vision (#16760 ) * save intermediate * add vision * add vision * save * finish models * finish models * continue * finish * up * up * up * tests all pass * clean up * up * up * fix bugs in beit * correct docs * finish * finish docs * make style * up * more fixes * fix type hint * make style * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/data2vec/test_modeling_data2vec_vision.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix test Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-18 17:52:13 +02:00
Patrick von Platen	9a2995ee39	[Quicktour Audio] Improve && remove ffmpeg dependency (#16723 ) * [Quicktour Audio] Improve && remove ffmpeg dependency * final fix * final touches	2022-04-18 16:50:13 +02:00
Patrick von Platen	b24201fa44	[Doctests] Fix all T5 doc tests (#16646 ) * [Doctests] Fix all T5 doc tests * make style * Update docs/source/en/model_doc/t5.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply Sylvains comments * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-13 11:36:54 +02:00
Minh Chien Vu	9c9db751e2	add Bigbird ONNX config (#16427 ) * add Bigbird ONNX config	2022-04-12 20:46:06 +02:00
Anmol Joshi	a315988bae	Moved functions to pytorch_utils.py (#16625 ) * Moved functions to pytorch_utils.py * isort formatting * Reverted tf changes * isort, make fix-copies * documentation fix * Fixed Conv1D import * Reverted research examples file * backward compatibility for pytorch_utils * missing import * isort fix	2022-04-12 12:38:50 -04:00
Sylvain Gugger	0711c45eae	Remove duplicate header (#16732 )	2022-04-12 12:37:13 -04:00
Patrick von Platen	098b002644	[Doctests] Correct task summary (#16644 )	2022-04-11 14:59:35 +02:00
Yih-Dar	8e93dc7eaf	Fix some doc examples in task summary (#16666 ) * Fix some doc examples Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-11 11:20:03 +02:00
Steven Liu	7c5d79912a	Update audio examples with MInDS-14 (#16633 ) * ✨ update audio examples with minds dataset * 🖍 make style * 🖍 minor fixes for doctests	2022-04-08 15:55:42 -05:00
NielsRogge	4ef0abb738	Add TAPEX (#16473 ) * Add TapexTokenizer * Improve docstrings and provide option to provide answer * Remove option for pretokenized inputs * Add TAPEX to README * Fix copies * Remove option for pretokenized inputs * Initial commit: add tapex fine-tuning examples on both table-based question answering and table-based fact verification. * - Draft a README file for running the script and introducing some background. - Remove unused code lines in tabfact script. - Disable the deafult `pad_to_max_length` option which is memory-consuming. * * Support `as_target_tokenizer` function for TapexTokenizer. * Fix the do_lower_case behaviour of TapexTokenizer. * Add unit tests for target scenarios and cased/uncased scenarios for both source and target. * * Replace the label BartTokenizer with TapexTokenizer's as_target_tokenizer function. * Fix typos in tapex example README. * * fix the evaluation script - remove the property `task_name` * * Make the label space more clear for tabfact tasks * * Using a new fine-tuning script for tapex-base on tabfact. * * Remove the lowercase code outside the tokenizer - we use the tokenizer to control whether do_lower_case * Guarantee the hyper-parameter can be run without out-of-memory on 16GB card and report the new reproduced number on wikisql * * Remove the default tokenizer_name option. * Provide evaluation command. * * Support for WikiTableQuestion dataset. * Fix a typo in README. * * Fix the datasets's key name in WikiTableQuestions * Run make fixup and move test to folder * Fix quality * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply some more suggestions from code review * Improve docstrings * Overwrite failing test * Improve comment in example scripts * Fix rebase * Add TAPEX to Auto mapping * Add TAPEX to auto config mappings * Put TAPEX higher than BART in auto mapping * Add TAPEX to doc tests Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain> Co-authored-by: SivilTaram <qianlxc@outlook.com> Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-04-08 10:57:51 +02:00
Francesco Saverio Zuppichini	af14c61973	RegNet (#16188 ) * base model done * make style * done * added files * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Trigger doc build * resolved conversations * resolved conversations * seer models * minor changes * minor changes * make fixup * glob variables * minor changes * fix copies * config when possibile * resolved conflicts * resolved conflicts * resolved conflicts * CI * conversion script for 10b param * fixed for 10b model * minor updates in the doc + make style * removed unused code * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * removed unused code * removed unused code * updated modeling_utils from main Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-04-07 21:58:00 +02:00
Joao Gante	3f43d824b9	TF generate refactor - Beam Search (#16374 ) * refactor TF beam search * refactored generate can now properly use attention masks * add force bos/eos logit processors	2022-04-06 18:19:34 +01:00
Patrick von Platen	c65633156b	[Speech2Text Doc] Fix docs (#16611 ) * [Speech2Text Doc] Fix docs * apply ydshiehs suggestions	2022-04-06 14:19:00 +02:00
Patrick von Platen	0bf18643f4	[Minds14] Correct quicktour (#16626 )	2022-04-06 11:27:11 +02:00
Sylvain Gugger	208f4c109a	Quality	2022-04-05 14:12:01 -04:00
Steven Liu	f553c3ce4c	Update summary of the tasks (#16528 ) * 📝 add image/vision classification and asr * 🖍 minor formatting fixes * Fixed a typo in legacy seq2seq_trainer.py (#16531) * Add ONNX export for BeiT (#16498) * Add beit onnx conversion support * Updated docs * Added cross reference to ViT ONNX config * call on_train_end when trial is pruned (#16536) * Type hints added (#16529) * Fix Bart type hints (#16297) * Add type hints to PLBart PyTorch * Remove pending merge conflicts * Fix PLBart Type Hints * Add changes from review * Add VisualBert type hints (#16544) * Adding missing type hints for mBART model (PyTorch) (#16429) * added type hints for mbart tensorflow tf implementation * Adding missing type hints for mBART model Tensorflow Implementation model added with missing type hints * Missing Type hints - correction For TF model * Code fixup using make quality tests * Hint types - typo error * make fix-copies and make fixup * type hints * updated files * type hints update * making dependent modesls coherent Co-authored-by: matt <rocketknight1@gmail.com> * Remove MBart subclass of XLMRoberta in tokenzier docs (#16546) * Remove MBart subclass of XLMRoberta in tokenzier * Fix style * Copy docs from MBart50 tokenizer * Use random_attention_mask for TF tests (#16517) * use random_attention_mask for TF tests * Fix for TFCLIP test (for now). Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Improve code example (#16450) Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home> * Pin tokenizers version <0.13 (#16539) * Pin tokenizers version <0.13 * Style * Add code samples for TF speech models (#16494) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [FlaxSpeechEncoderDecoder] Fix dtype bug (#16581) * [FlaxSpeechEncoderDecoder] Fix dtype bug * more fixes * Making the impossible to connect error actually report the right URL. (#16446) * Fix flax import in __init__.py: modeling_xglm -> modeling_flax_xglm (#16556) * Add utility to find model labels (#16526) * Add utility to find model labels * Use it in the Trainer * Update src/transformers/utils/generic.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Quality Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Enable doc in Spanish (#16518) * Reorganize doc for multilingual support * Fix style * Style * Toc trees * Adapt templates * Add use_auth to load_datasets for private datasets to PT and TF examples (#16521) * fix formatting and remove use_auth * Add use_auth_token to Flax examples * add a test checking the format of `convert_tokens_to_string`'s output (#16540) * add new tests * add comment to overridden tests * TF: Finalize `unpack_inputs`-related changes (#16499) * Add unpack_inputs to remaining models * removed kwargs to `call()` in TF models * fix TF T5 tests * [SpeechEncoderDecoderModel] Correct Encoder Last Hidden State Output (#16586) * initialize the default rank set on TrainerState (#16530) * initialize the default rank set on TrainerState * fix style * Trigger doc build * Fix CI: test_inference_for_pretraining in ViTMAEModelTest (#16591) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * add a template to add missing tokenization test (#16553) * add a template to add missing tokenization test * add cookiecutter setting * improve doc * Update templates/adding_a_missing_tokenization_test/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * made _load_pretrained_model_low_mem static + bug fix (#16548) * handle torch_dtype in low cpu mem usage (#16580) * [Doctests] Correct filenaming (#16599) * [Doctests] Correct filenaming * improve quicktour * make style * Adding new train_step logic to make things less confusing for users (#15994) * Adding new train_step logic to make things less confusing for users * DO NOT ASK WHY WE NEED THAT SUBCLASS * Metrics now working, at least for single-output models with type annotations! * Updates and TODOs for the new train_step * Make fixup * Temporary test workaround until T5 has types * Temporary test workaround until T5 has types * I think this actually works! Needs a lot of tests though * MAke style/quality * Revert changes to T5 tests * Deleting the aforementioned unmentionable subclass * Deleting the aforementioned unmentionable subclass * Adding a Keras API test * Style fixes * Removing unneeded TODO and comments * Update test_step too * Stop trying to compute metrics with the dummy_loss, patch up test * Make style * make fixup * Docstring cleanup * make fixup * make fixup * Stop expanding 1D input tensors when using dummy loss * Adjust T5 test given the new compile() * make fixup * Skipping test for convnext * Removing old T5-specific Keras test now that we have a common one * make fixup * make fixup * Only skip convnext test on CPU * Update src/transformers/modeling_tf_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Avoiding TF import issues * make fixup * Update compile() to support TF 2.3 * Skipping model.fit() on template classes for now * Skipping model.fit() on template class tests for now * Replace ad-hoc solution with find_labels * make fixup Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Adding missing type hints for BigBird model (#16555) * added type hints for mbart tensorflow tf implementation * Adding missing type hints for mBART model Tensorflow Implementation model added with missing type hints * Missing Type hints - correction For TF model * Code fixup using make quality tests * Hint types - typo error * make fix-copies and make fixup * type hints * updated files * type hints update * making dependent modesls coherent * Type hints for BigBird * removing typos Co-authored-by: matt <rocketknight1@gmail.com> * [deepspeed] fix typo, adjust config name (#16597) * 🖍 apply feedback Co-authored-by: Cathy <815244047@qq.com> Co-authored-by: Jim Rohrer <jrohrer1@gmail.com> Co-authored-by: Ferdinand Schlatt <fschlatt@gmail.com> Co-authored-by: Dahlbomii <101373053+Dahlbomii@users.noreply.github.com> Co-authored-by: Gunjan Chhablani <chhablani.gunjan@gmail.com> Co-authored-by: Rishav Chandra Varma <rishavchandra.v16@iiits.in> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Daniel Stancl <46073029+stancld@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Karim Foda <35491698+KMFODA@users.noreply.github.com> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Andres Codas <andrescodas@users.noreply.github.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-04-05 12:48:42 -05:00
Patrick von Platen	7ccacdf10f	[Doctests] Correct filenaming (#16599 ) * [Doctests] Correct filenaming * improve quicktour * make style	2022-04-05 14:15:02 +02:00
Sylvain Gugger	b9a768b3ff	Enable doc in Spanish (#16518 ) * Reorganize doc for multilingual support * Fix style * Style * Toc trees * Adapt templates	2022-04-04 10:25:46 -04:00
Jim Rohrer	9de70f213e	Add ONNX export for BeiT (#16498 ) * Add beit onnx conversion support * Updated docs * Added cross reference to ViT ONNX config	2022-04-01 10:52:42 +02:00
Francesco Saverio Zuppichini	a8b6443e06	Refactor Modeling Outputs (#16341 ) * first proposal * replace model outputs in various models * conflicts * docstring * update poolformer * minor change in docstring * CI * removed poolformer specific outputs from doc * removed convnext specific outputs from doc * CI * weird char in segformer * conversations * reverted docstring for BaseModelOutputWithPooling * update outputs * changed docstring in BaseModelOutput * updated docstring in modeling outputs * typos :) * fixed typo after copy & paste it all around * CI * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * segformer Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-03-31 09:32:33 +02:00
Sayak Paul	5b40a37bc4	Add TF ViT MAE (#16255 ) * ported TFViTMAEIntermediate and TFViTMAEOutput. * added TFViTMAEModel and TFViTMAEDecoder. * feat: added a noise argument in the implementation for reproducibility. * feat: vit mae models with an additional noise argument for reproducibility. Co-authored-by: ariG23498 <aritra.born2fly@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-03-29 18:24:15 +01:00
Wesley A. Cheng	875e07a9e3	[doc] Fix missing trainer import (#16469 )	2022-03-29 18:57:43 +02:00
Wesley A. Cheng	3015d12bfb	fix wrong variable name (#16467 )	2022-03-29 18:55:40 +02:00
Steven Liu	45abb37ac9	Remove duplicate mLuke (#16460 ) * Remove duplicate mLuke * 🖍 apply feedback	2022-03-29 10:34:30 -05:00
NielsRogge	979b039c89	Add DPT (#15991 ) * First draft * More improvements * Add fusion blocks * Make conversion script work for dpt_large * Make conversion script work * Improve implementation * Improve conversion script * Add DPTForSemanticSegmentation * Make conversion work for semantic segmentation * Add tests * Remove print statements * First draft * Redesign neck * Improve tests * Improve implementation some more * Make neck output list of tensors * Improve neck and feature extractor * Fix integration tests * Make more tests pass * Make all tests pass * Add missing config archive map * Add in_index attribute to make heads accept list of tensors * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply some more suggestions * Add copied from statements * Remove assert * Apply suggestions from code review * Apply suggestions from code review * Remove DPTInterpolate in favor of nn.Upsample * Add comments * Apply suggestions from code review * Apply suggestions from code review * Add proposed design * Update design * Add DPTReassembleLayer * Add DPTFeatureFusionStage * Apply more suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Fix rebase * Update in_index and out_indices * Fix conversion script * Fix code quality * Add model to toctree and use DepthEstimatorOutput * Fix rebase * Fix code examples * Improve code * Fix copied from statements * Apply suggestions from code review * Remove compute_loss method * Apply suggestions from code review * Fix documentation tests file * Remove test.py file * Improve doc example Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>	2022-03-28 16:28:10 +02:00
Sylvain Gugger	473709fc76	Use doc builder styler (#16412 ) * Config update * Use doc-builder styler * Cleanup * Adapt import * We need it there too!	2022-03-28 07:45:18 -04:00
Kurian Benoy	c88ff66cc8	Fix broken links (#16113 ) * Update marian.mdx * Update marian.mdx * Update docs/source/model_doc/marian.mdx Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update marian.mdx Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2022-03-28 05:38:17 -04:00
Nathan Glenn	e02f95b229	remove references to PDF reading via PIL (#15293 ) * fix confusing PIL instructions As stated in the documentation [here](https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html?highlight=pdf#write-only-formats), PIL can only write PDF's, not read them. Remove references to reading PDF's via PIL from this page to avoid confusion. * mention PDF in doc examples using PIL Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Be explicit: PDFs must be converted to images * fix formatting Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-03-28 05:00:29 -04:00
Steven Liu	b320d87ece	Create concept guide section (#16369 ) * ✨ create concept guide section * 🖍 make fixup * 🖍 apply feedback Co-authored-by: Steven <stevhliu@gmail.com>	2022-03-25 14:51:43 -05:00
Daniel Stancl	ed2ee373d0	Add TF implementation of GPT-J (#15623 ) * Initial commit * Add TFGPTJModel * Fix a forward pass * Add TFGPTJCausalLM * Add TFGPTJForSequenceClassification * Add TFGPTJForQuestionAnswering * Fix docs * Deal with TF dynamic shapes * Add Loss parents to models * Adjust split and merge heads to handle 4 and 5-dim tensors * Update outputs for @tooslow tests	2022-03-25 19:27:19 +00:00
lewtun	a97f3150c4	Add ONNX support for Blenderbot and BlenderbotSmall (#15875 ) * Add ONNX support for Blenderbot * Add BlenderbotSmall ONNX configuration * Update serialization table	2022-03-25 17:04:43 +01:00
Sylvain Gugger	867f3950fa	Rename master to main for notebooks links and leftovers (#16397 )	2022-03-25 09:12:23 -04:00
Sylvain Gugger	088c1880b7	Big file_utils cleanup (#16396 ) * Big file_utils cleanup * This one still needs to be treated separately	2022-03-25 07:25:20 -04:00
Sylvain Gugger	3a0f1684c3	Fix readme links and add CI check (#16392 ) * Fix doc links in README * Fix name * Fix links in READMEs and doc index * Error if there is something wrong so the CI knows	2022-03-24 11:59:09 -04:00
Thomas Chaigneau	029b0d95ed	add GPT-J ONNX config to Transformers (#16274 ) * add GPT-J ONNX config to Transformers * remove token-classification features mapping Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * add question-answering features mapping Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * add GPT2 config init to GPT2 config + copie shebang for fix-copies Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-03-23 16:36:11 -04:00
Edward Beeching	aff9bc405a	Decision transformer gym (#15845 ) * Created the Decision Transformer Modle * updating tests, copy to other machine * Added last hidden size to Decision Transformer modelling outputs * Removed copy of original DT file * made a temporary change to gpt2 to have it conform with the Decision Transformer version * Updated tests * Ignoring a file used to test the DT model * added comments to config file * added comments and argument descriptions to decision transformer file * Updated doc * Ran "make style" * Remove old model imports * Removed unused imports, cleaned up init file * Update docs/source/model_doc/decision_transformer.mdx added my username Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Reverted changes made to gpt2 * Removed datasets submodule * Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states * Added support for return of hidden states, attentions and return dict of gpt2 model. * Updated tests to include many of the ModelTesterMixin tests. The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes * Added missing line to the end of gpt2 file * Added an integration test for the Decision Transformer Test performs and autoregressive evaluation for two time steps * Set done and info to _ to fix failing test * Updated integration test to be deterministic and check expected outputs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unnecessary config options * Cleaned up commented code and old comments. * Cleaned up commented code. * Changed DecisionTransformer to Decision Transformer * Added Decision Transformer to the main README file * Added copy of GTP2 called DecisionTranformerGPT2Model * isorted imports * isorted imports * Added model to non-English README files * Ran make fix-copies and corrected some cases. * Updated index file to include Decision Transformer * Added gpt2 model as copy inside the Decision Transformer model file * Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS * Deleted redundant checkpoint files (I don't know how these got committed) * Removed testing files. (These should have never been committed) * Removed accidentally committed files * Moved the Decision Transformer test to its own directory * Add type hints for Pegasus (#16324) * Funnel type hints (#16323) * add pt funnel type hints * add tf funnel type hints * Add type hints for ProphetNet PyTorch (#16272) * [GLPN] Improve docs (#16331) * Add link to notebook * Add link * Fix bug Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Added type hints for Pytorch Marian calls (#16200) * Added type hinting for forward functions in pytorch marian * typo correction * Removed type hints on functions from BART per Suraj Patil request * fix import pb * fix typo * corrected tuple call * ran black * after fix-copies Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List * Fixing copies to roformer and pegasus Co-authored-by: Clementine Fourrier <cfourrie@inria.fr> Co-authored-by: matt <rocketknight1@gmail.com> * Moved DecisionTransformOutput to modeling_decision_transformer * Moved the example usage to research project and cleaned comments * Made tests ignore the copy of gpt2 in Decision Transformer * Added module output to modelling decision transformer * removed copied gpt2 model from list of transformers models * Updated tests and created __init__ file for new test location * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unneeded summary type from config file * Fixed copies * Updated pretrained config map to refer to hopper-medium checkpoint * done (#16340) * Added Decision transformer to model docs * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add type annotations for Rembert/Splinter and copies (#16338) * undo black autoformat * minor fix to rembert forward with default * make fix-copies, make quality * Adding types to template model * Removing List from the template types * Remove `Optional` from a couple of types that don't accept `None` Co-authored-by: matt <rocketknight1@gmail.com> * [Bug template] Shift responsibilities for long-range (#16344) * Fix code repetition in serialization guide (#16346) * Adopt framework-specific blocks for content (#16342) * ✨ refactor code samples with framework-specific blocks * ✨ update training.mdx * 🖍 apply feedback * Updates the default branch from master to main (#16326) * Updates the default branch from master to main * Links from `master` to `main` * Typo * Update examples/flax/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updated model with custom docstring example * Created the Decision Transformer Modle * updating tests, copy to other machine * Added last hidden size to Decision Transformer modelling outputs * Removed copy of original DT file * made a temporary change to gpt2 to have it conform with the Decision Transformer version * Updated tests * Ignoring a file used to test the DT model * added comments to config file * added comments and argument descriptions to decision transformer file * Updated doc * Ran "make style" * Remove old model imports * Removed unused imports, cleaned up init file * Update docs/source/model_doc/decision_transformer.mdx added my username Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Reverted changes made to gpt2 * Removed datasets submodule * Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states * Added support for return of hidden states, attentions and return dict of gpt2 model. * Updated tests to include many of the ModelTesterMixin tests. The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes * Added missing line to the end of gpt2 file * Added an integration test for the Decision Transformer Test performs and autoregressive evaluation for two time steps * Set done and info to _ to fix failing test * Updated integration test to be deterministic and check expected outputs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unnecessary config options * Cleaned up commented code and old comments. * Cleaned up commented code. * Changed DecisionTransformer to Decision Transformer * Added Decision Transformer to the main README file * Added copy of GTP2 called DecisionTranformerGPT2Model * isorted imports * isorted imports * Added model to non-English README files * Ran make fix-copies and corrected some cases. * Updated index file to include Decision Transformer * Added gpt2 model as copy inside the Decision Transformer model file * Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS * Deleted redundant checkpoint files (I don't know how these got committed) * Removed testing files. (These should have never been committed) * Removed accidentally committed files * Moved the Decision Transformer test to its own directory * Moved DecisionTransformOutput to modeling_decision_transformer * Moved the example usage to research project and cleaned comments * Made tests ignore the copy of gpt2 in Decision Transformer * Added module output to modelling decision transformer * removed copied gpt2 model from list of transformers models * Updated tests and created __init__ file for new test location * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unneeded summary type from config file * Fixed copies * Updated pretrained config map to refer to hopper-medium checkpoint * Added Decision transformer to model docs * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updated model with custom docstring example * Updated copies, config auto, and readme files. Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Dan Tegzes <48134725+Tegzes@users.noreply.github.com> Co-authored-by: Adam Montgomerie <adam@avanssion.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com> Co-authored-by: Clementine Fourrier <cfourrie@inria.fr> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com> Co-authored-by: Jacob Dineen <54680234+jacobdineen@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2022-03-23 16:18:43 -04:00
Lysandre Debut	eca77f4719	Updates the default branch from master to main (#16326 ) * Updates the default branch from master to main * Links from `master` to `main` * Typo * Update examples/flax/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-23 03:46:59 -04:00
Steven Liu	7732148124	Adopt framework-specific blocks for content (#16342 ) * ✨ refactor code samples with framework-specific blocks * ✨ update training.mdx * 🖍 apply feedback	2022-03-22 16:14:58 -05:00
Omar Sanseviero	62cbd8423b	Fix code repetition in serialization guide (#16346 )	2022-03-22 16:57:19 -04:00
NielsRogge	a2379b9257	[GLPN] Improve docs (#16331 ) * Add link to notebook * Add link * Fix bug Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-03-22 15:45:29 +01:00
NielsRogge	0c55d47cde	Add GLPN (#16199 ) * First draft * Fix logits calculation * Improve tests * Add copied from statements * Fix base_model_prefix * Improve implementation, upload new models * Update design * Fix integration test * Add model to README and toctree * Add document image * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add decoder_hidden_size attribute * Update design of decoder * Add DepthEstimatorOutput class * Rename in_index to head_in_index and add feature extractor tests * Apply suggestions from code review * Apply suggestions from code review * Update pretrained model name and add to doc tests * Remove test.py script * Update copied from statements and clean up Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-22 08:51:13 +01:00
Thomas Chaigneau	0aac9ba2da	Add Flaubert OnnxConfig to Transformers (#16279 ) * Add Flaubert to ONNX to make it available for conversion. * Fixed features for FlauBERT. fixup command remove flaubert to docs list. Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>	2022-03-21 21:46:31 +01:00
Steven Liu	5a42bb431e	Update troubleshoot with more content (#16243 ) * 📝 first draft * 🖍 apply feedback	2022-03-21 11:37:18 -05:00
NielsRogge	fbb454307d	[SegFormer] Remove unused attributes (#16285 ) * Remove unused attributes * Add link to blog and add clarification about input size * Improve readability of the code Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-03-21 17:34:10 +01:00
Gunjan Chhablani	3f0f75e497	Remove disclaimer from Longformer docs (#16296 )	2022-03-21 10:05:47 -04:00
PolarisRisingWar	abf3cc7064	Fix a typo (add a coma) (#16291 ) As mentioned: https://github.com/huggingface/transformers/issues/16277	2022-03-21 12:10:24 +00:00
Aflah	f393868073	Fixed Error Raised Due to Wrongly Accessing Training Sample (#16115 ) * Update training.mdx Fixed Error Raised Due to Wrongly Accessing Training Sample * Ran make style * Revert to Old Commit * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-03-21 12:54:54 +01:00
Sylvain Gugger	4ecb022eb1	Draft a guide with our code quirks for new models (#16237 ) * Draft a guide with our code quirks for new models * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-03-21 07:44:03 -04:00
Patrick von Platen	c1af180dfe	Add Slack notification support for doc tests (#16253 ) * up * up * up * fix * yeh * ups * Empty test commit * correct quicktour * correct * correct * up * up * uP * uP * up * up * uP * up * up * up * up * up * up * up * up * up * up * Update src/transformers/models/van/modeling_van.py * finish * apply suggestions * remove folder * revert to daily testing	2022-03-21 11:33:18 +01:00
Steven Liu	ffc319e7b8	Fix links in guides (#16182 ) * 🖍 fix links in guides * 🖍 apply feedback	2022-03-18 16:16:16 -05:00
Stas Bekman	47cccb5318	[Deepspeed] non-HF Trainer doc update (#16238 )	2022-03-17 13:33:55 -07:00
Patrick von Platen	8a96b0f10a	[Generate Docs] Correct docs (#16133 ) * [Generate Docs] Correct docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2022-03-17 20:05:28 +01:00
Francesco Saverio Zuppichini	667b823b89	Swin support for any input size (#15986 ) * padding done * correctly return one attention per layer * almost correct, attentions are not flatten one tuple per stage * tests green * doc * conversations * reshaping hidden_states * view in the test * reshape_hidden_states in Encoder and Model * new outputs with reshaped_hidden_states * conversations * doc * Update docs/source/model_doc/swin.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * conversations * fix tests * minor changes * resolved conversations * attentions one per stage * typo * typos * typos * function signature * CI * clean up tests Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-03-16 18:38:25 +01:00
Sylvain Gugger	4f4e5ddbcb	Framework split (#16030 ) * First files * More files * Last files * Style	2022-03-15 10:13:34 -04:00
Markus Sagen	bcaf566038	[Fix doc example] Fix first example for the custom_datasets tutorial (#16087 ) * Fix inconsistent example variable naming - Example code for a sequence classification in Tensorflow had spelling mistakes and incorrect and inconsistent naming - Changed variable naming to be consistent with the two other TF examples * Fix incorrect incorrect training examples	2022-03-15 08:17:51 -04:00
Francesco Saverio Zuppichini	0a057201a9	Visual Attention Network (VAN) (#16027 ) * encoder works * addded files * norm in stage * convertion script * tests * fix copies * make fix-copies * fixed __init__ * make fix-copies * fix * shapiro test needed * make fix-copie * minor changes * make style + quality * minor refactor conversion script * rebase + tests * removed unused variables * updated doc * toctree * CI * doc * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * resolved conversations * make fixup * config passed to modules * config passed to modules * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * conversations * conversations * copyrights * normal test * tests Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-03-15 08:47:12 +01:00
Francesco Saverio Zuppichini	e3008c679f	[WIP] Resnet (#15770 ) * first commit * ResNet model correctly implemented. basic modeling + weights conversion is done removed unused doc mdx file doc and conversion script added feature_extractor to auto test minor changes + style + quality doc test Delete process.yml A left over from my attempt of running circleci locally * minor changes * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * new test format * minor changes from conversations * minor changes from conversations * make style + quality * readded the tests * test + README * minor changes from conversations * error in README * make fix-copies * removed regression for classification head * make quality * fixed loss control flow * fixed loss control flow * resolved conversations * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * READMEs * index.mdx * minor changes * updated tests and models * unused import * outputs * Update docs/source/model_doc/resnet.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added embeddings_size * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * conversation * added push to hub * test * embedding_size * make fix-copies * resolved conversations * CI * changed organization * minor changes * CI * minor changes * conversations * conversation * doc * tests * removed unused docstring * conversation * removed unused outputs * CI Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-03-14 19:57:55 +01:00
Omar Sanseviero	802984ad42	Fix and document Zero Shot Image Classification (#16079 )	2022-03-14 08:50:36 +01:00
lewtun	6e1e88fd38	Add TFCamembertForCausalLM and ONNX integration test (#16073 ) * Make Camembert great again! * Add Camembert to TensorFlow ONNX tests	2022-03-14 08:40:42 +01:00
Stas Bekman	580dd87c55	[Deepspeed] add support for bf16 mode (#14569 ) * [WIP] add support for bf16 mode * prep for bf16 * prep for bf16 * fix; zero2/bf16 is ok * check bf16 is available * test fixes * enable zero3_bf16 * config files * docs * split stage_dtype; merge back to non-dtype-specific config file * fix doc * cleanup * cleanup * bfloat16 => bf16 to match the PR changes * s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/ * test fixes/skipping * move * fix * Update docs/source/main_classes/deepspeed.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * backticks * cleanup * cleanup * cleanup * new version * add note about grad accum in bf16 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-11 17:53:53 -08:00
Steven Liu	ae2dd42be5	Audio/vision task guides (#15808 ) * 📝 first draft of audio/vision guides * ✨ make fixup * 🖍 fix typo * 🖍 close parentheses * 🖍 apply feedback * 🖍 apply feedback, make fixup * 🖍 more fixup for perceiver * 🖍 apply feedback * ✨ make fixup * 🖍 fix data collator	2022-03-11 16:43:49 -06:00
Steven Liu	5b4c97d09d	Update troubleshoot guide (#16001 ) * 📝 first draft * 🖍 apply feedback * 🖍 apply feedback	2022-03-11 13:05:44 -06:00
Sylvain Gugger	f7708e1bed	Force default brnahc name via the config	2022-03-11 10:09:15 -05:00
David S. Batista	96ac7549cb	updating fine-tune classifier documentation (#16063 )	2022-03-10 16:21:56 -05:00
Patrick von Platen	6ce11c2c0f	[Docs] Improve PyTorch, Flax generate API (#15988 ) * Move generate docs * up * Update docs/source/_toctree.yml * correct * correct some stuff * correct tests * more fixes * finish generate * add to doc stest * finish * finalize * add warning to generate method	2022-03-10 11:54:45 +01:00
NielsRogge	0835119bf3	Add Document Image Transformer (DiT) (#15984 ) * Add conversion script * Improve script * Fix bug * Add option to push to hub * Add support for classification models * Update model name * Upload feature extractor files first * Remove hash checking * Fix config * Add id2label * Add import * Fix id2label file name * Fix expected shape * Add model to README * Improve docs * Add integration test and fix CI * Fix code style * Add missing init * Add model to SPECIAL_MODULE_TO_TEST_MAP Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-03-10 11:34:44 +01:00
Sanchit Gandhi	b256f3518d	Add FlaxBartForCausalLM (#15995 ) * add causal lm * add CausalLM tests * Add FlaxBartForCausalLM * Add EncoderDecoder model tests * change docstring * make repo-consistency * suggested changes * remove jax ops * correction * rename pre-trained decoder model	2022-03-09 19:53:01 +01:00
lewtun	50dd314d93	Add ONNX export for ViT (#15658 ) * Add ONNX support for ViT * Refactor to use generic preprocessor * Add vision dep to tests * Extend ONNX slow tests to ViT * Add dummy image generator * Use model_type to determine modality * Add deprecation warnings for tokenizer argument * Add warning when overwriting the preprocessor * Add optional args to docstrings * Add minimum PyTorch version to OnnxConfig * Refactor OnnxConfig class variables from CONSTANT_NAME to snake_case * Add reasonable value for default atol Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-09 17:36:59 +01:00
Patrick von Platen	c1aaa43935	[Doctests] Move doctests to new GPU & Fix bugs (#15969 ) * test * up * up * Empty test commit * up * update tests * up * fix some vision models * correct * correct docs * Trigger notification * finalize * check * correct quicktour * Apply suggestions from code review * improve doctests * Trigger Build * next try * next try * and again * Output current clone information * Output current clone information * Correct path * add tf round again * revert to daily job Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2022-03-09 13:09:56 +01:00
Steven Liu	38cc35069c	Update training scripts docs (#15931 ) * 📝 first draft * 🖍 apply feedback * 🖍 remove examples from toctree * 🗑 remove examples from docs/source	2022-03-07 13:29:14 -06:00
Chan Woo Kim	5c6f57ee75	Constrained Beam Search [With Disjunctive Decoding] (#15761 ) * added classes to get started with constrained beam search * in progress, think i can directly force tokens now but not yet with the round robin * think now i have total control, now need to code the bank selection * technically works as desired, need to optimize and fix design choices leading to undersirable outputs * complete PR #1 without disjunctive decoding * removed incorrect tests * Delete k.txt * Delete test.py * Delete test.sh * revert changes to test scripts * genutils * full implementation with testing, no disjunctive yet * shifted docs * passing all tests realistically ran locally * removing accidentally included print statements * fixed source of error in initial PR test * fixing the get_device() vs device trap * fixed documentation docstrings about constrained_beam_search * fixed tests having failing for Speech2TextModel's floating point inputs * fix cuda long tensor * added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search * deleted accidentally added test halting code with assert False * code reformat * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py * fixing based on comments on PR * took out the testing code that should but work fails without the beam search moditification ; style changes * fixing comments issues * docstrings for ConstraintListState * typo in PhrsalConstraint docstring * docstrings improvements * finished adding what is sort of an opinionated implementation of disjunctive generation, but it revealed errors in inner beam search logic during testing. * fixed bug found in constrained beam search that used beam_idx that were not global across all the batches * disjunctive constraint working 100% correctly * passing all tests * Accidentally included mlruns * Update src/transformers/generation_beam_constraints.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/generation_beam_constraints.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * complete overhaul of type complexities and other nits * strict type checks in generate() * fixing second round of feedback by narsil * fixed failing generation test because of type check overhaul * generation test fail fix * fixing test fails Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-03-04 18:18:34 +01:00
Javier de la Rosa	01485ceec3	Add missing support for Flax XLM-RoBERTa (#15900 ) * Adding Flax XLM-RoBERTa * Add Flax to __init__ * Adding doc and dummy objects * Add tests * Add Flax XLM-R models autodoc * Fix tests * Add Flask XLM-RoBERTa to TEST_FILES_WITH_NO_COMMON_TESTS * Update src/transformers/models/xlm_roberta/modeling_flax_xlm_roberta.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update tests/xlm_roberta/test_modeling_flax_xlm_roberta.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update tests/xlm_roberta/test_modeling_flax_xlm_roberta.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Remove test on large Flask XLM-RoBERTa * Add tokenizer to the test Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-03-04 14:36:28 +01:00
Nicolas Patry	89c7d9cfba	Making MaskFormerForInstanceSegmentation. (#15934 ) Small adjustments. Adding in type hint. Last fix ? Only include the default dict thing, not the pipelines.	2022-03-04 13:56:15 +01:00
Li-Huai (Allan) Lin	7b3bd1f21a	Fix and improve REALM fine-tuning (#15297 ) * Draft * Add test * Update src/transformers/models/realm/modeling_realm.py * Apply suggestion * Add block_mask * Update * Update * Add block_embedding_to * Remove no_grad * Use AutoTokenizer * Remove model.to overridding	2022-03-03 14:10:15 +01:00
Patrick von Platen	439de3f7f9	[Fix link in pipeline doc] (#15906 )	2022-03-03 07:43:13 -05:00
Joao Gante	baab5e7cdf	TF generate refactor - Sample (#15793 ) * Add TF logits wrappers * Add sample method * add tests for TF logit wrappers * TF generate sample tests now run on CPU Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-03-02 16:13:54 +00:00
Francesco Saverio Zuppichini	d83d22f578	Maskformer (#15682 ) * maskformer * conflicts * conflicts * minor fixes * feature extractor test fix refactor MaskFormerLoss following conversation MaskFormer related types should not trigger a module time import error missed one removed all the types that are not used update config mapping minor updates in the doc resolved conversation that doesn't need a discussion minor changes resolved conversations fixed DetrDecoder * minor changes minor changes fixed mdx file test feature_extractor return types functional losses -> classes removed the return type test for the feature extractor minor changes + style + quality * conflicts? * rebase master * readme * added missing files * deleded poolformers test that where in the wrong palce * CI * minor changes * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * resolved conversations * minor changes * conversations [Unispeech] Fix slow tests (#15818) * remove soundfile old way of loading audio * Adapt slow test [Barthez Tokenizer] Fix saving (#15815) [TFXLNet] Correct tf xlnet generate (#15822) * [TFXLNet] Correct tf xlnet * adapt test comment Fix the push run (#15807) Fix semantic segmentation pipeline test (#15826) Fix dummy_inputs() to dummy_inputs in symbolic_trace doc (#15776) Add model specific output classes to PoolFormer model docs (#15746) * Added model specific output classes to poolformer docs * Fixed Segformer typo in Poolformer docs Adding the option to return_timestamps on pure CTC ASR models. (#15792) * Adding the option to return_timestamps on pure CTC ASR models. * Remove `math.prod` which was introduced in Python 3.8 * int are not floats. * Reworking the PR to support "char" vs "word" output. * Fixup! * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Quality. Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> HFTracer.trace should use/return self.graph to be compatible with torch.fx.Tracer (#15824) Fix tf.concatenate + test past_key_values for TF models (#15774) * fix wrong method name tf.concatenate * add tests related to causal LM / decoder * make style and quality * clean-up * Fix TFBertModel's extended_attention_mask when past_key_values is provided * Fix tests * fix copies * More tf.int8 -> tf.int32 in TF test template * clean-up * Update TF test template * revert the previous commit + update the TF test template * Fix TF template extended_attention_mask when past_key_values is provided * Fix some styles manually * clean-up * Fix ValueError: too many values to unpack in the test * Fix more: too many values to unpack in the test * Add a comment for extended_attention_mask when there is past_key_values * Fix TFElectra extended_attention_mask when past_key_values is provided * Add tests to other TF models * Fix for TF Electra test: add prepare_config_and_inputs_for_decoder * Fix not passing training arg to lm_head in TFRobertaForCausalLM * Fix tests (with past) for TF Roberta * add testing for pask_key_values for TFElectra model Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> [examples/summarization and translation] fix readme (#15833) Add ONNX Runtime quantization for text classification notebook (#15817) Re-enable doctests for the quicktour (#15828) * Re-enable doctests for the quicktour * Re-enable doctests for task_summary (#15830) * Remove & Framework split model report (#15825) Add TFConvNextModel (#15750) * feat: initial implementation of convnext in tensorflow. * fix: sample code for the classification model. * chore: added checked for from the classification model. * chore: set bias initializer in the classification head. * chore: updated license terms. * chore: removed ununsed imports * feat: enabled argument during using drop_path. * chore: replaced tf.identity with layers.Activation(linear). * chore: edited default checkpoint. * fix: minor bugs in the initializations. * partial-fix: tf model errors for loading pretrained pt weights. * partial-fix: call method updated * partial-fix: cross loading of weights (4x3 variables to be matched) * chore: removed unneeded comment. * removed playground.py * rebasing * rebasing and removing playground.py. * fix: renaming TFConvNextStage conv and layer norm layers * chore: added initializers and other minor additions. * chore: added initializers and other minor additions. * add: tests for convnext. * fix: integration tester class. * fix: issues mentioned in pr feedback (round 1). * fix: how output_hidden_states arg is propoagated inside the network. * feat: handling of arg for pure cnn models. * chore: added a note on equal contribution in model docs. * rebasing * rebasing and removing playground.py. * feat: encapsulation for the convnext trunk. * Fix variable naming; Test-related corrections; Run make fixup * chore: added Joao as a contributor to convnext. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * chore: corrected copyright year and added comment on NHWC. * chore: fixed the black version and ran formatting. * chore: ran make style. * chore: removed from_pt argument from test, ran make style. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * fix: tests in the convnext subclass, ran make style. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * chore: moved convnext test to the correct location * fix: locations for the test file of convnext. * fix: convnext tests. * chore: applied sgugger's suggestion for dealing w/ output_attentions. * chore: added comments. * chore: applied updated quality enviornment style. * chore: applied formatting with quality enviornment. * chore: revert to the previous tests/test_modeling_common.py. * chore: revert to the original test_modeling_common.py * chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py * fix: tests for convnext. * chore: removed output_attentions argument from convnext config. * chore: revert to the earlier tf utils. * fix: output shapes of the hidden states * chore: removed unnecessary comment * chore: reverting to the right test_modeling_tf_common.py. * Styling nits Co-authored-by: ariG23498 <aritra.born2fly@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> * minor changes * doc fix in feature extractor * doc * typose * removed detr logic from config * removed detr logic from config * removed num_labels * small fix in the config * auxilary -> auxiliary * make style * some test is failing * fix a weird char in config prevending doc-builder * retry to fix the doc-builder issue * make style * new try to fix the doc builder * CI * change weights to facebook Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: ariG23498 <aritra.born2fly@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-03-02 15:48:20 +01:00
Patrick von Platen	40040727ab	[Bart] Fix implementation note doc (#15879 )	2022-03-02 10:24:32 +01:00
Michael Benayoun	4bfe75bd08	M2M100 support for ONNX export (#15193 ) * Add M2M100 support for ONNX export * Delete useless imports * Add M2M100 to tests * Fix protobuf issue	2022-03-02 10:03:14 +01:00
Steven Liu	6ccfa2170c	Inference for multilingual models (#15836 ) * 📝 first draft for multilingual models * 🖍 make style	2022-03-01 15:10:31 -06:00
NielsRogge	c008afea3c	Add link to notebooks (#15791 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-03-01 17:44:20 +01:00
Patrick von Platen	9863f7d228	[Benchmark tools] Deprecate all (#15848 ) * [Benchmark tools] Deprecate all * up	2022-03-01 11:26:20 +01:00
Eduardo Gonzalez Ponferrada	df5a4094a6	Add Data2Vec (#15507 ) * Add data2vec model cloned from roberta * Add checkpoint conversion script * Fix copies * Update docs * Add checkpoint conversion script * Remove fairseq data2vec_text script and fix format * Add comment on where to get data2vec_text.py * Remove mock implementation cheat.py and fix style * Fix copies * Remove TF and Flax classes from init * Add back copy from fairseq data2vec_text.py and fix style * Update model name in docs/source/index.mdx to be CamelCase * Revert model name in table to lower-case to get check_table test to pass * Update src/transformers/models/data2vec/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/model_doc/data2vec.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/data2vec.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update documentation * Copy-paste Data2VecConfig from BertConfig * Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency * Update config special tokens to match RoBERTa * Split multiple assertions and add individual error messages * Rename Data2VecModel to Data2VecForTextModel * Add Data2Vec to _toctree.yml * Rename Data2VecEmbeddings to Data2VecForTextEmbeddings * Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding). * finish audio model * finish audio file * Update names and fix style, quality and repo consistency * Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files. * add inputs to logits to data2vec' * correct autio models * correct config auto * correct tok auto * Update utils/tests_fetcher.py * delete unnecessary files * delete unnecessary files * further renaming * make all tests pass * finish * remove useless test file * Update tests/test_modeling_common.py * Update utils/check_repo.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec_text.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Fix copies * Update docs * Remove fairseq data2vec_text script and fix format * Add comment on where to get data2vec_text.py * Remove mock implementation cheat.py and fix style * Fix copies * Remove TF and Flax classes from init * Add back copy from fairseq data2vec_text.py and fix style * Update model name in docs/source/index.mdx to be CamelCase * Revert model name in table to lower-case to get check_table test to pass * Update documentation * Update src/transformers/models/data2vec/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Copy-paste Data2VecConfig from BertConfig * Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency * Update config special tokens to match RoBERTa * Split multiple assertions and add individual error messages * Rename Data2VecModel to Data2VecForTextModel * Add Data2Vec to _toctree.yml * Rename Data2VecEmbeddings to Data2VecForTextEmbeddings * Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding). * finish audio model * finish audio file * add inputs to logits to data2vec' * Update names and fix style, quality and repo consistency * Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files. * correct autio models * correct config auto * correct tok auto * delete unnecessary files * delete unnecessary files * Update utils/tests_fetcher.py * further renaming * make all tests pass * finish * remove useless test file * Update tests/test_modeling_common.py * Update utils/check_repo.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec_text.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Move data2vec tests to new structure * Fix test imports for text tests * Remove fairseq files * Change paper link to arxiv * Modify Data2Vec documentation to reflect that the encoder is not shared across the audio and text models in the current implementation. * Update text model checkpoint to be facebook/data2vec-text-base * Add 'Copy from' statements and update paper links and docs * fix copy from statements * improve copied from * correct more copied from statements * finish copied from stuff * make style * add model to README * add to master Co-authored-by: Eduardo Gonzalez Ponferrada <eduardo@ferrumhealth.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-01 11:09:20 +01:00
Sanchit Gandhi	e3342edc4e	Flax Speech-Encoder-Decoder Model (#15613 ) * rebase * Delete shift tokens func * downsample decoder input seq len for init * correct attention mask * add tests * pt flax cross test * make fixup * init file for import * change pt-flax cross test threshold * pt-flax test logits only * move tests * make repo-consistency * consistent indentation Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-28 12:22:36 +01:00
Sayak Paul	84eaa6acf5	Add TFConvNextModel (#15750 ) * feat: initial implementation of convnext in tensorflow. * fix: sample code for the classification model. * chore: added checked for from the classification model. * chore: set bias initializer in the classification head. * chore: updated license terms. * chore: removed ununsed imports * feat: enabled argument during using drop_path. * chore: replaced tf.identity with layers.Activation(linear). * chore: edited default checkpoint. * fix: minor bugs in the initializations. * partial-fix: tf model errors for loading pretrained pt weights. * partial-fix: call method updated * partial-fix: cross loading of weights (4x3 variables to be matched) * chore: removed unneeded comment. * removed playground.py * rebasing * rebasing and removing playground.py. * fix: renaming TFConvNextStage conv and layer norm layers * chore: added initializers and other minor additions. * chore: added initializers and other minor additions. * add: tests for convnext. * fix: integration tester class. * fix: issues mentioned in pr feedback (round 1). * fix: how output_hidden_states arg is propoagated inside the network. * feat: handling of arg for pure cnn models. * chore: added a note on equal contribution in model docs. * rebasing * rebasing and removing playground.py. * feat: encapsulation for the convnext trunk. * Fix variable naming; Test-related corrections; Run make fixup * chore: added Joao as a contributor to convnext. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * chore: corrected copyright year and added comment on NHWC. * chore: fixed the black version and ran formatting. * chore: ran make style. * chore: removed from_pt argument from test, ran make style. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * fix: tests in the convnext subclass, ran make style. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * chore: moved convnext test to the correct location * fix: locations for the test file of convnext. * fix: convnext tests. * chore: applied sgugger's suggestion for dealing w/ output_attentions. * chore: added comments. * chore: applied updated quality enviornment style. * chore: applied formatting with quality enviornment. * chore: revert to the previous tests/test_modeling_common.py. * chore: revert to the original test_modeling_common.py * chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py * fix: tests for convnext. * chore: removed output_attentions argument from convnext config. * chore: revert to the earlier tf utils. * fix: output shapes of the hidden states * chore: removed unnecessary comment * chore: reverting to the right test_modeling_tf_common.py. * Styling nits Co-authored-by: ariG23498 <aritra.born2fly@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-02-25 18:19:16 +01:00
Sylvain Gugger	0118c4f6a8	Re-enable doctests for the quicktour (#15828 ) * Re-enable doctests for the quicktour * Re-enable doctests for task_summary (#15830) * Remove &	2022-02-25 17:46:38 +01:00
Tanay Mehta	7566734d6f	Add model specific output classes to PoolFormer model docs (#15746 ) * Added model specific output classes to poolformer docs * Fixed Segformer typo in Poolformer docs	2022-02-25 13:43:56 +01:00
Steven Liu	fecb08c2b8	🧼 NLP task guides (#15731 ) * clean commit of changes to NLP tasks * 🖍 apply feedback * 📝 move tf data collator in multiple choice Co-authored-by: Steven <stevhliu@gmail.com>	2022-02-23 13:58:33 -06:00
Julien Chaumond	32f5de10a0	[doc] custom_models: mention security features of the Hub (#15768 ) * custom_models: tiny doc addition * mention security feature earlier in the section Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2022-02-23 11:40:06 -05:00
Nicolas Patry	f9582c205a	Adding ZeroShotImageClassificationPipeline (#12119 ) * [Proposal] Adding ZeroShotImageClassificationPipeline - Based on CLIP * WIP, Resurection in progress. * Resurrection... achieved. * Reword handling different `padding_value` for `feature_extractor` and `tokenizer`. * Thanks doc-builder ! * Adding docs + global namespace `ZeroShotImageClassificationPipeline`. * Fixing templates. * Make the test pass and be robust to floating error. * Adressing suraj's comments on docs mostly. * Tf support start. * TF support. * Update src/transformers/pipelines/zero_shot_image_classification.py Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-02-23 09:41:42 +01:00
Patrick von Platen	c44d3675c2	Time stamps for CTC models (#15687 ) * [Wav2Vec2 Time Stamps] * Add first version * add word time stamps * Fix * save intermediate space * improve * [Finish CTC Tokenizer] * remove @ * remove @ * push * continue with phonemes * up * finish PR * up * add example * rename * finish * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct split * finalize Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-22 19:26:44 +01:00
Francesco Saverio Zuppichini	38bed912e3	added link to our writing-doc document (#15756 )	2022-02-22 09:57:28 +01:00
Joao Gante	3956b133b6	TF text classification examples (#15704 ) * Working example with to_tf_dataset * updated text_classification * more comments	2022-02-21 17:17:59 +00:00
Gunjan Chhablani	2c2a31ffbc	Add missing PLBart entry in README (#15721 ) * Add missing PLBart entry in index * Fix README * Fix README * Fix style * Change to master model doc	2022-02-18 21:11:42 +01:00
Gunjan Chhablani	ae1f835028	Add PLBart (#13269 ) * Init PLBART * Add missing configuration file * Add conversion script and configurationf ile * Fix style * Update modeling and conversion scripts * Fix scale embedding in config * Add comment * Fix conversion script * Add classification option to conversion script * Fix vocab size in config doc * Add tokenizer files from MBart50 * Allow no lang code in regular tokenizer * Add PLBart Tokenizer Converters * Remove mask from multi tokenizer * Remove mask from multi tokenizer * Change from MBart-50 to MBart tokenizer * Fix names and modify src/tgt behavior * Fix imports for tokenizer * Remove <mask> from multi tokenizer * Fix style * Change tokenizer_class to processor_class * Add attribute map to config class * Update modeling file to modified MBart code * Update configuration file to MBart style configuration * Fix tokenizer * Separate tokenizers * Fix error in tokenization auto * Copy MBart tests * Replace with MBart tokenization tests * Fix style * Fix language code in multi tokenizer * Fix configuration docs * Add entry for plbart_multi in transformers init * Add dummy objects and fix imports * Fix modeling tests * Add TODO in config * Fix copyright year * Fix modeling docs and test * Fix some tokenization tests and style * Add changes from review * Fix copies * Fix docs * Fix docs * Fix style * Fix year * Add changes from review * Remove extra changes * Fix base tokenizer and doc * Fix style * Fix modeling and slow tokenizer tests * Remove Multi-tokenizer Converter and Tests * Delete QA model and Multi Tokenizer dummy objects * Fix repo consistency and code quality issues * Fix example documentation * Fix style * Remove PLBartTokenizer from type checking in init * Fix consistency issue * Add changes from review * Fix style * Remove PLBartTokenizerFast * Remove FastTokenizer converter * Fix AutoTokenzier mapping * Add plbart to toctree and fix consistency issues * Add language codes tokenizer test * Fix styling and doc issues * Add fixes for failing tests * Fix copies * Fix failing modeling test * Change assert to assertTrue in modeling tests	2022-02-18 14:17:09 +01:00
Francesco Saverio Zuppichini	240cc6cbdc	Adding a model, more doc for pushing to the hub (#15690 ) * doc for adding a model to the hub * run make style * resolved conversation * removed a line * removed ) * Update docs/source/add_new_model.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/add_new_model.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make style Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-18 09:11:18 +01:00
NielsRogge	57882177be	Add SimMIM (#15586 ) * Add first draft * Make model importable * Make SwinForMaskedImageModeling importable * Fix imports * Add missing inits * Add support for Swin * Fix bug * Fix bug * Fix another bug * Fix Swin MIM implementation * Fix default encoder stride * Fix Swin * Add print statements for debugging * Add image_size data argument * Fix Swin * Fix image_size * Add print statements for debugging * Fix print statement * Remove print statements * Improve reshaping of bool_masked_pos * Add support for DeiT, fix tests * Improve docstrings * Apply new black version * Improve script * Fix bug * Improve README * Apply suggestions from code review * Remove DS_Store and add to gitignore * Apply suggestions from code review + fix BEiT Flax * Revert BEiT changes * Improve README * Fix code quality * Improve README Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-02-17 19:44:55 +01:00
Yih-Dar	92a537d938	Minor fix on README.md (#15688 ) * fix README * fix more arxiv links * make fix-copies Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-17 08:38:32 -05:00
Tanay Mehta	f84e0dbd2a	Add PoolFormer (#15531 ) * Added all files, PoolFormerFeatureExtractor still failing tests * Fixed PoolFormerFeatureExtractor not being able to import * Completed Poolformer doc * Applied Suggested fixes * Fixed errors in modeling_auto.py * Fix feature extractor, convert docs to Markdown, styling of code * Remove PoolFormer from check_repo and fix integration test * Remove Poolformer from check_repo * Fixed configuration_poolformer.py docs and removed inference.py from poolformer * Ran with black v22 * Added PoolFormer to _toctree.yml * Updated poolformer doc * Applied suggested fixes and added on README.md * Did make fixup and make fix-copies, tests should pass now * Changed PoolFormer weights conversion script name and fixed README * Applied fixes in test_modeling_poolformer.py and modeling_poolformer.py * Added PoolFormerFeatureExtractor to AutoFeatureExtractor API Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>	2022-02-17 13:16:37 +01:00
Francesco Saverio Zuppichini	b87c044c79	Usage examples for logger (#15657 ) * logger * Update docs/source/main_classes/logging.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update docs/source/main_classes/logging.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-02-16 10:15:13 +01:00
Stas Bekman	bee361c6f1	[t5/t0/mt5 models] faster/leaner custom layer norm (#14656 ) * [t5] faster/leaner custom layer norm * wip * apex.normalization.FusedRMSNorm * cleanup * cleanup * add doc * add catch all * Trigger CI * expand	2022-02-15 16:49:57 -08:00
Patrick von Platen	2e12b907ae	TF generate refactor - Greedy Search (#15562 ) * TF generate start refactor * Add tf tests for sample generate * re-organize * boom boom * Apply suggestions from code review * re-add * add all code * make random greedy pass * make encoder-decoder random work * further improvements * delete bogus file * make gpt2 and t5 tests work * finish logits tests * correct logits processors * correct past / encoder_outputs drama * refactor some methods * another fix * refactor shape_list * fix more shape list * import shape _list * finish docs * fix imports * make style * correct tf utils * Fix TFRag as well * Apply Lysandre's and Sylvais suggestions * Update tests/test_generation_tf_logits_process.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Update src/transformers/tf_utils.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * remove cpu according to gante * correct logit processor Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-02-15 17:54:43 +01:00
Nicolas Patry	cdf19c501d	Re-export `KeyDataset`. (#15645 ) * Re-export `KeyDataset`. * Update the docs locations.	2022-02-15 17:49:38 +01:00
Stas Bekman	28e6155d8a	add a network debug script and document it (#15652 ) * add a network debug script and document it * doc	2022-02-15 08:48:00 -08:00
jonrbates	86a7845c0c	Fix typo in speech2text2 doc (#15617 ) Forward looks for inputs, not input_ids	2022-02-15 13:54:34 +01:00
fra	05a8580964	Revert "logger doc" This reverts commit `41168a49ce`.	2022-02-15 10:46:45 +01:00
fra	41168a49ce	logger doc	2022-02-15 10:03:28 +01:00
NielsRogge	b090b79022	Make Swin work with VisionEncoderDecoderModel (#15527 ) * Add attribute_map * Add mention in docs * Set hidden_size attribute correctly * Add note about Transformer-based models only Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>	2022-02-14 17:33:35 +01:00
Daniel Erenrich	4f403ea899	Fix grammar in tokenizer_summary (#15614 ) "to make ensure" is redundant.	2022-02-11 16:51:30 -05:00
Stas Bekman	f15c99fabf	[deepspeed docs] misc additions (#15585 ) * [deepspeed docs] round_robin_gradients * training and/or eval/predict loss is * Update docs/source/main_classes/deepspeed.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-11 10:54:04 -08:00
Steven Liu	85aee09e9a	🖍 remove broken link (#15615 )	2022-02-11 12:33:55 -06:00
Sylvain Gugger	6cf06d198c	Mark "code in the Hub" API as experimental (#15624 )	2022-02-11 09:55:31 -05:00
Ngo Quang Huy	c0864d98ba	Correct JSON format (#15600 )	2022-02-10 09:02:03 -08:00
lewtun	2e8b85f72e	Add local and TensorFlow ONNX export examples to docs (#15604 ) * Add local and TensorFlow ONNX export examples to docs * Use PyTorch - TensorFlow split	2022-02-10 16:31:00 +01:00
Alberto Bégué	cb7ed6e083	Add Tensorflow handling of ONNX conversion (#13831 ) * Add TensorFlow support for ONNX export * Change documentation to mention conversion with Tensorflow * Refactor export into export_pytorch and export_tensorflow * Check model's type instead of framework installation to choose between TF and Pytorch Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Alberto Bégué <alberto.begue@della.ai> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-02-10 11:18:41 +01:00
Sylvain Gugger	c722753afd	Expand tutorial for custom models (#15587 ) * Expand tutorial for custom models * Style * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2022-02-09 17:44:28 -05:00
NielsRogge	a86ee2261e	Add link (#15588 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>	2022-02-09 23:33:39 +01:00
Stas Bekman	dee17d5676	[trainer docs] document how to select specific gpus (#15551 ) * [trainer docs] document how to select specific gpus * expand * add urls * add accelerate launcher	2022-02-09 10:12:29 -08:00
Chan Woo Kim	2b5603f6ac	Constrained Beam Search [without disjunctive decoding] (#15416 ) * added classes to get started with constrained beam search * in progress, think i can directly force tokens now but not yet with the round robin * think now i have total control, now need to code the bank selection * technically works as desired, need to optimize and fix design choices leading to undersirable outputs * complete PR #1 without disjunctive decoding * removed incorrect tests * Delete k.txt * Delete test.py * Delete test.sh * revert changes to test scripts * genutils * full implementation with testing, no disjunctive yet * shifted docs * passing all tests realistically ran locally * removing accidentally included print statements * fixed source of error in initial PR test * fixing the get_device() vs device trap * fixed documentation docstrings about constrained_beam_search * fixed tests having failing for Speech2TextModel's floating point inputs * fix cuda long tensor * added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search * deleted accidentally added test halting code with assert False * code reformat * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py * fixing based on comments on PR * took out the testing code that should but work fails without the beam search moditification ; style changes * fixing comments issues * docstrings for ConstraintListState * typo in PhrsalConstraint docstring * docstrings improvements Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-09 16:59:26 +01:00
Leandro von Werra	d923f76203	add model scaling section (#15119 ) * add model scaling section * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * integrate reviewer feedback * initialize GPU properly * add note about BnB optimizer * move doc from `scaling.mdx` to `performance.mdx` * integrate reviewer feedback * revert section levels Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-09 15:27:30 +01:00
Sylvain Gugger	b5c6fdecf0	PoC for a ProcessorMixin class (#15549 ) * PoC for a ProcessorMixin class * Documentation * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Roll out to other processors * Add base feature extractor class in init * Use args and kwargs Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-09 09:24:49 -05:00
Nathan Raw	fcb4f11c92	📝 Add codecarbon callback to docs (#15563 )	2022-02-08 14:10:53 -05:00
Joao Gante	8406fa6dd5	Add TFSpeech2Text (#15113 ) * Add wrapper classes * convert inner layers to tf * Add TF Encoder and Decoder layers * TFSpeech2Text models * Loadable model * TF model with same outputs as PT model * test skeleton * correct tests and run the fixup * correct attention expansion * TFSpeech2Text pask_key_values with TF format	2022-02-08 16:27:23 +00:00
aaron	87d08afb16	electra is added to onnx supported model (#15084 ) * electra is added to onnx supported model * add google/electra-base-generator for test onnx module Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com>	2022-02-08 15:47:49 +01:00
Steven Liu	552f8d3091	Create a custom model guide (#15489 ) * 📝 add config section * 📝 finish first draft * 📝 add feature extractor and processor * 🖍 apply feedback from review * 📝 minor edits * last review	2022-02-07 12:34:56 -06:00
lewtun	6775b211b6	Remove Longformers from ONNX-supported models (#15273 )	2022-02-07 17:32:13 +01:00
NielsRogge	84eec9e6ba	Add ConvNeXT (#15277 ) * First draft * Add conversion script * Improve conversion script * Improve docs and implement tests * Define model output class * Fix tests * Fix more tests * Add model to README * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply more suggestions from code review * Apply suggestions from code review * Rename dims to hidden_sizes * Fix equivalence test * Rename gamma to gamma_parameter * Clean up conversion script * Add ConvNextFeatureExtractor * Add corresponding tests * Implement feature extractor correctly * Make implementation cleaner * Add ConvNextStem class * Improve design * Update design to also include encoder * Fix gamma parameter * Use sample docstrings * Finish conversion, add center cropping * Replace nielsr by facebook, make feature extractor tests smaller * Fix integration test Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-07 16:11:37 +01:00
Stas Bekman	8ce1330631	[deepspeed docs] DeepSpeed ZeRO Inference (#15486 ) * [deepspeed docs] DeepSpeed ZeRO Inference * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * tweak * deal with black * extra cleanup, better comments Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-04 13:51:02 -08:00
Sylvain Gugger	ac6aa10f23	Standardize semantic segmentation models outputs (#15469 ) * Standardize instance segmentation models outputs * Rename output * Update src/transformers/modeling_outputs.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add legacy argument to the config and model forward * Update src/transformers/models/beit/modeling_beit.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Copy fix in Segformer Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2022-02-04 14:52:07 -05:00
Stas Bekman	31be2f45a9	[deepspeed docs] Megatron-Deepspeed info (#15488 )	2022-02-04 11:15:13 -08:00
Stas Bekman	21dcaec5d5	[deepspeed docs] memory requirements (#15506 )	2022-02-03 10:55:14 -08:00
Sylvain Gugger	44b21f117b	Save code of registered custom models (#15379 ) * Allow dynamic modules to use relative imports * Work for configs * Fix last merge conflict * Save code of registered custom objects * Map strings to strings * Fix test * Add tokenizer * Rework tests * Tests * Ignore fixtures py files for tests * Tokenizer test + fix collection * With full path * Rework integration * Fix typo * Remove changes in conftest * Test for tokenizers * Add documentation * Update docs/source/custom_models.mdx Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Add file structure and file content * Add more doc * Style * Update docs/source/custom_models.mdx Co-authored-by: Suraj Patil <surajp815@gmail.com> * Address review comments Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-02-02 10:44:37 -05:00
Steven Liu	b9418a1d97	Update tutorial docs (#15165 ) * first draft of pipeline, autoclass, preprocess tutorials * apply review feedback * 🖍 apply feedback from patrick/niels * 📝add output image to preprocessed image * 🖍 apply feedback from patrick	2022-02-01 18:31:35 -06:00
Steven Liu	c157c7e3fd	Update fine-tune docs (#15259 ) * add fine-tune tutorial * make edits, fix style * 📝 make edits * 🖍 fix code format links to external libraries * 🔄revert code formatting * 🖍 use DefaultDataCollator instead of DataCollatorWithPadding	2022-02-01 18:28:12 -06:00
Stas Bekman	44c7857b87	[deepspeed doc] fix import, extra notes (#15400 ) * [deepspeed doc] fix import, extra notes * typo	2022-01-31 08:28:10 -08:00
NielsRogge	47df0f2234	Add header (#15434 )	2022-01-31 11:15:54 -05:00
Ogundepo Odunayo	282ae123e2	add t5 ner finetuning (#15432 )	2022-01-31 17:03:06 +01:00
Soonhwan-Kwon	e09473a817	Add support for XLM-R XL and XXL models by modeling_xlm_roberta_xl.py (#13727 ) * add xlm roberta xl * add convert xlm xl fairseq checkpoint to pytorch * fix init and documents for xlm-roberta-xl * fix indention * add test for XLM-R xl,xxl * fix model hub name * fix some stuff * up * correct init * fix more * fix as suggestions * add torch_device * fix default values of doc strings * fix leftovers * merge to master * up * correct hub names * fix docs * fix model * up * finalize * last fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add copied from * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-29 13:42:37 +01:00
Steven Liu	16d4acbfdb	Get started docs (#15098 ) * clean commit of changes * apply review feedback, make edits * fix backticks, minor formatting * 🖍 make fixup and minor edits * 🖍 fix # in header * 📝 update code sample without from_pt * 📝 final review	2022-01-28 19:01:37 -06:00
Steven Liu	cabd6d26a2	Update model share tutorial (#15288 ) * add model sharing tutorial * 🖍 apply feedback from review * 📝 make edits * 🖍 fix formatting * 📝 convert from pt checkpoint to flax * 📝 final review	2022-01-28 18:49:26 -06:00
Suraj Patil	d25e25ee2b	Add XGLM models (#14876 ) * add xglm * update vocab size * fix model name * style and tokenizer * typo * no mask token * fix pos embed compute * fix args * fix tokenizer * fix positions * fix tokenization * style and dic fixes * fix imports * add fast tokenizer * update names * add pt tests * fix tokenizer * fix typo * fix tokenizer import * fix fast tokenizer * fix tokenizer * fix converter * add tokenizer test * update checkpoint names * fix tokenizer tests * fix slow tests * add copied from comments * rst -> mdx * flax model * update flax tests * quality * style * doc * update index and readme * fix copies * fix doc * update toctrr * fix indent * minor fixes * fix config doc * don't save embed_pos weights * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * address Sylvains commnets, few doc fixes * fix check_repo * align order of arguments * fix copies * fix labels * remove unnecessary mapping * fix saving tokenizer Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-28 18:55:23 +01:00
Ngo Quang Huy	4996922b6d	[docs] fix wrong file name in `pr_check` (#15380 )	2022-01-28 07:52:01 -05:00
Steven Liu	f5db6ce76a	Fix code format for Accelerate doc (#15335 ) * 🖍 fix code syntax to external libraries and replace image * 🔄revert code formatting, replace image with code block * 🖍 apply feedback	2022-01-27 13:49:04 -06:00
Lysandre	f87db5e412	Release: v4.16.0	2022-01-27 13:06:33 -05:00
Sylvain Gugger	8f6454bfac	Add proper documentation for Keras callbacks (#15374 ) * Add proper documentation for Keras callbacks * Add dummies	2022-01-27 10:51:38 -05:00
Stas Bekman	fc8fc400e3	[docs] post-PR merge fix (#15355 ) * [docs] post-PR merge fix * Update docs/source/main_classes/deepspeed.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-26 11:23:32 -08:00
novice	99a2771189	Add YOSO (#15091 ) * Add cookiecutter files * Add cuda kernels and cpp files * Update modeling_yoso.py * Add .h files * Update configuration_yoso.py * Updates * Remove tokenizer * Code quality * Update modeling_yoso.py * Update modeling_yoso.py * Fix failing test * Update modeling_yoso.py * Fix code quality * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review and fix integration tests * Update src/transformers/models/yoso/modeling_yoso.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Apply suggestions from code review * Fix copied from statement * Fix docstring * Fix code quality * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions and fix mask * Apply suggestions from code review * Fix code quality * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix docstrings * Fix code quality * Remove trailing whitespace * Update yoso.mdx * Move kernel loading to YosoEncoder * make style * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/yoso/modeling_yoso.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add short summary to docs * Update docs/source/model_doc/yoso.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update yoso.mdx * Update docs/source/model_doc/yoso.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Remove CausalLM model and add copied from * Remove autoregressive code * Remove unused imports * add copied from for embeddings * Fix code quality * Update docs/source/model_doc/yoso.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestion from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-26 19:18:29 +01:00
Ngo Quang Huy	5d8b98608c	Fix deepspeed docs (#15346 )	2022-01-26 07:24:33 -05:00
Jacob Deppen	96161ac408	make table into valid Markdown table syntax (#15337 )	2022-01-26 07:10:00 -05:00
Maciej Pawłowski	e79a0faeae	Added missing code in exemplary notebook - custom datasets fine-tuning (#15300 ) * Added missing code in exemplary notebook - custom datasets fine-tuning Added missing code in tokenize_and_align_labels function in the exemplary notebook on custom datasets - token classification. The missing code concerns adding labels for all but first token in a single word. The added code was taken directly from huggingface official example - this [colab notebook](https://github.com/huggingface/notebooks/blob/master/transformers_doc/custom_datasets.ipynb). * Changes requested in the review - keep the code as simple as possible	2022-01-25 17:26:17 -05:00
Steven Liu	0501beb846	Add 🤗 Accelerate tutorial (#15263 ) * add accelerate tutorial * 🖍 apply feedback from review * 📝 make edits	2022-01-25 13:46:11 -06:00
novice	d43e308e7f	Add Swin Transformer (#15085 ) * Add all files * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Updates * Apply suggestions from review * Fix failing tests * Update __init__.py * Update configuration_swin.py * Update auto_factory.py * Fix pytests * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Fix tests and default checkpoint * Fix Recursion error * Code quality * Remove copied from * Update modeling_swin.py * Code quality * Update modeling_swin.py * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review * Fix feature extractor * Fix code quality * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review * Update configuration_swin.py * Update default checkpoint * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/model_doc/swin.mdx Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu> * Update conversion script * Reformat conversion script Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu>	2022-01-21 12:10:41 +01:00
NielsRogge	515ed3ad2a	Fix doc examples (#15257 )	2022-01-20 21:51:51 +01:00
Kamal Raj	08b41b413a	Update pipelines.mdx (#15243 ) fix few spelling mistakes	2022-01-20 08:46:48 -05:00
NielsRogge	80f7296091	Update Trainer code example (#15070 ) * Update code example * Fix code quality * Add comment	2022-01-19 20:15:12 +01:00
NielsRogge	ac227093e4	Add ViLT (#14895 ) * First commit * Add conversion script * Make conversion script work for base model * More improvements * Update conversion script, works for vqa * Add indexing argument to meshgrid * Make conversion script work for ViltForPreTraining * Add ViltForPreTraining to docs * Fix device issue * Add processor * Add MinMaxResize to feature extractor * Implement call method of ViltProcessor * Fix tests * Add integration test * Add loss calculation for VQA * Improve tests * Improve some more tests * Debug tests * Small improvements * Add support for attention_mask * Remove mask_it * Add pixel_mask * Add tests for ViltFeatureExtractor * Improve tests * Add ViltForNaturalLanguageVisualReasoning * Add ViltForNaturalLanguageVisualReasoning to conversion script * Minor fixes * Add support for image_embeds, update docstrings to markdown * Update docs to markdown * Improve conversion script * Rename ViltForPreTraining to ViltForMaskedLM * Improve conversion script * Convert docstrings to markdown * Fix code example of retrieval model * Properly convert masked language model * Add integration test for nlvr * Fix code quality * Apply suggestions from code review * Add copied from statements * Fix pretrained_config_archive_map * Fix docs * Add model to README * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply more suggestions from code review * Make code more readable * Add ViltForNaturalLanguageVisualReasoning to the tests * Rename ViltForVisualQuestionAnswering to ViltForQuestionAnswering * Replace pixel_values_2 by single tensor * Add hidden_states and attentions * Fix one more test * Fix all tests * Update year * Fix rebase issues * Fix another rebase issue * Remove ViltForPreTraining from auto mapping * Rename ViltForImageRetrievalTextRetrieval to ViltForImageAndTextRetrieval * Make it possible to use BertTokenizerFast in the processor * Use BertTokenizerFast by default * Rename ViltForNaturalLanguageVisualReasoning, define custom model output Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-19 19:51:59 +01:00
NielsRogge	842298f84f	[ViTMAE] Various fixes (#15221 ) * Add MAE to AutoFeatureExtractor * Add link to notebook * Fix relative paths	2022-01-19 15:27:57 +01:00
Li-Huai (Allan) Lin	841d979190	Add FastTokenizer to REALM (#15211 ) * Remove BertTokenizer abstraction * Add FastTokenizer to REALM * Fix config archive map * Fix copies * Update realm.mdx * Apply suggestions from code review	2022-01-19 15:19:36 +01:00
Sylvain Gugger	db3503949d	Finish conversion of REALM doc to MDX	2022-01-18 18:00:30 -05:00
Jake Tae	fe78fe98ca	Enable tqdm toggling (#15167 ) * feature: enable tqdm toggle * test: add tqdm unit test * style: run linter * Update tests/test_tqdm_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * refactor: use tiny model, run linter * docs: add tqdm to logging * docs: add tqdm reference to `http_get` * style: run linter * Update docs/source/main_classes/logging.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * fix: use `AutoConfig` for framework agnostic testing * chore: mv tqdm test to `test_logging.py` * feature: implement enable/disable functions * docs: mv docstring to comment * chore: mv tqdm functions to `logging.py` * docs: update docs to reference `enable/disable` funcs * test: update test to use `enable/disable` func * chore: update function reference in comment Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-01-18 17:52:35 -05:00
NielsRogge	74bec9865c	Add MAE (#15120 ) * First draft * More improvements * More improvements * More improvements * Fix embeddings * Add conversion script * Finish conversion script * More improvements * Fix forward pass * Remove print statements * Add weights initialization * Add initialization of decoder weights * Add support for other models in the conversion script * Fix patch_size for huge model * Fix most of the tests * Fix integration test * Fix docs * Fix archive_list * Apply suggestions from code review * Improve documentation * Apply more suggestions * Skip some tests due to non-deterministic behaviour * Fix test_initialization * Remove unneccessary initialization of nn.Embedding * Improve docs * Fix dummies * Remove ViTMAEFeatureExtractor from docs * Add model to README and table of contents * Delete inference file	2022-01-18 16:21:32 +01:00
Li-Huai (Allan) Lin	22454ae492	Add REALM (#13292 ) * REALM initial commit * Retriever OK (Update new_gelu). * Encoder prediction score OK * Encoder pretrained model OK * Update retriever comments * Update docs, tests, and imports * Prune unused models * Make embedder as a module `RealmEmbedder` * Add RealmRetrieverOutput * Update tokenization * Pass all tests in test_modeling_realm.py * Prune RealmModel * Update docs * Add training test. * Remove completed TODO * Style & Quality * Prune `RealmModel` * Fixup * Changes: 1. Remove RealmTokenizerFast 2. Update docstrings 3. Add a method to RealmTokenizer to handle candidates tokenization. * Fix up * Style * Add tokenization tests * Update `from_pretrained` tests * Apply suggestions * Style & Quality * Copy BERT model * Fix comment to avoid docstring copying * Make RealmBertModel private * Fix bug * Style * Basic QA * Save * Complete reader logits * Add searcher * Complete searcher & reader * Move block records init to constructor * Fix training bug * Add some outputs to RealmReader * Add finetuned checkpoint variable names parsing * Fix bug * Update REALM config * Add RealmForOpenQA * Update convert_tfrecord logits * Fix bugs * Complete imports * Update docs * Update naming * Add brute-force searcher * Pass realm model tests * Style * Exclude RealmReader from common tests * Fix * Fix * convert docs * up * up * more make style * up * upload * up * Fix * Update src/transformers/__init__.py * adapt testing * change modeling code * fix test * up * up * up * correct more * make retriever work * update * make style * finish main structure * Resolve merge conflict * Make everything work * Style * Fixup * Fixup * Update training test * fix retriever * remove hardcoded path * Fix * Fix modeling test * Update model links * Initial retrieval test * Fix modeling test * Complete retrieval tests * Fix * style * Fix tests * Fix docstring example * Minor fix of retrieval test * Update license headers and docs * Apply suggestions from code review * Style * Apply suggestions from code review * Add an example to RealmEmbedder * Fix Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-18 07:24:13 -05:00
Stas Bekman	edd3fce2f7	[doc] new MoE paper (#15184 ) add new paper	2022-01-17 09:10:51 -08:00
Stas Bekman	669e3c50c9	[doc] performance: Efficient Software Prebuilds (#15147 ) * Efficient Software Prebuilds * improve	2022-01-14 18:25:20 -08:00
AK391	4663c609b9	Add "open in hf spaces" gradio button issue #73 (#15106 ) * update XLMProphetNet link * update DPR link * change prophetnet link * change link MBART * change link GPT * update gpt2 link * ctrl update link * update Transformer-XL link * Update Reformer link * update xlnet link * bert update link * udpate albert link * roberta update link * update distilbert link * update convbert link * update XLM link * xlm roberta update link * update Flaubert link * update electra link * update funnel transformer and longformer * bart update link * pegasus update link * udpate marianmt link * t5 update link * mt5 update link	2022-01-14 10:12:30 -05:00
Carlos Aguayo	3fc221d077	Update model_sharing.mdx (#15142 ) Fix typo	2022-01-13 12:26:02 -05:00
lewtun	021f2ea987	Add ONNX configuration classes to docs (#15121 ) * Add ONNX classes to main package * Remove permalinks from ONNX guide * Fix ToC entry * Revert "Add ONNX classes to main package" This reverts commit `eb794a5b00`. * Add ONNX classes to main doc * Fix syntax highlighting in doc * Fix text * Add FeaturesManager to doc * Use paths to reference ONNX classes * Add FeaturesManager to init * Add missing ONNX paths	2022-01-12 16:33:32 +01:00
Sylvain Gugger	c425d60bb9	Fix link to deepspeed config	2022-01-12 09:32:53 -05:00
lewtun	16f0b7d72c	Update ONNX docs (#14904 ) * Remove docs for deprecated ONNX export * Tidy up the CLI help messages * Revamp ONNX docs * Update auto-config table * Use DistilBERT as example for consistency * Wrap up first pass at ONNX docs * Fix table check * Add tweaks and introduction * Add cross-ref * Fix missing import * Fix style * Add permalinks to ONNX configs * Clarify role of OrderedDict * Update docs/source/serialization.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add doctest syntax to code blocks * Remove permalinks * Revert "Remove permalinks" This reverts commit `099701daf0`. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-11 18:06:05 +01:00
AK391	68d925195e	Merge branch 'master' into master	2022-01-11 11:11:29 -05:00
novice	28e091430e	Add Nystromformer (#14659 ) * Initial commit * Config and modelling changes Added Nystromformer-specific attributes to config and removed all decoder functionality from modelling. * Modelling and test changes Added Nystrom approximation and removed decoder tests. * Code quality fixes * Modeling changes and conversion script Initial commits to conversion script, modeling changes. * Minor modeling changes and conversion script * Modeling changes * Correct modeling, add tests and documentation * Code refactor * Remove tokenizers * Code refactor * Update __init__.py * Fix bugs * Update src/transformers/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/model_doc/nystromformer.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/convert_nystromformer_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update modeling and test_modeling * Code refactor * .rst to .mdx * doc changes * Doc changes * Update modeling_nystromformer.py * Doc changes * Fix copies * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update configuration_nystromformer.py * Fix copies * Update tests/test_modeling_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update test_modeling_nystromformer.py * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Fix code style * Update modeling_nystromformer.py * Update modeling_nystromformer.py * Fix code style * Reformat modeling file * Update modeling_nystromformer.py * Modify NystromformerForMultipleChoice * Fix code quality * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Code style changes and torch.no_grad() * make style * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-11 14:25:49 +01:00
Virus	c4fa908fa9	Adds IBERT to models exportable with ONNX (#14868 ) * Add IBertOnnxConfig and tests * add all the supported features for IBERT and remove outputs in IbertOnnxConfig * use OnnxConfig * fix codestyle * remove serialization.rst * codestyle	2022-01-11 12:17:08 +01:00
AK391	5cd7086fdb	XLM-ProphetNet Spaces badge	2022-01-11 00:11:31 -05:00
AK391	4e3208662e	DPR Spaces badge	2022-01-10 13:50:40 -05:00
AK391	ac2c06d492	ProphetNet spaces badge	2022-01-10 13:43:34 -05:00
AK391	bf0201e184	MBART spaces badge	2022-01-10 13:37:17 -05:00
Yih-Dar	b67fd797be	Add TFVisionEncoderDecoderModel (#14148 ) * Start the work on TFVisionEncoderDecoderModel * Expose TFVisionEncoderDecoderModel * fix import * Add modeling_tf_vision_encoder_decoder to _ignore_modules in get_model_modules() * reorder * Apply the fix for checkpoint loading as in #14016 * remove attention_mask + fix VISION_DUMMY_INPUTS * A minimal change to make TF generate() work for vision models as encoder in encoder-decoder setting * fix wrong condition: shape_list(input_ids) == 2 * add tests * use personal TFViTModel checkpoint (for now) * Add equivalence tests + projection layer * style * make sure projection layer can run * Add examples * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Clean comments (need to work on TODOs for PyTorch models) * Remove TF -> PT in check_pt_tf_equivalence for TFVisionEncoderDecoderModel * fixes * Revert changes in PT code. * Update tests/test_modeling_tf_vision_encoder_decoder.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Add test_inference_coco_en for TF test * fix quality * fix name * build doc * add main_input_name * Fix ckpt name in test * fix diff between master and this PR * fix doc * fix style and quality * fix missing doc * fix labels handling * Delete auto.rst * Add the changes done in #14016 * fix prefix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make style Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-10 13:30:14 -05:00
AK391	c9504b2f50	MT5 Spaces badge	2022-01-10 12:57:08 -05:00
AK391	daec528ca9	T5 Spaces badge	2022-01-10 12:51:39 -05:00
AK391	0554e4d5c5	MarianMT Spaces badge	2022-01-10 12:47:12 -05:00
AK391	7ec6aad23d	Pegasus Spaces badge	2022-01-10 12:39:22 -05:00
AK391	03f8b9c9e0	BART Spaces badge	2022-01-10 12:33:59 -05:00
Stas Bekman	37bc0b4e53	[performance doc] Power and Cooling (#14935 ) * [performance doc] Power and Cooling * more docs * Update docs/source/performance.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * reword Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-10 09:21:04 -08:00
AK391	20f169b523	Longformer Spaces badge	2022-01-10 12:14:18 -05:00
AK391	4fbc924d0a	Funnel Transformer spaces badge	2022-01-10 12:06:05 -05:00
AK391	222c09a635	ELECTRA Spaces badge	2022-01-10 11:53:23 -05:00
Stas Bekman	31838d3e11	[doc] normalize HF Transformers string (#15023 )	2022-01-10 08:44:33 -08:00
AK391	84f360e862	FlauBERT spaces badge	2022-01-10 11:41:10 -05:00
AK391	9f33116898	XLM-Roberta Spaces badge	2022-01-10 10:54:18 -05:00
AK391	20fa9eb035	XLM Spaces badge	2022-01-10 10:48:06 -05:00
AK391	16b6df6fca	ConvBERT spaces badge	2022-01-10 10:33:03 -05:00
Santiago Castro	f21bc4215a	Use tqdm.auto in Pipeline docs (#14920 ) It's better for e.g. notebook.	2022-01-10 10:28:34 -05:00
Mishig Davaadorj	f012c00ada	Model summary horizontal banners (#15058 )	2022-01-10 10:06:14 -05:00
Minghao Li	b2c477fc6d	support the trocr small models (#14893 ) * support the trocr small models * resolve conflict * Update docs/source/model_doc/trocr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/model_doc/trocr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/model_doc/trocr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix unexpected indent in processing_trocr.py * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * update the docstring of processing_trocr * remove extra space Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-01-10 09:28:03 -05:00
Yih-Dar	0a03a86813	fix model table cell text alignment (#14999 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-01-10 06:44:11 -05:00
AK391	5be1242ac0	Merge branch 'huggingface:master' into master	2022-01-07 11:48:22 -05:00
AK391	484e7a441f	Distilbert spaces badge	2022-01-07 11:47:56 -05:00
K.C. Tung	f18c6fa94c	Resubmit changes after rebase to master (#14982 )	2022-01-07 08:34:12 +01:00
AK391	1d71227295	Roberta spaces badge	2022-01-06 18:50:19 -05:00
AK391	cac877425c	ALBERT spaces badge	2022-01-06 13:01:23 -05:00
AK391	794441c379	BERT spaces badge	2022-01-06 12:22:09 -05:00
AK391	f872f18dca	XLNet spaces badge	2022-01-06 12:09:50 -05:00
AK391	8d187e7feb	Reformer Spaces badge	2022-01-06 11:59:21 -05:00
AK391	59fb636948	Transformer-XL badge	2022-01-06 11:47:41 -05:00
AK391	2380136722	add spaces badges	2022-01-04 16:13:57 -05:00
Kevin Ko	857ab55c01	[doc] Update parallelism.mdx (#15018 ) * Update parallelism.mdx * Update parallelism.mdx	2022-01-04 09:58:27 -08:00
Daniel Stancl	21aecc0971	Add Flax RoFormer (#15005 ) * Add FlaxRoFormer * Clean code + make quality * Fix output pooling for FlaxRoFormerForMultipleChoiceModule * Apply suggestions from code review * add flax model to repos Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-04 13:23:10 +01:00
Kevin Ko	f2ab21833f	Update parallelism.mdx (#15013 ) * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx	2022-01-03 11:49:27 -08:00
Sylvain Gugger	8f6373c61c	Map model_type and doc pages names (#14944 ) * Map model_type and doc pages names * Add script * Fix typo * Quality * Manual check for Auto Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2022-01-03 05:08:55 -05:00
Sylvain Gugger	2c5597f6c7	Style	2021-12-27 19:18:08 -05:00
Sylvain Gugger	b5e2b183af	Doc styler examples (#14953 ) * Fix bad examples * Add black formatting to style_doc * Use first nonempty line * Put it at the right place * Don't add spaces to empty lines * Better templates * Deal with triple quotes in docstrings * Result of style_doc * Enable mdx treatment and fix code examples in MDXs * Result of doc styler on doc source files * Last fixes * Break copy from	2021-12-27 19:07:46 -05:00
Stas Bekman	e13f72fbff	[doc] :obj: hunt (#14954 ) * redo sans examples * style	2021-12-27 15:49:48 -08:00
Stas Bekman	133c5e40c4	[doc] consistent True/False/None default format (#14951 ) * [doc] consistent True/False/None default format * Update src/transformers/models/xlnet/modeling_xlnet.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-27 14:31:40 -08:00
Sylvain Gugger	b2f500256e	Convert last rst file (#14952 )	2021-12-27 17:09:37 -05:00
Daniel Stancl	501307b58b	Add `ElectraForCausalLM` -> Enable Electra encoder-decoder model (#14729 ) * Add ElectraForCausalLM and cover some basic tests & need to fix a few tests * Fix bugs * make style * make fix-copies * Update doc * Change docstring to markdown format * Remove redundant update_keys_to_ignore	2021-12-27 12:37:52 +01:00
Nicolas Patry	b058490ceb	ChunkPipeline (batch_size enabled on `zero-cls` and `qa` pipelines. (#14225 ) * Pipeline chunks. * Batching for Chunking pipelines ? * Batching for `question-answering` and `zero-shot-cls`. * Fixing for FNet. * Making ASR a chunk pipeline. * Chunking ASR API. * doc style. * Fixing ASR test. * Fixing QA eror (p_mask, padding is 1, not 0). * Enable both vad and simple chunking. * Max length for vad. * remove inference mode, crashing on s2t. * Revert ChunkPipeline for ASRpipeline. Too many knobs for simple integration within the pipeline, better stick to external convenience functions instead, more control to be had, simpler pipeline and also easier to replace with other things later. * Drop necessity for PT for these. * Enabling generators. * Add mic + cleanup. * Typo. * Typo2. * Remove ASR work, it does not belong in this PR anymore. * Update src/transformers/pipelines/pt_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/pipelines/zero_shot_classification.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Adding many comments. * Doc quality. * `hidden_states` handling. * Adding doc. * Bad rebase. * Autofixing docs. * Fixing CRITICAL bug in the new Zerocls pipeline. Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-12-27 11:26:20 +01:00
Yih-Dar	8f2cc1c3ab	Add TFCLIPModel (#13967 ) * Start the work for TFCLIPModel * Convert to TF code (TODO: loss + doc) * Clean up * Fix pooled_output for TFCLIPTextTransformer - using tf.gather_nd * assert -> raise error * Expose TFCLIPModel * Deal with dummy_inputs * Add tests * Fix all tests. TODO: manual check weight loading + add more comments * Fix pt tf equivalence test * fixes * update TFCLIPVisionEmbeddings's Conv2D * Fix loss + overwrite test_pt_tf_model_equivalence from common * Add a comment about the change about MainLayer in test_keras_save_load * Set return_loss=True in TFCLIPModelTester + make tests pass * overwrite test_pt_tf_model_equivalence from tf common * fix base_model_prefix * Fix examples * remove unused * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply review suggestions * change self.pre_layrnorm to self.pre_layernorm * apply more review suggestions * return attention probs before dropout (to align with PT) * fix weight init * fix * build doc * fix missing doc * fix for test Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-23 11:19:44 -05:00
lewtun	6b655cc63f	Add ONNX support for MarianMT models (#14586 ) * First commit to add MarianMT to ONNX * Now MarianModel.forward() automatically generates decoder_input_ids, like BartModel.forward() * Adjusted MarianOnnxConfig.inputs and outputs to work with seq2seq-lm feature * Style fix * Added support for other features for already supported models * Partial support for causal and seq2seq models * Partial support for causal and seq2seq models * Add default task for MarianMT ONNX * Remove automatic creation of decoder_input_ids * Extend inputs and outputs for MarianMT ONNX config * Add MarianMT to ONNX unit tests * Refactor * OnnxSeq2SeqConfigWithPast to support seq2seq models * Parameterized the onnx tests * Restored run_mlm.py * Restored run_mlm.py * [WIP] BART update * BART and MBART * Add past_key_values and fix dummy decoder inputs Using a sequence length of 1 in generate_dummy_outputs() produces large discrepancies, presumably due to some hidden optimisations. * Refactor MarianOnnxConfig to remove custom past_key_values logic * Fix quality * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit `0f4e39c559`. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Refactor Marian export to account for base changes * Fix copies * Implemented suggestions * Extend support for causal LM * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit `0f4e39c559`. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit `0f4e39c559`. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Remove commented import * Remove ONNX model * Remove redundant class method * Tidy up imports * Fix quality * Refactor dummy input function * Add copied from statements to Marian config functions * Remove false copied from comments * Fix copy from comment Co-authored-by: Massimiliano Bruni <massimiliano.bruni@hcl.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>	2021-12-23 13:35:56 +01:00
Sylvain Gugger	207594be81	Convert rst files (#14888 ) * Convert all tutorials and guides * Convert all remaining rst to mdx * Track and fix bad links	2021-12-22 16:14:35 -05:00
NielsRogge	7df4b90c76	Fix Perceiver docs (#14879 )	2021-12-22 14:18:03 +01:00
Ryokan RI	824fd44fc3	Feature/fix slow test in mluke (#14749 ) * make MLukeTokenizerTest fast * make LukeTokenizerTest fast * add entry to _toctree.yaml	2021-12-22 06:35:59 -05:00
Lysandre Debut	ec3567fe20	Convert model files from rst to mdx (#14865 ) * First pass * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-22 03:27:30 -05:00
Stas Bekman	185876392c	[doc porting] several docs (#14858 ) * [doc porting] 2 docs * [doc porting] 2 docs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/main_classes/deepspeed.mdx * cleanup Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-21 09:55:25 -08:00
Stas Bekman	b6ec956976	[logging] implement warning_advice / TRANSFORMERS_NO_ADVISORY_WARNINGS (#14669 ) * [logging] implement warning_advice / TRANSFORMERS_NO_ADVISORY_WARNINGS * reword	2021-12-20 20:48:38 -08:00
Stas Bekman	c1125dc2ba	[doc] typo (#14849 ) fix small typo	2021-12-20 12:20:21 -05:00
Patrick von Platen	952a77b05d	[Perceiver] Skip multi-gpu tests for now (#14813 ) * [Perceiver] Skip multi-gpu tests for now * Update tests/test_modeling_perceiver.py * up * up	2021-12-20 15:22:50 +01:00
Derek Chia	8a818c26cb	Fix dead link to benchmarks.ipynb (#14842 ) Notebook has been updated here https://github.com/huggingface/notebooks/tree/master/examples/benchmark.ipynb	2021-12-20 09:08:05 -05:00
Anton Lozhkov	3883e3a75e	Add SD and SV heads for WavLM (#14847 ) * Add converted heads * Add dummies	2021-12-20 16:40:56 +03:00
Patrick von Platen	c4a96cecbc	Wav2Vec2 meets phonemes (#14353 ) * up * add tokenizer * improve more * finish tokenizer * finish * adapt speech recognition script * adapt convert * more fixes * more fixes * update phonemizer wav2vec2 * better naming * fix more tests * more fixes swedish * correct tests * finish * improve script * remove file * up * lets get those 100 model architectures until the end of the month * make fix-copies * correct more * correct script * more fixes * more fixes * add to docs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * replace assert * fix copies * fix docs * new try docs * boom boom * update * add phonemizer to audio tests * make fix-copies * up * upload models * some changes * Update tests/test_tokenization_wav2vec2_phoneme.py Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * more fixes * remove @ Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>	2021-12-17 19:56:44 +01:00
Lysandre Debut	77d6c826d8	Convert rst to mdx bert (#14806 ) * BERT to mdx mdx :) c * Update docs/source/model_doc/bert.mdx Co-authored-by: Julien Chaumond <julien@huggingface.co> * Remove all Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co>	2021-12-17 11:13:34 -05:00
Patrick von Platen	bef1e3e4a0	Add WavLM (#14354 ) * first commit * fix some stuff * fix more readme * Apply suggestions from code review * update * correct * up * attn layer works * push code * make modedls work * Small change * more refactor * finish * up * fix convertsion * fix position bias * Fix style * fix conversion * make fix-copies * add * clean * fix docs * fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply final changes * make fix-copies Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-16 18:57:05 +01:00
Anton Lozhkov	48463ebb33	Add Speaker Diarization and Verification heads (#14723 ) * Models * Squashed commit of the following: commit 72278e1e931a16d0879acc77f65762f3364833d0 Author: anton-l <aglozhkov@gmail.com> Date: Fri Dec 10 21:45:08 2021 +0300 * Add unispeech heads * Add sd/sv automodels * Docs cleanup * Fix docstrings * rename xvector classes * examples * Tests cleanup * Style * Better checkpoints for tests * leftover docs * apply review suggestions * Style + init tests * Update unispeech-sat tdnn downsampling	2021-12-16 19:22:14 +03:00
Lysandre Debut	8010fda9bf	Removes images to put them in a dataset (#14781 ) * First try * Update instructions	2021-12-16 04:42:02 -05:00
Sylvain Gugger	459677aebe	PoC for conserving old links (#14754 ) * PoC for conserving old links * Do the same for other links * remap the redirects section * add instructions on how to move sections * improve Co-authored-by: Stas Bekman <stas@stason.org>	2021-12-15 11:40:47 -08:00
NielsRogge	50bc57cef8	Update Perceiver code examples (#14783 ) * Fix code examples * Fix code example	2021-12-15 11:06:38 -05:00
Xing Han Lu	72c6e8b8bf	Update t5.rst (#14776 )	2021-12-15 14:59:11 +01:00
Stas Bekman	fdf3ce2827	[doc] performance: groups of operations by compute-intensity (#14757 )	2021-12-14 19:01:23 -08:00
Amit Chaudhary	851a78978a	Fix broken links to distillation on index page of documentation (#14722 ) * Fix broken links to distillation on index page of documentation * Fix broken link for distillation in main README * Run make fixup	2021-12-14 21:55:33 -05:00
Sylvain Gugger	322d416916	Update Table of Contents (#14755 )	2021-12-13 17:15:19 -05:00
Sylvain Gugger	7533d30acd	Convert Trainer doc page to MarkDown (#14753 ) * Convert Trainer doc page to MarkDown * Fix repo consistency * Fix the doc build test job	2021-12-13 13:09:50 -05:00
Sylvain Gugger	c3cd88a9ba	Small fixes for the doc (#14751 )	2021-12-13 11:17:01 -05:00
Lucien	fc74c84537	Swap TF and PT code inside two blocks (#14742 )	2021-12-13 10:31:11 -05:00
Lysandre Debut	6e05bb1c96	Fix the perceiver docs (#14748 )	2021-12-13 09:29:47 -05:00
NielsRogge	4c99e553c1	Improve documentation of some models (#14695 ) * Migrate docs to mdx * Update TAPAS docs * Remove lines * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply some more suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add pt/tf switch to code examples * More improvements * Improve docstrings * More improvements Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-13 13:24:36 +01:00
Stas Bekman	027074f4d0	[doc] document MoE model approach and current solutions (#14725 ) * document MoE model approach * additional info from Samyam * fix	2021-12-10 18:24:38 -08:00
Sylvain Gugger	5eca742f6c	Fix special character in MDX (#14721 )	2021-12-10 16:02:48 -05:00
Sylvain Gugger	63c284c2d4	Prevent style_doc from tempering the config file	2021-12-10 15:31:43 -05:00
Sylvain Gugger	1b75d7238c	Automatically build doc notebooks (#14718 ) * Test workflow * Build doc * Make a clean build * Add doc config * Restore other workflows * Final job * Print something in else statements * Pull before making changes	2021-12-10 14:20:56 -05:00
Sylvain Gugger	bab1556456	Put back open in colab markers (#14684 )	2021-12-09 12:00:06 -05:00
Tikeng Notsawo Pascal Junior	3bc7d70e9c	Fix : wrong link in the documentation (ConvBERT vs DistilBERT) (#14705 )	2021-12-09 11:35:22 -05:00
Mishig Davaadorj	60be4bf8ac	Fix typo in toctree (#14704 )	2021-12-09 09:25:31 -05:00
Sylvain Gugger	13186d7152	Move pyctcdecode (#14686 ) * Move pyctcdecode dep * Fix doc and last objects * Quality * Style * Ignore this black	2021-12-08 15:41:58 -05:00
Stas Bekman	1228661285	[bf16 support] tweaks (#14580 ) * [bf16 support] tweaks * corrections Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>	2021-12-08 11:33:24 -08:00
Sylvain Gugger	01b8cd5932	Revert open-in-colab and add perceiver (#14683 )	2021-12-08 13:52:31 -05:00
Sylvain Gugger	cf36f4d7a8	Convert tutorials (#14665 ) * Convert a few docs * And another * Last tutorials * New syntax for colab links * Convert a few docs * And another * Last tutorials * New syntax for colab links	2021-12-08 13:19:46 -05:00
NielsRogge	65b20b739b	Add Perceiver IO (#14487 ) * First draft * Style and remove mlm * Make forward pass work * More improvements * More improvements * Fix bug * More improvements * More improvements * Add PerceiverTokenizer first draft * Improve conversion script * More improvements * Make conversion script work for the encoder * Make conversion script work with local pickle files * Style & quality, fix-copies * Add dummy input to conversion script * Add absolute position embeddings to TextPreProcessor * Make forward pass of encoder work * More improvements * Move text preprocessor to separate script * More improvements * More improvements * Add post processor * Make MLM model work * Style * Add PerceiverForMaskedLM * Add PerceiverImagePreprocessor * Make style * Make PerceiverForImageClassification work * More improvements * More improvements * Use tokenizer in conversion script * Use PerceiverForMaskedLM in conversion script * Define custom PerceiverModelOutput * Improve PerceiverAttention to make it work for both MLM and image classification * More improvements * More improvements * More improvements to the conversion script * Make conversion script work for both MLM and image classification * Add PerceiverFeatureExtractor * More improvements * Style and quality * Add center cropping * Fix bug * Small fix * Add print statement * Fix bug in image preprocessor * Fix bug with conversion script * Make output position embeddings an nn.Parameter layer instead of nn.Embedding * Comment out print statements * Add position encoding classes * More improvements * Use position_encoding_kwargs * Add PerceiverForImageClassificationFourier * Make style & quality * Add PerceiverForImageClassificationConvProcessing * Style & quality * Add flow model * Move processors to modeling file * Make position encodings modular * Make basic decoder use modular position encodings * Add PerceiverForOpticalFlow to conversion script * Add AudioPreprocessor * Make it possible for the basic decoder to use Fourier position embeddings * Add PerceiverForMultimodalAutoencoding * Improve model for optical flow * Improve _build_network_inputs method * Add print statement * Fix device issue * Fix device of Fourier embeddings * Add print statements for debugging * Add another print statement * Add another print statement * Add another print statement * Add another print statement * Improve PerceiverAudioPreprocessor * Improve conversion script for multimodal modal * More improvements * More improvements * Improve multimodal model * Make forward pass multimodal model work * More improvements * Improve tests * Fix some more tests * Add output dataclasses * Make more tests pass * Add print statements for debuggin * Add tests for image classification * Add PerceiverClassifierOutput * More improvements * Make more tests pass for the optical flow model * Make style & quality * Small improvements * Don't support training for optical flow model for now * Fix _prepare_for_class for tests * Make more tests pass, add some docs * Add multimodal model to tests * Minor fixes * Fix tests * Improve conversion script * Make fixup * Remove pos_dim argument * Fix device issue * Potential fix for OOM * Revert previous commit * Fix test_initialization * Add print statements for debugging * Fix print statement * Add print statement * Add print statement * Add print statement * Add print statement * Add print statement * Add print statement * Remove need for output_shape * Comment out output_shape * Remove unnecessary code * Improve docs * Fix make fixup * Remove PerceiverTextProcessor from init * Improve docs * Small improvement * Apply first batch of suggestions from code review * Apply more suggestions from code review * Update docstrings * Define dicts beforehand for readability * Rename task to architecture in conversion script, include PerceiverModel in tests * Add print statements for debugging * Fix tests on GPU * Remove preprocessors, postprocessors and decoders from main init * Add integration test * Fix docs * Replace einops by torch * Update for new docs frontend * Rename PerceiverForImageClassification * Improve docs * Improve docs * Improve docs of PerceiverModel * Fix some more tests * Improve center_crop * Add PerceiverForSequenceClassification * Small improvements * Fix tests * Add integration test for optical flow model * Clean up * Add tests for tokenizer * Fix tokenizer by adding special tokens properly * Fix CI	2021-12-08 14:20:34 +01:00
Patrick von Platen	961732c276	[Wav2Vec2] PyCTCDecode Integration to support language model boosted decoding (#14339 ) * up * up * up * make it cleaner * correct * make styhahalal * add more tests * finish * small fix * make style * up * tryout to solve cicrle ci * up * fix more tests * fix more tests * apply sylvains suggestions * fix import * correct docs * add pyctcdecode only to speech tests * fix more tests * add tf, flax and pt tests * add pt * fix last tests * fix more tests * Apply suggestions from code review * change lines * Apply suggestions from code review Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * correct tests * correct tests * add doc string Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>	2021-12-08 12:07:54 +01:00
Ryokan RI	30646a0a3c	Add mLUKE (#14640 ) * implement MLukeTokenizer and LukeForMaskedLM * update tests * update docs * add LukeForMaskedLM to check_repo.py * update README * fix test and specify the entity pad id in tokenization_(m)luke * fix EntityPredictionHeadTransform	2021-12-07 00:25:28 -05:00
tucan9389	0f3f045ebd	Add GPTJForQuestionAnswering (#14503 ) * Add GPTJForQuestionAnswering * Reformat for GPTJForQuestionAnswering * Fix isort error * make style for GPTJForQA * Add _keys_to_ignore_on_load_missing * Change the sequence of qa and classification Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-12-06 11:44:10 -05:00
Matt	73ec4340ec	Make DefaultDataCollator importable from root (#14588 ) * Make DefaultDataCollator importable from root * Add documentation for DefaultDataCollator and add return_tensors argument to all class docstrings * make style * Add DefaultDataCollator to data_collator.rst * Add DefaultDataCollator to data_collator.rst	2021-12-03 15:15:09 -05:00
Stas Bekman	71b1bf7ea8	[trainer] add tf32-mode control (#14606 ) * [trainer] add --tf32 support * it's pt>=.17 * it's pt>=.17 * flip the default to True * add experimental note * simplify logic * style * switch to 3-state logic * doc * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * re-style code Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-03 10:08:58 -08:00
Lysandre Debut	ec47baeba2	2022 is the year of multi-modality (#14610 ) * 2022 is the year of multi-modality * Small fix * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * Apply suggestions from code review * Apply to documentation index * Apply suggestions from code review Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2021-12-03 11:35:44 -05:00
Daniel Stancl	50d909be28	[Flax] Add FlaxBlenderbotSmall (#14576 ) * [WIP] Add FlaxBlenderbotSmall * Revert some unintentionally changed files Revert some unintentionally files changed by improperly filled cookiecutter instructions. * Fix repo consistency * Fix Flax-PT equivalence * Apply suggestions from code review * Update index.mdx * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-12-02 14:21:48 +05:30
Mishig Davaadorj	275402bf2b	Update doc img links (#14593 ) * Update doc img links * Rename toctree.yml -> _toctree.yml (#14594) * Update doc img links * Update performance.md img link	2021-12-02 09:01:35 +01:00
Mishig Davaadorj	4f68de625c	Rename toctree.yml -> _toctree.yml (#14594 )	2021-12-02 08:58:39 +01:00
Stas Bekman	fbe278c76c	[doc] bf16/tf32 guide (#14579 ) * [doc] bf16/tf32 guide * expand * expand * Update docs/source/performance.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-01 14:18:58 -08:00
Sylvain Gugger	4df7d05a87	Doc new front (#14590 ) * Convert PretrainedConfig doc to Markdown * Use syntax * Add necessary doc files (#14496) * Doc fixes (#14499) * Fixes for the new front * Convert DETR file for table * Title is needed * Simplify a bit * Even simpler * Remove imports * Fix typo in toctree (#14516) * Fix checkpoints badge * Update versions.yml format (#14517) * Doc new front github actions (#14512) * Doc new front github actions * Fix docstring * Fix feature extraction utils import (#14515) * Address Julien's comments * Push to doc-builder * Ready for merge * Remove old build and deploy * Doc misc fixes (#14583) * Rm versions.yml from doc * Fix converting.rst * Rm pretrained_models from toctree * Fix index links (#14567) * Fix links in README * Localized READMEs * Fix copy script * Fix find doc script * Update README_ko.md Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co> * Adapt build command to new CLI tools (#14578) * Fix typo * Fix doc interlinks (#14589) * Convert PretrainedConfig doc to Markdown * Use syntax * Rm pattern <[a-z]+(.html).> Rm huggingface.co/transformers/master * Rm .html * Rm .html from index.mdx * Rm .html from model_summary.rst * Update index.mdx rm html * Update remove .html * Fix inner doc links * Fix interlink in preprocssing.rst * Update pr_checks Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Convert PretrainedConfig doc to Markdown * Use syntax * Add necessary doc files (#14496) * Doc fixes (#14499) * Fixes for the new front * Convert DETR file for table * Title is needed * Simplify a bit * Even simpler * Remove imports * Fix checkpoints badge * Fix typo in toctree (#14516) * Update versions.yml format (#14517) * Doc new front github actions (#14512) * Doc new front github actions * Fix docstring * Fix feature extraction utils import (#14515) * Address Julien's comments * Push to doc-builder * Ready for merge * Remove old build and deploy * Doc misc fixes (#14583) * Rm versions.yml from doc * Fix converting.rst * Rm pretrained_models from toctree * Fix index links (#14567) * Fix links in README * Localized READMEs * Fix copy script * Fix find doc script * Update README_ko.md Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co> * Adapt build command to new CLI tools (#14578) * Fix typo * Fix doc interlinks (#14589) * Convert PretrainedConfig doc to Markdown * Use syntax * Rm pattern <[a-z]+(.html).> Rm huggingface.co/transformers/master * Rm .html * Rm .html from index.mdx * Rm .html from model_summary.rst * Update index.mdx rm html * Update remove .html * Fix inner doc links * Fix interlink in preprocssing.rst * Update pr_checks Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Styling Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co>	2021-12-01 14:13:02 -05:00
Suraj Patil	4c0dd199c8	FlaxGPTJ (#14396 ) * add flax gptj * no bias in attention dense * no wpe * fix rotary embeddings * fix rotary embeds * fix rotray embeds * quality * doc and quality * fix equivalence tests	2021-12-01 10:57:39 +05:30
Suraj Patil	fc1d97f29d	VisionTextDualEncoder (#13511 ) * init vision_text_dual_encoder * fix merge * remove extra heads * fix tests * remove VISION_TEXT_DUAL_ENCODER_PRETRAINED_CONFIG_ARCHIVE_MAP * remove archive map * fix imports * fix more imports * fix init * delete tokenizers * fix imports * clean * support clip's vision model * handle None config * begin tests * more test and few fixes * warn about newly init weights * more tests * add loss to model * remove extra classes from doc * add processor * doc and small fixes * add start docstr * update flax model * flax tests * more flax tests * doc * quality * doc and quality * fix doc * doc * remove comments * update warning * quality * fix docs * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * replace asserts, fix imports * update imports * fix import * address some review comments * fix check * reduce tolerance * fix test * add flax integration test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address Sylvain's comments * fix style * add pt_flax_equivalence test in PT tests * add pt integration test * update test * use pre-trained checkpoint in examples Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-30 22:21:48 +05:30
Daniel Stancl	faacd74729	[Flax] Add FlaxBlenderbot (#13633 ) * Init Flax implementation for Blenderbot * Add a majority of stuff except for tests * make style quality * Add tests and fix some bugs * Add tests * Clean source code and fix some bugs * Fix copies and docs * Fix jax device condition for tests * Fix layer norm in the encoder * Fix a few typos in the test file * make fix-copies * make fix-copies * fix layer norm * Fix Flax params dtype (#13090) * Fix PR reference (#13098) * make fix-copies * Update tests/test_modeling_flax_blenderbot.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-11-30 17:36:54 +05:30
Kamal Raj	c468a87a69	Tapas tf (#13393 ) * TF Tapas first commit * updated docs * updated logger message * updated pytorch weight conversion script to support scalar array * added use_cache to tapas model config to work properly with tf input_processing * 1. rm embeddings_sum 2. added # Copied 3. + TFTapasMLMHead 4. and lot other small fixes * updated docs * + test for tapas * updated testing_utils to check is_tensorflow_probability_available * converted model logits post processing using numpy to work with both PT and TF models * + TFAutoModelForTableQuestionAnswering * added TF support * added test for TFAutoModelForTableQuestionAnswering * added test for TFAutoModelForTableQuestionAnswering pipeline * updated auto model docs * fixed typo in import * added tensorflow_probability to run tests * updated MLM head * updated tapas.rst with TF model docs * fixed optimizer import in docs * updated convert to np data from pt model is not `transformers.tokenization_utils_base.BatchEncoding` after pipeline upgrade * updated pipeline: 1. with torch.no_gard removed, pipeline forward handles 2. token_type_ids converted to numpy * updated docs. * removed `use_cache` from config * removed floats_tensor * updated code comment * updated Copyright Year and logits_aggregation Optional * updated docs and comments * updated docstring * fixed model weight loading * make fixup * fix indentation * added tf slow pipeline test * pip upgrade * upgrade python to 3.7 * removed from_pt from tests * revert commit `f18cfa9`	2021-11-30 11:07:55 +01:00
NielsRogge	25156eb296	Rename ImageGPT (#14526 ) * Rename * Add MODEL_FOR_CAUSAL_IMAGE_MODELING_MAPPING	2021-11-29 10:19:11 +01:00
Xing Han Lu	ebbe8cc3fe	Tokenizers docs: Specify which class contains `__call__` method (#14379 ) * Update tokenizer.rst * Apply `make fixup`	2021-11-28 18:55:38 -05:00
Lysandre Debut	2318bf77eb	Fixes (#14534 )	2021-11-26 04:35:08 -05:00
Lysandre Debut	c15f4f203f	Quicktour updates (#14533 )	2021-11-26 04:09:31 -05:00
Chris Fregly	1bbd6fcdeb	added save_directories for _psave_pretrained_pt and _tf, changed model to tf_model and pt_model, enable the notebook to run cleanly from top to bottom without error (#14529 ) * added save_directories for _psave_pretrained_pt and _tf, changed model to tf_model and pt_model, enable the notebook to run cleanly from top to bottom without error * Update quicktour.rst * added >>> * dependencies * added space	2021-11-26 03:46:07 -05:00
Stas Bekman	956a483173	[deepspeed] zero inference (#14253 ) * [deepspeed] zero inference * only z3 makes sense for inference * fix and style * docs * rework * fix test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * responding to suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-23 14:09:15 -08:00
Sylvain Gugger	204d251310	Auto processor (#14465 ) * Add AutoProcessor class * Init and tests * Add doc * Fix init * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Reverts to tokenizer or feature extractor when available * Adapt test Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-11-22 12:17:38 -05:00
Daniel Stancl	e0e2da1194	Improve a add-new-pipeline docs a bit (#14485 )	2021-11-22 10:35:49 -05:00
Shang Zhang	a59e7c1ed4	Add QDQBert model and quantization examples of SQUAD task (#14066 ) * clean up branch for add-qdqbert-model * README update for QAT example; update docstrings in modeling_qdqbert.py * Update qdqbert.rst * Update README.md * Update README.md * calibration data using traning set; QAT example runs in fp32 * re-use BERTtokenizer for qdqbert * Update docs/source/model_doc/qdqbert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/qdqbert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/qdqbert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove qdqbert tokenizer * Update qdqbert.rst * update evaluate-hf-trt-qa.py * update configuration_qdqbert.py * update modeling_qdqbert.py: add copied statement; replace assert with ValueError * update copied from statement * add is_quantization_available; run make fix-copies * unittest add require_quantization * add backend dependency to qdqbert model * update README; update evaluate script; make style * lint * docs qdqbert update * circleci build_doc add pytorch-quantization for qdqbert * update README * update example readme with instructions to upgrade TensorRT to 8.2 * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * change quantization to pytorch_quantization for backend requirement * feed_forward_chunking not supported in QDQBert * make style * update model docstrings and comments in testing scripts * rename example to quantization-qdqbert; rename example scripts from qat to quant * Update src/transformers/models/qdqbert/modeling_qdqbert.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * rm experimental functions in quant_trainer * qa cleanup * make fix-copies for docs index.rst * fix doctree; use post_init() for qdqbert * fix early device assignment for qdqbert * fix CI:Model templates runner Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-11-19 13:33:39 -05:00
NielsRogge	0490b98877	[ImageGPT] Small fixes (#14460 ) * Add integration test * Fix typo	2021-11-19 15:15:02 +01:00
NielsRogge	da36c557f7	Add ImageGPT (#14240 ) * First draft * More improvements * Improve conversion script * Fix init weights for layer norm * Fix correct model for conversion script * Don't tie input and output embeddings * Add print statements for debugging * Add print statements for debugging * Fix vocab size of model * Improve documentation, remove fast tokenizer * Add ImageGPTForImageClassification, improve docs * Fix docs issue * Set verbosity level back to info * Improve tests * Fix tests and add figure * Delete tokenizer file * Remove ImageGPTTokenizer from init files * Remove ImageGPTLayer from init files * Remove ImageGPT tokenizer from docs * First draft of ImageGPTFeatureExtractor * Fix typo * Fix bug * More improvements * Apply suggestions from code review, add tests for feature extractor * Fix layernorm * Update save_pretrained method * Fix issue * Make all tests of ImageGPTFeatureExtractor pass * Update code examples * Rename model inputs to pixel_values * Improve code examples * Update init_weights to post_init * Fix post_init	2021-11-18 16:24:34 +01:00
Patrick von Platen	754202de4f	[Bart] Fix docs (#14434 )	2021-11-17 19:02:33 +01:00
Lysandre	c6c075544d	Docs for version v4.12.5	2021-11-17 11:39:12 -05:00
NielsRogge	a2864a50e7	Improve semantic segmentation models (#14355 ) * Improve tests * Improve documentation * Add ignore_index attribute * Add semantic_ignore_index to BEiT model * Add segmentation maps argument to BEiTFeatureExtractor * Simplify SegformerFeatureExtractor and corresponding tests * Improve tests * Apply suggestions from code review * Minor docs improvements * Streamline segmentation map tests of SegFormer and BEiT * Improve reduce_labels docs and test * Fix code quality * Fix code quality again	2021-11-17 15:29:58 +01:00
Lysandre	888fb21159	Docs for v4.12.4	2021-11-16 17:40:58 -05:00
Patrick von Platen	4ce74edf51	[Speech2Text2] Enable tokenizers (#14390 ) * [Speech2Text2] Enable tokenizers * minor fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-15 16:34:11 +01:00
Stas Bekman	29dfb2dbb1	[doc] performance and parallelism updates (#14391 ) * [doc] performance and parallelism doc update * improve * improve	2021-11-14 17:19:15 -08:00
Nicolas Patry	5c153079e2	Adding some quality of life for `pipeline` function. (#14322 ) * Adding some quality of life for `pipeline` function. * Update docs/source/main_classes/pipelines.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Improve the tests. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-10 10:18:35 +01:00
Steven Liu	e4d8f517b9	Rewrite guides for fine-tuning with Datasets (#13923 ) * rewrite guides for fine-tuning with datasets * simple qa code example * use anonymous rST links * style	2021-11-09 14:12:50 -05:00
Yih-Dar	be4a6c64dc	Add TFViTModel (#13778 ) * Start the work for TFViTModel * Convert to TF code - need to check in the follow up commits * Clean up model code * Expose TFViTModel * make style * make quality * Add test * make style & quality * Fix some imports * fix wrong usage - kwargs => * kwargs * Fix Conv2D weight loading (PT->TF) issue * Add tests for images with different sizes + fix model * Fix some common tests for TFViTModel * Use inputs instead of input_ids in test_compile_tf_model * Add a comment about transpose and Conv2D in convert_tf_weight_name_to_pt_weight_name * Avoid transpose in TFViT call * Fix Conv2D issue in load_tf2_weights_in_pytorch_model * Use tf.keras.layers.Conv2D instead of tf.nn.conv2d * Using simpler heuristic to detect Conv2D layer * Change convert_tf_weight_name_to_pt_weight_name to return TransposeType * Check tf_weight_shape is not None before using it * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix missing comma * fix input dtype Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-09 07:54:37 -05:00
Yih-Dar	95b3ec3bc9	Add FlaxVisionEncoderDecoderModel (#13359 ) * Start the work on FlaxVisionEncoderDecoderModel * Add FlaxVisionEncoderDecoderModel * Add VisionEncoderDecoderConfig * Make FlaxVisionEncoderDecoderModel visible to transformers * Add test * Fix wrong getattr usage * Fix tests * Add FlaxAutoModelForVision2Seq * Expose FLAX_MODEL_FOR_VISION_2_SEQ_MAPPING * clean-up * add integration test * update expected logits * update expected scores * Add ViT2GPT2ModelIntegrationTest + some cleaning * Add projection layer + PT/Flax equivalence tests * Fix import * minor changes * make test slow again * Apply suggestions * Add modeling_flax_vision_encoder_decoder to _ignore_modules in get_model_modules() * fix copies * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * split long strings in multiple lines * decoder_input_ids can't be None * Add back test_configuration_tie * Remove attention_mask parameter * fix test - encoder_last_hidden_state should be encoder_outputs.last_hidden_state instead of the projected vector * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Remove more encoder_attention_mask * remove encoder_attention_mask when calling self.decode (in FlaxVisionEncoderDecoderModule) * Fix style + pass 1s instead of None as encoder_attention_mask * fix init_weights * pass None for encoder_attention_mask * pass 1s instead of None as encoder_attention_mask * Fix doc style Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-11-09 15:14:28 +05:30
Xing Han Lu	843c326ee1	Update dpr.rst (#14300 )	2021-11-06 09:41:02 -04:00
Sylvain Gugger	f0d6e952c0	Quality explain (#14264 ) * Start PR doc * Cleanup the quality checks and document them * Add reference in the contributing guide * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Rename file as per review suggestion Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-11-03 17:43:19 -04:00
NielsRogge	5f789a687a	Add LayoutXLMProcessor (and LayoutXLMTokenizer, LayoutXLMTokenizerFast) (#14115 ) * Add LayoutXLMTokenizer and LayoutXLMTokenizerFast * Fix styling issues * Fix more styling issues * Fix more styling issues * Fix docstring * Fix unit tests * Fix docs * Fix unit tests * Fix typos and styling issues * Fix styling issues * Fix docstring * Make all tests of test_tokenization_layoutxlm pass * Add LayoutXLMProcessor * Make fixup * Make all LayoutXLMProcessor tests pass * Minor fixes * Leave LayoutLMv2Processor tests unchanged * Fix code quality * Move LayoutXLM tokenizers and processor to separate folder * Fix code quality * Apply suggestions from code review * Replace assertions by value errors * Remove methods from fast tokenizer Co-authored-by: King Yiu Suen <kingyiusuen@gmail.com>	2021-11-03 08:59:44 +01:00
Sylvain Gugger	558f8543ba	Update Transformers to huggingface_hub >= 0.1.0 (#14251 ) * Update Transformers to huggingface_hub >= 0.1.0 * Forgot to save... * Style * Fix test	2021-11-02 18:58:42 -04:00
lumliolum	519a677e87	Added Beit model output class (#14133 ) * add Beit model ouput class * inherting from BaseModelOuputWithPooling * updated docs if use_mean_pooling is False * added beit specific outputs in model docs * changed the import path * Fix docs Co-authored-by: Niels Rogge <niels.rogge1@gmail.com>	2021-11-02 18:29:14 +01:00
NielsRogge	e20faa6f03	Add BeitForSemanticSegmentation (#14096 ) * Add first draft * Make forward pass work * Improve conversion script * Add notebook that checks if it works * Add BeitForSemanticSegmentation to the tests * More improvements * Make BeitForSemanticSegmentation consistent with Segformer * Small bug fix * Add BeitForSemanticSegmentation to docs * Make sure model doesn't output hidden states when the user doesn't want to * Make it possible to convert the large model * Fix issue * Fix conversion script for large model * Add auxiliary_head option to semantic segmentation model * Apply suggestions from @sgugger's review * Apply suggestions from code review * Fix failing test Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-11-01 19:55:45 +01:00
Lysandre	9fc1951711	Docs for v4.12.2	2021-10-29 14:51:05 -04:00
Lysandre	513fa30a63	Docs for v4.12.1	2021-10-29 13:49:50 -04:00
Daniel Stancl	d37f1fb8ba	Add `BlenderbotTokenizerFast` (#13720 ) * Add the support for the fast (rust) implementation of BlenbderbotTokenizer * Fix a converter and a typo in a doc * Apply the patil-suraj's suggestion * (Nitpick) Fast tokenization -> Fast Tokenization in doc * Apply the SaulLu's suggestion * Apply Narsil's suggestion to fix test pipelines * Add encoder_no_repeat_ngram_size according to the Narsil's suggestion * Revert the last (unnecessary) commit * Override pipeline config for Blenderbot to allow for larger pos. emb. * make fix-copies	2021-10-29 09:19:01 -04:00
Nicolas Patry	be236361f1	Adding `batch_size` support for (almost) all pipelines (#13724 ) * Tentative enabling of `batch_size` for pipelines. * Add systematic test for pipeline batching. * Enabling batch_size on almost all pipelines - Not `zero-shot` (it's already passing stuff as batched so trickier) - Not `QA` (preprocess uses squad features, we need to switch to real tensors at this boundary. * Adding `min_length_for_response` for conversational. * Making CTC, speech mappings avaiable regardless of framework. * Attempt at fixing automatic tests (ffmpeg not enabled for fast tests) * Removing ffmpeg dependency in tests. * Small fixes. * Slight cleanup. * Adding docs and adressing comments. * Quality. * Update docs/source/main_classes/pipelines.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/question_answering.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/zero_shot_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Improving docs. * Update docs/source/main_classes/pipelines.rst Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com> * N -> oberved_batch_size softmax trick. * Follow `padding_side`. * Supporting image pipeline batching (and padding). * Rename `unbatch` -> `loader_batch`. * unbatch_size forgot. * Custom padding for offset mappings. * Attempt to remove librosa. * Adding require_audio. * torchaudio. * Back to using datasets librosa. * Adding help to set a pad_token on the tokenizer. * Update src/transformers/pipelines/base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Quality. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>	2021-10-29 11:34:18 +02:00
Lysandre	b8fad022a0	v4.13.0.dev0	2021-10-28 12:56:46 -04:00
Lysandre	62bf536631	Release v4.12.0	2021-10-28 12:09:49 -04:00
NielsRogge	1dc96a760d	Add SegFormer (#14019 ) * First draft * Make style & quality * Improve conversion script * Add print statement to see actual slice * Make absolute tolerance smaller * Fix image classification models * Add post_process_semantic method * Disable padding * Improve conversion script * Rename to ForSemanticSegmentation, add integration test, remove post_process methods * Improve docs * Fix code quality * Fix feature extractor tests * Fix tests for image classification model * Delete file * Add is_torch_available to feature extractor * Improve documentation of feature extractor methods * Apply suggestions from @sgugger's code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply some more suggestions of code review * Rebase with master * Fix rebase issues * Make sure model only outputs hidden states when the user wants to * Apply suggestions from code review * Add pad method * Support padding of 2d images * Add print statement * Add print statement * Move padding method to SegformerFeatureExtractor * Fix issue * Add casting of segmentation maps * Add test for padding * Add small note about padding Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-10-28 08:23:52 -04:00
Patrick von Platen	9f3aa46f45	Add Unispeech & Unispeech-SAT (#13963 ) * unispeech * add copy from * remove hubert copy from * finish for today * add unispeech-sat * adapt more * up * up * up * up * add modeling * add tests * up * up * finish * up * Apply suggestions from code review * up * up * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * up * up Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-10-26 18:59:58 +02:00
Thomas Chaigneau	1f60df81b2	Add Camembert to models exportable with ONNX (#14059 ) Add Camembert to models exportable with ONNX Co-authored-by: Thomas.Chaigneau <thomas.chaigneau@arkea.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>	2021-10-26 11:22:22 +02:00
Matt	84b9579da7	Remove unneeded `to_tensor()` in TF inline example (#14140 )	2021-10-25 15:04:36 +01:00
Reza Gharibi	3e04a41a9b	Fix some writing issues in the docs (#14136 ) * Fix some writing issues in the docs * Run code quality check	2021-10-25 07:48:02 -04:00
Reza Gharibi	6b83090e80	Fix some typos in the docs (#14126 ) * Fix some typos in the docs * Fix a styling issue * Fix code quality check error	2021-10-25 07:40:44 -04:00
Kevin Ko	95bab53868	Update TP parallel GEMM image (#14112 ) * Update TP parallel GEMM image * Delete parallelism-tp-parallel_gemm.png * Update parallelism-tp-parallel_gemm.png	2021-10-22 12:57:48 -07:00
Reza Gharibi	7888914edd	Fix a typo in preprocessing docs (#14108 )	2021-10-21 17:00:26 -04:00
Reza Gharibi	49155d2431	Fix broken link in translation section (#14087 )	2021-10-20 15:10:57 -04:00
Ihor Omelchenko	9eda0d156d	Fix typo (#14056 )	2021-10-18 18:03:39 -04:00
Patrick von Platen	d5ff69fce9	[Speech] Refactor Examples (#14040 ) * adapt_examples * up * up * up * up * add auto models * finish	2021-10-18 17:43:35 +02:00
Sylvain Gugger	2c60ff2fe2	Add an API to register objects to Auto classes (#13989 ) * Add API to register a new object in auto classes * Fix test * Documentation * Add to tokenizers and test * Add cleanup after tests * Be more careful * Move import * Move import * Cleanup in TF test too * Add consistency check * Add documentation * Style * Update docs/source/model_doc/auto.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/auto/auto_factory.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-10-18 10:22:46 -04:00
Dat Quoc Nguyen	3d587c5343	Add BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese (#13788 ) * Add the pre-trained BARTpho model * Add the pre-trained BARTpho model * Add the pre-trained BARTpho model * Fix incorrectly sorted and/or formatted imports * Fix incorrectly sorted and/or formatted style * Fix check_dummies * Fix check_dummies * Fix check_dummies * Update docs/source/model_doc/bartpho.rst Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update src/transformers/models/bartpho/__init__.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update src/transformers/models/bartpho/tokenization_bartpho.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update tests/test_tokenization_bartpho.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update src/transformers/models/bartpho/tokenization_bartpho.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update tests/test_tokenization_bartpho.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update docs/source/model_doc/bartpho.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/bartpho.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/bartpho/__init__.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Add the pre-trained BARTpho model * Add Tips section in doc and details of monolingual_vocab_file * Fix conflicts * Add another tip related to monolingual_vocab_file * Readd dependency_versions_table.py * Handle failing checks * Remove test_list.txt * Remove md5sum.saved * Revise Readme.md Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-10-18 10:16:46 -04:00
Anton Lozhkov	cd3166a8ed	Add the SEW and SEW-D speech models (#13962 ) * Working encoder * SEW-D and tests * Further conv fixes * Automodels and conv inits * Update integration tests, add docs * Docs cleanup, resolve todos * Conf fix * Fix docs * Fix tests, apply suggestions * Update src/transformers/models/sew/modeling_sew.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Model conversion and updated no-mask tests * Remove copy of feature_proj * Style * Update src/transformers/models/auto/feature_extraction_auto.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/auto/feature_extraction_auto.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Move orgs Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-10-15 18:26:26 +03:00
Patrick von Platen	7fb2a8b3d9	up (#14008 )	2021-10-14 15:46:22 +02:00
NielsRogge	408b2d2bd0	Add TrOCR + VisionEncoderDecoderModel (#13874 ) * First draft * Update self-attention of RoBERTa as proposition * Improve conversion script * Add TrOCR decoder-only model * More improvements * Make forward pass with pretrained weights work * More improvements * Some more improvements * More improvements * Make conversion work * Clean up print statements * Add documentation, processor * Add test files * Small improvements * Some more improvements * Make fix-copies, improve docs * Make all vision encoder decoder model tests pass * Make conversion script support other models * Update URL for OCR image * Update conversion script * Fix style & quality * Add support for the large-printed model * Fix some issues * Add print statement for debugging * Add print statements for debugging * Make possible fix for sinusoidal embedding * Further debugging * Potential fix v2 * Add more print statements for debugging * Add more print statements for debugging * Deubg more * Comment out print statements * Make conversion of large printed model possible, address review comments * Make it possible to convert the stage1 checkpoints * Clean up code, apply suggestions from code review * Apply suggestions from code review, use Microsoft models in tests * Rename encoder_hidden_size to cross_attention_hidden_size * Improve docs	2021-10-13 10:28:56 +02:00
Stas Bekman	61f6426269	[parallel doc] dealing with layers larger than one gpu (#13980 )	2021-10-12 15:37:55 -07:00
Yih-Dar	8b240a0661	Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222 ) * Add cross attentions to TFGPT2Model * Add TFEncoderDecoderModel * Add TFBaseModelOutputWithPoolingAndCrossAttentions * Add cross attentions to TFBertModel * Fix past or past_key_values argument issue * Fix generation * Fix save and load * Add some checks and comments * Clean the code that deals with past keys/values * Add kwargs to processing_inputs * Add serving_output to TFEncoderDecoderModel * Some cleaning + fix use_cache value issue * Fix tests + add bert2bert/bert2gpt2 tests * Fix more tests * Ignore crossattention.bias when loading GPT2 weights into TFGPT2 * Fix return_dict_in_generate in tf generation * Fix is_token_logit_eos_token bug in tf generation * Finalize the tests after fixing some bugs * Fix another is_token_logit_eos_token bug in tf generation * Add/Update docs * Add TFBertEncoderDecoderModelTest * Clean test script * Add TFEncoderDecoderModel to the library * Add cross attentions to TFRobertaModel * Add TFRobertaEncoderDecoderModelTest * make style * Change the way of position_ids computation * bug fix * Fix copies in tf_albert * Remove some copied from and apply some fix-copies * Remove some copied * Add cross attentions to some other TF models * Remove encoder_hidden_states from TFLayoutLMModel.call for now * Make style * Fix TFRemBertForCausalLM * Revert the change to longformer + Remove copies * Revert the change to albert and convbert + Remove copies * make quality * make style * Add TFRembertEncoderDecoderModelTest * make quality and fix-copies * test TFRobertaForCausalLM * Fixes for failed tests * Fixes for failed tests * fix more tests * Fixes for failed tests * Fix Auto mapping order * Fix TFRemBertEncoder return value * fix tf_rembert * Check copies are OK * Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined * Add TFEncoderDecoderModelSaveLoadTests * fix tf weight loading * check the change of use_cache * Revert the change * Add missing test_for_causal_lm for TFRobertaModelTest * Try cleaning past * fix _reorder_cache * Revert some files to original versions * Keep as many copies as possible * Apply suggested changes - Use raise ValueError instead of assert * Move import to top * Fix wrong require_torch * Replace more assert by raise ValueError * Add test_pt_tf_model_equivalence (the test won't pass for now) * add test for loading/saving * finish * finish * Remove test_pt_tf_model_equivalence * Update tf modeling template * Remove pooling, added in the prev. commit, from MainLayer * Update tf modeling test template * Move inputs["use_cache"] = False to modeling_tf_utils.py * Fix torch.Tensor in the comment * fix use_cache * Fix missing use_cache in ElectraConfig * Add a note to from_pretrained * Fix style * Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt * Fix TFMLP (in TFGPT2) activation issue * Fix None past_key_values value in serving_output * Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub * Apply review suggestions - style for cross_attns in serving_output * Apply review suggestions - change assert + docstrings * break the error message to respect the char limit * deprecate the argument past * fix docstring style * Update the encoder-decoder rst file * fix Unknown interpreted text role "method" * fix typo Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-10-13 00:10:34 +02:00
Adam Kaczmarek	23ee06ed55	Fixed typo: herBERT -> HerBERT (#13936 )	2021-10-08 10:27:32 -04:00
Mishig Davaadorj	026866df92	Image Segmentation pipeline (#13828 ) * Implement img seg pipeline * Update src/transformers/pipelines/image_segmentation.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/pipelines/image_segmentation.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update output shape with individual masks * Rm dev change * Remove loops in test Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2021-10-08 09:59:53 +02:00
Lysandre	5be59a3649	Deploy docs for v4.11.3	2021-10-06 12:58:47 -04:00
Sylvain Gugger	7d83655da9	Autodocument the list of ONNX-supported models (#13884 )	2021-10-05 22:43:16 -04:00
Hyunwoong Ko	36fc401621	Update parallelism.md (#13892 ) * Update parallelism.md * Update docs/source/parallelism.md Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update docs/source/parallelism.md Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update docs/source/parallelism.md Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update docs/source/parallelism.md Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update docs/source/parallelism.md Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update docs/source/parallelism.md Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-10-05 17:42:12 -07:00

... 7 8 9 10 11 ...

1734 Commits