transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-18 12:08:22 +06:00

Author	SHA1	Message	Date
Michael Benayoun	4bfe75bd08	M2M100 support for ONNX export (#15193 ) * Add M2M100 support for ONNX export * Delete useless imports * Add M2M100 to tests * Fix protobuf issue	2022-03-02 10:03:14 +01:00
Steven Liu	6ccfa2170c	Inference for multilingual models (#15836 ) * 📝 first draft for multilingual models * 🖍 make style	2022-03-01 15:10:31 -06:00
NielsRogge	c008afea3c	Add link to notebooks (#15791 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-03-01 17:44:20 +01:00
Patrick von Platen	9863f7d228	[Benchmark tools] Deprecate all (#15848 ) * [Benchmark tools] Deprecate all * up	2022-03-01 11:26:20 +01:00
Eduardo Gonzalez Ponferrada	df5a4094a6	Add Data2Vec (#15507 ) * Add data2vec model cloned from roberta * Add checkpoint conversion script * Fix copies * Update docs * Add checkpoint conversion script * Remove fairseq data2vec_text script and fix format * Add comment on where to get data2vec_text.py * Remove mock implementation cheat.py and fix style * Fix copies * Remove TF and Flax classes from init * Add back copy from fairseq data2vec_text.py and fix style * Update model name in docs/source/index.mdx to be CamelCase * Revert model name in table to lower-case to get check_table test to pass * Update src/transformers/models/data2vec/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/model_doc/data2vec.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/data2vec.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update documentation * Copy-paste Data2VecConfig from BertConfig * Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency * Update config special tokens to match RoBERTa * Split multiple assertions and add individual error messages * Rename Data2VecModel to Data2VecForTextModel * Add Data2Vec to _toctree.yml * Rename Data2VecEmbeddings to Data2VecForTextEmbeddings * Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding). * finish audio model * finish audio file * Update names and fix style, quality and repo consistency * Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files. * add inputs to logits to data2vec' * correct autio models * correct config auto * correct tok auto * Update utils/tests_fetcher.py * delete unnecessary files * delete unnecessary files * further renaming * make all tests pass * finish * remove useless test file * Update tests/test_modeling_common.py * Update utils/check_repo.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec_text.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Fix copies * Update docs * Remove fairseq data2vec_text script and fix format * Add comment on where to get data2vec_text.py * Remove mock implementation cheat.py and fix style * Fix copies * Remove TF and Flax classes from init * Add back copy from fairseq data2vec_text.py and fix style * Update model name in docs/source/index.mdx to be CamelCase * Revert model name in table to lower-case to get check_table test to pass * Update documentation * Update src/transformers/models/data2vec/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Copy-paste Data2VecConfig from BertConfig * Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency * Update config special tokens to match RoBERTa * Split multiple assertions and add individual error messages * Rename Data2VecModel to Data2VecForTextModel * Add Data2Vec to _toctree.yml * Rename Data2VecEmbeddings to Data2VecForTextEmbeddings * Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding). * finish audio model * finish audio file * add inputs to logits to data2vec' * Update names and fix style, quality and repo consistency * Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files. * correct autio models * correct config auto * correct tok auto * delete unnecessary files * delete unnecessary files * Update utils/tests_fetcher.py * further renaming * make all tests pass * finish * remove useless test file * Update tests/test_modeling_common.py * Update utils/check_repo.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec_text.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Move data2vec tests to new structure * Fix test imports for text tests * Remove fairseq files * Change paper link to arxiv * Modify Data2Vec documentation to reflect that the encoder is not shared across the audio and text models in the current implementation. * Update text model checkpoint to be facebook/data2vec-text-base * Add 'Copy from' statements and update paper links and docs * fix copy from statements * improve copied from * correct more copied from statements * finish copied from stuff * make style * add model to README * add to master Co-authored-by: Eduardo Gonzalez Ponferrada <eduardo@ferrumhealth.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-01 11:09:20 +01:00
Sanchit Gandhi	e3342edc4e	Flax Speech-Encoder-Decoder Model (#15613 ) * rebase * Delete shift tokens func * downsample decoder input seq len for init * correct attention mask * add tests * pt flax cross test * make fixup * init file for import * change pt-flax cross test threshold * pt-flax test logits only * move tests * make repo-consistency * consistent indentation Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-28 12:22:36 +01:00
Sayak Paul	84eaa6acf5	Add TFConvNextModel (#15750 ) * feat: initial implementation of convnext in tensorflow. * fix: sample code for the classification model. * chore: added checked for from the classification model. * chore: set bias initializer in the classification head. * chore: updated license terms. * chore: removed ununsed imports * feat: enabled argument during using drop_path. * chore: replaced tf.identity with layers.Activation(linear). * chore: edited default checkpoint. * fix: minor bugs in the initializations. * partial-fix: tf model errors for loading pretrained pt weights. * partial-fix: call method updated * partial-fix: cross loading of weights (4x3 variables to be matched) * chore: removed unneeded comment. * removed playground.py * rebasing * rebasing and removing playground.py. * fix: renaming TFConvNextStage conv and layer norm layers * chore: added initializers and other minor additions. * chore: added initializers and other minor additions. * add: tests for convnext. * fix: integration tester class. * fix: issues mentioned in pr feedback (round 1). * fix: how output_hidden_states arg is propoagated inside the network. * feat: handling of arg for pure cnn models. * chore: added a note on equal contribution in model docs. * rebasing * rebasing and removing playground.py. * feat: encapsulation for the convnext trunk. * Fix variable naming; Test-related corrections; Run make fixup * chore: added Joao as a contributor to convnext. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * chore: corrected copyright year and added comment on NHWC. * chore: fixed the black version and ran formatting. * chore: ran make style. * chore: removed from_pt argument from test, ran make style. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * fix: tests in the convnext subclass, ran make style. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * chore: moved convnext test to the correct location * fix: locations for the test file of convnext. * fix: convnext tests. * chore: applied sgugger's suggestion for dealing w/ output_attentions. * chore: added comments. * chore: applied updated quality enviornment style. * chore: applied formatting with quality enviornment. * chore: revert to the previous tests/test_modeling_common.py. * chore: revert to the original test_modeling_common.py * chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py * fix: tests for convnext. * chore: removed output_attentions argument from convnext config. * chore: revert to the earlier tf utils. * fix: output shapes of the hidden states * chore: removed unnecessary comment * chore: reverting to the right test_modeling_tf_common.py. * Styling nits Co-authored-by: ariG23498 <aritra.born2fly@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-02-25 18:19:16 +01:00
Sylvain Gugger	0118c4f6a8	Re-enable doctests for the quicktour (#15828 ) * Re-enable doctests for the quicktour * Re-enable doctests for task_summary (#15830) * Remove &	2022-02-25 17:46:38 +01:00
Tanay Mehta	7566734d6f	Add model specific output classes to PoolFormer model docs (#15746 ) * Added model specific output classes to poolformer docs * Fixed Segformer typo in Poolformer docs	2022-02-25 13:43:56 +01:00
Steven Liu	fecb08c2b8	🧼 NLP task guides (#15731 ) * clean commit of changes to NLP tasks * 🖍 apply feedback * 📝 move tf data collator in multiple choice Co-authored-by: Steven <stevhliu@gmail.com>	2022-02-23 13:58:33 -06:00
Julien Chaumond	32f5de10a0	[doc] custom_models: mention security features of the Hub (#15768 ) * custom_models: tiny doc addition * mention security feature earlier in the section Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2022-02-23 11:40:06 -05:00
Nicolas Patry	f9582c205a	Adding ZeroShotImageClassificationPipeline (#12119 ) * [Proposal] Adding ZeroShotImageClassificationPipeline - Based on CLIP * WIP, Resurection in progress. * Resurrection... achieved. * Reword handling different `padding_value` for `feature_extractor` and `tokenizer`. * Thanks doc-builder ! * Adding docs + global namespace `ZeroShotImageClassificationPipeline`. * Fixing templates. * Make the test pass and be robust to floating error. * Adressing suraj's comments on docs mostly. * Tf support start. * TF support. * Update src/transformers/pipelines/zero_shot_image_classification.py Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-02-23 09:41:42 +01:00
Patrick von Platen	c44d3675c2	Time stamps for CTC models (#15687 ) * [Wav2Vec2 Time Stamps] * Add first version * add word time stamps * Fix * save intermediate space * improve * [Finish CTC Tokenizer] * remove @ * remove @ * push * continue with phonemes * up * finish PR * up * add example * rename * finish * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct split * finalize Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-22 19:26:44 +01:00
Francesco Saverio Zuppichini	38bed912e3	added link to our writing-doc document (#15756 )	2022-02-22 09:57:28 +01:00
Joao Gante	3956b133b6	TF text classification examples (#15704 ) * Working example with to_tf_dataset * updated text_classification * more comments	2022-02-21 17:17:59 +00:00
Gunjan Chhablani	2c2a31ffbc	Add missing PLBart entry in README (#15721 ) * Add missing PLBart entry in index * Fix README * Fix README * Fix style * Change to master model doc	2022-02-18 21:11:42 +01:00
Gunjan Chhablani	ae1f835028	Add PLBart (#13269 ) * Init PLBART * Add missing configuration file * Add conversion script and configurationf ile * Fix style * Update modeling and conversion scripts * Fix scale embedding in config * Add comment * Fix conversion script * Add classification option to conversion script * Fix vocab size in config doc * Add tokenizer files from MBart50 * Allow no lang code in regular tokenizer * Add PLBart Tokenizer Converters * Remove mask from multi tokenizer * Remove mask from multi tokenizer * Change from MBart-50 to MBart tokenizer * Fix names and modify src/tgt behavior * Fix imports for tokenizer * Remove <mask> from multi tokenizer * Fix style * Change tokenizer_class to processor_class * Add attribute map to config class * Update modeling file to modified MBart code * Update configuration file to MBart style configuration * Fix tokenizer * Separate tokenizers * Fix error in tokenization auto * Copy MBart tests * Replace with MBart tokenization tests * Fix style * Fix language code in multi tokenizer * Fix configuration docs * Add entry for plbart_multi in transformers init * Add dummy objects and fix imports * Fix modeling tests * Add TODO in config * Fix copyright year * Fix modeling docs and test * Fix some tokenization tests and style * Add changes from review * Fix copies * Fix docs * Fix docs * Fix style * Fix year * Add changes from review * Remove extra changes * Fix base tokenizer and doc * Fix style * Fix modeling and slow tokenizer tests * Remove Multi-tokenizer Converter and Tests * Delete QA model and Multi Tokenizer dummy objects * Fix repo consistency and code quality issues * Fix example documentation * Fix style * Remove PLBartTokenizer from type checking in init * Fix consistency issue * Add changes from review * Fix style * Remove PLBartTokenizerFast * Remove FastTokenizer converter * Fix AutoTokenzier mapping * Add plbart to toctree and fix consistency issues * Add language codes tokenizer test * Fix styling and doc issues * Add fixes for failing tests * Fix copies * Fix failing modeling test * Change assert to assertTrue in modeling tests	2022-02-18 14:17:09 +01:00
Francesco Saverio Zuppichini	240cc6cbdc	Adding a model, more doc for pushing to the hub (#15690 ) * doc for adding a model to the hub * run make style * resolved conversation * removed a line * removed ) * Update docs/source/add_new_model.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/add_new_model.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make style Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-18 09:11:18 +01:00
NielsRogge	57882177be	Add SimMIM (#15586 ) * Add first draft * Make model importable * Make SwinForMaskedImageModeling importable * Fix imports * Add missing inits * Add support for Swin * Fix bug * Fix bug * Fix another bug * Fix Swin MIM implementation * Fix default encoder stride * Fix Swin * Add print statements for debugging * Add image_size data argument * Fix Swin * Fix image_size * Add print statements for debugging * Fix print statement * Remove print statements * Improve reshaping of bool_masked_pos * Add support for DeiT, fix tests * Improve docstrings * Apply new black version * Improve script * Fix bug * Improve README * Apply suggestions from code review * Remove DS_Store and add to gitignore * Apply suggestions from code review + fix BEiT Flax * Revert BEiT changes * Improve README * Fix code quality * Improve README Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-02-17 19:44:55 +01:00
Yih-Dar	92a537d938	Minor fix on README.md (#15688 ) * fix README * fix more arxiv links * make fix-copies Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-17 08:38:32 -05:00
Tanay Mehta	f84e0dbd2a	Add PoolFormer (#15531 ) * Added all files, PoolFormerFeatureExtractor still failing tests * Fixed PoolFormerFeatureExtractor not being able to import * Completed Poolformer doc * Applied Suggested fixes * Fixed errors in modeling_auto.py * Fix feature extractor, convert docs to Markdown, styling of code * Remove PoolFormer from check_repo and fix integration test * Remove Poolformer from check_repo * Fixed configuration_poolformer.py docs and removed inference.py from poolformer * Ran with black v22 * Added PoolFormer to _toctree.yml * Updated poolformer doc * Applied suggested fixes and added on README.md * Did make fixup and make fix-copies, tests should pass now * Changed PoolFormer weights conversion script name and fixed README * Applied fixes in test_modeling_poolformer.py and modeling_poolformer.py * Added PoolFormerFeatureExtractor to AutoFeatureExtractor API Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>	2022-02-17 13:16:37 +01:00
Francesco Saverio Zuppichini	b87c044c79	Usage examples for logger (#15657 ) * logger * Update docs/source/main_classes/logging.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update docs/source/main_classes/logging.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-02-16 10:15:13 +01:00
Stas Bekman	bee361c6f1	[t5/t0/mt5 models] faster/leaner custom layer norm (#14656 ) * [t5] faster/leaner custom layer norm * wip * apex.normalization.FusedRMSNorm * cleanup * cleanup * add doc * add catch all * Trigger CI * expand	2022-02-15 16:49:57 -08:00
Patrick von Platen	2e12b907ae	TF generate refactor - Greedy Search (#15562 ) * TF generate start refactor * Add tf tests for sample generate * re-organize * boom boom * Apply suggestions from code review * re-add * add all code * make random greedy pass * make encoder-decoder random work * further improvements * delete bogus file * make gpt2 and t5 tests work * finish logits tests * correct logits processors * correct past / encoder_outputs drama * refactor some methods * another fix * refactor shape_list * fix more shape list * import shape _list * finish docs * fix imports * make style * correct tf utils * Fix TFRag as well * Apply Lysandre's and Sylvais suggestions * Update tests/test_generation_tf_logits_process.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Update src/transformers/tf_utils.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * remove cpu according to gante * correct logit processor Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-02-15 17:54:43 +01:00
Nicolas Patry	cdf19c501d	Re-export `KeyDataset`. (#15645 ) * Re-export `KeyDataset`. * Update the docs locations.	2022-02-15 17:49:38 +01:00
Stas Bekman	28e6155d8a	add a network debug script and document it (#15652 ) * add a network debug script and document it * doc	2022-02-15 08:48:00 -08:00
jonrbates	86a7845c0c	Fix typo in speech2text2 doc (#15617 ) Forward looks for inputs, not input_ids	2022-02-15 13:54:34 +01:00
fra	05a8580964	Revert "logger doc" This reverts commit `41168a49ce`.	2022-02-15 10:46:45 +01:00
fra	41168a49ce	logger doc	2022-02-15 10:03:28 +01:00
NielsRogge	b090b79022	Make Swin work with VisionEncoderDecoderModel (#15527 ) * Add attribute_map * Add mention in docs * Set hidden_size attribute correctly * Add note about Transformer-based models only Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>	2022-02-14 17:33:35 +01:00
Daniel Erenrich	4f403ea899	Fix grammar in tokenizer_summary (#15614 ) "to make ensure" is redundant.	2022-02-11 16:51:30 -05:00
Stas Bekman	f15c99fabf	[deepspeed docs] misc additions (#15585 ) * [deepspeed docs] round_robin_gradients * training and/or eval/predict loss is * Update docs/source/main_classes/deepspeed.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-11 10:54:04 -08:00
Steven Liu	85aee09e9a	🖍 remove broken link (#15615 )	2022-02-11 12:33:55 -06:00
Sylvain Gugger	6cf06d198c	Mark "code in the Hub" API as experimental (#15624 )	2022-02-11 09:55:31 -05:00
Ngo Quang Huy	c0864d98ba	Correct JSON format (#15600 )	2022-02-10 09:02:03 -08:00
lewtun	2e8b85f72e	Add local and TensorFlow ONNX export examples to docs (#15604 ) * Add local and TensorFlow ONNX export examples to docs * Use PyTorch - TensorFlow split	2022-02-10 16:31:00 +01:00
Alberto Bégué	cb7ed6e083	Add Tensorflow handling of ONNX conversion (#13831 ) * Add TensorFlow support for ONNX export * Change documentation to mention conversion with Tensorflow * Refactor export into export_pytorch and export_tensorflow * Check model's type instead of framework installation to choose between TF and Pytorch Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Alberto Bégué <alberto.begue@della.ai> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-02-10 11:18:41 +01:00
Sylvain Gugger	c722753afd	Expand tutorial for custom models (#15587 ) * Expand tutorial for custom models * Style * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2022-02-09 17:44:28 -05:00
NielsRogge	a86ee2261e	Add link (#15588 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>	2022-02-09 23:33:39 +01:00
Stas Bekman	dee17d5676	[trainer docs] document how to select specific gpus (#15551 ) * [trainer docs] document how to select specific gpus * expand * add urls * add accelerate launcher	2022-02-09 10:12:29 -08:00
Chan Woo Kim	2b5603f6ac	Constrained Beam Search [without disjunctive decoding] (#15416 ) * added classes to get started with constrained beam search * in progress, think i can directly force tokens now but not yet with the round robin * think now i have total control, now need to code the bank selection * technically works as desired, need to optimize and fix design choices leading to undersirable outputs * complete PR #1 without disjunctive decoding * removed incorrect tests * Delete k.txt * Delete test.py * Delete test.sh * revert changes to test scripts * genutils * full implementation with testing, no disjunctive yet * shifted docs * passing all tests realistically ran locally * removing accidentally included print statements * fixed source of error in initial PR test * fixing the get_device() vs device trap * fixed documentation docstrings about constrained_beam_search * fixed tests having failing for Speech2TextModel's floating point inputs * fix cuda long tensor * added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search * deleted accidentally added test halting code with assert False * code reformat * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py * fixing based on comments on PR * took out the testing code that should but work fails without the beam search moditification ; style changes * fixing comments issues * docstrings for ConstraintListState * typo in PhrsalConstraint docstring * docstrings improvements Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-09 16:59:26 +01:00
Leandro von Werra	d923f76203	add model scaling section (#15119 ) * add model scaling section * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * integrate reviewer feedback * initialize GPU properly * add note about BnB optimizer * move doc from `scaling.mdx` to `performance.mdx` * integrate reviewer feedback * revert section levels Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-09 15:27:30 +01:00
Sylvain Gugger	b5c6fdecf0	PoC for a ProcessorMixin class (#15549 ) * PoC for a ProcessorMixin class * Documentation * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Roll out to other processors * Add base feature extractor class in init * Use args and kwargs Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-09 09:24:49 -05:00
Nathan Raw	fcb4f11c92	📝 Add codecarbon callback to docs (#15563 )	2022-02-08 14:10:53 -05:00
Joao Gante	8406fa6dd5	Add TFSpeech2Text (#15113 ) * Add wrapper classes * convert inner layers to tf * Add TF Encoder and Decoder layers * TFSpeech2Text models * Loadable model * TF model with same outputs as PT model * test skeleton * correct tests and run the fixup * correct attention expansion * TFSpeech2Text pask_key_values with TF format	2022-02-08 16:27:23 +00:00
aaron	87d08afb16	electra is added to onnx supported model (#15084 ) * electra is added to onnx supported model * add google/electra-base-generator for test onnx module Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com>	2022-02-08 15:47:49 +01:00
Steven Liu	552f8d3091	Create a custom model guide (#15489 ) * 📝 add config section * 📝 finish first draft * 📝 add feature extractor and processor * 🖍 apply feedback from review * 📝 minor edits * last review	2022-02-07 12:34:56 -06:00
lewtun	6775b211b6	Remove Longformers from ONNX-supported models (#15273 )	2022-02-07 17:32:13 +01:00
NielsRogge	84eec9e6ba	Add ConvNeXT (#15277 ) * First draft * Add conversion script * Improve conversion script * Improve docs and implement tests * Define model output class * Fix tests * Fix more tests * Add model to README * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply more suggestions from code review * Apply suggestions from code review * Rename dims to hidden_sizes * Fix equivalence test * Rename gamma to gamma_parameter * Clean up conversion script * Add ConvNextFeatureExtractor * Add corresponding tests * Implement feature extractor correctly * Make implementation cleaner * Add ConvNextStem class * Improve design * Update design to also include encoder * Fix gamma parameter * Use sample docstrings * Finish conversion, add center cropping * Replace nielsr by facebook, make feature extractor tests smaller * Fix integration test Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-07 16:11:37 +01:00
Stas Bekman	8ce1330631	[deepspeed docs] DeepSpeed ZeRO Inference (#15486 ) * [deepspeed docs] DeepSpeed ZeRO Inference * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * tweak * deal with black * extra cleanup, better comments Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-04 13:51:02 -08:00
Sylvain Gugger	ac6aa10f23	Standardize semantic segmentation models outputs (#15469 ) * Standardize instance segmentation models outputs * Rename output * Update src/transformers/modeling_outputs.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add legacy argument to the config and model forward * Update src/transformers/models/beit/modeling_beit.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Copy fix in Segformer Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2022-02-04 14:52:07 -05:00
Stas Bekman	31be2f45a9	[deepspeed docs] Megatron-Deepspeed info (#15488 )	2022-02-04 11:15:13 -08:00
Stas Bekman	21dcaec5d5	[deepspeed docs] memory requirements (#15506 )	2022-02-03 10:55:14 -08:00
Sylvain Gugger	44b21f117b	Save code of registered custom models (#15379 ) * Allow dynamic modules to use relative imports * Work for configs * Fix last merge conflict * Save code of registered custom objects * Map strings to strings * Fix test * Add tokenizer * Rework tests * Tests * Ignore fixtures py files for tests * Tokenizer test + fix collection * With full path * Rework integration * Fix typo * Remove changes in conftest * Test for tokenizers * Add documentation * Update docs/source/custom_models.mdx Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Add file structure and file content * Add more doc * Style * Update docs/source/custom_models.mdx Co-authored-by: Suraj Patil <surajp815@gmail.com> * Address review comments Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-02-02 10:44:37 -05:00
Steven Liu	b9418a1d97	Update tutorial docs (#15165 ) * first draft of pipeline, autoclass, preprocess tutorials * apply review feedback * 🖍 apply feedback from patrick/niels * 📝add output image to preprocessed image * 🖍 apply feedback from patrick	2022-02-01 18:31:35 -06:00
Steven Liu	c157c7e3fd	Update fine-tune docs (#15259 ) * add fine-tune tutorial * make edits, fix style * 📝 make edits * 🖍 fix code format links to external libraries * 🔄revert code formatting * 🖍 use DefaultDataCollator instead of DataCollatorWithPadding	2022-02-01 18:28:12 -06:00
Stas Bekman	44c7857b87	[deepspeed doc] fix import, extra notes (#15400 ) * [deepspeed doc] fix import, extra notes * typo	2022-01-31 08:28:10 -08:00
NielsRogge	47df0f2234	Add header (#15434 )	2022-01-31 11:15:54 -05:00
Ogundepo Odunayo	282ae123e2	add t5 ner finetuning (#15432 )	2022-01-31 17:03:06 +01:00
Soonhwan-Kwon	e09473a817	Add support for XLM-R XL and XXL models by modeling_xlm_roberta_xl.py (#13727 ) * add xlm roberta xl * add convert xlm xl fairseq checkpoint to pytorch * fix init and documents for xlm-roberta-xl * fix indention * add test for XLM-R xl,xxl * fix model hub name * fix some stuff * up * correct init * fix more * fix as suggestions * add torch_device * fix default values of doc strings * fix leftovers * merge to master * up * correct hub names * fix docs * fix model * up * finalize * last fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add copied from * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-29 13:42:37 +01:00
Steven Liu	16d4acbfdb	Get started docs (#15098 ) * clean commit of changes * apply review feedback, make edits * fix backticks, minor formatting * 🖍 make fixup and minor edits * 🖍 fix # in header * 📝 update code sample without from_pt * 📝 final review	2022-01-28 19:01:37 -06:00
Steven Liu	cabd6d26a2	Update model share tutorial (#15288 ) * add model sharing tutorial * 🖍 apply feedback from review * 📝 make edits * 🖍 fix formatting * 📝 convert from pt checkpoint to flax * 📝 final review	2022-01-28 18:49:26 -06:00
Suraj Patil	d25e25ee2b	Add XGLM models (#14876 ) * add xglm * update vocab size * fix model name * style and tokenizer * typo * no mask token * fix pos embed compute * fix args * fix tokenizer * fix positions * fix tokenization * style and dic fixes * fix imports * add fast tokenizer * update names * add pt tests * fix tokenizer * fix typo * fix tokenizer import * fix fast tokenizer * fix tokenizer * fix converter * add tokenizer test * update checkpoint names * fix tokenizer tests * fix slow tests * add copied from comments * rst -> mdx * flax model * update flax tests * quality * style * doc * update index and readme * fix copies * fix doc * update toctrr * fix indent * minor fixes * fix config doc * don't save embed_pos weights * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * address Sylvains commnets, few doc fixes * fix check_repo * align order of arguments * fix copies * fix labels * remove unnecessary mapping * fix saving tokenizer Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-28 18:55:23 +01:00
Ngo Quang Huy	4996922b6d	[docs] fix wrong file name in `pr_check` (#15380 )	2022-01-28 07:52:01 -05:00
Steven Liu	f5db6ce76a	Fix code format for Accelerate doc (#15335 ) * 🖍 fix code syntax to external libraries and replace image * 🔄revert code formatting, replace image with code block * 🖍 apply feedback	2022-01-27 13:49:04 -06:00
Lysandre	f87db5e412	Release: v4.16.0	2022-01-27 13:06:33 -05:00
Sylvain Gugger	8f6454bfac	Add proper documentation for Keras callbacks (#15374 ) * Add proper documentation for Keras callbacks * Add dummies	2022-01-27 10:51:38 -05:00
Stas Bekman	fc8fc400e3	[docs] post-PR merge fix (#15355 ) * [docs] post-PR merge fix * Update docs/source/main_classes/deepspeed.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-26 11:23:32 -08:00
novice	99a2771189	Add YOSO (#15091 ) * Add cookiecutter files * Add cuda kernels and cpp files * Update modeling_yoso.py * Add .h files * Update configuration_yoso.py * Updates * Remove tokenizer * Code quality * Update modeling_yoso.py * Update modeling_yoso.py * Fix failing test * Update modeling_yoso.py * Fix code quality * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review and fix integration tests * Update src/transformers/models/yoso/modeling_yoso.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Apply suggestions from code review * Fix copied from statement * Fix docstring * Fix code quality * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions and fix mask * Apply suggestions from code review * Fix code quality * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix docstrings * Fix code quality * Remove trailing whitespace * Update yoso.mdx * Move kernel loading to YosoEncoder * make style * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/yoso/modeling_yoso.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add short summary to docs * Update docs/source/model_doc/yoso.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update yoso.mdx * Update docs/source/model_doc/yoso.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Remove CausalLM model and add copied from * Remove autoregressive code * Remove unused imports * add copied from for embeddings * Fix code quality * Update docs/source/model_doc/yoso.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestion from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-26 19:18:29 +01:00
Ngo Quang Huy	5d8b98608c	Fix deepspeed docs (#15346 )	2022-01-26 07:24:33 -05:00
Jacob Deppen	96161ac408	make table into valid Markdown table syntax (#15337 )	2022-01-26 07:10:00 -05:00
Maciej Pawłowski	e79a0faeae	Added missing code in exemplary notebook - custom datasets fine-tuning (#15300 ) * Added missing code in exemplary notebook - custom datasets fine-tuning Added missing code in tokenize_and_align_labels function in the exemplary notebook on custom datasets - token classification. The missing code concerns adding labels for all but first token in a single word. The added code was taken directly from huggingface official example - this [colab notebook](https://github.com/huggingface/notebooks/blob/master/transformers_doc/custom_datasets.ipynb). * Changes requested in the review - keep the code as simple as possible	2022-01-25 17:26:17 -05:00
Steven Liu	0501beb846	Add 🤗 Accelerate tutorial (#15263 ) * add accelerate tutorial * 🖍 apply feedback from review * 📝 make edits	2022-01-25 13:46:11 -06:00
novice	d43e308e7f	Add Swin Transformer (#15085 ) * Add all files * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Updates * Apply suggestions from review * Fix failing tests * Update __init__.py * Update configuration_swin.py * Update auto_factory.py * Fix pytests * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Fix tests and default checkpoint * Fix Recursion error * Code quality * Remove copied from * Update modeling_swin.py * Code quality * Update modeling_swin.py * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review * Fix feature extractor * Fix code quality * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review * Update configuration_swin.py * Update default checkpoint * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/model_doc/swin.mdx Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu> * Update conversion script * Reformat conversion script Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu>	2022-01-21 12:10:41 +01:00
NielsRogge	515ed3ad2a	Fix doc examples (#15257 )	2022-01-20 21:51:51 +01:00
Kamal Raj	08b41b413a	Update pipelines.mdx (#15243 ) fix few spelling mistakes	2022-01-20 08:46:48 -05:00
NielsRogge	80f7296091	Update Trainer code example (#15070 ) * Update code example * Fix code quality * Add comment	2022-01-19 20:15:12 +01:00
NielsRogge	ac227093e4	Add ViLT (#14895 ) * First commit * Add conversion script * Make conversion script work for base model * More improvements * Update conversion script, works for vqa * Add indexing argument to meshgrid * Make conversion script work for ViltForPreTraining * Add ViltForPreTraining to docs * Fix device issue * Add processor * Add MinMaxResize to feature extractor * Implement call method of ViltProcessor * Fix tests * Add integration test * Add loss calculation for VQA * Improve tests * Improve some more tests * Debug tests * Small improvements * Add support for attention_mask * Remove mask_it * Add pixel_mask * Add tests for ViltFeatureExtractor * Improve tests * Add ViltForNaturalLanguageVisualReasoning * Add ViltForNaturalLanguageVisualReasoning to conversion script * Minor fixes * Add support for image_embeds, update docstrings to markdown * Update docs to markdown * Improve conversion script * Rename ViltForPreTraining to ViltForMaskedLM * Improve conversion script * Convert docstrings to markdown * Fix code example of retrieval model * Properly convert masked language model * Add integration test for nlvr * Fix code quality * Apply suggestions from code review * Add copied from statements * Fix pretrained_config_archive_map * Fix docs * Add model to README * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply more suggestions from code review * Make code more readable * Add ViltForNaturalLanguageVisualReasoning to the tests * Rename ViltForVisualQuestionAnswering to ViltForQuestionAnswering * Replace pixel_values_2 by single tensor * Add hidden_states and attentions * Fix one more test * Fix all tests * Update year * Fix rebase issues * Fix another rebase issue * Remove ViltForPreTraining from auto mapping * Rename ViltForImageRetrievalTextRetrieval to ViltForImageAndTextRetrieval * Make it possible to use BertTokenizerFast in the processor * Use BertTokenizerFast by default * Rename ViltForNaturalLanguageVisualReasoning, define custom model output Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-19 19:51:59 +01:00
NielsRogge	842298f84f	[ViTMAE] Various fixes (#15221 ) * Add MAE to AutoFeatureExtractor * Add link to notebook * Fix relative paths	2022-01-19 15:27:57 +01:00
Li-Huai (Allan) Lin	841d979190	Add FastTokenizer to REALM (#15211 ) * Remove BertTokenizer abstraction * Add FastTokenizer to REALM * Fix config archive map * Fix copies * Update realm.mdx * Apply suggestions from code review	2022-01-19 15:19:36 +01:00
Sylvain Gugger	db3503949d	Finish conversion of REALM doc to MDX	2022-01-18 18:00:30 -05:00
Jake Tae	fe78fe98ca	Enable tqdm toggling (#15167 ) * feature: enable tqdm toggle * test: add tqdm unit test * style: run linter * Update tests/test_tqdm_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * refactor: use tiny model, run linter * docs: add tqdm to logging * docs: add tqdm reference to `http_get` * style: run linter * Update docs/source/main_classes/logging.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * fix: use `AutoConfig` for framework agnostic testing * chore: mv tqdm test to `test_logging.py` * feature: implement enable/disable functions * docs: mv docstring to comment * chore: mv tqdm functions to `logging.py` * docs: update docs to reference `enable/disable` funcs * test: update test to use `enable/disable` func * chore: update function reference in comment Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-01-18 17:52:35 -05:00
NielsRogge	74bec9865c	Add MAE (#15120 ) * First draft * More improvements * More improvements * More improvements * Fix embeddings * Add conversion script * Finish conversion script * More improvements * Fix forward pass * Remove print statements * Add weights initialization * Add initialization of decoder weights * Add support for other models in the conversion script * Fix patch_size for huge model * Fix most of the tests * Fix integration test * Fix docs * Fix archive_list * Apply suggestions from code review * Improve documentation * Apply more suggestions * Skip some tests due to non-deterministic behaviour * Fix test_initialization * Remove unneccessary initialization of nn.Embedding * Improve docs * Fix dummies * Remove ViTMAEFeatureExtractor from docs * Add model to README and table of contents * Delete inference file	2022-01-18 16:21:32 +01:00
Li-Huai (Allan) Lin	22454ae492	Add REALM (#13292 ) * REALM initial commit * Retriever OK (Update new_gelu). * Encoder prediction score OK * Encoder pretrained model OK * Update retriever comments * Update docs, tests, and imports * Prune unused models * Make embedder as a module `RealmEmbedder` * Add RealmRetrieverOutput * Update tokenization * Pass all tests in test_modeling_realm.py * Prune RealmModel * Update docs * Add training test. * Remove completed TODO * Style & Quality * Prune `RealmModel` * Fixup * Changes: 1. Remove RealmTokenizerFast 2. Update docstrings 3. Add a method to RealmTokenizer to handle candidates tokenization. * Fix up * Style * Add tokenization tests * Update `from_pretrained` tests * Apply suggestions * Style & Quality * Copy BERT model * Fix comment to avoid docstring copying * Make RealmBertModel private * Fix bug * Style * Basic QA * Save * Complete reader logits * Add searcher * Complete searcher & reader * Move block records init to constructor * Fix training bug * Add some outputs to RealmReader * Add finetuned checkpoint variable names parsing * Fix bug * Update REALM config * Add RealmForOpenQA * Update convert_tfrecord logits * Fix bugs * Complete imports * Update docs * Update naming * Add brute-force searcher * Pass realm model tests * Style * Exclude RealmReader from common tests * Fix * Fix * convert docs * up * up * more make style * up * upload * up * Fix * Update src/transformers/__init__.py * adapt testing * change modeling code * fix test * up * up * up * correct more * make retriever work * update * make style * finish main structure * Resolve merge conflict * Make everything work * Style * Fixup * Fixup * Update training test * fix retriever * remove hardcoded path * Fix * Fix modeling test * Update model links * Initial retrieval test * Fix modeling test * Complete retrieval tests * Fix * style * Fix tests * Fix docstring example * Minor fix of retrieval test * Update license headers and docs * Apply suggestions from code review * Style * Apply suggestions from code review * Add an example to RealmEmbedder * Fix Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-18 07:24:13 -05:00
Stas Bekman	edd3fce2f7	[doc] new MoE paper (#15184 ) add new paper	2022-01-17 09:10:51 -08:00
Stas Bekman	669e3c50c9	[doc] performance: Efficient Software Prebuilds (#15147 ) * Efficient Software Prebuilds * improve	2022-01-14 18:25:20 -08:00
AK391	4663c609b9	Add "open in hf spaces" gradio button issue #73 (#15106 ) * update XLMProphetNet link * update DPR link * change prophetnet link * change link MBART * change link GPT * update gpt2 link * ctrl update link * update Transformer-XL link * Update Reformer link * update xlnet link * bert update link * udpate albert link * roberta update link * update distilbert link * update convbert link * update XLM link * xlm roberta update link * update Flaubert link * update electra link * update funnel transformer and longformer * bart update link * pegasus update link * udpate marianmt link * t5 update link * mt5 update link	2022-01-14 10:12:30 -05:00
Carlos Aguayo	3fc221d077	Update model_sharing.mdx (#15142 ) Fix typo	2022-01-13 12:26:02 -05:00
lewtun	021f2ea987	Add ONNX configuration classes to docs (#15121 ) * Add ONNX classes to main package * Remove permalinks from ONNX guide * Fix ToC entry * Revert "Add ONNX classes to main package" This reverts commit `eb794a5b00`. * Add ONNX classes to main doc * Fix syntax highlighting in doc * Fix text * Add FeaturesManager to doc * Use paths to reference ONNX classes * Add FeaturesManager to init * Add missing ONNX paths	2022-01-12 16:33:32 +01:00
Sylvain Gugger	c425d60bb9	Fix link to deepspeed config	2022-01-12 09:32:53 -05:00
lewtun	16f0b7d72c	Update ONNX docs (#14904 ) * Remove docs for deprecated ONNX export * Tidy up the CLI help messages * Revamp ONNX docs * Update auto-config table * Use DistilBERT as example for consistency * Wrap up first pass at ONNX docs * Fix table check * Add tweaks and introduction * Add cross-ref * Fix missing import * Fix style * Add permalinks to ONNX configs * Clarify role of OrderedDict * Update docs/source/serialization.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add doctest syntax to code blocks * Remove permalinks * Revert "Remove permalinks" This reverts commit `099701daf0`. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-11 18:06:05 +01:00
AK391	68d925195e	Merge branch 'master' into master	2022-01-11 11:11:29 -05:00
novice	28e091430e	Add Nystromformer (#14659 ) * Initial commit * Config and modelling changes Added Nystromformer-specific attributes to config and removed all decoder functionality from modelling. * Modelling and test changes Added Nystrom approximation and removed decoder tests. * Code quality fixes * Modeling changes and conversion script Initial commits to conversion script, modeling changes. * Minor modeling changes and conversion script * Modeling changes * Correct modeling, add tests and documentation * Code refactor * Remove tokenizers * Code refactor * Update __init__.py * Fix bugs * Update src/transformers/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/model_doc/nystromformer.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/convert_nystromformer_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update modeling and test_modeling * Code refactor * .rst to .mdx * doc changes * Doc changes * Update modeling_nystromformer.py * Doc changes * Fix copies * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update configuration_nystromformer.py * Fix copies * Update tests/test_modeling_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update test_modeling_nystromformer.py * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Fix code style * Update modeling_nystromformer.py * Update modeling_nystromformer.py * Fix code style * Reformat modeling file * Update modeling_nystromformer.py * Modify NystromformerForMultipleChoice * Fix code quality * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Code style changes and torch.no_grad() * make style * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-11 14:25:49 +01:00
Virus	c4fa908fa9	Adds IBERT to models exportable with ONNX (#14868 ) * Add IBertOnnxConfig and tests * add all the supported features for IBERT and remove outputs in IbertOnnxConfig * use OnnxConfig * fix codestyle * remove serialization.rst * codestyle	2022-01-11 12:17:08 +01:00
AK391	5cd7086fdb	XLM-ProphetNet Spaces badge	2022-01-11 00:11:31 -05:00
AK391	4e3208662e	DPR Spaces badge	2022-01-10 13:50:40 -05:00
AK391	ac2c06d492	ProphetNet spaces badge	2022-01-10 13:43:34 -05:00
AK391	bf0201e184	MBART spaces badge	2022-01-10 13:37:17 -05:00
Yih-Dar	b67fd797be	Add TFVisionEncoderDecoderModel (#14148 ) * Start the work on TFVisionEncoderDecoderModel * Expose TFVisionEncoderDecoderModel * fix import * Add modeling_tf_vision_encoder_decoder to _ignore_modules in get_model_modules() * reorder * Apply the fix for checkpoint loading as in #14016 * remove attention_mask + fix VISION_DUMMY_INPUTS * A minimal change to make TF generate() work for vision models as encoder in encoder-decoder setting * fix wrong condition: shape_list(input_ids) == 2 * add tests * use personal TFViTModel checkpoint (for now) * Add equivalence tests + projection layer * style * make sure projection layer can run * Add examples * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Clean comments (need to work on TODOs for PyTorch models) * Remove TF -> PT in check_pt_tf_equivalence for TFVisionEncoderDecoderModel * fixes * Revert changes in PT code. * Update tests/test_modeling_tf_vision_encoder_decoder.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Add test_inference_coco_en for TF test * fix quality * fix name * build doc * add main_input_name * Fix ckpt name in test * fix diff between master and this PR * fix doc * fix style and quality * fix missing doc * fix labels handling * Delete auto.rst * Add the changes done in #14016 * fix prefix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make style Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-10 13:30:14 -05:00
AK391	c9504b2f50	MT5 Spaces badge	2022-01-10 12:57:08 -05:00
AK391	daec528ca9	T5 Spaces badge	2022-01-10 12:51:39 -05:00
AK391	0554e4d5c5	MarianMT Spaces badge	2022-01-10 12:47:12 -05:00
AK391	7ec6aad23d	Pegasus Spaces badge	2022-01-10 12:39:22 -05:00
AK391	03f8b9c9e0	BART Spaces badge	2022-01-10 12:33:59 -05:00
Stas Bekman	37bc0b4e53	[performance doc] Power and Cooling (#14935 ) * [performance doc] Power and Cooling * more docs * Update docs/source/performance.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * reword Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-10 09:21:04 -08:00
AK391	20f169b523	Longformer Spaces badge	2022-01-10 12:14:18 -05:00
AK391	4fbc924d0a	Funnel Transformer spaces badge	2022-01-10 12:06:05 -05:00
AK391	222c09a635	ELECTRA Spaces badge	2022-01-10 11:53:23 -05:00
Stas Bekman	31838d3e11	[doc] normalize HF Transformers string (#15023 )	2022-01-10 08:44:33 -08:00
AK391	84f360e862	FlauBERT spaces badge	2022-01-10 11:41:10 -05:00
AK391	9f33116898	XLM-Roberta Spaces badge	2022-01-10 10:54:18 -05:00
AK391	20fa9eb035	XLM Spaces badge	2022-01-10 10:48:06 -05:00
AK391	16b6df6fca	ConvBERT spaces badge	2022-01-10 10:33:03 -05:00
Santiago Castro	f21bc4215a	Use tqdm.auto in Pipeline docs (#14920 ) It's better for e.g. notebook.	2022-01-10 10:28:34 -05:00
Mishig Davaadorj	f012c00ada	Model summary horizontal banners (#15058 )	2022-01-10 10:06:14 -05:00
Minghao Li	b2c477fc6d	support the trocr small models (#14893 ) * support the trocr small models * resolve conflict * Update docs/source/model_doc/trocr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/model_doc/trocr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/model_doc/trocr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix unexpected indent in processing_trocr.py * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * update the docstring of processing_trocr * remove extra space Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-01-10 09:28:03 -05:00
Yih-Dar	0a03a86813	fix model table cell text alignment (#14999 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-01-10 06:44:11 -05:00
AK391	5be1242ac0	Merge branch 'huggingface:master' into master	2022-01-07 11:48:22 -05:00
AK391	484e7a441f	Distilbert spaces badge	2022-01-07 11:47:56 -05:00
K.C. Tung	f18c6fa94c	Resubmit changes after rebase to master (#14982 )	2022-01-07 08:34:12 +01:00
AK391	1d71227295	Roberta spaces badge	2022-01-06 18:50:19 -05:00
AK391	cac877425c	ALBERT spaces badge	2022-01-06 13:01:23 -05:00
AK391	794441c379	BERT spaces badge	2022-01-06 12:22:09 -05:00
AK391	f872f18dca	XLNet spaces badge	2022-01-06 12:09:50 -05:00
AK391	8d187e7feb	Reformer Spaces badge	2022-01-06 11:59:21 -05:00
AK391	59fb636948	Transformer-XL badge	2022-01-06 11:47:41 -05:00
AK391	2380136722	add spaces badges	2022-01-04 16:13:57 -05:00
Kevin Ko	857ab55c01	[doc] Update parallelism.mdx (#15018 ) * Update parallelism.mdx * Update parallelism.mdx	2022-01-04 09:58:27 -08:00
Daniel Stancl	21aecc0971	Add Flax RoFormer (#15005 ) * Add FlaxRoFormer * Clean code + make quality * Fix output pooling for FlaxRoFormerForMultipleChoiceModule * Apply suggestions from code review * add flax model to repos Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-04 13:23:10 +01:00
Kevin Ko	f2ab21833f	Update parallelism.mdx (#15013 ) * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx	2022-01-03 11:49:27 -08:00
Sylvain Gugger	8f6373c61c	Map model_type and doc pages names (#14944 ) * Map model_type and doc pages names * Add script * Fix typo * Quality * Manual check for Auto Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2022-01-03 05:08:55 -05:00
Sylvain Gugger	2c5597f6c7	Style	2021-12-27 19:18:08 -05:00
Sylvain Gugger	b5e2b183af	Doc styler examples (#14953 ) * Fix bad examples * Add black formatting to style_doc * Use first nonempty line * Put it at the right place * Don't add spaces to empty lines * Better templates * Deal with triple quotes in docstrings * Result of style_doc * Enable mdx treatment and fix code examples in MDXs * Result of doc styler on doc source files * Last fixes * Break copy from	2021-12-27 19:07:46 -05:00
Stas Bekman	e13f72fbff	[doc] :obj: hunt (#14954 ) * redo sans examples * style	2021-12-27 15:49:48 -08:00
Stas Bekman	133c5e40c4	[doc] consistent True/False/None default format (#14951 ) * [doc] consistent True/False/None default format * Update src/transformers/models/xlnet/modeling_xlnet.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-27 14:31:40 -08:00
Sylvain Gugger	b2f500256e	Convert last rst file (#14952 )	2021-12-27 17:09:37 -05:00
Daniel Stancl	501307b58b	Add `ElectraForCausalLM` -> Enable Electra encoder-decoder model (#14729 ) * Add ElectraForCausalLM and cover some basic tests & need to fix a few tests * Fix bugs * make style * make fix-copies * Update doc * Change docstring to markdown format * Remove redundant update_keys_to_ignore	2021-12-27 12:37:52 +01:00
Nicolas Patry	b058490ceb	ChunkPipeline (batch_size enabled on `zero-cls` and `qa` pipelines. (#14225 ) * Pipeline chunks. * Batching for Chunking pipelines ? * Batching for `question-answering` and `zero-shot-cls`. * Fixing for FNet. * Making ASR a chunk pipeline. * Chunking ASR API. * doc style. * Fixing ASR test. * Fixing QA eror (p_mask, padding is 1, not 0). * Enable both vad and simple chunking. * Max length for vad. * remove inference mode, crashing on s2t. * Revert ChunkPipeline for ASRpipeline. Too many knobs for simple integration within the pipeline, better stick to external convenience functions instead, more control to be had, simpler pipeline and also easier to replace with other things later. * Drop necessity for PT for these. * Enabling generators. * Add mic + cleanup. * Typo. * Typo2. * Remove ASR work, it does not belong in this PR anymore. * Update src/transformers/pipelines/pt_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/pipelines/zero_shot_classification.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Adding many comments. * Doc quality. * `hidden_states` handling. * Adding doc. * Bad rebase. * Autofixing docs. * Fixing CRITICAL bug in the new Zerocls pipeline. Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-12-27 11:26:20 +01:00
Yih-Dar	8f2cc1c3ab	Add TFCLIPModel (#13967 ) * Start the work for TFCLIPModel * Convert to TF code (TODO: loss + doc) * Clean up * Fix pooled_output for TFCLIPTextTransformer - using tf.gather_nd * assert -> raise error * Expose TFCLIPModel * Deal with dummy_inputs * Add tests * Fix all tests. TODO: manual check weight loading + add more comments * Fix pt tf equivalence test * fixes * update TFCLIPVisionEmbeddings's Conv2D * Fix loss + overwrite test_pt_tf_model_equivalence from common * Add a comment about the change about MainLayer in test_keras_save_load * Set return_loss=True in TFCLIPModelTester + make tests pass * overwrite test_pt_tf_model_equivalence from tf common * fix base_model_prefix * Fix examples * remove unused * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply review suggestions * change self.pre_layrnorm to self.pre_layernorm * apply more review suggestions * return attention probs before dropout (to align with PT) * fix weight init * fix * build doc * fix missing doc * fix for test Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-23 11:19:44 -05:00
lewtun	6b655cc63f	Add ONNX support for MarianMT models (#14586 ) * First commit to add MarianMT to ONNX * Now MarianModel.forward() automatically generates decoder_input_ids, like BartModel.forward() * Adjusted MarianOnnxConfig.inputs and outputs to work with seq2seq-lm feature * Style fix * Added support for other features for already supported models * Partial support for causal and seq2seq models * Partial support for causal and seq2seq models * Add default task for MarianMT ONNX * Remove automatic creation of decoder_input_ids * Extend inputs and outputs for MarianMT ONNX config * Add MarianMT to ONNX unit tests * Refactor * OnnxSeq2SeqConfigWithPast to support seq2seq models * Parameterized the onnx tests * Restored run_mlm.py * Restored run_mlm.py * [WIP] BART update * BART and MBART * Add past_key_values and fix dummy decoder inputs Using a sequence length of 1 in generate_dummy_outputs() produces large discrepancies, presumably due to some hidden optimisations. * Refactor MarianOnnxConfig to remove custom past_key_values logic * Fix quality * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit `0f4e39c559`. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Refactor Marian export to account for base changes * Fix copies * Implemented suggestions * Extend support for causal LM * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit `0f4e39c559`. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit `0f4e39c559`. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Remove commented import * Remove ONNX model * Remove redundant class method * Tidy up imports * Fix quality * Refactor dummy input function * Add copied from statements to Marian config functions * Remove false copied from comments * Fix copy from comment Co-authored-by: Massimiliano Bruni <massimiliano.bruni@hcl.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>	2021-12-23 13:35:56 +01:00
Sylvain Gugger	207594be81	Convert rst files (#14888 ) * Convert all tutorials and guides * Convert all remaining rst to mdx * Track and fix bad links	2021-12-22 16:14:35 -05:00
NielsRogge	7df4b90c76	Fix Perceiver docs (#14879 )	2021-12-22 14:18:03 +01:00
Ryokan RI	824fd44fc3	Feature/fix slow test in mluke (#14749 ) * make MLukeTokenizerTest fast * make LukeTokenizerTest fast * add entry to _toctree.yaml	2021-12-22 06:35:59 -05:00
Lysandre Debut	ec3567fe20	Convert model files from rst to mdx (#14865 ) * First pass * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-22 03:27:30 -05:00
Stas Bekman	185876392c	[doc porting] several docs (#14858 ) * [doc porting] 2 docs * [doc porting] 2 docs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/main_classes/deepspeed.mdx * cleanup Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-21 09:55:25 -08:00
Stas Bekman	b6ec956976	[logging] implement warning_advice / TRANSFORMERS_NO_ADVISORY_WARNINGS (#14669 ) * [logging] implement warning_advice / TRANSFORMERS_NO_ADVISORY_WARNINGS * reword	2021-12-20 20:48:38 -08:00
Stas Bekman	c1125dc2ba	[doc] typo (#14849 ) fix small typo	2021-12-20 12:20:21 -05:00
Patrick von Platen	952a77b05d	[Perceiver] Skip multi-gpu tests for now (#14813 ) * [Perceiver] Skip multi-gpu tests for now * Update tests/test_modeling_perceiver.py * up * up	2021-12-20 15:22:50 +01:00
Derek Chia	8a818c26cb	Fix dead link to benchmarks.ipynb (#14842 ) Notebook has been updated here https://github.com/huggingface/notebooks/tree/master/examples/benchmark.ipynb	2021-12-20 09:08:05 -05:00
Anton Lozhkov	3883e3a75e	Add SD and SV heads for WavLM (#14847 ) * Add converted heads * Add dummies	2021-12-20 16:40:56 +03:00
Patrick von Platen	c4a96cecbc	Wav2Vec2 meets phonemes (#14353 ) * up * add tokenizer * improve more * finish tokenizer * finish * adapt speech recognition script * adapt convert * more fixes * more fixes * update phonemizer wav2vec2 * better naming * fix more tests * more fixes swedish * correct tests * finish * improve script * remove file * up * lets get those 100 model architectures until the end of the month * make fix-copies * correct more * correct script * more fixes * more fixes * add to docs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * replace assert * fix copies * fix docs * new try docs * boom boom * update * add phonemizer to audio tests * make fix-copies * up * upload models * some changes * Update tests/test_tokenization_wav2vec2_phoneme.py Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * more fixes * remove @ Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>	2021-12-17 19:56:44 +01:00
Lysandre Debut	77d6c826d8	Convert rst to mdx bert (#14806 ) * BERT to mdx mdx :) c * Update docs/source/model_doc/bert.mdx Co-authored-by: Julien Chaumond <julien@huggingface.co> * Remove all Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co>	2021-12-17 11:13:34 -05:00
Patrick von Platen	bef1e3e4a0	Add WavLM (#14354 ) * first commit * fix some stuff * fix more readme * Apply suggestions from code review * update * correct * up * attn layer works * push code * make modedls work * Small change * more refactor * finish * up * fix convertsion * fix position bias * Fix style * fix conversion * make fix-copies * add * clean * fix docs * fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply final changes * make fix-copies Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-16 18:57:05 +01:00
Anton Lozhkov	48463ebb33	Add Speaker Diarization and Verification heads (#14723 ) * Models * Squashed commit of the following: commit 72278e1e931a16d0879acc77f65762f3364833d0 Author: anton-l <aglozhkov@gmail.com> Date: Fri Dec 10 21:45:08 2021 +0300 * Add unispeech heads * Add sd/sv automodels * Docs cleanup * Fix docstrings * rename xvector classes * examples * Tests cleanup * Style * Better checkpoints for tests * leftover docs * apply review suggestions * Style + init tests * Update unispeech-sat tdnn downsampling	2021-12-16 19:22:14 +03:00
Lysandre Debut	8010fda9bf	Removes images to put them in a dataset (#14781 ) * First try * Update instructions	2021-12-16 04:42:02 -05:00
Sylvain Gugger	459677aebe	PoC for conserving old links (#14754 ) * PoC for conserving old links * Do the same for other links * remap the redirects section * add instructions on how to move sections * improve Co-authored-by: Stas Bekman <stas@stason.org>	2021-12-15 11:40:47 -08:00
NielsRogge	50bc57cef8	Update Perceiver code examples (#14783 ) * Fix code examples * Fix code example	2021-12-15 11:06:38 -05:00
Xing Han Lu	72c6e8b8bf	Update t5.rst (#14776 )	2021-12-15 14:59:11 +01:00
Stas Bekman	fdf3ce2827	[doc] performance: groups of operations by compute-intensity (#14757 )	2021-12-14 19:01:23 -08:00
Amit Chaudhary	851a78978a	Fix broken links to distillation on index page of documentation (#14722 ) * Fix broken links to distillation on index page of documentation * Fix broken link for distillation in main README * Run make fixup	2021-12-14 21:55:33 -05:00
Sylvain Gugger	322d416916	Update Table of Contents (#14755 )	2021-12-13 17:15:19 -05:00
Sylvain Gugger	7533d30acd	Convert Trainer doc page to MarkDown (#14753 ) * Convert Trainer doc page to MarkDown * Fix repo consistency * Fix the doc build test job	2021-12-13 13:09:50 -05:00
Sylvain Gugger	c3cd88a9ba	Small fixes for the doc (#14751 )	2021-12-13 11:17:01 -05:00
Lucien	fc74c84537	Swap TF and PT code inside two blocks (#14742 )	2021-12-13 10:31:11 -05:00
Lysandre Debut	6e05bb1c96	Fix the perceiver docs (#14748 )	2021-12-13 09:29:47 -05:00
NielsRogge	4c99e553c1	Improve documentation of some models (#14695 ) * Migrate docs to mdx * Update TAPAS docs * Remove lines * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply some more suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add pt/tf switch to code examples * More improvements * Improve docstrings * More improvements Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-13 13:24:36 +01:00
Stas Bekman	027074f4d0	[doc] document MoE model approach and current solutions (#14725 ) * document MoE model approach * additional info from Samyam * fix	2021-12-10 18:24:38 -08:00
Sylvain Gugger	5eca742f6c	Fix special character in MDX (#14721 )	2021-12-10 16:02:48 -05:00
Sylvain Gugger	63c284c2d4	Prevent style_doc from tempering the config file	2021-12-10 15:31:43 -05:00
Sylvain Gugger	1b75d7238c	Automatically build doc notebooks (#14718 ) * Test workflow * Build doc * Make a clean build * Add doc config * Restore other workflows * Final job * Print something in else statements * Pull before making changes	2021-12-10 14:20:56 -05:00
Sylvain Gugger	bab1556456	Put back open in colab markers (#14684 )	2021-12-09 12:00:06 -05:00
Tikeng Notsawo Pascal Junior	3bc7d70e9c	Fix : wrong link in the documentation (ConvBERT vs DistilBERT) (#14705 )	2021-12-09 11:35:22 -05:00
Mishig Davaadorj	60be4bf8ac	Fix typo in toctree (#14704 )	2021-12-09 09:25:31 -05:00
Sylvain Gugger	13186d7152	Move pyctcdecode (#14686 ) * Move pyctcdecode dep * Fix doc and last objects * Quality * Style * Ignore this black	2021-12-08 15:41:58 -05:00
Stas Bekman	1228661285	[bf16 support] tweaks (#14580 ) * [bf16 support] tweaks * corrections Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>	2021-12-08 11:33:24 -08:00
Sylvain Gugger	01b8cd5932	Revert open-in-colab and add perceiver (#14683 )	2021-12-08 13:52:31 -05:00
Sylvain Gugger	cf36f4d7a8	Convert tutorials (#14665 ) * Convert a few docs * And another * Last tutorials * New syntax for colab links * Convert a few docs * And another * Last tutorials * New syntax for colab links	2021-12-08 13:19:46 -05:00
NielsRogge	65b20b739b	Add Perceiver IO (#14487 ) * First draft * Style and remove mlm * Make forward pass work * More improvements * More improvements * Fix bug * More improvements * More improvements * Add PerceiverTokenizer first draft * Improve conversion script * More improvements * Make conversion script work for the encoder * Make conversion script work with local pickle files * Style & quality, fix-copies * Add dummy input to conversion script * Add absolute position embeddings to TextPreProcessor * Make forward pass of encoder work * More improvements * Move text preprocessor to separate script * More improvements * More improvements * Add post processor * Make MLM model work * Style * Add PerceiverForMaskedLM * Add PerceiverImagePreprocessor * Make style * Make PerceiverForImageClassification work * More improvements * More improvements * Use tokenizer in conversion script * Use PerceiverForMaskedLM in conversion script * Define custom PerceiverModelOutput * Improve PerceiverAttention to make it work for both MLM and image classification * More improvements * More improvements * More improvements to the conversion script * Make conversion script work for both MLM and image classification * Add PerceiverFeatureExtractor * More improvements * Style and quality * Add center cropping * Fix bug * Small fix * Add print statement * Fix bug in image preprocessor * Fix bug with conversion script * Make output position embeddings an nn.Parameter layer instead of nn.Embedding * Comment out print statements * Add position encoding classes * More improvements * Use position_encoding_kwargs * Add PerceiverForImageClassificationFourier * Make style & quality * Add PerceiverForImageClassificationConvProcessing * Style & quality * Add flow model * Move processors to modeling file * Make position encodings modular * Make basic decoder use modular position encodings * Add PerceiverForOpticalFlow to conversion script * Add AudioPreprocessor * Make it possible for the basic decoder to use Fourier position embeddings * Add PerceiverForMultimodalAutoencoding * Improve model for optical flow * Improve _build_network_inputs method * Add print statement * Fix device issue * Fix device of Fourier embeddings * Add print statements for debugging * Add another print statement * Add another print statement * Add another print statement * Add another print statement * Improve PerceiverAudioPreprocessor * Improve conversion script for multimodal modal * More improvements * More improvements * Improve multimodal model * Make forward pass multimodal model work * More improvements * Improve tests * Fix some more tests * Add output dataclasses * Make more tests pass * Add print statements for debuggin * Add tests for image classification * Add PerceiverClassifierOutput * More improvements * Make more tests pass for the optical flow model * Make style & quality * Small improvements * Don't support training for optical flow model for now * Fix _prepare_for_class for tests * Make more tests pass, add some docs * Add multimodal model to tests * Minor fixes * Fix tests * Improve conversion script * Make fixup * Remove pos_dim argument * Fix device issue * Potential fix for OOM * Revert previous commit * Fix test_initialization * Add print statements for debugging * Fix print statement * Add print statement * Add print statement * Add print statement * Add print statement * Add print statement * Add print statement * Remove need for output_shape * Comment out output_shape * Remove unnecessary code * Improve docs * Fix make fixup * Remove PerceiverTextProcessor from init * Improve docs * Small improvement * Apply first batch of suggestions from code review * Apply more suggestions from code review * Update docstrings * Define dicts beforehand for readability * Rename task to architecture in conversion script, include PerceiverModel in tests * Add print statements for debugging * Fix tests on GPU * Remove preprocessors, postprocessors and decoders from main init * Add integration test * Fix docs * Replace einops by torch * Update for new docs frontend * Rename PerceiverForImageClassification * Improve docs * Improve docs * Improve docs of PerceiverModel * Fix some more tests * Improve center_crop * Add PerceiverForSequenceClassification * Small improvements * Fix tests * Add integration test for optical flow model * Clean up * Add tests for tokenizer * Fix tokenizer by adding special tokens properly * Fix CI	2021-12-08 14:20:34 +01:00
Patrick von Platen	961732c276	[Wav2Vec2] PyCTCDecode Integration to support language model boosted decoding (#14339 ) * up * up * up * make it cleaner * correct * make styhahalal * add more tests * finish * small fix * make style * up * tryout to solve cicrle ci * up * fix more tests * fix more tests * apply sylvains suggestions * fix import * correct docs * add pyctcdecode only to speech tests * fix more tests * add tf, flax and pt tests * add pt * fix last tests * fix more tests * Apply suggestions from code review * change lines * Apply suggestions from code review Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * correct tests * correct tests * add doc string Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>	2021-12-08 12:07:54 +01:00
Ryokan RI	30646a0a3c	Add mLUKE (#14640 ) * implement MLukeTokenizer and LukeForMaskedLM * update tests * update docs * add LukeForMaskedLM to check_repo.py * update README * fix test and specify the entity pad id in tokenization_(m)luke * fix EntityPredictionHeadTransform	2021-12-07 00:25:28 -05:00
tucan9389	0f3f045ebd	Add GPTJForQuestionAnswering (#14503 ) * Add GPTJForQuestionAnswering * Reformat for GPTJForQuestionAnswering * Fix isort error * make style for GPTJForQA * Add _keys_to_ignore_on_load_missing * Change the sequence of qa and classification Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-12-06 11:44:10 -05:00
Matt	73ec4340ec	Make DefaultDataCollator importable from root (#14588 ) * Make DefaultDataCollator importable from root * Add documentation for DefaultDataCollator and add return_tensors argument to all class docstrings * make style * Add DefaultDataCollator to data_collator.rst * Add DefaultDataCollator to data_collator.rst	2021-12-03 15:15:09 -05:00
Stas Bekman	71b1bf7ea8	[trainer] add tf32-mode control (#14606 ) * [trainer] add --tf32 support * it's pt>=.17 * it's pt>=.17 * flip the default to True * add experimental note * simplify logic * style * switch to 3-state logic * doc * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * re-style code Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-03 10:08:58 -08:00
Lysandre Debut	ec47baeba2	2022 is the year of multi-modality (#14610 ) * 2022 is the year of multi-modality * Small fix * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * Apply suggestions from code review * Apply to documentation index * Apply suggestions from code review Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2021-12-03 11:35:44 -05:00
Daniel Stancl	50d909be28	[Flax] Add FlaxBlenderbotSmall (#14576 ) * [WIP] Add FlaxBlenderbotSmall * Revert some unintentionally changed files Revert some unintentionally files changed by improperly filled cookiecutter instructions. * Fix repo consistency * Fix Flax-PT equivalence * Apply suggestions from code review * Update index.mdx * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-12-02 14:21:48 +05:30
Mishig Davaadorj	275402bf2b	Update doc img links (#14593 ) * Update doc img links * Rename toctree.yml -> _toctree.yml (#14594) * Update doc img links * Update performance.md img link	2021-12-02 09:01:35 +01:00
Mishig Davaadorj	4f68de625c	Rename toctree.yml -> _toctree.yml (#14594 )	2021-12-02 08:58:39 +01:00
Stas Bekman	fbe278c76c	[doc] bf16/tf32 guide (#14579 ) * [doc] bf16/tf32 guide * expand * expand * Update docs/source/performance.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-01 14:18:58 -08:00
Sylvain Gugger	4df7d05a87	Doc new front (#14590 ) * Convert PretrainedConfig doc to Markdown * Use syntax * Add necessary doc files (#14496) * Doc fixes (#14499) * Fixes for the new front * Convert DETR file for table * Title is needed * Simplify a bit * Even simpler * Remove imports * Fix typo in toctree (#14516) * Fix checkpoints badge * Update versions.yml format (#14517) * Doc new front github actions (#14512) * Doc new front github actions * Fix docstring * Fix feature extraction utils import (#14515) * Address Julien's comments * Push to doc-builder * Ready for merge * Remove old build and deploy * Doc misc fixes (#14583) * Rm versions.yml from doc * Fix converting.rst * Rm pretrained_models from toctree * Fix index links (#14567) * Fix links in README * Localized READMEs * Fix copy script * Fix find doc script * Update README_ko.md Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co> * Adapt build command to new CLI tools (#14578) * Fix typo * Fix doc interlinks (#14589) * Convert PretrainedConfig doc to Markdown * Use syntax * Rm pattern <[a-z]+(.html).> Rm huggingface.co/transformers/master * Rm .html * Rm .html from index.mdx * Rm .html from model_summary.rst * Update index.mdx rm html * Update remove .html * Fix inner doc links * Fix interlink in preprocssing.rst * Update pr_checks Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Convert PretrainedConfig doc to Markdown * Use syntax * Add necessary doc files (#14496) * Doc fixes (#14499) * Fixes for the new front * Convert DETR file for table * Title is needed * Simplify a bit * Even simpler * Remove imports * Fix checkpoints badge * Fix typo in toctree (#14516) * Update versions.yml format (#14517) * Doc new front github actions (#14512) * Doc new front github actions * Fix docstring * Fix feature extraction utils import (#14515) * Address Julien's comments * Push to doc-builder * Ready for merge * Remove old build and deploy * Doc misc fixes (#14583) * Rm versions.yml from doc * Fix converting.rst * Rm pretrained_models from toctree * Fix index links (#14567) * Fix links in README * Localized READMEs * Fix copy script * Fix find doc script * Update README_ko.md Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co> * Adapt build command to new CLI tools (#14578) * Fix typo * Fix doc interlinks (#14589) * Convert PretrainedConfig doc to Markdown * Use syntax * Rm pattern <[a-z]+(.html).> Rm huggingface.co/transformers/master * Rm .html * Rm .html from index.mdx * Rm .html from model_summary.rst * Update index.mdx rm html * Update remove .html * Fix inner doc links * Fix interlink in preprocssing.rst * Update pr_checks Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Styling Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co>	2021-12-01 14:13:02 -05:00
Suraj Patil	4c0dd199c8	FlaxGPTJ (#14396 ) * add flax gptj * no bias in attention dense * no wpe * fix rotary embeddings * fix rotary embeds * fix rotray embeds * quality * doc and quality * fix equivalence tests	2021-12-01 10:57:39 +05:30
Suraj Patil	fc1d97f29d	VisionTextDualEncoder (#13511 ) * init vision_text_dual_encoder * fix merge * remove extra heads * fix tests * remove VISION_TEXT_DUAL_ENCODER_PRETRAINED_CONFIG_ARCHIVE_MAP * remove archive map * fix imports * fix more imports * fix init * delete tokenizers * fix imports * clean * support clip's vision model * handle None config * begin tests * more test and few fixes * warn about newly init weights * more tests * add loss to model * remove extra classes from doc * add processor * doc and small fixes * add start docstr * update flax model * flax tests * more flax tests * doc * quality * doc and quality * fix doc * doc * remove comments * update warning * quality * fix docs * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * replace asserts, fix imports * update imports * fix import * address some review comments * fix check * reduce tolerance * fix test * add flax integration test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address Sylvain's comments * fix style * add pt_flax_equivalence test in PT tests * add pt integration test * update test * use pre-trained checkpoint in examples Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-30 22:21:48 +05:30
Daniel Stancl	faacd74729	[Flax] Add FlaxBlenderbot (#13633 ) * Init Flax implementation for Blenderbot * Add a majority of stuff except for tests * make style quality * Add tests and fix some bugs * Add tests * Clean source code and fix some bugs * Fix copies and docs * Fix jax device condition for tests * Fix layer norm in the encoder * Fix a few typos in the test file * make fix-copies * make fix-copies * fix layer norm * Fix Flax params dtype (#13090) * Fix PR reference (#13098) * make fix-copies * Update tests/test_modeling_flax_blenderbot.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-11-30 17:36:54 +05:30
Kamal Raj	c468a87a69	Tapas tf (#13393 ) * TF Tapas first commit * updated docs * updated logger message * updated pytorch weight conversion script to support scalar array * added use_cache to tapas model config to work properly with tf input_processing * 1. rm embeddings_sum 2. added # Copied 3. + TFTapasMLMHead 4. and lot other small fixes * updated docs * + test for tapas * updated testing_utils to check is_tensorflow_probability_available * converted model logits post processing using numpy to work with both PT and TF models * + TFAutoModelForTableQuestionAnswering * added TF support * added test for TFAutoModelForTableQuestionAnswering * added test for TFAutoModelForTableQuestionAnswering pipeline * updated auto model docs * fixed typo in import * added tensorflow_probability to run tests * updated MLM head * updated tapas.rst with TF model docs * fixed optimizer import in docs * updated convert to np data from pt model is not `transformers.tokenization_utils_base.BatchEncoding` after pipeline upgrade * updated pipeline: 1. with torch.no_gard removed, pipeline forward handles 2. token_type_ids converted to numpy * updated docs. * removed `use_cache` from config * removed floats_tensor * updated code comment * updated Copyright Year and logits_aggregation Optional * updated docs and comments * updated docstring * fixed model weight loading * make fixup * fix indentation * added tf slow pipeline test * pip upgrade * upgrade python to 3.7 * removed from_pt from tests * revert commit `f18cfa9`	2021-11-30 11:07:55 +01:00
NielsRogge	25156eb296	Rename ImageGPT (#14526 ) * Rename * Add MODEL_FOR_CAUSAL_IMAGE_MODELING_MAPPING	2021-11-29 10:19:11 +01:00
Xing Han Lu	ebbe8cc3fe	Tokenizers docs: Specify which class contains `__call__` method (#14379 ) * Update tokenizer.rst * Apply `make fixup`	2021-11-28 18:55:38 -05:00
Lysandre Debut	2318bf77eb	Fixes (#14534 )	2021-11-26 04:35:08 -05:00
Lysandre Debut	c15f4f203f	Quicktour updates (#14533 )	2021-11-26 04:09:31 -05:00
Chris Fregly	1bbd6fcdeb	added save_directories for _psave_pretrained_pt and _tf, changed model to tf_model and pt_model, enable the notebook to run cleanly from top to bottom without error (#14529 ) * added save_directories for _psave_pretrained_pt and _tf, changed model to tf_model and pt_model, enable the notebook to run cleanly from top to bottom without error * Update quicktour.rst * added >>> * dependencies * added space	2021-11-26 03:46:07 -05:00
Stas Bekman	956a483173	[deepspeed] zero inference (#14253 ) * [deepspeed] zero inference * only z3 makes sense for inference * fix and style * docs * rework * fix test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * responding to suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-23 14:09:15 -08:00
Sylvain Gugger	204d251310	Auto processor (#14465 ) * Add AutoProcessor class * Init and tests * Add doc * Fix init * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Reverts to tokenizer or feature extractor when available * Adapt test Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-11-22 12:17:38 -05:00

... 2 3 4 5 6 ...

1283 Commits