transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-06 22:30:09 +06:00

Author	SHA1	Message	Date
Joao Gante	af37d183b3	Generate: documented function to compute the transition scores (#21191 ) Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2023-01-20 12:50:01 +00:00
Bartosz Szmelczynski	1b37fb5e17	Efficientformer (#20459 ) - Adds EfficientFormer V1 to transformers - PR co-authored by @novice03 and @Bearnardd Co-authored-by: novice <pranavpulijala@gmail.com> Co-authored-by: novice <44259234+novice03@users.noreply.github.com>	2023-01-20 11:35:42 +03:00
Clémentine Fourrier	87208a05af	Graphormer model for Graph Classification (#20968 ) * [FT] First commit for graphormer architecture. The model has no tokenizer, as it uses a collator and preprocessing function for its input management. Architecture to be tested against original one. The arch might need to be changed to fit the checkpoint, but a revert to the original arch will make the code less nice to read. TODO: doc * [FIX] removed test model * [FIX] import error * [FIX] black and flake * [DOC] added paper refs * [FIX] [DOC] * [FIX] black * [DOC] Updated READMEs * [FIX] Order of imports + rm Tokenizer calls * [FIX] Moved assert in class to prevent doc build failure * [FIX] make fix-copies * [Doc] update from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [FIX] Removed Graphormer from Sequence classification model list * [DOC] Added HF copyright to Cython file * [DOC] Fixed comments * [FIX] typos in class doc + removed config classes. Todo: update doc from paper definitions * [FIX] Removed dependency to fairseq, and replaced all asserts with Exception management * [FIX] Homogeneized initialization of weights to pretrained constructor * [FIX] [CP] Updated multi_hop parameter to get same results as in original implementation * [DOC] Relevant parameter description in the configuration file * [DOC] Updated doc and comments in main graphormer file * [FIX] make style and quality checks * [DOC] Fix doc format * [FIX] [WIP] Updated part of the tests, though still a wip * [FIX] [WIP] * [FIX] repo consistency * [FIX] Changed input names for more understandability * [FIX] [BUG] updated num_classes params for propagation in the model * simplified collator * [FIX] Updated tests to follow new naming pattern * [TESTS] Updated test suite along with model * \|FIX] rm tokenizer import * [DOC] add link to graphormerdoc * Changed section in doc from text model to graph model * Apply suggestions from code review Spacing, inits Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [DOC] Explain algos_graphormer functions * Cython soft import protection * Rm call to Callable in configuration graphormer * [FIX] replaced asserts with Exceptions * Add org to graphormer checkpoints * Prefixed classes with Graphormer * Management of init functions * format * fixes * fix length file * update indent * relaunching ci * Errors for missing cython imports * fix style * fix style doc Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-01-19 13:05:59 -05:00
Kambe Hiroyuki	705e332b46	Add Japanese translation index.mdx (#21186 ) * Add Japanese translation index.mdx * Fix the year of the license * Change the models list to Japanese	2023-01-19 17:53:28 +01:00
Maria Khalusova	0359e2e15f	Updates to computer vision section of the Preprocess doc (#21181 ) * Extended the CV preprocessing section with more details and refactored the example * added padding to the CV section, though it is a special case * Added a tip about post processing methods * make style * link update * Apply suggestions from review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * review feedback Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-01-19 08:43:36 -05:00
Jitesh Jain	5b949623c7	Add OneFormer Model (#20577 ) * Add Oneformer Model * Add OneFormer Tests * Add UNIVERSAL_SEGMENTATION_MAPPING * Fix config * 🐛 Fix error encountered while writing tests * 🔨 Fix instance segmentation post processing * Format Files and Add Documentation * Add Documentation mdx file * Run make fixup * Run make fix-copies * Remove unnecessary code * Format modeling_oneformer.py * Add OneFormer to ImageSegmentationPipeline * Format files * Add Demo link to Readme * Fix fomatting errors * Fix test failures * Update Table in index.mdx * Fix version * Fix style * Remove OneFormer from TF * Fix Imports * Fix dummy objects * Fix tests * Add newline * Remove OneFormerFeatureExtractor * Remove CUDA Kernels * Use AutoBackbone for Swin * Fix description * Use Image Processor * Fix copies * Fix formatting * Fix import order * Fix flake8 errors * Fix doc errors * Add Hindi Readme entry * Update supported backbones * Update supported backbones * Undo Changes * Fix type of config * Fix isort * Fix auto.mdx * Fix swin config * Replace DinatBackbone with AutoBackbone * Use SwinBackbone * Use SwinBackbone * Fix conversion script * Fix arguments * Add argument description * Fix style * Add OneFormerProcessor * Fix OneFormerProcessor Tests * Fix mapping * Fix imports * Fix inits * Fix style * Fix comment * Fix docstring * Move OneFormer to MultiModal * Fix Copies * Remove size divisor * Fix check_repo.py * Fix copies * Add Processor for Testing Pipeline * Fix padding for tokens * Fix variables * Fix formatting with correct black version * Add Image Processor Test * Apply suggestions * Revert common modeling * Add check for task * Fix conversion script * Fix initialization order * Fix tests * Undo Pipeline Changes * Fix layers in MLP * Fix copies * Update image paths * Fix copies * Apply suggestions	2023-01-19 09:31:07 +01:00
Matt	00ba7cadd8	Rewrite a couple of lines in the TF XLA doc (#21177 ) * Rewrite a couple of lines in the TF XLA doc to explain that jit_compile can be used in model.compile() too * Remove extra )	2023-01-18 17:53:05 +00:00
Samuel Xu	defdcd2862	Remove Roberta Dependencies from XLM Roberta Flax and Tensorflow models (#21047 ) * Added flax model code * Added tf changes * missed some * Added copy comments * Added style hints * Fixed copy statements * Added suggested fixes * Made some fixes * Style fixup * Added necessary copy statements * Fixing copy statements * Added more copies * Final copy fix * Some bugfixes * Adding imports to init * Fixed up all make fixup errors * Fixed doc errors * Auto model changes	2023-01-18 07:49:39 -05:00
Younes Belkada	023f51fe16	`blip` support for training (#21021 ) * `blip` support for training * remove labels creation * remove unneeded `decoder_input_ids` creation * final changes - add colab link to documentation - reduction = mean for loss * fix nits * update link * clearer error message	2023-01-18 11:24:37 +01:00
Shogo Hida	14154f7238	Add Japanese translation to multilingual.mdx (#21084 ) * Create toctree for Japanese translations Signed-off-by: Shogo Hida <shogo.hida@gmail.com> * Copy English version Signed-off-by: Shogo Hida <shogo.hida@gmail.com> * Add Japanese translations Signed-off-by: Shogo Hida <shogo.hida@gmail.com> * Add Japanese translations Signed-off-by: Shogo Hida <shogo.hida@gmail.com> Signed-off-by: Shogo Hida <shogo.hida@gmail.com>	2023-01-18 10:08:18 +01:00
Wonhyeong Seo	30c12301f8	🌐 [i18n-KO] Translated `installation.mdx` to Korean (#20948 ) docs: ko: installation.mdx	2023-01-18 10:05:23 +01:00
Maria Khalusova	0248810300	Refactoring of the text generate API docs (#21112 ) * initial commit, refactoring the text generation api reference * removed repetitive code examples * Refactoring the text generation docs to reduce repetition * make style	2023-01-17 12:23:48 -05:00
Maria Khalusova	d386fd646a	Add: An introductory guide for text generation (#21090 ) * Part of the "text generation" rework: adding a high-level overview of the text generation strategies * code samples update via make style * fixed a few formatting issues * Apply suggestions from review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fixed spaces, and switched two links to markdown * Apply Steven's suggestions from review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * new lines after headers to fix link rendering * review feedback addressed. added links to image captioning and audio transcription examples * minor capitalization fix * addressed the review feedback * Apply suggestions from review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Applied review suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2023-01-17 12:23:22 -05:00
Maria Khalusova	868d37165f	Add: tensorflow example for image classification task guide (#21038 ) * Added TF example for image classification * Code style polishing * code style polishing * minor polishing * fixed a link in a tip, and a typo in the inference TF content * Apply Amy's suggestions from review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_classification.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * review feedback addressed * make style * added PushToHubCallback with save_strategy="no" * minor polishing * added PushToHubCallback with save_strategy=no * minor polishing * Update docs/source/en/tasks/image_classification.mdx * added data augmentation Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * make style Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2023-01-17 12:20:08 -05:00
NielsRogge	3a9bd972e2	Add resources (#20872 ) * Add resources * Add more resources * Remove pipeline tag * Add more resources * Add more resources Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2023-01-17 17:42:33 +01:00
Sayak Paul	f3feaf7f22	Change variable name to prevent shadowing (#21153 ) fix: input -> input_string.	2023-01-17 11:29:23 -05:00
NielsRogge	cf028d0c3d	Add batch of resources (#20647 ) * Add resources * Add more resources * Add more resources * Add TAPAS * Fix pipeline tag * Fix pipeline tags * Remove pipeline tag * Remove depth-estimation tag * Update docs/source/en/model_doc/segformer.mdx Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Apply suggestion * Fix segformer Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Maria Khalusova <kafooster@gmail.com>	2023-01-17 17:18:56 +01:00
Sayak Paul	f30bcd5357	feat: add standalone guide on XLA support. (#21141 ) * feat: add standalone guide on XLA support. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Empty commit to trigger CI * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address PR comments. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-01-17 15:07:59 +01:00
Alara Dirik	2411f0e465	Add Mask2Former (#20792 ) * Adds Mask2Former to transformers Co-authored-by: Shivalika Singh <shivalikasingh95@gmail.com> Co-authored-by: Shivalika Singh <73357305+shivalikasingh95@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-01-16 20:37:07 +03:00
NielsRogge	4ed89d48ab	Add UperNet (#20648 ) * First draft * More improvements * Add convnext backbone * Add conversion script * Add more improvements * Comment out to_dict * Add to_dict method * Add default config * Fix config * Fix backbone * Fix backbone some more * Add docs, auto mapping, tests * Fix some tests * Fix more tests * Fix more tests * Add conversion script * Improve conversion script * Add support for getting reshaped undownsampled hidden states * Fix forward pass * Add print statements * Comment out set_shift_and_window_size * More improvements * Correct downsampling layers conversion * Fix style * First draft * Fix conversion script * Remove config attribute * Fix more tests * Update READMEs * Update ConvNextBackbone * Fix ConvNext tests * Align ConvNext with Swin * Remove files * Fix index * Improve docs * Add output_attentions to model forward * Add backbone mixin, improve tests * More improvements * Update init_weights * Fix interpolation of logits * Add UperNetImageProcessor * Improve image processor * Fix image processor * Remove print statements * Remove script * Update import * Add image processor tests * Remove print statements * Fix test * Add integration test * Add convnext integration test * Update docstring * Fix README * Simplify config * Apply suggestions * Improve docs * Rename class * Fix test_initialization * Fix import * Address review * Fix confg * Convert all checkpoints * Fix default backbone * Usage same processor as segformer * Apply suggestions * Fix init_weights, update conversion scripts * Improve config * Use Auto API instead of creating a new image processor * Fix docs * Add doctests * Remove ResNetConfig dependency * Add always_partition argument * Fix rebaseé * Improve docs * Convert checkpoints Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>	2023-01-16 09:39:13 +01:00
Shogo Hida	7f65d2366a	Add Spanish translation to community.mdx (#21055 ) * Add community to toctree Signed-off-by: Shogo Hida <shogo.hida@gmail.com> * Copy English content Signed-off-by: Shogo Hida <shogo.hida@gmail.com> * Add some translations Signed-off-by: Shogo Hida <shogo.hida@gmail.com> * Add some translations Signed-off-by: Shogo Hida <shogo.hida@gmail.com> * Add some translations Signed-off-by: Shogo Hida <shogo.hida@gmail.com> * Fix position of community Signed-off-by: Shogo Hida <shogo.hida@gmail.com> * Fix translation Signed-off-by: Shogo Hida <shogo.hida@gmail.com> * Add translation Signed-off-by: Shogo Hida <shogo.hida@gmail.com> * Add translation Signed-off-by: Shogo Hida <shogo.hida@gmail.com> * Add translation Signed-off-by: Shogo Hida <shogo.hida@gmail.com> * Add translation Signed-off-by: Shogo Hida <shogo.hida@gmail.com> Signed-off-by: Shogo Hida <shogo.hida@gmail.com>	2023-01-14 09:25:05 +01:00
Steven Liu	f58248b824	Update task summary part 1 (#21014 ) * first draft of new task summary * make style * review * apply feedback * apply feedbacks * final touches	2023-01-13 11:01:53 -08:00
Steven Liu	8f796960f6	Fix header level (#21072 ) fix header level	2023-01-10 10:24:10 -08:00
Sayak Paul	263fd3c4c7	add: task guide on video classification model fine-tuning. (#20827 ) * add: task guide on video classification model fine-tuning. * apply make style from hf-formatting. * add: toc entry. * chore: address PR comments. Co-authored-by Maria Khalusova * Reflect Maria's contributions. Co-authored-by: Maria Khalusova <1065417+MKhalusova@users.noreply.github.com> * chore: minor correction. * Apply suggestions from code review Co-authored-by: Nathan Raw <nxr9266@g.rit.edu> * PyTorch Video -> PyTorchVideo. * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * change licensing year. * minor rewording. * apply make style. * address Sylvain's comments. * replace links. Co-authored-by: Maria Khalusova <1065417+MKhalusova@users.noreply.github.com> Co-authored-by: Nathan Raw <nxr9266@g.rit.edu> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-01-05 00:43:40 +05:30
Maria Khalusova	b493fee958	Add: doc page for the object detection task (#20925 ) * Added Object Detection task guide (new branch) * Polished code examples after running make style * Update docs/source/en/tasks/object_detection.mdx Rephrasing suggestion from Sayak Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/tasks/object_detection.mdx A rephrasing suggestion from Sayak Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/tasks/object_detection.mdx typo Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Applied reviewers suggestions > > Co-authored-by: sayakpaul <spsayakpaul@gmail.com> Co-authored-by: sgugger <sylvain.gugger@gmail.com> * polished code examples * Added a visualization of the inference result. Slightly changed hyperparameters, and updated the results. * polished code examples * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/object_detection.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Applying Steven's review suggestions Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * minor punctuation fix Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-01-04 08:36:37 -05:00
Jongjyh	ce85686a1f	Add AltCLIP (#20446 ) * add altclip * update * fix wrong title * fix the copyright in readme * add altclip model * add altclip * fix test_gradient_checkpointing_enable_disable * code * add return class * add projection_state * "fix pretrained model bug" * delete print and fix 2 test instances. * delete token * rm xlmr * one model one file. * empty commit to trigger CI * Fix modeling_outputs.py * Fix __init__ * Fix quality * Fix modeling file docstring * Fix README.md * Fix test file * add vision model * empty commit to trigger CI * fix * fix * fix * fix * fix * fix * fix * fix * fix * del token in mdx file * fix * fix * fix * remove altrob from test list * add vision test * fix fx * fix * fix * fix * trigger CI * fix copies * fix tests * fix style * fix quality * update * recover import * recover * add , * recover * fix copies * trigger CI * fix * some of review * update * remove import * last 2 * fix * fix style * fix style * fix bug * fix uncomment * fix * update * fix * second review * empty commit to trigger CI * empty commit to trigger CI * fix position * fix * empty commit to trigger CI * empty commit to trigger CI * third comment * Update docs/source/en/model_doc/altclip.mdx Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update docs/source/en/model_doc/altclip.mdx Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/models/altclip/configuration_altclip.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/models/altclip/modeling_altclip.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/models/altclip/processing_altclip.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/models/altclip/modeling_altclip.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * fix merge * fix copies * update * update * empty commit to trigger CI * fix code example * empty commit to trigger CI * fix * empty commit to trigger CI * empty commit to trigger CI Co-authored-by: shunxing1234 <xw747777271@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: shunxing1234 <33774367+shunxing1234@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2023-01-04 09:18:57 +01:00
Alara Dirik	cd2457809f	Improve OWL-ViT postprocessing (#20980 ) * add post_process_object_detection method * style changes	2023-01-03 19:25:09 +03:00
NielsRogge	9c6f7485a6	Add GIT (GenerativeImage2Text) (#20295 ) * First draft * Make model instantiation work * Fix copied from statement * More fixes * Add correct output head * Improve configuration * Add conversion script * Improve conversion script * Remove token_type_ids * Fix conversion of projection layers * Convert all weights * Use cats image * Make logits match * Generate caption on cats image * Add GITProcessor * Update conversion script * Add support for more checkpoints * Fix conversion script * Add initial tests * Remove cross-attention * More improvements * Remove is_decoder * Improve model tests * Improve tests * Improve model outputs * Fix model outputs equivalence * Fix more tests * Remove unused code * Use generate to generate text, no use of cache for now * Use generate more appropriately * Fix config tests * Fix style * Add support for use_cache Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fix style * Fix GIT vision encoder * Update README * Fix integration test * Set bos and eos token ids * Improve docs * Improve code * Add support for provided attention_mask * Add copied from statement * Fix gradient checkpointing test * Set model_input_names * Investigate model_input_names * Remove script * Fix model inputs * Fix docstring * Rename GIT to Git * Support more models * Add support for textvqa model * Add video support * Extend conversion script for video * Add support for large variant * Add support for more models * Fix config archive map * Update integration test * Fix README * Fix CLIP mean and std * Update processor * Fix use_cache for video, thanks @gante * Remove print statements * Remove assertion * Add processor tests * Fix model_input_names * Use Auto API for processor * Fix processor tests * Fix integration test * Fix pipeline test * Make tests faster * Update conversion script * Update conversion script * Convert more checkpoints * Update conversion script * Fix typo * Update docstrings * Improve code snippets * Fix doc tests * Add more code examplesé * Fix doc tests * Add integration tests * Fix unused variable * revert * Add GIT to Japanese README Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-01-03 14:17:18 +01:00
Konstantin Kotik	367fdf3330	`MinNewTokensLengthLogitsProcessor` for `.generate` method #20814 (#20892 ) * feat: add min new length logit processor * test: add min new length logit processor * docs: add MinNewTokensLengthLogitsProcessor * feat: import MinNewTokensLengthLogitsProcessor * fix: update pytorch dummy objects * refactor & fix: rename attributes and var and get rid of dynamic attribute * tests: align test with new interface * docs: fix typo * docs: minor clarification * Empty-Commit * empty commit * run automated quality edits Co-authored-by: Joao Gante <joao@huggingface.co>	2023-01-03 06:29:02 -05:00
Alex Hedges	0b686a8a1e	Remove non-breaking spaces (#20929 ) * Remove non-breaking space in comment It was likely added unintionally. * Remove remaining non-breaking spaces	2022-12-29 02:12:40 -05:00
Yih-Dar	5fa0b17c3d	[Past CI] 🔥 Leave Past CI failures in the past 🔥 (#20861 ) * torch.jit._state * Fix past CI * Fix for perceiver * Fix REALM * Fix for Bloom * Fix for SwinMode * Fix for TrajectoryTransformerModel * Fix for test_wav2vec2_with_lm * make style Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-27 18:37:25 +01:00
Eli Simhayev	e35bc46af6	fix docs typos in "add_new_model" (#20900 ) fix Jupyter typos	2022-12-27 02:49:15 -05:00
Kamal Raj Kanakarajan	d1b3011292	Update flan-t5 original model link (#20897 ) Update flan-t5.mdx	2022-12-27 02:26:14 -05:00
Nathan Barry	47146721b8	typo fix (#20891 )	2022-12-26 02:06:23 -05:00
Syed Abdul Gaffar Shakhadri	15bc776fec	Add Onnx Config for PoolFormer (#20868 ) poolformer onnx Co-authored-by: syed <syed.abdul@sandlogic.com>	2022-12-23 01:30:57 -05:00
Maria Khalusova	04c560225b	Adding `evaluate` to the list of libraries required in generated notebooks (#20850 ) Adding `evaluate` to the list of libraries to be installed for every generated notebook in transformers	2022-12-21 14:04:08 +01:00
Younes Belkada	0d284bd574	Add BLIP (#20716 ) * add new model like * add v1 * v1 * v1 * vision encoder logits match * v2 * fix * add docstring * CI tests pass * fix tests * make fixup * add to `toctree` * fix processors * fix processors * fix doc * fill title * add content doc * remove from tokenization auto * fix config * change order * add `# Copied from` * few fixes - add correct license on modeling text - remove dummy argument * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * replace name * refactor a bit * more refactor * remove unused arg * make fixup + remove some `# Adapted from ...` * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * more `# Copied from` * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * now `generate` supports no prefix * remove `FeatureExtractor` * fix path * correct dependency * fix tests * few fixes * add integration tests * add correct conversion script * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add `blip` to tokenization auto * fix docstrings * fix test + add image * remove processor from uncorrect place * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * clean up a bit * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * clean pixel mask * clean pixel mask * fix `F` * Update src/transformers/models/blip/modeling_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix output * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix pad token id * remove `token_type_ids` * make fixup * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * make fixup * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add comments * Update src/transformers/models/blip/modeling_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * remove `token_type_ids` * make fixup * better name * replace with `image_attention_mask` * refactor * make fixup * better docstring * replace `answer_xx` * remove ununsed args * add `labels` * add `labels` * fix processing tests * make fixup * make fixup * put correct repo * remove `pad` * remove `crop` and `center_crop` * Update src/transformers/models/blip/image_processing_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix * remove `size_divisor` * fix weights `init` * remove unneeded functions * add suggestions * minor changes - change slow test output for PT 1.13 - docstring order * replace `feature_extractor` by `image_processor` * fix doctests * fix weight init order + add fp16 slow test * add `blip` to doctest * add correct repo name and fix test * Update src/transformers/models/blip/processing_blip.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix tests * use `convert_to_rgb` from `image_transforms` * make fixup * fix large loading issue Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-12-21 09:39:10 +01:00
Steven Liu	3be028bc9d	Embed circle packing chart for model summary (#20791 ) * embed circle packing chart * trim whitespace from bottom * explain bubble sizes	2022-12-20 10:26:52 -08:00
stanleycai95	bdb84e2bad	Add model resources for ViT (#20723 ) * Set up overall resources documentation structure * Update vit.mdx * Removing irrelevant sections on text models * Update vit.mdx * Update vit.mdx * Update vit.mdx * Update vit.mdx * Update vit.mdx * Update vit.mdx * Update vit.mdx * Update vit.mdx * Update vit.mdx * Update vit.mdx * Update vit.mdx * Update vit.mdx * Update vit.mdx * Update vit.mdx	2022-12-19 10:59:34 -08:00
Andreas Madsen	b4b613b102	Implement Roberta PreLayerNorm (#20305 ) * Copy RoBERTa * formatting * implement RoBERTa with prelayer normalization * update test expectations * add documentation * add convertion script for DinkyTrain weights * update checkpoint repo Unfortunately the original checkpoints assumes a hacked roberta model * add to RoBERTa-PreLayerNorm docs to toc * run utils/check_copies.py * lint files * remove unused import * fix check_repo reporting wrongly a test is missing * fix import error, caused by rebase * run make fix-copies * add RobertaPreLayerNormConfig to ROBERTA_EMBEDDING_ADJUSMENT_CONFIGS * Fix documentation <Facebook> -> Facebook Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup: Fix documentation <Facebook> -> Facebook Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add missing Flax header Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * expected_slice -> EXPECTED_SLICE Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update copies after rebase * add missing copied from statements * make fix-copies * make prelayernorm explicit in code * fix checkpoint path for the original implementation * add flax integration tests * improve docs * update utils/documentation_tests.txt * lint files * Remove Copyright notice Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fix-copies * Remove EXPECTED_SLICE calculation comments Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-12-19 09:30:17 +01:00
NielsRogge	26dd041c6e	Add Swin2SR (#19784 ) * First draft * Add more improvements * Improve forward pass * Fix layernorm * Add upscaler * More improvements * More improvements * More improvements * Improve conversion script * Add preprocessing * Make output match original implementation * Add additional attributes * Add support for more models * Support more models * Add support for real world sr * Add initial Swin2SRFeatureExtractor * Add ImageSuperResolutionOutput * Make more tests pass * Use BaseModelOutput * Fix one more test * Fix more tests * Fix another test * Fix all tests * Rename to Swin2SRImageProcessor * Fix toctree * Fix toctree * Fix rebase * Improve Swin2SRImageProcessor * Remove feature extractor file * Improve model * Improve conversion script * Fix integration test * Fix init * Fix conversion script * Address comments * Improve upsampler * Add NearestConvUpsampler * Improve pixel shuffle upsampler * Improve auxiliary upsampler * Improve conversion script * Rename conv_last to final_convolution * Fix rebase * Improve upsample module * Add padding to image processor * Fix bug * Update padding * Remove print statement and fix integration test * Improve docs * Add image processor tests * Convert all checkpoints, fix testsé * Remove print statements * Fix import Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-12-16 16:24:01 +01:00
NielsRogge	7f99861218	Add Universal Segmentation class + mapping (#20766 ) * Add mapping * Add mapping to pipeline * Apply suggestions * Fix feature extractor tests * Use ForInstance, add model to universal mapping * More fixes * Remove model from deprecated objectsé Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-12-16 14:22:46 +01:00
Joao Gante	4bc723f87d	Generate: use `GenerationConfig` as the basis for `.generate()` parametrization (#20388 ) * generate from config mvp * fix failing tests * max_time test * Load default gen config at model load time; Update docs * further documentation; add tests * adapt rag to the new structure * handle models not instantiated with from_pretained (like in tests) * better default generation config * add can_generate fn * handle legacy use case of ad hoc model config changes * initialize gen config from config in individual methods, if gen config is none * fix _get_decoder_start_token_id when called outside GenerationMixin * correct model config load order (set attr > model config > decoder config) * update rag to match latest changes * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * load gen config from model config in model.from_pretrained * fix can_generate fn * handle generate calls without a previous from_pretrained (e.g. tests) * add legacy behavior (and a warning) * lower logger severity Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-12-15 18:27:20 +00:00
Nicolas Patry	ba9da49aa2	Fixing the pipeline tutorial test (#20746 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-13 19:08:30 +01:00
Hazrul Akmal	f28c918c7e	Add docs xlm roberta (#20742 ) * added model resources for xlm-roberta * added model resources for xlm-roberta * resolve suggested changes * add resources to xlm-roberta	2022-12-13 09:25:55 -08:00
Ariel Ekgren	5f94855dc3	Add gpt-sw3 model to transformers (#20209 ) * Add templates for gpt-sw3 * Add templates for gpt-sw3 * Added sentencepiece tokenizer * intermediate commit with many changes * fixed conflicts * Init commit for tokenization port * Tokenization progress * Remove fast tokenizer * Clean up and rename spm.model -> spiece.model * Remove TF -> PT conversion script template, Clean up Megatron -> PT script * Optimize encode & decode performance * added new attention * added new attention * attention for gpt-sw3 working * attention good * Cache is now working * fixed attention mask so that it works with causal attention * fixed badbmm bug for cpu and caching * updated config with correct parameters * Refactor and leave optimizations as separate functions to avoid breaking expected functionality * Fix special tokens mapping for both tokenizers * cleaning up of code and comments * HF compatible attention outputs * Tokenizer now passing tests, add documentation * Update documentation * reverted back to base implementation after checking that it is identical to pretrained model * updated gpt-sw3 config * updated conversion script * aligned parameters with gpt-sw3 config * changed default scale_attn_by_inverse_layer_idx to true * removed flag from conversion script * added temporary model path * reverted back to functioning convert script * small changes to default config * updated tests for gpt-sw3 * make style, make quality, minor cleanup * Change local paths to testing online repository * Change name: GptSw3 -> GPTSw3 * Remove GPTSw3TokenizerFast references * Use official model repository and add more model sizes * Added reference to 6.7b model * Add GPTSw3DoubleHeadsModel to IGNORE_NON_AUTO_CONFIGURED, like GPT2DoubleHeadsModel * Remove pointers to non-existing TFGPTSw3 * Add GPTSw3 to docs/_toctree.yml * Remove TF artifacts from GPTSw3 in __init__ files * Update README:s with 'make fix-copies' * Add 20b model to archive list * Add documentation for GPT-Sw3 * Fix typo in documentation for GPT-Sw3 * Do 'make fix-copies' again after having updated docs * Fix some typos in docs * Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/gpt_sw3/test_tokenization_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Resolve comments from PR feedback * Resolve more comments from PR feedback, also set use_cache=True in convert script * Add '# Copied from' comments for GPTSw3 modeling * Set 'is_parallelizable = False' * Remove '# Copied from' where code was modified and add 'with x->y' when appropriate * Remove parallelize in mdx * make style, make quality * Update GPTSw3Config default values and corresponding documentation * Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Clean up and protect GPTSw3Tokenizer imports with is_sentencepiece_available * Make style, make quality * Add dummy object for GPTSw3Tokenizer via 'make fix-copies' * make fix-copies * Remove GPTSw3 modeling classes * make style, make quality * Add GPTSw3 auto-mappings for other GPT2 heads * Update docs/source/en/model_doc/gpt-sw3.mdx Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove old TODO-comment * Add example usage to GPTSw3Tokenizer docstring * make style, make quality * Add implementation details and example usage to gpt-sw3.mdx Co-authored-by: JoeyOhman <joeyoh@kth.se> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-12-12 13:12:13 -05:00
Matt	c1b9a11dd4	Convert tokenizer outputs for Keras in doc example (#20732 ) * Convert tokenizer outputs for Keras in doc example * Das deutsche Beispiel auch korrigieren	2022-12-12 16:14:04 +00:00
Juanjo do Olmo	0ba94aceb6	Spanish translation of the file debugging.mdx (#20566 ) * Create and translate to Spanish debugging.mdx * solved typo error in a header * Update debugging.mdx * Update debugging.mdx * Update docs/source/es/debugging.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/debugging.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/debugging.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/debugging.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/debugging.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update _toctree.yml Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-12-12 10:38:56 -05:00
stanleycai95	17c742bbf5	Very small edit to change name to OpenAI GPT (#20722 )	2022-12-12 09:43:43 -05:00
Alberto Mario Ceballos-Arroyo	8286af6f54	Spanish translation of asr.mdx and add_new_pipeline.mdx (#20569 ) * Fix minor typo in question_answering.mdx * Fixes minor typo in the english version of tasks/asr.mdx * Update _toctree.yml * Translate add_new_pipeline.mdx into Spanish * Fixes some typos in the English version of add_new_pipeline.mdx * Translate asr.mdx into Spanish * Fixes small typos in add_new_pipeline.mdx * Update docs/source/es/add_new_pipeline.mdx Suggestion by @osanseviero Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/add_new_pipeline.mdx Suggestion by @osanseviero: use "biblioteca" instead of "librería." Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/tasks/asr.mdx Suggestion by @osanseviero. Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/add_new_pipeline.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/add_new_pipeline.mdx Suggestion by @osanseviero. Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/add_new_pipeline.mdx Suggestion by @osanseviero. Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/add_new_pipeline.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/tasks/asr.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/tasks/asr.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/tasks/asr.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update asr.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>	2022-12-12 09:23:23 -05:00
Sylvain Gugger	799cea64ac	Fix rendering issue in quicktour (#20708 ) * Fix rendering issue in quicktour * Separate in two blocks	2022-12-09 13:51:35 -05:00
Michael Benayoun	6a062a3ed9	Change transformers.onnx to use optimum.exporters.onnx (#20529 ) * Change transformers.onnx to use optimum.exporters.onnx * Update doc * Remove print * Fix transformers.onnx cli * Update documentation * Update documentation * Small fixes * Fix log message * Apply suggestions * Update src/transformers/onnx/__main__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions * Add missing line break * Ran make fix-copies * Update src/transformers/onnx/__main__.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update src/transformers/onnx/__main__.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Michael Benayoun <michael@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-12-09 10:42:02 +01:00
Nathan Raw	9e56aff58a	Add video classification pipeline (#20151 ) * 🚧 wip video classification pipeline * 🚧 wip - add is_decord_available check * 🐛 add missing import * ✅ add tests * 🔧 add decord to setup extras * 🚧 add is_decord_available * ✨ add video-classification pipeline * 📝 add video classification pipe to docs * 🐛 add missing VideoClassificationPipeline import * 📌 add decord install in test runner * ✅ fix url inputs to video-classification pipeline * ✨ updates from review * 📝 add video cls pipeline to docs * 📝 add docstring * 🔥 remove unused import * 🔥 remove some code * 📝 docfix	2022-12-08 16:22:43 -05:00
Sylvain Gugger	9cc65f8701	Migrate torchdynamo to torch.compile (#20634 ) * Migrate torchdynamo to torch.compile * Add docstring and generic option * Properly use the function... * Reorg args	2022-12-08 11:18:52 -05:00
Cole Howard	fc95386ea1	Add TFBartForSequenceClassification (#20570 ) * read to load * base functionality * revert init * fix dummy data * moving right along * moving right along * finally * cleanup * pull out comment * add test * update docstring for main class * flake comments and rewriting copies from make repo-consistency` * remove irrelevant differences/accidental spaces * put copies back after space removals * mid * final test pass * stray comment * update test file * update test file * fixup * black * missed * black missed one more * sytle * add doc update * fix order of output class * comment * Revert "comment" This reverts commit `03f86b6948`. * remove redundant function, and redundant reshape * move change out of common * style * put common spaces back * reorder kwargs in output * doc style	2022-12-07 18:05:39 +01:00
NielsRogge	d151a8c550	Add BiT + ViT hybrid (#20550 ) * First draft * More improvements * Add backbone, first draft of ViT hybrid * Add AutoBackbone * More improvements * Fix bug * More improvements * More improvements * Convert ViT-hybrid * More improvements * add patch bit * Fix style * Improve code * cleaned v1 * more cleaning * more refactoring * Improve models, add tests * Add docs and tests * Make more tests pass * Improve default backbone config * Update model_type * Fix more tests * Add more copied from statements * More improvements * Add push to hub to conversion scripts * clean * more cleanup * clean * replace to * fix * Update src/transformers/models/bit/configuration_bit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix base model prefix * more cleaning * get rid of stem * clean * replace flag * Update src/transformers/models/bit/configuration_bit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/bit/configuration_bit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add check * another check * fix for hybrid vit * final fix * update config * fix class name * fix `make fix-copies` * remove `use_activation` * Update src/transformers/models/bit/configuration_bit.py * rm unneeded file * Add BiT image processor * rm unneeded file * add doc * Add image processor to conversion script * Add ViTHybrid image processor * Add resources * Move bit to correct position * Fix auto mapping * Rename hybrid to Hybrid * Fix name in toctree * Fix READMEs' * Improve config * Simplify GroupNormActivation layer * fix test + make style * Improve config * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * remove comment * remove comment * replace * replace * remove all conv_layer * refactor norm_layer * revert x * add copied from * last changes + integration tests * make fixup * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix name * fix message * remove assert and refactor * refactor + make fixup * refactor - add + sfety checker * fix docstring + checkpoint names * fix merge issues * fix function name * fix copies * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix model checkpoint * fix doctest output * vit name on doc * fix name on doc * fix small nits * fixed integration tests * final changes - slow tests pass Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-12-07 11:03:39 +01:00
Samuel Xu	e842e181df	Documentation fixes (#20607 )	2022-12-06 07:32:46 -05:00
Nicolas Patry	28f3d431d4	Rework the pipeline tutorial (#20437 ) * [WIP] Rework the pipeline tutorial - Switch to `asr` instead of another NLP task. - It also has simpler to understand results. - Added a section with interaction with `datasets`. - Added a section with writing a simple webserver. * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Addressing comments. * Links. * Fixing docs format. * Adding pipeline_webserver to _toctree. * Warnig -> Tip warnings={true}. * Fix link ? * Links ? * Fixing link, adding chunk batching. * Oops. * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/pipeline_tutorial.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2022-12-06 10:47:31 +01:00
Steven Liu	720e9599c1	Split autoclasses on modality (#20559 ) * split autoclasses on modality * apply review * auto classes	2022-12-05 12:28:44 -08:00
Steven Liu	7d1c1c5b21	Fix code sample in preprocess (#20561 ) * change to image_processor * apply review	2022-12-05 11:49:43 -08:00
Francisco Kurucz	ac3bccdc74	Fix link to Swin Model contributor novice03 (#20557 )	2022-12-05 11:42:29 -05:00
Erin	87282cb73c	Add RemBERT ONNX config (#20520 ) * rembert onnx config * formatting Co-authored-by: Ho <erincho@bcd0745f972b.ant.amazon.com>	2022-12-05 11:39:09 -05:00
Kamal Raj Kanakarajan	13e736685a	Add BioGPT (#20420 ) * biogpt initial commit * updated init * fix faster decoding with use_cache * 1. fix input_ids and input_embeds with correct device 2. added _keys_to_ignore_on_load_missing 3. updated prepare_inputs_for_generation * add activation_dropout and scale_embedding * replace fsmt attention with bart attention * added test * run make fix-copies * doc init and fix build * updated README with proper information * 1. added tips to docs 2. updated BioGptTokenizer func * 1. added tokenizer test 2. refactor tokenizer * make fixup * add biogpt fairseq to hf converter * updated layer names more similar to original checkpoints * config update doc string and set defaults * added "#copied" from bart model and updated doc strings * enable model_input_names in tokenizer * 1. positionalembedding depending on attention_mask 2. added attention mask to prepare for generation * added test to verify past and generation * BioGptLMHeadModel -> BioGptForCausalLM * fix typo * tokenization and test Copyright and updated assertion * updated Copyright and one func at time in line * Copyright updates and minor doc fix * replace assertion with ValueError * rm extra space * added code syntax * revert cmnt position change * add tokenizer to auto * updated doc string * tokenizer doc string update * biogpt hub model update to microsoft/biogpt * make fixup * rm cmnt to fix flake8 5.0.4 vs 6 error	2022-12-05 10:12:03 -05:00
szhublox	699e90437f	flan-t5.mdx: fix link to large model (#20555 )	2022-12-02 19:27:46 +01:00
fatih	cc3d0e1b01	[New Model] Add TimeSformer model (#18908 ) * init timesformer * apply fix-copies * reformat style * revert back some incoorect style updates * init timesformer * apply fix-copies * reformat style * revert back some incoorect style updates * update timseformer doc * add some functions and classes * add new config params * implement multiple classes * update TimeSformerLayer * update TimeSformerModel, TimeSformerPreTrainedModel, TimeSformerEncoder * several fixes * reformat * temporary update * fix some typos * fix weight converter * more fixes * fix a typo * fix typo * remove redundant params * fix for latest hf-hub * merge fix * fix some checks * video classification works with einops * add paper info to docs * merge fix * remove redundant line * remove redundant docstring * update config * fix some typos * fix converter * update some test constants * refactor einops functions * reformat * fix a comment * remove redundat imports * reformat * fix a typo * remove comment * remove unused imports * remove redundant doc line * reformat * add missing line * fix docs * fix timesformer auto feat ext * add unittests * reformat * fix docs * some fixes and updates * fix readme * fix modeling * fix readme * update index * revert _toctree.yml changes * update timseformer.mdx * update drop_path_prob to drop_path_rate * add dosctring for drop_path_rate * update TimeSformerPatchEmbed naming * remove to_2tuple * explicit use of nn.functional * reformat * many updates from review comments * fix a typo * reformat * remove assert, better variable name * make variable names more explicit * add some adapted from * more explicit variable names * remove redundant docstring * fix initilaization * move permute inside embedding * update class names * remove unused imports * add test for video classification * update PretrainedModel with PreTrainedModel * remove double permute * update based on sylvain's review * aply auto fix * update image_processing_auto for timesformer * update hub urls * reformat * remove duplicate import * update doc link	2022-12-02 09:13:25 +01:00
Younes Belkada	8b486c0310	add doc for (#20525 )	2022-12-01 16:52:13 +01:00
Yang An	721764028e	Add Chinese-CLIP implementation (#20368 ) * init chinese-clip model from clip * init model tests and docs * implement chinese-clip into hf * implement chinese-clip into hf * implement chinese-clip into hf * implement chinese-clip into hf * implement chinese-clip into hf * update usecase example in model implementation * fix codestyle * fix model_type typo in readme * add placeholder in doc * add placeholder in doc * update the init script * update usecase * fix codestyle * update testcase * update testcase * update testcase * update testcase * update testcase * update testcase * update testcase * update testcase * update testcase * update testcase * update testcase * update testcase * forward the convert_rgb * update testcase * update testcase * update testcase * merge the recent update from clip about model_input_name property * update the doc * update the doc * update the doc * update the doc * remove unused imports * reformat code style * update the doc * fix isort style * bypass a weird failed unit test which is unrelated with my PR * update the doc * implement independent vision config class * implement independent vision model class * fix refactor bug * fix refactor bug * fix refactor bug * make style * fix refactor bug * make style * fix refactor bug * fix refactor bug * make style * fix refactor bug * fix refactor bug * doc-build restyle * implement independent text config class * implement independent text model class * implement independent text model class * make style * make fix-copies * fix refactor bug * fix refactor bug * fix refactor bug * fix refactor bug * fix refactor bug * fix refactor bug * fix refactor bug * fix refactor bug * fix refactor bug * fix refactor bug * make style * update doc * black and isort * update doc * Update src/transformers/models/chinese_clip/configuration_chinese_clip.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/auto/tokenization_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * modify the model type from chinese-clip to chinese_clip * format the example comment of ChineseCLIPVisionConfig * correct the copyright comment * fix the tokenizer specification * add copied from for loss function * remove unused class * update CHINESE_CLIP_TEXT_INPUTS_DOCSTRING * update CHINESE_CLIP_INPUTS_DOCSTRING * update doc * update doc * update code comment in config * update copied from statement * make style * rename the doc file * add copied statement * remove unused attention_mask, causal_attention_mask in ChineseCLIPVisionEncoder * remove ChineseCLIPTextPreTrainedModel * fix bug * fix bug * fix bug * update doc * make style * Update src/transformers/models/chinese_clip/configuration_chinese_clip.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/chinese_clip/configuration_chinese_clip.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update ChineseCLIPImageProcessor in image_processing_auto * fix config_class of chinesecliptextmodel * fix the test case * update the docs * remove the copied from comment for ChineseCLIPTextModel, since it has diverged from BertModel with customed config_class * update the testcase * final fix Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-11-30 19:22:23 +01:00
Sylvain Gugger	08b4621899	Repurpose torchdynamo training args towards torch._dynamo (#20498 ) * Repurpose torchdynamo training args towards torch._dynamo * Add doc	2022-11-30 11:10:45 -05:00
Julian Pollmann	829374e4fc	Fix Typo in Docs for GPU (#20509 )	2022-11-30 10:41:18 -05:00
amyeroberts	17a7b49bda	Update doc examples feature extractor -> image processor (#20501 ) * Update doc example feature extractor -> image processor * Apply suggestions from code review	2022-11-30 14:50:55 +00:00
amyeroberts	de6d19ea92	Add segmentation + object detection image processors (#20160 ) * Add transforms for object detection * DETR models + Yolos * Scrappy additions * Maskformer image processor * Fix up; MaskFormer tests * Update owlvit processor * Add to docs * OwlViT tests * Update pad logic * Remove changes to transforms * Import fn directly * Update to include pad transformation * Remove uninstended changes * Add new owlvit post processing function * Tidy up * Fix copies * Fix some copies * Include device fix * Fix scipy imports * Update _pad_image * Update padding functionality * Fix bug * Properly handle ignore index * Fix up * Remove defaults to None in docstrings * Fix docstrings & docs * Fix sizes bug * Resolve conflicts in init * Cast to float after resizing * Tidy & add size if missing * Allow kwards when processing for owlvit * Update test values	2022-11-30 10:24:03 +00:00
Francisco Kurucz	4aa630eeab	Fix documentation code to import facebook/detr-resnet-50 model (#20491 )	2022-11-29 13:30:26 -05:00
Pi Esposito	fb2b45e562	add in layer gpt2 tokenizer (#20421 ) * add minimal working gpt2 tokenizer * graph mode and output equivalence tests working * not today tensorflow. serialization test passing! * fix style, documentation, docstrings and all that jazz * passing consistency checks * move keras nlp to tf dependencies * fix tf modeling utils and gpt2 attention to enable compiling * fix (I hope) keras nlp dependencies * rever changes on generation * remove debug prints * remove redundant tf dummy objects * add from config, get config and max length settings to address review * let flake ignore the error on distillation you are welcome * test from config * add padding test * address sgugger review	2022-11-29 10:02:40 -05:00
amyeroberts	ae1cffaf3c	Add Donut image processor (#20425 ) * Add Donut image processor * Update src/transformers/image_transforms.py Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> * Fix docstrings * Full var names in docstring Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>	2022-11-29 10:38:01 +00:00
NielsRogge	6dc884abc8	[Maskformer] Add MaskFormerSwin backbone (#20344 ) * First draft * Fix backwards compatibility * More fixes * More fixes * Make backbone more general * Improve backbone * Improve test * Fix config checkpoint * Address comments * Use model_type * Address more comments * Fix special model names * Remove MaskFormerSwinModel and MaskFormerSwinPreTrainedModel from main init * Fix typo * Update backbone * Apply suggestion Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-11-28 20:33:49 +01:00
Sayak Paul	d59d5a618b	chore: add link to the video cls notebook. (#20386 ) * chore: add link to the video cls notebook. * chore: segregate as resources.	2022-11-28 12:10:24 -05:00
Wang, Yi	ca3b652bbd	update cpu related doc (#20444 )	2022-11-28 08:54:35 -05:00
Ian C	bc00c29d11	Add Spanish translation of pr_checks.mdx (#20339 ) * Update _toctree and clone original doc * Forgot to translate (lol) * Translate documentation and update toctree * Add suggested changes from review	2022-11-23 15:06:29 -05:00
Ian C	c3eb01013b	Fix toctree for Section 3 in Spanish Documentation (#20360 ) * Order and group topics in the right section * Translate "Computer Vision"	2022-11-21 16:44:34 -05:00
Steven Liu	d896029e27	Add inference section to task guides (#18781 ) * 📝 start adding inference section to task guides * ✨ make style * 📝 add multiple choice * add rest of inference sections * make style * add compute_metric, push_to_hub, pipeline * make style * add updated sequence and token classification * make style * make edits in token classification * add audio classification * make style * add asr * make style * add image classification * make style * add summarization * make style * add translation * make style * add multiple choice * add language modeling * add qa * make style * review and edits * apply reviews * make style * fix call to processor * apply audio reviews * update to better asr model * make style	2022-11-21 10:06:21 -08:00
NielsRogge	4973d2a04c	Add Audio Spectogram Transformer (#19981 ) * First draft * Make conversion script work * Add id2label mapping, run code quality * Fix copies * Add first draft of feature extractor * Update conversion script to use feature extractor * Make more tests pass * Add docs * update input_features to input_values + pad by default to max length * Fix doc tests * Add feature extractor tests * Add proper padding/truncation to feature extractor * Add support for conversion of all audioset checkpoints * Improve docs and extend conversion script * Fix README * Rename spectogram to spectrogram * Fix copies * Add integration test * Remove dummy conv * Update to ast * Update organization * Fix init * Rename model to AST * Add require_torchaudio annotator * Move import of ASTFeatureExtractor under a is_speech_available * Fix rebase * Add pipeline config * Update name of classifier head * Rename time_dimension and frequency_dimension for clarity * Remove print statement * Fix pipeline test * Fix pipeline test * Fix index table * Fix init * Fix conversion script * Rename to ForAudioClassification * Fix index table Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-11-21 18:58:54 +01:00
NielsRogge	96783e53b4	Add resources (#20296 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-11-21 18:24:32 +01:00
Matthijs Hollemans	d21c97cc0f	add MobileNetV1 model (#17799 ) * add model files etc for MobileNetV2 rename files for MobileNetV1 initial implementation of MobileNetV1 fix conversion script cleanup write docs tweaks fix conversion script extract hidden states fix test cases make fixup fixup it all remove main from doc link fixes fix tests fix up use google org fix weird assert * fixup * use google organization for checkpoints	2022-11-21 10:21:28 -05:00
Raj Rajhans	22d7161a52	fix: "BigSicence" typo in docs (#20331 )	2022-11-21 09:44:54 -05:00
Ian C	d28448c5cd	Add Spanish translation of serialization.mdx (#20245 ) * Update _toctree and clone original content * Translate first three sections * Add more translated chapters. Only 3 more left. * Finish translation * Run style from doc-builder * Address recommended changes from reviewer	2022-11-21 08:46:54 -05:00
BFSS	05d80d856c	translate zh quicktour(#20095 ) (#20181 ) * zh quicktour(#20095) * add zh to doc workflow * remove untranslation from toctree Co-authored-by: BeifangSusu <BeifangSusu@bfss.com>	2022-11-21 08:44:18 -05:00
Joao Gante	3de07473da	Generate: add generation config class (#20218 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-11-21 13:30:15 +00:00
Steven Liu	d316037ad7	organize pipelines by modality (#20306 )	2022-11-18 12:06:25 -08:00
Ali Hassani	fc4a993e1b	Add Neighborhood Attention Transformer (NAT) and Dilated NAT (DiNAT) models (#20219 ) * Add DiNAT * Adds DiNAT + tests * Minor fixes * Added HF model * Add natten to dependencies. * Cleanup * Minor fixup * Reformat * Optional NATTEN import. * Reformat & add doc to _toctree * Reformat (finally) * Dummy objects for DiNAT * Add NAT + minor changes Adds NAT as its own independent model + docs, tests Adds NATTEN to ext deps to ensure ci picks it up. * Remove natten from `all` and `dev-torch` deps, add manual pip install to ci tests * Minor fixes. * Fix READMEs. * Requested changes to docs + minor fixes. * Requested changes. * Add NAT/DiNAT tests to layoutlm_job * Correction to Dinat doc. * Requested changes.	2022-11-18 13:08:26 -05:00
amyeroberts	b98269425e	Add padding image transformation (#19838 ) * Add padding transformation * Add in upstream changes * Update tests & docs * Code formatting tuples in docstring	2022-11-18 11:27:21 +00:00
Shogo Hida	389702242d	[Docs] Add resources of OpenAI GPT (#20084 ) * Add resources of OpenAI GPT * Delete Deploy section and add . * Add scripts * Update docs/source/en/model_doc/openai-gpt.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Delete causal-language-modeling section * Add TFOpenAIGPTLMHeadModel * Add resources from community * Delete a link Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2022-11-16 11:17:32 -05:00
Alara Dirik	a00b7e85ea	Adds image-guided object detection support to OWL-ViT (#20136 ) Adds image-guided object detection method to OwlViTForObjectDetection class as described in the original paper. One-shot/ image-guided object detection enables users to use a query image to search for similar objects in the input image. Co-Authored-By: Dhruv Karan k4r4n.dhruv@gmail.com	2022-11-16 09:07:46 +03:00
Ambuj Pawar	c19aa7acce	Add clip resources to the transformers documentation (#20190 ) * WIP: Added CLIP resources from HuggingFace blog * ADD: Notebooks documentation to clip * Add link straight to notebook Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Change notebook links to colab Co-authored-by: Ambuj Pawar <your_email@abc.example> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2022-11-15 13:26:46 -05:00
Saad Mahmud	5b62f8ea2b	Add to DeBERTa resources (#20155 ) * Add to DeBERTa resources * Fix mistakes with chapter number * Add fill-mask pipeline * Add sequence, token and QA pipeline * Change token classification pipeline order * Remove flax script and notebook links	2022-11-15 13:26:07 -05:00
Suraj Patil	7f74433814	[CLIP] allow loading projection layer in vision and text model (#18962 ) * allow loading projection in text and vision model * begin tests * finish test for CLIPTextModelTest * style * add slow tests * add new classes for projection heads * remove with_projection * add in init * add in doc * fix tests * fix some more tests * fix copies * fix docs * remove leftover from fix-copies * add the head models in IGNORE_NON_AUTO_CONFIGURED * fix docstr * fix tests * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add docstr for models Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-11-15 17:50:07 +01:00
Muhammad Sakib Khan Inan	777b1bfe62	New logging support to "Trainer" Class (ClearML Logger) (#20184 ) * Init Update * ClearML Callbacks integration * update corrections * args reporting updated * {'tensorboard': False, 'pytorch': False} * ClearML Tests added * add clearml * output_uri=True in Task.init * reformatted integrations.py * reformatted and fixed * IF-ELSE statement issue on "has_clearml" resolved * Add clearml in main callback docs * Add additional clearml documentation * Update src/transformers/integrations.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Accept suggestion Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Accept suggestion Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Small change in comments * Make style clearml * Accept suggestion Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Victor Sonck <victor.sonck@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-11-15 10:08:59 -05:00
Kendall	683cbc4c34	fixed spelling error in testing.mdx (#20220 )	2022-11-15 09:40:06 -05:00
amyeroberts	4c7e8d0900	Add object detection + segmentation transforms (#20003 ) * Add transforms for object detection * Update src/transformers/image_transforms.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Better var names & docstring * Remove unused var desc in docstring * Update src/transformers/image_transforms.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-11-15 12:50:03 +00:00
Younes Belkada	163ac3d3ee	Add Switch transformers (#19323 ) * first commit * add more comments * add router v1 * clean up - remove `tf` modeling files * clean up - remove `tf` modeling files * clean up * v0 routers * added more router - Implemented `ExpertsChooseMaskedRouter` - added tests - 2 more routers to implement * last router * improved docstring - completed the docstring in `router.py` - added more args in the config * v0 sparse mlp * replace wrong naming * forward pass run * update MOE layer * small router update * fixup * consistency * remove scatter router * remove abstract layer * update test and model for integration testing * v1 conversion * update * hardcode hack * all keys match * add gin conversion, without additional libraries * update conversion sctipy * delete router file * update tests wrt router deletion * fix router issues * update expert code * update, logits match, code needsREFACTORING * Refactor code Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com> * add generate tests Co-authored-by: younesbelkada <younesbelkada@gmail.com> * add support for router loss Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com> * fix forward error * refactor a bit * remove `FlaxSwitchTransformers` modules * more tests pass * Update code Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com> * fixup * fix tests * fix doc * fix doc + tokenization * fix tokenizer test * fix test * fix loss output * update code for backward pass * add loss support * update documentation * fix documentation, clean tokenizer * more doc fix, cleanup example_switch * fix failing test * fix test * fix test * fix loss issue * move layer * update doc and fix router capacity usage * fixup * add sparse mlp index for documentation on hub * fixup * test sparse mix architecture * Apply suggestions from code review * Update docs/source/en/model_doc/switch_transformers.mdx * fixup on update * fix tests * fix another test * attempt fix * Update src/transformers/models/switch_transformers/configuration_switch_transformers.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/switch_transformers/convert_switch_transformers_original_flax_checkpoint_to_pytorch.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * try * all tests pass * fix jitter noise * Apply suggestions from code review * doc tests pass * Update src/transformers/models/switch_transformers/modeling_switch_transformers.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/switch_transformers/modeling_switch_transformers.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove assert * change config order * fix readme japanese * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove parallelizable tests + add one liners * remove ONNX config * fix nits - add `T5Tokenizer` in auto mapping - remove `Switch Transformers` from ONNX supported models * remove `_get_router` * remove asserts * add check in test for `router_dtype` * add `SwitchTransformersConfig` in `run_pipeline_test` * Update tests/pipelines/test_pipelines_summarization.py * add huge model conversion script * fix slow tests - add better casting for `Linear8bitLt` - remove `torchscript` tests * add make dir * style on new script * fix nits - doctest - remove `_keys_to_ignore_on_load_unexpected` * Update src/transformers/models/switch_transformers/configuration_switch_transformers.py * add google as authors * fix year * remove last `assert` statements * standardize vertical spaces * fix failing import * fix another failing test * Remove strange àuthorized_keys` * removing todo and padding that is never used Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: ybelkada <younes@huggingface.co> Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur Zucker <arthur@huggingface.co>	2022-11-15 13:06:45 +01:00
bofeng huang	9625924c60	Update tokenizer_summary.mdx (#20135 )	2022-11-15 01:18:13 +01:00
Wonhyeong Seo	8fadfd5035	[docs] set overflowing image width to auto-scale (#20197 ) * docs: fix: set overflowing image width to auto-scale * docs: fix: new language Korean is also affected * docs: fix: unnecessary line break in index page	2022-11-15 01:13:40 +01:00
Wonhyeong Seo	07d8d6e2f7	docs: translated index page to korean (#20180 ) docs: i18n: first draft of index page docs: fix: first revision of index page docs: i18n: missed section - supported frameworks docs: fix: second revision of index page review by @ArthurZucker refactor: remove untranslated files from korean docs: fix: remove untranslated references from toctree.yml feat: enable korean docs in gh actions docs: feat: add in_translation page as placeholder docs: bug: testing if internal toc need alphabet chars docs: fix: custom english anchor for non-alphanumeric headings review by @sgugger docs: i18n: translate comments on install methods in _config.py docs: refactor: more concise wording for translations	2022-11-14 12:09:21 -05:00
Bartosz Szmelczynski	78a471ff71	Fix tapas scatter (#20149 ) * First draft * Remove scatter dependency * Add require_torch * update vectorized sum test, add clone call * remove artifacts * fix style * fix style v2 * remove "scatter" mentions from the code base * fix isort error Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-11-14 01:04:26 -05:00
Matthijs Hollemans	f711d683b5	add MobileNetV2 model (#17845 ) * add model files etc for MobileNetV2 * rename files for MobileNetV1 * initial implementation of MobileNetV1 * fix conversion script * cleanup * write docs * tweaks * fix conversion script * extract hidden states * fix test cases * make fixup * fixup it all * rename V1 to V2 * fix checkpoints * fixup * implement first block + weight conversion * add remaining layers * add output stride and dilation * fixup * add tests * add deeplabv3+ head * a bit of fixup * finish deeplab conversion * add link to doc * fix issue with JIT trace in_height and in_width would be Tensor objects during JIT trace, which caused Core ML conversion to fail on the remainder op. By making them ints, the result of the padding calculation becomes a constant value. * cleanup * fix order of models * fix rebase error * remove main from doc link * add image processor * remove old feature extractor * fix converter + other issues * fixup * fix unit test * add to onnx tests (but these appear broken now) * add post_process_semantic_segmentation * use google org * remove unused imports * move args * replace weird assert	2022-11-14 01:00:10 -05:00
Arthur	61a51f5f23	Add Jukebox model (replaces #16875 ) (#17826 )	2022-11-10 21:05:27 +01:00
NielsRogge	9f0c72f93b	Add doc tests (#20158 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>	2022-11-10 15:25:30 +01:00
NielsRogge	93e14486d6	[CLIPSeg] Add resources (#20118 ) * Add resource * Add tag Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-11-09 18:31:22 +01:00
Steven Liu	a44985b41c	add cv + audio labels (#20114 )	2022-11-09 07:40:15 -08:00
Joao Gante	f270b960d6	Generate: move generation_.py src files into generation/.py (#20096 ) * move generation_.py src files into generation/.py * populate generation.__init__ with lazy loading * move imports and references from generation.xxx.object to generation.object	2022-11-09 15:34:08 +00:00
amyeroberts	4eb918e656	AutoImageProcessor (#20111 ) * AutoImageProcessor skeleton * Update references * Add mapping in init * Add model image processors to __init__ for importing * Add AutoImageProcessor tests * Fix up * Image Processor documentation * Remove pdb * Update docs/source/en/model_doc/mobilevit.mdx * Update docs * Don't add whitespace on json files * Remove fixtures * Move checking model config down * Fix up * Add check for image processor * Remove FeatureExtractorMixin in docstrings * Rename model_tmpfile to config_tmpfile * Don't make None if not in image processor map	2022-11-08 19:54:41 +00:00
Weiwe Shi	efa889d2e4	Add RocBert (#20013 ) * add roc_bert * update roc_bert readme * code style * change name and delete unuse file * udpate model file * delete unuse log file * delete tokenizer fast * reformat code and change model file path * add RocBertForPreTraining * update docs * delete wrong notes * fix copies * fix make repo-consistency error * fix files are not present in the table of contents error * change RocBert -> RoCBert * add doc, add detail test Co-authored-by: weiweishi <weiweishi@tencent.com>	2022-11-08 10:03:43 -05:00
NielsRogge	258963062b	Add CLIPSeg (#20066 ) * Add first draft * Update conversion script * Improve conversion script * Improve conversion script some more * Add conditional embeddings * Add initial decoder * Fix activation function of decoder * Make decoder outputs match original implementation * Make decoder outputs match original implementation * Add more copied from statements * Improve model outputs * Fix auto tokenizer file * Fix more tests * Add test * Improve README and docs, improve conditional embeddings * Fix more tests * Remove print statements * Remove initial embeddings * Improve conversion script * Add interpolation of position embeddings * Finish addition of interpolation of position embeddings * Add support for refined checkpoint * Fix refined checkpoint * Remove unused parameter * Improve conversion script * Add support for training * Fix conversion script * Add CLIPSegFeatureExtractor * Fix processor * Fix CLIPSegProcessor * Fix conversion script * Fix most tests * Fix equivalence test * Fix README * Add model to doc tests * Use better variable name * Convert other checkpoint as well * Update config, add link to paper * Add docs * Update organization * Replace base_model_prefix with clip * Fix base_model_prefix * Fix checkpoint of config * Fix config checkpoint * Remove file * Use logits for output * Fix tests Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-11-08 10:55:47 +01:00
Tom Aarsen	6156bffa2b	Replace awkward timm link with the expected one (#20109 )	2022-11-07 13:57:39 -05:00
Steven Liu	71f772ebd0	Add new terms to the glossary (#20051 ) * add new terms * apply review	2022-11-07 10:45:27 -08:00
Tom Aarsen	d44ac47bac	docs: Fixed variables in f-strings (#20087 ) * docs: Fixed variables in f-strings * Replace unknown `block` with known `block_type` in ValueError Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add missing torch import in docs code block Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-11-07 13:18:09 -05:00
Tom Aarsen	3222fc645b	docs: Resolve many typos in the English docs (#20088 ) * docs: Fix typo in ONNX parser help: 'tolerence' => 'tolerance' * docs: Resolve many typos in the English docs Typos found via 'codespell ./docs/source/en'	2022-11-07 09:19:04 -05:00
Tom Aarsen	b8112eddec	Replace unsupported facebookresearch/bitsandbytes (#20093 ) With https://github.com/TimDettmers/bitsandbytes, which is by the same author and is still being updated	2022-11-07 08:52:03 -05:00
Jordan Clive	3bd0007e87	Update documentation on seq2seq models with absolute positional embeddings, to be in line with Tips section for BERT and GPT2 (#20068 ) Co-authored-by: jordiclive <jordiclive19@imperial.ac.uk>	2022-11-04 11:32:44 -04:00
Matt	6e1c5786dc	Update READMEs for ESMFold and add notebooks (#20067 ) * Update READMEs for ESMFold and add notebooks * Fix PyCharm formatting * make fix-copies	2022-11-04 15:10:13 +00:00
Wang, Yi	2564f0c21d	fix jit trace error for model forward sequence is not aligned with jit.trace tuple input sequence, update related doc (#19891 ) * fix jit trace error for classification usecase, update related doc Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * add implementation in torch 1.14.0 Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * update_doc Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * update_doc Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2022-11-03 10:50:03 -04:00
Sanchit Gandhi	06d488061f	[Whisper Tokenizer] Make more user-friendly (#19921 ) * [Whisper Tokenizer] Make more user-friendly * use property * make indexing rigorous * small clean-up * tests * skip seq2seq tests * remove multilingual arg * reorder args * collapse to one function Co-authored-by: ArthurZucker <arthur@huggingface.co> * option to override attributes Co-authored-by: ArthurZucker <arthur@huggingface.co> * add to docs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make comment more clear Co-authored-by: sgugger <sylvain@huggingface.co> * don't add special tokens in get_decoder_prompt_ids * add test for set_prefix_tokens Co-authored-by: ArthurZucker <arthur@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: sgugger <sylvain@huggingface.co>	2022-11-03 14:22:40 +00:00
Yih-Dar	9ccea7acb1	Fix some doctests after PR 15775 (#20036 ) * Add skip_special_tokens=True in some doctest * For T5 * Fix for speech_to_text.mdx Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-11-03 14:18:45 +01:00
Steven Liu	aa39967b28	reorganize glossary (#20010 )	2022-11-02 16:58:17 -07:00
Yih-Dar	fb7cbe236b	Fix doctest (#20023 ) * Fix doctest Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-11-02 19:37:25 +01:00
amyeroberts	a6b7759880	Add Image Processors (#19796 ) * Add CLIP image processor * Crop size as dict too * Update warning * Actually use logger this time * Normalize doesn't change dtype of input * Add perceiver image processor * Tidy up * Add DPT image processor * Add Vilt image processor * Tidy up * Add poolformer image processor * Tidy up * Add LayoutLM v2 and v3 imsge processors * Tidy up * Add Flava image processor * Tidy up * Add deit image processor * Tidy up * Add ConvNext image processor * Tidy up * Add levit image processor * Add segformer image processor * Add in post processing * Fix up * Add ImageGPT image processor * Fixup * Add mobilevit image processor * Tidy up * Add postprocessing * Fixup * Add VideoMAE image processor * Tidy up * Add ImageGPT image processor * Fixup * Add ViT image processor * Tidy up * Add beit image processor * Add mobilevit image processor * Tidy up * Add postprocessing * Fixup * Fix up * Fix flava and remove tree module * Fix image classification pipeline failing tests * Update feature extractor in trainer scripts * Update pad_if_smaller to accept tuple and int size * Update for image segmentation pipeline * Update src/transformers/models/perceiver/image_processing_perceiver.py Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> * Update src/transformers/image_processing_utils.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/beit/image_processing_beit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * PR comments - docstrings; remove accidentally added resize; var names * Update docstrings * Add exception if size is not in the right format * Fix exception check * Fix up * Use shortest_edge in tuple in script Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-11-02 11:57:36 +00:00
Steven Liu	79c720c062	fix typo (#20006 )	2022-11-01 11:30:36 -07:00
Steven Liu	ab74ac11e4	Add LayoutLMv3 resource (#19932 ) * add layoutlmv3 resource * add layoutlmv2 resources * fix button	2022-11-01 11:10:46 -07:00
Steven Liu	dec8578e70	Add BERT resources (#19852 ) * add resources for bert * add course chapters * apply reviews * add pipeline icons and community resource * fix buttons	2022-11-01 11:09:53 -07:00
Steven Liu	1f6885bad0	add dataset (#20005 )	2022-11-01 10:37:20 -07:00
Sayak Paul	c87ae86a8f	Update image_classification.mdx (#19996 )	2022-11-01 07:54:41 -04:00
Mohit Sharma	c796b6dea6	Added onnx config whisper (#19525 ) * Added onnx config whisper * added whisper support onnx * add audio input data * added whisper support onnx * fixed the seqlength value * Updated the whisper onnx ocnfig * restore files to old version * removed attention mask from inputs * Updated get_dummy_input_onnxruntime docstring * Updated relative imports and token generation * update docstring	2022-11-01 07:50:42 -04:00
Matt	7f9b7b3f0e	Add ESMFold (#19977 ) * initial commit * First draft that gets outputs without crashing! * Add all the ported openfold dependencies * testing * Restructure config files for ESMFold * Debugging to find output discrepancies * Mainly style * Make model runnable without extra deps * Remove utils and merge them to the modeling file * Use correct gelu and remove some debug prints * More cleanup * Update esm docs * Update conversion script to support ESMFold properly * Port some top-level changes from ESMFold repo * Expand EsmFold docstrings * Make attention_mask optional (default to all 1s) * Add inference test for ESMFold * Use config and not n kwargs * Add modeling output class * Remove einops * Remove chunking in ESM FFN * Update tests for ESMFold * Quality * REpo consistency * Remove tree dependency from ESMFold * make fixup * Add an error in case my structure map function breaks later * Remove needless code * Stop auto-casting the LM to float16 so CPU tests pass * Stop auto-casting the LM to float16 so CPU tests pass * Final test updates * Split test file * Copyright and quality * Unpin PyTorch to see built doc * Fix config file to_dict() method * Add some docstrings to the output * Skip TF checkpoint tests for ESM until we reupload those * make fixup * More docstrings * Unpin to get even with main * Flag example to write Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-10-31 21:32:58 -04:00
Jean Charles Kouame	6aede2d602	Tranformers documentation translation to Italian #17459 (#19988 )	2022-10-31 13:19:15 -04:00
NielsRogge	0b294c2334	[Conditional, Deformable DETR] Add postprocessing methods (#19709 ) * Add postprocessing methods * Update docs * Add fix * Add test * Add test for deformable detr postprocessing * Add post processing methods for segmentation * Update code examples * Add post_process to make the pipeline work * Apply updates Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-10-31 08:28:44 +01:00
Steven Liu	2e35bac4e7	Add wav2vec2 resources (#19931 ) * add wav2vec2 resources * apply review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2022-10-28 13:28:18 -07:00
Steven Liu	9d2788b46b	add resources for distilbert (#19930 )	2022-10-28 13:16:07 -07:00
Steven Liu	b0a2c3a2d6	add resources for bart (#19928 )	2022-10-28 13:15:43 -07:00
Raghav Prabhakar	0d4c45c585	Add Onnx Config for ImageGPT (#19868 ) * add Onnx Config for ImageGPT * add generate_dummy_inputs for onnx config * add TYPE_CHECKING clause * Update doc for generate_dummy_inputs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-10-28 09:39:53 -04:00
Steven Liu	e4132952a1	Add GPT2 resources (#19879 ) * add resources for gpt2 * add pipeline icons and community resources	2022-10-27 11:34:00 -07:00
Steven Liu	d818dd3a41	Add BLOOM resources (#19881 ) * add bloom resources * add pipeline icon	2022-10-27 11:33:52 -07:00
Steven Liu	50f5266b2c	Add T5 resources (#19878 ) * add resources for t5 * add pipeline icons and community resources	2022-10-27 11:33:37 -07:00
Steven Liu	536a8ae6ad	Add RoBERTa resources (#19911 ) * add roberta resources * fix typo	2022-10-27 11:33:15 -07:00
Younes Belkada	7a1c68a845	Add `flan-t5` documentation page (#19892 ) * add `flan-t5` documentation page * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add more content * revert `_toctree` modif * revert `toctree` modif - 2 * Update README.md * Revert "Update README.md" This reverts commit `5660714429`. * Update README_es.md * Update README_zh-hans.md * Update README_zh-hant.md * Update README_ko.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-10-26 17:22:57 +02:00
Lysandre Debut	eedaba682f	[Past CI] Vilt only supports PT >= v1.10 (#19851 ) * Support for Vilt in v1.9 * Skip if not higher or equal than 1.10 * Move test :) * I am bad at python	2022-10-25 15:59:35 +02:00
Davi Alves	0a77249178	Added translation of serialization.mdx to Portuguese Issue #16824 (#19869 ) * [ custom_models.mdx ] - Translated to Portuguese the custom models tutorial. * [ run_scripts.mdx ] - Translated to Portuguese the run scripts tutorial. * [ converting_tensorflow_models.mdx ] - Translated to Portuguese the converting tensorflow models tutorial. * [ converting_tensorflow_models.mdx ] - Translated to Portuguese the converting tensorflow models tutorial. * [ serialization.mdx ] - Translated to Portuguese the serialization tutorial.	2022-10-25 09:34:28 -04:00
Alberto Mario Ceballos-Arroyo	371337a95b	Spanish translation of multiple_choice.mdx, question_answering.mdx. (#19821 ) * Translated multiple_choice.mdx, question_answering.mdx. Added them to _toctree.yml * Added translation for a missed line. * Update _toctree.yml as per Omar's suggestions * Update multiple_choice.mdx as per Omar's comments * Updt question_answering.mdx as per Omar's comments	2022-10-24 20:11:34 -04:00
Steven Liu	9ecb13d63a	add small updates only (#19847 )	2022-10-24 10:18:20 -07:00
Yih-Dar	072ed01c38	Fix doctest for `MarkupLM` (#19845 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-10-24 17:54:23 +02:00
Dhruv Singal	5cbf1fa8ca	fixed typo in fp16 training section for perf_train_gpu_one (#19736 )	2022-10-24 10:04:28 -04:00
Davi Alves	743995e0e6	Added translation of converting_tensorflow_models.mdx to Portuguese Issue #16824 (#19824 ) * [ custom_models.mdx ] - Translated to Portuguese the custom models tutorial. * [ run_scripts.mdx ] - Translated to Portuguese the run scripts tutorial. * [ converting_tensorflow_models.mdx ] - Translated to Portuguese the converting tensorflow models tutorial. * [ converting_tensorflow_models.mdx ] - Translated to Portuguese the converting tensorflow models tutorial.	2022-10-24 09:50:16 -04:00
zhou fan	3b419cfc6f	fix broken links in testing.mdx (#19820 )	2022-10-24 09:48:02 -04:00
Davi Alves	74b3eb3dea	Added translation of run_scripts.mdx to Portuguese Issue #16824 (#19800 ) * [ custom_models.mdx ] - Translated to Portuguese the custom models tutorial. * [ run_scripts.mdx ] - Translated to Portuguese the run scripts tutorial.	2022-10-21 17:38:35 -04:00
Davi Alves	2ebf4e6a7b	[ custom_models.mdx ] - Translated to Portuguese the custom models tutorial. (#19779 )	2022-10-21 09:48:19 -04:00
ftorres16	c1f009ad9a	Update training.mdx (#19791 )	2022-10-21 09:46:44 -04:00
Rohit Gupta	2dd1b8f0c5	adding key pair dataset (#19765 )	2022-10-20 09:05:49 -04:00
amyeroberts	5041bc3511	Image transforms add center crop (#19718 ) * Add center crop to transforms library * Return PIL images if PIL image input by default * Fixup and add docstring * Trigger CI * Update src/transformers/image_transforms.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/image_transforms.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * PR comments - move comments; unindent Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-10-19 16:15:01 +01:00
GMFTBY	71786b10c5	Adding the state-of-the-art contrastive search decoding methods for the codebase of generation_utils.py (#19477 ) * add: the contrastive search for generaton_utils * add: testing scripts for contrastive search under examples/text-generation * update the quality of codes * revise the docstring; make the generation_contrastive_search.py scripts; * revise the examples/pytorch/text-generation/run_generation_contrastive_search.py to the auto-APIs format * revise the necessary documents * fix: revise the docstring of generation_contrastive_search.py * Fix the code indentation * fix: revise the nits and examples in contrastive_search docstring. * fix the copyright * delete generation_contrastive_search.py * revise the logic in contrastive_search * update the intergration test and the docstring * run the tests over * add the slow decorate to the contrastive_search intergrate test * add more test * do the style, quality, consistency checks	2022-10-19 10:17:46 +01:00
NielsRogge	14fe3e0410	Add docs (#19729 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-10-18 17:42:46 +02:00
NielsRogge	dd523da577	Add table transformer [v2] (#19614 ) * First draft * Add conversion script * Make conversion work * Upload checkpoints * Add final fixes * Revert changes of conditional and deformable detr * Fix toctree, add and remove copied from * Use model type * Improve docs * Improve code example * Update copies * Add copied formt * Don't update conditional detr * Don't update deformable detr	2022-10-18 15:20:09 +02:00
Antonio Carlos Falcão Petri	af150e4a1c	Allow user-managed Pool in Wav2Vec2ProcessorWithLM.batch_decode (#18351 ) * [Wav2Vec2] Allow user-managed Pool in Wav2Vec2ProcessorWithLM.batch_decode * [Wav2Vec2] Add user-managed LM's pool tests and usage examples * Improve styling Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [Wav2Vec2] Fix hyperlink references Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-10-18 08:48:03 -04:00
Christopher Akiki	71ca79448c	Fix typo in perf docs (#19705 )	2022-10-18 12:18:19 +02:00
NielsRogge	90071fe42b	Improve DETR models (#19644 ) * Improve DETR models * Fix Deformable DETR loss and matcher * Fixup * Fix integration tests * Improve variable names * Apply suggestion * Fix copies * Fix DeformableDetrLoss * Make Conditional DETR copy from Deformable DETR * Copy from deformable detr's hungarian matcher * Fix bug	2022-10-18 10:29:14 +02:00
NielsRogge	fd9a027aca	Fix docs (#19687 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-10-18 09:52:51 +02:00
amyeroberts	4181320b8c	Add normalize to image transforms module (#19544 ) * Adapt FE methods to transforms library * Mixin for saving the image processor * Base processor skeleton * BatchFeature for packaging image processor outputs * Initial image processor for GLPN * REmove accidental import * Fixup and docs * Mixin for saving the image processor * Fixup and docs * Import BatchFeature from feature_extraction_utils * Fixup and docs * Fixup and docs * Fixup and docs * Fixup and docs * BatchFeature for packaging image processor outputs * Import BatchFeature from feature_extraction_utils * Import BatchFeature from feature_extraction_utils * Fixup and docs * Fixup and docs * BatchFeature for packaging image processor outputs * Import BatchFeature from feature_extraction_utils * Fixup and docs * Mixin for saving the image processor * Fixup and docs * Add rescale back and remove ImageType * fix import mistake * Fix enum var reference * Can transform and specify image data format * Remove redundant function * Update reference * Data format flag for rescale * Fix typo * Fix dimension check * Fixes to make IP and FE outputs match * Add tests for transforms * Add test for utils * Update some docstrings * Make sure in channels last before converting to PIL * Remove default to numpy batching * Fix up * Add docstring and model_input_types * Use feature processor config from hub * Alias GLPN feature extractor to image processor * Alias feature extractor mixin * Add return_numpy=False flag for resize * Fix up * Fix up * Use different frameworks safely * Safely import PIL * Call function checking if PIL available * Only import if vision available * Address Sylvain PR comments Co-authored-by: Sylvain.gugger@gmail.com * Apply suggestions from code review Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/image_transforms.py Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> * Update src/transformers/models/glpn/feature_extraction_glpn.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add in docstrings * Fix TFSwinSelfAttention to have relative position index as non-trainable weight (#18226) Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Refactor `TFSwinLayer` to increase serving compatibility (#18352) * Refactor `TFSwinLayer` to increase serving compatibility Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Fix missed parameters while refactoring Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Fix window_reverse to calculate batch size Signed-off-by: Seunghwan Hong <harrydrippin@gmail.com> Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add TF prefix to TF-Res test class (#18481) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Remove py.typed (#18485) * Fix pipeline tests (#18487) * Fix pipeline tests * Make sure all pipelines tests run with init changes * Use new huggingface_hub tools for download models (#18438) * Draft new cached_file * Initial draft for config and model * Small fixes * Fix first batch of tests * Look in cache when internet is down * Fix last tests * Bad black, not fixing all quality errors * Make diff less * Implement change for TF and Flax models * Add tokenizer and feature extractor * For compatibility with main * Add utils to move the cache and auto-do it at first use. * Quality * Deal with empty commit shas * Deal with empty etag * Address review comments * Fix `test_dbmdz_english` by updating expected values (#18482) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Move cache folder to huggingface/hub for consistency with hf_hub (#18492) * Move cache folder to just huggingface * Thank you VsCode for this needless import * Move to hub * Forgot one * Update some expected values in `quicktour.mdx` for `resampy 0.3.0` (#18484) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Forgot one new_ for cache migration * disable Onnx test for google/long-t5-tglobal-base (#18454) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Typo reported by Joel Grus on TWTR (#18493) * Just re-reading the whole doc every couple of months 😬 (#18489) * Delete valohai.yaml * NLP => ML * typo * website supports https * datasets * 60k + modalities * unrelated link fixing for accelerate * Ok those links were actually broken * Fix link * Make `AutoTokenizer` auto-link * wording tweak * add at least one non-nlp task * `transformers-cli login` => `huggingface-cli login` (#18490) * zero chance anyone's using that constant no? * `transformers-cli login` => `huggingface-cli login` * `transformers-cli repo create` => `huggingface-cli repo create` * `make style` * Add seed setting to image classification example (#18519) * [DX fix] Fixing QA pipeline streaming a dataset. (#18516) * [DX fix] Fixing QA pipeline streaming a dataset. QuestionAnsweringArgumentHandler would iterate over the whole dataset effectively killing all properties of the pipeline. This restores nice properties when using `Dataset` or `Generator` since those are meant to be consumed lazily. * Handling TF better. * Clean up hub (#18497) * Clean up utils.hub * Remove imports * More fixes * Last fix * update fsdp docs (#18521) * updating fsdp documentation * typo fix * Fix compatibility with 1.12 (#17925) * Fix compatibility with 1.12 * Remove pin from examples requirements * Update torch scatter version * Fix compatibility with 1.12 * Remove pin from examples requirements * Update torch scatter version * fix torch.onnx.symbolic_opset12 import * Reject bad version Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Remove debug statement * Specify en in doc-builder README example (#18526) Co-authored-by: Ankur Goyal <ankur@impira.com> * New cache fixes: add safeguard before looking in folders (#18522) * unpin resampy (#18527) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * ✨ update to use interlibrary links instead of Markdown (#18500) * Add example of multimodal usage to pipeline tutorial (#18498) * 📝 add example of multimodal usage to pipeline tutorial * 🖍 apply feedbacks * 🖍 apply niels feedback * [VideoMAE] Add model to doc tests (#18523) * Add videomae to doc tests * Add pip install decord Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Update perf_train_gpu_one.mdx (#18532) * Update no_trainer.py scripts to include accelerate gradient accumulation wrapper (#18473) * Added accelerate gradient accumulation wrapper to run_image_classification_no_trainer.py example script * make fixup changes * PR comments * changed input to Acceletor based on PR comment, ran make fixup * Added comment explaining the sync_gradients statement * Fixed lr scheduler max steps * Changed run_clm_no_trainer.py script to use accelerate gradient accum wrapper * Fixed all scripts except wav2vec2 pretraining to use accelerate gradient accum wrapper * Added accelerate gradient accum wrapper for wav2vec2_pretraining_no_trainer.py script * make fixup and lr_scheduler step inserted back into run_qa_beam_search_no_trainer.py * removed changes to run_wav2vec2_pretraining_no_trainer.py script and fixed using wrong constant in qa_beam_search_no_trainer.py script * Add Spanish translation of converting_tensorflow_models.mdx (#18512) * Add file in spanish docs to be translated * Finish translation to Spanish * Improve Spanish wording * Add suggested changes from review * Spanish translation of summarization.mdx (#15947) (#18477) * Add Spanish translation of summarization.mdx * Apply suggestions from code review Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Let's not cast them all (#18471) * add correct dtypes when checking for params dtype * forward contrib credits * Update src/transformers/modeling_utils.py Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> * more comments - added more comments on why we cast only floating point parameters * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: sgugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> * fix: data2vec-vision Onnx ready-made configuration. (#18427) * feat: add the data2vec conf that are missing https://huggingface.co/docs/transformers/serialization * fix: wrong config * Add mt5 onnx config (#18394) * update features * MT5OnnxConfig added with updated with tests and docs * fix imports * fix onnc_config_cls for mt5 Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai> * Minor update of `run_call_with_unpacked_inputs` (#18541) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * BART - Fix attention mask device issue on copied models (#18540) * attempt to fix attn mask device * fix bart `_prepare_decoder_attention_mask` - add correct device - run `make fix-copies` to propagate the fix * Adding a new `align_to_words` param to qa pipeline. (#18010) * Adding a new `align_to_words` param to qa pipeline. * Update src/transformers/pipelines/question_answering.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Import protection. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * 📝 update metric with evaluate (#18535) * Restore _init_weights value in no_init_weights (#18504) * Recover _init_weights value in no_init_weights For potential nested use. In addition, users might modify private no_init_weights as well. * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Remove private variable change check Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Clean up comment * 📝 update documentation build section (#18548) * `bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901) * first commit * correct replace function * add final changes - works like charm! - cannot implement tests yet - tested * clean up a bit * add bitsandbytes dependencies * working version - added import function - added bitsandbytes utils file * small fix * small fix - fix import issue * fix import issues * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor a bit - move bitsandbytes utils to utils - change comments on functions * reformat docstring - reformat docstring on init_empty_weights_8bit * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * revert bad formatting * change to bitsandbytes * refactor a bit - remove init8bit since it is useless * more refactoring - fixed init empty weights issue - added threshold param * small hack to make it work * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_utils.py * revmoe the small hack * modify utils file * make style + refactor a bit * create correctly device map * add correct dtype for device map creation * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions - remove with torch.grad - do not rely on Python bool magic! * add docstring - add docstring for new kwargs * add docstring - comment `replace_8bit_linear` function - fix weird formatting * - added more documentation - added new utility function for memory footprint tracking - colab demo to add * few modifs - typo doc - force cast into float16 when load_in_8bit is enabled * added colab link * add test architecture + docstring a bit * refactor a bit testing class * make style + refactor a bit * enhance checks - add more checks - start writing saving test * clean up a bit * male style * add more details on doc * add more tests - still needs to fix 2 tests * replace by "or" - could not fix it from GitHub GUI Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor a bit testing code + add readme * make style * fix import issue * Update src/transformers/modeling_utils.py Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * add few comments * add more doctring + make style * more docstring * raise error when loaded in 8bit * make style * add warning if loaded on CPU * add small sanity check * fix small comment * add bitsandbytes on dockerfile * Improve documentation - improve documentation from comments * add few comments * slow tests pass on the VM but not on the CI VM * Fix merge conflict * make style * another test should pass on a multi gpu setup * fix bad import in testing file * Fix slow tests - remove dummy batches - no more CUDA illegal memory errors * odify dockerfile * Update docs/source/en/main_classes/model.mdx * Update Dockerfile * Update model.mdx * Update Dockerfile * Apply suggestions from code review * few modifications - lm head can stay on disk/cpu - change model name so that test pass * change test value - change test value to the correct output - torch bmm changed to baddmm in bloom modeling when merging * modify installation guidelines * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * replace `n`by `name` * merge `load_in_8bit` and `low_cpu_mem_usage` * first try - keep the lm head in full precision * better check - check the attribute `base_model_prefix` instead of computing the number of parameters * added more tests * Update src/transformers/utils/bitsandbytes.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit * improve documentation - fix typos for installation - change title in the documentation Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * TF: XLA-trainable DeBERTa v2 (#18546) * fix deberta issues * add different code paths for gpu and tpu * shorter gpu take along axis * Stable Dropout without tf cond * variable must be float * Preserve hub-related kwargs in AutoModel.from_pretrained (#18545) * Preserve hub-related kwargs in AutoModel.from_pretrained * Fix tests * Remove debug statement * TF Examples Rewrite (#18451) * Finished QA example * Dodge a merge conflict * Update text classification and LM examples * Update NER example * New Keras metrics WIP, fix NER example * Update NER example * Update MC, summarization and translation examples * Add XLA warnings when shapes are variable * Make sure batch_size is consistently scaled by num_replicas * Add PushToHubCallback to all models * Add docs links for KerasMetricCallback * Add docs links for prepare_tf_dataset and jit_compile * Correct inferred model names * Don't assume the dataset has 'lang' * Don't assume the dataset has 'lang' * Write metrics in text classification * Add 'framework' to TrainingArguments and TFTrainingArguments * Export metrics in all examples and add tests * Fix training args for Flax * Update command line args for translation test * make fixup * Fix accidentally running other tests in fp16 * Remove do_train/do_eval from run_clm.py * Remove do_train/do_eval from run_mlm.py * Add tensorflow tests to circleci * Fix circleci * Update examples/tensorflow/language-modeling/run_mlm.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/test_tensorflow_examples.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/translation/run_translation.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/token-classification/run_ner.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fix save path for tests * Fix some model card kwargs * Explain the magical -1000 * Actually enable tests this time * Skip text classification PR until we fix shape inference * make fixup Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Use commit hash to look in cache instead of calling head (#18534) * Use commit hash to look in cache instead of calling head * Add tests * Add attr for local configs too * Stupid typos * Fix tests * Update src/transformers/utils/hub.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Address Julien's comments Co-authored-by: Julien Chaumond <julien@huggingface.co> * `pipeline` support for `device="mps"` (or any other string) (#18494) * `pipeline` support for `device="mps"` (or any other string) * Simplify `if` nesting * Update src/transformers/pipelines/base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix? @sgugger * passing `attr=None` is not the same as not passing `attr` 🤯 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update philosophy to include other preprocessing classes (#18550) * 📝 update philosophy to include other preprocessing classes * 🖍 apply feedbacks * Properly move cache when it is not in default path (#18563) * Adds CLIP to models exportable with ONNX (#18515) * onnx config for clip * default opset as 14 * changes from the original repo * input values order fix * outputs fix * remove unused import * ran make fix-copies * black format * review comments: forward ref, import fix, model change revert, .to cleanup * make style * formatting fixes * revert groupvit * comment for cast to int32 * comment fix * make .T as .t() for onnx conversion * ran make fix-copies * remove unneeded comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * remove comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * raise atol for MT5OnnxConfig (#18560) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * fix string (#18568) * Segformer TF: fix output size in documentation (#18572) * Segformer TF: fix output size in doc * Segformer pytorch: fix output size in doc Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com> * Fix resizing bug in OWL-ViT (#18573) * Fixes resizing bug in OWL-ViT * Defaults to square resize if size is set to an int * Sets do_center_crop default value to False * Fix LayoutLMv3 documentation (#17932) * fix typos * fix sequence_length docs of LayoutLMv3Model * delete trailing white spaces * fix layoutlmv3 docs more * apply make fixup & quality * change to two versions of input docstring * apply make fixup & quality * Skip broken tests * Change BartLearnedPositionalEmbedding's forward method signature to support Opacus training (#18486) * changing BartLearnedPositionalEmbedding forward signature and references to it * removing debugging dead code (thanks style checker) * blackened modeling_bart file * removing copy inconsistencies via make fix-copies * changing references to copied signatures in Bart variants * make fix-copies once more * using expand over repeat (thanks @michaelbenayoun) * expand instead of repeat for all model copies Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com> * german docs translation (#18544) * Create _config.py * Create _toctree.yml * Create index.mdx not sure about "du / ihr" oder "sie" * Create quicktour.mdx * Update _toctree.yml * Update build_documentation.yml * Update build_pr_documentation.yml * fix build * Update index.mdx * Update quicktour.mdx * Create installation.mdx * Update _toctree.yml * Deberta V2: Fix critical trace warnings to allow ONNX export (#18272) * Fix critical trace warnings to allow ONNX export * Force input to `sqrt` to be float type * Cleanup code * Remove unused import statement * Update model sew * Small refactor Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Use broadcasting instead of repeat * Implement suggestion Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Match deberta v2 changes in sew_d * Improve code quality * Update code quality * Consistency of small refactor * Match changes in sew_d Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * [FX] _generate_dummy_input supports audio-classification models for labels (#18580) * Support audio classification architectures for labels generation, as well as provides a flag to print warnings or not * Use ENV_VARS_TRUE_VALUES * Fix docstrings with last version of hf-doc-builder styler (#18581) * Fix docstrings with last version of hf-doc-builder styler * Remove empty Parameter block * Bump nbconvert from 6.0.1 to 6.3.0 in /examples/research_projects/lxmert (#18565) Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump nbconvert in /examples/research_projects/visual_bert (#18566) Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix owlvit tests, update docstring examples (#18586) * Return the permuted hidden states if return_dict=True (#18578) * Load sharded pt to flax (#18419) * initial commit * add small test * add cross pt tf flag to test * fix quality * style * update test with new repo * fix failing test * update * fix wrong param ordering * style * update based on review * update related to recent new caching mechanism * quality * Update based on review Co-authored-by: sgugger <sylvain.gugger@gmail.com> * quality and style * Update src/transformers/modeling_flax_utils.py Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add type hints for ViLT models (#18577) * Add type hints for Vilt models * Add missing return type for TokenClassification class * update doc for perf_train_cpu_many, add intel mpi introduction (#18576) * update doc for perf_train_cpu_many, add mpi introduction Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Update docs/source/en/perf_train_cpu_many.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_train_cpu_many.mdx Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * typos (#18594) * FSDP bug fix for `load_state_dict` (#18596) * Add `TFAutoModelForSemanticSegmentation` to the main `__init__.py` (#18600) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Generate: validate `model_kwargs` (and catch typos in generate arguments) (#18261) * validate generate model_kwargs * generate tests -- not all models have an attn mask * Supporting seq2seq models for `bitsandbytes` integration (#18579) * Supporting seq2seq models for `bitsandbytes` integration - `bitsandbytes` integration supports now seq2seq models - check if a model has tied weights as an additional check * small modification - tie the weights before looking at tied weights! * Add Donut (#18488) * First draft * Improve script * Update script * Make conversion work * Add final_layer_norm attribute to Swin's config * Add DonutProcessor * Convert more models * Improve feature extractor and convert base models * Fix bug * Improve integration tests * Improve integration tests and add model to README * Add doc test * Add feature extractor to docs * Fix integration tests * Remove register_buffer * Fix toctree and add missing attribute * Add DonutSwin * Make conversion script work * Improve conversion script * Address comment * Fix bug * Fix another bug * Remove deprecated method from docs * Make Swin and Swinv2 untouched * Fix code examples * Fix processor * Update model_type to donut-swin * Add feature extractor tests, add token2json method, improve feature extractor * Fix failing tests, remove integration test * Add do_thumbnail for consistency * Improve code examples * Add code example for document parsing * Add DonutSwin to MODEL_NAMES_MAPPING * Add model to appropriate place in toctree * Update namespace to appropriate organization Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Fix URLs (#18604) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Update BLOOM parameter counts (#18531) * Update BLOOM parameter counts * Update BLOOM parameter counts * [doc] fix anchors (#18591) the manual anchors end up being duplicated with automatically added anchors and no longer work. * [fsmt] deal with -100 indices in decoder ids (#18592) * [fsmt] deal with -100 indices in decoder ids Fixes: https://github.com/huggingface/transformers/issues/17945 decoder ids get the default index -100, which breaks the model - like t5 and many other models add a fix to replace -100 with the correct pad index. For some reason this use case hasn't been used with this model until recently - so this issue was there since the beginning it seems. Any suggestions to how to add a simple test here? or perhaps we have something similar already? user's script is quite massive. * style * small change (#18584) * Flax Remat for LongT5 (#17994) * [Flax] Add remat (gradient checkpointing) * fix variable naming in test * flip: checkpoint using a method * fix naming * fix class naming * apply PVP's suggestions from code review * add gradient_checkpointing to examples * Add gradient_checkpointing to run_mlm_flax * Add remat to longt5 * Add gradient checkpointing test longt5 * Fix args errors * Fix remaining tests * Make fixup & quality fixes * replace kwargs * remove unecessary kwargs * Make fixup changes * revert long_t5_flax changes * Remove return_dict and copy to LongT5 * Remove test_gradient_checkpointing Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> * mac m1 `mps` integration (#18598) * mac m1 `mps` integration * Update docs/source/en/main_classes/trainer.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * addressing comments * Apply suggestions from code review Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> * resolve comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> * Change scheduled CIs to use torch 1.12.1 (#18644) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add checks for some workflow jobs (#18583) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * TF: Fix generation repetition penalty with XLA (#18648) * Update longt5.mdx (#18634) * Update run_translation_no_trainer.py (#18637) * Update run_translation_no_trainer.py found an error in selecting `no_decay` parameters and some small modifications when the user continues to train from a checkpoint * fixs `no_decay` and `resume_step` issue 1. change `no_decay` list 2. if use continue to train their model from provided checkpoint, the `resume_step` will not be initialized properly if `args.gradient_accumulation_steps != 1` * [bnb] Minor modifications (#18631) * bnb minor modifications - refactor documentation - add troubleshooting README - add PyPi library on DockerFile * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * put in one block - put bash instructions in one block * update readme - refactor a bit hardware requirements * change text a bit * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * apply suggestions Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * add link to paper * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update tests/mixed_int8/README.md * Apply suggestions from code review * refactor a bit * add instructions Turing & Amperer Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add A6000 * clarify a bit * remove small part * Update tests/mixed_int8/README.md Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Examples: add Bloom support for token classification (#18632) * examples: add Bloom support for token classification (FLAX, PyTorch and TensorFlow) * examples: remove support for Bloom in token classication (FLAX and TensorFlow currently have no support for it) * Fix Yolos ONNX export test (#18606) Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fixup * Fix up * Move PIL default arguments inside function for safe imports * Add image utils to toctree * Update `rescale` method to reflect changes in #18677 * Update docs/source/en/internal/image_processing_utils.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Address Niels PR comments * Add normalize method to transforms library * Apply suggestions from code review - remove defaults to None Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix docstrings and revert to PIL.Image.XXX resampling Use PIL.Image.XXX resampling values instead of PIL.Image.Resampling.XXX enum as it's only in the recent version >= 9.10 and version is not yet pinned and older version support deprecated * Some more docstrings and PIL.Image tidy up * Reorganise arguments so flags by modifiers * Few last docstring fixes * Add normalize to docs * Accept PIL.Image inputs with deprecation warning * Update src/transformers/image_transforms.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update warning to include version * Trigger CI - hash clash on doc build Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Seunghwan Hong <harrydrippin@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com> Co-authored-by: Ankur Goyal <ankrgyl@gmail.com> Co-authored-by: Ankur Goyal <ankur@impira.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Mishig Davaadorj <dmishig@gmail.com> Co-authored-by: Rasmus Arpe Fogh Jensen <Rasmus.arpe@gmail.com> Co-authored-by: Ian Castillo <7807897+donelianc@users.noreply.github.com> Co-authored-by: AguilaCudicio <aguila.cudicio@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> Co-authored-by: Niklas Hansson <niklas.sven.hansson@gmail.com> Co-authored-by: Thomas Chaigneau <t.chaigneau.tc@gmail.com> Co-authored-by: YouJiacheng <1503679330@qq.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Dhruv Karan <k4r4n.dhruv@gmail.com> Co-authored-by: Michael Wyatt <mrwyattii@gmail.com> Co-authored-by: Maxime G <joihn@users.noreply.github.com> Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com> Co-authored-by: Wonseok Lee (Jack) <rollerkid02@snu.ac.kr> Co-authored-by: Dan Jones <dan.j.jones2@gmail.com> Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com> Co-authored-by: flozi00 <flozi00.fz@gmail.com> Co-authored-by: iiLaurens <iiLaurens@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: Karim Foda <35491698+KMFODA@users.noreply.github.com> Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> Co-authored-by: zhoutang776 <47708118+zhoutang776@users.noreply.github.com> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-10-17 17:02:14 +01:00
Christopher Akiki	aa629e7a7c	Update perf_train_gpu_one.mdx (#19676 )	2022-10-17 16:54:35 +02:00
Matt	3b3024da70	TF port of ESM (#19587 ) * Partial TF port for ESM model * Add ESM-TF tests * Add the various imports for TF-ESM * TF weight conversion almost ready * Stop ignoring the decoder weights in PT * Add tests and lots of fixes * fix-copies * Fix imports, add model docs * Add get_vocab() to tokenizer * Fix vocab links for pretrained files * Allow multiple inputs with a sep * Use EOS as SEP token because ESM vocab lacks SEP * Correctly return special tokens mask from ESM tokenizer * make fixup * Stop testing unsupported embedding resizing * Handle TF bias correctly * Skip all models with slow tokenizers in the token classification test * Fixing the batch/unbatcher of pipelines to accomodate the `None` being passed around. * Fixing pipeline bug caused by slow tokenizer being different. * Update src/transformers/models/esm/modeling_tf_esm.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/esm/modeling_tf_esm.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/esm/modeling_tf_esm.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update set_input_embeddings and the copyright notices Co-authored-by: Your Name <you@example.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2022-10-17 14:16:16 +01:00
Arthur	d2e5b19b82	Add doctest info in testingmdx (#19623 )	2022-10-17 11:23:20 +02:00
0xflotus	5ef2186692	fix: small error (#19612 ) * fix: small error * fix: another typo error	2022-10-14 11:10:33 -04:00
Akash Mahajan	504cd71a6b	add a note to whisper docs clarifying support of long-form decoding (#19497 )	2022-10-13 10:39:03 +02:00
amyeroberts	1973b7716b	Image transforms library (#18520 ) * Adapt FE methods to transforms library * Mixin for saving the image processor * Base processor skeleton * BatchFeature for packaging image processor outputs * Initial image processor for GLPN * REmove accidental import * Fixup and docs * Mixin for saving the image processor * Fixup and docs * Import BatchFeature from feature_extraction_utils * Fixup and docs * Fixup and docs * Fixup and docs * Fixup and docs * BatchFeature for packaging image processor outputs * Import BatchFeature from feature_extraction_utils * Import BatchFeature from feature_extraction_utils * Fixup and docs * Fixup and docs * BatchFeature for packaging image processor outputs * Import BatchFeature from feature_extraction_utils * Fixup and docs * Mixin for saving the image processor * Fixup and docs * Add rescale back and remove ImageType * fix import mistake * Fix enum var reference * Can transform and specify image data format * Remove redundant function * Update reference * Data format flag for rescale * Fix typo * Fix dimension check * Fixes to make IP and FE outputs match * Add tests for transforms * Add test for utils * Update some docstrings * Make sure in channels last before converting to PIL * Remove default to numpy batching * Fix up * Add docstring and model_input_types * Use feature processor config from hub * Alias GLPN feature extractor to image processor * Alias feature extractor mixin * Add return_numpy=False flag for resize * Fix up * Fix up * Use different frameworks safely * Safely import PIL * Call function checking if PIL available * Only import if vision available * Address Sylvain PR comments Co-authored-by: Sylvain.gugger@gmail.com * Apply suggestions from code review Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/image_transforms.py Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> * Update src/transformers/models/glpn/feature_extraction_glpn.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add in docstrings * Fix TFSwinSelfAttention to have relative position index as non-trainable weight (#18226) Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Refactor `TFSwinLayer` to increase serving compatibility (#18352) * Refactor `TFSwinLayer` to increase serving compatibility Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Fix missed parameters while refactoring Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Fix window_reverse to calculate batch size Signed-off-by: Seunghwan Hong <harrydrippin@gmail.com> Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add TF prefix to TF-Res test class (#18481) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Remove py.typed (#18485) * Fix pipeline tests (#18487) * Fix pipeline tests * Make sure all pipelines tests run with init changes * Use new huggingface_hub tools for download models (#18438) * Draft new cached_file * Initial draft for config and model * Small fixes * Fix first batch of tests * Look in cache when internet is down * Fix last tests * Bad black, not fixing all quality errors * Make diff less * Implement change for TF and Flax models * Add tokenizer and feature extractor * For compatibility with main * Add utils to move the cache and auto-do it at first use. * Quality * Deal with empty commit shas * Deal with empty etag * Address review comments * Fix `test_dbmdz_english` by updating expected values (#18482) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Move cache folder to huggingface/hub for consistency with hf_hub (#18492) * Move cache folder to just huggingface * Thank you VsCode for this needless import * Move to hub * Forgot one * Update some expected values in `quicktour.mdx` for `resampy 0.3.0` (#18484) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Forgot one new_ for cache migration * disable Onnx test for google/long-t5-tglobal-base (#18454) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Typo reported by Joel Grus on TWTR (#18493) * Just re-reading the whole doc every couple of months 😬 (#18489) * Delete valohai.yaml * NLP => ML * typo * website supports https * datasets * 60k + modalities * unrelated link fixing for accelerate * Ok those links were actually broken * Fix link * Make `AutoTokenizer` auto-link * wording tweak * add at least one non-nlp task * `transformers-cli login` => `huggingface-cli login` (#18490) * zero chance anyone's using that constant no? * `transformers-cli login` => `huggingface-cli login` * `transformers-cli repo create` => `huggingface-cli repo create` * `make style` * Add seed setting to image classification example (#18519) * [DX fix] Fixing QA pipeline streaming a dataset. (#18516) * [DX fix] Fixing QA pipeline streaming a dataset. QuestionAnsweringArgumentHandler would iterate over the whole dataset effectively killing all properties of the pipeline. This restores nice properties when using `Dataset` or `Generator` since those are meant to be consumed lazily. * Handling TF better. * Clean up hub (#18497) * Clean up utils.hub * Remove imports * More fixes * Last fix * update fsdp docs (#18521) * updating fsdp documentation * typo fix * Fix compatibility with 1.12 (#17925) * Fix compatibility with 1.12 * Remove pin from examples requirements * Update torch scatter version * Fix compatibility with 1.12 * Remove pin from examples requirements * Update torch scatter version * fix torch.onnx.symbolic_opset12 import * Reject bad version Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Remove debug statement * Specify en in doc-builder README example (#18526) Co-authored-by: Ankur Goyal <ankur@impira.com> * New cache fixes: add safeguard before looking in folders (#18522) * unpin resampy (#18527) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * ✨ update to use interlibrary links instead of Markdown (#18500) * Add example of multimodal usage to pipeline tutorial (#18498) * 📝 add example of multimodal usage to pipeline tutorial * 🖍 apply feedbacks * 🖍 apply niels feedback * [VideoMAE] Add model to doc tests (#18523) * Add videomae to doc tests * Add pip install decord Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Update perf_train_gpu_one.mdx (#18532) * Update no_trainer.py scripts to include accelerate gradient accumulation wrapper (#18473) * Added accelerate gradient accumulation wrapper to run_image_classification_no_trainer.py example script * make fixup changes * PR comments * changed input to Acceletor based on PR comment, ran make fixup * Added comment explaining the sync_gradients statement * Fixed lr scheduler max steps * Changed run_clm_no_trainer.py script to use accelerate gradient accum wrapper * Fixed all scripts except wav2vec2 pretraining to use accelerate gradient accum wrapper * Added accelerate gradient accum wrapper for wav2vec2_pretraining_no_trainer.py script * make fixup and lr_scheduler step inserted back into run_qa_beam_search_no_trainer.py * removed changes to run_wav2vec2_pretraining_no_trainer.py script and fixed using wrong constant in qa_beam_search_no_trainer.py script * Add Spanish translation of converting_tensorflow_models.mdx (#18512) * Add file in spanish docs to be translated * Finish translation to Spanish * Improve Spanish wording * Add suggested changes from review * Spanish translation of summarization.mdx (#15947) (#18477) * Add Spanish translation of summarization.mdx * Apply suggestions from code review Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Let's not cast them all (#18471) * add correct dtypes when checking for params dtype * forward contrib credits * Update src/transformers/modeling_utils.py Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> * more comments - added more comments on why we cast only floating point parameters * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: sgugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> * fix: data2vec-vision Onnx ready-made configuration. (#18427) * feat: add the data2vec conf that are missing https://huggingface.co/docs/transformers/serialization * fix: wrong config * Add mt5 onnx config (#18394) * update features * MT5OnnxConfig added with updated with tests and docs * fix imports * fix onnc_config_cls for mt5 Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai> * Minor update of `run_call_with_unpacked_inputs` (#18541) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * BART - Fix attention mask device issue on copied models (#18540) * attempt to fix attn mask device * fix bart `_prepare_decoder_attention_mask` - add correct device - run `make fix-copies` to propagate the fix * Adding a new `align_to_words` param to qa pipeline. (#18010) * Adding a new `align_to_words` param to qa pipeline. * Update src/transformers/pipelines/question_answering.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Import protection. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * 📝 update metric with evaluate (#18535) * Restore _init_weights value in no_init_weights (#18504) * Recover _init_weights value in no_init_weights For potential nested use. In addition, users might modify private no_init_weights as well. * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Remove private variable change check Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Clean up comment * 📝 update documentation build section (#18548) * `bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901) * first commit * correct replace function * add final changes - works like charm! - cannot implement tests yet - tested * clean up a bit * add bitsandbytes dependencies * working version - added import function - added bitsandbytes utils file * small fix * small fix - fix import issue * fix import issues * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor a bit - move bitsandbytes utils to utils - change comments on functions * reformat docstring - reformat docstring on init_empty_weights_8bit * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * revert bad formatting * change to bitsandbytes * refactor a bit - remove init8bit since it is useless * more refactoring - fixed init empty weights issue - added threshold param * small hack to make it work * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_utils.py * revmoe the small hack * modify utils file * make style + refactor a bit * create correctly device map * add correct dtype for device map creation * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions - remove with torch.grad - do not rely on Python bool magic! * add docstring - add docstring for new kwargs * add docstring - comment `replace_8bit_linear` function - fix weird formatting * - added more documentation - added new utility function for memory footprint tracking - colab demo to add * few modifs - typo doc - force cast into float16 when load_in_8bit is enabled * added colab link * add test architecture + docstring a bit * refactor a bit testing class * make style + refactor a bit * enhance checks - add more checks - start writing saving test * clean up a bit * male style * add more details on doc * add more tests - still needs to fix 2 tests * replace by "or" - could not fix it from GitHub GUI Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor a bit testing code + add readme * make style * fix import issue * Update src/transformers/modeling_utils.py Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * add few comments * add more doctring + make style * more docstring * raise error when loaded in 8bit * make style * add warning if loaded on CPU * add small sanity check * fix small comment * add bitsandbytes on dockerfile * Improve documentation - improve documentation from comments * add few comments * slow tests pass on the VM but not on the CI VM * Fix merge conflict * make style * another test should pass on a multi gpu setup * fix bad import in testing file * Fix slow tests - remove dummy batches - no more CUDA illegal memory errors * odify dockerfile * Update docs/source/en/main_classes/model.mdx * Update Dockerfile * Update model.mdx * Update Dockerfile * Apply suggestions from code review * few modifications - lm head can stay on disk/cpu - change model name so that test pass * change test value - change test value to the correct output - torch bmm changed to baddmm in bloom modeling when merging * modify installation guidelines * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * replace `n`by `name` * merge `load_in_8bit` and `low_cpu_mem_usage` * first try - keep the lm head in full precision * better check - check the attribute `base_model_prefix` instead of computing the number of parameters * added more tests * Update src/transformers/utils/bitsandbytes.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit * improve documentation - fix typos for installation - change title in the documentation Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * TF: XLA-trainable DeBERTa v2 (#18546) * fix deberta issues * add different code paths for gpu and tpu * shorter gpu take along axis * Stable Dropout without tf cond * variable must be float * Preserve hub-related kwargs in AutoModel.from_pretrained (#18545) * Preserve hub-related kwargs in AutoModel.from_pretrained * Fix tests * Remove debug statement * TF Examples Rewrite (#18451) * Finished QA example * Dodge a merge conflict * Update text classification and LM examples * Update NER example * New Keras metrics WIP, fix NER example * Update NER example * Update MC, summarization and translation examples * Add XLA warnings when shapes are variable * Make sure batch_size is consistently scaled by num_replicas * Add PushToHubCallback to all models * Add docs links for KerasMetricCallback * Add docs links for prepare_tf_dataset and jit_compile * Correct inferred model names * Don't assume the dataset has 'lang' * Don't assume the dataset has 'lang' * Write metrics in text classification * Add 'framework' to TrainingArguments and TFTrainingArguments * Export metrics in all examples and add tests * Fix training args for Flax * Update command line args for translation test * make fixup * Fix accidentally running other tests in fp16 * Remove do_train/do_eval from run_clm.py * Remove do_train/do_eval from run_mlm.py * Add tensorflow tests to circleci * Fix circleci * Update examples/tensorflow/language-modeling/run_mlm.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/test_tensorflow_examples.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/translation/run_translation.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/token-classification/run_ner.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fix save path for tests * Fix some model card kwargs * Explain the magical -1000 * Actually enable tests this time * Skip text classification PR until we fix shape inference * make fixup Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Use commit hash to look in cache instead of calling head (#18534) * Use commit hash to look in cache instead of calling head * Add tests * Add attr for local configs too * Stupid typos * Fix tests * Update src/transformers/utils/hub.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Address Julien's comments Co-authored-by: Julien Chaumond <julien@huggingface.co> * `pipeline` support for `device="mps"` (or any other string) (#18494) * `pipeline` support for `device="mps"` (or any other string) * Simplify `if` nesting * Update src/transformers/pipelines/base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix? @sgugger * passing `attr=None` is not the same as not passing `attr` 🤯 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update philosophy to include other preprocessing classes (#18550) * 📝 update philosophy to include other preprocessing classes * 🖍 apply feedbacks * Properly move cache when it is not in default path (#18563) * Adds CLIP to models exportable with ONNX (#18515) * onnx config for clip * default opset as 14 * changes from the original repo * input values order fix * outputs fix * remove unused import * ran make fix-copies * black format * review comments: forward ref, import fix, model change revert, .to cleanup * make style * formatting fixes * revert groupvit * comment for cast to int32 * comment fix * make .T as .t() for onnx conversion * ran make fix-copies * remove unneeded comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * remove comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * raise atol for MT5OnnxConfig (#18560) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * fix string (#18568) * Segformer TF: fix output size in documentation (#18572) * Segformer TF: fix output size in doc * Segformer pytorch: fix output size in doc Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com> * Fix resizing bug in OWL-ViT (#18573) * Fixes resizing bug in OWL-ViT * Defaults to square resize if size is set to an int * Sets do_center_crop default value to False * Fix LayoutLMv3 documentation (#17932) * fix typos * fix sequence_length docs of LayoutLMv3Model * delete trailing white spaces * fix layoutlmv3 docs more * apply make fixup & quality * change to two versions of input docstring * apply make fixup & quality * Skip broken tests * Change BartLearnedPositionalEmbedding's forward method signature to support Opacus training (#18486) * changing BartLearnedPositionalEmbedding forward signature and references to it * removing debugging dead code (thanks style checker) * blackened modeling_bart file * removing copy inconsistencies via make fix-copies * changing references to copied signatures in Bart variants * make fix-copies once more * using expand over repeat (thanks @michaelbenayoun) * expand instead of repeat for all model copies Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com> * german docs translation (#18544) * Create _config.py * Create _toctree.yml * Create index.mdx not sure about "du / ihr" oder "sie" * Create quicktour.mdx * Update _toctree.yml * Update build_documentation.yml * Update build_pr_documentation.yml * fix build * Update index.mdx * Update quicktour.mdx * Create installation.mdx * Update _toctree.yml * Deberta V2: Fix critical trace warnings to allow ONNX export (#18272) * Fix critical trace warnings to allow ONNX export * Force input to `sqrt` to be float type * Cleanup code * Remove unused import statement * Update model sew * Small refactor Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Use broadcasting instead of repeat * Implement suggestion Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Match deberta v2 changes in sew_d * Improve code quality * Update code quality * Consistency of small refactor * Match changes in sew_d Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * [FX] _generate_dummy_input supports audio-classification models for labels (#18580) * Support audio classification architectures for labels generation, as well as provides a flag to print warnings or not * Use ENV_VARS_TRUE_VALUES * Fix docstrings with last version of hf-doc-builder styler (#18581) * Fix docstrings with last version of hf-doc-builder styler * Remove empty Parameter block * Bump nbconvert from 6.0.1 to 6.3.0 in /examples/research_projects/lxmert (#18565) Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump nbconvert in /examples/research_projects/visual_bert (#18566) Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix owlvit tests, update docstring examples (#18586) * Return the permuted hidden states if return_dict=True (#18578) * Load sharded pt to flax (#18419) * initial commit * add small test * add cross pt tf flag to test * fix quality * style * update test with new repo * fix failing test * update * fix wrong param ordering * style * update based on review * update related to recent new caching mechanism * quality * Update based on review Co-authored-by: sgugger <sylvain.gugger@gmail.com> * quality and style * Update src/transformers/modeling_flax_utils.py Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add type hints for ViLT models (#18577) * Add type hints for Vilt models * Add missing return type for TokenClassification class * update doc for perf_train_cpu_many, add intel mpi introduction (#18576) * update doc for perf_train_cpu_many, add mpi introduction Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Update docs/source/en/perf_train_cpu_many.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_train_cpu_many.mdx Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * typos (#18594) * FSDP bug fix for `load_state_dict` (#18596) * Add `TFAutoModelForSemanticSegmentation` to the main `__init__.py` (#18600) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Generate: validate `model_kwargs` (and catch typos in generate arguments) (#18261) * validate generate model_kwargs * generate tests -- not all models have an attn mask * Supporting seq2seq models for `bitsandbytes` integration (#18579) * Supporting seq2seq models for `bitsandbytes` integration - `bitsandbytes` integration supports now seq2seq models - check if a model has tied weights as an additional check * small modification - tie the weights before looking at tied weights! * Add Donut (#18488) * First draft * Improve script * Update script * Make conversion work * Add final_layer_norm attribute to Swin's config * Add DonutProcessor * Convert more models * Improve feature extractor and convert base models * Fix bug * Improve integration tests * Improve integration tests and add model to README * Add doc test * Add feature extractor to docs * Fix integration tests * Remove register_buffer * Fix toctree and add missing attribute * Add DonutSwin * Make conversion script work * Improve conversion script * Address comment * Fix bug * Fix another bug * Remove deprecated method from docs * Make Swin and Swinv2 untouched * Fix code examples * Fix processor * Update model_type to donut-swin * Add feature extractor tests, add token2json method, improve feature extractor * Fix failing tests, remove integration test * Add do_thumbnail for consistency * Improve code examples * Add code example for document parsing * Add DonutSwin to MODEL_NAMES_MAPPING * Add model to appropriate place in toctree * Update namespace to appropriate organization Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Fix URLs (#18604) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Update BLOOM parameter counts (#18531) * Update BLOOM parameter counts * Update BLOOM parameter counts * [doc] fix anchors (#18591) the manual anchors end up being duplicated with automatically added anchors and no longer work. * [fsmt] deal with -100 indices in decoder ids (#18592) * [fsmt] deal with -100 indices in decoder ids Fixes: https://github.com/huggingface/transformers/issues/17945 decoder ids get the default index -100, which breaks the model - like t5 and many other models add a fix to replace -100 with the correct pad index. For some reason this use case hasn't been used with this model until recently - so this issue was there since the beginning it seems. Any suggestions to how to add a simple test here? or perhaps we have something similar already? user's script is quite massive. * style * small change (#18584) * Flax Remat for LongT5 (#17994) * [Flax] Add remat (gradient checkpointing) * fix variable naming in test * flip: checkpoint using a method * fix naming * fix class naming * apply PVP's suggestions from code review * add gradient_checkpointing to examples * Add gradient_checkpointing to run_mlm_flax * Add remat to longt5 * Add gradient checkpointing test longt5 * Fix args errors * Fix remaining tests * Make fixup & quality fixes * replace kwargs * remove unecessary kwargs * Make fixup changes * revert long_t5_flax changes * Remove return_dict and copy to LongT5 * Remove test_gradient_checkpointing Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> * mac m1 `mps` integration (#18598) * mac m1 `mps` integration * Update docs/source/en/main_classes/trainer.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * addressing comments * Apply suggestions from code review Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> * resolve comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> * Change scheduled CIs to use torch 1.12.1 (#18644) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add checks for some workflow jobs (#18583) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * TF: Fix generation repetition penalty with XLA (#18648) * Update longt5.mdx (#18634) * Update run_translation_no_trainer.py (#18637) * Update run_translation_no_trainer.py found an error in selecting `no_decay` parameters and some small modifications when the user continues to train from a checkpoint * fixs `no_decay` and `resume_step` issue 1. change `no_decay` list 2. if use continue to train their model from provided checkpoint, the `resume_step` will not be initialized properly if `args.gradient_accumulation_steps != 1` * [bnb] Minor modifications (#18631) * bnb minor modifications - refactor documentation - add troubleshooting README - add PyPi library on DockerFile * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * put in one block - put bash instructions in one block * update readme - refactor a bit hardware requirements * change text a bit * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * apply suggestions Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * add link to paper * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update tests/mixed_int8/README.md * Apply suggestions from code review * refactor a bit * add instructions Turing & Amperer Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add A6000 * clarify a bit * remove small part * Update tests/mixed_int8/README.md Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Examples: add Bloom support for token classification (#18632) * examples: add Bloom support for token classification (FLAX, PyTorch and TensorFlow) * examples: remove support for Bloom in token classication (FLAX and TensorFlow currently have no support for it) * Fix Yolos ONNX export test (#18606) Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fixup * Fix up * Move PIL default arguments inside function for safe imports * Add image utils to toctree * Update `rescale` method to reflect changes in #18677 * Update docs/source/en/internal/image_processing_utils.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Address Niels PR comments * Apply suggestions from code review - remove defaults to None Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix docstrings and revert to PIL.Image.XXX resampling Use PIL.Image.XXX resampling values instead of PIL.Image.Resampling.XXX enum as it's only in the recent version >= 9.10 and version is not yet pinned and older version support deprecated * Some more docstrings and PIL.Image tidy up * Reorganise arguments so flags by modifiers * Few last docstring fixes Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Seunghwan Hong <harrydrippin@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com> Co-authored-by: Ankur Goyal <ankrgyl@gmail.com> Co-authored-by: Ankur Goyal <ankur@impira.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Mishig Davaadorj <dmishig@gmail.com> Co-authored-by: Rasmus Arpe Fogh Jensen <Rasmus.arpe@gmail.com> Co-authored-by: Ian Castillo <7807897+donelianc@users.noreply.github.com> Co-authored-by: AguilaCudicio <aguila.cudicio@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> Co-authored-by: Niklas Hansson <niklas.sven.hansson@gmail.com> Co-authored-by: Thomas Chaigneau <t.chaigneau.tc@gmail.com> Co-authored-by: YouJiacheng <1503679330@qq.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Dhruv Karan <k4r4n.dhruv@gmail.com> Co-authored-by: Michael Wyatt <mrwyattii@gmail.com> Co-authored-by: Maxime G <joihn@users.noreply.github.com> Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com> Co-authored-by: Wonseok Lee (Jack) <rollerkid02@snu.ac.kr> Co-authored-by: Dan Jones <dan.j.jones2@gmail.com> Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com> Co-authored-by: flozi00 <flozi00.fz@gmail.com> Co-authored-by: iiLaurens <iiLaurens@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: Karim Foda <35491698+KMFODA@users.noreply.github.com> Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> Co-authored-by: zhoutang776 <47708118+zhoutang776@users.noreply.github.com> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-10-12 18:32:02 +01:00
Darío Hereñú	5760a8fcf6	Syntax issues (paragraphs 122, 130, 147, 155) Documentation: @sgugger (#19437 ) * Syntax issues (paragraphs 122, 130, 147, 155) `preentramiento` > `preentrenamiento` * semantic issue (paragraph 220 & 232 & 252) * Update docs/source/es/create_a_model.mdx with approval of @ignacioct and scrutiny of @sgugger Co-authored-by: Ignacio Talavera <ignaciotalaveracepeda@gmail.com> Co-authored-by: Ignacio Talavera <ignaciotalaveracepeda@gmail.com>	2022-10-12 13:18:11 -04:00
Daniel van Strien	af539d6f0a	fix MarkupLMProcessor option flag (#19526 )	2022-10-12 15:08:48 +02:00
Ritik Nandwal	e94384e4d8	Add depth estimation pipeline (#18618 ) * Add initial files for depth estimation pipelines * Add test file for depth estimation pipeline * Update model mapping names * Add updates for depth estimation output * Add generic test * Hopefully fixing the tests. * Check if test passes * Add make fixup and make fix-copies changes after rebase with main * Rebase with main * Fixing up depth pipeline. * This is not used anymore. * Fixing the test. `Image` is a module `Image.Image` is the type. * Update docs/source/en/main_classes/pipelines.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-10-12 08:54:20 -04:00
Darío Hereñú	c60381e90d	Syntax issue (line 497, 526) Documentation @ssuggen (#19442 )	2022-10-12 08:28:54 -04:00
NielsRogge	4d367a3c81	Add LiLT (#19450 ) * First draft * Fix more things * Improve more things * Remove some head models * Fix more things * Add missing layers * Remove tokenizer * Fix more things * Fix copied from statements * Make all tests pass * Remove print statements * Remove files * Fix README and docs * Add integration test and fix organization * Add tips * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Make tests faster, improve docs * Fix doc tests * Add model to toctree * Add docs * Add note about creating new checkpoint * Remove is_decoder * Make tests smaller, add docs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-10-12 10:11:20 +02:00
Wang, Yi	7543e275d4	update doc for perf_train_cpu_many (#19506 ) Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2022-10-11 22:54:19 -04:00
Mathieu Jouffroy	5ca131f3d4	[CvT] Tensorflow implementation (#18597 ) * implemented TFCvtModel and TFCvtForImageClassification and modified relevant files, added an exception in convert_tf_weight_name_to_pt_weight_name, added quick testing file to compare with pytorch model * added docstring + testing file in transformers testing suite * added test in testing file, modified docs to pass repo-consistency, passed formatting test * refactoring + passing all test * small refacto, removing unwanted comments * improved testing config * corrected import error * modified acces to pretrained model archive list, to pass tf_test * corrected import structure in init files * modified testing for keras_fit with cpu * correcting PR issues + Refactoring * Refactoring : improving readability and reducing the number of permutations * corrected momentum value + cls_token initialization * removed from_pt as weights were added to the hub * Update tests/models/cvt/test_modeling_tf_cvt.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2022-10-11 18:16:52 +01:00
Darío Hereñú	ae710425d2	Syntax issues (lines 126, 203) (#19444 )	2022-10-11 08:14:21 -04:00
Stefano Bosisio	b0b962ccca	Add Italian translation for `add_new_model.mdx` (#18713 ) * fix conflicts * start translating * proof check * add toc * fix errors and typos	2022-10-10 10:12:40 -04:00
amyeroberts	e3f028f3af	Add TF whisper (#19378 ) * simplify loop * add featur extractor * add model * start conversion * add dropout * initial commit of test files * copnversion for all models * update processor for correct padding * update feature extraction * update integration test logits match * fmnt: off for the logits * on the fly mel bank * small nit * update test * update tokenizer * nit feature extraction * update * update tokenizer test * adds logit processor and update tokenizer to get supress tokens * style * clean convert * revert to original modeling tf utils * Update * update * nit * clean convert file * update tests and nits * quality * slow generation test * ffn_dim to allow customization * update readme * add to toctreee * start fixing integration tests * update tests and code * fix feature extractor * fix config tests common * update code to fix tests * fix feature exctractor * nit feature extraction * update test for new feature extractor * style * add absrtact * large logits wioth custom decoder input ids * wraap around is otrch available * fix feature extractor * correct logits for whisper small.en * nit * fix encoder_attentino_mask * some fixes * remove unnecessary inputs * nits * add normalizer file * update etst tokenization * fix attention mask not defined * fix generate * remove uncoder attention mask useless * update test modeling whisper * update condfig to add second non supress tokens * nits on feature exrtactor * nit for test tokenizers * update etsts * update tests * update tokenization test * fixup * invalidated hf token. Clean convert openai to whisper * fix logit tests * fixup * Add model to README * Fix doc tests * clean merge * revert toc_tree changes * remove useless LogitProcessor * Update whisper .mdx * update config file doc * update configuration docstring * update test tokenization * update test tokenization * update tokenization whisper Added copied from where needed * update feature extraction * nit test name * style * quality * remove get suppress tokens and update non_speech tokens global variables * Update src/transformers/models/whisper/feature_extraction_whisper.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * clean modeling whisper and test Removed the attention mask arguments that are deprecated * fix large test * Add multilingual audio test, and translate test * style * fix larg multilingual test * nits * add copied from for attention layer * remove attention masks in doc * add english normalizer * Update docs/source/en/model_doc/whisper.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update tokenization test * remove copied from in whisper attention : no bias in k_proj only * wrap around dependencies in english normalizer * style * correct import generation logits * for now, wrap feature extractor with torch * remove torch depencies for feature extraction and style * Update src/transformers/models/whisper/convert_openai_whisper_to_tfms.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/whisper.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixup * nit * update logitds * style * nit * nits and fix final tests * add `is_more_itertools_available` to utils * quality * add begin supress tokens, supress tokens to generate args and config * clean supressTokensLogitProcessor in generation logits * Nit naming * add supressTokensAtBegin * udpate tests, supress tokens to None or correct values * nit and style * update RAG to fit test and generate_logit * add copy pasted statment on english normalizer * add arguments to config_common_kwargs * Update src/transformers/generation_utils.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/generation_logits_process.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * revert changes based on reviews * update doc and nits * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * more nits * last nits * update test configuration common * add BART name in decoder attention mask documentation * Update src/transformers/models/whisper/modeling_whisper.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * style * nit * nit * add english.json file to git * nits on documentation * nit * nits * last styling * add main toctree file * remove sentence piece dependency * clean init file * fix tokenizer that has no dependencies on sentencepiece * update whisper init file, nit * remove english.json file * add get decoder prompt id * All weights loading * Remove hanging pdb * Fixup and tidy up * Use same copied from as PT model * Remove whitespace changes * Remove torch references * Tie embeddings * Remove logits processor input to generate * Update logit values * revert changes and add forced logit processor * nit * clean normalizer * remove protected * Add logit processors and update generation code & tests * Some tidy up * Update docstring * update * update based on review * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update to reflect changes on the PT model branch * Tidy up * Remove extra whitespace * Fix test - make input ids small enough we can append * Include upstream changes on main * PR comments - add batch tests, remove comments & defaults * Fix model output imports * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation_tf_logits_process.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/models/whisper/test_modeling_tf_whisper.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update docstring example * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Remove changes to adjust_logits_during_generation function * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Tidy up imports that don't require TF * Update tests - skip and no more skip * Update tests/generation/test_generation_tf_logits_process.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Add training flags * Add (skipped) XLA generation tests * Add embedding correctness test * Add constant ids for generation tests * Make logits finding a bit tidier * Remove unused args * xla generation enabled * Don't skip XLA tests anymore * Fix tests - add position ids to expected signature and update rag generation * Undo method reorder * Remove added whitespace * Remove copy-paste gradient checkopint ref * Remove * Trigger CI - (issue with refs when pulling) Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: NielsRogge <niels.rogge1@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Joao Gante <joao@huggingface.co>	2022-10-10 14:48:17 +01:00
APAVOU Clément	af69360bf9	Add `OPTForQuestionAnswering` (#19402 ) * Add `OPTForQuestionAnswering` - added `OPTForQuestionAnswering` class based on `BloomForQuestionAnswering` - added `OPTForQuestionAnswering` in common tests - all common tests pass - make fixup done * added docstrings for OPTForQuestionAnswering * Fix docstrings for OPTForQuestionAnswering	2022-10-10 09:30:59 -04:00
Mohit Sharma	3080bb4754	Add onnx support for VisionEncoderDecoder (#19254 ) * Add onnx support for VisionEncoderDecoder * Add onnx support for VisionEncoderDecoder * Removed unused import * Rename encoder hidden state Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update docstrings and removed redundant code * Added test function for enc-dec models * Update doc string text Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * fixed code style Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-10-10 09:20:19 -04:00
Darío Hereñú	3410705730	Fixed duplicated line (paragraph #83 ) Documentation: @sgugger (#19436 ) * Fixed duplicated line (paragraph #83) @omarespejel @sgugger * Datasets map denomination fixed (paragraph 42)	2022-10-10 09:08:34 -04:00
Darío Hereñú	83dc49b69b	Backtick fixed (paragraph 68) (#19440 )	2022-10-10 08:47:14 -04:00
Amrit Sahu	e9a49babee	[WIP] Add ZeroShotObjectDetectionPipeline (#18445 ) (#18930 ) * Add ZeroShotObjectDetectionPipeline (#18445) * Add AutoModelForZeroShotObjectDetection task This commit also adds the following - Add explicit _processor method for ZeroShotObjectDetectionPipeline. This is necessary as pipelines don't auto infer processors yet and `OwlVitProcessor` wraps tokenizer and feature_extractor together, to process multiple images at once - Add auto tests and other tests for ZeroShotObjectDetectionPipeline * Add AutoModelForZeroShotObjectDetection task This commit also adds the following - Add explicit _processor method for ZeroShotObjectDetectionPipeline. This is necessary as pipelines don't auto infer processors yet and `OwlVitProcessor` wraps tokenizer and feature_extractor together, to process multiple images at once - Add auto tests and other tests for ZeroShotObjectDetectionPipeline * Add batching for ZeroShotObjectDetectionPipeline * Fix doc-string ZeroShotObjectDetectionPipeline * Fix output format: ZeroShotObjectDetectionPipeline	2022-10-07 10:00:19 -04:00
Bibhabasu Mohapatra	e162cebfa3	add ONNX support for swin transformer (#19390 ) * swin transformer onnx support * Updated image dimensions as dynamic Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-10-07 09:23:24 -04:00
Alara Dirik	ae3e3bc60a	fix docs example, add object_detection to DETR docs (#19377 )	2022-10-07 00:02:26 +02:00
Arthur	45e14038f2	Add WhisperModel to transformers (#19166 ) * simplify loop * add featur extractor * add model * start conversion * add dropout * initial commit of test files * copnversion for all models * update processor for correct padding * update feature extraction * update integration test logits match * fmnt: off for the logits * on the fly mel bank * small nit * update test * update tokenizer * nit feature extraction * update * update tokenizer test * adds logit processor and update tokenizer to get supress tokens * style * clean convert * revert to original modeling tf utils * Update * update * nit * clean convert file * update tests and nits * quality * slow generation test * ffn_dim to allow customization * update readme * add to toctreee * start fixing integration tests * update tests and code * fix feature extractor * fix config tests common * update code to fix tests * fix feature exctractor * nit feature extraction * update test for new feature extractor * style * add absrtact * large logits wioth custom decoder input ids * wraap around is otrch available * fix feature extractor * correct logits for whisper small.en * nit * fix encoder_attentino_mask * some fixes * remove unnecessary inputs * nits * add normalizer file * update etst tokenization * fix attention mask not defined * Add model to README * Fix doc tests * fix generate * remove uncoder attention mask useless * update test modeling whisper * update condfig to add second non supress tokens * nits on feature exrtactor * nit for test tokenizers * update etsts * update tests * update tokenization test * fixup * invalidated hf token. Clean convert openai to whisper * fix logit tests * fixup * clean merge * revert toc_tree changes * remove useless LogitProcessor * Update whisper .mdx * update config file doc * update configuration docstring * update test tokenization * update test tokenization * update tokenization whisper Added copied from where needed * update feature extraction * nit test name * style * quality * remove get suppress tokens and update non_speech tokens global variables * Update src/transformers/models/whisper/feature_extraction_whisper.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * clean modeling whisper and test Removed the attention mask arguments that are deprecated * fix large test * Add multilingual audio test, and translate test * style * fix larg multilingual test * nits * Update docs/source/en/model_doc/whisper.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add copied from for attention layer * remove attention masks in doc * add english normalizer * update tokenization test * remove copied from in whisper attention : no bias in k_proj only * wrap around dependencies in english normalizer * style * correct import generation logits * for now, wrap feature extractor with torch * Update src/transformers/models/whisper/convert_openai_whisper_to_tfms.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/whisper.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * remove torch depencies for feature extraction and style * fixup * nit * update logitds * style * nit * nits and fix final tests * add `is_more_itertools_available` to utils * quality * add begin supress tokens, supress tokens to generate args and config * clean supressTokensLogitProcessor in generation logits * Nit naming * add supressTokensAtBegin * udpate tests, supress tokens to None or correct values * nit and style * update RAG to fit test and generate_logit * add copy pasted statment on english normalizer * add arguments to config_common_kwargs * Update src/transformers/generation_utils.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/generation_logits_process.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * revert changes based on reviews * update doc and nits * more nits * last nits * update test configuration common * add BART name in decoder attention mask documentation * Update src/transformers/models/whisper/modeling_whisper.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * style * nit * nit * add english.json file to git * nits on documentation * nit * nits * last styling * add main toctree file * remove sentence piece dependency * clean init file * fix tokenizer that has no dependencies on sentencepiece * update whisper init file, nit * remove english.json file * add get decoder prompt id * revert changes and add forced logit processor * nit * clean normalizer * remove protected * update * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update based on review * Update src/transformers/models/whisper/configuration_whisper.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add batched tests Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: NielsRogge <niels.rogge1@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-10-05 22:28:31 +02:00
Alara Dirik	07e94bf159	Maskformer post-processing fixes and improvements (#19172 ) - Improves MaskFormer docs, corrects minor typos - Restructures MaskFormerFeatureExtractor.post_process_panoptic_segmentation for better readability, adds target_sizes argument for optional resizing - Adds post_process_semantic_segmentation and post_process_instance_segmentation methods. - Adds a deprecation warning to post_process_segmentation method in favour of post_process_instance_segmentation	2022-10-05 15:27:15 +03:00
Younes Belkada	587d84b178	Add `BloomForQuestionAnswering` (#19310 ) * add bloom for question answering - attempt to add Bloom for question answering - adapted from `GPTJForQuestionAnswering` - Fixed `num_labels` to `2` for common tests - Added a bit of docstring - All common tests pass * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * revert changes related to `num_labels` Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-10-04 17:52:13 +02:00
Steven Liu	68f50f3453	Breakup export guide (#19271 ) * split onnx and torchscript docs * make style * apply reviews	2022-10-03 13:18:29 -07:00
Alara Dirik	36f52e9593	Restructure DETR post-processing, return prediction scores (#19262 ) * Restructure DetrFeatureExtractor post-processing methods * Update post_process_instance_segmentation and post_process_panoptic_segmentation methods to return prediction scores * Update DETR models docs	2022-10-03 12:02:51 +03:00
Kashif Rasul	5cd16f01db	time series forecasting model (#17965 ) * initial files * initial model via cli * typos * make a start on the model config * ready with configuation * remove tokenizer ref. * init the transformer * added initial model forward to return dec_output * require gluonts * update dep. ver table and add as extra * fixed typo * add type for prediction_length * use num_time_features * use config * more config * typos * opps another typo * freq can be none * default via transformation is 1 * initial transformations * fix imports * added transform_start_field * add helper to create pytorch dataloader * added inital val and test data loader * added initial distr head and loss * training working * remove TimeSeriesTransformerTokenizer Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed copyright * removed docs * remove time series tokenizer * fixed docs * fix text * fix second * fix default * fix order * use config directly * undo change * fix comment * fix year * fix import * add additional arguments for training vs. test * initial greedy inference loop * fix inference * comment out token inputs to enc dec * Use HF encoder/decoder * fix inference * Use Seq2SeqTSModelOutput output * return Seq2SeqTSPredictionOutput * added default arguments * fix return_dict true * scale is a tensor * output static_features for inference * clean up some unused bits * fixed typo * set return_dict if none * call model once for both train/predict * use cache if future_target is none * initial generate func * generate arguments * future_time_feat is required * return SampleTSPredictionOutput * removed unneeded classes * fix when params is none * fix return dict * fix num_attention_heads * fix arguments * remove unused shift_tokens_right * add different dropout configs * implement FeatureEmbedder, Scaler and weighted_average * remove gluonts dependency * fix class names * avoid _variable names * remove gluonts dependency * fix imports * remove gluonts from configuration * fix docs * fixed typo * move utils to examples * add example requirements * config has no freq * initial run_ts_no_trainer * remove from ignore * fix output_attentions and removed unsued getters/setters * removed unsed tests * add dec seq len * add test_attention_outputs * set has_text_modality=False * add config attribute_map * make style * make fix-copies * add encoder_outputs to TimeSeriesTransformerForPrediction forward * Improve docs, add model to README * added test_forward_signature * More improvements * Add more copied from * Fix README * Fix remaining quality issues * updated encoder and decoder * fix generate * output_hidden_states and use_cache are optional * past key_values returned too * initialize weights of distribution_output module * fixed more tests * update test_forward_signature * fix return_dict outputs * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * removed commented out tests * added neg. bin and normal output * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * move to one line * Add docstrings * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add try except for assert and raise * try and raise exception * fix the documentation formatting * fix assert call * fix docstring formatting * removed input_ids from DOCSTRING * Update input docstring * Improve variable names * Update order of inputs * Improve configuration * Improve variable names * Improve docs * Remove key_length from tests * Add extra docs * initial unittests * added test_inference_no_head test * added test_inference_head * add test_seq_to_seq_generation * make style * one line * assert mean prediction * removed comments * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix order of args * make past_observed_mask optional as well * added Amazon license header * updated utils with new fieldnames * make style * cleanup * undo position of past_observed_mask * fix import * typo * more typo * rename example files * remove example for now * Update docs/source/en/_toctree.yml Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update modeling_time_series_transformer.py fix style * fixed typo * fix typo and grammer * fix style Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: NielsRogge <niels.rogge1@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-09-30 15:32:59 -04:00
Joao Gante	cfb777f27c	Docs - Guide to add a new TensorFlow model (#19256 ) Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-09-30 20:30:38 +01:00
Matt	368b649af6	Rebase ESM PR and update all file formats (#19055 ) * Rebase ESM PR and update all file formats * Fix test relative imports * Add __init__.py to the test dir * Disable gradient checkpointing * Remove references to TFESM... FOR NOW >:\| * Remove completed TODOs from tests * Convert docstrings to mdx, fix-copies from BERT * fix-copies for the README and index * Update ESM's __init__.py to the modern format * Add to _toctree.yml * Ensure we correctly copy the pad_token_id from the original ESM model * Ensure we correctly copy the pad_token_id from the original ESM model * Tiny grammar nitpicks * Make the layer norm after embeddings an optional flag * Make the layer norm after embeddings an optional flag * Update the conversion script to handle other model classes * Remove token_type_ids entirely, fix attention_masking and add checks to convert_esm.py * Break the copied from link from BertModel.forward to remove token_type_ids * Remove debug array saves * Begin ESM-2 porting * Add a hacky workaround for the precision issue in original repo * Code cleanup * Remove unused checkpoint conversion code * Remove unused checkpoint conversion code * Fix copyright notices * Get rid of all references to the TF weights conversion * Remove token_type_ids from the tests * Fix test code * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add credit * Remove _ args and __ kwargs in rotary embedding * Assertively remove asserts * Replace einsum with torch.outer() * Fix docstring formatting * Remove assertions in tokenization * Add paper citation to ESMModel docstring * Move vocab list to single line * Remove ESMLayer from init * Add Facebook copyrights * Clean up RotaryEmbedding docstring * Fix docstring formatting * Fix docstring for config object * Add explanation for new config methods * make fix-copies * Rename all the ESM- classes to Esm- * Update conversion script to allow pushing to hub * Update tests to point at my repo for now * Set config properly for tests * Remove the gross hack that forced loss of precision in inv_freq and instead copy the data from the model being converted * make fixup * Update expected values for slow tests * make fixup * Remove EsmForCausalLM for now * Remove EsmForCausalLM for now * Fix padding idx test * Updated README and docs with ESM-1b and ESM-2 separately (#19221) * Updated README and docs with ESM-1b and ESM-2 separately * Update READMEs, longer entry with 3 citations * make fix-copies Co-authored-by: Your Name <you@example.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Tom Sercu <tsercu@fb.com> Co-authored-by: Your Name <you@example.com>	2022-09-30 14:16:25 +01:00
NielsRogge	f3d2f7a6e0	Add MarkupLM (#19198 ) * First draft * Make basic test work * Fix most tokenizer tests * More improvements * Make more tests pass * Fix more tests * Fix some code quality * Improve truncation * Implement feature extractor * Improve feature extractor and add tests * Improve feature extractor tests * Fix pair_input test partly * Add fast tokenizer * Improve implementation * Fix rebase * Fix rebase * Fix most of the tokenizer tests. * propose solution for fast * add: integration test for fasttokenizer, warning for decode, fix template in slow tokenizer * add: modify markuplmconverter * add: some modify on converter and tokenizerfast * Fix style, copies * Make fixup * Update tokenization_markuplm.py * Update test_tokenization_markuplm.py * Update markuplm related * Improve processor, add integration test * Add processor test file * Improve processor * Improve processor tests * Fix more processor tests * Fix processor tests * Update docstrings * Add Copied from statements * Add more Copied from statements * Add code examples * Improve code examples * Add model to doc tests * Adding dependency check * Add dummy file * Add requires_backends * Add model to toctree * Fix more things, disable dependency check for now * Apply more suggestions * Add soft dependency * Add annotators to tests * Fix style * Remove from_slow=True * Remove print statements * Add sanity check * Fix processor test * Fix processor tests, add more docs * Add doc tests for mdx file * Add more tips * Apply suggestions Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: lockon-n <45759388+lockon-n@users.noreply.github.com> Co-authored-by: SaulLu <lucilesaul.com@gmail.com> Co-authored-by: lockon-n <dd098309@126.com>	2022-09-30 08:25:43 +02:00
mustapha ajeghrir	ba9e336fa3	Fix `m2m_100.mdx` doc example missing `labels` (#19149 ) The `labels` variable is not defined, the `model_inputs` already contain this information.	2022-09-29 13:27:58 +02:00
Aritra Roy Gosthipaty	0dc7b3a785	[TensorFlow] Adding GroupViT (#18020 ) * chore: initial commit * chore: adding util methods yet to work on the nn.functional.interpolate port with align_corener=True * chore: refactor the utils * used tf.compat.v1.image.resize to align the F.interpolate function * added type hints to the method signatures * added references to the gists where one 2 one alignment of torch and tf has been shown * chore: adding the layers * chore: porting all the layers from torch to tf This is the initial draft, nothing is tested yet. * chore: aligning the layers with reference to tf clip * chore: aligning the modules * added demaraction comments * added copied and adapted from comments * chore: aligning with CLIP * chore: wrangling the layers to keep it tf compatible * chore: aligning the names of the layers for porting * chore: style changes * chore: adding docs and inits * chore: adding tfp dependencis the code is taken from TAPAS * chore: initial commit for testing * chore: aligning the vision embeddings with the vit implementatino * chore: changing model prefix * chore: fixing the name of the model and the layer normalization test case * chore: every test passes but the slow ones * chore: fix style and integration test * chore: moving comments below decorators * chore: make fixup and fix-copies changes * chore: adding the Vision and Text Model to check_repo * chore: modifying the prefix name to align it with the torch implementation * chore: fix typo in configuration * choer: changing the name of the model variable * chore: adding segmentation flag * chore: gante's review * chore: style refactor * chore: amy review * chore: adding shape_list to parts that have been copied from other snippets * chore: init batchnorm with torch defaults * chore: adding shape_list to pass the tests * test fix: adding seed as 0 * set seed * chore: changing the straight through trick to fix -ve dimensinos * chore: adding a dimension to the loss * chore: adding reviewers and contributors names to the docs * chore: added changes after review * chore: code quality fixup * chore: fixing the segmentation snippet * chore: adding to the layer calls * chore: changing int32 to int64 for inputs of serving * chore: review changes * chore: style changes * chore: remove from_pt=True * fix: repo consistency Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-09-29 10:48:04 +01:00
Steven Liu	6957350c2b	Focus doc around preprocessing classes (#18768 ) * 📝 reframe docs around preprocessing classes * small edits * edits and review * fix typo * apply review * clarify processor	2022-09-28 17:09:44 -07:00
Steven Liu	990936a868	Move AutoClasses under Main Classes (#19163 ) * move autoclasses to main classes * keep auto.mdx in model_doc	2022-09-28 17:09:29 -07:00
Nicola Procopio	e3a30e2b99	translated add_new_pipeline (#19215 )	2022-09-27 08:55:41 -04:00
Wang, Yi	88f597ba6a	add doc for hyperparameter search (#19192 ) * add doc for hyperparameter search * update doc	2022-09-27 07:51:51 -04:00
Sylvain Gugger	c20b2c7e18	Use repo_type instead of deprecated datasets repo IDs (#19202 ) * Use repo_type instead of deprecated datasets repo IDs * Add missing one in doc	2022-09-26 09:50:48 -04:00
flozi00	fa4eeb4fd3	german training, accelerate and model sharing (#19171 ) * correct spelling in README * processing * german training * accelerate * german model sharing * build doc * ttf links * casing	2022-09-23 14:52:09 -04:00
Alara Dirik	7e84723fe4	Add semantic segmentation post-processing method to MobileViT (#19105 ) * add post-processing method for semantic segmentation Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-09-23 16:24:28 +03:00
Wang, Yi	e5b7cff5fe	update perf_train_cpu_many doc (#19151 ) Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2022-09-22 09:20:15 -04:00
NielsRogge	cf6308ef9b	Improve conditional detr docs (#19154 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-09-22 13:21:05 +02:00
Sayak Paul	2d9853b226	MSN (Masked Siamese Networks) for ViT (#18815 ) * feat: modeling and conversion scripts for msn. * chore: change license year. * chore: remove unneeded modules. * feat: direct loading of state_dict from remote url. * fix: import paths. * add: rest of the files. * add and fix rest of the files. Co-authored-by: Niels <niels.rogge1@gmail.com> * chore: formatting. * code quality fix. * chore: remove pooler. * feat: add classification top. * fix: configuration object. * add: initial test cases (one failing). * fix: basemodeloutput. * add: caution on using the classification head. * add: rest of the model related files. * add: vit msn readme. * fix: copied from statement. * fix: dummy objects. * add: ViTMSNPreTrainedModel to inits. * fix: repo consistency. * minor change in the model doc. * fix: tests. * Empty-Commit * Update src/transformers/models/vit_msn/configuration_vit_msn.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address PR comments. * Update src/transformers/models/vit_msn/modeling_vit_msn.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * chore: put model in no_grad() and formatting. Co-authored-by: Niels <niels.rogge1@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2022-09-22 07:15:03 -04:00
NielsRogge	9393f966bc	[fix] Add DeformableDetrFeatureExtractor (#19140 ) * Add DeformableDetrFeatureExtractor * Fix post_process * Fix name * Add tests for feature extractor * Fix doc tests * Fix name * Address comments * Apply same fix to DETR and YOLOS as well Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-09-22 09:45:24 +02:00
DepuMeng	126a739058	Add support for conditional detr (#18948 ) * added conditional_detr files * checked copies * checked copies * fixed style and copies * fixed style and copies * fixed hub * fixed style * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/conditional_detr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some doc issue * changed prefix to ConditionalDetr * fixed docs * Update README_ko.md * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fixed use_pretrained issue * changed post-process * added conditional_detr files * checked copies * checked copies * fixed style and copies * fixed style and copies * fixed hub * fixed style * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some doc issue * Update docs/source/en/model_doc/conditional_detr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * changed prefix to ConditionalDetr * fixed docs * Update README_ko.md * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fixed use_pretrained issue * changed post-process * fix style quality and copies * fix style quality and copies * fix style quality and copies * fix style quality and copies * add more fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some variable names & added more fix-copies * fixed some variable names & added more fix-copies * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added more copied from * fixed quality * changed pretrained config * added more copied-from and fixed the issue in feature_extraction_auto * added conditional_detr files * checked copies * checked copies * fixed style and copies * fixed style and copies * fixed hub * fixed style * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some doc issue * Update docs/source/en/model_doc/conditional_detr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * changed prefix to ConditionalDetr * fixed docs * Update README_ko.md * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fixed use_pretrained issue * changed post-process * added conditional_detr files * checked copies * fixed style and copies * fixed some doc issue * changed prefix to ConditionalDetr * fixed docs * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fix style quality and copies * fix style quality and copies * fix style quality and copies * add more fix-copies * fixed some variable names & added more fix-copies * fixed some variable names & added more fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added more copied from * fixed quality * changed pretrained config * added more copied-from and fixed the issue in feature_extraction_auto * fixed style * added conditional_detr files * checked copies * checked copies * fixed style and copies * fixed style and copies * fixed hub * fixed style * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixed some doc issue * Update docs/source/en/model_doc/conditional_detr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * changed prefix to ConditionalDetr * fixed docs * Update README_ko.md * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fixed use_pretrained issue * changed post-process * added conditional_detr files * checked copies * fixed style and copies * fixed some doc issue * changed prefix to ConditionalDetr * fixed docs * added spatial_model_name * fixed fix-copies * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added some copied from * added some copied from * added some copied from * added some copied from * fix style quality and copies * fix style quality and copies * fix style quality and copies * add more fix-copies * fixed some variable names & added more fix-copies * fixed some variable names & added more fix-copies * Update src/transformers/models/conditional_detr/feature_extraction_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/conditional_detr/configuration_conditional_detr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added more copied from * fixed quality * changed pretrained config * added more copied-from and fixed the issue in feature_extraction_auto * rebased Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Depu Meng <depumeng@Depus-MacBook-Pro.local>	2022-09-22 09:45:04 +02:00
Alara Dirik	e7fdfc720a	Add post_process_semantic_segmentation method to DPTFeatureExtractor (#19107 ) * add post-processing method for semantic segmentation * add test for post-processing	2022-09-21 15:15:26 +03:00
Alara Dirik	9e95706648	Add post_process_semantic_segmentation method to SegFormer (#19072 ) * add post_process_semantic_segmentation method to SegformerFeatureExtractor * add test for semantic segmentation post-processing	2022-09-21 11:40:35 +03:00
flozi00	de26241645	german processing (#19121 ) * correct spelling in README * processing	2022-09-20 09:18:21 -04:00
Alara Dirik	c81ebd1c39	Beit postprocessing (#19099 ) * add post_process_semantic_segmentation method to BeiTFeatureExtractor	2022-09-20 10:41:56 +03:00
NielsRogge	e7206ceab9	Improve vision models docs (#19103 ) * Add tips * Add BEiT figure * Fix URL * Move tip to start * Add tip to TF model as well Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-09-19 19:22:34 +02:00
flozi00	ae219532e3	german autoclass (#19049 ) * german autoclass * Update _toctree.yml	2022-09-16 16:16:00 -04:00
Stas Bekman	8edf196310	[doc] debug: fix import (#19042 ) correct the import statement	2022-09-14 16:29:58 -07:00
Hakjin Lee	abca1741cf	Fix a broken link for deepspeed ZeRO inference in the docs (#19001 ) * Fix a broken link for deepspeed ZeRO inference * fix link Co-authored-by: Stas Bekman <stas@stason.org>	2022-09-14 16:21:06 -07:00
Shinya Otani	f5f430e5c8	Add support for Japanese GPT-NeoX-based model by ABEJA, Inc. (#18814 ) * add gpt-neox-japanese model and tokenizer as new model * Correction to PR's comment for GPT NeoX Japanese - Fix to be able to use gpu - Add comment # Copied... at the top of RotaryEmbedding - Implement nn.Linear instead of original linear class - Add generation test under @slow * fix bias treatment for gpt-neox-japanese * Modidy gpt-neox-japanese following PR - add doc for bias_dropout_add - style change following a PR comment * add document for gpt-neox-japanese * remove unused import from gpt-neox-japanese * fix README for gpt-neox-japanese	2022-09-14 10:17:40 -04:00
NielsRogge	59407bbeb3	Add Deformable DETR (#17281 ) * First draft * More improvements * Improve model, add custom CUDA code * Import torch before * Add script that imports custom layer * Add everything in new ops directory * Import custom layer in modeling file * Fix ARCHIVE_MAP typo * Creating the custom kernel on the fly. * Import custom layer in modeling file * More improvements * Fix CUDA loading * More improvements * Improve conversion script * Improve conversion script * Make it work until encoder_outputs * Make forward pass work * More improvements * Make logits match original implementation * Make implementation also support single_scale model * Add support for single_scale and dilation checkpoint * Add support for with_box_refine model * Support also two stage model * Improve tests * Fix more tests * Make more tests pass * Upload all models to the hub * Clean up some code * Improve decoder outputs * Rename intermediate hidden states and reference points * Improve model outputs * Move tests to dedicated folder * Improve model outputs * Fix retain_grad test * Improve docs * Clean up and make test_initialization pass * Improve variable names * Add copied from statements * Improve docs * Fix style * Improve docs * Improve docs, move tests to model folder * Fix rebase * Remove DetrForSegmentation from auto mapping * Apply suggestions from code review * Improve variable names and docstrings * Apply some more suggestions from code review * Apply suggestion from code review * better docs and variables names * hint to num_queries and two_stage confusion * remove asserts and code refactor * add exception if two_stage is True and with_box_refine is False * use f-strings * Improve docs and variable names * Fix code quality * Fix rebase * Add require_torch_gpu decorator * Add pip install ninja to CI jobs * Apply suggestion of @sgugger * Remove DeformableDetrForObjectDetection from auto mapping * Remove DeformableDetrModel from auto mapping * Add model to toctree * Add model back to mappings, skip model in pipeline tests * Apply @sgugger's suggestion * Fix imports in the init * Fix copies * Add CPU implementation * Comment out GPU function * Undo previous change * Apply more suggestions * Remove require_torch_gpu annotator * Fix quality * Add logger.info * Fix logger * Fix variable names * Fix initializaztion * Add missing initialization * Update checkpoint name * Add model to doc tests * Add CPU/GPU equivalence test * Add Deformable DETR to pipeline tests * Skip model for object detection pipeline Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Nouamane Tazi <nouamane98@gmail.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-09-14 11:45:21 +02:00
Chris Emezue	470799b3a6	Removed issue in wav2vec link (#18945 ) Fix connected to [this issue](https://github.com/huggingface/transformers/issues/18944)	2022-09-12 21:59:19 +02:00
Tobias Nusser	4c2e983f44	Fixed typo (#18921 ) Fixed typo itmes --> items	2022-09-12 21:03:48 +02:00
Rafał Jankowski	85125fcffd	Neptune.ai integration improvements (#18934 ) * NeptuneCallback improvements * After review suggestions and deduplication of initial run * Added volatile checkpoints support due to missing post-rebase commit * Update README per review comments - Remove list formatting - Correct Neptune docs link Co-authored-by: Sabine <sabine.nyholm@neptune.ai>	2022-09-09 11:37:34 -04:00
HuYong	22f7218560	add task_type_id to BERT to support ERNIE-2.0 and ERNIE-3.0 models (#18686 ) * add_ernie * remove Tokenizer in ernie * polish code * format code style * polish code * fix style * update doc * make fix-copies * change model name * change model name * fix dependency * add more copied from * rename ErnieLMHeadModel to ErnieForCausalLM do not expose ErnieLayer update doc * fix * make style * polish code * polish code * fix * fix * fix * fix * fix * final fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-09-09 07:36:46 -04:00
NielsRogge	bb6f6d5338	Add X-CLIP (#18852 ) * First draft * Improve conversion script * Make vision encoder work * More improvements * Improve conversion script * Fix quality * Add MultiframeIntegrationTransformer * More improvements * Make MiT output work * Fix quality * Add prompts generator * Add tests * Fix some tests * Fix some more tests * Fix more tests * Improve conversion script * Fix model outputs * Fix more tests * Add XClipProcessor * Use processor in conversion script * Fix integration test * Update README, fix docs * Fix all tests * Add MIT output to XClipOutput * Create better variable names * Rename XClip to XCLIP * Extend conversion script * Add support for large models * Add support for 16 frame models * Add another model' * Fix module issue * Apply suggestions from code review * Add figure to docs * Fix CLIPProcessor issue * Apply suggestions from code review * Delete file * Convert more checkpoints * Convert last checkpoint * Update nielsr to microsoft	2022-09-08 14:50:30 +02:00
Devlee247	9832ac7c73	Fix LayoutXLM wrong link in README (#18932 ) * fix LayoutXLM wrong link in README * fix LayoutXLM worng link in index.mdx	2022-09-08 07:32:41 -04:00
Steven Liu	90f6fe9155	Skip some doctests in quicktour (#18927 ) * skip some code examples for doctests * make style * fix code snippet formatting * separate code snippet into two blocks	2022-09-07 14:45:22 -07:00
Ankur Goyal	2ef7742117	Add DocumentQuestionAnswering pipeline (#18414 ) * [WIP] Skeleton of VisualQuestionAnweringPipeline extended to support LayoutLM-like models * Fixup * Use the full encoding * Basic refactoring to DocumentQuestionAnsweringPipeline * Cleanup * Improve args, docs, and implement preprocessing * Integrate OCR * Refactor question_answering pipeline * Use refactored QA code in the document qa pipeline * Fix tests * Some small cleanups * Use a string type annotation for Image.Image * Update encoding with image features * Wire through the basic docs * Handle invalid response * Handle empty word_boxes properly * Docstring fix * Integrate Donut model * Fixup * Incorporate comments * Address comments * Initial incorporation of tests * Address Comments * Change assert to ValueError * Comments * Wrap `score` in float to make it JSON serializable * Incorporate AutoModeLForDocumentQuestionAnswering changes * Fixup * Rename postprocess function * Fix auto import * Applying comments * Improve docs * Remove extra assets and add copyright * Address comments Co-authored-by: Ankur Goyal <ankur@impira.com>	2022-09-07 13:38:49 -04:00
Matt	2b9513fdab	Update TF fine-tuning docs (#18654 ) * Update TF fine-tuning docs * Fix formatting * Add some section headers so the right sidebar works better * Squiggly it * Update docs/source/en/training.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Explain things in the text, not the comments * Make the two dataset creation methods into a list * Move the advice about collation out of a <Tip> * Edits for clarity * Edits for clarity * Edits for clarity * Replace `to_tf_dataset` with `prepare_tf_dataset` in the fine-tuning pages * Restructure the page a little bit * Restructure the page a little bit * Restructure the page a little bit Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-09-07 13:30:07 +01:00
Ekagra Ranjan	0a632f076d	Fix incorrect size of input for 1st strided window length in `Perplexity of fixed-length models` (#18906 ) * update the PPL for stride 512 * fix 1st strided window size * linting * fix typo * styling	2022-09-06 15:20:12 -04:00
Ekagra Ranjan	f85acb4d73	Fix decode_input_ids to bare T5Model and improve doc (#18791 ) * use tokenizer to output tensor * add preprocessing for decoder_input_ids for bare T5Model * add preprocessing to tf and flax * linting * linting * Update src/transformers/models/t5/modeling_flax_t5.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/t5/modeling_tf_t5.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/t5/modeling_t5.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-09-06 14:12:26 +02:00
Surya Prakash Sahu	17c634fd5b	Update perf_train_gpu_one.mdx (#18442 )	2022-09-05 14:06:36 +02:00
Lysandre Debut	591cfc6c90	Mention TF and Flax checkpoints (#18894 )	2022-09-05 11:09:39 +02:00
Steven Liu	65fb71bc76	Add Trainer to quicktour (#18723 ) * 📝 update quicktour * 📝 add trainer section * 🖍 markdown table, apply feedbacks * ✨ make style * add tf training section * make style	2022-09-02 15:05:31 -05:00
Steven Liu	ae32f3afef	Finetune guide for semantic segmentation (#18640 ) * 📝 first draft * oops add to toctree * make style * 📝 add inference section * 🖍 make style * 📝 add images * 🖍 apply feedbacks * remove num_labels and pytorch block * apply feedbacks, add colab notebook Co-authored-by: Steven <stevhliu@gmail.com>	2022-09-02 14:29:51 -05:00
Steven Liu	bf9d506137	Update docs landing page (#18590 ) * 📝 update docs landing page * 🖍 apply feedbacks * apply feedbacks * apply feedbacks, use <br> for list	2022-09-02 14:29:06 -05:00
Jason Phang	53e33e6f1b	PEGASUS-X (#18551 ) * PegasusX Initial commit * rename * pegasus X implementation * pegx update * pegx fix * pegasus-x fixes * pegx updates * cleanup * cleanup * cleanup * tests * stylefixes * Documentation update * Model hub fix * cleanup * update * update * testfix * Check fix * tweaks for merging * style * style * updates for pr * style * change pegasus-x repo	2022-09-02 19:54:02 +02:00
NielsRogge	17981faf67	Add OWL-ViT to the appropriate section (#18867 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-09-02 15:59:25 +02:00
NielsRogge	c60dd98e87	[LayoutLM] Add clarification to docs (#18716 ) * Add clarification * Add another clarification * Apply suggestion Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-09-02 14:48:19 +02:00
OlivierDehaene	129d73294e	Fix naming issue with ImageToText pipeline (#18864 ) Co-authored-by: Olivier Dehaene <olivier@huggingface.co>	2022-09-02 07:55:30 -04:00
Steven Liu	142e12afb4	Split docs on modality (#18205 ) * update * 🖍 add missing files * 📝 add nested sections * 🖍 align titles with tasks * oops * remove quotes from titles	2022-09-01 15:19:11 -05:00
OlivierDehaene	ddb69e5af8	Add Image To Text Generation pipeline (#18821 ) * Add Image2TextGenerationPipeline to supported pipelines * Add Flax and Tensorflow support * Add Flax and Tensorflow small tests * Add default model for Tensorflow * Add docstring * Fix doc style * Add tiny models for pytorch and flax * Remove flax from pipeline. Fix tests * Use ydshieh/vit-gpt2-coco-en as a default for both PyTorch and Tensorflow * Fix Tensorflow support Co-authored-by: Olivier Dehaene <olivier@huggingface.co>	2022-09-01 12:07:14 -04:00
Sayak Paul	954e18ab97	TensorFlow MobileViT (#18555 ) * initial implementation. * add: working model till image classification. * add: initial implementation that passes intg tests. Co-authored-by: Amy <aeroberts4444@gmail.com> * chore: formatting. * add: tests (still breaking because of config mismatch). Coo-authored-by: Yih <2521628+ydshieh@users.noreply.github.com> * add: corrected tests and remaning changes. * fix code style and repo consistency. * address PR comments. * address Amy's comments. * chore: remove from_pt argument. * chore: add full-stop. * fix: TFLite model conversion in the doc. * Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply formatting. * chore: remove comments from the example block. * remove identation in the example. Co-authored-by: Amy <aeroberts4444@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-09-01 10:35:15 -04:00
Pedro Cuenca	f719c0377f	Minor typo in prose of model outputs documentation. (#18848 )	2022-09-01 12:05:40 +02:00
flozi00	359f7b4b8d	Create pipeline_tutorial.mdx german docs (#18625 ) * Create pipeline_tutorial.mdx * Update _toctree.yml	2022-09-01 09:57:59 +02:00
lewtun	80367cd1fb	Add security warning about the from_pretrained() method (#18801 ) * Add security warning about from_pretrained() method * Add sentence about malware scanner Co-authored-by: Julien Chaumond <julien@huggingface.co>	2022-08-31 21:48:40 +02:00
NielsRogge	7e7f743481	Add SegFormer ONNX support (#18006 ) * Add ONNX support * Make height and width dynamic axes Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-08-31 20:58:44 +02:00
Ankur Goyal	5c4c869014	Add LayoutLMForQuestionAnswering model (#18407 ) * Add LayoutLMForQuestionAnswering model * Fix output * Remove TF TODOs * Add test cases * Add docs * TF implementation * Fix PT/TF equivalence * Fix loss * make fixup * Fix up documentation code examples * Fix up documentation examples + test them * Remove LayoutLMForQuestionAnswering from the auto mapping * Docstrings * Add better docstrings * Undo whitespace changes * Update tokenizers in comments * Fixup code and remove `from_pt=True` * Fix tests * Revert some unexpected docstring changes * Fix tests by overriding _prepare_for_class Co-authored-by: Ankur Goyal <ankur@impira.com>	2022-08-31 10:05:33 +02:00
Dhruv Karan	220da3b8a1	Adds GroupViT to models exportable with ONNX (#18628 ) * groupvit to onnx * dynamic shape for pixel values dim	2022-08-30 14:31:35 +02:00
Dhruv Karan	46d0e26a27	Adds OWLViT to models exportable with ONNX (#18588 ) * onnx conversion for owlvit * .T to .t() * dynamic shapes for pixel values	2022-08-30 14:30:59 +02:00
Christoffer Koo Øhrstrøm	de8548ebf3	[LayoutLMv3] Add TensorFlow implementation (#18678 ) Co-authored-by: Esben Toke Christensen <esben.christensen@visma.com> Co-authored-by: Lasse Reedtz <lasse.reedtz@visma.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2022-08-30 11:48:11 +01:00
Philipp Schmid	f2fbe44753	Fix broken link DeepSpeed documentation link (#18783 ) * Fix broken link * Trigger CI Co-authored-by: Stas Bekman <stas@stason.org>	2022-08-28 19:32:19 -07:00
Patrick Deutschmann	3223d49354	Add ONNX support for Longformer (#17176 ) * Implement ONNX support for Longformer Fix repo consistency check complaints Fix value mismatches Add pooler output for default model Increase validation atol to accommodate multiple-choice error Fix copies Fix chunking for longer sequence lengths Add future comment * Fix issue in mask_invalid_locations * Remove torch imports in configuration_longformer * Change config access to fix LED * Push opset version to support tril * Work in review comments (mostly style) * Add Longformer to ONNX tests	2022-08-25 08:34:42 +02:00
Daniel Stancl	c72d7d91bf	Add TF implementation of `XGLMModel` (#16543 ) * Add TFXGLM models * Add todo: self.supports_xla_generation = False Co-authored-by: Daniel Stancl <stancld@Daniels-MacBook-Pro.local> Co-authored-by: Daniel Stancl <stancld@daniels-mbp.home> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Daniel <daniel.stancl@rossum.ai> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-08-24 10:51:05 +01:00
Yih-Dar	cecf9f9b27	fix pipeline_tutorial.mdx doctest (#18717 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-08-24 05:38:03 -04:00
Mishig Davaadorj	c12dbdc246	Update perf_infer_gpu_many.mdx (#18744 )	2022-08-24 10:37:52 +02:00
Younes Belkada	a123eee9df	[bnb] Move documentation (#18671 ) * fix bnb documentation - move bnb documentation to `infer_gpu_many` * small refactoring - added text on infer_gpu_one - added a small note on infer_gpu_many - added customized multi gpu example on infer_gpu_many * Update docs/source/en/perf_infer_gpu_many.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * apply suggestions Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-08-18 17:34:48 +02:00
Younes Belkada	6d175c1129	[bnb] Minor modifications (#18631 ) * bnb minor modifications - refactor documentation - add troubleshooting README - add PyPi library on DockerFile * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * put in one block - put bash instructions in one block * update readme - refactor a bit hardware requirements * change text a bit * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * apply suggestions Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * add link to paper * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update tests/mixed_int8/README.md * Apply suggestions from code review * refactor a bit * add instructions Turing & Amperer Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add A6000 * clarify a bit * remove small part * Update tests/mixed_int8/README.md Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2022-08-17 00:48:10 +02:00
flozi00	a27195b1de	Update longt5.mdx (#18634 )	2022-08-16 10:20:46 -05:00
Sourab Mangrulkar	9cf274685a	mac m1 `mps` integration (#18598 ) * mac m1 `mps` integration * Update docs/source/en/main_classes/trainer.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * addressing comments * Apply suggestions from code review Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> * resolve comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>	2022-08-16 16:34:51 +05:30
Stas Bekman	37c5991843	[doc] fix anchors (#18591 ) the manual anchors end up being duplicated with automatically added anchors and no longer work.	2022-08-12 10:49:59 -07:00
Niklas Muennighoff	56ef0ba447	Update BLOOM parameter counts (#18531 ) * Update BLOOM parameter counts * Update BLOOM parameter counts	2022-08-12 19:36:18 +02:00
NielsRogge	153d1361c7	Fix URLs (#18604 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-08-12 18:52:49 +02:00
NielsRogge	2ab790e82d	Add Donut (#18488 ) * First draft * Improve script * Update script * Make conversion work * Add final_layer_norm attribute to Swin's config * Add DonutProcessor * Convert more models * Improve feature extractor and convert base models * Fix bug * Improve integration tests * Improve integration tests and add model to README * Add doc test * Add feature extractor to docs * Fix integration tests * Remove register_buffer * Fix toctree and add missing attribute * Add DonutSwin * Make conversion script work * Improve conversion script * Address comment * Fix bug * Fix another bug * Remove deprecated method from docs * Make Swin and Swinv2 untouched * Fix code examples * Fix processor * Update model_type to donut-swin * Add feature extractor tests, add token2json method, improve feature extractor * Fix failing tests, remove integration test * Add do_thumbnail for consistency * Improve code examples * Add code example for document parsing * Add DonutSwin to MODEL_NAMES_MAPPING * Add model to appropriate place in toctree * Update namespace to appropriate organization Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-08-12 16:40:58 +02:00
Yih-Dar	2156619f10	Add `TFAutoModelForSemanticSegmentation` to the main `__init__.py` (#18600 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-08-12 15:10:00 +02:00
Wang, Yi	3cdaea47ec	update doc for perf_train_cpu_many, add intel mpi introduction (#18576 ) * update doc for perf_train_cpu_many, add mpi introduction Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Update docs/source/en/perf_train_cpu_many.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_train_cpu_many.mdx Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-08-12 08:36:27 -04:00
Alara Dirik	f28f240828	fix owlvit tests, update docstring examples (#18586 )	2022-08-11 19:10:25 +03:00
flozi00	5d3f037433	german docs translation (#18544 ) * Create _config.py * Create _toctree.yml * Create index.mdx not sure about "du / ihr" oder "sie" * Create quicktour.mdx * Update _toctree.yml * Update build_documentation.yml * Update build_pr_documentation.yml * fix build * Update index.mdx * Update quicktour.mdx * Create installation.mdx * Update _toctree.yml	2022-08-11 09:52:27 -04:00
Dhruv Karan	f62cb8313c	Adds CLIP to models exportable with ONNX (#18515 ) * onnx config for clip * default opset as 14 * changes from the original repo * input values order fix * outputs fix * remove unused import * ran make fix-copies * black format * review comments: forward ref, import fix, model change revert, .to cleanup * make style * formatting fixes * revert groupvit * comment for cast to int32 * comment fix * make .T as .t() for onnx conversion * ran make fix-copies * remove unneeded comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * remove comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-08-10 15:47:31 -04:00
Steven Liu	6936e7c487	Update philosophy to include other preprocessing classes (#18550 ) * 📝 update philosophy to include other preprocessing classes * 🖍 apply feedbacks	2022-08-10 13:20:39 -05:00
Younes Belkada	4a51075a96	`bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901 ) * first commit * correct replace function * add final changes - works like charm! - cannot implement tests yet - tested * clean up a bit * add bitsandbytes dependencies * working version - added import function - added bitsandbytes utils file * small fix * small fix - fix import issue * fix import issues * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor a bit - move bitsandbytes utils to utils - change comments on functions * reformat docstring - reformat docstring on init_empty_weights_8bit * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * revert bad formatting * change to bitsandbytes * refactor a bit - remove init8bit since it is useless * more refactoring - fixed init empty weights issue - added threshold param * small hack to make it work * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_utils.py * revmoe the small hack * modify utils file * make style + refactor a bit * create correctly device map * add correct dtype for device map creation * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions - remove with torch.grad - do not rely on Python bool magic! * add docstring - add docstring for new kwargs * add docstring - comment `replace_8bit_linear` function - fix weird formatting * - added more documentation - added new utility function for memory footprint tracking - colab demo to add * few modifs - typo doc - force cast into float16 when load_in_8bit is enabled * added colab link * add test architecture + docstring a bit * refactor a bit testing class * make style + refactor a bit * enhance checks - add more checks - start writing saving test * clean up a bit * male style * add more details on doc * add more tests - still needs to fix 2 tests * replace by "or" - could not fix it from GitHub GUI Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor a bit testing code + add readme * make style * fix import issue * Update src/transformers/modeling_utils.py Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * add few comments * add more doctring + make style * more docstring * raise error when loaded in 8bit * make style * add warning if loaded on CPU * add small sanity check * fix small comment * add bitsandbytes on dockerfile * Improve documentation - improve documentation from comments * add few comments * slow tests pass on the VM but not on the CI VM * Fix merge conflict * make style * another test should pass on a multi gpu setup * fix bad import in testing file * Fix slow tests - remove dummy batches - no more CUDA illegal memory errors * odify dockerfile * Update docs/source/en/main_classes/model.mdx * Update Dockerfile * Update model.mdx * Update Dockerfile * Apply suggestions from code review * few modifications - lm head can stay on disk/cpu - change model name so that test pass * change test value - change test value to the correct output - torch bmm changed to baddmm in bloom modeling when merging * modify installation guidelines * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * replace `n`by `name` * merge `load_in_8bit` and `low_cpu_mem_usage` * first try - keep the lm head in full precision * better check - check the attribute `base_model_prefix` instead of computing the number of parameters * added more tests * Update src/transformers/utils/bitsandbytes.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit * improve documentation - fix typos for installation - change title in the documentation Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>	2022-08-10 09:13:36 +02:00
Steven Liu	8cf4a6f0a6	📝 update documentation build section (#18548 )	2022-08-09 18:22:55 -05:00
Steven Liu	0c183cc2f4	📝 update metric with evaluate (#18535 )	2022-08-09 11:58:11 -05:00
Thomas Chaigneau	8cb5ecd912	Add mt5 onnx config (#18394 ) * update features * MT5OnnxConfig added with updated with tests and docs * fix imports * fix onnc_config_cls for mt5 Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai>	2022-08-09 03:46:53 -04:00
AguilaCudicio	499450ed75	Spanish translation of summarization.mdx (#15947 ) (#18477 ) * Add Spanish translation of summarization.mdx * Apply suggestions from code review Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-08-08 15:54:11 -04:00
Ian Castillo	ed70f24291	Add Spanish translation of converting_tensorflow_models.mdx (#18512 ) * Add file in spanish docs to be translated * Finish translation to Spanish * Improve Spanish wording * Add suggested changes from review	2022-08-08 15:53:43 -04:00
Mishig Davaadorj	f1f5de31ed	Update perf_train_gpu_one.mdx (#18532 )	2022-08-08 20:33:34 +02:00
Steven Liu	3632531ec6	Add example of multimodal usage to pipeline tutorial (#18498 ) * 📝 add example of multimodal usage to pipeline tutorial * 🖍 apply feedbacks * 🖍 apply niels feedback	2022-08-08 11:31:31 -05:00
Steven Liu	36b37990af	✨ update to use interlibrary links instead of Markdown (#18500 )	2022-08-08 10:53:52 -05:00
Sourab Mangrulkar	2fecde742d	update fsdp docs (#18521 ) * updating fsdp documentation * typo fix	2022-08-08 18:56:51 +05:30
Julien Chaumond	8d1f9039d0	Just re-reading the whole doc every couple of months 😬 (#18489 ) * Delete valohai.yaml * NLP => ML * typo * website supports https * datasets * 60k + modalities * unrelated link fixing for accelerate * Ok those links were actually broken * Fix link * Make `AutoTokenizer` auto-link * wording tweak * add at least one non-nlp task	2022-08-06 09:38:55 +02:00
Yih-Dar	9d64f7f00c	Update some expected values in `quicktour.mdx` for `resampy 0.3.0` (#18484 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-08-05 19:17:51 +02:00
Sylvain Gugger	faacdf007b	Move cache folder to huggingface/hub for consistency with hf_hub (#18492 ) * Move cache folder to just huggingface * Thank you VsCode for this needless import * Move to hub * Forgot one	2022-08-05 13:14:00 -04:00
NielsRogge	f9a0008d2d	Add VideoMAE (#17821 ) * First draft * Add VideoMAEForVideoClassification * Improve conversion script * Add VideoMAEForPreTraining * Add VideoMAEFeatureExtractor * Improve VideoMAEFeatureExtractor * Improve docs * Add first draft of model tests * Improve VideoMAEForPreTraining * Fix base_model_prefix * Make model take pixel_values of shape (B, T, C, H, W) * Add loss computation of VideoMAEForPreTraining * Improve tests * Improve model testsé * Make all tests pass * Add VideoMAE to main README * Add tests for VideoMAEFeatureExtractor * Add integration test * Improve conversion script * Rename patch embedding class * Remove VideoMAELayer from init * Update design of patch embeddings * Improve comments * Improve conversion script * Improve conversion script * Add conversion of pretrained model * Add loss verification of pretrained model * Add loss verification of unnormalized targets * Add integration test for pretraining model * Apply suggestions from code review * Fix bug to make feature extractor resize only shorter edge * Address more comments * Improve normalization of videos * Add doc examples * Move constants to dedicated script * Remove scripts * Transfer checkpoints, fix docs * Update script * Update image mean and std * Fix doc tests * Set return_tensors to NumPy by default * Revert the previous change Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-08-04 18:02:55 +02:00
Ian Castillo	10e1ec9a8c	Add Spanish translation of run_scripts.mdx (#18415 ) * Add file in spanish docs to be translated * Translate first two sections to Spanish * Translate four additional sections to Spanish * Finish translation to Spanish * Improve writing style in Spanish * Add suggested changes from reviewer	2022-08-03 07:32:20 -04:00
Steven Liu	92915ebec2	Update _toctree.yml (#18440 ) This PR moves GroupViT and LXMert to their correct sections. As pointed out by @NielsRogge and @LysandreJik, GroupViT and LXMert are both multimodal models.	2022-08-03 12:26:01 +02:00
Christopher Akiki	5096a654b7	Add programming languages (#18434 ) The current wording makes it sound as if the programming languages are part of the 46 natural languages.	2022-08-02 16:02:25 -04:00
Alara Dirik	8ae7784256	update maskformer docs (#18423 ) * update maskformer docs * fix typo	2022-08-02 18:43:58 +03:00
Steven Liu	151a2aaa4e	Split model list on modality (#18328 ) * 📝 split up model list * Adapt script to reorg * apply niels feedback Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-08-01 11:10:20 -05:00
Sylvain Gugger	01db72abd4	Rewrite push_to_hub to use upload_files (#18366 ) * Rewrite push_to_hub to use upload_files * Adapt the doc a bit * Address review comments and clean doc	2022-08-01 12:07:30 -04:00
Ikuya Yamada	62098b9348	Adding fine-tuning models to LUKE (#18353 ) * add LUKE models for downstream tasks * add new LUKE models to docs * fix typos * remove commented lines * exclude None items from tuple return values	2022-08-01 11:09:47 -04:00
NielsRogge	7b9e995b70	Fix docs (#18399 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-08-01 17:02:51 +02:00
Sylvain Gugger	986526a0e4	Replace `as_target` context managers by direct calls (#18325 ) * Preliminary work on tokenizers * Quality + fix tests * Treat processors * Fix pad * Remove all uses of in tests, docs and examples * Replace all as_target_tokenizer * Fix tests * Fix quality * Update examples/flax/image-captioning/run_image_captioning_flax.py Co-authored-by: amyeroberts <amy@huggingface.co> * Style Co-authored-by: amyeroberts <amy@huggingface.co>	2022-07-29 08:09:09 -04:00
Sanchit Gandhi	a4ee463d95	[Docs] Fix Speech Encoder Decoder doc sample (#18346 ) * [Docs] Fix Speech Encoder Decoder doc sample * improve pre-processing comment * make style	2022-07-29 09:11:28 +01:00
Nicola Procopio	985c7e3ac9	Updated _toctree.yml (#18337 )	2022-07-28 09:04:32 -04:00
Edoardo Federici	a8e279579b	updated translation (#18333 ) Left the term fine-tuning since there is no correct translation into Italian and the English term is generally used. The same was done with some terms like "learning rate"	2022-07-28 08:14:15 -04:00
Edoardo Federici	1e380c7dcb	fixed typo (#18331 )	2022-07-28 06:14:56 -04:00
Steven Liu	96be1b7f49	Update feature extractor docs (#18324 ) As pointed out by @NielsRogge, a feature extractor is used to prepare inputs for a model with a single modality rather than multimodal models.	2022-07-27 15:32:57 -05:00
Wang, Yi	2b81f72be9	start from 1.12, torch_ccl is renamed as oneccl_bindings_for_pytorch … (#18229 ) * start from 1.12, torch_ccl is renamed as oneccl_bindings_for_pytorch and should import it before use Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * add doc for perf_train_cpu_many Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * update doc Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2022-07-27 11:15:41 -04:00
Ritik Nandwal	e87ac9d18b	Add swin transformer v2 (#17469 ) * Add files generated using transformer-cli add-new-model-like command * Add changes for swinv2 attention and forward method * Add fixes * Add modifications for weight conversion and remaining args in swin model * Add changes for patchmerging * Add changes for SwinV2selfattention * Update conversion script * Add final fixes for the swin_v2 model * Add changes for conversion script for pretrained window size case * Add pretrained window size value from config in SwinV2Encoder class * Make fixup * Add swinv2 to models_not_in_readme to utils/check_copies.py * Modify Swinv2v2 to Swin Transformer V2 * Remove copied from, to run make fixup command * Add updates to swinv2tf from main branch * Add pretrained_window_size to config, to make tests pass * Add modified weights from nandwalritik profile for swinv2 * Update model weights from swinv2 from nandwalritik profile * Add fix for build_pr_documentation CI fix * Add fixes for weight conversion * Add change to make input with padding work * Add fixes for test cases * Add few changes from swin to swinv2 to pass test cases * Remove tests for tensorflow as swinv2 for TF is not added yet * Overide test_pt_tf_model_equivalence function as TF implementation for swinv2 is not added yet * Add modeling_tf_swinv2 to _ignore_modules as test file is removed for this one right now. * Update docs url for swinv2 in README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Undo changes for check_repo * Update url in readme.md * Remove overrided function to test pt_tf_model_equivalence * Remove TF model imports for Swinv2 as its not implemented in this PR * Add changes for index.mdx * Add swinv2 papers link,abstract and contributors details * Rename cpb_mlp to continous_position_bias_mlp * Add tips for swinv2 model * Update src/transformers/models/swinv2/configuration_swinv2.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/swinv2/configuration_swinv2.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Fix indentation for docstring example in src/transformers/models/swinv2/configuration_swinv2.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update import order in src/transformers/models/swinv2/configuration_swinv2.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add copyright statements in weights conversion script. Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Remove Swinv2 from models_not_in_readme * Reformat code * Remove TF implementation file for swinv2 * Update start docstring. Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add changes for docstring * Update orgname for weights to microsoft * Remove to_2tuple function * Add copied from statements wherever applicable * Add copied from to Swinv2ForMaskedImageModelling class * Reformat code. Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add unittest.skip(with reason.) for test_inputs_embeds test case. Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add updates for test_modeling_swinv2.py * Add @unittest.skip() annotation for clarity to create_and_test_config_common_properties function * Add continuous_position_bias_mlp parameter to conversion script * Add test for testing masked_image_modelling for swinv2 * Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/swinv2.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/swinv2.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add suggested changes * Add copied from to forward methods of Swinv2Stage and Swinv2Encoder * Add push_to_hub flag to weight conversion script * Change order or Swinv2DropPath class * Add id2label mapping for imagenet 21k * Add updated url for SwinV2 functions and classes used in implementation * Update input_feature dimensions format, mentioned in comments. Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> * Add suggested changes for modeling_swin2.py * Update docs * Remove create_and_test_config_common_properties function, as test_model_common_attributes is sufficient. * Fix indentation. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add changes for making Nit objects in code style * Add suggested changes * Add suggested changes for test_modelling_swinv2 * make fix-copies * Update docs/source/en/model_doc/swinv2.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-07-27 11:14:47 -04:00
NielsRogge	ccd4180f8a	[EncoderDecoder] Improve docs (#18271 ) * Improve docs * Improve docs of speech one as well * Apply suggestions from code review Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-07-27 10:08:59 +02:00
Ian Castillo	a5d504834d	Add Spanish translation of custom_models.mdx (#17807 ) * Update index * Translate to Spanish two sections from custom_models * Translate to Spanish custom models documentation * Fixing typos and grammatical errors * Add requested changes from reviewer	2022-07-26 10:10:37 -04:00
Federico Panero	7ea7eba39d	Add Italian translation of sharing_custom_models.mdx (#17631 ) * work in progress: custom_models * Update custom_models.mdx * Update custom_models.mdx * Update _toctree.yml * Update _toctree.yml * Update custom_models.mdx * Update custom_models.mdx * Update _toctree.yml * Update _toctree.yml Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-07-26 09:48:58 -04:00
Federico Panero	bbc28106e0	Add Italian translation of converting_tensorflow_models.mdx (#18283 ) * Add Italian translation of converting_tensorflow_models.mdx * Update _toctree.yml * Update converting_tensorflow_models.mdx * Update docs/source/it/_toctree.yml Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-07-26 08:37:34 -04:00
Fellip Silva Alves	5e0ffd9183	[ create_a_model.mdx ] translate to pt (#18098 ) * [ fast_tokenizers.mdx ] - Added translation to portuguese to tutorial * Delete docs/source/pt-br directory * [ fast_tokenizers.mdx ] - Continuing work on file * [ fast_tokenizers.mdx ] - Continuing work on file * Add fast tokenizers to _toctree.yml * Eliminated config and toctree.yml * Nits in fast_tokenizers.mdx * Finishing create_a_model * [ create_a_model.mdx ] finishing create a model in pt-br * [ Changing _toctree.yml ] adding create a model in pt Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-07-26 08:01:08 -04:00
Gorkem Ozkaya	f58b9c0522	Update translation.mdx (#18169 ) * Update translation.mdx * update translation.mdx by running make style	2022-07-26 07:56:40 -04:00
gilad19	2b09650885	Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER) (#17924 ) * Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER) * Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER) * provide classifier only text hidden states * add test_for_token_classification * Update src/transformers/models/vilt/modeling_vilt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/vilt/modeling_vilt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/vilt/modeling_vilt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/vilt/modeling_vilt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add test_for_token_classification Co-authored-by: gfuchs <gfuchs@ebay.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-07-26 10:11:32 +02:00
Alara Dirik	002915aa2a	Owlvit docs test (#18257 ) * fix docs and add owlvit docs test * fix minor bug in post_process, add to processor * improve owlvit code examples * fix hardcoded image size	2022-07-26 10:55:14 +03:00
Muhammad Ahmed	7cb4da13fe	change bloom parameters to 176B (#18235 )	2022-07-22 10:17:48 -04:00
Fx039482	4935409757	Add Italian translation of create_model.mdx and serialization.mdx (#17640 ) * First commit * final changes * Changed create_model to create_a_model Translated into crea un'architettura personalizzata in the file it/_toctree.yml * Added _toctree.yml in the italian translation loca: serialization title Esporta modelli transformers * Edit translation for create_model.mdx * t with '#' will be ignored, and an empty message aborts the commit. * Added file serialization for translation in italian * Fix toctree serialization position I checked the eng toctree and realized I made a mistake. * Update _toctree.yml Correct spacing Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-07-22 13:53:54 +02:00
Alara Dirik	12d66b4701	Add OWL-ViT model for zero-shot object detection (#17938 ) * add owlvit model skeleton * add class and box predictor heads * convert modified flax clip to pytorch * fix box and class predictors * add OwlViTImageTextEmbedder * convert class and box head checkpoints * convert image text embedder checkpoints * add object detection head * fix bugs * update conversion script * update conversion script * fix q,v,k,out weight conversion conversion * add owlvit object detection output * fix bug in image embedder * fix bugs in text embedder * fix positional embeddings * fix bug in inference mode vision pooling * update docs, init tokenizer and processor files * support batch processing * add OwlViTProcessor * remove merge conflicts * readd owlvit imports * fix bug in OwlViTProcessor imports * fix bugs in processor * update docs * fix bugs in processor * update owlvit docs * add OwlViTFeatureExtractor * style changes, add postprocess method to feature extractor * add feature extractor and processor tests * add object detection tests * update conversion script * update config paths * update config paths * fix configuration paths and bugs * fix bugs in OwlViT tests * add import checks to processor * fix docs and minor issues * fix docs and minor issues * fix bugs and issues * fix bugs and issues * fix bugs and issues * fix bugs and issues * update docs and examples * fix bugs and issues * update conversion script, fix positional embeddings * process 2D input ids, update tests * fix style and quality issues * update docs * update docs and imports * update OWL-ViT index.md * fix bug in OwlViT feature ext tests * fix code examples, return_dict by default * return_dict by default * minor fixes, add tests to processor * small fixes * add output_attentions arg to main model * fix bugs * remove output_hidden_states arg from main model * update self.config variables * add option to return last_hidden_states * fix bug in config variables * fix copied from statements * fix small issues and bugs * fix bugs * fix bugs, support greyscale images * run fixup * update repo name * merge OwlViTImageTextEmbedder with obj detection head * fix merge conflict * fix merge conflict * make fixup * fix bugs * fix bugs * add additional processor test	2022-07-22 13:35:32 +03:00
Sayak Paul	561b9a8c00	[SegFormer] TensorFlow port (#17910 ) * add: segformer utils and img. classification. * add: segmentation layer. * feat: working implementation of segformer. * chore: remove unused variable. * add test, remaining modifications. * remove: unnecessary files. * add: rest of the files. Co-authored-by: matt <rocketknight1@gmail.com> * chore: remove ModuleList comment. * chore: apply make style. * chore: apply make fixup-copies. * add to check_repo.py * add decode head to IGNORE_NON_TESTED * chore: run make style. * chore: PR comments. * chore: minor changes to model doc. * tests: reduction across samples. * add a note on the space. * sort importats. * fix: reduction in loss computation. * chore: align loss function with that of NER. * chore: correct utils/documentation_tests.txt Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * chore: simplify the interpolation of logits in loss computation. * chore: return transposed logits when return_dict=False. * chore: add link to the tf fine-tuning repo. * address pr comments. * address niels's comments. * remove from_pt=True since tf weights are in. * remove comment from pt model. * address niels's comments. Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2022-07-21 18:22:37 +01:00
Martina Fumanelli	07575e869d	Italian/accelerate (#17698 ) * Add 'accelerate' to _toctree file * Fix 'training with a nb' title Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-07-21 14:23:47 +02:00
Martina Fumanelli	8881e58b22	Italian/model sharing (#17828 ) * Add Italian translation of the doc file model_sharing.mdx * Fix style * Fix typo * Update docs/source/it/_toctree.yml Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-07-21 14:07:53 +02:00
Lorenzo Balzani	0d971be84f	Italian translation of run_scripts.mdx gh-17459 (#17642 ) * Run_scripts Italian translation gh-17459 * Updated run_scripts gh-17642 * Updated run_scripts gh-17642 Made the text more gender-neutral. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-07-21 12:02:08 +02:00
Nicola Procopio	9f787ce874	Translation/debugging (#18230 ) * added debugging.mdx * updated debugging.mdx * updated translation * updated translation debugging * translated debugging * updated _toctree.yml	2022-07-21 11:02:26 +02:00
Zhi Zheng	dbfeffd7c9	Update add_new_pipeline.mdx (#18224 ) fix typo	2022-07-21 07:55:30 +02:00
Steven Liu	ff56b8fbff	Add custom config to quicktour (#18115 ) * 📝 first draft of new quicktour * make style * 🖍 edit and review * 🖍 small fixes * 🖍 only add custom config section * 🖍 use autoclass instead	2022-07-20 12:23:03 -05:00
Raghavan	dcec4c4387	Adding OPTForSeqClassification class (#18123 ) * Adding OPTForSeqClassification class * Fix import issues * Add documentation for optforseqclassification * Remove checkout * fix failing tests * fix typo * Fix code formatting * Incorporating the PR feedbacks * Incorporate PR Feedbacks * Fix failing test and add new test for multi label setup * Fix formatting issue * Fix failing tests * Fix formatting issues * Fix failing tests * Fix failing tests * Fix failing tests * Fix failing tests * PR feedback	2022-07-20 10:14:21 +02:00
Sylvain Gugger	dc9147ff36	Custom pipeline (#18079 ) * Initial work * More work * Add tests for custom pipelines on the Hub * Protect import * Make the test work for TF as well * Last PyTorch specific bit * Add documentation * Style * Title in toc * Bad names! * Update docs/source/en/add_new_pipeline.mdx Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Auto stash before merge of "custom_pipeline" and "origin/custom_pipeline" * Address review comments * Address more review comments * Update src/transformers/pipelines/__init__.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2022-07-19 12:02:35 +02:00
Nicola Procopio	8e445ca51d	Translation/training: italian translation training.mdx (#17662 ) * added training.mdx * updated training.mdx * updated training.mdx * updated training.mdx * updated _toctree.yml * fixed typos after review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-07-18 19:21:07 +02:00
Nicola Procopio	c4cc894086	Translation italian: multilingual.mdx (#17768 ) * added multilingual.mdx * updated multilingual.mdx * italian translation multilingual.mdx * updated _toctree.yml * fixed typos _toctree.yml * fixed typos after review * fixed error after review	2022-07-18 19:09:08 +02:00
Nicola Procopio	0a5b61d004	Added preprocessing.mdx italian translation (#17600 ) * updated _toctree.yml * added preprocessing * updated preprocessing.mdx * updated preprocessing.mdx updated after review	2022-07-18 19:06:10 +02:00
gcheron	8c14b342aa	add ONNX support for LeVit (#18154 ) Co-authored-by: Guilhem Chéron <guilhemc@authentifier.com>	2022-07-18 15:17:07 +02:00
Lysandre Debut	c1c79b0655	NLLB tokenizer (#18126 ) * NLLB tokenizer * Apply suggestions from code review - Thanks Stefan! Co-authored-by: Stefan Schweter <stefan@schweter.it> * Final touches * Style :) * Update docs/source/en/model_doc/nllb.mdx Co-authored-by: Stefan Schweter <stefan@schweter.it> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * PR reviews * Auto models Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-07-18 08:12:34 -04:00
amyeroberts	8581a798c0	Add TF DeiT implementation (#17806 ) * Initial TF DeiT implementation * Fix copies naming issues * Fix up + docs * Properly same main layer * Name layers properly * Initial TF DeiT implementation * Fix copies naming issues * Fix up + docs * Properly same main layer * Name layers properly * Fixup * Fix import * Fix import * Fix import * Fix weight loading for tests whilst not on hub * Add doc tests and remove to_2tuple * Add back to_2tuple Removing to_2tuple results in many downstream changes needed because of the copies checks * Incorporate updates in Improve vision models #17731 PR * Don't hard code num_channels * Copy PyTorch DeiT embeddings and remove pytorch operations with mask * Fix patch embeddings & tidy up * Update PixelShuffle to move logic into class layer * Update doc strings - remove PT references * Use NHWC format in internal layers * Fix up * Use linear activation layer * Remove unused import * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Move dataclass to top of file * Remove from_pt now weights on hub * Fixup Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>	2022-07-13 18:04:08 +01:00
Wei	7ea6ccc2b3	Enable torchdynamo with torch_tensorrt(fx path) (#17765 ) * enable fx2trt * Update perf_train_gpu_one.mdx * Update perf_train_gpu_one.mdx * add lib check * update * format * update * fix import check * fix isort * improve doc * refactor ctx manager * fix isort * black format * isort fix * fix format * update args * update black * cleanups * Update perf_train_gpu_one.mdx * code refactor * code refactor to init * remove redundancy * isort * replace self.args with args Co-authored-by: Stas Bekman <stas@stason.org>	2022-07-13 12:43:28 -04:00
Yulv-git	95113d1365	Fix some typos. (#17560 ) * Fix some typos. Signed-off-by: Yulv-git <yulvchi@qq.com> * Fix typo. Signed-off-by: Yulv-git <yulvchi@qq.com> * make fixup.	2022-07-11 05:00:13 -04:00
varshith	91c4a3ab1a	Added Command for windows VENV activation in installation docs (#18008 ) * Added command for windows VENV activation * changed linux and macos specification	2022-07-07 08:18:44 -04:00
Sylvain Gugger	1b749a7f8d	Sort doc toc (#18034 ) * Add script to sort doc ToC * Style and fixes * Add check to quality job	2022-07-07 08:17:58 -04:00
Sylvain Gugger	2e90c3df8f	Doc to dataset (#18037 ) * Link to the Datasets doc * Remove unwanted file	2022-07-06 12:10:06 -04:00
NielsRogge	22edb68d49	Squash commits (#17981 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-07-06 08:11:48 -04:00
Matthijs Hollemans	6cb19540c9	sort list of models (#18011 )	2022-07-04 09:20:55 -04:00
amyeroberts	77ea5130a1	Add TF ResNet model (#17427 ) * Rought TF conversion outline * Tidy up * Fix padding differences between layers * Add back embedder - whoops * Match test file to main * Match upstream test file * Correctly pass and assign image_size parameter Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Add in MainLayer * Correctly name layer * Tidy up AdaptivePooler * Small tidy-up More accurate type hints and remove whitespaces * Change AdaptiveAvgPool Use the AdaptiveAvgPool implementation by @Rocketknight1, which correctly pools if the output shape does not evenly divide by input shape c.f. `9e26607e22 (r900109509)` Co-authored-by: From: matt <rocketknight1@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Use updated AdaptiveAvgPool Co-authored-by: matt <rocketknight1@gmail.com> * Make AdaptiveAvgPool compatible with CPU * Remove image_size from configuration * Fixup * Tensorflow -> TensorFlow * Fix pt references in tests * Apply suggestions from code review - grammar and wording Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add TFResNet to doc tests * PR comments - GlobalAveragePooling and clearer comments * Remove unused import * Add in keepdims argument * Add num_channels check * grammar fix: by -> of Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Remove transposes - keep NHWC throughout forward pass * Fixup look sharp * Add missing layer names * Final tidy up - remove from_pt now weights on hub Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-07-04 10:59:15 +01:00
Lysandre Debut	7b18702ca7	Add link to existing documentation (#17931 )	2022-07-04 04:13:05 -04:00
Nouamane Tazi	b68d408f1b	add ONNX support for BLOOM (#17961 ) * add onnx support for BLOOM * use TYPE_CHECKING for type annotations * fix past_shape for bloom (different from gpt2) * use logical_or instead of `+` for onnx support * bigger `atol_for_validation` for larger bloom models * copied -> taken because it's no longer an exact copy * remove "copied from" comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-07-01 10:44:42 -04:00
Billy Cao	cb42502410	Fix typo in perf_train_gpu_one.mdx (#17983 )	2022-07-01 09:19:13 -04:00
Aaron Pham	49cd736a28	feat: add pipeline registry abstraction (#17905 ) * feat: add pipeline registry abstraction - added `PipelineRegistry` abstraction - updates `add_new_pipeline.mdx` (english docs) to reflect the api addition - migrate `check_task` and `get_supported_tasks` from transformers/pipelines/__init__.py to transformers/pipelines/base.py#PipelineRegistry.{check_task,get_supported_tasks} Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * fix: update with upstream/main chore: Apply suggestions from sgugger's code review Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * chore: PR updates - revert src/transformers/dependency_versions_table.py from upstream/main - updates pipeline registry to use global variables Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * tests: add tests for pipeline registry Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * tests: add test for output warning. Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * chore: fmt and cleanup unused imports Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * fix: change imports to top of the file and address comments Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-30 12:11:08 -04:00
regisss	9cb7cef285	Add ONNX support for LayoutLMv3 (#17953 ) * Add ONNX support for LayoutLMv3 * Update docstrings * Update empty description in docstring * Fix imports and type hints	2022-06-30 12:09:52 -04:00
Crystina	692e61e91a	Flax t5 Encoder (#17784 ) * first draft adding Flax-t5-encoder and Flax-mt5-encoder * imports * after make fixup * flax t5 encoder test * black on test * make fix-copies * clean * all_model_classes -> tuple * clean test * is_encoder_decoder=False in t5-enc tester * remove file docstring before FlaxT5Encoder * black * isort * commit suggestions on src/transformers/models/t5/modeling_flax_t5.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * commit suggestions on src/transformers/models/t5/modeling_flax_t5.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * remove _get_encoder_module * self.decoder_seq_length -> self.encoder_seq_length as t5-enc does not have decoder * bugfix - self.module_class is class itself, not instance; * docs for mt5 and t5 * call -> __call__ in t5 doc * FlaxMT5EncoderModel to TYPE_HINT * run doc-builder to allow change the files Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-06-30 00:49:02 +02:00
Matthijs Hollemans	fbc7598bab	add MobileViT model (#17354 ) * add MobileViT * fixup * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * remove empty line Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * use clearer variable names * rename to MobileViTTransformerLayer * no longer inherit from nn.Sequential * fixup * fixup * not sure why this got added twice * rename organization for checkpoints * fix it up * Update src/transformers/models/mobilevit/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/models/mobilevit/test_modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * code style improvements * fixup * Update docs/source/en/model_doc/mobilevit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/mobilevit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/mobilevit/configuration_mobilevit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * download labels from hub * rename layers * rename more layers * don't compute loss in separate function * remove some nn.Sequential * replace nn.Sequential with new MobileViTTransformer class * replace nn.Sequential with MobileViTMobileNetLayer * fix pruning since model structure changed * fixup * fix doc comment * remove custom resize from feature extractor * fix ONNX import * add to doc tests * use center_crop from image_utils * move RGB->BGR flipping into image_utils * fix broken tests * wrong type hint * small tweaks Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-29 16:07:51 -04:00
StevenTang1998	3cff4cc587	Add MVP model (#17787 ) * Add MVP model * Update README * Remove useless module * Update docs * Fix bugs in tokenizer * Remove useless test * Remove useless module * Update vocab * Remove specifying * Remove specifying * Add #Copied ... statement * Update paper link * Remove useless TFMvp * Add #Copied ... statement * Fix style in test mvp model * Fix some typos * Fix properties of unset special tokens in non verbose mode * Update paper link * Update MVP doc * Update MVP doc * Fix README * Fix typos in docs * Update docs	2022-06-29 09:30:55 -04:00
Aritra Roy Gosthipaty	a7eba83161	TF implementation of RegNets (#17554 ) * chore: initial commit Copied the torch implementation of regnets and porting the code to tf step by step. Also introduced an output layer which was needed for regnets. * chore: porting the rest of the modules to tensorflow did not change the documentation yet, yet to try the playground on the model * Fix initilizations (#1) * fix: code structure in few cases. * fix: code structure to align tf models. * fix: layer naming, bn layer still remains. * chore: change default epsilon and momentum in bn. * chore: styling nits. * fix: cross-loading bn params. * fix: regnet tf model, integration passing. * add: tests for TF regnet. * fix: code quality related issues. * chore: added rest of the files. * minor additions.. * fix: repo consistency. * fix: regnet tf tests. * chore: reorganize dummy_tf_objects for regnet. * chore: remove checkpoint var. * chore: remov unnecessary files. * chore: run make style. * Update docs/source/en/model_doc/regnet.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * chore: PR feedback I. * fix: pt test. thanks to @ydshieh. * New adaptive pooler (#3) * feat: new adaptive pooler Co-authored-by: @Rocketknight1 * chore: remove image_size argument. Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: matt <rocketknight1@gmail.com> * Empty-Commit * chore: remove image_size comment. * chore: remove playground_tf.py * chore: minor changes related to spacing. * chore: make style. * Update src/transformers/models/regnet/modeling_tf_regnet.py Co-authored-by: amyeroberts <aeroberts4444@gmail.com> * Update src/transformers/models/regnet/modeling_tf_regnet.py Co-authored-by: amyeroberts <aeroberts4444@gmail.com> * chore: refactored __init__. * chore: copied from -> taken from./g * adaptive pool -> global avg pool, channel check. * chore: move channel check to stem. * pr comments - minor refactor and add regnets to doc tests. * Update src/transformers/models/regnet/modeling_tf_regnet.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * minor fix in the xlayer. * Empty-Commit * chore: removed from_pt=True. Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: amyeroberts <aeroberts4444@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-06-29 13:45:14 +01:00
Jerry Jiarui XU	6c8f4c9a93	Adding GroupViT Models (#17313 ) * add group vit and fixed test (except slow) * passing slow test * addressed some comments * fixed test * fixed style * fixed copy * fixed segmentation output * fixed test * fixed relative path * fixed copy * add ignore non auto configured * fixed docstring, add doc * fixed copies * Apply suggestions from code review merge suggestions Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * resolve comment, renaming model * delete unused attr * use fix copies * resolve comments * fixed attn * remove unused vars * refactor tests * resolve final comments * add demo notebook * fixed inconsitent default * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * rename stage->stages * Create single GroupViTEncoderLayer class * Update conversion script * Simplify conversion script * Remove cross-attention class in favor of GroupViTAttention * Convert other model as well, add processor to conversion script * addressing final comment * fixed args * Update src/transformers/models/groupvit/modeling_groupvit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-06-28 20:51:47 +02:00
regisss	76d13de5ae	Add ONNX support for DETR (#17904 )	2022-06-28 14:48:43 +02:00
Bill Ray	bfcd5743ee	In `group_texts` function, drop last block if smaller than `block_size` (#17908 )	2022-06-28 08:34:55 -04:00
Matt	ee0d001de7	Add a TF in-graph tokenizer for BERT (#17701 ) * Add a TF in-graph tokenizer for BERT * Add from_pretrained * Add proper truncation, option handling to match other tokenizers * Add proper imports and guards * Add test, fix all the bugs exposed by said test * Fix truncation of paired texts in graph mode, more test updates * Small fixes, add a (very careful) test for savedmodel * Add tensorflow-text dependency, make fixup * Update documentation * Update documentation * make fixup * Slight changes to tests * Add some docstring examples * Update tests * Update tests and add proper lowercasing/normalization * make fixup * Add docstring for padding! * Mark slow tests * make fixup * Fall back to BertTokenizerFast if BertTokenizer is unavailable * Fall back to BertTokenizerFast if BertTokenizer is unavailable * make fixup * Properly handle tensorflow-text dummies	2022-06-27 12:06:21 +01:00
rooa	d6b6fb9963	Add CodeGen model (#17443 ) * Add CodeGen model * Add missing key and switch order of super() * Fix torch.ones init with uint8 instead of bool * Address comments: copy statements and doc * update tests * remove old model parallel * fix batch gen tests * fix batch gen test * update test_gpt2_sample_max_time * fix codgen test and revert gpt2 test change * Fix incorrect tie_word_embedding value, typo, URL * Fix model order in README and styling * Reorder model list alphabetically * Set tie_word_embedding to False by default * Apply suggestions from code review * Better attn mask name & remove attn masked_bias * add tokenizer for codegen * quality * doc tokenizer * fix-copies * add CodeGenTokenizer in converter * make truncation optional * add test for truncation * add copyright * fix-copies * fix fast tokenizer decode * Update src/transformers/models/codegen/tokenization_codegen.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * increase vocab_size in tests Co-authored-by: patil-suraj <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-06-24 17:10:38 +02:00
Vishwas	c2c0d9db5f	Improve encoder decoder model docs (#17815 ) * Copied all the changes from the last PR * added in documentation_tests.txt * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: vishwaspai <vishwas.pai@emplay.net> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2022-06-24 14:48:19 +02:00
Sijun He	7cf52a49de	Nezha Pytorch implementation (#17776 ) * wip * rebase * all tests pass * rebase * ready for PR * address comments * fix styles * add require_torch to pipeline test * remove remote image to improve CI consistency * address comments; fix tf/flax tests * address comments; fix tf/flax tests * fix tests; add alias * repo consistency tests * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * address comments * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * merge * wip * wip * wip * most basic tests passes * all tests pass now * relative embedding * wip * running make fixup * remove bert changes * fix doc * fix doc * fix issues * fix doc * address comments * fix CI * remove redundant copied from * address comments * fix broken test Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-06-23 12:36:22 -04:00
Leandro von Werra	6f29029b05	Improve performance docs (#17750 ) * add skeleton files * fix cpu inference link * add hint to make clear that single gpu section contains general info * add new files to ToC * update toctree to have subsection for performance * add "coming soon" to the still empty sections * fix missing title * fix typo * add reference to empty documents * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-06-23 14:51:54 +02:00
Anugunj Naman	27e907386a	Fix Automatic Download of Pretrained Weights in DETR (#17712 ) * added use_backbone_pretrained * style fixes * update * Update detr.mdx * Update detr.mdx * Update detr.mdx * update using doc py * Update detr.mdx * Update src/transformers/models/detr/configuration_detr.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-21 16:45:35 +02:00
mrbean	eb16be415a	add onnx support for deberta and debertav2 (#17617 ) * add onnx support for debertav2 * debertav2 -> deberta-v2 in onnx features file * remove causal lm * add deberta-v2-xlarge to onnx tests * use self.type().dtype() in xsoftmax Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com> * remove hack for deberta * remove unused imports * Update src/transformers/models/deberta_v2/configuration_deberta_v2.py Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com> * use generate dummy inputs * linter * add imports * add support for deberta v1 as well * deberta does not support multiple choice * Update src/transformers/models/deberta/configuration_deberta.py Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com> * Update src/transformers/models/deberta_v2/configuration_deberta_v2.py Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com> * one line ordered dict * fire build Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>	2022-06-21 11:04:15 +02:00
Patrick von Platen	8fcbe275c3	Add UL2 (just docs) (#17740 ) * Add UL2 Co-authored-by: Daniel Hesslow <Daniel.Hesslow@gmail.com> * Correct naming * sort better * up * apply sylvains suggestion	2022-06-21 10:24:50 +02:00
Rafael Zimmer	0d92798b45	Added translation of index.mdx to Portuguese Issue #16824 (#17565 ) * Added translation of installation.mdx to Portuguese, as well as default templates of _toctree.yml and _config.py * [ build_documentation.yml ] - Updated doc_builder to build documentation in Portuguese. [ pipeline_tutorial.mdx ] - Created translation for the pipeline_tutorial.mdx. * [ build_pr_documentation.yml ] - Added pt language to pr_documentation builder. [ pipeline_tutorial.mdx ] - Grammar changes. * [ accelerate.mdx ] - Translated to Portuguese the acceleration tutorial. * [ multilingual.mdx ] - Added portuguese translation for multilingual tutorial. [ training.mdx ] - Added portuguese translation for training tutorial. * [ preprocessing.mdx ] - WIP * Update _toctree.yml * Adding Pré-processamento to _toctree.yml * Update accelerate.mdx * Nits and eliminate preprocessing file while it is ready * [ index.mdx ] - Translated to Portuguese the index apresentation page. * [ docs/source/pt ] - Updated _toctree.yml to match newest translations. * Fix build_pr_documentation.yml * Fix index nits * nits in _toctree Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-06-17 20:06:05 -04:00
Sylvain Gugger	3981ee8650	Sort the model doc Toc Alphabetically (#17723 )	2022-06-15 16:11:56 -04:00
Patrick von Platen	7f14839f55	[Wav2Vec2Conformer] Official release (#17709 ) * [Wav2Vec2Conformer] Official release * remove from not-in-readme	2022-06-15 18:34:15 +02:00
Hailey Schoelkopf	edb672ac5e	Add `BloomForSequenceClassification` and `BloomForTokenClassification` classes (#17639 ) * add new bloom classes * (feat) add bloom classification tests; make style * style: change import in test * add some typehints to bloom classes * merge main into branch * fix: input checking in bloom seq classification * fix tests * change model class tests * fix few tests - more tests should pass - one test left * make token classifier return hidden states * style: make BLOOM typehints consistent Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2022-06-14 17:10:12 +02:00
jianan-gu	3b29c9fdb7	Extend Transformers Trainer Class to Enable PyTorch Torchscript for Inference (#17153 ) * add jit mode option and model wrap * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refine code * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add ut and refine code * code refine * refine code * add inference doc * Update src/transformers/trainer.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add cpu inference performance doc * Update perf_infer_cpu.mdx * Update perf_infer_cpu.mdx * Update performance.mdx * Update _toctree.yml * refine jit func naming * Update _toctree.yml * Delete perf_infer_gpu_one.mdx * Update perf_infer_cpu.mdx * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add none check before jit * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-06-14 07:56:47 -04:00
Daniel Stancl	a72f1c9f5b	Add `LongT5` model (#16792 ) * Initial commit * Make some fixes * Make PT model full forward pass * Drop TF & Flax implementation, fix copies etc * Add Flax model and update some corresponding stuff * Drop some TF things * Update config and flax local attn * Add encoder_attention_type to config * . * Update docs * Do some cleansing * Fix some issues -> make style; add some docs * Fix position_bias + mask addition + Update tests * Fix repo consistency * Fix model consistency by removing flax operation over attn_mask * [WIP] Add PT TGlobal LongT5 * . * [WIP] Add flax tglobal model * [WIP] Update flax model to use the right attention type in the encoder * Fix flax tglobal model forward pass * Make the use of global_relative_attention_bias * Add test suites for TGlobal model * Fix minor bugs, clean code * Fix pt-flax equivalence though not convinced with correctness * Fix LocalAttn implementation to match the original impl. + update READMEs * Few updates * Update: [Flax] improve large model init and loading #16148 * Add ckpt conversion script accoring to #16853 + handle torch device placement * Minor updates to conversion script. * Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM * gpu support + dtype fix * Apply some suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * * Remove (de)parallelize stuff * Edit shape comments * Update README.md * make fix-copies * Remove caching logic for local & tglobal attention * Apply another batch of suggestions from code review * Add missing checkpoints * Format converting scripts * Drop (de)parallelize links from longT5 mdx * Fix converting script + revert config file change * Revert "Remove caching logic for local & tglobal attention" This reverts commit 2a619828f6ddc3e65bd9bb1725a12b77fa883a46. * Stash caching logic in Flax model * Make side relative bias used always * Drop caching logic in PT model * Return side bias as it was * Drop all remaining model parallel logic * Remove clamp statements * Move test files to the proper place * Update docs with new version of hf-doc-builder * Fix test imports * Make some minor improvements * Add missing checkpoints to docs * Make TGlobal model compatible with torch.onnx.export * Replace some np.ndarray with jnp.ndarray * Fix TGlobal for ONNX conversion + update docs * fix _make_global_fixed_block_ids and masked neg value * update flax model * style and quality * fix imports * remove load_tf_weights_in_longt5 from init and fix copies * add slow test for TGlobal model * typo fix * Drop obsolete is_parallelizable and one warning * Update __init__ files to fix repo-consistency * fix pipeline test * Fix some device placements * [wip]: Update tests -- need to generate summaries to update expected_summary * Fix quality * Update LongT5 model card * Update (slow) summarization tests * make style * rename checkpoitns * finish * fix flax tests Co-authored-by: phungvanduy <pvduy23@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: patil-suraj <surajp815@gmail.com>	2022-06-13 22:36:58 +02:00
Sijun He	66336dc183	Add Visual Question Answering (VQA) pipeline (#17286 ) * wip * rebase * all tests pass * rebase * ready for PR * address comments * fix styles * add require_torch to pipeline test * remove remote image to improve CI consistency * address comments; fix tf/flax tests * address comments; fix tf/flax tests * fix tests; add alias * repo consistency tests * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * address comments * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * merge * Update src/transformers/models/auto/modeling_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * merge Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-13 07:49:44 -04:00
Martina Fumanelli	e0b58fb5ba	Translation/autoclass (#17615 ) * Add Italian translation for autoclass_tutorial.mdx * Fix synthesis Co-authored-by: martina.fumanelli <martina.fumanelli@MBP-di-martinafumanelli.local>	2022-06-09 20:56:44 -04:00
Sylvain Gugger	29080643eb	Mention in the doc we drop support for fairscale (#17610 )	2022-06-09 12:20:39 -04:00
regisss	e0be053e43	Add ONNX support for ConvNeXT (#17627 )	2022-06-09 09:31:02 -04:00
regisss	5323094a22	Add ONNX support for ResNet (#17585 ) * Add ONNX support for ResNet * Add ONNX test * make fix-copies	2022-06-09 08:44:27 -04:00
Younes Belkada	ca2a55e9df	BLOOM (#17474 ) * adding template * update model * model update * update conf for debug model * update conversion * update conversion script * update conversion script * fix missing keys check * add tests to test the tokenizer in the local machine * Change variable name * add tests on xnli dataset * add more description * add descriptions + clearer code * clearer code * adding new tests + skipping few tests because of env problems * change comment * add dtype on the configuration * add test embeddings * add hardcoded test * fix dtype issue * adding torch.float16 to config * adding more metrics (min, max, mean) * add sum * now the test passes with almost equal * add files for conversion - test passes on cpu gpu * add final changes * cleaning code * add new args in the docstring * fix one liner function * remove macros * remove forward attention * clean up init funtion * add comments on the issue * rm scale mask softmax * do make style * fix dtype in init * fixing for loop on att probs * fix style with black * fix style + doc error * fix and debug CI errors (docs + style) * some updates - change new operations - finally add scaled softmax - added new args in the config * make use cache working * add changes - save sharded models - final changes on the modeling script * add changes - comment on alibi - add TODO on seq length * test commit - added a text to test the commit Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com> * final changes - attention mask change - generation works on BS176b Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com> * changes - model + conversion * move to correct dir * put , * fex fixes * fix tokenizer autodoc * fix minor CI issues * fix minor CI issues * fix minor CI issues * fix style issue * fix minor import issues * fix few issues * remove def main on the test * add require torch * replace decorator with 'with' * fix style * change to bloom * add quick fix tokenizer * fix tokenizer file * fix tokenizer - merge tests - small fixes * fix import issue * add bloom to readme * fix consistency * Update docs/source/en/model_doc/bloom.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review fix comment issues on file headers Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix doc issue * small fix - modeling test * some changes - refactor some code - taking into account reviews - more tests should pass - removed pruning tests * remove useless division * more tests should pass * more tests should pass * more tests should pass * let's try this one -add alibi offset - remove all permutes to make the grad operations work - finger crossed * refactor - refactor code - style changes - add new threshold for test * major changes - change BLOOM to Bloom - add quick doc on bloom.mdx - move embeddings test on modeling test * modify readme * small fixes * small fix - better threshold for a test * remove old test file from fetcher * fix small typo * major change - change BloomLMHead to BloomForCausalLM * remove onnx config * major changes - refactor the code - remove asserts - change tol for test * make style * small change * adding a slow test + commenting old ones for now * make style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make style * fix duplicates * cleaning comments on config * clean a bit conversion file * refacor a bit modeling file * refactor tokenizer file * fix tokenization test issue * fix tokenization issue #2 * fix tokenization issue second try * fix test issue * make style + add suggestions * change test fetcher * try this one - slow tests should pass - finger crossed * possible final changes * make style * try fix padding side issue * fix side * fix padding issue * fix ko-readme * fix config auto * cleaning modeling file * keep bloom in caps in ko * update config docs * remove pretraining_pp * remove model parallel * update config - add correct config files * fix duplicates * fix fetcher * fix refactor issue - remove divide function * try to remove alibi * small fixes - fix alibi - remove seq length - refactor a bit the code * put correct values - fix bos and eos token ids * fix attention mask loop Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com> * small fixes: - remove skip bias add * small fixes - fix typo in readme - fix typos in config * small changes - remove a test - add reconstruction test - change config * small changes - change Scaled Softmax to BloomScaledSoftmax * small fixes - fix alibi dtype * major changes - removing explicit dtype when loading modules - fixing test args (torch_dtype=auto) - add dosctring * fix readmes * major changes - now bloom supports alibi shifting - refactor a bit the code - better test tolerance now * refactor a bit * refactor a bit * put correct name on test * change docstring * small changes - fix docstring modeling - fix test tolerance * fix small nit - take dtype from tensors in the conversion script * minor fix - fix mdx issue * minor fix - change config docstring * forward contrib credits from PR14084 * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * apply modifications Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * resolve softmax upcast * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * final changes modeling Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Merge commit 'd156898f3b9b2c990e5963f5030a7143d57921a2' * merge commit * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * apply suggestions Apply suggestions from Stas comments Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Fix gradient checkpointing Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add slow but exact * add accelerate compatibility Co-authored-by: Nicolas Patry <Narsil@users.noreply.github.com> * forward contrib credits Co-authored-by: thomasw21 <thomasw21@users.noreply.github.com> Co-authored-by: sgugger <sgugger@users.noreply.github.com> Co-authored-by: patrickvonplaten <patrickvonplaten@users.noreply.github.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: LysandreJik <LysandreJik@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix torch device on tests * make style * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix nits Co-authored-by: patrickvonplaten<patrickvonplaten@users.noreply.github.com> * remove final nits * fix doc - add more details on the doc - add links to checkpoints * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions Co-authored-by: sgugger <sgugger@users.noreply.github.com> * put test torchscript to false * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: justheuristic <justheuristic@gmail.com> * fix alibi - create alibi only once * add small doc * make quality * replace torch.nn * remove token type emb * fix fused op + output bias * add fused op - now can control fused operation from config * remove fused op * make quality * small changes - remove unsed args on config - removed bias gelu file - make the model torchscriptable - add torchscript slow tests * Update src/transformers/models/bloom/modeling_bloom.py * fix slow * make style * add accelerate support * add bloom to deepspeed tests * minor changes * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * minor change * slow tests pass * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/bloom.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * minor changes: - change docstring - add link to paper Co-authored-by: Thomwolf <thomwolf@gmail.com> Co-authored-by: Thomas Wolf <thomas@huggingface.co> Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: sIncerass <sheng.s@berkeley.edu> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: Nicolas Patry <Narsil@users.noreply.github.com> Co-authored-by: thomasw21 <thomasw21@users.noreply.github.com> Co-authored-by: sgugger <sgugger@users.noreply.github.com> Co-authored-by: patrickvonplaten <patrickvonplaten@users.noreply.github.com> Co-authored-by: LysandreJik <LysandreJik@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: justheuristic <justheuristic@gmail.com> Co-authored-by: Stas Bekman <stas@stason.org>	2022-06-09 12:00:40 +02:00
jianan-gu	34097b3304	Extend Transformers Trainer Class to Enable CPU AMP and Integrate Intel Extension for PyTorch (#17138 ) * init PR * fix import ipex * minor fix on bf16 * refine optimizer * refine args notes * refine code * refine ipex optimize args * refine half_precision_backend * black format * isort format * isort format files * flake8 format * doc builder format * refine codes * remove jit and optim bits * black preview format * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refine code * refine notes * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * code refine * add ipex ut * add performance cpu doc * link to the cpu doc from main perf doc * install ipex into CI's docker * Update perf_train_cpu.mdx * Update docs/source/en/perf_train_cpu.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update perf_train_cpu.mdx * Update perf_train_cpu.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-06-08 09:41:57 -04:00
Sayak Paul	9d99489f2f	Add TFData2VecVision for semantic segmentation (#17271 ) * feat: initial implementation of data2vec segmentation model in TF. * chore: minor corrections to make the segmenter work. * chore: removed unncessary files. * chore: add tests and other modifications. * fix: loss computation for segmentation. * chore: remove unused variable. * chore: formatting. * added a dummy adaptive pooling layer. * removed unnecessary file. * potentially add identifiers to layer names. * fix: layer naming. * chore: removed unnecessary print. * Skipping unneeded test * chore: add logging to debug tolerance. * fix: segmentation tests for tfdata2vecvision * chore: make style. * fix: layer names, assertion to be resolved. * Bumping test tolerance a bit * chore: bump the tol in PT test. Co-authored-by: matt <rocketknight1@gmail.com>	2022-06-08 14:03:18 +01:00
Chan Woo Kim	119e3c0fc8	M-CTC-T Model (#16402 ) * added cbs to notebooks, made copy-paste error fix in generation_utils * initial push for mctc model * mctc feature extractor done * added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly. * added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly. * passing attention, now struggling to figure out how attention masks make sense here * works when excluding attention masks. ask later how one would integrate attention maskshere * bizarre configuration error (model prefix comes first in config dict json and messes up the order) * all passing but bizzarre config dict ordering issue when to_dict * passing all major tests * feature extraction, processor, tokenizer added & tests passing * style & consistency & other logistical fixes * copy paste fix * model after feature extraction working * commiting final feature extraction results; need to fix normalization * feature extraction passing tests; probably should add tests on the specific flashlight-copied functions? * delete print ; format code a bit * fixing tests * passing major tests * fixing styles * completed tokenization test with real example; not sure if these values are entirely correct. * last test fixes from local * reverting accidentally included custom setup configs * remove load tf weights; fix config error * testing couldnt import featureextractor * fix docs * fix docs * resolving comments * style fixes * style fixes * Update to MCTCConv1dSubSampler Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * relposemb fixes * conv1d name issue; expecting config fail with paraentheses * fix config issue * fix config issue * fix config issue * change everything to MCTCT * fixing naming change errors * archive list * copyrights and docs * copyrights and docs * copyrights and docs * merge resolution * move tests, fix to changed optionaldependency structure * test directories changed * fixing tests * how to avoid tf tests? * how to avoid tf tests? * tests passing locally * allow mctctprocessor imported any env * allow mctctprocessor imported any env * fixed second round of feedback, need to fix docs * doc changes not being applied * all fixed * style fix * feedback fixes * fix copies and feature extraction style fix * Update tests/models/visual_bert/test_modeling_visual_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * copy paste huggingface:main visual bert * added eof newline to visual bert; all tests are passing otherwise * fix slow tests by adding attention mask * change model id to speechbrain * make fix-copies * fix readme unwanted deletes * fixing readmes, make fix-copies * consistent M-CTC-T naming * Update src/transformers/models/mctct/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * all fixed but variable naming * adjust double quotes * fixed variable names * copyright and mr quilter * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct slow tests * make fix-copies * Update src/transformers/models/mctct/configuration_mctct.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mctct/configuration_mctct.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * m-ctc-t not mctct Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-08 00:33:07 +02:00
Vítor Fróis	706bb8364d	quicktour.mdx en -> pt translation (#17074 ) * Quicktour Portuguese Translation Translated quicktour.mdx until line 161 * Finished translating quicktour.mdx Ready to upload and adjust eventual .mdx or translation mistakes. * Add _toctree.yml and fix nits * Fixed pt-br mdx syntax problem Closed <frameworkcontent> instance * Changed </frameworkcontent> line * Copied missing block from english version of quicktour.mdx * Reviwed the entire file once again. It should be working now. Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-06-07 17:35:05 -04:00
Omar U. Espejel	b118730745	Fix gendered sentence in Spanish translation(#17558 )	2022-06-07 14:09:39 +02:00
Nicola Procopio	34a886fce3	Translation/italian: added pipeline_tutorial.mdx [Issue: #17459 ] (#17507 ) * added toctree.yml file * first translation * added pipeline_tutorial.mdx translation added pipeline_tutorial.mdx updated _toctree.yml * updated pipeline_tutorial.mdx * updated _toctree.yml Updated preprocessing and training * updated preprocessing.mdx start translation * Update _toctree.yml * Delete preprocessing.mdx * Update _toctree.yml * updated _toctree.yml * added preprocessing * Update _toctree.yml * updated _toctree.yml * undo * Revert "undo" This reverts commit `5d38d76875`. * Revert "Revert "undo"" This reverts commit `8aa0830b58`.	2022-06-06 10:35:20 -04:00
Martina Fumanelli	f6ad0e0556	Add installation.mdx Italian translation (#17530 ) * Add the Italian translation of the file installation.mdx and edit _toctree * Add the Italian translation of the file installation.mdx and edit _toctree	2022-06-06 07:48:08 -04:00
Jonatas Grosman	4aed1dc81b	Adding the Portuguese version of the tasks/token_classification.mdx documentation (#17492 ) * add tasks/token_classification pt doc structure * add tasks/token_classification pt doc translation * add tasks/token_classification pt doc translation	2022-06-06 07:47:34 -04:00
Britney Muller	72f5b94984	Update index.mdx (#17547 ) This PR updates our Expert Acceleration Program image with a new image featuring our experts. This is similar to our Transformers/README.md image update that has proven to be successful.	2022-06-03 12:56:37 -05:00
Patrick Deutschmann	babeff5524	Add support for Perceiver ONNX export (#17213 ) * Start adding perceiver support for ONNX * Fix pad token bug for fast tokenizers * Fix formatting * Make get_preprocesor more opinionated (processor priority, otherwise tokenizer/feature extractor) * Clean docs format * Minor cleanup following @sgugger's comments * Fix typo in docs * Fix another docs typo * Fix one more typo in docs * Update src/transformers/onnx/utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/onnx/utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/onnx/utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-03 07:40:22 -04:00
Robert Dargavel Smith	5c17918fe4	Allow from transformers import TypicalLogitsWarper (#17477 ) * Allow from transformers import TypicalLogitsWarper * Added TypicalLogitsWarper * Allow from transformers import TypicalLogitsWarper * Allow from transformers import TypicalLogitsWarper * Allow from transformers import TypicalLogitsWarper * Allow from transformers import TypicalLogitsWarper Added TypicalLogitsWarper Allow from transformers import TypicalLogitsWarper Allow from transformers import TypicalLogitsWarper Allow from transformers import TypicalLogitsWarper	2022-06-03 11:08:35 +02:00
Sylvain Gugger	048dd73bba	Check list of models in the main README and sort it (#17517 ) * Script for README * Fix copies * Complete error message	2022-06-02 08:10:08 -04:00
Anugunj Naman	84aaadd8c5	Adding LeViT Model by Facebook (#17466 ) * levit files * levit tests * weights script * weights script * update * style fixes * few minor corrections * Added teacher model * edit docs * fix-copies * style fixes * pr error resolved * Update README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/index.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/configuration_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/configuration_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * suggested pr changes * style fixes * minor bug * update * minor doc edit * style * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/models/levit/test_modeling_levit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * residual layer readable * style * Update docs/source/en/model_doc/levit.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update tests/models/levit/test_feature_extraction_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * change checkpoints and style * update * minor changes * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/levit/modeling_levit.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-06-01 17:06:20 +02:00
Ruihua Fang	4f38808e9e	Add OnnxConfig for SqueezeBert iss17314 (#17315 ) * add onnx config for SqueezeBert * add test for onnx config for SqueezeBert * add automatically updated doc for onnx config for SqueezeBert * Update src/transformers/onnx/features.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update src/transformers/models/squeezebert/configuration_squeezebert.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-06-01 06:16:15 -04:00
Arthur	7822a9b7a7	Opt in flax and tf (#17388 ) * initial commit * add init file * update globakl init * update index and dummy objects * style * update modelling auto * fix initi typo in src/transformers * fix typo in modeling tf auto, opt was in wrong mapping name * fixed a slow test : saved_model * style * fix positionnal embedding if no position id is provided * update tf test * update test flax requirements * fixed serialization * update * update tf name to allow smooth convertion * update flax tests * style * fix test typo * fix tf typo test * add xla for generate support in causal LM * fixed bug * cleaned tf tests * style * removed from PT for slow tests * fix typp * opt test as slow * trying to fix GPT2 undefined * correct documentation and add to test doc * update tf doc * fix doc * fake commit * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * update test based on review * merged main layer for functionning test * fixup + quality * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update long comment * make fix copies Co-authored-by: Arthur <arthur@huggingface.co> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-31 18:41:22 +02:00
Martina Fumanelli	dfc38463b8	Setup for Italian translation and add quicktour.mdx translation (#17472 ) * Setup for Italian translation and add first document - Add 'it' folder for files translated into Italian - Add _config.py and _toctree.yml files - Add translation of quicktour.mdx * Fix style issue of italian documentation files * Add 'it' to the languages section in the .github/workflows * Remove - installation from _toctree for Italian * Translation for index file - Add index to _toctree.yml - Add translation of index.mdx * Fix typo in docs/source/it/index.mdx * Translate code comments in docs/source/it/_config.py Co-authored-by: Martina Fumanelli <martinafumanelli@Martinas-MBP.homenet.telecomitalia.it>	2022-05-31 09:57:43 -04:00
Ritik Nandwal	5af38953bb	Added XLM onnx config (#17030 ) * Add onnx configuration for xlm * Add supported features for xlm * Add xlm to models exportable with onnx * Add xlm architecture to test file * Modify docs * Make code quality fixes	2022-05-31 09:26:06 -04:00
Omar U. Espejel	2ef09ecfb8	Fix nits (#17349 )	2022-05-31 08:41:54 -04:00
Yhary Arias	2295bcaea8	Spanish translation of the file preprocessing.mdx (#16299 ) * Spanish translation of the file training.mdx * Settings - Spanish translation of the file training.mdx * Latest changes to the Spanish translation of the training.mdx file * Delete Hugging.mdx * Last changes to the training fil Espanish version * Latest modifications * Latest changes, document ready for PR * Nits * Spanish translation of the preprocessing file * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Update docs/source_es/preprocessing.mdx * Nits and add preprocessing to _toctree.yml Co-authored-by: Yhary Arias <yharystefa@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-26 07:28:14 -04:00
Juanjo do Olmo	8f46ac9849	Spanish translation of the files sagemaker.mdx and image_classification.mdx (#17262 ) * Duplication of the source eng file * Spanish translation of the file multilingual.mdx * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Fix nits and finish translation * Spanish translation of sagemaker.mdx * Was deleted in main * Security saving * Complete translation of image_classification.mdx * Nits * nits * Update docs/source/es/image_classification.mdx * Add files to _toctree.yml * Fix toctree and add tasks folder Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-25 19:10:16 -04:00
Joaq	5e7f085fcc	Added es version of bertology.mdx doc (#17255 ) * added bertology es doc * toctree fix * Update docs/source/es/bertology.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/bertology.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/bertology.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * change position of bertology in _toctree.yml Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-25 18:46:53 -04:00
Jonatas Grosman	70484a8d74	Adding the Portuguese version of the tasks/sequence_classification.mdx documentation (#17352 ) * add sequence_classification pt doc structure * add Portuguese tasks/sequence_classification.mdx	2022-05-25 16:21:27 -04:00
Leandro von Werra	740a1574f1	fix link in performance docs (#17419 )	2022-05-25 20:54:43 +02:00
Jason Phang	71e602725b	[WIP] Adding GPT-NeoX-20B (#16659 ) * initial * first try * working 20B * 20B tokenizers * Docs * Import fixes for missing classes * Update docs, fixup * black formatting * isort * flake * dummy objects * documentation * Documentation yml * more docs * tweaks for tests * tokenization auto * fix neox tests * test * test * einsum * address PR feedback * Documentation * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gpt_neox/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gpt_neox/configuration_gpt_neox.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Remove undefined LaTeX syntax * Update to full url to avoid confusion about if that's supposed to refer to the Hub * fix auto * move tests * documentation fix * more doc fixes * test refactor * fix import * fix import * fix import * fix import * fix import * style fixes * More modeling fixes Co-authored-by: Jason Phang <zp489@gr057.hpc.nyu.edu> Co-authored-by: Stella Biderman <stellabiderman@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-24 09:31:10 -04:00
NielsRogge	31ee80d556	Add LayoutLMv3 (#17060 ) * Make forward pass work * More improvements * Remove unused imports * Remove timm dependency * Improve loss calculation of token classifier * Fix most tests * Add docs * Add model integration test * Make all tests pass * Add LayoutLMv3FeatureExtractor * Improve integration test + make fixup * Add example script * Fix style * Add LayoutLMv3Processor * Fix style * Add option to add visual labels * Make more tokenizer tests pass * Fix more tests * Make more tests pass * Fix bug and improve docs * Fix import of processors * Improve docstrings * Fix toctree and improve docs * Fix auto tokenizer * Move tests to model folder * Move tests to model folder * change default behavior add_prefix_space * add prefix space for fast * add_prefix_spcae set to True for Fast * no space before `unique_no_split` token * add test to hightligh special treatment of added tokens * fix `test_batch_encode_dynamic_overflowing` by building a long enough example * fix `test_full_tokenizer` with add_prefix_token * Fix tokenizer integration test * Make the code more readable * Add tests for LayoutLMv3Processor * Fix style * Add model to README and update init * Apply suggestions from code review * Replace asserts by value errors * Add suggestion by @ducviet00 * Add model to doc tests * Simplify script * Improve README * a step ahead to fix * Update pair_input_test * Make all tokenizer tests pass - phew * Make style * Add LayoutLMv3 to CI job * Fix auto mapping * Fix CI job name * Make all processor tests pass * Make tests of LayoutLMv2 and LayoutXLM consistent * Add copied from statements to fast tokenizer * Add copied from statements to slow tokenizer * Remove add_visual_labels attribute * Fix tests * Add link to notebooks * Improve docs of LayoutLMv3Processor * Fix reference to section Co-authored-by: SaulLu <lucilesaul.com@gmail.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-05-24 09:53:45 +02:00
Sylvain Gugger	56f50590d5	Use Accelerate in `from_pretrained` for big model inference (#17341 ) * Initial work * More or less finished with first draft * Update src/transformers/modeling_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Fix randomly initialized weights * Update src/transformers/modeling_utils.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Address review comments * Rename DeepSpeed folder to temporarily fix the test issue? * Revert to try if Accelerate fix works * Use latest Accelerate release * Quality and fixes * Style * Quality * Add doc * Test + fix * More blocks Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2022-05-23 14:32:21 -04:00
Anugunj Naman	c86aad6110	Fix cvt docstrings (#17367 )	2022-05-23 16:11:09 +02:00
ghlai9665	7b8cb26953	Correct & Improve Doctests for LayoutLMv2 (#17168 ) * add inference example to LayoutLMv2ForQuestionAnswering, passing doctest * add loss example to LayoutLMv2ForQuestionAnswering, passing doctest * Add correct doctest for LayoutLMv2ForTokenClassification, passing doctest * add correct doctest for LayoutLMv2ForSequenceClassification, passing test * add correct doctest for LayoutLMv2Model, passing test * make fixup * fix to address review comments * make style * fix doctest line break issue, add to documentaiton_tests.txt, address review comments * move comment about layoutlmv2 dependencies to the doc page * format doc page as suggested Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * delete extraneous backtick Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-23 08:02:31 -04:00
NielsRogge	adc0ff2502	Add CvT (#17299 ) * Adding cvt files * Adding cvt files * changes in init file * Adding cvt files * changes in init file * Style fixes * Address comments from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Format lists in docstring * Fix copies * Apply suggestion from code review Co-authored-by: AnugunjNaman <anugunjjha@gmail.com> Co-authored-by: Ayushman Singh <singhayushman13@protonmail.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-18 17:47:18 +02:00
Carl	d6b8e9cec7	Add trajectory transformer (#17141 ) * Add trajectory transformer Fix model init Fix end of lines for .mdx files Add trajectory transformer model to toctree Add forward input docs Fix docs, remove prints, simplify prediction test Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Update docs, more descriptive comments Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Update readme Small comment update and add conversion script Rebase and reformat Fix copies Fix rebase, remove duplicates Fix rebase, remove duplicates * Remove tapex * Remove tapex * Remove tapex	2022-05-17 19:07:43 -04:00
Cesare Campagnano	d9050dc768	[LED] fix global_attention_mask not being passed for generation and docs clarification about grad checkpointing (#17112 ) * [LED] fixed global_attention_mask not passed for generation + docs clarification for gradient checkpointing * LED docs clarification Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * [LED] gradient_checkpointing=True should be passed to TrainingArguments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * [LED] docs: remove wrong word Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * [LED] docs fix typo Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-05-17 23:44:37 +02:00
Jean Vancoppenolle	bad358398a	Add support for pretraining recurring span selection to Splinter (#17247 ) * Add SplinterForSpanSelection for pre-training recurring span selection. * Formatting. * Rename SplinterForSpanSelection to SplinterForPreTraining. * Ensure repo consistency * Fixup changes * Address SplinterForPreTraining PR comments * Incorporate feedback and derive multiple question tokens per example. * Update src/transformers/models/splinter/modeling_splinter.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/splinter/modeling_splinter.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Jean Vancoppenole <jean.vancoppenolle@retresco.de> Co-authored-by: Tobias Günther <tobias.guenther@retresco.de> Co-authored-by: Tobias Günther <github@tobigue.de> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-05-17 23:42:14 +02:00
Patrick von Platen	5a9957358c	Add Wav2Vec2Conformer (#16812 ) * save intermediate * add wav2vec2 conformer * add more code * more * first test passes * make all checkpoints work * update * up * more clean ups * save clean-up * save clean-up * save more * remove bogus * finalize design conformer * remove vision * finish all tests * more changes * finish code * add doc tests * add slow tests * fix autoconfig test * up * correct docstring * up * update * fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * Update docs/source/en/model_doc/wav2vec2-conformer.mdx * upload * save copied from * correct configs * fix model outputs * add to docs * fix imports * finish * finish code * correct copied from * correct again * correct make fix * improve make fix copies * save * correct fix copy from * correct init structure * correct * fix import * apply suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>	2022-05-17 00:43:16 +02:00
amyeroberts	f6a6388972	Add Tensorflow Swin model (#16988 ) Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-16 22:19:53 +01:00
Kevin Zehnder	6cb7187324	docs(transformers): fix typo (#17263 )	2022-05-16 17:04:30 -04:00
Sander Land	053a80c606	logging documentation update (#17174 ) * logging documentation * style Co-authored-by: Sander Land <sander@chatdesk.com>	2022-05-16 16:47:28 -04:00
Sylvain Gugger	ddb1a47ec8	Automatically sort auto mappings (#17250 ) * Automatically sort auto mappings * Better class extraction * Some auto class magic * Adapt test and underlying behavior * Remove re-used config * Quality	2022-05-16 13:24:20 -04:00
Stas Bekman	71abd3ade1	[WIP] [doc] performance/scalability revamp (#15723 ) * [doc] performance/scalability revamp * link the new docs * no : * mixed precision * work on the first doc * expand the main doc * Trigger CI * style * revamp single GPU training section * work on training performance * remove files not used anymore or will be added later * final touches * fix rebase * Add hardware section to toctree * fix toctree again * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove `fast_tokenizers` entry that was copied in rebase * add warning about DP vs DDP * remove todo * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix missing closure of codeblock * Update docs/source/en/perf_train_gpu_many.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * sync with #16860 * update toc Co-authored-by: leandro <leandro.vonwerra@spoud.io> Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-16 13:36:41 +02:00
Ignacio Talavera	ee393c009a	Guide to create custom models in Spanish (#17158 ) * file copied and toctree updated * Intro and configuration translated * model section translated * enter hotfix * Translation over, correction pending * Typos and corrections * Update docs/source/es/create_a_model.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/create_a_model.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/create_a_model.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/create_a_model.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-13 16:19:29 -04:00
Gerardo Huerta Robles	16be422912	Translated version of model_sharing.mdx doc to spanish (#16184 ) * Translated version of model_sharing to spanish * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Update docs/source_es/model_sharing.mdx * Addind model sharing to _toctree.yml Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-13 16:18:46 -04:00
Fellip Silva Alves	f9024814e1	[ fast_tokenizers.mdx ] - Added translation to portuguese to tutorial (#17076 ) * [ fast_tokenizers.mdx ] - Added translation to portuguese to tutorial * Delete docs/source/pt-br directory * [ fast_tokenizers.mdx ] - Continuing work on file * [ fast_tokenizers.mdx ] - Continuing work on file * Add fast tokenizers to _toctree.yml * Eliminated config and toctree.yml * Nits in fast_tokenizers.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-13 16:18:14 -04:00
Rafael Zimmer	85fc455972	Added translation of installation.mdx to Portuguese Issue #16824 (#16979 ) * Added translation of installation.mdx to Portuguese, as well as default templates of _toctree.yml and _config.py * [ build_documentation.yml ] - Updated doc_builder to build documentation in Portuguese. [ pipeline_tutorial.mdx ] - Created translation for the pipeline_tutorial.mdx. * [ build_pr_documentation.yml ] - Added pt language to pr_documentation builder. [ pipeline_tutorial.mdx ] - Grammar changes. * [ accelerate.mdx ] - Translated to Portuguese the acceleration tutorial. * [ multilingual.mdx ] - Added portuguese translation for multilingual tutorial. [ training.mdx ] - Added portuguese translation for training tutorial. * [ preprocessing.mdx ] - WIP * Update _toctree.yml * Adding Pré-processamento to _toctree.yml * Update accelerate.mdx * Nits and eliminate preprocessing file while it is ready Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-13 07:55:44 -04:00
Sayak Paul	9f16a1cc13	Update data2vec.mdx to include a Colab Notebook link (that shows fine-tuning) (#17194 ) * Update data2vec.mdx * Update data2vec.mdx * Update docs/source/en/model_doc/data2vec.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-12 10:22:00 -04:00
Younes Belkada	b971c769e8	Add OPT (#17088 ) * First version - OPT model * Final changes - putting use cache to False * few changes - remove commented block * few changes - remove unecessary files * fix style issues * few changes - remove a test file - added the logits test * Update src/transformers/models/auto/tokenization_auto.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add gen tests * few changes - rm mask filling example on docstring * few changes - remove useless args * some changes - more tests should pass now - needs to clean more - documentation still needs to be done * fix code quality * major changes - change attention architecture to BART-like - modify some tests - style fix * rm useless classes - remove opt for: - QA - cond generation - seq classif * Removed autodoc calls to non-existant classes TOkenizers are not implemented * Update src/transformers/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/auto/modeling_tf_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Replaced OPTTokeniser with GPT2 tokenizer * added GPT2Tokenizer.from_pretrained("patrickvonplaten/opt_gpt2_tokenizer") * Removed OPTTokenizer * make style * Make style replaces ``` ...).unsqueeze(``` by ``` >>>).unsqueeze(``` * make repo consistency * Removed PretrainedOPTModel * fix opt.mdx removed other heads * fix init, removed 3 heads * removed heads * finished cleaning head * removed seauence classif and question answering * removed unused imports * removed useless dummy object for QA, SC and CG * removed tests for removed useless dummy object for QA, SC and CG * Removed head_mask using encoder layers which don't exist * fixed test * fix line * added OPT to toctree * Updated model path with pushed weigths * fix model path * fixed code quality * fixed embeddings and generation tests * update paths * clean comments * removed OPTClassificationHead for sentence classification * renamed hidden layer * renamed num layers to standard num_hidden_layers * num_attention_heads fix * changes for 125m * add first version for 125m * add first version - flax * add new version * causal LM output * replace output type with BaseModelOutputWithPastAndCrossAttentions * revert working config from 150m to 350m * clean * removed decoder input ids * fixed embed dim * more embed_dim issues * make style + removed enc_dec test * update falx model * removed troublesome copy * added is_encoder_decoder=False to config * added set_input emb fuinction to model class * requires torch on embed test * use head mask instead of decoder head mask input param solves a test * 8 test remaining, update * Updated create_and_check_decoder_model_past_large_inputs * Make style * update op tokenizer with condition * make style * See if I can push * some clean up * remove linear head hack * save intermediate * save correct attention * add copied from from bart * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix part of the reviewss Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * same changes in naming / conversion * correct mask * more fixes * delete FlaxOPT and TfOPT * clean traces of Flax and Tf * fix mask * fixed positionnal embedding length when past key value is provoded * get 125m, 6.7b to work * Added do_layer_norm * solved mismatch in load dictionnary * clean up preapre opt input dict * fixed past key value as bool * fix previus * fixed return dict False tuple issue * All tests are passing * Make style * Ignore OPTDecoder non tested * make fix-copies * make repo consistency * small fix * removed uselss @torch.no_grad decorator * make styl;e * fix previous opt test * style * make style * added opt documentation * update OPT_PRETRAINED_MODEL_ARCHIVE_LIST * up * more fixes * model & config work * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * added comment on padding hack (+2) * cleaup * review update * docstring for missing arg * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/opt/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update pretrained map * update path and tests * make style * styling * make consistency * add gpt2 tok new * more tok fixes * Update src/transformers/models/auto/tokenization_auto.py * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/opt.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/models/opt/test_modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/opt/modeling_opt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update based on reviews * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * make style * make tokenizer auto tests pass * apply Lysandre suggestion * finish tests * add some good tokenizer tests * improve docs slighly Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: ArthurZucker <arthur.zucker@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2022-05-12 12:24:35 +02:00
Omar U. Espejel	1a688709b3	Fix contents in index.mdx to match docs' sidebar (#17198 ) * Fix contents in index.mdx to match docs' sidebar * Eliminates api section from contents	2022-05-12 02:37:13 -05:00
Omar Sanseviero	b17b78897b	Fix style error in Spanish docs (#17197 )	2022-05-12 08:51:46 +02:00
Omar U. Espejel	1a66a6c677	Translate index.mdx (to ES) and add Spanish models to quicktour.mdx examples (#16685 ) * Change nits in Spanish for quicktour.mdx - Add tasks names in English too. - Fix small nits in Spanish * Translate index.mdx to Spanish * Translate body of index. * Translated the compatible models list (not the papers´ names). Since this should not be updated manually, I can come back to the original text. * Add models and a dataset for Spanish in the code exmaples * Replaced the English models to Spanish versions. * Add index to _toctree.yml and fix Spanish * Fix double ““ error * Change negative example in ASR example * make style * Debug style in quicktour.mdx	2022-05-11 23:35:07 -05:00
Jorge Loayza R	e2d678b71c	Documentation: Spanish translation of fast_tokenizers.mdx (#16882 ) * Spanish translation of fast_tokenizers.mdx * add fast_tokenizers to the spanish _toctree.yml * Update docs/source/es/fast_tokenizers.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/fast_tokenizers.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/fast_tokenizers.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/fast_tokenizers.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/fast_tokenizers.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/fast_tokenizers.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-11 22:25:44 -05:00
Joaq	ae82da2181	Added es version of language_modeling.mdx doc (#17021 ) * Spanish version of language_modeling.mdx doc file * modification to toctree.yml file * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/language_modeling.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Correct position of Guías conceptuales Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-11 22:04:56 -05:00
jkmg	36ddcc0d35	Spanish translation of philosophy.mdx #15947 (#16922 ) * adding philosophy.mdx translation to Spanish * adding philosophy.mdx translation to Spanish * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source/es/philosophy.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * philosophy translation to Spanish * Update _toctree.yml * Update _toctree.yml * nits Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>	2022-05-11 20:47:50 -05:00
Amanpreet Singh	a10f61834d	[feat] Add FLAVA model (#16654 ) * [WIP] Add FLAVA model This PR aims to add [FLAVA](ihttps://arxiv.org/abs/2112.04482) model to the transformers repo. Following checklist delineates the list of things to be done for this PR to be complete: [x] Flava init [x] Flava base models [x] Flava layers [x] Flava Configs [x] Flava encoders [x] Flava pretraining models [ ] Flava classification/retrieval models (To be added in a separate PR) [x] Documentation updates [x] Imports updates [x] Argstring updates [x] Flava pretrained checkpoints [x] Flava tests [x] Flava processors [x] Sanity check [x] Lint	2022-05-11 14:56:48 -07:00
hasan salim kanmaz	c33f6046c3	[WIP] Enable reproducibility for distributed trainings (#16907 ) * add seed worker and set_deterministic_seed_for_cuda function to enforce reproducability * change function name to enable determinism, add docstrings, reproducability support for tf * change function name to enable_determinism_for_distributed_training * revert changes in set_seed and call set_seed within enable_full_determinism * add one position argument for seed_worker function * add full_determinism flag in training args and call enable_full_determinism when it is true * add enable_full_determinism to documentation * apply make fixup after the last commit * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-05-11 09:37:13 -04:00
Jason Phang	48a8f3daa1	Add DebertaV2ForMultipleChoice (#17135 )	2022-05-10 16:21:44 -04:00
Patrick Haller	259eeb6dab	Fixing the output of code examples in the preprocessing chapter (#17162 )	2022-05-10 12:16:28 -04:00
Zachary Mueller	d719bcd46a	Fix all docs for accelerate install directions (#17145 )	2022-05-09 15:45:18 -04:00
Sylvain Gugger	7783fa6bb3	Fix quality and repo consistency	2022-05-09 11:14:36 -04:00
Sourab Mangrulkar	05fc1766ff	PyTorch FSDP integration in Trainer (#17136 ) * PyTorch FSDP integration in Trainer * reformatting make style and make quality are now compliant. * Updating dependency check * Trigger CI Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-05-09 20:40:56 +05:30
Manan Dey	dc3645dc9c	add `mobilebert` onnx configs (#17029 ) * update docs of length_penalty * Revert "update docs of length_penalty" This reverts commit `466bf4800b`. * add mobilebert onnx config * address suggestions * Update auto.mdx * Update __init__.py * Update features.py	2022-05-09 10:36:53 -04:00
Ritik Nandwal	215e0681e4	Added BigBirdPegasus onnx config (#17104 ) * Add onnx configuration for bigbird-pegasus * Modify docs	2022-05-06 17:31:00 +02:00
Steven Liu	cad61b6839	Fix link to example scripts (#17103 )	2022-05-05 15:20:27 -05:00
Daniel Espejel	db377a0b37	Added spanish translation of autoclass_tutorial. (#17069 ) * Added spanish translation of autoclass_tutorial. Added 'local' and 'title' fields for autoclass_tutorial. * Fixed autoclass_tutorial title in _toctree.yml and autoclass_tutorial.mdx	2022-05-04 14:18:24 -05:00
Steven Liu	23619ef6b7	📝 open fresh PR for pipeline doctests (#17073 )	2022-05-04 11:30:34 -05:00
Sayak Paul	049e791758	Add Data2Vec for Vision in TF (#17008 ) * add utilities till TFData2VecVisionLayer. * chore: pass window_size to attention layer. * feat: add TFData2VecVisionRelativePositionBias. * feat: initial implementation ready for tf data2vec. * fix: relative position bias index, table to be fixed. * chore: implementation added, tests remaining. * add: tests, other PR files. * fix: code quality. * fix: import structure in init. * chore: run make fix-copies. * chore: address PR feedback (round I). * chore: styling nit. * fix: tests due to removal of to_2tuple(). * chore: rebase with upstream main and move the test. * Update src/transformers/models/auto/modeling_tf_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/auto/modeling_tf_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix: layer call. * chore: remove from_pt=True and rerun test. * chore: remove cast and tf.divide. * chore: minor edits to the test script. * Update src/transformers/models/data2vec/modeling_tf_data2vec_vision.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * fix: expand() on TF tensors with broadcast_to(). * fix: test import. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-05-04 08:08:25 -04:00
Sylvain Gugger	a8fa2f91f4	Make Trainer compatible with sharded checkpoints (#17053 ) * Make Trainer compatible with sharded checkpoints * Add doc	2022-05-03 09:55:10 -04:00
Sanchit Gandhi	cd9274d010	[FlaxBert] Add ForCausalLM (#16995 ) * [FlaxBert] Add ForCausalLM * make style * fix output attentions * Add RobertaForCausalLM * remove comment * fix fx-to-pt model loading * remove comment * add modeling tests * add enc-dec model tests * add big_bird * add electra * make style * make repo-consitency * add to docs * remove roberta test * quality * amend cookiecutter * fix attention_mask bug in flax bert model tester * tighten pt-fx thresholds to 1e-5 * add 'copied from' statements * amend 'copied from' statements * amend 'copied from' statements * quality	2022-05-03 11:26:19 +02:00
Lysandre Debut	bb2e088be7	Allow all imports from transformers (#17050 )	2022-05-02 12:47:39 -04:00
NielsRogge	1ac698744c	Add YOLOS (#16848 ) * First draft * Add YolosForObjectDetection * Make forward pass work * Add mid position embeddings * Add interpolation of position encodings * Add expected values * Add YOLOS to tests * Add integration test * Support tiny model as well * Support all models in conversion script * Remove mid_pe_size attribute * Make more tests pass * Add model to README and fix config * Add copied from statements * Rename base_model_prefix to vit * Add missing YOLOS_PRETRAINED_CONFIG_ARCHIVE_MAP * Apply suggestions from code review * Apply more suggestions from code review * Convert remaining checkpoints * Improve docstrings * Add YolosFeatureExtractor * Add feature extractor to docs * Add corresponding tests * Fix style * Fix docs * Apply suggestion from code review * Fix bad rebase * Fix some more bad rebase * Fix missing character * Improve docs and variable names Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-05-02 18:30:55 +02:00
Yih-Dar	ede5e04191	Add a check on config classes docstring checkpoints (#17012 ) * Add the check * add missing ckpts * add a list to ignore * call the added check script * better regex pattern Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-30 10:40:46 +02:00
Sylvain Gugger	7152ed2bae	Result of new doc style with fixes (#17015 ) * Result of new doc style with fixes * Add last two files * Bump hf-doc-builder	2022-04-29 17:42:15 -04:00
Mishig Davaadorj	cf8a7c2490	Update custom_models.mdx (#16964 ) BertModelForSequenceClassification -> BertForSequenceClassification	2022-04-27 16:46:55 +02:00
Yang Ming	10dfa126b7	documentation: some minor clean up (#16850 )	2022-04-26 16:56:08 -04:00
Krishna Sirumalla	aaee4038c3	Add onnx config for RoFormer (#16861 ) * add roformer onnx config	2022-04-26 16:51:15 +02:00
Rushi Chaudhari	8246caf3eb	added deit onnx config (#16887 ) * added deit onnx config	2022-04-25 20:50:45 +02:00
Patrick von Platen	3a71e94a92	Fix doc test quicktour dataset (#16929 ) * fix doc test * fix doc test Co-authored-by: Patrick <patrick@pop-os.localdomain>	2022-04-25 16:26:59 +02:00
Patrick von Platen	72728be3db	[DocTests] Fix some doc tests (#16889 ) * [DocTests] Fix some doc tests * hacky fix * correct	2022-04-23 08:40:14 +02:00
Thomas Chaigneau	ec81c11a18	Add OnnxConfig for ConvBERT (#16859 ) * add OnnxConfig for ConvBert Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>	2022-04-22 18:19:15 +02:00
Nicolas Patry	e789418ebe	Adding support for `array` key in raw dictionnaries in ASR pipeline. (#16827 ) * Adding support for `array` key in raw dictionnaries in ASR pipeline. * ES . * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Making it work by not popping `array` first. * Black 22.3 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-21 14:39:10 +02:00
Stas Bekman	67ed0e43dc	[docs] fix url (#16860 )	2022-04-20 11:01:24 -07:00
Yang Ming	ff06b17791	add DebertaV2 fast tokenizer (#15529 ) Co-authored-by: alcinos <carion.nicolas@gmail.com> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> Co-authored-by: Nicolas Carion <carion.nicolas@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-20 10:26:51 +02:00
Patrick von Platen	8d3f952adb	[Data2Vec] Add data2vec vision (#16760 ) * save intermediate * add vision * add vision * save * finish models * finish models * continue * finish * up * up * up * tests all pass * clean up * up * up * fix bugs in beit * correct docs * finish * finish docs * make style * up * more fixes * fix type hint * make style * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/data2vec/test_modeling_data2vec_vision.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix test Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-18 17:52:13 +02:00
Patrick von Platen	9a2995ee39	[Quicktour Audio] Improve && remove ffmpeg dependency (#16723 ) * [Quicktour Audio] Improve && remove ffmpeg dependency * final fix * final touches	2022-04-18 16:50:13 +02:00
Patrick von Platen	b24201fa44	[Doctests] Fix all T5 doc tests (#16646 ) * [Doctests] Fix all T5 doc tests * make style * Update docs/source/en/model_doc/t5.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply Sylvains comments * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-13 11:36:54 +02:00
Minh Chien Vu	9c9db751e2	add Bigbird ONNX config (#16427 ) * add Bigbird ONNX config	2022-04-12 20:46:06 +02:00
Anmol Joshi	a315988bae	Moved functions to pytorch_utils.py (#16625 ) * Moved functions to pytorch_utils.py * isort formatting * Reverted tf changes * isort, make fix-copies * documentation fix * Fixed Conv1D import * Reverted research examples file * backward compatibility for pytorch_utils * missing import * isort fix	2022-04-12 12:38:50 -04:00
Sylvain Gugger	0711c45eae	Remove duplicate header (#16732 )	2022-04-12 12:37:13 -04:00
Patrick von Platen	098b002644	[Doctests] Correct task summary (#16644 )	2022-04-11 14:59:35 +02:00
Yih-Dar	8e93dc7eaf	Fix some doc examples in task summary (#16666 ) * Fix some doc examples Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-11 11:20:03 +02:00
Steven Liu	7c5d79912a	Update audio examples with MInDS-14 (#16633 ) * ✨ update audio examples with minds dataset * 🖍 make style * 🖍 minor fixes for doctests	2022-04-08 15:55:42 -05:00
NielsRogge	4ef0abb738	Add TAPEX (#16473 ) * Add TapexTokenizer * Improve docstrings and provide option to provide answer * Remove option for pretokenized inputs * Add TAPEX to README * Fix copies * Remove option for pretokenized inputs * Initial commit: add tapex fine-tuning examples on both table-based question answering and table-based fact verification. * - Draft a README file for running the script and introducing some background. - Remove unused code lines in tabfact script. - Disable the deafult `pad_to_max_length` option which is memory-consuming. * * Support `as_target_tokenizer` function for TapexTokenizer. * Fix the do_lower_case behaviour of TapexTokenizer. * Add unit tests for target scenarios and cased/uncased scenarios for both source and target. * * Replace the label BartTokenizer with TapexTokenizer's as_target_tokenizer function. * Fix typos in tapex example README. * * fix the evaluation script - remove the property `task_name` * * Make the label space more clear for tabfact tasks * * Using a new fine-tuning script for tapex-base on tabfact. * * Remove the lowercase code outside the tokenizer - we use the tokenizer to control whether do_lower_case * Guarantee the hyper-parameter can be run without out-of-memory on 16GB card and report the new reproduced number on wikisql * * Remove the default tokenizer_name option. * Provide evaluation command. * * Support for WikiTableQuestion dataset. * Fix a typo in README. * * Fix the datasets's key name in WikiTableQuestions * Run make fixup and move test to folder * Fix quality * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply some more suggestions from code review * Improve docstrings * Overwrite failing test * Improve comment in example scripts * Fix rebase * Add TAPEX to Auto mapping * Add TAPEX to auto config mappings * Put TAPEX higher than BART in auto mapping * Add TAPEX to doc tests Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain> Co-authored-by: SivilTaram <qianlxc@outlook.com> Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-04-08 10:57:51 +02:00
Francesco Saverio Zuppichini	af14c61973	RegNet (#16188 ) * base model done * make style * done * added files * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Trigger doc build * resolved conversations * resolved conversations * seer models * minor changes * minor changes * make fixup * glob variables * minor changes * fix copies * config when possibile * resolved conflicts * resolved conflicts * resolved conflicts * CI * conversion script for 10b param * fixed for 10b model * minor updates in the doc + make style * removed unused code * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * removed unused code * removed unused code * updated modeling_utils from main Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-04-07 21:58:00 +02:00
Joao Gante	3f43d824b9	TF generate refactor - Beam Search (#16374 ) * refactor TF beam search * refactored generate can now properly use attention masks * add force bos/eos logit processors	2022-04-06 18:19:34 +01:00
Patrick von Platen	c65633156b	[Speech2Text Doc] Fix docs (#16611 ) * [Speech2Text Doc] Fix docs * apply ydshiehs suggestions	2022-04-06 14:19:00 +02:00
Patrick von Platen	0bf18643f4	[Minds14] Correct quicktour (#16626 )	2022-04-06 11:27:11 +02:00
Sylvain Gugger	208f4c109a	Quality	2022-04-05 14:12:01 -04:00
Steven Liu	f553c3ce4c	Update summary of the tasks (#16528 ) * 📝 add image/vision classification and asr * 🖍 minor formatting fixes * Fixed a typo in legacy seq2seq_trainer.py (#16531) * Add ONNX export for BeiT (#16498) * Add beit onnx conversion support * Updated docs * Added cross reference to ViT ONNX config * call on_train_end when trial is pruned (#16536) * Type hints added (#16529) * Fix Bart type hints (#16297) * Add type hints to PLBart PyTorch * Remove pending merge conflicts * Fix PLBart Type Hints * Add changes from review * Add VisualBert type hints (#16544) * Adding missing type hints for mBART model (PyTorch) (#16429) * added type hints for mbart tensorflow tf implementation * Adding missing type hints for mBART model Tensorflow Implementation model added with missing type hints * Missing Type hints - correction For TF model * Code fixup using make quality tests * Hint types - typo error * make fix-copies and make fixup * type hints * updated files * type hints update * making dependent modesls coherent Co-authored-by: matt <rocketknight1@gmail.com> * Remove MBart subclass of XLMRoberta in tokenzier docs (#16546) * Remove MBart subclass of XLMRoberta in tokenzier * Fix style * Copy docs from MBart50 tokenizer * Use random_attention_mask for TF tests (#16517) * use random_attention_mask for TF tests * Fix for TFCLIP test (for now). Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Improve code example (#16450) Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home> * Pin tokenizers version <0.13 (#16539) * Pin tokenizers version <0.13 * Style * Add code samples for TF speech models (#16494) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [FlaxSpeechEncoderDecoder] Fix dtype bug (#16581) * [FlaxSpeechEncoderDecoder] Fix dtype bug * more fixes * Making the impossible to connect error actually report the right URL. (#16446) * Fix flax import in __init__.py: modeling_xglm -> modeling_flax_xglm (#16556) * Add utility to find model labels (#16526) * Add utility to find model labels * Use it in the Trainer * Update src/transformers/utils/generic.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Quality Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Enable doc in Spanish (#16518) * Reorganize doc for multilingual support * Fix style * Style * Toc trees * Adapt templates * Add use_auth to load_datasets for private datasets to PT and TF examples (#16521) * fix formatting and remove use_auth * Add use_auth_token to Flax examples * add a test checking the format of `convert_tokens_to_string`'s output (#16540) * add new tests * add comment to overridden tests * TF: Finalize `unpack_inputs`-related changes (#16499) * Add unpack_inputs to remaining models * removed kwargs to `call()` in TF models * fix TF T5 tests * [SpeechEncoderDecoderModel] Correct Encoder Last Hidden State Output (#16586) * initialize the default rank set on TrainerState (#16530) * initialize the default rank set on TrainerState * fix style * Trigger doc build * Fix CI: test_inference_for_pretraining in ViTMAEModelTest (#16591) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * add a template to add missing tokenization test (#16553) * add a template to add missing tokenization test * add cookiecutter setting * improve doc * Update templates/adding_a_missing_tokenization_test/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * made _load_pretrained_model_low_mem static + bug fix (#16548) * handle torch_dtype in low cpu mem usage (#16580) * [Doctests] Correct filenaming (#16599) * [Doctests] Correct filenaming * improve quicktour * make style * Adding new train_step logic to make things less confusing for users (#15994) * Adding new train_step logic to make things less confusing for users * DO NOT ASK WHY WE NEED THAT SUBCLASS * Metrics now working, at least for single-output models with type annotations! * Updates and TODOs for the new train_step * Make fixup * Temporary test workaround until T5 has types * Temporary test workaround until T5 has types * I think this actually works! Needs a lot of tests though * MAke style/quality * Revert changes to T5 tests * Deleting the aforementioned unmentionable subclass * Deleting the aforementioned unmentionable subclass * Adding a Keras API test * Style fixes * Removing unneeded TODO and comments * Update test_step too * Stop trying to compute metrics with the dummy_loss, patch up test * Make style * make fixup * Docstring cleanup * make fixup * make fixup * Stop expanding 1D input tensors when using dummy loss * Adjust T5 test given the new compile() * make fixup * Skipping test for convnext * Removing old T5-specific Keras test now that we have a common one * make fixup * make fixup * Only skip convnext test on CPU * Update src/transformers/modeling_tf_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Avoiding TF import issues * make fixup * Update compile() to support TF 2.3 * Skipping model.fit() on template classes for now * Skipping model.fit() on template class tests for now * Replace ad-hoc solution with find_labels * make fixup Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Adding missing type hints for BigBird model (#16555) * added type hints for mbart tensorflow tf implementation * Adding missing type hints for mBART model Tensorflow Implementation model added with missing type hints * Missing Type hints - correction For TF model * Code fixup using make quality tests * Hint types - typo error * make fix-copies and make fixup * type hints * updated files * type hints update * making dependent modesls coherent * Type hints for BigBird * removing typos Co-authored-by: matt <rocketknight1@gmail.com> * [deepspeed] fix typo, adjust config name (#16597) * 🖍 apply feedback Co-authored-by: Cathy <815244047@qq.com> Co-authored-by: Jim Rohrer <jrohrer1@gmail.com> Co-authored-by: Ferdinand Schlatt <fschlatt@gmail.com> Co-authored-by: Dahlbomii <101373053+Dahlbomii@users.noreply.github.com> Co-authored-by: Gunjan Chhablani <chhablani.gunjan@gmail.com> Co-authored-by: Rishav Chandra Varma <rishavchandra.v16@iiits.in> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Daniel Stancl <46073029+stancld@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Karim Foda <35491698+KMFODA@users.noreply.github.com> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Andres Codas <andrescodas@users.noreply.github.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-04-05 12:48:42 -05:00
Patrick von Platen	7ccacdf10f	[Doctests] Correct filenaming (#16599 ) * [Doctests] Correct filenaming * improve quicktour * make style	2022-04-05 14:15:02 +02:00
Sylvain Gugger	b9a768b3ff	Enable doc in Spanish (#16518 ) * Reorganize doc for multilingual support * Fix style * Style * Toc trees * Adapt templates	2022-04-04 10:25:46 -04:00
Jim Rohrer	9de70f213e	Add ONNX export for BeiT (#16498 ) * Add beit onnx conversion support * Updated docs * Added cross reference to ViT ONNX config	2022-04-01 10:52:42 +02:00
Francesco Saverio Zuppichini	a8b6443e06	Refactor Modeling Outputs (#16341 ) * first proposal * replace model outputs in various models * conflicts * docstring * update poolformer * minor change in docstring * CI * removed poolformer specific outputs from doc * removed convnext specific outputs from doc * CI * weird char in segformer * conversations * reverted docstring for BaseModelOutputWithPooling * update outputs * changed docstring in BaseModelOutput * updated docstring in modeling outputs * typos :) * fixed typo after copy & paste it all around * CI * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * segformer Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-03-31 09:32:33 +02:00
Sayak Paul	5b40a37bc4	Add TF ViT MAE (#16255 ) * ported TFViTMAEIntermediate and TFViTMAEOutput. * added TFViTMAEModel and TFViTMAEDecoder. * feat: added a noise argument in the implementation for reproducibility. * feat: vit mae models with an additional noise argument for reproducibility. Co-authored-by: ariG23498 <aritra.born2fly@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-03-29 18:24:15 +01:00
Wesley A. Cheng	875e07a9e3	[doc] Fix missing trainer import (#16469 )	2022-03-29 18:57:43 +02:00
Wesley A. Cheng	3015d12bfb	fix wrong variable name (#16467 )	2022-03-29 18:55:40 +02:00
Steven Liu	45abb37ac9	Remove duplicate mLuke (#16460 ) * Remove duplicate mLuke * 🖍 apply feedback	2022-03-29 10:34:30 -05:00
NielsRogge	979b039c89	Add DPT (#15991 ) * First draft * More improvements * Add fusion blocks * Make conversion script work for dpt_large * Make conversion script work * Improve implementation * Improve conversion script * Add DPTForSemanticSegmentation * Make conversion work for semantic segmentation * Add tests * Remove print statements * First draft * Redesign neck * Improve tests * Improve implementation some more * Make neck output list of tensors * Improve neck and feature extractor * Fix integration tests * Make more tests pass * Make all tests pass * Add missing config archive map * Add in_index attribute to make heads accept list of tensors * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply some more suggestions * Add copied from statements * Remove assert * Apply suggestions from code review * Apply suggestions from code review * Remove DPTInterpolate in favor of nn.Upsample * Add comments * Apply suggestions from code review * Apply suggestions from code review * Add proposed design * Update design * Add DPTReassembleLayer * Add DPTFeatureFusionStage * Apply more suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Fix rebase * Update in_index and out_indices * Fix conversion script * Fix code quality * Add model to toctree and use DepthEstimatorOutput * Fix rebase * Fix code examples * Improve code * Fix copied from statements * Apply suggestions from code review * Remove compute_loss method * Apply suggestions from code review * Fix documentation tests file * Remove test.py file * Improve doc example Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>	2022-03-28 16:28:10 +02:00
Sylvain Gugger	473709fc76	Use doc builder styler (#16412 ) * Config update * Use doc-builder styler * Cleanup * Adapt import * We need it there too!	2022-03-28 07:45:18 -04:00
Kurian Benoy	c88ff66cc8	Fix broken links (#16113 ) * Update marian.mdx * Update marian.mdx * Update docs/source/model_doc/marian.mdx Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update marian.mdx Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2022-03-28 05:38:17 -04:00
Nathan Glenn	e02f95b229	remove references to PDF reading via PIL (#15293 ) * fix confusing PIL instructions As stated in the documentation [here](https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html?highlight=pdf#write-only-formats), PIL can only write PDF's, not read them. Remove references to reading PDF's via PIL from this page to avoid confusion. * mention PDF in doc examples using PIL Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Be explicit: PDFs must be converted to images * fix formatting Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-03-28 05:00:29 -04:00
Steven Liu	b320d87ece	Create concept guide section (#16369 ) * ✨ create concept guide section * 🖍 make fixup * 🖍 apply feedback Co-authored-by: Steven <stevhliu@gmail.com>	2022-03-25 14:51:43 -05:00
Daniel Stancl	ed2ee373d0	Add TF implementation of GPT-J (#15623 ) * Initial commit * Add TFGPTJModel * Fix a forward pass * Add TFGPTJCausalLM * Add TFGPTJForSequenceClassification * Add TFGPTJForQuestionAnswering * Fix docs * Deal with TF dynamic shapes * Add Loss parents to models * Adjust split and merge heads to handle 4 and 5-dim tensors * Update outputs for @tooslow tests	2022-03-25 19:27:19 +00:00
lewtun	a97f3150c4	Add ONNX support for Blenderbot and BlenderbotSmall (#15875 ) * Add ONNX support for Blenderbot * Add BlenderbotSmall ONNX configuration * Update serialization table	2022-03-25 17:04:43 +01:00
Sylvain Gugger	867f3950fa	Rename master to main for notebooks links and leftovers (#16397 )	2022-03-25 09:12:23 -04:00
Sylvain Gugger	088c1880b7	Big file_utils cleanup (#16396 ) * Big file_utils cleanup * This one still needs to be treated separately	2022-03-25 07:25:20 -04:00
Sylvain Gugger	3a0f1684c3	Fix readme links and add CI check (#16392 ) * Fix doc links in README * Fix name * Fix links in READMEs and doc index * Error if there is something wrong so the CI knows	2022-03-24 11:59:09 -04:00
Thomas Chaigneau	029b0d95ed	add GPT-J ONNX config to Transformers (#16274 ) * add GPT-J ONNX config to Transformers * remove token-classification features mapping Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * add question-answering features mapping Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * add GPT2 config init to GPT2 config + copie shebang for fix-copies Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-03-23 16:36:11 -04:00
Edward Beeching	aff9bc405a	Decision transformer gym (#15845 ) * Created the Decision Transformer Modle * updating tests, copy to other machine * Added last hidden size to Decision Transformer modelling outputs * Removed copy of original DT file * made a temporary change to gpt2 to have it conform with the Decision Transformer version * Updated tests * Ignoring a file used to test the DT model * added comments to config file * added comments and argument descriptions to decision transformer file * Updated doc * Ran "make style" * Remove old model imports * Removed unused imports, cleaned up init file * Update docs/source/model_doc/decision_transformer.mdx added my username Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Reverted changes made to gpt2 * Removed datasets submodule * Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states * Added support for return of hidden states, attentions and return dict of gpt2 model. * Updated tests to include many of the ModelTesterMixin tests. The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes * Added missing line to the end of gpt2 file * Added an integration test for the Decision Transformer Test performs and autoregressive evaluation for two time steps * Set done and info to _ to fix failing test * Updated integration test to be deterministic and check expected outputs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unnecessary config options * Cleaned up commented code and old comments. * Cleaned up commented code. * Changed DecisionTransformer to Decision Transformer * Added Decision Transformer to the main README file * Added copy of GTP2 called DecisionTranformerGPT2Model * isorted imports * isorted imports * Added model to non-English README files * Ran make fix-copies and corrected some cases. * Updated index file to include Decision Transformer * Added gpt2 model as copy inside the Decision Transformer model file * Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS * Deleted redundant checkpoint files (I don't know how these got committed) * Removed testing files. (These should have never been committed) * Removed accidentally committed files * Moved the Decision Transformer test to its own directory * Add type hints for Pegasus (#16324) * Funnel type hints (#16323) * add pt funnel type hints * add tf funnel type hints * Add type hints for ProphetNet PyTorch (#16272) * [GLPN] Improve docs (#16331) * Add link to notebook * Add link * Fix bug Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Added type hints for Pytorch Marian calls (#16200) * Added type hinting for forward functions in pytorch marian * typo correction * Removed type hints on functions from BART per Suraj Patil request * fix import pb * fix typo * corrected tuple call * ran black * after fix-copies Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List * Fixing copies to roformer and pegasus Co-authored-by: Clementine Fourrier <cfourrie@inria.fr> Co-authored-by: matt <rocketknight1@gmail.com> * Moved DecisionTransformOutput to modeling_decision_transformer * Moved the example usage to research project and cleaned comments * Made tests ignore the copy of gpt2 in Decision Transformer * Added module output to modelling decision transformer * removed copied gpt2 model from list of transformers models * Updated tests and created __init__ file for new test location * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unneeded summary type from config file * Fixed copies * Updated pretrained config map to refer to hopper-medium checkpoint * done (#16340) * Added Decision transformer to model docs * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add type annotations for Rembert/Splinter and copies (#16338) * undo black autoformat * minor fix to rembert forward with default * make fix-copies, make quality * Adding types to template model * Removing List from the template types * Remove `Optional` from a couple of types that don't accept `None` Co-authored-by: matt <rocketknight1@gmail.com> * [Bug template] Shift responsibilities for long-range (#16344) * Fix code repetition in serialization guide (#16346) * Adopt framework-specific blocks for content (#16342) * ✨ refactor code samples with framework-specific blocks * ✨ update training.mdx * 🖍 apply feedback * Updates the default branch from master to main (#16326) * Updates the default branch from master to main * Links from `master` to `main` * Typo * Update examples/flax/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updated model with custom docstring example * Created the Decision Transformer Modle * updating tests, copy to other machine * Added last hidden size to Decision Transformer modelling outputs * Removed copy of original DT file * made a temporary change to gpt2 to have it conform with the Decision Transformer version * Updated tests * Ignoring a file used to test the DT model * added comments to config file * added comments and argument descriptions to decision transformer file * Updated doc * Ran "make style" * Remove old model imports * Removed unused imports, cleaned up init file * Update docs/source/model_doc/decision_transformer.mdx added my username Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Reverted changes made to gpt2 * Removed datasets submodule * Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states * Added support for return of hidden states, attentions and return dict of gpt2 model. * Updated tests to include many of the ModelTesterMixin tests. The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes * Added missing line to the end of gpt2 file * Added an integration test for the Decision Transformer Test performs and autoregressive evaluation for two time steps * Set done and info to _ to fix failing test * Updated integration test to be deterministic and check expected outputs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unnecessary config options * Cleaned up commented code and old comments. * Cleaned up commented code. * Changed DecisionTransformer to Decision Transformer * Added Decision Transformer to the main README file * Added copy of GTP2 called DecisionTranformerGPT2Model * isorted imports * isorted imports * Added model to non-English README files * Ran make fix-copies and corrected some cases. * Updated index file to include Decision Transformer * Added gpt2 model as copy inside the Decision Transformer model file * Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS * Deleted redundant checkpoint files (I don't know how these got committed) * Removed testing files. (These should have never been committed) * Removed accidentally committed files * Moved the Decision Transformer test to its own directory * Moved DecisionTransformOutput to modeling_decision_transformer * Moved the example usage to research project and cleaned comments * Made tests ignore the copy of gpt2 in Decision Transformer * Added module output to modelling decision transformer * removed copied gpt2 model from list of transformers models * Updated tests and created __init__ file for new test location * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unneeded summary type from config file * Fixed copies * Updated pretrained config map to refer to hopper-medium checkpoint * Added Decision transformer to model docs * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updated model with custom docstring example * Updated copies, config auto, and readme files. Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Dan Tegzes <48134725+Tegzes@users.noreply.github.com> Co-authored-by: Adam Montgomerie <adam@avanssion.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com> Co-authored-by: Clementine Fourrier <cfourrie@inria.fr> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com> Co-authored-by: Jacob Dineen <54680234+jacobdineen@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2022-03-23 16:18:43 -04:00
Lysandre Debut	eca77f4719	Updates the default branch from master to main (#16326 ) * Updates the default branch from master to main * Links from `master` to `main` * Typo * Update examples/flax/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-23 03:46:59 -04:00
Steven Liu	7732148124	Adopt framework-specific blocks for content (#16342 ) * ✨ refactor code samples with framework-specific blocks * ✨ update training.mdx * 🖍 apply feedback	2022-03-22 16:14:58 -05:00
Omar Sanseviero	62cbd8423b	Fix code repetition in serialization guide (#16346 )	2022-03-22 16:57:19 -04:00
NielsRogge	a2379b9257	[GLPN] Improve docs (#16331 ) * Add link to notebook * Add link * Fix bug Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-03-22 15:45:29 +01:00
NielsRogge	0c55d47cde	Add GLPN (#16199 ) * First draft * Fix logits calculation * Improve tests * Add copied from statements * Fix base_model_prefix * Improve implementation, upload new models * Update design * Fix integration test * Add model to README and toctree * Add document image * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add decoder_hidden_size attribute * Update design of decoder * Add DepthEstimatorOutput class * Rename in_index to head_in_index and add feature extractor tests * Apply suggestions from code review * Apply suggestions from code review * Update pretrained model name and add to doc tests * Remove test.py script * Update copied from statements and clean up Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-22 08:51:13 +01:00
Thomas Chaigneau	0aac9ba2da	Add Flaubert OnnxConfig to Transformers (#16279 ) * Add Flaubert to ONNX to make it available for conversion. * Fixed features for FlauBERT. fixup command remove flaubert to docs list. Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>	2022-03-21 21:46:31 +01:00
Steven Liu	5a42bb431e	Update troubleshoot with more content (#16243 ) * 📝 first draft * 🖍 apply feedback	2022-03-21 11:37:18 -05:00
NielsRogge	fbb454307d	[SegFormer] Remove unused attributes (#16285 ) * Remove unused attributes * Add link to blog and add clarification about input size * Improve readability of the code Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-03-21 17:34:10 +01:00
Gunjan Chhablani	3f0f75e497	Remove disclaimer from Longformer docs (#16296 )	2022-03-21 10:05:47 -04:00
PolarisRisingWar	abf3cc7064	Fix a typo (add a coma) (#16291 ) As mentioned: https://github.com/huggingface/transformers/issues/16277	2022-03-21 12:10:24 +00:00
Aflah	f393868073	Fixed Error Raised Due to Wrongly Accessing Training Sample (#16115 ) * Update training.mdx Fixed Error Raised Due to Wrongly Accessing Training Sample * Ran make style * Revert to Old Commit * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-03-21 12:54:54 +01:00
Sylvain Gugger	4ecb022eb1	Draft a guide with our code quirks for new models (#16237 ) * Draft a guide with our code quirks for new models * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-03-21 07:44:03 -04:00
Patrick von Platen	c1af180dfe	Add Slack notification support for doc tests (#16253 ) * up * up * up * fix * yeh * ups * Empty test commit * correct quicktour * correct * correct * up * up * uP * uP * up * up * uP * up * up * up * up * up * up * up * up * up * up * Update src/transformers/models/van/modeling_van.py * finish * apply suggestions * remove folder * revert to daily testing	2022-03-21 11:33:18 +01:00
Steven Liu	ffc319e7b8	Fix links in guides (#16182 ) * 🖍 fix links in guides * 🖍 apply feedback	2022-03-18 16:16:16 -05:00
Stas Bekman	47cccb5318	[Deepspeed] non-HF Trainer doc update (#16238 )	2022-03-17 13:33:55 -07:00
Patrick von Platen	8a96b0f10a	[Generate Docs] Correct docs (#16133 ) * [Generate Docs] Correct docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2022-03-17 20:05:28 +01:00
Francesco Saverio Zuppichini	667b823b89	Swin support for any input size (#15986 ) * padding done * correctly return one attention per layer * almost correct, attentions are not flatten one tuple per stage * tests green * doc * conversations * reshaping hidden_states * view in the test * reshape_hidden_states in Encoder and Model * new outputs with reshaped_hidden_states * conversations * doc * Update docs/source/model_doc/swin.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * conversations * fix tests * minor changes * resolved conversations * attentions one per stage * typo * typos * typos * function signature * CI * clean up tests Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-03-16 18:38:25 +01:00
Sylvain Gugger	4f4e5ddbcb	Framework split (#16030 ) * First files * More files * Last files * Style	2022-03-15 10:13:34 -04:00
Markus Sagen	bcaf566038	[Fix doc example] Fix first example for the custom_datasets tutorial (#16087 ) * Fix inconsistent example variable naming - Example code for a sequence classification in Tensorflow had spelling mistakes and incorrect and inconsistent naming - Changed variable naming to be consistent with the two other TF examples * Fix incorrect incorrect training examples	2022-03-15 08:17:51 -04:00
Francesco Saverio Zuppichini	0a057201a9	Visual Attention Network (VAN) (#16027 ) * encoder works * addded files * norm in stage * convertion script * tests * fix copies * make fix-copies * fixed __init__ * make fix-copies * fix * shapiro test needed * make fix-copie * minor changes * make style + quality * minor refactor conversion script * rebase + tests * removed unused variables * updated doc * toctree * CI * doc * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * resolved conversations * make fixup * config passed to modules * config passed to modules * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * conversations * conversations * copyrights * normal test * tests Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-03-15 08:47:12 +01:00
Francesco Saverio Zuppichini	e3008c679f	[WIP] Resnet (#15770 ) * first commit * ResNet model correctly implemented. basic modeling + weights conversion is done removed unused doc mdx file doc and conversion script added feature_extractor to auto test minor changes + style + quality doc test Delete process.yml A left over from my attempt of running circleci locally * minor changes * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * new test format * minor changes from conversations * minor changes from conversations * make style + quality * readded the tests * test + README * minor changes from conversations * error in README * make fix-copies * removed regression for classification head * make quality * fixed loss control flow * fixed loss control flow * resolved conversations * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * READMEs * index.mdx * minor changes * updated tests and models * unused import * outputs * Update docs/source/model_doc/resnet.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * added embeddings_size * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * conversation * added push to hub * test * embedding_size * make fix-copies * resolved conversations * CI * changed organization * minor changes * CI * minor changes * conversations * conversation * doc * tests * removed unused docstring * conversation * removed unused outputs * CI Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-03-14 19:57:55 +01:00
Omar Sanseviero	802984ad42	Fix and document Zero Shot Image Classification (#16079 )	2022-03-14 08:50:36 +01:00
lewtun	6e1e88fd38	Add TFCamembertForCausalLM and ONNX integration test (#16073 ) * Make Camembert great again! * Add Camembert to TensorFlow ONNX tests	2022-03-14 08:40:42 +01:00
Stas Bekman	580dd87c55	[Deepspeed] add support for bf16 mode (#14569 ) * [WIP] add support for bf16 mode * prep for bf16 * prep for bf16 * fix; zero2/bf16 is ok * check bf16 is available * test fixes * enable zero3_bf16 * config files * docs * split stage_dtype; merge back to non-dtype-specific config file * fix doc * cleanup * cleanup * bfloat16 => bf16 to match the PR changes * s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/ * test fixes/skipping * move * fix * Update docs/source/main_classes/deepspeed.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * backticks * cleanup * cleanup * cleanup * new version * add note about grad accum in bf16 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-11 17:53:53 -08:00
Steven Liu	ae2dd42be5	Audio/vision task guides (#15808 ) * 📝 first draft of audio/vision guides * ✨ make fixup * 🖍 fix typo * 🖍 close parentheses * 🖍 apply feedback * 🖍 apply feedback, make fixup * 🖍 more fixup for perceiver * 🖍 apply feedback * ✨ make fixup * 🖍 fix data collator	2022-03-11 16:43:49 -06:00
Steven Liu	5b4c97d09d	Update troubleshoot guide (#16001 ) * 📝 first draft * 🖍 apply feedback * 🖍 apply feedback	2022-03-11 13:05:44 -06:00
Sylvain Gugger	f7708e1bed	Force default brnahc name via the config	2022-03-11 10:09:15 -05:00
David S. Batista	96ac7549cb	updating fine-tune classifier documentation (#16063 )	2022-03-10 16:21:56 -05:00
Patrick von Platen	6ce11c2c0f	[Docs] Improve PyTorch, Flax generate API (#15988 ) * Move generate docs * up * Update docs/source/_toctree.yml * correct * correct some stuff * correct tests * more fixes * finish generate * add to doc stest * finish * finalize * add warning to generate method	2022-03-10 11:54:45 +01:00
NielsRogge	0835119bf3	Add Document Image Transformer (DiT) (#15984 ) * Add conversion script * Improve script * Fix bug * Add option to push to hub * Add support for classification models * Update model name * Upload feature extractor files first * Remove hash checking * Fix config * Add id2label * Add import * Fix id2label file name * Fix expected shape * Add model to README * Improve docs * Add integration test and fix CI * Fix code style * Add missing init * Add model to SPECIAL_MODULE_TO_TEST_MAP Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-03-10 11:34:44 +01:00
Sanchit Gandhi	b256f3518d	Add FlaxBartForCausalLM (#15995 ) * add causal lm * add CausalLM tests * Add FlaxBartForCausalLM * Add EncoderDecoder model tests * change docstring * make repo-consistency * suggested changes * remove jax ops * correction * rename pre-trained decoder model	2022-03-09 19:53:01 +01:00
lewtun	50dd314d93	Add ONNX export for ViT (#15658 ) * Add ONNX support for ViT * Refactor to use generic preprocessor * Add vision dep to tests * Extend ONNX slow tests to ViT * Add dummy image generator * Use model_type to determine modality * Add deprecation warnings for tokenizer argument * Add warning when overwriting the preprocessor * Add optional args to docstrings * Add minimum PyTorch version to OnnxConfig * Refactor OnnxConfig class variables from CONSTANT_NAME to snake_case * Add reasonable value for default atol Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-09 17:36:59 +01:00
Patrick von Platen	c1aaa43935	[Doctests] Move doctests to new GPU & Fix bugs (#15969 ) * test * up * up * Empty test commit * up * update tests * up * fix some vision models * correct * correct docs * Trigger notification * finalize * check * correct quicktour * Apply suggestions from code review * improve doctests * Trigger Build * next try * next try * and again * Output current clone information * Output current clone information * Correct path * add tf round again * revert to daily job Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2022-03-09 13:09:56 +01:00
Steven Liu	38cc35069c	Update training scripts docs (#15931 ) * 📝 first draft * 🖍 apply feedback * 🖍 remove examples from toctree * 🗑 remove examples from docs/source	2022-03-07 13:29:14 -06:00
Chan Woo Kim	5c6f57ee75	Constrained Beam Search [With Disjunctive Decoding] (#15761 ) * added classes to get started with constrained beam search * in progress, think i can directly force tokens now but not yet with the round robin * think now i have total control, now need to code the bank selection * technically works as desired, need to optimize and fix design choices leading to undersirable outputs * complete PR #1 without disjunctive decoding * removed incorrect tests * Delete k.txt * Delete test.py * Delete test.sh * revert changes to test scripts * genutils * full implementation with testing, no disjunctive yet * shifted docs * passing all tests realistically ran locally * removing accidentally included print statements * fixed source of error in initial PR test * fixing the get_device() vs device trap * fixed documentation docstrings about constrained_beam_search * fixed tests having failing for Speech2TextModel's floating point inputs * fix cuda long tensor * added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search * deleted accidentally added test halting code with assert False * code reformat * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py * fixing based on comments on PR * took out the testing code that should but work fails without the beam search moditification ; style changes * fixing comments issues * docstrings for ConstraintListState * typo in PhrsalConstraint docstring * docstrings improvements * finished adding what is sort of an opinionated implementation of disjunctive generation, but it revealed errors in inner beam search logic during testing. * fixed bug found in constrained beam search that used beam_idx that were not global across all the batches * disjunctive constraint working 100% correctly * passing all tests * Accidentally included mlruns * Update src/transformers/generation_beam_constraints.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/generation_beam_constraints.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * complete overhaul of type complexities and other nits * strict type checks in generate() * fixing second round of feedback by narsil * fixed failing generation test because of type check overhaul * generation test fail fix * fixing test fails Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-03-04 18:18:34 +01:00
Javier de la Rosa	01485ceec3	Add missing support for Flax XLM-RoBERTa (#15900 ) * Adding Flax XLM-RoBERTa * Add Flax to __init__ * Adding doc and dummy objects * Add tests * Add Flax XLM-R models autodoc * Fix tests * Add Flask XLM-RoBERTa to TEST_FILES_WITH_NO_COMMON_TESTS * Update src/transformers/models/xlm_roberta/modeling_flax_xlm_roberta.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update tests/xlm_roberta/test_modeling_flax_xlm_roberta.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update tests/xlm_roberta/test_modeling_flax_xlm_roberta.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Remove test on large Flask XLM-RoBERTa * Add tokenizer to the test Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-03-04 14:36:28 +01:00
Nicolas Patry	89c7d9cfba	Making MaskFormerForInstanceSegmentation. (#15934 ) Small adjustments. Adding in type hint. Last fix ? Only include the default dict thing, not the pipelines.	2022-03-04 13:56:15 +01:00
Li-Huai (Allan) Lin	7b3bd1f21a	Fix and improve REALM fine-tuning (#15297 ) * Draft * Add test * Update src/transformers/models/realm/modeling_realm.py * Apply suggestion * Add block_mask * Update * Update * Add block_embedding_to * Remove no_grad * Use AutoTokenizer * Remove model.to overridding	2022-03-03 14:10:15 +01:00
Patrick von Platen	439de3f7f9	[Fix link in pipeline doc] (#15906 )	2022-03-03 07:43:13 -05:00
Joao Gante	baab5e7cdf	TF generate refactor - Sample (#15793 ) * Add TF logits wrappers * Add sample method * add tests for TF logit wrappers * TF generate sample tests now run on CPU Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-03-02 16:13:54 +00:00
Francesco Saverio Zuppichini	d83d22f578	Maskformer (#15682 ) * maskformer * conflicts * conflicts * minor fixes * feature extractor test fix refactor MaskFormerLoss following conversation MaskFormer related types should not trigger a module time import error missed one removed all the types that are not used update config mapping minor updates in the doc resolved conversation that doesn't need a discussion minor changes resolved conversations fixed DetrDecoder * minor changes minor changes fixed mdx file test feature_extractor return types functional losses -> classes removed the return type test for the feature extractor minor changes + style + quality * conflicts? * rebase master * readme * added missing files * deleded poolformers test that where in the wrong palce * CI * minor changes * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * resolved conversations * minor changes * conversations [Unispeech] Fix slow tests (#15818) * remove soundfile old way of loading audio * Adapt slow test [Barthez Tokenizer] Fix saving (#15815) [TFXLNet] Correct tf xlnet generate (#15822) * [TFXLNet] Correct tf xlnet * adapt test comment Fix the push run (#15807) Fix semantic segmentation pipeline test (#15826) Fix dummy_inputs() to dummy_inputs in symbolic_trace doc (#15776) Add model specific output classes to PoolFormer model docs (#15746) * Added model specific output classes to poolformer docs * Fixed Segformer typo in Poolformer docs Adding the option to return_timestamps on pure CTC ASR models. (#15792) * Adding the option to return_timestamps on pure CTC ASR models. * Remove `math.prod` which was introduced in Python 3.8 * int are not floats. * Reworking the PR to support "char" vs "word" output. * Fixup! * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Quality. Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> HFTracer.trace should use/return self.graph to be compatible with torch.fx.Tracer (#15824) Fix tf.concatenate + test past_key_values for TF models (#15774) * fix wrong method name tf.concatenate * add tests related to causal LM / decoder * make style and quality * clean-up * Fix TFBertModel's extended_attention_mask when past_key_values is provided * Fix tests * fix copies * More tf.int8 -> tf.int32 in TF test template * clean-up * Update TF test template * revert the previous commit + update the TF test template * Fix TF template extended_attention_mask when past_key_values is provided * Fix some styles manually * clean-up * Fix ValueError: too many values to unpack in the test * Fix more: too many values to unpack in the test * Add a comment for extended_attention_mask when there is past_key_values * Fix TFElectra extended_attention_mask when past_key_values is provided * Add tests to other TF models * Fix for TF Electra test: add prepare_config_and_inputs_for_decoder * Fix not passing training arg to lm_head in TFRobertaForCausalLM * Fix tests (with past) for TF Roberta * add testing for pask_key_values for TFElectra model Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> [examples/summarization and translation] fix readme (#15833) Add ONNX Runtime quantization for text classification notebook (#15817) Re-enable doctests for the quicktour (#15828) * Re-enable doctests for the quicktour * Re-enable doctests for task_summary (#15830) * Remove & Framework split model report (#15825) Add TFConvNextModel (#15750) * feat: initial implementation of convnext in tensorflow. * fix: sample code for the classification model. * chore: added checked for from the classification model. * chore: set bias initializer in the classification head. * chore: updated license terms. * chore: removed ununsed imports * feat: enabled argument during using drop_path. * chore: replaced tf.identity with layers.Activation(linear). * chore: edited default checkpoint. * fix: minor bugs in the initializations. * partial-fix: tf model errors for loading pretrained pt weights. * partial-fix: call method updated * partial-fix: cross loading of weights (4x3 variables to be matched) * chore: removed unneeded comment. * removed playground.py * rebasing * rebasing and removing playground.py. * fix: renaming TFConvNextStage conv and layer norm layers * chore: added initializers and other minor additions. * chore: added initializers and other minor additions. * add: tests for convnext. * fix: integration tester class. * fix: issues mentioned in pr feedback (round 1). * fix: how output_hidden_states arg is propoagated inside the network. * feat: handling of arg for pure cnn models. * chore: added a note on equal contribution in model docs. * rebasing * rebasing and removing playground.py. * feat: encapsulation for the convnext trunk. * Fix variable naming; Test-related corrections; Run make fixup * chore: added Joao as a contributor to convnext. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * chore: corrected copyright year and added comment on NHWC. * chore: fixed the black version and ran formatting. * chore: ran make style. * chore: removed from_pt argument from test, ran make style. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * fix: tests in the convnext subclass, ran make style. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * chore: moved convnext test to the correct location * fix: locations for the test file of convnext. * fix: convnext tests. * chore: applied sgugger's suggestion for dealing w/ output_attentions. * chore: added comments. * chore: applied updated quality enviornment style. * chore: applied formatting with quality enviornment. * chore: revert to the previous tests/test_modeling_common.py. * chore: revert to the original test_modeling_common.py * chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py * fix: tests for convnext. * chore: removed output_attentions argument from convnext config. * chore: revert to the earlier tf utils. * fix: output shapes of the hidden states * chore: removed unnecessary comment * chore: reverting to the right test_modeling_tf_common.py. * Styling nits Co-authored-by: ariG23498 <aritra.born2fly@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> * minor changes * doc fix in feature extractor * doc * typose * removed detr logic from config * removed detr logic from config * removed num_labels * small fix in the config * auxilary -> auxiliary * make style * some test is failing * fix a weird char in config prevending doc-builder * retry to fix the doc-builder issue * make style * new try to fix the doc builder * CI * change weights to facebook Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: ariG23498 <aritra.born2fly@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-03-02 15:48:20 +01:00
Patrick von Platen	40040727ab	[Bart] Fix implementation note doc (#15879 )	2022-03-02 10:24:32 +01:00
Michael Benayoun	4bfe75bd08	M2M100 support for ONNX export (#15193 ) * Add M2M100 support for ONNX export * Delete useless imports * Add M2M100 to tests * Fix protobuf issue	2022-03-02 10:03:14 +01:00
Steven Liu	6ccfa2170c	Inference for multilingual models (#15836 ) * 📝 first draft for multilingual models * 🖍 make style	2022-03-01 15:10:31 -06:00
NielsRogge	c008afea3c	Add link to notebooks (#15791 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-03-01 17:44:20 +01:00
Patrick von Platen	9863f7d228	[Benchmark tools] Deprecate all (#15848 ) * [Benchmark tools] Deprecate all * up	2022-03-01 11:26:20 +01:00
Eduardo Gonzalez Ponferrada	df5a4094a6	Add Data2Vec (#15507 ) * Add data2vec model cloned from roberta * Add checkpoint conversion script * Fix copies * Update docs * Add checkpoint conversion script * Remove fairseq data2vec_text script and fix format * Add comment on where to get data2vec_text.py * Remove mock implementation cheat.py and fix style * Fix copies * Remove TF and Flax classes from init * Add back copy from fairseq data2vec_text.py and fix style * Update model name in docs/source/index.mdx to be CamelCase * Revert model name in table to lower-case to get check_table test to pass * Update src/transformers/models/data2vec/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/model_doc/data2vec.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/data2vec.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update documentation * Copy-paste Data2VecConfig from BertConfig * Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency * Update config special tokens to match RoBERTa * Split multiple assertions and add individual error messages * Rename Data2VecModel to Data2VecForTextModel * Add Data2Vec to _toctree.yml * Rename Data2VecEmbeddings to Data2VecForTextEmbeddings * Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding). * finish audio model * finish audio file * Update names and fix style, quality and repo consistency * Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files. * add inputs to logits to data2vec' * correct autio models * correct config auto * correct tok auto * Update utils/tests_fetcher.py * delete unnecessary files * delete unnecessary files * further renaming * make all tests pass * finish * remove useless test file * Update tests/test_modeling_common.py * Update utils/check_repo.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec_text.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Fix copies * Update docs * Remove fairseq data2vec_text script and fix format * Add comment on where to get data2vec_text.py * Remove mock implementation cheat.py and fix style * Fix copies * Remove TF and Flax classes from init * Add back copy from fairseq data2vec_text.py and fix style * Update model name in docs/source/index.mdx to be CamelCase * Revert model name in table to lower-case to get check_table test to pass * Update documentation * Update src/transformers/models/data2vec/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/convert_data2vec_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/configuration_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/data2vec/modeling_data2vec.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Copy-paste Data2VecConfig from BertConfig * Update config checkpoint to point to edugp/data2vec-nlp-base. Fix style and repo-consistency * Update config special tokens to match RoBERTa * Split multiple assertions and add individual error messages * Rename Data2VecModel to Data2VecForTextModel * Add Data2Vec to _toctree.yml * Rename Data2VecEmbeddings to Data2VecForTextEmbeddings * Add initial Data2VecForAudio model (unfinished). Only matching fairseq's implementation up to the feature encoder (before positional encoding). * finish audio model * finish audio file * add inputs to logits to data2vec' * Update names and fix style, quality and repo consistency * Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimicking the Wav2Vec2 test suite. Fix bias initilization in positional conv layers. Move back configurations for audio and text to separate files. * correct autio models * correct config auto * correct tok auto * delete unnecessary files * delete unnecessary files * Update utils/tests_fetcher.py * further renaming * make all tests pass * finish * remove useless test file * Update tests/test_modeling_common.py * Update utils/check_repo.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/data2vec/modeling_data2vec_text.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Move data2vec tests to new structure * Fix test imports for text tests * Remove fairseq files * Change paper link to arxiv * Modify Data2Vec documentation to reflect that the encoder is not shared across the audio and text models in the current implementation. * Update text model checkpoint to be facebook/data2vec-text-base * Add 'Copy from' statements and update paper links and docs * fix copy from statements * improve copied from * correct more copied from statements * finish copied from stuff * make style * add model to README * add to master Co-authored-by: Eduardo Gonzalez Ponferrada <eduardo@ferrumhealth.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-01 11:09:20 +01:00
Sanchit Gandhi	e3342edc4e	Flax Speech-Encoder-Decoder Model (#15613 ) * rebase * Delete shift tokens func * downsample decoder input seq len for init * correct attention mask * add tests * pt flax cross test * make fixup * init file for import * change pt-flax cross test threshold * pt-flax test logits only * move tests * make repo-consistency * consistent indentation Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-28 12:22:36 +01:00
Sayak Paul	84eaa6acf5	Add TFConvNextModel (#15750 ) * feat: initial implementation of convnext in tensorflow. * fix: sample code for the classification model. * chore: added checked for from the classification model. * chore: set bias initializer in the classification head. * chore: updated license terms. * chore: removed ununsed imports * feat: enabled argument during using drop_path. * chore: replaced tf.identity with layers.Activation(linear). * chore: edited default checkpoint. * fix: minor bugs in the initializations. * partial-fix: tf model errors for loading pretrained pt weights. * partial-fix: call method updated * partial-fix: cross loading of weights (4x3 variables to be matched) * chore: removed unneeded comment. * removed playground.py * rebasing * rebasing and removing playground.py. * fix: renaming TFConvNextStage conv and layer norm layers * chore: added initializers and other minor additions. * chore: added initializers and other minor additions. * add: tests for convnext. * fix: integration tester class. * fix: issues mentioned in pr feedback (round 1). * fix: how output_hidden_states arg is propoagated inside the network. * feat: handling of arg for pure cnn models. * chore: added a note on equal contribution in model docs. * rebasing * rebasing and removing playground.py. * feat: encapsulation for the convnext trunk. * Fix variable naming; Test-related corrections; Run make fixup * chore: added Joao as a contributor to convnext. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * chore: corrected copyright year and added comment on NHWC. * chore: fixed the black version and ran formatting. * chore: ran make style. * chore: removed from_pt argument from test, ran make style. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * fix: tests in the convnext subclass, ran make style. * rebasing * rebasing and removing playground.py. * rebasing * rebasing and removing playground.py. * chore: moved convnext test to the correct location * fix: locations for the test file of convnext. * fix: convnext tests. * chore: applied sgugger's suggestion for dealing w/ output_attentions. * chore: added comments. * chore: applied updated quality enviornment style. * chore: applied formatting with quality enviornment. * chore: revert to the previous tests/test_modeling_common.py. * chore: revert to the original test_modeling_common.py * chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py * fix: tests for convnext. * chore: removed output_attentions argument from convnext config. * chore: revert to the earlier tf utils. * fix: output shapes of the hidden states * chore: removed unnecessary comment * chore: reverting to the right test_modeling_tf_common.py. * Styling nits Co-authored-by: ariG23498 <aritra.born2fly@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-02-25 18:19:16 +01:00
Sylvain Gugger	0118c4f6a8	Re-enable doctests for the quicktour (#15828 ) * Re-enable doctests for the quicktour * Re-enable doctests for task_summary (#15830) * Remove &	2022-02-25 17:46:38 +01:00
Tanay Mehta	7566734d6f	Add model specific output classes to PoolFormer model docs (#15746 ) * Added model specific output classes to poolformer docs * Fixed Segformer typo in Poolformer docs	2022-02-25 13:43:56 +01:00
Steven Liu	fecb08c2b8	🧼 NLP task guides (#15731 ) * clean commit of changes to NLP tasks * 🖍 apply feedback * 📝 move tf data collator in multiple choice Co-authored-by: Steven <stevhliu@gmail.com>	2022-02-23 13:58:33 -06:00
Julien Chaumond	32f5de10a0	[doc] custom_models: mention security features of the Hub (#15768 ) * custom_models: tiny doc addition * mention security feature earlier in the section Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2022-02-23 11:40:06 -05:00
Nicolas Patry	f9582c205a	Adding ZeroShotImageClassificationPipeline (#12119 ) * [Proposal] Adding ZeroShotImageClassificationPipeline - Based on CLIP * WIP, Resurection in progress. * Resurrection... achieved. * Reword handling different `padding_value` for `feature_extractor` and `tokenizer`. * Thanks doc-builder ! * Adding docs + global namespace `ZeroShotImageClassificationPipeline`. * Fixing templates. * Make the test pass and be robust to floating error. * Adressing suraj's comments on docs mostly. * Tf support start. * TF support. * Update src/transformers/pipelines/zero_shot_image_classification.py Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-02-23 09:41:42 +01:00
Patrick von Platen	c44d3675c2	Time stamps for CTC models (#15687 ) * [Wav2Vec2 Time Stamps] * Add first version * add word time stamps * Fix * save intermediate space * improve * [Finish CTC Tokenizer] * remove @ * remove @ * push * continue with phonemes * up * finish PR * up * add example * rename * finish * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct split * finalize Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-22 19:26:44 +01:00
Francesco Saverio Zuppichini	38bed912e3	added link to our writing-doc document (#15756 )	2022-02-22 09:57:28 +01:00
Joao Gante	3956b133b6	TF text classification examples (#15704 ) * Working example with to_tf_dataset * updated text_classification * more comments	2022-02-21 17:17:59 +00:00
Gunjan Chhablani	2c2a31ffbc	Add missing PLBart entry in README (#15721 ) * Add missing PLBart entry in index * Fix README * Fix README * Fix style * Change to master model doc	2022-02-18 21:11:42 +01:00
Gunjan Chhablani	ae1f835028	Add PLBart (#13269 ) * Init PLBART * Add missing configuration file * Add conversion script and configurationf ile * Fix style * Update modeling and conversion scripts * Fix scale embedding in config * Add comment * Fix conversion script * Add classification option to conversion script * Fix vocab size in config doc * Add tokenizer files from MBart50 * Allow no lang code in regular tokenizer * Add PLBart Tokenizer Converters * Remove mask from multi tokenizer * Remove mask from multi tokenizer * Change from MBart-50 to MBart tokenizer * Fix names and modify src/tgt behavior * Fix imports for tokenizer * Remove <mask> from multi tokenizer * Fix style * Change tokenizer_class to processor_class * Add attribute map to config class * Update modeling file to modified MBart code * Update configuration file to MBart style configuration * Fix tokenizer * Separate tokenizers * Fix error in tokenization auto * Copy MBart tests * Replace with MBart tokenization tests * Fix style * Fix language code in multi tokenizer * Fix configuration docs * Add entry for plbart_multi in transformers init * Add dummy objects and fix imports * Fix modeling tests * Add TODO in config * Fix copyright year * Fix modeling docs and test * Fix some tokenization tests and style * Add changes from review * Fix copies * Fix docs * Fix docs * Fix style * Fix year * Add changes from review * Remove extra changes * Fix base tokenizer and doc * Fix style * Fix modeling and slow tokenizer tests * Remove Multi-tokenizer Converter and Tests * Delete QA model and Multi Tokenizer dummy objects * Fix repo consistency and code quality issues * Fix example documentation * Fix style * Remove PLBartTokenizer from type checking in init * Fix consistency issue * Add changes from review * Fix style * Remove PLBartTokenizerFast * Remove FastTokenizer converter * Fix AutoTokenzier mapping * Add plbart to toctree and fix consistency issues * Add language codes tokenizer test * Fix styling and doc issues * Add fixes for failing tests * Fix copies * Fix failing modeling test * Change assert to assertTrue in modeling tests	2022-02-18 14:17:09 +01:00
Francesco Saverio Zuppichini	240cc6cbdc	Adding a model, more doc for pushing to the hub (#15690 ) * doc for adding a model to the hub * run make style * resolved conversation * removed a line * removed ) * Update docs/source/add_new_model.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/add_new_model.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make style Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-18 09:11:18 +01:00
NielsRogge	57882177be	Add SimMIM (#15586 ) * Add first draft * Make model importable * Make SwinForMaskedImageModeling importable * Fix imports * Add missing inits * Add support for Swin * Fix bug * Fix bug * Fix another bug * Fix Swin MIM implementation * Fix default encoder stride * Fix Swin * Add print statements for debugging * Add image_size data argument * Fix Swin * Fix image_size * Add print statements for debugging * Fix print statement * Remove print statements * Improve reshaping of bool_masked_pos * Add support for DeiT, fix tests * Improve docstrings * Apply new black version * Improve script * Fix bug * Improve README * Apply suggestions from code review * Remove DS_Store and add to gitignore * Apply suggestions from code review + fix BEiT Flax * Revert BEiT changes * Improve README * Fix code quality * Improve README Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-02-17 19:44:55 +01:00
Yih-Dar	92a537d938	Minor fix on README.md (#15688 ) * fix README * fix more arxiv links * make fix-copies Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-02-17 08:38:32 -05:00
Tanay Mehta	f84e0dbd2a	Add PoolFormer (#15531 ) * Added all files, PoolFormerFeatureExtractor still failing tests * Fixed PoolFormerFeatureExtractor not being able to import * Completed Poolformer doc * Applied Suggested fixes * Fixed errors in modeling_auto.py * Fix feature extractor, convert docs to Markdown, styling of code * Remove PoolFormer from check_repo and fix integration test * Remove Poolformer from check_repo * Fixed configuration_poolformer.py docs and removed inference.py from poolformer * Ran with black v22 * Added PoolFormer to _toctree.yml * Updated poolformer doc * Applied suggested fixes and added on README.md * Did make fixup and make fix-copies, tests should pass now * Changed PoolFormer weights conversion script name and fixed README * Applied fixes in test_modeling_poolformer.py and modeling_poolformer.py * Added PoolFormerFeatureExtractor to AutoFeatureExtractor API Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>	2022-02-17 13:16:37 +01:00
Francesco Saverio Zuppichini	b87c044c79	Usage examples for logger (#15657 ) * logger * Update docs/source/main_classes/logging.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update docs/source/main_classes/logging.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-02-16 10:15:13 +01:00
Stas Bekman	bee361c6f1	[t5/t0/mt5 models] faster/leaner custom layer norm (#14656 ) * [t5] faster/leaner custom layer norm * wip * apex.normalization.FusedRMSNorm * cleanup * cleanup * add doc * add catch all * Trigger CI * expand	2022-02-15 16:49:57 -08:00
Patrick von Platen	2e12b907ae	TF generate refactor - Greedy Search (#15562 ) * TF generate start refactor * Add tf tests for sample generate * re-organize * boom boom * Apply suggestions from code review * re-add * add all code * make random greedy pass * make encoder-decoder random work * further improvements * delete bogus file * make gpt2 and t5 tests work * finish logits tests * correct logits processors * correct past / encoder_outputs drama * refactor some methods * another fix * refactor shape_list * fix more shape list * import shape _list * finish docs * fix imports * make style * correct tf utils * Fix TFRag as well * Apply Lysandre's and Sylvais suggestions * Update tests/test_generation_tf_logits_process.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Update src/transformers/tf_utils.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * remove cpu according to gante * correct logit processor Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-02-15 17:54:43 +01:00
Nicolas Patry	cdf19c501d	Re-export `KeyDataset`. (#15645 ) * Re-export `KeyDataset`. * Update the docs locations.	2022-02-15 17:49:38 +01:00
Stas Bekman	28e6155d8a	add a network debug script and document it (#15652 ) * add a network debug script and document it * doc	2022-02-15 08:48:00 -08:00
jonrbates	86a7845c0c	Fix typo in speech2text2 doc (#15617 ) Forward looks for inputs, not input_ids	2022-02-15 13:54:34 +01:00
fra	05a8580964	Revert "logger doc" This reverts commit `41168a49ce`.	2022-02-15 10:46:45 +01:00
fra	41168a49ce	logger doc	2022-02-15 10:03:28 +01:00
NielsRogge	b090b79022	Make Swin work with VisionEncoderDecoderModel (#15527 ) * Add attribute_map * Add mention in docs * Set hidden_size attribute correctly * Add note about Transformer-based models only Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>	2022-02-14 17:33:35 +01:00
Daniel Erenrich	4f403ea899	Fix grammar in tokenizer_summary (#15614 ) "to make ensure" is redundant.	2022-02-11 16:51:30 -05:00
Stas Bekman	f15c99fabf	[deepspeed docs] misc additions (#15585 ) * [deepspeed docs] round_robin_gradients * training and/or eval/predict loss is * Update docs/source/main_classes/deepspeed.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-11 10:54:04 -08:00
Steven Liu	85aee09e9a	🖍 remove broken link (#15615 )	2022-02-11 12:33:55 -06:00
Sylvain Gugger	6cf06d198c	Mark "code in the Hub" API as experimental (#15624 )	2022-02-11 09:55:31 -05:00
Ngo Quang Huy	c0864d98ba	Correct JSON format (#15600 )	2022-02-10 09:02:03 -08:00
lewtun	2e8b85f72e	Add local and TensorFlow ONNX export examples to docs (#15604 ) * Add local and TensorFlow ONNX export examples to docs * Use PyTorch - TensorFlow split	2022-02-10 16:31:00 +01:00
Alberto Bégué	cb7ed6e083	Add Tensorflow handling of ONNX conversion (#13831 ) * Add TensorFlow support for ONNX export * Change documentation to mention conversion with Tensorflow * Refactor export into export_pytorch and export_tensorflow * Check model's type instead of framework installation to choose between TF and Pytorch Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Alberto Bégué <alberto.begue@della.ai> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-02-10 11:18:41 +01:00
Sylvain Gugger	c722753afd	Expand tutorial for custom models (#15587 ) * Expand tutorial for custom models * Style * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2022-02-09 17:44:28 -05:00
NielsRogge	a86ee2261e	Add link (#15588 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>	2022-02-09 23:33:39 +01:00
Stas Bekman	dee17d5676	[trainer docs] document how to select specific gpus (#15551 ) * [trainer docs] document how to select specific gpus * expand * add urls * add accelerate launcher	2022-02-09 10:12:29 -08:00
Chan Woo Kim	2b5603f6ac	Constrained Beam Search [without disjunctive decoding] (#15416 ) * added classes to get started with constrained beam search * in progress, think i can directly force tokens now but not yet with the round robin * think now i have total control, now need to code the bank selection * technically works as desired, need to optimize and fix design choices leading to undersirable outputs * complete PR #1 without disjunctive decoding * removed incorrect tests * Delete k.txt * Delete test.py * Delete test.sh * revert changes to test scripts * genutils * full implementation with testing, no disjunctive yet * shifted docs * passing all tests realistically ran locally * removing accidentally included print statements * fixed source of error in initial PR test * fixing the get_device() vs device trap * fixed documentation docstrings about constrained_beam_search * fixed tests having failing for Speech2TextModel's floating point inputs * fix cuda long tensor * added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search * deleted accidentally added test halting code with assert False * code reformat * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/test_generation_utils.py * fixing based on comments on PR * took out the testing code that should but work fails without the beam search moditification ; style changes * fixing comments issues * docstrings for ConstraintListState * typo in PhrsalConstraint docstring * docstrings improvements Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-09 16:59:26 +01:00
Leandro von Werra	d923f76203	add model scaling section (#15119 ) * add model scaling section * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * integrate reviewer feedback * initialize GPU properly * add note about BnB optimizer * move doc from `scaling.mdx` to `performance.mdx` * integrate reviewer feedback * revert section levels Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-09 15:27:30 +01:00
Sylvain Gugger	b5c6fdecf0	PoC for a ProcessorMixin class (#15549 ) * PoC for a ProcessorMixin class * Documentation * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Roll out to other processors * Add base feature extractor class in init * Use args and kwargs Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-09 09:24:49 -05:00
Nathan Raw	fcb4f11c92	📝 Add codecarbon callback to docs (#15563 )	2022-02-08 14:10:53 -05:00
Joao Gante	8406fa6dd5	Add TFSpeech2Text (#15113 ) * Add wrapper classes * convert inner layers to tf * Add TF Encoder and Decoder layers * TFSpeech2Text models * Loadable model * TF model with same outputs as PT model * test skeleton * correct tests and run the fixup * correct attention expansion * TFSpeech2Text pask_key_values with TF format	2022-02-08 16:27:23 +00:00
aaron	87d08afb16	electra is added to onnx supported model (#15084 ) * electra is added to onnx supported model * add google/electra-base-generator for test onnx module Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com>	2022-02-08 15:47:49 +01:00
Steven Liu	552f8d3091	Create a custom model guide (#15489 ) * 📝 add config section * 📝 finish first draft * 📝 add feature extractor and processor * 🖍 apply feedback from review * 📝 minor edits * last review	2022-02-07 12:34:56 -06:00
lewtun	6775b211b6	Remove Longformers from ONNX-supported models (#15273 )	2022-02-07 17:32:13 +01:00
NielsRogge	84eec9e6ba	Add ConvNeXT (#15277 ) * First draft * Add conversion script * Improve conversion script * Improve docs and implement tests * Define model output class * Fix tests * Fix more tests * Add model to README * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply more suggestions from code review * Apply suggestions from code review * Rename dims to hidden_sizes * Fix equivalence test * Rename gamma to gamma_parameter * Clean up conversion script * Add ConvNextFeatureExtractor * Add corresponding tests * Implement feature extractor correctly * Make implementation cleaner * Add ConvNextStem class * Improve design * Update design to also include encoder * Fix gamma parameter * Use sample docstrings * Finish conversion, add center cropping * Replace nielsr by facebook, make feature extractor tests smaller * Fix integration test Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-07 16:11:37 +01:00
Stas Bekman	8ce1330631	[deepspeed docs] DeepSpeed ZeRO Inference (#15486 ) * [deepspeed docs] DeepSpeed ZeRO Inference * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * tweak * deal with black * extra cleanup, better comments Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-02-04 13:51:02 -08:00
Sylvain Gugger	ac6aa10f23	Standardize semantic segmentation models outputs (#15469 ) * Standardize instance segmentation models outputs * Rename output * Update src/transformers/modeling_outputs.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add legacy argument to the config and model forward * Update src/transformers/models/beit/modeling_beit.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Copy fix in Segformer Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2022-02-04 14:52:07 -05:00
Stas Bekman	31be2f45a9	[deepspeed docs] Megatron-Deepspeed info (#15488 )	2022-02-04 11:15:13 -08:00
Stas Bekman	21dcaec5d5	[deepspeed docs] memory requirements (#15506 )	2022-02-03 10:55:14 -08:00
Sylvain Gugger	44b21f117b	Save code of registered custom models (#15379 ) * Allow dynamic modules to use relative imports * Work for configs * Fix last merge conflict * Save code of registered custom objects * Map strings to strings * Fix test * Add tokenizer * Rework tests * Tests * Ignore fixtures py files for tests * Tokenizer test + fix collection * With full path * Rework integration * Fix typo * Remove changes in conftest * Test for tokenizers * Add documentation * Update docs/source/custom_models.mdx Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Add file structure and file content * Add more doc * Style * Update docs/source/custom_models.mdx Co-authored-by: Suraj Patil <surajp815@gmail.com> * Address review comments Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-02-02 10:44:37 -05:00
Steven Liu	b9418a1d97	Update tutorial docs (#15165 ) * first draft of pipeline, autoclass, preprocess tutorials * apply review feedback * 🖍 apply feedback from patrick/niels * 📝add output image to preprocessed image * 🖍 apply feedback from patrick	2022-02-01 18:31:35 -06:00
Steven Liu	c157c7e3fd	Update fine-tune docs (#15259 ) * add fine-tune tutorial * make edits, fix style * 📝 make edits * 🖍 fix code format links to external libraries * 🔄revert code formatting * 🖍 use DefaultDataCollator instead of DataCollatorWithPadding	2022-02-01 18:28:12 -06:00
Stas Bekman	44c7857b87	[deepspeed doc] fix import, extra notes (#15400 ) * [deepspeed doc] fix import, extra notes * typo	2022-01-31 08:28:10 -08:00
NielsRogge	47df0f2234	Add header (#15434 )	2022-01-31 11:15:54 -05:00
Ogundepo Odunayo	282ae123e2	add t5 ner finetuning (#15432 )	2022-01-31 17:03:06 +01:00
Soonhwan-Kwon	e09473a817	Add support for XLM-R XL and XXL models by modeling_xlm_roberta_xl.py (#13727 ) * add xlm roberta xl * add convert xlm xl fairseq checkpoint to pytorch * fix init and documents for xlm-roberta-xl * fix indention * add test for XLM-R xl,xxl * fix model hub name * fix some stuff * up * correct init * fix more * fix as suggestions * add torch_device * fix default values of doc strings * fix leftovers * merge to master * up * correct hub names * fix docs * fix model * up * finalize * last fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add copied from * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-29 13:42:37 +01:00
Steven Liu	16d4acbfdb	Get started docs (#15098 ) * clean commit of changes * apply review feedback, make edits * fix backticks, minor formatting * 🖍 make fixup and minor edits * 🖍 fix # in header * 📝 update code sample without from_pt * 📝 final review	2022-01-28 19:01:37 -06:00
Steven Liu	cabd6d26a2	Update model share tutorial (#15288 ) * add model sharing tutorial * 🖍 apply feedback from review * 📝 make edits * 🖍 fix formatting * 📝 convert from pt checkpoint to flax * 📝 final review	2022-01-28 18:49:26 -06:00
Suraj Patil	d25e25ee2b	Add XGLM models (#14876 ) * add xglm * update vocab size * fix model name * style and tokenizer * typo * no mask token * fix pos embed compute * fix args * fix tokenizer * fix positions * fix tokenization * style and dic fixes * fix imports * add fast tokenizer * update names * add pt tests * fix tokenizer * fix typo * fix tokenizer import * fix fast tokenizer * fix tokenizer * fix converter * add tokenizer test * update checkpoint names * fix tokenizer tests * fix slow tests * add copied from comments * rst -> mdx * flax model * update flax tests * quality * style * doc * update index and readme * fix copies * fix doc * update toctrr * fix indent * minor fixes * fix config doc * don't save embed_pos weights * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * address Sylvains commnets, few doc fixes * fix check_repo * align order of arguments * fix copies * fix labels * remove unnecessary mapping * fix saving tokenizer Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-28 18:55:23 +01:00
Ngo Quang Huy	4996922b6d	[docs] fix wrong file name in `pr_check` (#15380 )	2022-01-28 07:52:01 -05:00
Steven Liu	f5db6ce76a	Fix code format for Accelerate doc (#15335 ) * 🖍 fix code syntax to external libraries and replace image * 🔄revert code formatting, replace image with code block * 🖍 apply feedback	2022-01-27 13:49:04 -06:00
Lysandre	f87db5e412	Release: v4.16.0	2022-01-27 13:06:33 -05:00
Sylvain Gugger	8f6454bfac	Add proper documentation for Keras callbacks (#15374 ) * Add proper documentation for Keras callbacks * Add dummies	2022-01-27 10:51:38 -05:00
Stas Bekman	fc8fc400e3	[docs] post-PR merge fix (#15355 ) * [docs] post-PR merge fix * Update docs/source/main_classes/deepspeed.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-26 11:23:32 -08:00
novice	99a2771189	Add YOSO (#15091 ) * Add cookiecutter files * Add cuda kernels and cpp files * Update modeling_yoso.py * Add .h files * Update configuration_yoso.py * Updates * Remove tokenizer * Code quality * Update modeling_yoso.py * Update modeling_yoso.py * Fix failing test * Update modeling_yoso.py * Fix code quality * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review and fix integration tests * Update src/transformers/models/yoso/modeling_yoso.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Apply suggestions from code review * Fix copied from statement * Fix docstring * Fix code quality * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions and fix mask * Apply suggestions from code review * Fix code quality * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix docstrings * Fix code quality * Remove trailing whitespace * Update yoso.mdx * Move kernel loading to YosoEncoder * make style * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/yoso/modeling_yoso.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add short summary to docs * Update docs/source/model_doc/yoso.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update yoso.mdx * Update docs/source/model_doc/yoso.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Remove CausalLM model and add copied from * Remove autoregressive code * Remove unused imports * add copied from for embeddings * Fix code quality * Update docs/source/model_doc/yoso.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestion from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-26 19:18:29 +01:00
Ngo Quang Huy	5d8b98608c	Fix deepspeed docs (#15346 )	2022-01-26 07:24:33 -05:00
Jacob Deppen	96161ac408	make table into valid Markdown table syntax (#15337 )	2022-01-26 07:10:00 -05:00
Maciej Pawłowski	e79a0faeae	Added missing code in exemplary notebook - custom datasets fine-tuning (#15300 ) * Added missing code in exemplary notebook - custom datasets fine-tuning Added missing code in tokenize_and_align_labels function in the exemplary notebook on custom datasets - token classification. The missing code concerns adding labels for all but first token in a single word. The added code was taken directly from huggingface official example - this [colab notebook](https://github.com/huggingface/notebooks/blob/master/transformers_doc/custom_datasets.ipynb). * Changes requested in the review - keep the code as simple as possible	2022-01-25 17:26:17 -05:00
Steven Liu	0501beb846	Add 🤗 Accelerate tutorial (#15263 ) * add accelerate tutorial * 🖍 apply feedback from review * 📝 make edits	2022-01-25 13:46:11 -06:00
novice	d43e308e7f	Add Swin Transformer (#15085 ) * Add all files * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Updates * Apply suggestions from review * Fix failing tests * Update __init__.py * Update configuration_swin.py * Update auto_factory.py * Fix pytests * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Fix tests and default checkpoint * Fix Recursion error * Code quality * Remove copied from * Update modeling_swin.py * Code quality * Update modeling_swin.py * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review * Fix feature extractor * Fix code quality * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review * Update configuration_swin.py * Update default checkpoint * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/model_doc/swin.mdx Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu> * Update conversion script * Reformat conversion script Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu>	2022-01-21 12:10:41 +01:00
NielsRogge	515ed3ad2a	Fix doc examples (#15257 )	2022-01-20 21:51:51 +01:00
Kamal Raj	08b41b413a	Update pipelines.mdx (#15243 ) fix few spelling mistakes	2022-01-20 08:46:48 -05:00
NielsRogge	80f7296091	Update Trainer code example (#15070 ) * Update code example * Fix code quality * Add comment	2022-01-19 20:15:12 +01:00
NielsRogge	ac227093e4	Add ViLT (#14895 ) * First commit * Add conversion script * Make conversion script work for base model * More improvements * Update conversion script, works for vqa * Add indexing argument to meshgrid * Make conversion script work for ViltForPreTraining * Add ViltForPreTraining to docs * Fix device issue * Add processor * Add MinMaxResize to feature extractor * Implement call method of ViltProcessor * Fix tests * Add integration test * Add loss calculation for VQA * Improve tests * Improve some more tests * Debug tests * Small improvements * Add support for attention_mask * Remove mask_it * Add pixel_mask * Add tests for ViltFeatureExtractor * Improve tests * Add ViltForNaturalLanguageVisualReasoning * Add ViltForNaturalLanguageVisualReasoning to conversion script * Minor fixes * Add support for image_embeds, update docstrings to markdown * Update docs to markdown * Improve conversion script * Rename ViltForPreTraining to ViltForMaskedLM * Improve conversion script * Convert docstrings to markdown * Fix code example of retrieval model * Properly convert masked language model * Add integration test for nlvr * Fix code quality * Apply suggestions from code review * Add copied from statements * Fix pretrained_config_archive_map * Fix docs * Add model to README * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply more suggestions from code review * Make code more readable * Add ViltForNaturalLanguageVisualReasoning to the tests * Rename ViltForVisualQuestionAnswering to ViltForQuestionAnswering * Replace pixel_values_2 by single tensor * Add hidden_states and attentions * Fix one more test * Fix all tests * Update year * Fix rebase issues * Fix another rebase issue * Remove ViltForPreTraining from auto mapping * Rename ViltForImageRetrievalTextRetrieval to ViltForImageAndTextRetrieval * Make it possible to use BertTokenizerFast in the processor * Use BertTokenizerFast by default * Rename ViltForNaturalLanguageVisualReasoning, define custom model output Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-19 19:51:59 +01:00
NielsRogge	842298f84f	[ViTMAE] Various fixes (#15221 ) * Add MAE to AutoFeatureExtractor * Add link to notebook * Fix relative paths	2022-01-19 15:27:57 +01:00
Li-Huai (Allan) Lin	841d979190	Add FastTokenizer to REALM (#15211 ) * Remove BertTokenizer abstraction * Add FastTokenizer to REALM * Fix config archive map * Fix copies * Update realm.mdx * Apply suggestions from code review	2022-01-19 15:19:36 +01:00
Sylvain Gugger	db3503949d	Finish conversion of REALM doc to MDX	2022-01-18 18:00:30 -05:00
Jake Tae	fe78fe98ca	Enable tqdm toggling (#15167 ) * feature: enable tqdm toggle * test: add tqdm unit test * style: run linter * Update tests/test_tqdm_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * refactor: use tiny model, run linter * docs: add tqdm to logging * docs: add tqdm reference to `http_get` * style: run linter * Update docs/source/main_classes/logging.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * fix: use `AutoConfig` for framework agnostic testing * chore: mv tqdm test to `test_logging.py` * feature: implement enable/disable functions * docs: mv docstring to comment * chore: mv tqdm functions to `logging.py` * docs: update docs to reference `enable/disable` funcs * test: update test to use `enable/disable` func * chore: update function reference in comment Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-01-18 17:52:35 -05:00
NielsRogge	74bec9865c	Add MAE (#15120 ) * First draft * More improvements * More improvements * More improvements * Fix embeddings * Add conversion script * Finish conversion script * More improvements * Fix forward pass * Remove print statements * Add weights initialization * Add initialization of decoder weights * Add support for other models in the conversion script * Fix patch_size for huge model * Fix most of the tests * Fix integration test * Fix docs * Fix archive_list * Apply suggestions from code review * Improve documentation * Apply more suggestions * Skip some tests due to non-deterministic behaviour * Fix test_initialization * Remove unneccessary initialization of nn.Embedding * Improve docs * Fix dummies * Remove ViTMAEFeatureExtractor from docs * Add model to README and table of contents * Delete inference file	2022-01-18 16:21:32 +01:00
Li-Huai (Allan) Lin	22454ae492	Add REALM (#13292 ) * REALM initial commit * Retriever OK (Update new_gelu). * Encoder prediction score OK * Encoder pretrained model OK * Update retriever comments * Update docs, tests, and imports * Prune unused models * Make embedder as a module `RealmEmbedder` * Add RealmRetrieverOutput * Update tokenization * Pass all tests in test_modeling_realm.py * Prune RealmModel * Update docs * Add training test. * Remove completed TODO * Style & Quality * Prune `RealmModel` * Fixup * Changes: 1. Remove RealmTokenizerFast 2. Update docstrings 3. Add a method to RealmTokenizer to handle candidates tokenization. * Fix up * Style * Add tokenization tests * Update `from_pretrained` tests * Apply suggestions * Style & Quality * Copy BERT model * Fix comment to avoid docstring copying * Make RealmBertModel private * Fix bug * Style * Basic QA * Save * Complete reader logits * Add searcher * Complete searcher & reader * Move block records init to constructor * Fix training bug * Add some outputs to RealmReader * Add finetuned checkpoint variable names parsing * Fix bug * Update REALM config * Add RealmForOpenQA * Update convert_tfrecord logits * Fix bugs * Complete imports * Update docs * Update naming * Add brute-force searcher * Pass realm model tests * Style * Exclude RealmReader from common tests * Fix * Fix * convert docs * up * up * more make style * up * upload * up * Fix * Update src/transformers/__init__.py * adapt testing * change modeling code * fix test * up * up * up * correct more * make retriever work * update * make style * finish main structure * Resolve merge conflict * Make everything work * Style * Fixup * Fixup * Update training test * fix retriever * remove hardcoded path * Fix * Fix modeling test * Update model links * Initial retrieval test * Fix modeling test * Complete retrieval tests * Fix * style * Fix tests * Fix docstring example * Minor fix of retrieval test * Update license headers and docs * Apply suggestions from code review * Style * Apply suggestions from code review * Add an example to RealmEmbedder * Fix Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-18 07:24:13 -05:00
Stas Bekman	edd3fce2f7	[doc] new MoE paper (#15184 ) add new paper	2022-01-17 09:10:51 -08:00
Stas Bekman	669e3c50c9	[doc] performance: Efficient Software Prebuilds (#15147 ) * Efficient Software Prebuilds * improve	2022-01-14 18:25:20 -08:00
AK391	4663c609b9	Add "open in hf spaces" gradio button issue #73 (#15106 ) * update XLMProphetNet link * update DPR link * change prophetnet link * change link MBART * change link GPT * update gpt2 link * ctrl update link * update Transformer-XL link * Update Reformer link * update xlnet link * bert update link * udpate albert link * roberta update link * update distilbert link * update convbert link * update XLM link * xlm roberta update link * update Flaubert link * update electra link * update funnel transformer and longformer * bart update link * pegasus update link * udpate marianmt link * t5 update link * mt5 update link	2022-01-14 10:12:30 -05:00
Carlos Aguayo	3fc221d077	Update model_sharing.mdx (#15142 ) Fix typo	2022-01-13 12:26:02 -05:00
lewtun	021f2ea987	Add ONNX configuration classes to docs (#15121 ) * Add ONNX classes to main package * Remove permalinks from ONNX guide * Fix ToC entry * Revert "Add ONNX classes to main package" This reverts commit `eb794a5b00`. * Add ONNX classes to main doc * Fix syntax highlighting in doc * Fix text * Add FeaturesManager to doc * Use paths to reference ONNX classes * Add FeaturesManager to init * Add missing ONNX paths	2022-01-12 16:33:32 +01:00
Sylvain Gugger	c425d60bb9	Fix link to deepspeed config	2022-01-12 09:32:53 -05:00
lewtun	16f0b7d72c	Update ONNX docs (#14904 ) * Remove docs for deprecated ONNX export * Tidy up the CLI help messages * Revamp ONNX docs * Update auto-config table * Use DistilBERT as example for consistency * Wrap up first pass at ONNX docs * Fix table check * Add tweaks and introduction * Add cross-ref * Fix missing import * Fix style * Add permalinks to ONNX configs * Clarify role of OrderedDict * Update docs/source/serialization.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add doctest syntax to code blocks * Remove permalinks * Revert "Remove permalinks" This reverts commit `099701daf0`. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-11 18:06:05 +01:00
AK391	68d925195e	Merge branch 'master' into master	2022-01-11 11:11:29 -05:00
novice	28e091430e	Add Nystromformer (#14659 ) * Initial commit * Config and modelling changes Added Nystromformer-specific attributes to config and removed all decoder functionality from modelling. * Modelling and test changes Added Nystrom approximation and removed decoder tests. * Code quality fixes * Modeling changes and conversion script Initial commits to conversion script, modeling changes. * Minor modeling changes and conversion script * Modeling changes * Correct modeling, add tests and documentation * Code refactor * Remove tokenizers * Code refactor * Update __init__.py * Fix bugs * Update src/transformers/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/model_doc/nystromformer.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/convert_nystromformer_original_pytorch_checkpoint_to_pytorch.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/nystromformer/configuration_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update modeling and test_modeling * Code refactor * .rst to .mdx * doc changes * Doc changes * Update modeling_nystromformer.py * Doc changes * Fix copies * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update configuration_nystromformer.py * Fix copies * Update tests/test_modeling_nystromformer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update test_modeling_nystromformer.py * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Fix code style * Update modeling_nystromformer.py * Update modeling_nystromformer.py * Fix code style * Reformat modeling file * Update modeling_nystromformer.py * Modify NystromformerForMultipleChoice * Fix code quality * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Code style changes and torch.no_grad() * make style * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-11 14:25:49 +01:00
Virus	c4fa908fa9	Adds IBERT to models exportable with ONNX (#14868 ) * Add IBertOnnxConfig and tests * add all the supported features for IBERT and remove outputs in IbertOnnxConfig * use OnnxConfig * fix codestyle * remove serialization.rst * codestyle	2022-01-11 12:17:08 +01:00
AK391	5cd7086fdb	XLM-ProphetNet Spaces badge	2022-01-11 00:11:31 -05:00
AK391	4e3208662e	DPR Spaces badge	2022-01-10 13:50:40 -05:00
AK391	ac2c06d492	ProphetNet spaces badge	2022-01-10 13:43:34 -05:00
AK391	bf0201e184	MBART spaces badge	2022-01-10 13:37:17 -05:00
Yih-Dar	b67fd797be	Add TFVisionEncoderDecoderModel (#14148 ) * Start the work on TFVisionEncoderDecoderModel * Expose TFVisionEncoderDecoderModel * fix import * Add modeling_tf_vision_encoder_decoder to _ignore_modules in get_model_modules() * reorder * Apply the fix for checkpoint loading as in #14016 * remove attention_mask + fix VISION_DUMMY_INPUTS * A minimal change to make TF generate() work for vision models as encoder in encoder-decoder setting * fix wrong condition: shape_list(input_ids) == 2 * add tests * use personal TFViTModel checkpoint (for now) * Add equivalence tests + projection layer * style * make sure projection layer can run * Add examples * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Clean comments (need to work on TODOs for PyTorch models) * Remove TF -> PT in check_pt_tf_equivalence for TFVisionEncoderDecoderModel * fixes * Revert changes in PT code. * Update tests/test_modeling_tf_vision_encoder_decoder.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Add test_inference_coco_en for TF test * fix quality * fix name * build doc * add main_input_name * Fix ckpt name in test * fix diff between master and this PR * fix doc * fix style and quality * fix missing doc * fix labels handling * Delete auto.rst * Add the changes done in #14016 * fix prefix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make style Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-10 13:30:14 -05:00
AK391	c9504b2f50	MT5 Spaces badge	2022-01-10 12:57:08 -05:00
AK391	daec528ca9	T5 Spaces badge	2022-01-10 12:51:39 -05:00
AK391	0554e4d5c5	MarianMT Spaces badge	2022-01-10 12:47:12 -05:00
AK391	7ec6aad23d	Pegasus Spaces badge	2022-01-10 12:39:22 -05:00
AK391	03f8b9c9e0	BART Spaces badge	2022-01-10 12:33:59 -05:00
Stas Bekman	37bc0b4e53	[performance doc] Power and Cooling (#14935 ) * [performance doc] Power and Cooling * more docs * Update docs/source/performance.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * reword Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-10 09:21:04 -08:00
AK391	20f169b523	Longformer Spaces badge	2022-01-10 12:14:18 -05:00
AK391	4fbc924d0a	Funnel Transformer spaces badge	2022-01-10 12:06:05 -05:00
AK391	222c09a635	ELECTRA Spaces badge	2022-01-10 11:53:23 -05:00
Stas Bekman	31838d3e11	[doc] normalize HF Transformers string (#15023 )	2022-01-10 08:44:33 -08:00
AK391	84f360e862	FlauBERT spaces badge	2022-01-10 11:41:10 -05:00
AK391	9f33116898	XLM-Roberta Spaces badge	2022-01-10 10:54:18 -05:00
AK391	20fa9eb035	XLM Spaces badge	2022-01-10 10:48:06 -05:00
AK391	16b6df6fca	ConvBERT spaces badge	2022-01-10 10:33:03 -05:00
Santiago Castro	f21bc4215a	Use tqdm.auto in Pipeline docs (#14920 ) It's better for e.g. notebook.	2022-01-10 10:28:34 -05:00
Mishig Davaadorj	f012c00ada	Model summary horizontal banners (#15058 )	2022-01-10 10:06:14 -05:00
Minghao Li	b2c477fc6d	support the trocr small models (#14893 ) * support the trocr small models * resolve conflict * Update docs/source/model_doc/trocr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/model_doc/trocr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/model_doc/trocr.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix unexpected indent in processing_trocr.py * Update src/transformers/models/trocr/processing_trocr.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * update the docstring of processing_trocr * remove extra space Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-01-10 09:28:03 -05:00
Yih-Dar	0a03a86813	fix model table cell text alignment (#14999 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-01-10 06:44:11 -05:00
AK391	5be1242ac0	Merge branch 'huggingface:master' into master	2022-01-07 11:48:22 -05:00
AK391	484e7a441f	Distilbert spaces badge	2022-01-07 11:47:56 -05:00
K.C. Tung	f18c6fa94c	Resubmit changes after rebase to master (#14982 )	2022-01-07 08:34:12 +01:00
AK391	1d71227295	Roberta spaces badge	2022-01-06 18:50:19 -05:00
AK391	cac877425c	ALBERT spaces badge	2022-01-06 13:01:23 -05:00
AK391	794441c379	BERT spaces badge	2022-01-06 12:22:09 -05:00
AK391	f872f18dca	XLNet spaces badge	2022-01-06 12:09:50 -05:00
AK391	8d187e7feb	Reformer Spaces badge	2022-01-06 11:59:21 -05:00
AK391	59fb636948	Transformer-XL badge	2022-01-06 11:47:41 -05:00
AK391	2380136722	add spaces badges	2022-01-04 16:13:57 -05:00
Kevin Ko	857ab55c01	[doc] Update parallelism.mdx (#15018 ) * Update parallelism.mdx * Update parallelism.mdx	2022-01-04 09:58:27 -08:00
Daniel Stancl	21aecc0971	Add Flax RoFormer (#15005 ) * Add FlaxRoFormer * Clean code + make quality * Fix output pooling for FlaxRoFormerForMultipleChoiceModule * Apply suggestions from code review * add flax model to repos Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-04 13:23:10 +01:00
Kevin Ko	f2ab21833f	Update parallelism.mdx (#15013 ) * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx * Update parallelism.mdx	2022-01-03 11:49:27 -08:00
Sylvain Gugger	8f6373c61c	Map model_type and doc pages names (#14944 ) * Map model_type and doc pages names * Add script * Fix typo * Quality * Manual check for Auto Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2022-01-03 05:08:55 -05:00
Sylvain Gugger	2c5597f6c7	Style	2021-12-27 19:18:08 -05:00
Sylvain Gugger	b5e2b183af	Doc styler examples (#14953 ) * Fix bad examples * Add black formatting to style_doc * Use first nonempty line * Put it at the right place * Don't add spaces to empty lines * Better templates * Deal with triple quotes in docstrings * Result of style_doc * Enable mdx treatment and fix code examples in MDXs * Result of doc styler on doc source files * Last fixes * Break copy from	2021-12-27 19:07:46 -05:00
Stas Bekman	e13f72fbff	[doc] :obj: hunt (#14954 ) * redo sans examples * style	2021-12-27 15:49:48 -08:00
Stas Bekman	133c5e40c4	[doc] consistent True/False/None default format (#14951 ) * [doc] consistent True/False/None default format * Update src/transformers/models/xlnet/modeling_xlnet.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-27 14:31:40 -08:00
Sylvain Gugger	b2f500256e	Convert last rst file (#14952 )	2021-12-27 17:09:37 -05:00
Daniel Stancl	501307b58b	Add `ElectraForCausalLM` -> Enable Electra encoder-decoder model (#14729 ) * Add ElectraForCausalLM and cover some basic tests & need to fix a few tests * Fix bugs * make style * make fix-copies * Update doc * Change docstring to markdown format * Remove redundant update_keys_to_ignore	2021-12-27 12:37:52 +01:00
Nicolas Patry	b058490ceb	ChunkPipeline (batch_size enabled on `zero-cls` and `qa` pipelines. (#14225 ) * Pipeline chunks. * Batching for Chunking pipelines ? * Batching for `question-answering` and `zero-shot-cls`. * Fixing for FNet. * Making ASR a chunk pipeline. * Chunking ASR API. * doc style. * Fixing ASR test. * Fixing QA eror (p_mask, padding is 1, not 0). * Enable both vad and simple chunking. * Max length for vad. * remove inference mode, crashing on s2t. * Revert ChunkPipeline for ASRpipeline. Too many knobs for simple integration within the pipeline, better stick to external convenience functions instead, more control to be had, simpler pipeline and also easier to replace with other things later. * Drop necessity for PT for these. * Enabling generators. * Add mic + cleanup. * Typo. * Typo2. * Remove ASR work, it does not belong in this PR anymore. * Update src/transformers/pipelines/pt_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/pipelines/zero_shot_classification.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Adding many comments. * Doc quality. * `hidden_states` handling. * Adding doc. * Bad rebase. * Autofixing docs. * Fixing CRITICAL bug in the new Zerocls pipeline. Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-12-27 11:26:20 +01:00
Yih-Dar	8f2cc1c3ab	Add TFCLIPModel (#13967 ) * Start the work for TFCLIPModel * Convert to TF code (TODO: loss + doc) * Clean up * Fix pooled_output for TFCLIPTextTransformer - using tf.gather_nd * assert -> raise error * Expose TFCLIPModel * Deal with dummy_inputs * Add tests * Fix all tests. TODO: manual check weight loading + add more comments * Fix pt tf equivalence test * fixes * update TFCLIPVisionEmbeddings's Conv2D * Fix loss + overwrite test_pt_tf_model_equivalence from common * Add a comment about the change about MainLayer in test_keras_save_load * Set return_loss=True in TFCLIPModelTester + make tests pass * overwrite test_pt_tf_model_equivalence from tf common * fix base_model_prefix * Fix examples * remove unused * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply review suggestions * change self.pre_layrnorm to self.pre_layernorm * apply more review suggestions * return attention probs before dropout (to align with PT) * fix weight init * fix * build doc * fix missing doc * fix for test Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-23 11:19:44 -05:00
lewtun	6b655cc63f	Add ONNX support for MarianMT models (#14586 ) * First commit to add MarianMT to ONNX * Now MarianModel.forward() automatically generates decoder_input_ids, like BartModel.forward() * Adjusted MarianOnnxConfig.inputs and outputs to work with seq2seq-lm feature * Style fix * Added support for other features for already supported models * Partial support for causal and seq2seq models * Partial support for causal and seq2seq models * Add default task for MarianMT ONNX * Remove automatic creation of decoder_input_ids * Extend inputs and outputs for MarianMT ONNX config * Add MarianMT to ONNX unit tests * Refactor * OnnxSeq2SeqConfigWithPast to support seq2seq models * Parameterized the onnx tests * Restored run_mlm.py * Restored run_mlm.py * [WIP] BART update * BART and MBART * Add past_key_values and fix dummy decoder inputs Using a sequence length of 1 in generate_dummy_outputs() produces large discrepancies, presumably due to some hidden optimisations. * Refactor MarianOnnxConfig to remove custom past_key_values logic * Fix quality * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit `0f4e39c559`. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Refactor Marian export to account for base changes * Fix copies * Implemented suggestions * Extend support for causal LM * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit `0f4e39c559`. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)" This reverts commit `0f4e39c559`. * is_torch_available test to avoid failing imports * sorting parameterize parameters to solve ERROR gw0 gw1 * tests fix * tests fix * GPT2 with past fix * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially * Removed onnx file * Implemented suggestions * Fixed __init__ to resolve conflict with master * Remove commented import * Remove ONNX model * Remove redundant class method * Tidy up imports * Fix quality * Refactor dummy input function * Add copied from statements to Marian config functions * Remove false copied from comments * Fix copy from comment Co-authored-by: Massimiliano Bruni <massimiliano.bruni@hcl.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>	2021-12-23 13:35:56 +01:00
Sylvain Gugger	207594be81	Convert rst files (#14888 ) * Convert all tutorials and guides * Convert all remaining rst to mdx * Track and fix bad links	2021-12-22 16:14:35 -05:00
NielsRogge	7df4b90c76	Fix Perceiver docs (#14879 )	2021-12-22 14:18:03 +01:00
Ryokan RI	824fd44fc3	Feature/fix slow test in mluke (#14749 ) * make MLukeTokenizerTest fast * make LukeTokenizerTest fast * add entry to _toctree.yaml	2021-12-22 06:35:59 -05:00
Lysandre Debut	ec3567fe20	Convert model files from rst to mdx (#14865 ) * First pass * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-22 03:27:30 -05:00
Stas Bekman	185876392c	[doc porting] several docs (#14858 ) * [doc porting] 2 docs * [doc porting] 2 docs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/main_classes/deepspeed.mdx * cleanup Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-21 09:55:25 -08:00
Stas Bekman	b6ec956976	[logging] implement warning_advice / TRANSFORMERS_NO_ADVISORY_WARNINGS (#14669 ) * [logging] implement warning_advice / TRANSFORMERS_NO_ADVISORY_WARNINGS * reword	2021-12-20 20:48:38 -08:00
Stas Bekman	c1125dc2ba	[doc] typo (#14849 ) fix small typo	2021-12-20 12:20:21 -05:00
Patrick von Platen	952a77b05d	[Perceiver] Skip multi-gpu tests for now (#14813 ) * [Perceiver] Skip multi-gpu tests for now * Update tests/test_modeling_perceiver.py * up * up	2021-12-20 15:22:50 +01:00
Derek Chia	8a818c26cb	Fix dead link to benchmarks.ipynb (#14842 ) Notebook has been updated here https://github.com/huggingface/notebooks/tree/master/examples/benchmark.ipynb	2021-12-20 09:08:05 -05:00
Anton Lozhkov	3883e3a75e	Add SD and SV heads for WavLM (#14847 ) * Add converted heads * Add dummies	2021-12-20 16:40:56 +03:00
Patrick von Platen	c4a96cecbc	Wav2Vec2 meets phonemes (#14353 ) * up * add tokenizer * improve more * finish tokenizer * finish * adapt speech recognition script * adapt convert * more fixes * more fixes * update phonemizer wav2vec2 * better naming * fix more tests * more fixes swedish * correct tests * finish * improve script * remove file * up * lets get those 100 model architectures until the end of the month * make fix-copies * correct more * correct script * more fixes * more fixes * add to docs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * replace assert * fix copies * fix docs * new try docs * boom boom * update * add phonemizer to audio tests * make fix-copies * up * upload models * some changes * Update tests/test_tokenization_wav2vec2_phoneme.py Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * more fixes * remove @ Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>	2021-12-17 19:56:44 +01:00
Lysandre Debut	77d6c826d8	Convert rst to mdx bert (#14806 ) * BERT to mdx mdx :) c * Update docs/source/model_doc/bert.mdx Co-authored-by: Julien Chaumond <julien@huggingface.co> * Remove all Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co>	2021-12-17 11:13:34 -05:00
Patrick von Platen	bef1e3e4a0	Add WavLM (#14354 ) * first commit * fix some stuff * fix more readme * Apply suggestions from code review * update * correct * up * attn layer works * push code * make modedls work * Small change * more refactor * finish * up * fix convertsion * fix position bias * Fix style * fix conversion * make fix-copies * add * clean * fix docs * fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply final changes * make fix-copies Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-16 18:57:05 +01:00
Anton Lozhkov	48463ebb33	Add Speaker Diarization and Verification heads (#14723 ) * Models * Squashed commit of the following: commit 72278e1e931a16d0879acc77f65762f3364833d0 Author: anton-l <aglozhkov@gmail.com> Date: Fri Dec 10 21:45:08 2021 +0300 * Add unispeech heads * Add sd/sv automodels * Docs cleanup * Fix docstrings * rename xvector classes * examples * Tests cleanup * Style * Better checkpoints for tests * leftover docs * apply review suggestions * Style + init tests * Update unispeech-sat tdnn downsampling	2021-12-16 19:22:14 +03:00
Lysandre Debut	8010fda9bf	Removes images to put them in a dataset (#14781 ) * First try * Update instructions	2021-12-16 04:42:02 -05:00
Sylvain Gugger	459677aebe	PoC for conserving old links (#14754 ) * PoC for conserving old links * Do the same for other links * remap the redirects section * add instructions on how to move sections * improve Co-authored-by: Stas Bekman <stas@stason.org>	2021-12-15 11:40:47 -08:00
NielsRogge	50bc57cef8	Update Perceiver code examples (#14783 ) * Fix code examples * Fix code example	2021-12-15 11:06:38 -05:00
Xing Han Lu	72c6e8b8bf	Update t5.rst (#14776 )	2021-12-15 14:59:11 +01:00
Stas Bekman	fdf3ce2827	[doc] performance: groups of operations by compute-intensity (#14757 )	2021-12-14 19:01:23 -08:00
Amit Chaudhary	851a78978a	Fix broken links to distillation on index page of documentation (#14722 ) * Fix broken links to distillation on index page of documentation * Fix broken link for distillation in main README * Run make fixup	2021-12-14 21:55:33 -05:00
Sylvain Gugger	322d416916	Update Table of Contents (#14755 )	2021-12-13 17:15:19 -05:00
Sylvain Gugger	7533d30acd	Convert Trainer doc page to MarkDown (#14753 ) * Convert Trainer doc page to MarkDown * Fix repo consistency * Fix the doc build test job	2021-12-13 13:09:50 -05:00
Sylvain Gugger	c3cd88a9ba	Small fixes for the doc (#14751 )	2021-12-13 11:17:01 -05:00
Lucien	fc74c84537	Swap TF and PT code inside two blocks (#14742 )	2021-12-13 10:31:11 -05:00
Lysandre Debut	6e05bb1c96	Fix the perceiver docs (#14748 )	2021-12-13 09:29:47 -05:00
NielsRogge	4c99e553c1	Improve documentation of some models (#14695 ) * Migrate docs to mdx * Update TAPAS docs * Remove lines * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply some more suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add pt/tf switch to code examples * More improvements * Improve docstrings * More improvements Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-13 13:24:36 +01:00
Stas Bekman	027074f4d0	[doc] document MoE model approach and current solutions (#14725 ) * document MoE model approach * additional info from Samyam * fix	2021-12-10 18:24:38 -08:00
Sylvain Gugger	5eca742f6c	Fix special character in MDX (#14721 )	2021-12-10 16:02:48 -05:00
Sylvain Gugger	63c284c2d4	Prevent style_doc from tempering the config file	2021-12-10 15:31:43 -05:00
Sylvain Gugger	1b75d7238c	Automatically build doc notebooks (#14718 ) * Test workflow * Build doc * Make a clean build * Add doc config * Restore other workflows * Final job * Print something in else statements * Pull before making changes	2021-12-10 14:20:56 -05:00
Sylvain Gugger	bab1556456	Put back open in colab markers (#14684 )	2021-12-09 12:00:06 -05:00
Tikeng Notsawo Pascal Junior	3bc7d70e9c	Fix : wrong link in the documentation (ConvBERT vs DistilBERT) (#14705 )	2021-12-09 11:35:22 -05:00
Mishig Davaadorj	60be4bf8ac	Fix typo in toctree (#14704 )	2021-12-09 09:25:31 -05:00
Sylvain Gugger	13186d7152	Move pyctcdecode (#14686 ) * Move pyctcdecode dep * Fix doc and last objects * Quality * Style * Ignore this black	2021-12-08 15:41:58 -05:00
Stas Bekman	1228661285	[bf16 support] tweaks (#14580 ) * [bf16 support] tweaks * corrections Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>	2021-12-08 11:33:24 -08:00
Sylvain Gugger	01b8cd5932	Revert open-in-colab and add perceiver (#14683 )	2021-12-08 13:52:31 -05:00
Sylvain Gugger	cf36f4d7a8	Convert tutorials (#14665 ) * Convert a few docs * And another * Last tutorials * New syntax for colab links * Convert a few docs * And another * Last tutorials * New syntax for colab links	2021-12-08 13:19:46 -05:00
NielsRogge	65b20b739b	Add Perceiver IO (#14487 ) * First draft * Style and remove mlm * Make forward pass work * More improvements * More improvements * Fix bug * More improvements * More improvements * Add PerceiverTokenizer first draft * Improve conversion script * More improvements * Make conversion script work for the encoder * Make conversion script work with local pickle files * Style & quality, fix-copies * Add dummy input to conversion script * Add absolute position embeddings to TextPreProcessor * Make forward pass of encoder work * More improvements * Move text preprocessor to separate script * More improvements * More improvements * Add post processor * Make MLM model work * Style * Add PerceiverForMaskedLM * Add PerceiverImagePreprocessor * Make style * Make PerceiverForImageClassification work * More improvements * More improvements * Use tokenizer in conversion script * Use PerceiverForMaskedLM in conversion script * Define custom PerceiverModelOutput * Improve PerceiverAttention to make it work for both MLM and image classification * More improvements * More improvements * More improvements to the conversion script * Make conversion script work for both MLM and image classification * Add PerceiverFeatureExtractor * More improvements * Style and quality * Add center cropping * Fix bug * Small fix * Add print statement * Fix bug in image preprocessor * Fix bug with conversion script * Make output position embeddings an nn.Parameter layer instead of nn.Embedding * Comment out print statements * Add position encoding classes * More improvements * Use position_encoding_kwargs * Add PerceiverForImageClassificationFourier * Make style & quality * Add PerceiverForImageClassificationConvProcessing * Style & quality * Add flow model * Move processors to modeling file * Make position encodings modular * Make basic decoder use modular position encodings * Add PerceiverForOpticalFlow to conversion script * Add AudioPreprocessor * Make it possible for the basic decoder to use Fourier position embeddings * Add PerceiverForMultimodalAutoencoding * Improve model for optical flow * Improve _build_network_inputs method * Add print statement * Fix device issue * Fix device of Fourier embeddings * Add print statements for debugging * Add another print statement * Add another print statement * Add another print statement * Add another print statement * Improve PerceiverAudioPreprocessor * Improve conversion script for multimodal modal * More improvements * More improvements * Improve multimodal model * Make forward pass multimodal model work * More improvements * Improve tests * Fix some more tests * Add output dataclasses * Make more tests pass * Add print statements for debuggin * Add tests for image classification * Add PerceiverClassifierOutput * More improvements * Make more tests pass for the optical flow model * Make style & quality * Small improvements * Don't support training for optical flow model for now * Fix _prepare_for_class for tests * Make more tests pass, add some docs * Add multimodal model to tests * Minor fixes * Fix tests * Improve conversion script * Make fixup * Remove pos_dim argument * Fix device issue * Potential fix for OOM * Revert previous commit * Fix test_initialization * Add print statements for debugging * Fix print statement * Add print statement * Add print statement * Add print statement * Add print statement * Add print statement * Add print statement * Remove need for output_shape * Comment out output_shape * Remove unnecessary code * Improve docs * Fix make fixup * Remove PerceiverTextProcessor from init * Improve docs * Small improvement * Apply first batch of suggestions from code review * Apply more suggestions from code review * Update docstrings * Define dicts beforehand for readability * Rename task to architecture in conversion script, include PerceiverModel in tests * Add print statements for debugging * Fix tests on GPU * Remove preprocessors, postprocessors and decoders from main init * Add integration test * Fix docs * Replace einops by torch * Update for new docs frontend * Rename PerceiverForImageClassification * Improve docs * Improve docs * Improve docs of PerceiverModel * Fix some more tests * Improve center_crop * Add PerceiverForSequenceClassification * Small improvements * Fix tests * Add integration test for optical flow model * Clean up * Add tests for tokenizer * Fix tokenizer by adding special tokens properly * Fix CI	2021-12-08 14:20:34 +01:00
Patrick von Platen	961732c276	[Wav2Vec2] PyCTCDecode Integration to support language model boosted decoding (#14339 ) * up * up * up * make it cleaner * correct * make styhahalal * add more tests * finish * small fix * make style * up * tryout to solve cicrle ci * up * fix more tests * fix more tests * apply sylvains suggestions * fix import * correct docs * add pyctcdecode only to speech tests * fix more tests * add tf, flax and pt tests * add pt * fix last tests * fix more tests * Apply suggestions from code review * change lines * Apply suggestions from code review Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * correct tests * correct tests * add doc string Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>	2021-12-08 12:07:54 +01:00
Ryokan RI	30646a0a3c	Add mLUKE (#14640 ) * implement MLukeTokenizer and LukeForMaskedLM * update tests * update docs * add LukeForMaskedLM to check_repo.py * update README * fix test and specify the entity pad id in tokenization_(m)luke * fix EntityPredictionHeadTransform	2021-12-07 00:25:28 -05:00
tucan9389	0f3f045ebd	Add GPTJForQuestionAnswering (#14503 ) * Add GPTJForQuestionAnswering * Reformat for GPTJForQuestionAnswering * Fix isort error * make style for GPTJForQA * Add _keys_to_ignore_on_load_missing * Change the sequence of qa and classification Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-12-06 11:44:10 -05:00
Matt	73ec4340ec	Make DefaultDataCollator importable from root (#14588 ) * Make DefaultDataCollator importable from root * Add documentation for DefaultDataCollator and add return_tensors argument to all class docstrings * make style * Add DefaultDataCollator to data_collator.rst * Add DefaultDataCollator to data_collator.rst	2021-12-03 15:15:09 -05:00
Stas Bekman	71b1bf7ea8	[trainer] add tf32-mode control (#14606 ) * [trainer] add --tf32 support * it's pt>=.17 * it's pt>=.17 * flip the default to True * add experimental note * simplify logic * style * switch to 3-state logic * doc * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * re-style code Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-03 10:08:58 -08:00
Lysandre Debut	ec47baeba2	2022 is the year of multi-modality (#14610 ) * 2022 is the year of multi-modality * Small fix * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * Apply suggestions from code review * Apply to documentation index * Apply suggestions from code review Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2021-12-03 11:35:44 -05:00
Daniel Stancl	50d909be28	[Flax] Add FlaxBlenderbotSmall (#14576 ) * [WIP] Add FlaxBlenderbotSmall * Revert some unintentionally changed files Revert some unintentionally files changed by improperly filled cookiecutter instructions. * Fix repo consistency * Fix Flax-PT equivalence * Apply suggestions from code review * Update index.mdx * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-12-02 14:21:48 +05:30
Mishig Davaadorj	275402bf2b	Update doc img links (#14593 ) * Update doc img links * Rename toctree.yml -> _toctree.yml (#14594) * Update doc img links * Update performance.md img link	2021-12-02 09:01:35 +01:00
Mishig Davaadorj	4f68de625c	Rename toctree.yml -> _toctree.yml (#14594 )	2021-12-02 08:58:39 +01:00
Stas Bekman	fbe278c76c	[doc] bf16/tf32 guide (#14579 ) * [doc] bf16/tf32 guide * expand * expand * Update docs/source/performance.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-12-01 14:18:58 -08:00
Sylvain Gugger	4df7d05a87	Doc new front (#14590 ) * Convert PretrainedConfig doc to Markdown * Use syntax * Add necessary doc files (#14496) * Doc fixes (#14499) * Fixes for the new front * Convert DETR file for table * Title is needed * Simplify a bit * Even simpler * Remove imports * Fix typo in toctree (#14516) * Fix checkpoints badge * Update versions.yml format (#14517) * Doc new front github actions (#14512) * Doc new front github actions * Fix docstring * Fix feature extraction utils import (#14515) * Address Julien's comments * Push to doc-builder * Ready for merge * Remove old build and deploy * Doc misc fixes (#14583) * Rm versions.yml from doc * Fix converting.rst * Rm pretrained_models from toctree * Fix index links (#14567) * Fix links in README * Localized READMEs * Fix copy script * Fix find doc script * Update README_ko.md Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co> * Adapt build command to new CLI tools (#14578) * Fix typo * Fix doc interlinks (#14589) * Convert PretrainedConfig doc to Markdown * Use syntax * Rm pattern <[a-z]+(.html).> Rm huggingface.co/transformers/master * Rm .html * Rm .html from index.mdx * Rm .html from model_summary.rst * Update index.mdx rm html * Update remove .html * Fix inner doc links * Fix interlink in preprocssing.rst * Update pr_checks Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Convert PretrainedConfig doc to Markdown * Use syntax * Add necessary doc files (#14496) * Doc fixes (#14499) * Fixes for the new front * Convert DETR file for table * Title is needed * Simplify a bit * Even simpler * Remove imports * Fix checkpoints badge * Fix typo in toctree (#14516) * Update versions.yml format (#14517) * Doc new front github actions (#14512) * Doc new front github actions * Fix docstring * Fix feature extraction utils import (#14515) * Address Julien's comments * Push to doc-builder * Ready for merge * Remove old build and deploy * Doc misc fixes (#14583) * Rm versions.yml from doc * Fix converting.rst * Rm pretrained_models from toctree * Fix index links (#14567) * Fix links in README * Localized READMEs * Fix copy script * Fix find doc script * Update README_ko.md Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co> * Adapt build command to new CLI tools (#14578) * Fix typo * Fix doc interlinks (#14589) * Convert PretrainedConfig doc to Markdown * Use syntax * Rm pattern <[a-z]+(.html).> Rm huggingface.co/transformers/master * Rm .html * Rm .html from index.mdx * Rm .html from model_summary.rst * Update index.mdx rm html * Update remove .html * Fix inner doc links * Fix interlink in preprocssing.rst * Update pr_checks Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Styling Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co>	2021-12-01 14:13:02 -05:00
Suraj Patil	4c0dd199c8	FlaxGPTJ (#14396 ) * add flax gptj * no bias in attention dense * no wpe * fix rotary embeddings * fix rotary embeds * fix rotray embeds * quality * doc and quality * fix equivalence tests	2021-12-01 10:57:39 +05:30
Suraj Patil	fc1d97f29d	VisionTextDualEncoder (#13511 ) * init vision_text_dual_encoder * fix merge * remove extra heads * fix tests * remove VISION_TEXT_DUAL_ENCODER_PRETRAINED_CONFIG_ARCHIVE_MAP * remove archive map * fix imports * fix more imports * fix init * delete tokenizers * fix imports * clean * support clip's vision model * handle None config * begin tests * more test and few fixes * warn about newly init weights * more tests * add loss to model * remove extra classes from doc * add processor * doc and small fixes * add start docstr * update flax model * flax tests * more flax tests * doc * quality * doc and quality * fix doc * doc * remove comments * update warning * quality * fix docs * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * replace asserts, fix imports * update imports * fix import * address some review comments * fix check * reduce tolerance * fix test * add flax integration test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address Sylvain's comments * fix style * add pt_flax_equivalence test in PT tests * add pt integration test * update test * use pre-trained checkpoint in examples Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-30 22:21:48 +05:30
Daniel Stancl	faacd74729	[Flax] Add FlaxBlenderbot (#13633 ) * Init Flax implementation for Blenderbot * Add a majority of stuff except for tests * make style quality * Add tests and fix some bugs * Add tests * Clean source code and fix some bugs * Fix copies and docs * Fix jax device condition for tests * Fix layer norm in the encoder * Fix a few typos in the test file * make fix-copies * make fix-copies * fix layer norm * Fix Flax params dtype (#13090) * Fix PR reference (#13098) * make fix-copies * Update tests/test_modeling_flax_blenderbot.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-11-30 17:36:54 +05:30
Kamal Raj	c468a87a69	Tapas tf (#13393 ) * TF Tapas first commit * updated docs * updated logger message * updated pytorch weight conversion script to support scalar array * added use_cache to tapas model config to work properly with tf input_processing * 1. rm embeddings_sum 2. added # Copied 3. + TFTapasMLMHead 4. and lot other small fixes * updated docs * + test for tapas * updated testing_utils to check is_tensorflow_probability_available * converted model logits post processing using numpy to work with both PT and TF models * + TFAutoModelForTableQuestionAnswering * added TF support * added test for TFAutoModelForTableQuestionAnswering * added test for TFAutoModelForTableQuestionAnswering pipeline * updated auto model docs * fixed typo in import * added tensorflow_probability to run tests * updated MLM head * updated tapas.rst with TF model docs * fixed optimizer import in docs * updated convert to np data from pt model is not `transformers.tokenization_utils_base.BatchEncoding` after pipeline upgrade * updated pipeline: 1. with torch.no_gard removed, pipeline forward handles 2. token_type_ids converted to numpy * updated docs. * removed `use_cache` from config * removed floats_tensor * updated code comment * updated Copyright Year and logits_aggregation Optional * updated docs and comments * updated docstring * fixed model weight loading * make fixup * fix indentation * added tf slow pipeline test * pip upgrade * upgrade python to 3.7 * removed from_pt from tests * revert commit `f18cfa9`	2021-11-30 11:07:55 +01:00
NielsRogge	25156eb296	Rename ImageGPT (#14526 ) * Rename * Add MODEL_FOR_CAUSAL_IMAGE_MODELING_MAPPING	2021-11-29 10:19:11 +01:00
Xing Han Lu	ebbe8cc3fe	Tokenizers docs: Specify which class contains `__call__` method (#14379 ) * Update tokenizer.rst * Apply `make fixup`	2021-11-28 18:55:38 -05:00
Lysandre Debut	2318bf77eb	Fixes (#14534 )	2021-11-26 04:35:08 -05:00
Lysandre Debut	c15f4f203f	Quicktour updates (#14533 )	2021-11-26 04:09:31 -05:00
Chris Fregly	1bbd6fcdeb	added save_directories for _psave_pretrained_pt and _tf, changed model to tf_model and pt_model, enable the notebook to run cleanly from top to bottom without error (#14529 ) * added save_directories for _psave_pretrained_pt and _tf, changed model to tf_model and pt_model, enable the notebook to run cleanly from top to bottom without error * Update quicktour.rst * added >>> * dependencies * added space	2021-11-26 03:46:07 -05:00
Stas Bekman	956a483173	[deepspeed] zero inference (#14253 ) * [deepspeed] zero inference * only z3 makes sense for inference * fix and style * docs * rework * fix test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * responding to suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-23 14:09:15 -08:00
Sylvain Gugger	204d251310	Auto processor (#14465 ) * Add AutoProcessor class * Init and tests * Add doc * Fix init * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Reverts to tokenizer or feature extractor when available * Adapt test Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-11-22 12:17:38 -05:00
Daniel Stancl	e0e2da1194	Improve a add-new-pipeline docs a bit (#14485 )	2021-11-22 10:35:49 -05:00
Shang Zhang	a59e7c1ed4	Add QDQBert model and quantization examples of SQUAD task (#14066 ) * clean up branch for add-qdqbert-model * README update for QAT example; update docstrings in modeling_qdqbert.py * Update qdqbert.rst * Update README.md * Update README.md * calibration data using traning set; QAT example runs in fp32 * re-use BERTtokenizer for qdqbert * Update docs/source/model_doc/qdqbert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/qdqbert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/qdqbert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove qdqbert tokenizer * Update qdqbert.rst * update evaluate-hf-trt-qa.py * update configuration_qdqbert.py * update modeling_qdqbert.py: add copied statement; replace assert with ValueError * update copied from statement * add is_quantization_available; run make fix-copies * unittest add require_quantization * add backend dependency to qdqbert model * update README; update evaluate script; make style * lint * docs qdqbert update * circleci build_doc add pytorch-quantization for qdqbert * update README * update example readme with instructions to upgrade TensorRT to 8.2 * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * change quantization to pytorch_quantization for backend requirement * feed_forward_chunking not supported in QDQBert * make style * update model docstrings and comments in testing scripts * rename example to quantization-qdqbert; rename example scripts from qat to quant * Update src/transformers/models/qdqbert/modeling_qdqbert.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * rm experimental functions in quant_trainer * qa cleanup * make fix-copies for docs index.rst * fix doctree; use post_init() for qdqbert * fix early device assignment for qdqbert * fix CI:Model templates runner Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-11-19 13:33:39 -05:00
NielsRogge	0490b98877	[ImageGPT] Small fixes (#14460 ) * Add integration test * Fix typo	2021-11-19 15:15:02 +01:00
NielsRogge	da36c557f7	Add ImageGPT (#14240 ) * First draft * More improvements * Improve conversion script * Fix init weights for layer norm * Fix correct model for conversion script * Don't tie input and output embeddings * Add print statements for debugging * Add print statements for debugging * Fix vocab size of model * Improve documentation, remove fast tokenizer * Add ImageGPTForImageClassification, improve docs * Fix docs issue * Set verbosity level back to info * Improve tests * Fix tests and add figure * Delete tokenizer file * Remove ImageGPTTokenizer from init files * Remove ImageGPTLayer from init files * Remove ImageGPT tokenizer from docs * First draft of ImageGPTFeatureExtractor * Fix typo * Fix bug * More improvements * Apply suggestions from code review, add tests for feature extractor * Fix layernorm * Update save_pretrained method * Fix issue * Make all tests of ImageGPTFeatureExtractor pass * Update code examples * Rename model inputs to pixel_values * Improve code examples * Update init_weights to post_init * Fix post_init	2021-11-18 16:24:34 +01:00
Patrick von Platen	754202de4f	[Bart] Fix docs (#14434 )	2021-11-17 19:02:33 +01:00
Lysandre	c6c075544d	Docs for version v4.12.5	2021-11-17 11:39:12 -05:00
NielsRogge	a2864a50e7	Improve semantic segmentation models (#14355 ) * Improve tests * Improve documentation * Add ignore_index attribute * Add semantic_ignore_index to BEiT model * Add segmentation maps argument to BEiTFeatureExtractor * Simplify SegformerFeatureExtractor and corresponding tests * Improve tests * Apply suggestions from code review * Minor docs improvements * Streamline segmentation map tests of SegFormer and BEiT * Improve reduce_labels docs and test * Fix code quality * Fix code quality again	2021-11-17 15:29:58 +01:00
Lysandre	888fb21159	Docs for v4.12.4	2021-11-16 17:40:58 -05:00
Patrick von Platen	4ce74edf51	[Speech2Text2] Enable tokenizers (#14390 ) * [Speech2Text2] Enable tokenizers * minor fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-15 16:34:11 +01:00
Stas Bekman	29dfb2dbb1	[doc] performance and parallelism updates (#14391 ) * [doc] performance and parallelism doc update * improve * improve	2021-11-14 17:19:15 -08:00
Nicolas Patry	5c153079e2	Adding some quality of life for `pipeline` function. (#14322 ) * Adding some quality of life for `pipeline` function. * Update docs/source/main_classes/pipelines.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Improve the tests. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-10 10:18:35 +01:00
Steven Liu	e4d8f517b9	Rewrite guides for fine-tuning with Datasets (#13923 ) * rewrite guides for fine-tuning with datasets * simple qa code example * use anonymous rST links * style	2021-11-09 14:12:50 -05:00
Yih-Dar	be4a6c64dc	Add TFViTModel (#13778 ) * Start the work for TFViTModel * Convert to TF code - need to check in the follow up commits * Clean up model code * Expose TFViTModel * make style * make quality * Add test * make style & quality * Fix some imports * fix wrong usage - kwargs => * kwargs * Fix Conv2D weight loading (PT->TF) issue * Add tests for images with different sizes + fix model * Fix some common tests for TFViTModel * Use inputs instead of input_ids in test_compile_tf_model * Add a comment about transpose and Conv2D in convert_tf_weight_name_to_pt_weight_name * Avoid transpose in TFViT call * Fix Conv2D issue in load_tf2_weights_in_pytorch_model * Use tf.keras.layers.Conv2D instead of tf.nn.conv2d * Using simpler heuristic to detect Conv2D layer * Change convert_tf_weight_name_to_pt_weight_name to return TransposeType * Check tf_weight_shape is not None before using it * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix missing comma * fix input dtype Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-09 07:54:37 -05:00
Yih-Dar	95b3ec3bc9	Add FlaxVisionEncoderDecoderModel (#13359 ) * Start the work on FlaxVisionEncoderDecoderModel * Add FlaxVisionEncoderDecoderModel * Add VisionEncoderDecoderConfig * Make FlaxVisionEncoderDecoderModel visible to transformers * Add test * Fix wrong getattr usage * Fix tests * Add FlaxAutoModelForVision2Seq * Expose FLAX_MODEL_FOR_VISION_2_SEQ_MAPPING * clean-up * add integration test * update expected logits * update expected scores * Add ViT2GPT2ModelIntegrationTest + some cleaning * Add projection layer + PT/Flax equivalence tests * Fix import * minor changes * make test slow again * Apply suggestions * Add modeling_flax_vision_encoder_decoder to _ignore_modules in get_model_modules() * fix copies * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * split long strings in multiple lines * decoder_input_ids can't be None * Add back test_configuration_tie * Remove attention_mask parameter * fix test - encoder_last_hidden_state should be encoder_outputs.last_hidden_state instead of the projected vector * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Remove more encoder_attention_mask * remove encoder_attention_mask when calling self.decode (in FlaxVisionEncoderDecoderModule) * Fix style + pass 1s instead of None as encoder_attention_mask * fix init_weights * pass None for encoder_attention_mask * pass 1s instead of None as encoder_attention_mask * Fix doc style Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-11-09 15:14:28 +05:30
Xing Han Lu	843c326ee1	Update dpr.rst (#14300 )	2021-11-06 09:41:02 -04:00
Sylvain Gugger	f0d6e952c0	Quality explain (#14264 ) * Start PR doc * Cleanup the quality checks and document them * Add reference in the contributing guide * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Rename file as per review suggestion Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-11-03 17:43:19 -04:00
NielsRogge	5f789a687a	Add LayoutXLMProcessor (and LayoutXLMTokenizer, LayoutXLMTokenizerFast) (#14115 ) * Add LayoutXLMTokenizer and LayoutXLMTokenizerFast * Fix styling issues * Fix more styling issues * Fix more styling issues * Fix docstring * Fix unit tests * Fix docs * Fix unit tests * Fix typos and styling issues * Fix styling issues * Fix docstring * Make all tests of test_tokenization_layoutxlm pass * Add LayoutXLMProcessor * Make fixup * Make all LayoutXLMProcessor tests pass * Minor fixes * Leave LayoutLMv2Processor tests unchanged * Fix code quality * Move LayoutXLM tokenizers and processor to separate folder * Fix code quality * Apply suggestions from code review * Replace assertions by value errors * Remove methods from fast tokenizer Co-authored-by: King Yiu Suen <kingyiusuen@gmail.com>	2021-11-03 08:59:44 +01:00
Sylvain Gugger	558f8543ba	Update Transformers to huggingface_hub >= 0.1.0 (#14251 ) * Update Transformers to huggingface_hub >= 0.1.0 * Forgot to save... * Style * Fix test	2021-11-02 18:58:42 -04:00
lumliolum	519a677e87	Added Beit model output class (#14133 ) * add Beit model ouput class * inherting from BaseModelOuputWithPooling * updated docs if use_mean_pooling is False * added beit specific outputs in model docs * changed the import path * Fix docs Co-authored-by: Niels Rogge <niels.rogge1@gmail.com>	2021-11-02 18:29:14 +01:00
NielsRogge	e20faa6f03	Add BeitForSemanticSegmentation (#14096 ) * Add first draft * Make forward pass work * Improve conversion script * Add notebook that checks if it works * Add BeitForSemanticSegmentation to the tests * More improvements * Make BeitForSemanticSegmentation consistent with Segformer * Small bug fix * Add BeitForSemanticSegmentation to docs * Make sure model doesn't output hidden states when the user doesn't want to * Make it possible to convert the large model * Fix issue * Fix conversion script for large model * Add auxiliary_head option to semantic segmentation model * Apply suggestions from @sgugger's review * Apply suggestions from code review * Fix failing test Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-11-01 19:55:45 +01:00
Lysandre	9fc1951711	Docs for v4.12.2	2021-10-29 14:51:05 -04:00
Lysandre	513fa30a63	Docs for v4.12.1	2021-10-29 13:49:50 -04:00
Daniel Stancl	d37f1fb8ba	Add `BlenderbotTokenizerFast` (#13720 ) * Add the support for the fast (rust) implementation of BlenbderbotTokenizer * Fix a converter and a typo in a doc * Apply the patil-suraj's suggestion * (Nitpick) Fast tokenization -> Fast Tokenization in doc * Apply the SaulLu's suggestion * Apply Narsil's suggestion to fix test pipelines * Add encoder_no_repeat_ngram_size according to the Narsil's suggestion * Revert the last (unnecessary) commit * Override pipeline config for Blenderbot to allow for larger pos. emb. * make fix-copies	2021-10-29 09:19:01 -04:00
Nicolas Patry	be236361f1	Adding `batch_size` support for (almost) all pipelines (#13724 ) * Tentative enabling of `batch_size` for pipelines. * Add systematic test for pipeline batching. * Enabling batch_size on almost all pipelines - Not `zero-shot` (it's already passing stuff as batched so trickier) - Not `QA` (preprocess uses squad features, we need to switch to real tensors at this boundary. * Adding `min_length_for_response` for conversational. * Making CTC, speech mappings avaiable regardless of framework. * Attempt at fixing automatic tests (ffmpeg not enabled for fast tests) * Removing ffmpeg dependency in tests. * Small fixes. * Slight cleanup. * Adding docs and adressing comments. * Quality. * Update docs/source/main_classes/pipelines.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/question_answering.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/zero_shot_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Improving docs. * Update docs/source/main_classes/pipelines.rst Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com> * N -> oberved_batch_size softmax trick. * Follow `padding_side`. * Supporting image pipeline batching (and padding). * Rename `unbatch` -> `loader_batch`. * unbatch_size forgot. * Custom padding for offset mappings. * Attempt to remove librosa. * Adding require_audio. * torchaudio. * Back to using datasets librosa. * Adding help to set a pad_token on the tokenizer. * Update src/transformers/pipelines/base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Quality. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>	2021-10-29 11:34:18 +02:00
Lysandre	b8fad022a0	v4.13.0.dev0	2021-10-28 12:56:46 -04:00
Lysandre	62bf536631	Release v4.12.0	2021-10-28 12:09:49 -04:00
NielsRogge	1dc96a760d	Add SegFormer (#14019 ) * First draft * Make style & quality * Improve conversion script * Add print statement to see actual slice * Make absolute tolerance smaller * Fix image classification models * Add post_process_semantic method * Disable padding * Improve conversion script * Rename to ForSemanticSegmentation, add integration test, remove post_process methods * Improve docs * Fix code quality * Fix feature extractor tests * Fix tests for image classification model * Delete file * Add is_torch_available to feature extractor * Improve documentation of feature extractor methods * Apply suggestions from @sgugger's code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply some more suggestions of code review * Rebase with master * Fix rebase issues * Make sure model only outputs hidden states when the user wants to * Apply suggestions from code review * Add pad method * Support padding of 2d images * Add print statement * Add print statement * Move padding method to SegformerFeatureExtractor * Fix issue * Add casting of segmentation maps * Add test for padding * Add small note about padding Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-10-28 08:23:52 -04:00

... 13 14 15 16 17 ...

2356 Commits