transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-13 17:48:22 +06:00

Author	SHA1	Message	Date
NielsRogge	3d3204c025	Add FocalNet (#21532 ) Adds FocalNet by Microsoft to transformers --------- Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: alaradirik <alaradirik@gmail.com>	2023-04-23 20:03:05 +03:00
Connor Henderson	b950c38565	tests: Fix flaky test for NLLB-MoE (#22880 ) * add test update and docs edits * docs edit suggestion	2023-04-21 17:09:40 +01:00
fxmarty	3d852da2db	Expose AutoModelForMaskGeneration (#22910 ) * expose * style * add dummy object * amazed by the quality of transformers CI	2023-04-21 10:04:45 -04:00
Arthur	f143037789	Add `automatic-mask-generation` pipeline for Segment Anything Model (SAM) (#22840 ) * cleanup * updates * more refactoring * make style * update inits * support other inputs in base * update based on review Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com> * Update tests/pipelines/test_pipelines_automatic_mask_generation.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * update * fixup * TODO x and y to refactor, _h _w refactored here * update docstring * more nits * style on these * more doc fix * rename variables * update * updates * style * update * fix `_mask_to_rle_pytorch` * styling * fix ask to rle, wrong outputs * add device arg * update * more updates, fix tets * udpate * update docstrings * styling * fixup * add notebook on the docs * update orginal sizes * fix docstring * updat condition on point_per-batch * updates tests * fix CI test * extend is required, append does not work! * fixup * fix CI tests * whit pixels left * address doc comments * fix doc * slow pipeline tests * update auto init * add revision * make fixup * update p!ipoeline tag when calling tests * alphabeitcal order in inits * fix copies * last style nits * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * reformat docstring * more reformat * address most of the comments * Update src/transformers/pipelines/mask_generation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * final refactor * Update src/transformers/models/sam/image_processing_sam.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fixup and fix slow tests * revert --------- Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-04-20 19:27:24 +02:00
fxmarty	4cfe328bae	Fix SAM example in documentation (#22887 ) fix sam example	2023-04-20 12:22:42 +02:00
Younes Belkada	2da73f6302	[`SAM`] Correct arxiv link (#22886 ) put correct link	2023-04-20 11:23:12 +02:00
Arthur	474bf508df	Add Segment Anything Model (SAM) (#22654 ) * initial commit * keys match * update, fix conversion * fixes, inference working * fix * more fixes * more fixes * clean up * more clean up * fix copies and add convext copied layer norm * stash * pretty big upfate * cleaning * more cleaning * fixup stuffs * fix copies * fix iinit * update test removing tokenizer * nits * add pretrained * more nits * remove tracking of pipeline * few fixes * update san and conversion script * fix mask decoder and prompt encoder conversion * fixes * small update * fix order * fix * fix image embeddings * nites * few fixes * fix logits * clean up * fixes boxes inference * v1 AMG * clean up * some clean up * multi points support * amg working * fixup * clean up * readme * update toctree * fix type hint * multiple fixes * fixup * fixes * updates * updates * more tests * few fixes * change to `SamForMaskGeneration` * doc * fixup * fix more tests * multiple fixes * fix CI tests * refactor processor * renamings * draft the pipeline * refactor * fix tests * fix test * few cleanings * fix test * edit pipelien support chunking * udate * add slow tests * fix nit * fixup * fix nit * current chunk pipleine * cast boxes in fp32 * nit * current updates * piepleine works * fixup * clean up config * fix slow tests * fix slow tests * clean up * update doc and pipeline * adds more slow tests * fix slow tests * cleaning * tests pass * add docstring * fix copies * clean up * support batch of images * style * dummy is needed, add tests * fix slow tests * fix CI * update * adds more tests * fixes * fixes * fixup * fixes * few fixes * filter * few fixes * some refactor * touches finales * fix * style * remove pipeline files * fixes nits * revert pipeline changes * fix test * fixup * remove automodel for automatic mask generation * fix failing torch tests * update mdx * revert removal of `MODEL_FOR_AUTOMATIC_MASK_GENERATION_MAPPING` * update sam config based on review Co-authored-by: amyeroberts <aeroberts4444@gmail.com> Co-authored-by: sgugger <sylvain.gugger@gmail.com> * update low_resolution_masks -> pred_masks inti ln with layer_norm_eps add_decomposed_rel_pos doc forward doc of SamForMaskGeneration * update processor docstring * remove image processor import empty * update for testing * output vision hidden states + clean recomm also test all iou values * fixup * fixup * remove unused * Update src/transformers/models/sam/modeling_sam.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/sam/image_processing_sam.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * nits * fix * fix CI tests and slow tests * replace with Amy's processor * clearer docstring * add `SamVisionNeck` * refactor - all CI tests should pass * fix broken import on Gcolab * few fixes here and there * fix another bug * fix more bugs * update and merge * correct ckpt * address comments * add tips * revert * fix docstring * replace with `SamModel` * make fixup * add support for bathed images and batch ed points * make fixup this time, really * make fixup again and again * few fixes here and there, this should be the touche finale * Update docs/source/en/model_doc/sam.mdx * fixup * correct checkpoints * correct name * rm unneeded file * add notebook --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: amyeroberts <aeroberts4444@gmail.com> Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-04-19 21:01:49 +02:00
Joao Gante	78cda46f17	Generate: Add assisted generation (#22211 ) * working mvp * remove breakpoint * fix commit * standardize outputs * tmp commit * tests almost ready * tmp commit * skip a few models * Add streaming; Docs and examples * document limitations * PR commits * Amy PR comments	2023-04-18 17:36:56 +01:00
Gabriel Yang	42288269c3	🌐 [i18n-KO] Fix anchor links for docs `auto_tutorial`, `training` (#22796 ) docs: ko: fix anchor links for docs (auto_tutorial, training) Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Na Yeon Han <nayeon2.han@gmail.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>	2023-04-18 09:11:30 -04:00
Sylvain Gugger	dacd34568d	Mark auto models as important (#22815 ) * Mark auto models as important * Annoying file with bad line endings	2023-04-17 15:33:01 -04:00
Wonhyeong Seo	4d2c52e830	🌐 [i18n-KO] Translated `tasks/translation.mdx` to Korean (#22805 ) docs: ko: tasks/translation.mdx	2023-04-17 11:30:17 -04:00
Jungnerd	abbc96a214	[i18n-KO] fix: docs: ko: sagemaker anchors and `_toctree.yml` (#22549 ) fix: docs: ko: sagemaker anchors and `_toctree.yml` Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Na Yeon Han <nayeon2.han@gmail.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>	2023-04-17 07:41:52 -04:00
Na Yeon Han	18c894814e	🌐 [i18n-KO] Translated `custom_models.mdx` to Korean (#22534 ) docs: ko: translated `custom_models.mdx` Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>	2023-04-17 07:39:53 -04:00
Mayank Agarwal	daf53241d6	Fix word_ids hyperlink (#22765 ) * Fix word_ids hyperlink * Add suggested fix	2023-04-14 16:18:15 +01:00
Sohyun Sim	c8df3900c8	[WIP]🌐 [i18n-KO] Translated `tutorial/proprecssing.mdx` to Korean (#22578 ) * add ko preprocessing * translate preprocessing.mdx to korean * translate preprocessing.mdx * Update preprocessing.mdx Fixed the line 273 as below: 또한, 특징 추출기에 `sampling_rate` 인자를 추가하여 발생할 수 있는 조용한 오류(silent errors)를 더 잘 디버깅하는 것을 권장합니다. * translate Image part * translated preprocess.mdx * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * fixed translation --------- Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>	2023-04-14 07:26:44 -04:00
Hyeonseo Yun	bfb3925fcb	🌐 [i18n-KO] Translated `sequence_classification.mdx` to Korean (#22655 ) * docs: ko: init: tasks/sequence_classification.mdx * docs: ko: revised: change voca in tasks/sequence_classification.mdx * docs: ko: revised: [RE] change voca in tasks/sequence_classification.mdx * docs: ko: revised: spell check and sentence naturally in tasks/sequence_classification.mdx * docs: ko: revised: spell check and consistent vocabulary in tasks/sequence_classification.mdx * docs: ko: revised: Add full stop and change voca in tasks/sequence_classification.mdx * docs: ko: revised: sync first section templates in tasks/sequence_classification.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * fix: revert use of full-stops to colons * colons are used to emphasize the code block that follows * @0525hhgus @wonhyeongseo docs: ko: revised: sync second section templates in tasks/sequence_classification.mdx Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> * docs: ko: revised: change 'train', 'finetuning' in tasks/sequence_classification.mdx --------- Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>	2023-04-13 21:40:36 -04:00
Joao Gante	9dfd6a4baa	Generate: handle text conditioning with multimodal encoder-decoder models (#22748 )	2023-04-13 19:51:13 +01:00
Gabriel Yang	4def2fe969	🌐 [i18n-KO] Translated `training.mdx` to Korean (#22670 ) translate training doc to Korean	2023-04-13 11:04:47 -04:00
NielsRogge	8eb38f638d	[Pix2struct] Simplify generation (#22527 ) * Add model to doc tests * Remove generate and replace by prepare_inputs_for_generation * More fixes * Remove print statements * Update integration tests * Fix generate * Remove model from auto mapping * Use auto processor * Fix integration tests * Fix test * Add inference code snippet * Remove is_encoder_decoder * Update docs * Remove notebook link	2023-04-13 09:01:14 -04:00
ARKA1112	d87ef00c31	Modify pipeline_tutorial.mdx (#22726 ) generator(model="openai/whisper-large") always returns error. As the error says the generator expects an input, just like the .flac file above. Even the generator object has no parameters called model. While there are parameters which can be passed to generator like 'batch_size' but to pass a model i believe the the parameter has to be passed while instantiating the pipeline and not as a parameter to the instance. I believe the correct term should be: generator = pipeline(model="openai/whisper-large", device=0)	2023-04-12 15:20:25 +01:00
Younes Belkada	370f0ca18c	[`bnb`] Let's make serialization of int8 models possible (#22177 ) * make serialization of int8 models possible * make fixup * add docs * add ability to push to hub and save pretrained * fixes * more addition * more tests * fix issues * change variable * clearer message * adapt from suggestions * few fixes * remove unused function * Update src/transformers/utils/quantization_config.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address last comments * last warning * clarify doc * protect import * Update src/transformers/modeling_utils.py * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-12 08:01:18 -04:00
pioliverse	523ca4e016	add model resources for CPMAnt (new) (#20906 ) * resolve conflicts * rebase and make style * test * test * test * rebase and make style * rebase and make style * tests * tests * rewrite some functions * rebase and make style * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * fix some bugs & docstring * add models and tests * solve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * tests * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * fix some bugs & docstring * save resolution * make style * delete redefinition code * reformat function * reformat * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * tests * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * resolve conflicts * make style * fix bugs and refactor * modify docstrings and make style * unify import format in __init__.py * fix import-altclp bug * fix copies to update index.md * fix unused config parameters * fix unused config parameters * fix unused config parameters * update README_ja.md * dummy commit for unit test * fix attention mask * add CPMAntTokenizer&-Fast to auto-mapping * drop redundant changes in README_ko * fix defaults in docstring * fix use_cache and some docstring * add missing args in tokenizer * modify tester inheritance * add is_jieba_available * fix some bugs * make style and fix-copies * add doctests * skip integration tests * add is_jieba_available * fix bugs in common tests * adjust docstrings and make style * add argument docstring * adjust code to some specifications * make style and fix-copies * add fast tokenization test * dummy commit for unit test * dummy commit for unit test * dummy commit for unit test * normalize some comments and names * Bert->CPMAnt * camel names and drop redundant codes * make style and fix-coies * add CpmTokenizerFast _import_structure * drop cpmanttokenizerfast in model_doc * fix some problems * fix CPMAnt tokenization for common test * make style and fixup * fix copies and fixup * fix bugs in tokenization test * dummy commit for connection failure in unittest * fix copies * drop trailing comma * fix decorator in tests * dummy commit for connection failure in unittest --------- Co-authored-by: Gong Baitao <gongbaitao11@gmail.com>	2023-04-12 07:33:20 -04:00
Arthur	b76e6ebd44	remove wrong doc in readme (#22723 )	2023-04-12 07:11:12 -04:00
Sylvain Gugger	28c19ab58d	Make it easier to develop without a dev install (#22697 ) * Make it easier to develop without a dev install * Remove ugly hack that doesn't work anyway	2023-04-11 08:41:53 -04:00
Sugawara	6daa9cb515	add GPTNeoXForSequenceClassification (#22671 ) * add GPTNeoXForSequenceClassification * move the labels to logits.device (ref: #22561) * fix	2023-04-10 11:52:23 -04:00
Kirill	14fc1a2467	Fix quantization docs typo (#22666 )	2023-04-10 08:53:53 -04:00
Joel Lamy-Poirier	e0921c6b53	Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575 ) * Add model with cli tool * Remove unwanted stuff * Add new code * Remove inference runner * Style * Fix checks * Test updates * make fixup * fix docs * fix doc * fix test * hopefully fix pipeline tests * refactor * fix CIs * add comment * rename to `GPTBigCodeForCausalLM` * correct readme * make fixup + docs * make fixup * fixes * fixes * Remove pruning * Remove import * Doc updates * More pruning removal * Combine copies * Single MQA implementation, remove kv cache pre-allocation and padding * Update doc * Revert refactor to match gpt2 style * Merge back key and value caches, fix some type hints * Update doc * Fix position ids pith padding (PR 21080) * Add conversion script temporarily * Update conversion script * Remove checkpoint conversion * New model * Fix MQA test * Fix copies * try fix tests * FIX TEST!! * remove `DoubleHeadsModel` * add MQA tests * add slow tests * clean up * add CPU checker * final fixes * fixes - fix GPU issue - fixed slow tests - skip disk offload * fix final issue * Simplify and comment baddbmm fix * Remove unnecessary code * Transpose tweaks * Use beta=1 on cpu, improve tests --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-04-10 10:57:21 +02:00
Joao Gante	3f96e0b4e4	Generate: add API warning to streamers (#22659 ) add API warning	2023-04-07 14:15:20 -04:00
Wonhyeong Seo	fc1ba6fd11	🌐 [i18n-KO] Translated `pipeline_tutorial.mdx` to Korean (#22508 ) docs: feat: Korean pipeline_tutorial Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: gabrielwithappy <102908949+gabrielwithappy@users.noreply.github.com> Co-authored-by: Na Yeon Han <nayeon2.han@gmail.com>	2023-04-07 11:27:59 -04:00
gabrielwithappy	d59034ff6f	🌐[i18n-KO] Translate `autoclass_tutorial` to Korean and Fix the typo of `quicktour` (#22533 ) translate the autoclass_tutorial and fix the typo of the quicktour	2023-04-07 08:12:35 -04:00
Nicolas Patry	1670be4bde	Adding Llama FastTokenizer support. (#22264 ) * Adding Llama FastTokenizer support. - Requires https://github.com/huggingface/tokenizers/pull/1183 version - Only support byte_fallback for llama, raise otherwise (safety net). - Lots of questions are special tokens How to test: ```python from transformers.convert_slow_tokenizer import convert_slow_tokenizer from transformers import AutoTokenizer from tokenizers import Tokenizer tokenizer = AutoTokenizer.from_pretrained("huggingface/llama-7b") if False: new_tokenizer = Tokenizer.from_file("tok.json") else: new_tokenizer = convert_slow_tokenizer(tokenizer) new_tokenizer.save("tok.json") strings = [ "This is a test", "生活的真谛是", "生活的真谛是[MASK]。", # XXX: This one is problematic because of special tokens # "<s> Something something", ] for string in strings: encoded = tokenizer(string)["input_ids"] encoded2 = new_tokenizer.encode(string).ids assert encoded == encoded2, f"{encoded} != {encoded2}" decoded = tokenizer.decode(encoded) decoded2 = new_tokenizer.decode(encoded2) assert decoded.strip() == decoded2, f"{repr(decoded)} != {repr(decoded2)}" ``` The converter + some test script. The test script. Tmp save. Adding Fast tokenizer + tests. Adding the tokenization tests. Correct combination. Small fix. Fixing tests. Fixing with latest update. Rebased. fix copies + normalized added tokens + copies. Adding doc. TMP. Doc + split files. Doc. Versions + try import. Fix Camembert + warnings -> Error. Fix by ArthurZucker. Not a decorator. * Fixing comments. * Adding more to docstring. * Doc rewriting.	2023-04-06 09:53:03 +02:00
Younes Belkada	176ceff91f	Add DePlot + MatCha on `transformers` (#22528 ) * add deplot + matcha on `transformers` * more docs * correct path * Update docs/source/en/model_doc/deplot.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix * use auto processor * Update docs/source/en/model_doc/matcha.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fixup * Update docs/source/en/model_doc/deplot.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add correct names --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2023-04-05 17:43:48 +02:00
Wonhyeong Seo	f49b0762a1	docs: ko: complete `_toctree.yml` (#22581 ) Co-authored-by: gabrielwithappy <102908949+gabrielwithappy@users.noreply.github.com>	2023-04-05 09:32:17 -04:00
Shubhamai	900677487d	Flax Regnet (#21867 ) * initial commit * review changes * post model PR merge * updating doc	2023-04-04 12:41:12 -04:00
Matt	5f3ea66bc0	Add TF port of BLIP (#22090 ) * Initial commit * more stash commit * Yet another stash commit * yet more stash commit * Mostly working except for docs / repo consistency * Stop importing model list from torch file * Add TF BLIP models to docs * Add auto classes * Move get_text_features and get_image_features * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/blip/test_modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/blip/test_modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/models/blip/test_modeling_tf_blip_text.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use channels_last convolutions in TF (better performance + compatibility) * Remove _shape function * Move multi-line statement to one line in PT + TF * Specify tf.keras.layers instead of importing from it * Remove test_gradient_checkpointing and empty test_training methods * move some multi-line statements to one line * Update docstring for generate * Remove pruned heads set * Remove self.seq_len_dim * Fixed issues with loss computation, should resolve some tests. Also ensured that the PT version follows the config for output_attentions and output_hidden_states * ensure original model follows config in more cases * Skip the same cross-attention tests in the PT tests - didn't realize we did it twice! * Add training args throughout the models and layers * make fixup * Fix docstring for inputs_embeds * Add docstring for is_decoder * Add docstrings to text models * Remove redundant computation * Add unpack_inputs / keras_serializable * Add modeling_tf_blip to doctests * Add config classes for keras serialization * Changes to allow model porting with pt-to-tf * Quick fix to decoder head and test tweaks * Revert an issue with masking the embeddings outputs * Allow missing keys in some equivalence tests (for unused layers) * Add tf-pt equivalence tests back in * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fixup * Refactor invert_attention_mask out into tf_utils * Re-enable cross-tests on the PT side too --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-04 16:05:22 +01:00
Arthur	00b5887b94	🚨🚨🚨 `[NLLB Tokenizer]` Fix the prefix tokens 🚨🚨🚨 (#22313 ) * fix the prefix tokens * update fast and test values * add legacy behaviour Co-authored-by: sgugger <sylvain.gugger@gmail.com> * update disclaimer, linkissue PR and behaviral changes * Apply suggestions from code review Co-authored-by: Lysandre Debut <hi@lysand.re> * styling * make a quote * quote this time --------- Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Lysandre Debut <hi@lysand.re>	2023-04-04 14:53:06 +02:00
Kirill	a60010566a	llama docs: fix conversion script url (#22514 )	2023-04-03 10:28:40 -04:00
Joao Gante	a55a822adf	Generate: `TextIteratorStreamer` (streamer for gradio) (#22501 ) * haha text go brrr (but in gradio)	2023-04-03 15:04:37 +01:00
Mohammed Jabir	7d25c9c81e	added biogpt token classifier (#22447 ) * added biogpt token classifier * fix reviews * Updated modeling_biogpt.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-04-03 09:20:02 -04:00
Jungnerd	1194c3e315	[WIP] docs: ko: sagemaker.mdx (#22509 ) docs: ko: sagemaker.mdx	2023-04-03 09:17:02 -04:00
Manuel de Prada	d5de578c22	Docs fix: Multinomial sampling decoding needs "num_beams=1", since by default it is usually not 1. (#22473 ) Fix: Multinomial sampling needs "num_beams=1", since by default is 5.	2023-03-30 11:04:12 -04:00
Joao Gante	228792a9dc	Generate: basic token streaming (#22449 ) * haha tokens go brrrr	2023-03-30 12:00:12 +01:00
fpgaminer	ed57c979b9	Fix bug in perplexity guide calculations and update perplexity numbers. Fixes #22348 (#22411 ) Fix bug in perplexity guide calculations and update perplexity numbers.	2023-03-28 09:09:17 -04:00
Arthur	19ade2426a	[WIP]`NLLB-MoE` Adds the moe model (#22024 ) * Initial commit * update modeling code * update doc * add functions necessary * fix impotrs * revert changes * fixup * more styling to get going * remove standalone encoder * update code * styling * fix config and model * update code and some refactoring * make more tests pass * Adding NLLB-200 - MoE - 54.5B for no language left behind Fixes #21300 * fix mor common tests * styke * update testing file * update * update * Router2 doc * update check config with sparse layer * add dummy router * update current conversion script * create on the fly conversion script * Fixup * style * style 2 * fix empty return * fix return * Update default config sparse layers * easier to create sparse layers * update * update conversion script * update modeling * add to toctree * styling * make ruff happy * update docstring * update conversion script * update, will break tests but impelemting top2 * update * ❗local groups are supported here * ⚠️ Support for local groups is now removed ⚠️ This is because it has to work with model parallelism that we do not support * finish simplificaiton * Fix forward * style * fixup * Update modelling and test, refactoring * update tests * remove final layer)norm as it is done in the FF * routing works! Logits test added * nit in test * remove top1router * style * make sure sparse are tested. Had to change route_tokens a liottle bit * add support for unslip models when converting * fixup * style * update test s * update test * REFACTOR * encoder outputs match! * style * update testing * 🎉encoder and decoder logits match 🎉 * styleing * update tests * cleanup tests * fix router test and CIs * cleanup * cleanup test styling * fix tests * Finally the generation tests match! * cleanup * update test * style testing file * remove script * cleanup * more cleanup * nits * update * NLLB tokenizer is wrong and will be fixed soon * use LongTensors * update tests * revert some small changes * fix second expert sampling and batch prioritized routing * update tests * finish last tests * make ruff happy * update * ruff again * style * Update docs/source/en/model_doc/nllb-moe.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updates based on review * style and fix import issue * nit * more nits * cleanup * styling * update test_seconde_expert_policy * fix name * last nit on the markdown examples --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-27 19:42:00 +02:00
Nicola Procopio	204737fcc5	Translated documentation in italian (#22388 ) * updated toctree * added and translated mdx documents	2023-03-27 09:48:49 -04:00
Shubhamai	a0cbbba31f	Resnet flax (#21472 ) * [WIP] flax resnet * added pretrained flax models, results reproducible * Added pretrained flax models, results reproducible * working on tests * no real code change, just some comments * [flax] adding support for batch norm layers * fixing bugs related to pt+flax integration * removing loss from modeling flax output class * fixing classifier tests * fixing comments, model output * cleaning comments * review changes * review changes * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * renaming Flax to PyTorch --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-03-24 19:45:57 +00:00
Mitch Naylor	57f25f4b7f	Add Mega: Moving Average Equipped Gated Attention (#21766 ) * add mega file structure and plain pytorch version of mega source code * added config class with old naming conventions * filled in mega documentation * added config class and embeddings with optional token types * updated notes * starting the conversion process, deleted intermediate and added use_cache back to config * renamed config attributes in modeling_mega.py * checkpointing before refactoring incremental decoding functions * removed stateful incremental key/values for EMA and self-attention * refactored MovingAverageGatedAttention to remove stateful k/v history and use unified attention mask * MovingAverageGatedAttention works with incremental decoding + past values, added sequence length enforcement * more comments in MovingAverageGatedAttention + checkpointing before GatedCrossAttention * bug fix in attention mask handling in MovingAverageGatedAttention * removed incremental state from GatedCrossAttention and removed IncrementalState class * finished gated cross attention and got MegaLayer working * fixed causal masking in mega decoder * fixed how padding and causal masks are passed through MegaLayer with and without k/v caching * finished MegaModel; tested with encoder, decoder-only, and cross-attention type inputs; started work on downstream classes; removed mentions of position_ids * added optional dense hidden layer for masked and causal LM classes * docstring updates in MultiHeadEMA and GatedCrossAttention, removed unnecessary inputs in cross-attention * removed before_attn_fn in Mega class and updated docstrings and comments up to there * bug fix in MovingAverageGatedAttention masking * working conversion of MLM checkpoint in scratchpad script -- perfect matches * moved arg for hidden dense layer in LM head to config; discovered issue where from_pretrained is renaming gamma and beta parameters * renamed gamma and beta parameters to avoid HF renaming when loading from checkpoint * finished checkpoint conversion script * cleanup old class in mega config script * removed 'copied from' statements and passing integration tests * added num_attention_heads=1 to config for integration compatibility, decoder tests working, generation tests failing * fixed tuple output of megamodel * all common tests passing after fixing issues in decoder, gradient retention, and initialization * added mega-specific tests, ready for more documentation and style checks * updated docstrings; checkpoint before style fixes * style and quality checks, fixed initialization problem in float_tensor, ready for PR * added mega to toctree * removed unnecessary arg in megaconfig * removed unused arg and fixed code samples with leftover roberta models * Apply suggestions from code review Applied all suggestions except the one renaming a class, as I'll need to update that througout Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixed issue where .view breaks batch dimension, conversion script fixed with absolute imports, updated readme with Mega->MEGA * removed asserts in Mega code, renamed sequencenorm, gatedcrossattention, and NFFN, replaced get_activation_fn with ACTFN, and added sequencenorm to layer norms * reformatted .forward() docstrings to match style and removed unused mask input in cross-attention * removed all reset_parameters() methods and rolled into MegaPreTrainedModel._init_weights() * renamed all single-letter variables and improved readability in tensor size comments, Mega->MEGA in 2 documentation files * variable names in NFFN * manual Mega->MEGA changes in docs * Mega->MEGA in config auto * style and quality fixes * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * renamed parameters and variables with confusing names, added copied from statements, moved fft conv to its own method, other cleanup from PR comments * commit before dealing with merge conflicts * made new attention activation functions available in ACT2FN and added generation test from OPT * style and quality in activations and tests * documentation fixes, renaming variables in dropout and rotary positions, used built-in causal masking, encoders->layers in MegaModel, moved comments into docstrings * style and quality fixes after latest updates, before rotary position ids * causal mask in MegaBlock docstring + added missing device passing * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * added Mega prefixes where missing, reverted MegaSequenceNorm to if-else, other module renaming requested in PR * style and quality fixes + readme updates pointing to main --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-24 08:17:27 -04:00
Ashwin Mathur	b79607656b	Fix typo in Greedy Search Description (#22345 ) Fix typo in greedy search docs	2023-03-24 07:32:18 -04:00
Stas Bekman	73fdc8c5b4	[deepspeed zero3] need `generate(synced_gpus=True, ...)` (#22242 ) * [deepspeed zero3] need generate(synced_gpus=True, ...) * fix * rework per Sylvain's suggestion * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-22 12:18:57 -07:00
Younes Belkada	0f68a7f408	Add Pix2Struct (#21400 ) * v1 all keys match * clean up * forward pass ok * add correct image transform * generate works, logits matching * clean up * more refactor * revert * revert * clean up * clean ups * clean up * refactor * refactor * fix doc * fix tokenizer test * fix toctree * revert toctree * oops * few fixes * replace to `pixel_embeds` * make fixup * test processing & feat extractor * fix some tests * more fixes * make fixup * clean up * more clean up * add a single slow test * fix test * make fixup * fix * fix authors * fix toctree * update docs * add docstring * revert change * Update src/transformers/models/pix2struct/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix tokenizer * fix processor test * fix test * make fixup * refactor * fix config * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * format * fix * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * make fixup * add docstring * fix issues * fix * fix * fix * add slow test * fix * fix * fix batched issue * fix training issues * fix ci test * fix slow test * fix conversion script * remove unneeded classes * fix slow test * fix require backends * fix masked fill * revert * fix softmax * add large models support * fix conditional generation * few fixes * add instructions * rm unneeded file * Update src/transformers/models/pix2struct/convert_pix2struct_original_pytorch_to_hf.py * fix ci test * fix ci test really * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix nit * fix nits * fix image processors nits * docstring * clean up * fix nit * fix tests * docstring nit * fix reshape * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix nit * fix repetition * refactor processor * make patch size consistent * refactor forward * fix docstring * fix max_patches issue * update docstirng * update docstring * fix coped from * add skip reasons * few fixes * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * format * fix doctests * refactor and fix * fix doc build issue * fix processor test * small fix conversion script * replace correct weights * make fixup * fix some issues * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * revert config and fixes * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * more details * fixes * fix processor * fix processor test * fix * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * fix processor * Update src/transformers/models/pix2struct/modeling_pix2struct.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add copied * make fixup * fix copies * update docstring * refactor * fix docstring * fix conversion script * fix vqa issue * replace to `flattened_patches` * nit * fix numpy issue * fix image processors * add batched vqa support * fix vqa conversion * make fixup * fix conversion script * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * add correct docstring * update docstring * fix module level + channel dim * use `make_list_of_images` * refactor * correct docstring * fix authors * remove `data_format` * add header text test * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * add checkpoints --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2023-03-22 16:53:52 +01:00
Stas Bekman	89a0a9eace	[deepspeed] offload + non-cpuadam optimizer exception doc (#22044 ) * [deepspeed] offload + non-cpuadam optimizer exception doc * deps	2023-03-21 17:00:05 -07:00
Davide Gazzè	86c7931a70	Add translation perf_infer_gpu_one for it (#22296 ) Add translation	2023-03-21 13:07:30 -04:00
Maria Khalusova	7bd8650512	Example of pad_to_multiple_of for padding and truncation guide & docstring update (#22278 ) * added an example of pad_to_multiple_of * make style * addressed feedback	2023-03-20 14:18:55 -04:00
amyeroberts	8ac29fe090	Fix doc links (#22274 )	2023-03-20 17:07:31 +00:00
Sylvain Gugger	786092a35e	Rework a bit the LLaMA conversion script (#22236 ) * Update LLaMA conversion script * Doc * Fix the weight size for the 13B checkpoint * Update src/transformers/models/llama/convert_llama_weights_to_hf.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2023-03-20 11:30:36 -04:00
Nicola Procopio	c4bf6f38bd	Italian translation perf_infer_cpu (#22243 ) * added translated files added perf_train_cpu and perf_train_cpu_many * updated toctree * updated toctree * added file perf_infer_cpu.medx * italian translation perf_infer_cpu.mdx	2023-03-20 09:16:07 -04:00
Seb0	074490b2c2	fix(docs): fix task guide links in model docs (#22226 ) fix(docs): task guide links in model docs	2023-03-17 14:30:17 +00:00
Maria Khalusova	314cdf7c25	Removed .mdx extension in two links (#22230 ) removed .mdx extension	2023-03-17 10:27:12 -04:00
lewtun	f251441387	Add LlamaForSequenceClassification (#22209 ) * Add LlamaForSequenceClassification * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Add docstring * Add test * Add input embedding getter and setter * Remove dead code --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-03-17 14:39:26 +01:00
Sylvain Gugger	00934026a4	LLaMA house-keeping (#22216 ) * LLaMA house-keeping * Doc links	2023-03-17 08:55:15 -04:00
Maria Khalusova	42f8f76402	Depth estimation task guide (#22205 ) * added doc to toc, auto tip with supported models, mention of task guide in model docs * make style * removed "see also" * minor fix	2023-03-17 08:36:23 -04:00
wangpeng	af1c864cdc	fix code example in mgp-str doc (#22219 ) Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>	2023-03-17 09:40:06 +00:00
Kevin Turner	33d033d694	fix typos in llama.mdx (#22223 )	2023-03-17 08:43:18 +00:00
Jason Phang	0041be5b3d	LLaMA Implementation (#21955 ) * LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by: Stella Biderman <stellabiderman@gmail.com>	2023-03-16 09:00:53 -04:00
Baelish03	09922da4a7	Italian Translation of migration.mdx (#22183 ) * Tranlstion Italian: migration * Update migration.mdx minor fixes * Update _toctree.yml * Delete migration.mdx * Add italian translation of migration.mdx * Update of migration.mdx translation and toctree	2023-03-16 12:00:07 +00:00
Alara Dirik	1485bd9c02	Fix typo in Align docs (#22199 ) Fix align docs typo	2023-03-16 13:41:48 +03:00
Nicola Procopio	7f5ad6c35b	Translation Italian: perf_train_cpu and perf_train_cpu_many (#22151 ) * added translated files added perf_train_cpu and perf_train_cpu_many * updated toctree	2023-03-14 11:09:36 +00:00
Yih-Dar	ff88703501	Update 2 doctest expected values for torch 2.0.0 (#22148 ) update values Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-14 09:13:16 +00:00
Alara Dirik	cdddfbffa1	Add ConvNeXT V2 (#21679 ) * Add ConvNeXt V2 to transformers * TF model is separated from the PR to fix issues	2023-03-14 12:08:14 +03:00
MichaelRipa	101a6cd276	docs: New terms and updates to glossary (#21982 ) * Updated glossary with new terms, added abbreviations for certain terms and merged autoencoding models, autoregressive models and causal language modeling into encoder and decoder models * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Added link to 'Pipeline for inference' tutorial * Trigger CI * Update docs/source/en/glossary.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Added entry for self supervised learning, added deleted entries + fixed broken links * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-13 19:09:37 -04:00
Stas Bekman	618697ef53	[deepspeed docs] Activation Checkpointing (#22099 ) * [deepspeed docs] Activation Checkpointing * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update deepspeed.mdx --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-13 12:52:42 -07:00
Maria Khalusova	8def252de2	Zero-shot image classification task guide (#22132 ) * WIP * WIP * manual inference example * make style * Apply suggestions from code review Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> --------- Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>	2023-03-13 10:57:17 -04:00
Nicola Procopio	dd3a0580a6	Added big_models.mdx italian translation #17600 (#22115 ) * updated toctree * italian translation big_model.mdx * italian translation big_models	2023-03-13 10:02:03 -04:00
Alex Calabrese	0c883766bd	Add pr_checks.mdx Italian translation (#17459 ) (#22116 ) * Add pr_checks.mdx Italian translation (#17459) * Updated pr_checks.mdx Italian translation (#17459)	2023-03-13 09:24:34 -04:00
wangpeng	102b5ff4a8	add new model of MGP-STR (#21418 ) * add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * remove representation_size from MGPSTRConfig * reformat configuration_mgp_str.py * format test_processor_mgp_str.py * add test for tokenizer and complete model/processer test and model file * rm Unnecessary tupple in modeling_mgp_str * reduce hidden_size/layers/label_size in test_model * add integration tests and change MGPSTR to Mgpstr * add test for logit values * reformat test model file --------- Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>	2023-03-13 10:11:31 +00:00
Alara Dirik	32e3466d38	Add AutoModelForZeroShotImageClassification (#22087 ) Adds AutoModelForZeroShotImageClassification to transformers	2023-03-13 12:46:14 +03:00
Maria Khalusova	bdec2768bd	GPT-J specific half precision on CPU note (#22086 ) * re: #21989 * update re: #21989 * removed cpu option * make style	2023-03-10 14:03:43 -05:00
Kevin Jiang	ade26bf991	Fix small typo in flan-ul2.mdx (#22068 ) * Update flan-ul2.mdx * Update flan-ul2.mdx	2023-03-10 07:44:45 -05:00
Shaun VanWeelden	684774306d	Can't install tf2 on M1 Chip by default (#22046 )	2023-03-09 07:44:58 -05:00
Shaun VanWeelden	81cd655cab	Docs Improvement - In ZSH, not using ' ' around pip install fails, fix it (#22045 ) In ZSH, not using ' ' around pip install fails Running ``` pip install transformers[torch] ``` in the default ZSH terminal will fail with the error `zsh: no matches found: transformers[torch]` The solution is to wrap the installation path in ' ' like ``` pip install 'transformers[torch]' ``` Relevant StackOverflow: https://stackoverflow.com/questions/30539798/zsh-no-matches-found-requestssecurity	2023-03-09 07:43:49 -05:00
Alara Dirik	2055d737ad	Update ALIGN docs (#22025 ) * Fix typos and add code examples, resources	2023-03-09 14:12:17 +03:00
Anahita Bhiwandiwalla	de81adf978	[WIP] Add BridgeTowerForContrastiveLearning (#21964 ) * Add BridgeTower for ITC * Fix review feedback * Rename BridgeTowerForITC, cleanup * Fix style and quality * implement tests --------- Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com> Co-authored-by: Tiep Le <tiep.le@intel.com>	2023-03-08 09:00:54 -05:00
Qiushi	bbd949970d	update: bertology paper (#22012 )	2023-03-08 07:54:30 -05:00
Eli Simhayev	8abe4930d3	[Time-Series] informer model (#21099 ) * added informer to gitignore * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * moved enc-dec init to InformerEncoder/Decoder init * added 'init_std' to config, now model init works! * WIP conversion script, and added code sources * WIP conversion script: loading original informer pth works * WIP conversion script: change defaults in the config * WIP conversion script: supporting Informer input embedding * WIP conversion script: added parameters for the informer embed * WIP conversion script: change dim_feedforward=2048 * WIP conversion script: remove unused args for loading checkpoint * just cleaning up * DataEmbedding removed, after thinking with Kashif * working on forward pass * WIP forward pass: trying to establish working batch for forward pass * cleaning and finalizing * adding HF names and docs * init after cleaning works * WIP in tests * added docs for the informer specific args * fix style * undo change * cleaning informer, now need to work only enc-dec * initial enc-dec classes * added encoder and decoder * added todo * add todos for conv_layers * added decoder docs from vanilla * added encoder docs from vanilla * remove encoder decoder from the original informer * removed AttentionLayer from the original paper * removed TriangularCausalMask, same as decoder_attention_mask * initial sparse attention * use conv_layers * fixed test_config test * fix parenthesis when itearting zip(layers, conv_layers) * error found in prob attention, added sizes as comments * fix sizes * added proposal for q_reduce indexing, and remove unused * WIP ProbMask, and changed factor=2 for testing * remove unused libs for this PR for creating the env * fix checking the attn_weights.size() after bmm * Q_reduce: changed from torch.gather to simple slicing * WIP calculate final attn_output * finish adding v_aggregated, attn_output ready * changed tgt_len to u in attention_mask, need to fix the size error * comment attention_mask for encoder, and fix if cond for v_agg * added ProbMask support (wip), removed old original code * finished ProbMask 😃 * Revert "remove unused libs for this PR for creating the env" This reverts commit `11a081e09e`. * fixes * make style * fix initial tests * fix more tests * dry * make style * remove unused files * style * added integration tests * fix num_static_real_features * fix header * remove unused function * fix example * fix docs * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/modeling_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixes for reviewer * use prediction_length from model * fix style * fixed informer.mdx * added to index * updated readme * undo * make fix-copies * typo * fix copy * added Informer to toctree * in order * fixed comments * remove unneeded new lines in docs * make static real and cat optional * fix use of distil conv layers * fixed integration test * added checkpoint for convlayer * make fix-copies * updated from time series model * make fix-copies * copy decoder * fix unit tests * updated scaling config * fix integration tests * IGNORE_NON_TESTED * IGNORE_NON_AUTO_CONFIGURED * IGNORE_NON_AUTO_CONFIGURED * updated check configs * fix formatting * undo change from time series * prediction_length should not be None * aliign with the blog: prettify ProbSparse and change attention_factor to sampling_factor * make style * make fix-copies * niels CR: update contributed by * niels CR: update configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * niels CR: update kashif -> huggingface Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * niels CR: `sampling_factor` only relevant when `attention_type`=prob * make style * fixed U_part: added multiplication by `L_Q` * fixed bug: remove `is not None` from `if config.distil` * fixed test: `decoder_seq_length` to `encoder_seq_length` in cross_attentions check * fix integration tests * updated model hub * do not shift as in training * undo * fix make-copies * make fix-copies * added `if prediction_length is None` * changed `ProbSparseAttention` to `InformerProbSparseAttention` * changed `V_sum` -> `v_mean_dim_time` * changed `ConvLayer` to `InformerConvLayer` and fixed `super()` * TimeSeriesTansformer->Informer in decoder's Copied from * more descriptive in ProbSparse * make style * fix coped from * Revert "added `if prediction_length is None`" This reverts commit `b4cbddfa05`. * fixed indent * use InformerSinusoidalPositionalEmbedding * make fix-style * fix from #21860 * fix name * make fix-copies * use time series utils * fix dec num_heads * docstring * added time series util doc * _import_structure * formatting * changes from review * make style * fix docs * fix doc * removed NegativeLogLikelihood --------- Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2023-03-07 21:36:38 +01:00
Sanchit Gandhi	7c39318136	[Whisper] Add model for audio classification (#21754 ) * [Whisper] Add model for audio classification * make fix-copies * add to docs * add docstring * empty returns * add code example * switch to fleurs * stick everything on one line	2023-03-07 16:20:21 +01:00
PD Hall	31e3c6c393	docs: improve clarity for language modeling (#21952 ) * docs: improve clarity for clm/mlm * docs: remove incorrect explanation * docs: remove incorrect explanation --------- Co-authored-by: pdhall99 <pdhall99>	2023-03-06 13:13:43 -05:00
Arthur	82aac00e0f	[Flan-UL2] Add-flan-ul2 (#21929 ) * add doc and readme * add model docs * update toctree and fix copies * update * update doc file * fix * add FLAN-UL2 to configuration mapping * fixup * Apply suggestions from code review * more clarification --------- Co-authored-by: younesbelakda <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-03-03 17:57:24 +01:00
Alara Dirik	269b054939	Add ALIGN to transformers (#21741 ) Adds the ALIGN model to transformers. ALIGN is introduced in "Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision" by Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig.	2023-03-01 21:23:31 +03:00
Matt	f7c618e3b0	Add TFVisionTextDualEncoder (#21873 ) * Temporary commit to stash everything so far * Temporary commit to stash everything so far * stash commit * Refactor from_pretrained * Fix final test, make fixup * Update dummies * Add model to TEST_FILES_WITH_NO_COMMON_TESTS * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Add TFVisionTextDualEncoder to utils/documentation_tests.txt * make fixup --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2023-03-01 18:00:48 +00:00
Stas Bekman	3eba1dd27e	[doc] deepspeed tests (#21859 )	2023-03-01 08:52:49 -08:00
Sourab Mangrulkar	571dd693b5	update FSDP and add XLA-FSDP documentation (#21812 ) * update FSDP and add XLA-FSDP documentation * resolving comments * minor update * fix xla-fsdp docs	2023-03-01 19:51:07 +05:30
Maria Khalusova	9c1d59882b	Removed BLIP mention from the troubleshooting guide (#21872 ) removed BLIP mention from the troubleshooting guide	2023-03-01 08:26:25 -05:00
Lorenzo Balzani	619d831848	Italian translation of community.mdx (#21871 ) Italian translation of community.mdx gh-17459	2023-03-01 07:49:56 -05:00
Maria Khalusova	6ca844582c	Add: task guide for zero shot object detection (#21829 ) * zero shot object detection part 1 * added batch prediction section * added image guided object detection section * make style * added the task guide to the TOC * minor polishing * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> * added embedded owlvit demo * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * minor fix * make style --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-28 10:23:08 -05:00
Younes Belkada	b8de7e448e	[`Blip2`] Add `Blip2Model` (#21817 ) * add v1 * add `Blip2Model` - add relevant functions - add tests - add on automapping * fix docs * fix doctest	2023-02-28 15:42:55 +01:00
Younes Belkada	831f3144a6	[`tests`] add `accelerate` marker (#21743 ) * add `accelerate` marker * add to docs * Update docs/source/en/testing.mdx	2023-02-27 12:33:34 +01:00
Arthur	cc44e72d14	[Pipeline] Add zero shot audio classificatoin pipeline (#21600 ) * add pipeline * update init * add zero shot to init * update inits and correct checkpoints * update base to support input features * add tests * Update src/transformers/pipelines/zero_shot_audio_classification.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/pipelines/zero_shot_audio_classification.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * update pieline code * use tiny checkpoint * nits and expected value with tiny model * style * last nit on tests values * fix styling * fix collate fn that was casting t float * update --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-02-27 11:43:44 +01:00
Thomas Paviot	ba2a5f13f7	Fix en documentation typos (#21799 ) * fix wrong url * typos in english documentation	2023-02-27 08:36:36 +01:00
bofeng huang	c8545d2a9c	[Whisper] Add SpecAugment (#21298 ) * Return and rescale attention_mask * Add SpecAugment to Whisper modeling * Fix test * Update docstring * Add SpecAug related parameters to model config * Add the _mask_input_features function to doc * Fix quality * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove dev comments * Add test * Resolve conflict * feat: mask {feature, time} prob fast tests * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-24 11:07:52 +01:00
Maria Khalusova	04d90ac49e	Auto api Value Error addition to Troubleshoot (#21708 ) * troubleshooting guide: added an error description for missing auto-mapping * minor polishing * changed the example * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/troubleshooting.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-23 11:51:18 -05:00
Yih-Dar	36a6a1adb6	Fix 2 quicktour file doctest (#21742 ) * Update expect output values - as Hub repo. files are updated * Update expect output values - as librosa is from 0.9.2 to 0.10.0 on CI docker * fix * update one more --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-23 09:41:28 +01:00
Thomas Paviot	064f374874	typos in french documentation (#21750 )	2023-02-23 09:17:01 +01:00
Maria Khalusova	619d51e01f	Added "Open in Colab" to task guides (#21729 ) added Open in Colab to task guides	2023-02-22 08:32:35 -05:00
Maria Khalusova	78a53d59cb	Adding task guides to resources (#21704 ) * added resources: links to task guides that support these models * minor polishing * conflict resolved * link fix * Update docs/source/en/model_doc/vision-encoder-decoder.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-21 10:35:11 -05:00
Ishan Jindal	c40e3581c7	Fix axial positional encoding calculations for reformer.mdx (#21649 ) * Update reformer.mdx Fix axial positional encoding calculations * Update docs/source/en/model_doc/reformer.mdx Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-02-21 06:59:51 +01:00
Jonatan Kłosko	deafc24388	Add WhisperTokenizerFast (#21222 ) * Add WhisperTokenizerFast * Fixup * Up * Up * Improve tests * Update src/transformers/models/whisper/tokenization_whisper_fast.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Keep stride in whisper pipelien test * Remove unknown token special case * Reduce vocabulary size in tests * Fix vocab size assertion * Sync copied changes from WhisperTokenizer * Skip pipeline tests * Update assertion * Remove Whisper tokenizer dependency on sentencepiece * Format --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-02-21 06:58:54 +01:00
Alara Dirik	49ab16239c	Add EfficientNet (#21563 ) * Add EfficientNet to transformers	2023-02-20 16:37:11 +03:00
tanreinama	f56174ac5b	add GPTSAN model (reopen) (#21291 ) * add GPTSAN-Japanese * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN (update for review) * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * fix typo in comment text * add GPTSAN * add GPTSAN * add GPTSAN * add GPTSAN * fix document and comments * fix class name GPTSAN->GPTSan * fix import and test for tokenizer	2023-02-20 11:25:27 +01:00
Andy Ehrenberg	2840272c5f	add flax whisper implementation (#20479 ) * add flax whisper implementation * rever change to setup * remove unused imports * revert generation changes * flax whisper docs * docs * import order * import sorting * isort * add dummy objects * doc formatting * formatting * remove trailing whitespaces * fix flax whisper docs * add generation logic to unlock flax whisper * remove scans * give credits to Flax Bart implementation * remove unused imports * add license * remove assert * more credits to Bart * fix style * formatting * support left padding * add flax whisper generation test * remove copied from comments whenever not a full copy * fix docstrings for logits processors * revert change to FlaxForceTokensLogitsProcessor * revert doc changes * improve generation docs * reorganize * formatting * cleanup docs * add tests * handle empty list case * fix forced decoder ids in flax tests * add flax whisper to inits * upate dummy objects * docs for FlaxAutoModelForSpeechSeq2Seq * fix decoder_position_ids computation in pretrained model decode/__call__ fns * add Copied from statements as necessary * compute position_ids only in __call__ and decode methods of pretrained model subclasses * improve readabilityof compute positional embeddings * check dimensionality of input_features instead of hidden_states * copied from statement for init_cache * formatting * fix copies * fix copies * pass attention mask to encoder layers * fix decoder module outputs * set dtype Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * smaller flax model for whisper test * Update src/transformers/generation/flax_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_flax_whisper.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/models/whisper/test_modeling_flax_whisper.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * cleanup Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_flax_whisper.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * bias cleanup * doc fix * align style for force tokens processor * readability * fix input shape in tests * revert FlaxGenerationMixin docstring * formatting * fix tests * fix imports * consistent encoder hidden states * consistent hidden states * input shapes * typo * partial class trick * partial class for input shape * base_class with correct input shape * partial base classes * match by name * set main_input_name * compare on names * formatting * remove unused import * safer position ids computation * safer position id computation * Update src/transformers/models/whisper/modeling_flax_whisper.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_flax_whisper.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove identical inherited tests * fix prompt ids in tests * use generation config * use jnp array * better var names * more explicit bias use * import transformers * formatting * test formatting * remove unused imports * remove unused imports * formatting * isort * docs * fix ln orders for encoder hidden states * whisper unique generation stuff * flake * use finfo for attention bias * docs * Update src/transformers/generation/flax_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * docs * add timestamp flax test * jit for timestamps * formatting * clean up timestamps processor * formatting * remove if_true * cleanup --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-02-20 09:17:40 +01:00
Younes Belkada	3668ec1716	[`bnb`] Introducing `BitsAndBytesConfig` (#21579 ) * v1 `BitsandbytesConfig` - add v1 - add tests - more user-friendly API - add docs * change to `BitsAndBytesConfig` * replace logic * changes * make fixup * quality * make fixup * fix doc * fix test * update toctree * fix slow test * add tips * add warning * change title * oops * Update docs/source/en/main_classes/quantization.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/utils/bitsandbytes.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove unused file * adapt suggestion - add also tests - change logic * update docs * adapt suggestions --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-17 09:44:01 +01:00
Arthur	c236a62172	[CLAP] Add CLAP to the library (#21370 ) * add model like clip * update * text model ok * clap text works * some refactor - `CLAPVision` to `CLAPAudio` - refactor kwargs of audio modules * more refactor * more refactor * more refactor * correct fusion * more refactor * new modules * add basic processor * fixup * remove whisper copioed from * audio logits match * add doc * correct filters mel and add maxlength * style * few fixes * forward passes * fixup * fixup * some clean up * remove mels form the dictionnary * pad after the repeat * update padding when dsmaller * fix padding * style * use swin patch merging * use copied from swin * processor with any tokenizer * more copied from * some clean up * more refactor * fix mel when rand_trunc * style * remove unused imports * update processing * remove image processing tests * add testing fiel * fixmodeling issues * replace with `is_longer` * clap in serialization * more refactor * `make fixup` * make fixup * fix feature extractor * update test feature extractor * `make fixup` * clean up config * more clean up * more cleanup * update tests * refactor tests and inits * removeCLAP vision config * remove CLAP from image procssing auto and dummy vision objects * update inits * style * re order classes in modeling clap * Use roberta tokenizer as the other weights are not open sourced * small cleaup * remove tokenization CLAP * processor tokenizr is roberta * update feature extraction doc * remove vclap from model zero shot * update f_min and f_max to frequency_xx * some changes - fix modeling keys - add `is_longer` in the forward pass - make fixup * make fixup * consistent behavior ebtween rand_crop and fusion * add numpy resize and bilinear and documentation * move resizing to image utils * clean feature extraction * import resize from correct file * resize in image transforms * update * style * style * nit * remove unused arguments form the feature extractor * style * few fixes + make fixup * oops * fix more tests * add zero shot audio classification pipeline * update zeroshot classification pipeline * fixup * fix copies * all CI tests pass * make fixup + fix docs * fix docs * fix docs * update tests pip;eline * update zero shot pipeline * update feature extraction clap * update tokenization auto * use nested simplify * update pipeline tests * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * split in two lines * fixes * refactor * clean up * add integration tests * update config docstring * style * update processor * fix processor test * fix feat extractor tests * update docs * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix readmes * fix tips * Update src/transformers/models/auto/configuration_auto.py * update doc and remove todo -> properly explained * fix idx and typo * typoe * cleanup config * cleanup tests, styles and doc * ignore docstyle on image transform * add conversion script * remove the `clap` indx in favor of `CLAP` * update __init * nits * Update src/transformers/pipelines/__init__.py * fix bug * clarifiy config * fix copy * fix init * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix model output * fix comment * make fixup * make fixup * rename to `Clap` * replace to `Clap` * replace to `Clap` * repo consistency * again repo-consistency * make fixup * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * add config * changes * update conversion * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove unused function * update based on code reviews * style * more comments * cleanup * clean up * style * apply suggestions * Empty commit * pipeline will be added in a different PR * update calls to audio utils functions * update pipeline init * style * style * styling again * use pad * fix repo-consistency * update utils and add doc for audio utils * clean up resize by using torch. update inits accordingly * style * CLap's tokenizer is RobertA * add audio utils to internal toctreee * update totctree * style * update documentation and normalize naming accross audio utils and feature extraction clap * style * clean up * update doc and typos * fix doctest * update modelin code, got rid of a lot of reshaping * style on added doc audio utils * update modeling clap * style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * docstringvariables with CLAP * rename key * update modeling CLAP * update audio utils docstring * update processing clap * fix readmes * fix toctree * udpate configuration clap * fix init * make fixup * fix * fix * update naming * update * update checkpoint path * Apply suggestions from code review * Major refactoring * Update src/transformers/models/clap/configuration_clap.py * merge --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-02-16 20:59:27 +01:00
Alissa	b0f0086fa4	Add OPT resources to the transformers documentation (#21625 ) * Add resources to OPT * Add additional resources for OPT * Remove -{" "} after <PipelineTag pipeline="question-answering" /> * Change bitsnbytes to bitsandbytes * Revert formatting * Revert automatic format changes * Remove - sign after <PipelineTag pipeline="question-answering" />	2023-02-16 12:44:28 -05:00
Jannis Vamvas	61abe3290b	[WIP] Move X-MOD models to facebook organization (#21640 ) Move X-MOD models to facebook org	2023-02-16 09:18:25 -05:00
Steven Liu	7a5533b2c3	Refactor model summary (#21408 ) * first draft of model summary * restructure docs * finish first draft * ✨minor reviews and edits * apply feedbacks * save important info, create new page for attention * add attention doc to toctree * ✨ few more minor fixes	2023-02-15 10:35:14 -08:00
Zineng Tang	a0e69a9375	Add TVLT (#20725 ) * Update image_processing_tvlt.py * Update modeling_tvlt.py * Update * Update modeling_tvlt.py * Create tvlt.mdx * Update configuration_tvlt.py * Update modeling_tvlt.py * Update test_modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update image_processing_tvlt.py * Update feature_extraction_tvlt.py * Update tvlt models * Update tests * Update * Update * Update tests * Update README_ko.md * Update README_ja.md * Update README_ko.md * Update README_zh-hans.md * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update tvlt.mdx * Update modeling_tvlt.py * Update configuration_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Add files via upload * Update model * Update modeling_tvlt.py * Update tvlt models * Update src/transformers/models/tvlt/__init__.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/__init__.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add files via upload * Add files via upload * Delete modeling_tvlt.py * Delete feature_extraction_tvlt.py * Delete configuration_tvlt.py * Delete image_processing_tvlt.py * Delete processing_tvlt.py * Update tvlt * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update README_es.md * Update README_hd.md * Update README_ja.md * Update README_ko.md * Update README_zh-hans.md * Update README_zh-hant.md * Update index.mdx * Update tvlt.mdx * Update tvlt.mdx * Update configuration_tvlt.py * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update modeling_tvlt.py * Add files via upload * Update tvlt.mdx * Update modeling_auto.py * Add files via upload * Add files via upload * Update dummy_pt_objects.py * Update __init__.py * Update feature_extraction_tvlt.py * Update feature_extraction_tvlt.py * Update image_processing_tvlt.py * Update modeling_auto.py * Update test_feature_extraction_tvlt.py * Update test_processor_tvlt.py * Update test_feature_extraction_tvlt.py * Add files via upload * Update test_image_processor_tvlt.py * Update tests/models/tvlt/test_processor_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_image_processor_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update tests/models/tvlt/test_image_processor_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_image_processor_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_image_processor_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_feature_extraction_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/feature_extraction_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/feature_extraction_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/feature_extraction_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/feature_extraction_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update feature_extraction_tvlt.py * Update feature_extraction_tvlt.py * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update image_processing_tvlt.py * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update test_image_processor_tvlt.py * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/tvlt/test_modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add files via upload * Add files via upload * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Add files via upload * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update image_processing_tvlt.py * Add files via upload * Add files via upload * Update tvlt.mdx * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update docs/source/en/model_doc/tvlt.mdx Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Add files via upload * Add files via upload * Add files via upload * Add files via upload * Update modeling_auto.py * Update tvlt.mdx * Update dummy_pt_objects.py * Update feature_extraction_tvlt.py * Update modeling_tvlt.py * Update test_feature_extraction_tvlt.py * Update test_image_processor_tvlt.py * Update test_feature_extraction_tvlt.py * Update modeling_tvlt.py * Update dummy_pt_objects.py * Update dummy_speech_objects.py * Add files via upload * Update README_hd.md * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling_tvlt.py * Update test_modeling_tvlt.py * Update src/transformers/models/tvlt/configuration_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/feature_extraction_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/image_processing_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update MAE processing * Update modeling_tvlt.py * Update modeling_tvlt.py * Update modeling * Update style * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/tvlt/modeling_tvlt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update check_repo.py * Update tvlt.mdx * Update __init__.py * Update tests * Update tvlt models * Update configuration_tvlt.py * Update configuration_tvlt.py * Update image_processing_tvlt.py * Update dummy_pt_objects.py * Add files via upload * Update test_modeling_tvlt.py * Update test_feature_extraction_tvlt.py * Update test_feature_extraction_tvlt.py * Update test_feature_extraction_tvlt.py * Update test_feature_extraction_tvlt.py * Update test_feature_extraction_tvlt.py * Update test_feature_extraction_tvlt.py --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-02-15 18:10:30 +00:00
Susnato Dhar	0c9c8472e6	Add Ernie-M Model to huggingface (#21349 ) * config and tokenization(fast too) changed and ErnieEncoder added * Slow Tokenization Added * Tokenizer(slow) is now working and Fast Tokenizer removed * Added Config code * Added Base Model and utils * ErnieMModel is now working * All added except tests * All tests passed except ErnieUIEM * All tests passed * all fixes done * all fixes done * fixed MAP * fixed check_code_quality * fixed Build PR Documentation issue * Added changes(comments) and also updated to the latest upstream/main * Added fixup * Added # Copied comments * Added fixup * Added more comments and some nits * Added fixup * Fixed README_hd.md * Added more fixes * ErnieMTokenizer (being sentencepiece) protected and other docs edited * Added code_quality fix * Fixed for * Added more fix * modified AZ * ernie-m tokenization test added! * attention mask part fixed(with 0->self.config.pad_token_id) * applied make fixup	2023-02-15 09:24:56 -05:00
Steven Liu	7bce804260	Fix typo in QA task guide (#21608 ) fix typo	2023-02-14 12:02:19 -08:00
Steven Liu	5987e0ab69	Clarify available pipelines in quicktour (#21607 ) clarify available pipelines	2023-02-13 11:37:48 -08:00
Stas Bekman	101b9a7eb1	[deepspeed] performance docs (#21573 ) * [deepspeed] performance docs * fix * re-org * update * update * a new NCCL Collectives section * inference * Update docs/source/en/main_classes/deepspeed.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * suggestion * Update docs/source/en/main_classes/deepspeed.mdx * suggestion --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-13 10:29:12 -08:00
Nolwenn Bernard	a27074abb5	[i18n-fr] Translate quicktour page to French (#21589 ) * Translate quicktour to French * Traduction missing task	2023-02-13 13:05:31 -05:00
Christopher Akiki	dcb5e01197	[MINOR] Fix link in timeseries transformer docs (#21602 ) [MINOR] Fix link I'm not sure this will also fix the currently broken link in the docs (Specifically here: https://huggingface.co/docs/transformers/model_doc/time_series_transformer) whereby clicking on `kashif` attempts to link to the following non-existent URL: https://huggingface.co/docs/transformers/model_doc/%3Chttps://huggingface.co/kashif	2023-02-13 10:11:16 -05:00
Thomas Paviot	dd7429d645	Remove trailing 'extractive' word from en documentation (#21594 ) remove trailing word	2023-02-13 10:09:00 -05:00
Maria Khalusova	3baa407f92	Add: document question answering task guide (#21518 ) * document question answering guide * Added the list of supported models * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * switched to AutoProcessor * feedback addressed * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/tasks/document_question_answering.mdx Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * more feedback addressed * addressed comments about evaluation loss * added appropriate image link * make style * typo fix * resolving toc conflict * fixed the image link --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2023-02-13 09:24:56 -05:00
Sayak Paul	e2ec3089ce	[Tasks] Adds image captioning (#21512 ) * add: task guide on image cpationing. * Empty commit to trigger CI * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address additional comments from the PR. * fix: wording. * Update docs/source/en/tasks/image_captioning.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-10 22:52:12 +05:30
Jannis Vamvas	b0d539ccad	Add X-MOD (#20939 ) * Add X-MOD to Readme * Add documentation for X-MOD * Implement X-MOD * Fix formatting of X-MOD docs * Change signature of X-MOD forward methods to use lang_ids * Minor changes * Rebase with main and run make fix-copies * Make suggested changes to docstrings * Improve code readability Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Fix code style * Conversion script: Remove asserts and type annotations * Remove _TOKENIZER_FOR_DOC * XMOD -> Xmod * Update copyright note * Fix doctests * Fix docstring * Add integration test for FillMaskPipeline * Revert "Add integration test for FillMaskPipeline" This reverts commit 4381eb3b1d0f5d85785f89caba83928e6efa6d1f. * Add end-to-end integration test for mask fill * make style * Rebase with main and make fix-copies --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-02-10 15:32:06 +01:00
Eugene Zapolsky	129011c20b	adding a tip for deepspeed integration in multi-node environment (#21459 ) * adding note concerning use_node_local_storage * overriding checkpoint.use_node_local_storage if save_on_each_node == True * add more content * add more content * improve * style --------- Co-authored-by: Stas Bekman <stas@stason.org>	2023-02-10 09:12:56 -05:00
Younes Belkada	f83942684d	[`pipeline`] A simple fix for half-precision & 8bit models (#21479 ) * v1 fix * adapt from suggestions * make style * fix tests * add gpu tests * update docs * fix other tests * Apply suggestions from code review Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * better fix * make fixup * better example * revert changes * proposal * more elegant solution * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-10 10:26:17 +01:00
Sylvain Gugger	04b2f13c37	🚨🚨🚨 Enforce single model initialization (#21431 ) * Enforce single model initialization * Add OneFormer example for problem 3 * Do it the Stas way * Actually rename the uses... * Rewrite test * Try to change the test this way * Fix all init slow/fast tests * Break connection * Fix more tests * Fix test for initialization * Remove custom test * Quality * Fix last failing tests * The end?	2023-02-09 15:46:26 -05:00
NielsRogge	d7f1e7c009	Add BLIP-2 (#21441 ) * First draft * More improvements * More improvements * Improve conversion script * Convert all weights * Make forward pass work * Make logits match * More improvements * More improvements * More improvements * Use get_input_embeddings * Improve some more * Improve model tests * Improve model tests * More improvements * Fix processor * Update files * Update prepare_inputs_for_generation * More improvements * Fix copies * More fixes * Make fixup * More improvements * Add support for seq2seq language model * More improvements * Fix test * More improvements * Improve conversion script * Remove some todo's * Fix README's * Improve conversion script * Fix generation * Fix style and remove Blip2Model * Fix model outputs * More improvements * Set eos_token_id in config * Fix quality * Small improvements * Add processor tests * More improvements * Apply suggestions * Apply suggestions * Add integration test * Update image URL * Add integration test * Fix model_type * Update style * Improve docs * Add doc tests * Fix copies * Remove tests which are passing * Improve some more * Add tests for seq2seq language models * Minor fix * Convert more checkpoints * finalize CI * Fix blip and blip2 processors * add `accelerate` support for `blip2` * clean up * make style * Update conversion script * Update conversion script some more * Update organization * revert toc file * add blip-2 to toc file * Some more improvements * Fix docstring * Improve docs --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-02-09 16:52:11 +01:00
Joao Gante	1d9c26a4b8	Generate: TF `compute_transition_scores` (#21341 )	2023-02-08 16:36:43 +00:00
Stefan Schweter	7e51a441e4	Add XLM-V to Model Doc (#21498 ) * doc: introduce new section for XLM-V model * doc: mention more details for XLM-V integration * docs: paper abstract in italics, model identifier for base model added * doc: mention new XLM-V support * auto: add XLM-V mapping * doc: run make fix-copies ;)	2023-02-07 16:43:19 -05:00
Adrian Sager La Ganga	a3034c7004	Add inverse sqrt learning rate scheduler (#21495 ) * added inverse sqrt lr scheduler * Updated get_scheduler in src/transformers/optimization.py * Updated src/transformers/__init__.py * Added inverse sqrt lr scheduler test * Updated docs/source/en/main_classes/optimizer_schedules.mdx * Ran style and quality scripts * Fix get_inverse_sqrt_schedule docstring * Comment implementation URL	2023-02-07 15:00:50 -05:00
Sylvain Gugger	67d074874d	Cleanup quality (#21493 ) * Remove mentions of flake8/isort * Clean up inits * Deall with all other inits * Last special rule for dummy files	2023-02-07 12:27:31 -05:00
Yih-Dar	479322bfaa	A new test to check config attributes being used (#21453 ) * Add a new test to check config attributes being used * Add a new test to check config attributes being used * Add a new test to check config attributes being used * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions * Update allowed cases - part 1 * Update allowed cases - part 2 * final --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-07 17:49:30 +01:00
Matt	28ec07d8ad	Typos/fixes to link syntax (#21450 ) * Typos/fixes to link syntax * Trying section headers * Add header formatting for Rule #3	2023-02-07 15:19:19 +00:00
Younes Belkada	fa0ae17958	[`Doc`] Fix int8 docs (#21487 ) fix int8 docs	2023-02-07 15:09:27 +01:00
Sylvain Gugger	5b49376202	Deprecate parallelize API (#21448 ) * Deprecate parallelize API * Add documentation * Fix copies	2023-02-06 19:39:13 -05:00
Sylvain Gugger	6f79d26442	Update quality tooling for formatting (#21480 ) * Result of black 23.1 * Update target to Python 3.7 * Switch flake8 to ruff * Configure isort * Configure isort * Apply isort with line limit * Put the right black version * adapt black in check copies * Fix copies	2023-02-06 18:10:56 -05:00
lewtun	b7bb2b59f7	Add tips for generation with Int8 models (#21424 ) * Add tips for generation with Int8 models * Empty commit to trigger CI * Apply suggestions from code review Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update docs/source/en/perf_infer_gpu_one.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-06 20:25:40 +01:00
Nolwenn Bernard	baf4bacb1f	[i18n-fr] Translate index page to French (#21458 ) * Translate index page to French * Fix indent * Fix toctree * Replace missing file by in_translation * Add index * Update docs/source/fr/index.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-06 12:25:49 -05:00
Irene López	7dbee87e09	Fix `PushToHubCallback` import in Share a model docs (#21457 ) docs: update PushToHubCallback import in docs	2023-02-06 09:26:22 -05:00
Jinen Setpal	5ac1c7ea85	Added documentation for DagsHubCallback (#21452 ) updated documentation	2023-02-06 09:24:18 -05:00
jianan-gu	ae31831879	Add perf numbers for perf_train_cpu (#20974 ) * Update perf_train_cpu.mdx * Update perf_train_cpu.mdx * Update perf_train_cpu.mdx * Update docs/source/en/perf_train_cpu.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update perf_train_cpu.mdx * Update perf_train_cpu.mdx * Update perf_train_cpu.mdx * Update perf_train_cpu.mdx --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-06 09:20:43 -05:00
Matt	833174c929	Add tutorial doc for TF + TPU (#21429 ) * Add tutorial doc for TF + TPU * Fix all those extra asterisks in the markdown * Use the actual Tip formatting * Remove unnecessary spaces * Reformat checklist * Fix checklist and reformat tips slightly * Update docs/source/en/perf_train_tpu_tf.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_train_tpu_tf.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_train_tpu_tf.mdx Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/perf_train_tpu_tf.mdx Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Add link to TPU notebook in the notebooks list * Add links to the TPU notebook in the tutorial doc * Make the markdown table a bit less wild * Fix notebook link * More notebook links * More fixes to wild tables --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2023-02-03 19:07:42 +00:00
Matthijs Hollemans	e4bacf6614	[WIP] add SpeechT5 model (#18922 ) * make SpeechT5 model by copying Wav2Vec2 * add paper to docs * whoops added docs in wrong file * remove SpeechT5Tokenizer + put CTC back in the name * remove deprecated class * remove unused docstring * delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead * remove classes we don't need right now * initial stab at speech encoder prenet * add more speech encoder prenet stuff * improve SpeechEncoderPrenet * add encoder (not finished yet) * add relative position bias to self-attention * add encoder CTC layers * fix formatting * add decoder from BART, doesn't work yet * make it work with generate loop * wrap the encoder into a speech encoder class * wrap the decoder in a text decoder class * changed my mind * changed my mind again ;-) * load decoder weights, make it work * add weights for text decoder postnet * add SpeechT5ForCTC model that uses only the encoder * clean up EncoderLayer and DecoderLayer * implement _init_weights in SpeechT5PreTrainedModel * cleanup config + Encoder and Decoder * add head + cross attention masks * improve doc comments * fixup * more cleanup * more fixup * TextDecoderPrenet works now, thanks Kendall * add CTC loss * add placeholders for other pre/postnets * add type annotation * fix freeze_feature_encoder * set padding tokens to 0 in decoder attention mask * encoder attention mask downsampling * remove features_pen calculation * disable the padding tokens thing again * fixup * more fixup * code review fixes * rename encoder/decoder wrapper classes * allow checkpoints to be loaded into SpeechT5Model * put encoder into wrapper for CTC model * clean up conversion script * add encoder for TTS model * add speech decoder prenet * add speech decoder post-net * attempt to reconstruct the generation loop * add speech generation loop * clean up generate_speech * small tweaks * fix forward pass * enable always dropout on speech decoder prenet * sort declaration * rename models * fixup * fix copies * more fixup * make consistency checker happy * add Seq2SeqSpectrogramOutput class * doc comments * quick note about loss and labels * add HiFi-GAN implementation (from Speech2Speech PR) * rename file * add vocoder to TTS model * improve vocoder * working on tokenizer * more better tokenizer * add CTC tokenizer * fix decode and batch_code in CTC tokenizer * fix processor * two processors and feature extractors * use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2 * cleanup * more cleanup * even more fixup * notebooks * fix log-mel spectrograms * support reduction factor * fixup * shift spectrograms to right to create decoder inputs * return correct labels * add labels for stop token prediction * fix doc comments * fixup * remove SpeechT5ForPreTraining * more fixup * update copyright headers * add usage examples * add SpeechT5ProcessorForCTC * fixup * push unofficial checkpoints to hub * initial version of tokenizer unit tests * add slow test * fix failing tests * tests for CTC tokenizer * finish CTC tokenizer tests * processor tests * initial test for feature extractors * tests for spectrogram feature extractor * fixup * more fixup * add decorators * require speech for tests * modeling tests * more tests for ASR model * fix imports * add fake tests for the other models * fixup * remove jupyter notebooks * add missing SpeechT5Model tests * add missing tests for SpeechT5ForCTC * add missing tests for SpeechT5ForTextToSpeech * sort tests by name * fix Hi-Fi GAN tests * fixup * add speech-to-speech model * refactor duplicate speech generation code * add processor for SpeechToSpeech model * add usage example * add tests for speech-to-speech model * fixup * enable gradient checkpointing for SpeechT5FeatureEncoder * code review * push_to_hub now takes repo_id * improve doc comments for HiFi-GAN config * add missing test * add integration tests * make number of layers in speech decoder prenet configurable * rename variable * rename variables * add auto classes for TTS and S2S * REMOVE CTC!!! * S2S processor does not support save/load_pretrained * fixup * these models are now in an auto mapping * fix doc links * rename HiFiGAN to HifiGan, remove separate config file * REMOVE auto classes * there can be only one * fixup * replace assert * reformat * feature extractor can process input and target at same time * update checkpoint names * fix commit hash	2023-02-03 12:43:46 -05:00
Avi Singhal	0df802822c	Added model resources for LayoutLM Issue#19848 (#21377 ) * updated resources for LayoutLM * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fixed formatting, removed extra section --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-02-03 08:53:16 -05:00
Steven Liu	fbee82951f	Update task summary (#21067 ) * first draft of audio section * make style * first draft of computer vision section * add convnext and encoder tasks * finish up nlp tasks * minor edits * add arch images, more edits * fix image links * apply sanchit feedback * model naming convention * apply niels vit feedback * replace detr for segmentation with mask2former * apply feedback * apply feedback	2023-02-02 11:41:27 -08:00
Steven Liu	0a75717602	Fix task guide formatting (#21409 ) fix formatting	2023-02-02 10:06:26 -08:00
Maria Khalusova	65b5035a1d	Moved LiLT under multimodal models in TOC (#21393 ) moved LiLT under multimodal models	2023-02-01 08:03:00 -05:00
NielsRogge	c21298a69b	[Docs] Minor fixes (#21383 ) * Improve docs * Add DETA resources --------- Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2023-01-31 15:13:12 +01:00

1 2 3 4 5 ...

1921 Commits