transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-14 10:08:29 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	cf43200861	Add local agent (#23438 ) * Add local agent * Document LocalAgent	2023-05-18 11:09:55 -04:00
Joao Gante	db13634183	TF: GPT2 with native embedding layers (#23436 )	2023-05-18 14:46:40 +01:00
Nayeon Han	8cfae44093	🌐 [i18n-KO] Translated `tasks/zero_shot_object_detection.mdx` to Korean (#23430 ) docs: ko: zero_shot_object_detection	2023-05-18 08:52:17 -04:00
Joao Gante	5b1ad0eb73	Docs: add link to assisted generation blog post (#23397 )	2023-05-16 18:54:34 +01:00
Sohyun Sim	728c5e82cc	🌐 [i18n-KO] Translated `asr.mdx` to Korean (#23106 ) * docs: ko: task/asr.mdx * feat: manual draft * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>	2023-05-16 09:22:56 -04:00
Yih-Dar	21741e8c7e	Update `test_batched_inference_image_captioning_conditioned` (#23391 ) * fix * fix * fix test + add more docs --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-05-16 14:49:24 +02:00
richardachen	65b885027a	Typo suggestion (#23360 ) Update graphormer.mdx Typo suggestion	2023-05-15 12:04:16 +01:00
Shehan Munasinghe	c045249049	Add swiftformer (#22686 ) * Commit the automatically generated code using add-new-model-like * Update description at swiftformer.mdx file * remove autogenerated code for MaskedImageModeling * update weight conversion scripts * Update modeling_swiftformer.py * update configuration_swiftformer.py * Update test_modeling_swiftformer.py * update modeling code - remove einops dependency * Update _toctree.yml * update modeling code - remove copied from comments * update docs * Revert "update docs" This reverts commit `c2e05e2998`. * update docs * remove unused reference SwiftFormerImageProcessor * update dependency_versions_table.py * update swiftformer.mdx * update swiftformer.mdx * change model output type - no attentions * update model org name * Fix typo * fix copies * Update tests/models/swiftformer/test_modeling_swiftformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/auto/feature_extraction_auto.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/swiftformer.mdx Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/swiftformer/configuration_swiftformer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_swiftformer.py fix-copies * make style, make quality, fix-copies * Apply suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make style Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fix-copies * Update modeling_swiftformer.py * Update modeling_swiftformer.py * Add suggestions from code review Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-12 11:52:31 +01:00
Freddy Boulton	662751b4e2	Fix typo in gradio-tools docs (#23305 ) Fix typo	2023-05-11 14:31:28 -04:00
Sylvain Gugger	f76fb3aeea	Fix broken links in the agent docs (#23297 )	2023-05-11 14:26:19 -04:00
Lysandre Debut	71b19ee251	Agents extras (#23301 ) * Agents extras * Add to docs	2023-05-11 14:25:51 -04:00
Mishig	436dc779a5	Update transformers_agents.mdx (#23289 ) Make `huggingface-tools` to [`huggingface-tools`](https://huggingface.co/huggingface-tools)	2023-05-11 08:54:02 -04:00
Mishig	125516977d	Update custom_tools.mdx: fix link (#23292 ) Wrong parantheses	2023-05-11 08:50:04 -04:00
Yih-Dar	9088fcae82	Bring back the PR `Refactor doctests + add CI` to `main` (#23271 ) * Revert "Revert "[Doctests] Refactor doctests + add CI" (#23245)" This reverts commit `69ee46243c`. * try not expose HfDocTestParser * move into testing_utils.py * remove pytest install --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-10 22:00:48 +02:00
Sylvain Gugger	eb5b5ce641	Render custom tool docs a bit better (#23269 ) * Try on a couple of blocks to see * Build the doc please * Build the doc please * Build the doc please * add more * Finish with all * Style	2023-05-10 11:58:20 -04:00
Sylvain Gugger	f93509b114	Refine documentation for Tools (#23266 ) * refine documentation for Tools * + one bugfix	2023-05-10 11:03:53 -04:00
Patrick von Platen	996f127a90	Improve Docs of Custom Tools and Agents (#23255 ) * Improve docs * correct tip format * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Correct grammer & spelling * Improve code style * make style ruff * make style final	2023-05-10 08:55:26 -04:00
Maria Khalusova	d3cbc997a2	[docs] Audio task guides fixes (#23239 ) trainer parameters fixed	2023-05-10 07:45:33 -04:00
Sylvain Gugger	3335724376	Test composition (#23214 ) * Remove nestedness in tool config * Really do it * Use remote tools descriptions * Work * Clean up eval * Changes * Tools * Tools * tool * Fix everything * Use last result/assign for evaluation * Prompt * Remove hardcoded selection * Evaluation for chat agents * correct some spelling * Small fixes * Change summarization model (#23172) * Fix link displayed * Update description of the tool * Fixes in chat prompt * Custom tools, custom prompt * Tool clean up * save_pretrained and push_to_hub for tool * Fix init * Tests * Fix tests * Tool save/from_hub/push_to_hub and tool->load_tool * Clean push_to_hub and add app file * Custom inference API for endpoints too * Clean up * old remote tool and new remote tool * Make a requirements * return_code adds tool creation * Avoid redundancy between global variables * Remote tools can be loaded * Tests * Text summarization tests * Quality * Properly mark tests * Test the python interpreter * And the CI shall be green. * fix loading of additional tools * Work on RemoteTool and fix tests * General clean up * Guard imports * Fix tools * docs: Fix broken link in 'How to add a model...' (#23216) fix link * Get default endpoint from the Hub * Add guide * Simplify tool config * Docs * Some fixes * Docs * Docs * Docs * Fix code returned by agent * Try this * Match args with signature in remote tool * Should fix python interpreter for Python 3.8 * Fix push_to_hub for tools * Other fixes to push_to_hub * Add API doc page * Docs * Docs * Custom tools * Pin tensorflow-probability (#23220) * Pin tensorflow-probability * [all-test] * [all-test] Fix syntax for bash * PoC for some chaining API * Text to speech * J'ai pris des libertés * Rename * Basic python interpreter * Add agents * Quality * Add translation tool * temp * GenQA + LID + S2T * Quality + word missing in translation * Add open assistance, support f-strings in evaluate * captioning + s2t fixes * Style * Refactor descriptions and remove chain * Support errors and rename OpenAssistantAgent * Add setup * Deal with typos + example of inference API * Some rename + README * Fixes * Update prompt * Unwanted change * Make sure everyone has a default * One prompt to rule them all. * SD * Description * Clean up remote tools * More remote tools * Add option to return code and update doc * Image segmentation * ControlNet * Gradio demo * Diffusers protection * Lib protection * ControlNet description * Cleanup * Style * Remove accelerate and try to be reproducible * No randomness * Male Basic optional in token * Clean description * Better prompts * Fix args eval in interpreter * Add tool wrapper * Tool on the Hub * Style post-rebase * Big refactor of descriptions, batch generation and evaluation for agents * Make problems easier - interface to debug * More problems, add python primitives * Back to one prompt * Remove dict for translation * Be consistent * Add prompts * New version of the agent * Evaluate new agents * New endpoints agents * Make all tools a dict variable * Typo * Add problems * Add to big prompt * Harmonize * Add tools * New evaluation * Add more tools * Build prompt with tools descriptions * Tools on the Hub * Let's chat! * Cleanup * Temporary bs4 safeguard * Cache agents and clean up * Blank init * Fix evaluation for agents * New format for tools on the Hub * Add method to reset state * Remove nestedness in tool config * Really do it * Use remote tools descriptions * Work * Clean up eval * Changes * Tools * Tools * tool * Fix everything * Use last result/assign for evaluation * Prompt * Remove hardcoded selection * Evaluation for chat agents * correct some spelling * Small fixes * Change summarization model (#23172) * Fix link displayed * Update description of the tool * Fixes in chat prompt * Custom tools, custom prompt * Tool clean up * save_pretrained and push_to_hub for tool * Fix init * Tests * Fix tests * Tool save/from_hub/push_to_hub and tool->load_tool * Clean push_to_hub and add app file * Custom inference API for endpoints too * Clean up * old remote tool and new remote tool * Make a requirements * return_code adds tool creation * Avoid redundancy between global variables * Remote tools can be loaded * Tests * Text summarization tests * Quality * Properly mark tests * Test the python interpreter * And the CI shall be green. * Work on RemoteTool and fix tests * fix loading of additional tools * General clean up * Guard imports * Fix tools * Get default endpoint from the Hub * Simplify tool config * Add guide * Docs * Some fixes * Docs * Docs * Fix code returned by agent * Try this * Docs * Match args with signature in remote tool * Should fix python interpreter for Python 3.8 * Fix push_to_hub for tools * Other fixes to push_to_hub * Add API doc page * Fixes * Doc fixes * Docs * Fix audio * Custom tools * Audio fix * Improve custom tools docstring * Docstrings * Trigger CI * Mode docstrings * More docstrings * Improve custom tools * Fix for remote tools * Style * Fix repo consistency * Quality * Tip * Cleanup on doc * Cleanup toc * Add disclaimer for starcoder vs openai * Remove disclaimer * Small fixed in the prompts * 4.29 * Update src/transformers/tools/agents.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Complete documentation * Small fixes * Agent evaluation * Note about gradio-tools & LC * Clean up agents and prompt * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Note about gradio-tools & LC * Add copyrights and address review comments * Quality * Add all language codes * Add remote tool tests * Move custom prompts to other docs * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * TTS tests * Quality --------- Co-authored-by: Lysandre <hi@lyand.re> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com> Co-authored-by: Connor Henderson <connor.henderson@talkiatry.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre <lysandre@huggingface.co> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-09 20:37:57 -04:00
Sylvain Gugger	69ee46243c	Revert "[Doctests] Refactor doctests + add CI" (#23245 ) Revert "[Doctests] Refactor doctests + add CI (#22987)" This reverts commit `627f44799a`.	2023-05-09 15:26:15 -04:00
Arthur	627f44799a	[Doctests] Refactor doctests + add CI (#22987 ) * intiial commit * new styling * update * just run doctest in CI * remove more test for fast dev * update * update refs * update path and fetch upstream * update documentatyion trests * typo * parse pwd * don't check for files that are in hidden folders * just give paths relative to transformers * update * update * update * major refactoring * make sure options is ok * lest test that mdx is tested * doctest glob * nits * update doctest nightly * some cleaning * run correct test on diff * debug * run on a single worker * skip_cuda_test tampkate * updates * add rA and continue on failure * test options * parse `py` codeblock? * we don't need to replace ignore results, don't remember whyu I put it * cleanup * more cleaning * fix arg * more cleaning * clean an todo * more pre-processing * doctest-module has none so extra `- ` is needed * remove logs * nits * doctest-modules .... * oups * let's use sugar * make dataset go quiet * add proper timeout * nites * spleling timeout * update * properly skip tests that have CUDSA * proper skipping * cleaning main and get tests to run * remove make report? * remove tee * some updates * tee was removed but is the full output still available? * [all-test] * only our tests * don't touch tee in this PR * no atee-sys * proper sub * monkey * only replace call * fix sub * nits * nits * fix invalid syntax * add skip cuda doctest env variable * make sure all packages are installed * move file * update check repo * revert changes * nit * finish cleanup * fix re * findall * update don't test init files * ignore pycache * `-ignore-pycache` when running pytests * try to fix the import missmatch error * install dec * pytest is required as doctest_utils imports things from it * the only log issues were dataset, ignore results should work * more cleaning * Update .circleci/create_circleci_config.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * [ydshieh] empty string if cuda is found * [ydshieh] fix condition * style * [ydshieh] fix * Add comment * style * style * show failure * trigger CI --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-09 20:34:48 +02:00
Sylvain Gugger	b4d4d6fe87	Add RWKV-4 (#22797 ) * First draft of RWKV-4 * Add support for generate * Style post-rebase * Properly use state * Write doc * Fix doc * More math * Add model to README, dummies and clean config * Fix init * multiple fixes: - fix common tests - fix configuraion default values - add CI test for checking state computation - fix some CI tests * correct tokenizer * some tweaks - fix config docstring - fix failing tests * fix CI tests - add output_attention / output_hidden_states - override test_initialization - fix failing CIs * fix conversion script - fix sharded case - add new arguments * add slow tests + more fixes on conversion script * add another test * final fixes * change single name variable * add mock attention mask for pipeline to work * correct eos token id * fix nits * add checkpoints * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add `tie_word_embeddings` in docstring * change tensor name * fix final nits * Trigger CI --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-09 13:04:10 -04:00
Rustin Welter	9a50cb6195	Add Japanese translation to accelerate.mdx (#23232 ) Co-authored-by: rustinwelter <rustinwelter.alwp9@slmails.com>	2023-05-09 10:51:43 -04:00
Furkan Akkurt	51ae566511	Fix typo ; Update output.mdx (#23227 )	2023-05-09 09:19:38 -04:00
Matthijs Hollemans	7f91950901	audio_utils improvements (#21998 ) * silly change to allow making a PR * clean up doc comments * simplify hertz_to_mel and mel_to_hertz * fixup * clean up power_to_db * also add amplitude_to_db * move functions * clean up mel_filter_bank * fixup * credit librosa & torchaudio authors * add unit tests * tests for power_to_db and amplitude_to_db * add mel_filter_bank tests * rewrite STFT * add convenience spectrogram function * missing transpose * fewer transposes * add integration test to M-CTC-T * frame length can be either window or FFT length * rewrite stft API * add preemphasis coefficient * move argument * add log option to spectrogram * replace M-CTC-T feature extractor * fix api thing * replace whisper STFT * replace whisper mel filters * replace tvlt's stft * allow alternate window names * replace speecht5 stft * fixup * fix integration tests * fix doc comments * remove manual FFT length calculation * fix docs * go away, deprecation warnings * combine everything into spectrogram function * add deprecated functions back * fixup	2023-05-09 09:10:17 -04:00
NielsRogge	431b04d8c4	[SAM] Add resources (#23224 ) Add resources	2023-05-09 08:58:19 -04:00
Connor Henderson	188a8bfccc	docs: Fix broken link in 'How to add a model...' (#23216 ) fix link	2023-05-08 14:56:42 -04:00
Ashwin Mathur	ef0c380c12	Update LLaMA docs with arxiv link (#23191 ) * Update docs with arxiv link * Update llama model docs	2023-05-07 18:52:44 -04:00
raghavanone	312b104ff6	Add FlaxWhisperForAudioClassification model (#23173 ) * Add FlaxWhisperForAudioClassification model * Add models to init * Add models to init * Fix copies * Fix automapping * Fix failing test	2023-05-05 13:23:46 -04:00
Gabriel Yang	40082d598b	🌐 [i18n-KO] docs: ko: Translate `multiple_choice.mdx` (#23064 ) * update doctree * doc: ko: translate multiple choice * Update reviews	2023-05-05 11:36:56 -04:00
Perry Huang	1b9c352e55	Add TrOCR resources (#23142 ) * Add TrOCR resources * Made fixes suggested by stevhliu	2023-05-05 11:29:20 -04:00
Sylvain Gugger	01734dba84	Revert "Add FlaxWhisperForAudioClassification model" (#23154 ) Revert "Add FlaxWhisperForAudioClassification model (#22883)" This reverts commit `c8f2c5c56e`.	2023-05-04 13:47:07 -04:00
Maria Khalusova	516dc6305f	[docs] Text to speech task guide (#23107 ) * First draft * Some polishing * Text polishing * added TOC entry for TTS * make style * added links to images * fixed links to images * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * feedback addressed * feedback from Matthijs addresed * Update docs/source/en/tasks/text-to-speech.mdx Co-authored-by: Matthijs Hollemans <mail@hollance.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Matthijs Hollemans <mail@hollance.com>	2023-05-04 13:17:13 -04:00
raghavanone	c8f2c5c56e	Add FlaxWhisperForAudioClassification model (#22883 ) * Add FlaxWhisperForAudioClassification model * Add models to init * Add models to init * Fix copies * Fix automapping	2023-05-04 13:00:16 -04:00
peter-sk	83b38fbea8	GPTNeoXForQuestionAnswering (#23059 ) * first draft - gives index error in question_answering.py * maturing * no labels * pipeline should know about QA * fixing checks * formatting * fixed docstring * initial commit * formatting * adding the class to many places * towards less unhappy checks * nearly there * and gpt neox for qa * use right model * forgot this one * base_model_prefix is "gpt_neox" for GPTNeoX* models * unnecessary stuff * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * format * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * removed gpt2 stuff --------- Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-04 10:15:15 -04:00
Victor Geislinger	3b74889e8f	Remove typo in perf_train_gpu_many.mdx (#23144 ) - Excess `w` in the word `bottom`	2023-05-04 09:56:45 -04:00
digger-yu	5eeb556484	fix spelling error (#23143 ) change referrred to referred	2023-05-04 09:56:28 -04:00
peter-sk	78b7debf56	GPTNeoForQuestionAnswering (#23057 ) * first draft - gives index error in question_answering.py * maturing * no labels * pipeline should know about QA * fixing checks * formatting * fixed docstring * initial commit * formatting * adding the class to many places * towards less unhappy checks * nearly there * Update src/transformers/models/gpt_neo/modeling_gpt_neo.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * avoid error * moving to device of star/end_logits --------- Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-03 15:59:19 -04:00
Julien Chaumond	ca7eb27ed5	[doc] Try a few ≠ ways of linking to Papers, users, and org profiles (#22611 ) * [doc] Try a few ≠ ways of linking to Papers, users, and org profiles * Empty commit * Empty commit now that the backend is fixed --------- Co-authored-by: Lysandre <lysandre@huggingface.co>	2023-05-03 18:23:09 +02:00
Nayeon Han	fbe0178f08	docs: ko: update `_toctree.yml` (#23112 ) * docs: ko: update `_toctree.yml` * fix: ko: update toc * fix: resolve suggestions * fix: resolve build issue --------- Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>	2023-05-03 11:04:58 -04:00
Samin Yasar	b53004fdce	Add resources for LayoutLmV2 and reformat documentation resources (#23115 ) * add resources for layoutlmv2 * remove 🌎 from some resources	2023-05-03 09:53:00 -04:00
Sohyun Sim	f31a510bb3	🌐 [i18n-KO] Translated `torchscript.mdx` to Korean (#23060 ) * docs: ko: torchscript.mdx * feat: gpt and deepl draft * fix: manual edits * fix: edit anchor link * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>	2023-05-02 09:27:59 -04:00
peter-sk	2b0c924568	GPT2ForQuestionAnswering (#23030 ) * first draft - gives index error in question_answering.py * maturing * no labels * pipeline should know about QA * fixing checks * formatting * fixed docstring * make sure legacy code executes * comment * like this --------- Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>	2023-05-02 09:25:46 -04:00
Nayeon Han	f9426eeb94	🌐 [i18n-KO] Translated `tasks/zero_shot_image_classification.mdx` to Korean (#23065 ) docs: ko: `tasks/zero_shot_image_classification` Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>	2023-05-01 20:11:56 -04:00
Jungnerd	92601d2eb1	🌐 [i18n-KO] Translated `tasks/question_answering.mdx` to Korean (#23012 ) docs: ko: `tasks/question_answering.mdx` to Korean Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Kihoon Son <75935546+KIHOON71@users.noreply.github.com>	2023-05-01 11:05:40 -04:00
Hyeonseo Yun	78941b9fe5	🌐 [i18n-KO] Translated `tasks/image_classification.mdx` to Korean (#23048 ) * ko: init: tasks/image_classification.mdx * docs: ko: trans: tasks/image_classification.mdx * docs: ko: revise: sync glossary and spell check tasks/image_classification.mdx * docs: ko: revise: sync glossary tasks/image_classification.mdx * fix: resolve suggestions (github) image_classification.mdx Only github code review suggestion Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> * fix: resolve suggestions image_classification.mdx Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> --------- Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>	2023-05-01 09:50:05 -04:00
Zachary Mueller	9884862383	Depricate xpu_backend for ddp_backend (#23085 ) * Depricate xpu_backend for ddp_backend * Typo * Only do a minor deprecation, no need for major Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-05-01 09:44:47 -04:00
Ashwin Mathur	487f132a6f	Add `BioGPTForSequenceClassification` (#22253 ) * added BioGptForSequenceClassification * added source of copied code * typo * Format code with black * Update comments for copied code * Remove code copy comment * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix failing tests * Update code copied from comments * Fix code quality * Update src/transformers/models/biogpt/modeling_biogpt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix lint error * Update src/transformers/models/biogpt/modeling_biogpt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Rename model to biogpt for consistency * Add PipelineTesterMixin to test_modeling_biogpt.py * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Resolve merge confict --------- Co-authored-by: Guillem García Subies <37592763+GuillemGSubies@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-01 09:17:27 -04:00
s-JoL	c2c99dc7ef	add open-llama model with ckpt (#22795 ) * update Open-Llama model * update * update format * update doc * update * update stable embedding test * update test case * update format * update readme * fix typo * update name * remove tokenizer and update format * remove convert_open_llama_weights_to_hf * update warning and doc_string --------- Co-authored-by: songliang.bayesian <songliang.bayesian@bytedance.com>	2023-04-28 11:01:32 -04:00
Maria Khalusova	521a8ffa53	[docs] Doc TOC updates (#23049 ) * first draft of toc restructure * polishing based on feedback	2023-04-28 09:24:28 -04:00
Hyeonseo Yun	4893d919f1	🌐 [i18n-KO] Translated `model_sharing.mdx` to Korean (#22991 ) * docs: ko: init: model_sharing.mdx * docs: ko: trans: model_sharing.mdx Co-Authored-By: Kihoon Son <75935546+KIHOON71@users.noreply.github.com> Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com> Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com> * docs: ko: revised: apply code reviews model_sharing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> * docs: ko: revised: apply aditional reviews model_sharing.mdx 1. Natural Expression 2. `파인 튜닝` to `미세 조정` 3. Glossary Sync Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com> Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> * docs: ko: revised: apply aditional reviews in model_sharing.mdx 1. Spell check 2. Natural Expression 3. Sync Glossary Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> * docs: ko: revised: `프로그래밍 방식` to `API` in model_sharing.mdx Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> --------- Co-authored-by: Kihoon Son <75935546+KIHOON71@users.noreply.github.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Nayeon Han <nayeon2.han@gmail.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>	2023-04-28 09:20:33 -04:00
Ehsan M. Kermani	a0e7332839	Fix CLAP link across all READMEs (#23032 ) * Fix CLAP link across all READMEs * Fix copy only for en	2023-04-27 18:07:02 -04:00
peter-sk	d65b14ed67	added GPTNeoForTokenClassification (#22908 ) * added GPTNeoForTokenClassification * add to top-level init * fixup * test * more fixup * add to gpt_neo.mdx * repo consistency * dummy copy * fix copies * optax >= 0.1.5 assumes jax.Array exists - which it doesn't for jax <= 0.3.6 * merge with main made this superfluous * added classifier_dropout * remove legacy code * removed fmt:on/off removed expected_outputs * doc style fix * classifier_dropout is always in config --------- Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>	2023-04-27 12:10:03 -04:00
peter-sk	614e191c4d	added GPTNeoXForTokenClassification (#23002 ) * initial commit * added GPTNeoXForTokenClassification * typo * doc fixed extra comma that turned into a tuple * unifying variable names fixing forward call * classifier_dropout is in config Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-27 11:08:26 -04:00
Nayeon Han	e28fff18b8	🌐 [i18n-KO] Translated `multilingual.mdx` to Korean (#23008 ) docs: ko: `multilingual.mdx` Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>	2023-04-27 08:06:12 -04:00
fxmarty	3042c63a95	Add methods to PreTrainedModel to use PyTorch's BetterTransformer (#21259 ) * fix mess * better documentation * typo * fix doc * update * add test * fix test * more tests * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * move to utils * Apply suggestions from code review Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * nit --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>	2023-04-27 11:03:42 +02:00
Ritik Nandwal	20ac86c6f1	Add TensorFlow Wav2Vec2 for sequence classification (#22073 ) * Add initial changes for TF wav2vec2 for sequence classification * Add suggested changes * Add serving and serving output methods * Add serving_output implementation and fix layer_weights * Add fixes * Fixed test cases * Fixing test and adding suggested changes	2023-04-26 13:35:30 +01:00
Hyeonseo Yun	4c2b4c4c3c	🌐 [i18n-KO] Translated `token_classification.mdx` to Korean (#22945 ) * docs: ko: init: token_classification.mdx * docs: ko: trans: tasks/token_classification.mdx * docs: ko: revise: apply suggestions tasks/token_classification.mdx right vocabulary, spell check, natural expression Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> * docs: ko: revise: `Hub` to `허브` in tasks/token_classification.mdx * docs: ko: revise: `example` in tasks/token_classification.mdx Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> Co-Authored-By: Kihoon Son <75935546+KIHOON71@users.noreply.github.com> Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com> Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com> * docs: ko: revise: ko expression in tasks/token_classification.mdx Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> * Revert "docs: ko: revise: ko expression in tasks/token_classification.mdx" This reverts commit `8efe28059b`. * docs: ko: revise: `quick tour` in tasks/token_classification.mdx Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> --------- Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Kihoon Son <75935546+KIHOON71@users.noreply.github.com> Co-authored-by: Nayeon Han <nayeon2.han@gmail.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>	2023-04-26 07:56:14 -04:00
Sohyun Sim	6dc2474727	🌐 [i18n-KO] Translated `tasks/image_captioning.mdx` to Korean (#22943 ) docs: ko: tasks/image_captioning.mdx Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Nayeon Han <nayeon2.han@gmail.com> Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com> Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>	2023-04-26 07:54:58 -04:00
Daniel Levenson	4e1522d65a	Fix typo in mega.mdx (#22998 ) MegaConfiig -> MegaConfig	2023-04-25 17:58:45 -04:00
Wonhyeong Seo	d95045717e	🌐 [i18n-KO] Translated `serialization.mdx` to Korean (#22806 ) docs: ko: serialization.mdx Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>	2023-04-25 12:38:51 -04:00
Jari Van Melckebeke	81c1910c86	fixed small typo in code example (#22982 ) fixed typo in code example fixed a really small typo in the docs of single gpu inference	2023-04-25 08:56:21 -04:00
Nayeon Han	f0f5e28f82	🌐 [i18n-KO] Fixed `tasks/masked_language_modeling.mdx` (#22965 ) fix: docs: missing newline before code block	2023-04-25 09:59:17 +02:00
Joao Gante	e4a97f82bf	Generate: assisted generation with sample (take 2) (#22949 ) * temperature controls speed	2023-04-24 19:54:55 +01:00
Gabriel Yang	7701716efc	🌐 [i18n-KO] translate `create_a_model` doc to Korean (#22754 ) docs: ko: translates create_a_model.mdx Co-authored-by: Nayeon Han <nayeon2.han@gmail.com> Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>	2023-04-24 13:02:19 -04:00
amyeroberts	8f20e61c85	Update feature selection in to_tf_dataset (#21935 ) * Update feature selection * Check compatibility with datasets version * Checkout from datasets main	2023-04-24 17:34:30 +01:00
Matt	345a1371d8	Fix TF example in quicktour (#22960 ) * Fix TF example in quicktour * Fix model.fit() and the dataset section too	2023-04-24 17:25:13 +01:00
Nayeon Han	d6f1da6b71	🌐 [i18n-KO] Translated `run_scripts.mdx` to Korean (#22793 ) docs: ko: `run_scripts` to Korean Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>	2023-04-24 10:18:20 -04:00
Sohyun Sim	84097f6d38	🌐 [i18n-KO] Translated `tasks/summarization.mdx` to Korean (#22783 ) docs: ko: tasks/summarization.mdx Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Nayeon Han <nayeon2.han@gmail.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>	2023-04-24 09:03:02 -04:00
Nayeon Han	093be36f6c	🌐 [i18n-KO] Translated `tasks/masked_language_modeling.mdx` to Korean (#22838 ) docs: ko: `tasks/masked_language_modeling.mdx` to Korean Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>	2023-04-24 09:02:21 -04:00
Arthur	df017c3ccc	[CLAP] Doc nits (#22957 ) clap nits	2023-04-24 14:00:29 +02:00
Hyeonseo Yun	137eb8e663	[i18n-KO] Translated `accelerate.mdx` to Korean (#22830 ) * docs: ko: init: accelerate.mdx * docs: ko: translated: accelerate.mdx * docs: ko: revised: natural expression accelerate.mdx Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> * docs: ko: revised: natural expression2 accelerate.mdx Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> --------- Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>	2023-04-24 07:49:05 -04:00
NielsRogge	3d3204c025	Add FocalNet (#21532 ) Adds FocalNet by Microsoft to transformers --------- Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: alaradirik <alaradirik@gmail.com>	2023-04-23 20:03:05 +03:00
Connor Henderson	b950c38565	tests: Fix flaky test for NLLB-MoE (#22880 ) * add test update and docs edits * docs edit suggestion	2023-04-21 17:09:40 +01:00
fxmarty	3d852da2db	Expose AutoModelForMaskGeneration (#22910 ) * expose * style * add dummy object * amazed by the quality of transformers CI	2023-04-21 10:04:45 -04:00
Arthur	f143037789	Add `automatic-mask-generation` pipeline for Segment Anything Model (SAM) (#22840 ) * cleanup * updates * more refactoring * make style * update inits * support other inputs in base * update based on review Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com> * Update tests/pipelines/test_pipelines_automatic_mask_generation.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * update * fixup * TODO x and y to refactor, _h _w refactored here * update docstring * more nits * style on these * more doc fix * rename variables * update * updates * style * update * fix `_mask_to_rle_pytorch` * styling * fix ask to rle, wrong outputs * add device arg * update * more updates, fix tets * udpate * update docstrings * styling * fixup * add notebook on the docs * update orginal sizes * fix docstring * updat condition on point_per-batch * updates tests * fix CI test * extend is required, append does not work! * fixup * fix CI tests * whit pixels left * address doc comments * fix doc * slow pipeline tests * update auto init * add revision * make fixup * update p!ipoeline tag when calling tests * alphabeitcal order in inits * fix copies * last style nits * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * reformat docstring * more reformat * address most of the comments * Update src/transformers/pipelines/mask_generation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * final refactor * Update src/transformers/models/sam/image_processing_sam.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fixup and fix slow tests * revert --------- Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-04-20 19:27:24 +02:00
fxmarty	4cfe328bae	Fix SAM example in documentation (#22887 ) fix sam example	2023-04-20 12:22:42 +02:00
Younes Belkada	2da73f6302	[`SAM`] Correct arxiv link (#22886 ) put correct link	2023-04-20 11:23:12 +02:00
Arthur	474bf508df	Add Segment Anything Model (SAM) (#22654 ) * initial commit * keys match * update, fix conversion * fixes, inference working * fix * more fixes * more fixes * clean up * more clean up * fix copies and add convext copied layer norm * stash * pretty big upfate * cleaning * more cleaning * fixup stuffs * fix copies * fix iinit * update test removing tokenizer * nits * add pretrained * more nits * remove tracking of pipeline * few fixes * update san and conversion script * fix mask decoder and prompt encoder conversion * fixes * small update * fix order * fix * fix image embeddings * nites * few fixes * fix logits * clean up * fixes boxes inference * v1 AMG * clean up * some clean up * multi points support * amg working * fixup * clean up * readme * update toctree * fix type hint * multiple fixes * fixup * fixes * updates * updates * more tests * few fixes * change to `SamForMaskGeneration` * doc * fixup * fix more tests * multiple fixes * fix CI tests * refactor processor * renamings * draft the pipeline * refactor * fix tests * fix test * few cleanings * fix test * edit pipelien support chunking * udate * add slow tests * fix nit * fixup * fix nit * current chunk pipleine * cast boxes in fp32 * nit * current updates * piepleine works * fixup * clean up config * fix slow tests * fix slow tests * clean up * update doc and pipeline * adds more slow tests * fix slow tests * cleaning * tests pass * add docstring * fix copies * clean up * support batch of images * style * dummy is needed, add tests * fix slow tests * fix CI * update * adds more tests * fixes * fixes * fixup * fixes * few fixes * filter * few fixes * some refactor * touches finales * fix * style * remove pipeline files * fixes nits * revert pipeline changes * fix test * fixup * remove automodel for automatic mask generation * fix failing torch tests * update mdx * revert removal of `MODEL_FOR_AUTOMATIC_MASK_GENERATION_MAPPING` * update sam config based on review Co-authored-by: amyeroberts <aeroberts4444@gmail.com> Co-authored-by: sgugger <sylvain.gugger@gmail.com> * update low_resolution_masks -> pred_masks inti ln with layer_norm_eps add_decomposed_rel_pos doc forward doc of SamForMaskGeneration * update processor docstring * remove image processor import empty * update for testing * output vision hidden states + clean recomm also test all iou values * fixup * fixup * remove unused * Update src/transformers/models/sam/modeling_sam.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/sam/image_processing_sam.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * nits * fix * fix CI tests and slow tests * replace with Amy's processor * clearer docstring * add `SamVisionNeck` * refactor - all CI tests should pass * fix broken import on Gcolab * few fixes here and there * fix another bug * fix more bugs * update and merge * correct ckpt * address comments * add tips * revert * fix docstring * replace with `SamModel` * make fixup * add support for bathed images and batch ed points * make fixup this time, really * make fixup again and again * few fixes here and there, this should be the touche finale * Update docs/source/en/model_doc/sam.mdx * fixup * correct checkpoints * correct name * rm unneeded file * add notebook --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: amyeroberts <aeroberts4444@gmail.com> Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-04-19 21:01:49 +02:00
Joao Gante	78cda46f17	Generate: Add assisted generation (#22211 ) * working mvp * remove breakpoint * fix commit * standardize outputs * tmp commit * tests almost ready * tmp commit * skip a few models * Add streaming; Docs and examples * document limitations * PR commits * Amy PR comments	2023-04-18 17:36:56 +01:00
Gabriel Yang	42288269c3	🌐 [i18n-KO] Fix anchor links for docs `auto_tutorial`, `training` (#22796 ) docs: ko: fix anchor links for docs (auto_tutorial, training) Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Na Yeon Han <nayeon2.han@gmail.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>	2023-04-18 09:11:30 -04:00
Sylvain Gugger	dacd34568d	Mark auto models as important (#22815 ) * Mark auto models as important * Annoying file with bad line endings	2023-04-17 15:33:01 -04:00
Wonhyeong Seo	4d2c52e830	🌐 [i18n-KO] Translated `tasks/translation.mdx` to Korean (#22805 ) docs: ko: tasks/translation.mdx	2023-04-17 11:30:17 -04:00
Jungnerd	abbc96a214	[i18n-KO] fix: docs: ko: sagemaker anchors and `_toctree.yml` (#22549 ) fix: docs: ko: sagemaker anchors and `_toctree.yml` Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Na Yeon Han <nayeon2.han@gmail.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>	2023-04-17 07:41:52 -04:00
Na Yeon Han	18c894814e	🌐 [i18n-KO] Translated `custom_models.mdx` to Korean (#22534 ) docs: ko: translated `custom_models.mdx` Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>	2023-04-17 07:39:53 -04:00
Mayank Agarwal	daf53241d6	Fix word_ids hyperlink (#22765 ) * Fix word_ids hyperlink * Add suggested fix	2023-04-14 16:18:15 +01:00
Sohyun Sim	c8df3900c8	[WIP]🌐 [i18n-KO] Translated `tutorial/proprecssing.mdx` to Korean (#22578 ) * add ko preprocessing * translate preprocessing.mdx to korean * translate preprocessing.mdx * Update preprocessing.mdx Fixed the line 273 as below: 또한, 특징 추출기에 `sampling_rate` 인자를 추가하여 발생할 수 있는 조용한 오류(silent errors)를 더 잘 디버깅하는 것을 권장합니다. * translate Image part * translated preprocess.mdx * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * fixed translation --------- Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>	2023-04-14 07:26:44 -04:00
Hyeonseo Yun	bfb3925fcb	🌐 [i18n-KO] Translated `sequence_classification.mdx` to Korean (#22655 ) * docs: ko: init: tasks/sequence_classification.mdx * docs: ko: revised: change voca in tasks/sequence_classification.mdx * docs: ko: revised: [RE] change voca in tasks/sequence_classification.mdx * docs: ko: revised: spell check and sentence naturally in tasks/sequence_classification.mdx * docs: ko: revised: spell check and consistent vocabulary in tasks/sequence_classification.mdx * docs: ko: revised: Add full stop and change voca in tasks/sequence_classification.mdx * docs: ko: revised: sync first section templates in tasks/sequence_classification.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * fix: revert use of full-stops to colons * colons are used to emphasize the code block that follows * @0525hhgus @wonhyeongseo docs: ko: revised: sync second section templates in tasks/sequence_classification.mdx Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> * docs: ko: revised: change 'train', 'finetuning' in tasks/sequence_classification.mdx --------- Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>	2023-04-13 21:40:36 -04:00
Joao Gante	9dfd6a4baa	Generate: handle text conditioning with multimodal encoder-decoder models (#22748 )	2023-04-13 19:51:13 +01:00
Gabriel Yang	4def2fe969	🌐 [i18n-KO] Translated `training.mdx` to Korean (#22670 ) translate training doc to Korean	2023-04-13 11:04:47 -04:00
NielsRogge	8eb38f638d	[Pix2struct] Simplify generation (#22527 ) * Add model to doc tests * Remove generate and replace by prepare_inputs_for_generation * More fixes * Remove print statements * Update integration tests * Fix generate * Remove model from auto mapping * Use auto processor * Fix integration tests * Fix test * Add inference code snippet * Remove is_encoder_decoder * Update docs * Remove notebook link	2023-04-13 09:01:14 -04:00
ARKA1112	d87ef00c31	Modify pipeline_tutorial.mdx (#22726 ) generator(model="openai/whisper-large") always returns error. As the error says the generator expects an input, just like the .flac file above. Even the generator object has no parameters called model. While there are parameters which can be passed to generator like 'batch_size' but to pass a model i believe the the parameter has to be passed while instantiating the pipeline and not as a parameter to the instance. I believe the correct term should be: generator = pipeline(model="openai/whisper-large", device=0)	2023-04-12 15:20:25 +01:00
Younes Belkada	370f0ca18c	[`bnb`] Let's make serialization of int8 models possible (#22177 ) * make serialization of int8 models possible * make fixup * add docs * add ability to push to hub and save pretrained * fixes * more addition * more tests * fix issues * change variable * clearer message * adapt from suggestions * few fixes * remove unused function * Update src/transformers/utils/quantization_config.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address last comments * last warning * clarify doc * protect import * Update src/transformers/modeling_utils.py * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-12 08:01:18 -04:00
pioliverse	523ca4e016	add model resources for CPMAnt (new) (#20906 ) * resolve conflicts * rebase and make style * test * test * test * rebase and make style * rebase and make style * tests * tests * rewrite some functions * rebase and make style * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * fix some bugs & docstring * add models and tests * solve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * tests * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * fix some bugs & docstring * save resolution * make style * delete redefinition code * reformat function * reformat * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * tests * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * resolve conflicts * make style * fix bugs and refactor * modify docstrings and make style * unify import format in __init__.py * fix import-altclp bug * fix copies to update index.md * fix unused config parameters * fix unused config parameters * fix unused config parameters * update README_ja.md * dummy commit for unit test * fix attention mask * add CPMAntTokenizer&-Fast to auto-mapping * drop redundant changes in README_ko * fix defaults in docstring * fix use_cache and some docstring * add missing args in tokenizer * modify tester inheritance * add is_jieba_available * fix some bugs * make style and fix-copies * add doctests * skip integration tests * add is_jieba_available * fix bugs in common tests * adjust docstrings and make style * add argument docstring * adjust code to some specifications * make style and fix-copies * add fast tokenization test * dummy commit for unit test * dummy commit for unit test * dummy commit for unit test * normalize some comments and names * Bert->CPMAnt * camel names and drop redundant codes * make style and fix-coies * add CpmTokenizerFast _import_structure * drop cpmanttokenizerfast in model_doc * fix some problems * fix CPMAnt tokenization for common test * make style and fixup * fix copies and fixup * fix bugs in tokenization test * dummy commit for connection failure in unittest * fix copies * drop trailing comma * fix decorator in tests * dummy commit for connection failure in unittest --------- Co-authored-by: Gong Baitao <gongbaitao11@gmail.com>	2023-04-12 07:33:20 -04:00
Arthur	b76e6ebd44	remove wrong doc in readme (#22723 )	2023-04-12 07:11:12 -04:00
Sylvain Gugger	28c19ab58d	Make it easier to develop without a dev install (#22697 ) * Make it easier to develop without a dev install * Remove ugly hack that doesn't work anyway	2023-04-11 08:41:53 -04:00
Sugawara	6daa9cb515	add GPTNeoXForSequenceClassification (#22671 ) * add GPTNeoXForSequenceClassification * move the labels to logits.device (ref: #22561) * fix	2023-04-10 11:52:23 -04:00
Kirill	14fc1a2467	Fix quantization docs typo (#22666 )	2023-04-10 08:53:53 -04:00
Joel Lamy-Poirier	e0921c6b53	Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575 ) * Add model with cli tool * Remove unwanted stuff * Add new code * Remove inference runner * Style * Fix checks * Test updates * make fixup * fix docs * fix doc * fix test * hopefully fix pipeline tests * refactor * fix CIs * add comment * rename to `GPTBigCodeForCausalLM` * correct readme * make fixup + docs * make fixup * fixes * fixes * Remove pruning * Remove import * Doc updates * More pruning removal * Combine copies * Single MQA implementation, remove kv cache pre-allocation and padding * Update doc * Revert refactor to match gpt2 style * Merge back key and value caches, fix some type hints * Update doc * Fix position ids pith padding (PR 21080) * Add conversion script temporarily * Update conversion script * Remove checkpoint conversion * New model * Fix MQA test * Fix copies * try fix tests * FIX TEST!! * remove `DoubleHeadsModel` * add MQA tests * add slow tests * clean up * add CPU checker * final fixes * fixes - fix GPU issue - fixed slow tests - skip disk offload * fix final issue * Simplify and comment baddbmm fix * Remove unnecessary code * Transpose tweaks * Use beta=1 on cpu, improve tests --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-04-10 10:57:21 +02:00
Joao Gante	3f96e0b4e4	Generate: add API warning to streamers (#22659 ) add API warning	2023-04-07 14:15:20 -04:00
Wonhyeong Seo	fc1ba6fd11	🌐 [i18n-KO] Translated `pipeline_tutorial.mdx` to Korean (#22508 ) docs: feat: Korean pipeline_tutorial Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by: gabrielwithappy <102908949+gabrielwithappy@users.noreply.github.com> Co-authored-by: Na Yeon Han <nayeon2.han@gmail.com>	2023-04-07 11:27:59 -04:00
gabrielwithappy	d59034ff6f	🌐[i18n-KO] Translate `autoclass_tutorial` to Korean and Fix the typo of `quicktour` (#22533 ) translate the autoclass_tutorial and fix the typo of the quicktour	2023-04-07 08:12:35 -04:00
Nicolas Patry	1670be4bde	Adding Llama FastTokenizer support. (#22264 ) * Adding Llama FastTokenizer support. - Requires https://github.com/huggingface/tokenizers/pull/1183 version - Only support byte_fallback for llama, raise otherwise (safety net). - Lots of questions are special tokens How to test: ```python from transformers.convert_slow_tokenizer import convert_slow_tokenizer from transformers import AutoTokenizer from tokenizers import Tokenizer tokenizer = AutoTokenizer.from_pretrained("huggingface/llama-7b") if False: new_tokenizer = Tokenizer.from_file("tok.json") else: new_tokenizer = convert_slow_tokenizer(tokenizer) new_tokenizer.save("tok.json") strings = [ "This is a test", "生活的真谛是", "生活的真谛是[MASK]。", # XXX: This one is problematic because of special tokens # "<s> Something something", ] for string in strings: encoded = tokenizer(string)["input_ids"] encoded2 = new_tokenizer.encode(string).ids assert encoded == encoded2, f"{encoded} != {encoded2}" decoded = tokenizer.decode(encoded) decoded2 = new_tokenizer.decode(encoded2) assert decoded.strip() == decoded2, f"{repr(decoded)} != {repr(decoded2)}" ``` The converter + some test script. The test script. Tmp save. Adding Fast tokenizer + tests. Adding the tokenization tests. Correct combination. Small fix. Fixing tests. Fixing with latest update. Rebased. fix copies + normalized added tokens + copies. Adding doc. TMP. Doc + split files. Doc. Versions + try import. Fix Camembert + warnings -> Error. Fix by ArthurZucker. Not a decorator. * Fixing comments. * Adding more to docstring. * Doc rewriting.	2023-04-06 09:53:03 +02:00
Younes Belkada	176ceff91f	Add DePlot + MatCha on `transformers` (#22528 ) * add deplot + matcha on `transformers` * more docs * correct path * Update docs/source/en/model_doc/deplot.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix * use auto processor * Update docs/source/en/model_doc/matcha.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fixup * Update docs/source/en/model_doc/deplot.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add correct names --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2023-04-05 17:43:48 +02:00
Wonhyeong Seo	f49b0762a1	docs: ko: complete `_toctree.yml` (#22581 ) Co-authored-by: gabrielwithappy <102908949+gabrielwithappy@users.noreply.github.com>	2023-04-05 09:32:17 -04:00
Shubhamai	900677487d	Flax Regnet (#21867 ) * initial commit * review changes * post model PR merge * updating doc	2023-04-04 12:41:12 -04:00
Matt	5f3ea66bc0	Add TF port of BLIP (#22090 ) * Initial commit * more stash commit * Yet another stash commit * yet more stash commit * Mostly working except for docs / repo consistency * Stop importing model list from torch file * Add TF BLIP models to docs * Add auto classes * Move get_text_features and get_image_features * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/blip/test_modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/blip/test_modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/models/blip/test_modeling_tf_blip_text.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use channels_last convolutions in TF (better performance + compatibility) * Remove _shape function * Move multi-line statement to one line in PT + TF * Specify tf.keras.layers instead of importing from it * Remove test_gradient_checkpointing and empty test_training methods * move some multi-line statements to one line * Update docstring for generate * Remove pruned heads set * Remove self.seq_len_dim * Fixed issues with loss computation, should resolve some tests. Also ensured that the PT version follows the config for output_attentions and output_hidden_states * ensure original model follows config in more cases * Skip the same cross-attention tests in the PT tests - didn't realize we did it twice! * Add training args throughout the models and layers * make fixup * Fix docstring for inputs_embeds * Add docstring for is_decoder * Add docstrings to text models * Remove redundant computation * Add unpack_inputs / keras_serializable * Add modeling_tf_blip to doctests * Add config classes for keras serialization * Changes to allow model porting with pt-to-tf * Quick fix to decoder head and test tweaks * Revert an issue with masking the embeddings outputs * Allow missing keys in some equivalence tests (for unused layers) * Add tf-pt equivalence tests back in * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fixup * Refactor invert_attention_mask out into tf_utils * Re-enable cross-tests on the PT side too --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-04 16:05:22 +01:00
Arthur	00b5887b94	🚨🚨🚨 `[NLLB Tokenizer]` Fix the prefix tokens 🚨🚨🚨 (#22313 ) * fix the prefix tokens * update fast and test values * add legacy behaviour Co-authored-by: sgugger <sylvain.gugger@gmail.com> * update disclaimer, linkissue PR and behaviral changes * Apply suggestions from code review Co-authored-by: Lysandre Debut <hi@lysand.re> * styling * make a quote * quote this time --------- Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Lysandre Debut <hi@lysand.re>	2023-04-04 14:53:06 +02:00
Kirill	a60010566a	llama docs: fix conversion script url (#22514 )	2023-04-03 10:28:40 -04:00
Joao Gante	a55a822adf	Generate: `TextIteratorStreamer` (streamer for gradio) (#22501 ) * haha text go brrr (but in gradio)	2023-04-03 15:04:37 +01:00
Mohammed Jabir	7d25c9c81e	added biogpt token classifier (#22447 ) * added biogpt token classifier * fix reviews * Updated modeling_biogpt.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-04-03 09:20:02 -04:00
Jungnerd	1194c3e315	[WIP] docs: ko: sagemaker.mdx (#22509 ) docs: ko: sagemaker.mdx	2023-04-03 09:17:02 -04:00
Manuel de Prada	d5de578c22	Docs fix: Multinomial sampling decoding needs "num_beams=1", since by default it is usually not 1. (#22473 ) Fix: Multinomial sampling needs "num_beams=1", since by default is 5.	2023-03-30 11:04:12 -04:00
Joao Gante	228792a9dc	Generate: basic token streaming (#22449 ) * haha tokens go brrrr	2023-03-30 12:00:12 +01:00
fpgaminer	ed57c979b9	Fix bug in perplexity guide calculations and update perplexity numbers. Fixes #22348 (#22411 ) Fix bug in perplexity guide calculations and update perplexity numbers.	2023-03-28 09:09:17 -04:00
Arthur	19ade2426a	[WIP]`NLLB-MoE` Adds the moe model (#22024 ) * Initial commit * update modeling code * update doc * add functions necessary * fix impotrs * revert changes * fixup * more styling to get going * remove standalone encoder * update code * styling * fix config and model * update code and some refactoring * make more tests pass * Adding NLLB-200 - MoE - 54.5B for no language left behind Fixes #21300 * fix mor common tests * styke * update testing file * update * update * Router2 doc * update check config with sparse layer * add dummy router * update current conversion script * create on the fly conversion script * Fixup * style * style 2 * fix empty return * fix return * Update default config sparse layers * easier to create sparse layers * update * update conversion script * update modeling * add to toctree * styling * make ruff happy * update docstring * update conversion script * update, will break tests but impelemting top2 * update * ❗local groups are supported here * ⚠️ Support for local groups is now removed ⚠️ This is because it has to work with model parallelism that we do not support * finish simplificaiton * Fix forward * style * fixup * Update modelling and test, refactoring * update tests * remove final layer)norm as it is done in the FF * routing works! Logits test added * nit in test * remove top1router * style * make sure sparse are tested. Had to change route_tokens a liottle bit * add support for unslip models when converting * fixup * style * update test s * update test * REFACTOR * encoder outputs match! * style * update testing * 🎉encoder and decoder logits match 🎉 * styleing * update tests * cleanup tests * fix router test and CIs * cleanup * cleanup test styling * fix tests * Finally the generation tests match! * cleanup * update test * style testing file * remove script * cleanup * more cleanup * nits * update * NLLB tokenizer is wrong and will be fixed soon * use LongTensors * update tests * revert some small changes * fix second expert sampling and batch prioritized routing * update tests * finish last tests * make ruff happy * update * ruff again * style * Update docs/source/en/model_doc/nllb-moe.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updates based on review * style and fix import issue * nit * more nits * cleanup * styling * update test_seconde_expert_policy * fix name * last nit on the markdown examples --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-27 19:42:00 +02:00
Nicola Procopio	204737fcc5	Translated documentation in italian (#22388 ) * updated toctree * added and translated mdx documents	2023-03-27 09:48:49 -04:00
Shubhamai	a0cbbba31f	Resnet flax (#21472 ) * [WIP] flax resnet * added pretrained flax models, results reproducible * Added pretrained flax models, results reproducible * working on tests * no real code change, just some comments * [flax] adding support for batch norm layers * fixing bugs related to pt+flax integration * removing loss from modeling flax output class * fixing classifier tests * fixing comments, model output * cleaning comments * review changes * review changes * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * renaming Flax to PyTorch --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-03-24 19:45:57 +00:00
Mitch Naylor	57f25f4b7f	Add Mega: Moving Average Equipped Gated Attention (#21766 ) * add mega file structure and plain pytorch version of mega source code * added config class with old naming conventions * filled in mega documentation * added config class and embeddings with optional token types * updated notes * starting the conversion process, deleted intermediate and added use_cache back to config * renamed config attributes in modeling_mega.py * checkpointing before refactoring incremental decoding functions * removed stateful incremental key/values for EMA and self-attention * refactored MovingAverageGatedAttention to remove stateful k/v history and use unified attention mask * MovingAverageGatedAttention works with incremental decoding + past values, added sequence length enforcement * more comments in MovingAverageGatedAttention + checkpointing before GatedCrossAttention * bug fix in attention mask handling in MovingAverageGatedAttention * removed incremental state from GatedCrossAttention and removed IncrementalState class * finished gated cross attention and got MegaLayer working * fixed causal masking in mega decoder * fixed how padding and causal masks are passed through MegaLayer with and without k/v caching * finished MegaModel; tested with encoder, decoder-only, and cross-attention type inputs; started work on downstream classes; removed mentions of position_ids * added optional dense hidden layer for masked and causal LM classes * docstring updates in MultiHeadEMA and GatedCrossAttention, removed unnecessary inputs in cross-attention * removed before_attn_fn in Mega class and updated docstrings and comments up to there * bug fix in MovingAverageGatedAttention masking * working conversion of MLM checkpoint in scratchpad script -- perfect matches * moved arg for hidden dense layer in LM head to config; discovered issue where from_pretrained is renaming gamma and beta parameters * renamed gamma and beta parameters to avoid HF renaming when loading from checkpoint * finished checkpoint conversion script * cleanup old class in mega config script * removed 'copied from' statements and passing integration tests * added num_attention_heads=1 to config for integration compatibility, decoder tests working, generation tests failing * fixed tuple output of megamodel * all common tests passing after fixing issues in decoder, gradient retention, and initialization * added mega-specific tests, ready for more documentation and style checks * updated docstrings; checkpoint before style fixes * style and quality checks, fixed initialization problem in float_tensor, ready for PR * added mega to toctree * removed unnecessary arg in megaconfig * removed unused arg and fixed code samples with leftover roberta models * Apply suggestions from code review Applied all suggestions except the one renaming a class, as I'll need to update that througout Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixed issue where .view breaks batch dimension, conversion script fixed with absolute imports, updated readme with Mega->MEGA * removed asserts in Mega code, renamed sequencenorm, gatedcrossattention, and NFFN, replaced get_activation_fn with ACTFN, and added sequencenorm to layer norms * reformatted .forward() docstrings to match style and removed unused mask input in cross-attention * removed all reset_parameters() methods and rolled into MegaPreTrainedModel._init_weights() * renamed all single-letter variables and improved readability in tensor size comments, Mega->MEGA in 2 documentation files * variable names in NFFN * manual Mega->MEGA changes in docs * Mega->MEGA in config auto * style and quality fixes * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * renamed parameters and variables with confusing names, added copied from statements, moved fft conv to its own method, other cleanup from PR comments * commit before dealing with merge conflicts * made new attention activation functions available in ACT2FN and added generation test from OPT * style and quality in activations and tests * documentation fixes, renaming variables in dropout and rotary positions, used built-in causal masking, encoders->layers in MegaModel, moved comments into docstrings * style and quality fixes after latest updates, before rotary position ids * causal mask in MegaBlock docstring + added missing device passing * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * added Mega prefixes where missing, reverted MegaSequenceNorm to if-else, other module renaming requested in PR * style and quality fixes + readme updates pointing to main --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-24 08:17:27 -04:00
Ashwin Mathur	b79607656b	Fix typo in Greedy Search Description (#22345 ) Fix typo in greedy search docs	2023-03-24 07:32:18 -04:00
Stas Bekman	73fdc8c5b4	[deepspeed zero3] need `generate(synced_gpus=True, ...)` (#22242 ) * [deepspeed zero3] need generate(synced_gpus=True, ...) * fix * rework per Sylvain's suggestion * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-22 12:18:57 -07:00
Younes Belkada	0f68a7f408	Add Pix2Struct (#21400 ) * v1 all keys match * clean up * forward pass ok * add correct image transform * generate works, logits matching * clean up * more refactor * revert * revert * clean up * clean ups * clean up * refactor * refactor * fix doc * fix tokenizer test * fix toctree * revert toctree * oops * few fixes * replace to `pixel_embeds` * make fixup * test processing & feat extractor * fix some tests * more fixes * make fixup * clean up * more clean up * add a single slow test * fix test * make fixup * fix * fix authors * fix toctree * update docs * add docstring * revert change * Update src/transformers/models/pix2struct/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix tokenizer * fix processor test * fix test * make fixup * refactor * fix config * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * format * fix * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * make fixup * add docstring * fix issues * fix * fix * fix * add slow test * fix * fix * fix batched issue * fix training issues * fix ci test * fix slow test * fix conversion script * remove unneeded classes * fix slow test * fix require backends * fix masked fill * revert * fix softmax * add large models support * fix conditional generation * few fixes * add instructions * rm unneeded file * Update src/transformers/models/pix2struct/convert_pix2struct_original_pytorch_to_hf.py * fix ci test * fix ci test really * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix nit * fix nits * fix image processors nits * docstring * clean up * fix nit * fix tests * docstring nit * fix reshape * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix nit * fix repetition * refactor processor * make patch size consistent * refactor forward * fix docstring * fix max_patches issue * update docstirng * update docstring * fix coped from * add skip reasons * few fixes * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * format * fix doctests * refactor and fix * fix doc build issue * fix processor test * small fix conversion script * replace correct weights * make fixup * fix some issues * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * revert config and fixes * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * more details * fixes * fix processor * fix processor test * fix * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * fix processor * Update src/transformers/models/pix2struct/modeling_pix2struct.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add copied * make fixup * fix copies * update docstring * refactor * fix docstring * fix conversion script * fix vqa issue * replace to `flattened_patches` * nit * fix numpy issue * fix image processors * add batched vqa support * fix vqa conversion * make fixup * fix conversion script * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * add correct docstring * update docstring * fix module level + channel dim * use `make_list_of_images` * refactor * correct docstring * fix authors * remove `data_format` * add header text test * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * add checkpoints --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2023-03-22 16:53:52 +01:00
Stas Bekman	89a0a9eace	[deepspeed] offload + non-cpuadam optimizer exception doc (#22044 ) * [deepspeed] offload + non-cpuadam optimizer exception doc * deps	2023-03-21 17:00:05 -07:00
Davide Gazzè	86c7931a70	Add translation perf_infer_gpu_one for it (#22296 ) Add translation	2023-03-21 13:07:30 -04:00
Maria Khalusova	7bd8650512	Example of pad_to_multiple_of for padding and truncation guide & docstring update (#22278 ) * added an example of pad_to_multiple_of * make style * addressed feedback	2023-03-20 14:18:55 -04:00
amyeroberts	8ac29fe090	Fix doc links (#22274 )	2023-03-20 17:07:31 +00:00
Sylvain Gugger	786092a35e	Rework a bit the LLaMA conversion script (#22236 ) * Update LLaMA conversion script * Doc * Fix the weight size for the 13B checkpoint * Update src/transformers/models/llama/convert_llama_weights_to_hf.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2023-03-20 11:30:36 -04:00
Nicola Procopio	c4bf6f38bd	Italian translation perf_infer_cpu (#22243 ) * added translated files added perf_train_cpu and perf_train_cpu_many * updated toctree * updated toctree * added file perf_infer_cpu.medx * italian translation perf_infer_cpu.mdx	2023-03-20 09:16:07 -04:00
Seb0	074490b2c2	fix(docs): fix task guide links in model docs (#22226 ) fix(docs): task guide links in model docs	2023-03-17 14:30:17 +00:00
Maria Khalusova	314cdf7c25	Removed .mdx extension in two links (#22230 ) removed .mdx extension	2023-03-17 10:27:12 -04:00
lewtun	f251441387	Add LlamaForSequenceClassification (#22209 ) * Add LlamaForSequenceClassification * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Add docstring * Add test * Add input embedding getter and setter * Remove dead code --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-03-17 14:39:26 +01:00
Sylvain Gugger	00934026a4	LLaMA house-keeping (#22216 ) * LLaMA house-keeping * Doc links	2023-03-17 08:55:15 -04:00
Maria Khalusova	42f8f76402	Depth estimation task guide (#22205 ) * added doc to toc, auto tip with supported models, mention of task guide in model docs * make style * removed "see also" * minor fix	2023-03-17 08:36:23 -04:00
wangpeng	af1c864cdc	fix code example in mgp-str doc (#22219 ) Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>	2023-03-17 09:40:06 +00:00
Kevin Turner	33d033d694	fix typos in llama.mdx (#22223 )	2023-03-17 08:43:18 +00:00
Jason Phang	0041be5b3d	LLaMA Implementation (#21955 ) * LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by: Stella Biderman <stellabiderman@gmail.com>	2023-03-16 09:00:53 -04:00
Baelish03	09922da4a7	Italian Translation of migration.mdx (#22183 ) * Tranlstion Italian: migration * Update migration.mdx minor fixes * Update _toctree.yml * Delete migration.mdx * Add italian translation of migration.mdx * Update of migration.mdx translation and toctree	2023-03-16 12:00:07 +00:00
Alara Dirik	1485bd9c02	Fix typo in Align docs (#22199 ) Fix align docs typo	2023-03-16 13:41:48 +03:00
Nicola Procopio	7f5ad6c35b	Translation Italian: perf_train_cpu and perf_train_cpu_many (#22151 ) * added translated files added perf_train_cpu and perf_train_cpu_many * updated toctree	2023-03-14 11:09:36 +00:00
Yih-Dar	ff88703501	Update 2 doctest expected values for torch 2.0.0 (#22148 ) update values Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-14 09:13:16 +00:00
Alara Dirik	cdddfbffa1	Add ConvNeXT V2 (#21679 ) * Add ConvNeXt V2 to transformers * TF model is separated from the PR to fix issues	2023-03-14 12:08:14 +03:00
MichaelRipa	101a6cd276	docs: New terms and updates to glossary (#21982 ) * Updated glossary with new terms, added abbreviations for certain terms and merged autoencoding models, autoregressive models and causal language modeling into encoder and decoder models * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Added link to 'Pipeline for inference' tutorial * Trigger CI * Update docs/source/en/glossary.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Added entry for self supervised learning, added deleted entries + fixed broken links * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-13 19:09:37 -04:00
Stas Bekman	618697ef53	[deepspeed docs] Activation Checkpointing (#22099 ) * [deepspeed docs] Activation Checkpointing * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update deepspeed.mdx --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-13 12:52:42 -07:00
Maria Khalusova	8def252de2	Zero-shot image classification task guide (#22132 ) * WIP * WIP * manual inference example * make style * Apply suggestions from code review Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> --------- Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>	2023-03-13 10:57:17 -04:00
Nicola Procopio	dd3a0580a6	Added big_models.mdx italian translation #17600 (#22115 ) * updated toctree * italian translation big_model.mdx * italian translation big_models	2023-03-13 10:02:03 -04:00
Alex Calabrese	0c883766bd	Add pr_checks.mdx Italian translation (#17459 ) (#22116 ) * Add pr_checks.mdx Italian translation (#17459) * Updated pr_checks.mdx Italian translation (#17459)	2023-03-13 09:24:34 -04:00
wangpeng	102b5ff4a8	add new model of MGP-STR (#21418 ) * add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * remove representation_size from MGPSTRConfig * reformat configuration_mgp_str.py * format test_processor_mgp_str.py * add test for tokenizer and complete model/processer test and model file * rm Unnecessary tupple in modeling_mgp_str * reduce hidden_size/layers/label_size in test_model * add integration tests and change MGPSTR to Mgpstr * add test for logit values * reformat test model file --------- Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>	2023-03-13 10:11:31 +00:00
Alara Dirik	32e3466d38	Add AutoModelForZeroShotImageClassification (#22087 ) Adds AutoModelForZeroShotImageClassification to transformers	2023-03-13 12:46:14 +03:00
Maria Khalusova	bdec2768bd	GPT-J specific half precision on CPU note (#22086 ) * re: #21989 * update re: #21989 * removed cpu option * make style	2023-03-10 14:03:43 -05:00
Kevin Jiang	ade26bf991	Fix small typo in flan-ul2.mdx (#22068 ) * Update flan-ul2.mdx * Update flan-ul2.mdx	2023-03-10 07:44:45 -05:00

1 2 3 4 5 ...

1993 Commits