transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-21 21:49:06 +06:00

Author	SHA1	Message	Date
Wonhyeong Seo	57943630e2	Add Llama2 resources (#25531 ) * docs: feat: model resources for llama2 Co-authored-by: Woojun Jung <hello_984@naver.com> * fix: add description for dpo and rearrange posts * docs: feat: add llama2 notebook resources * style: one liners for each resource Co-Authored-By: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-Authored-By: Kihoon Son <75935546+kihoon71@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fix typo Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Woojun Jung <hello_984@naver.com> Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-08-22 17:14:54 -07:00
Yih-Dar	40a0cabd93	Update doc toctree (#25661 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-22 22:58:55 +02:00
Alex McKinney	5eeaef921f	Adds `TRANSFORMERS_TEST_BACKEND` (#25655 ) * Adds `TRANSFORMERS_TEST_BACKEND` Allows specifying arbitrary additional import following first `import torch`. This is useful for some custom backends, that will require additional imports to trigger backend registration with upstream torch. See https://github.com/pytorch/benchmark/pull/1805 for a similar change in `torchbench`. * Update src/transformers/testing_utils.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Adds real backend example to documentation --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2023-08-22 17:08:13 +02:00
Sylvain Gugger	3629190689	Put IDEFICS in the right section of the doc (#25650 )	2023-08-22 10:39:10 +02:00
Blake Wyatt	6a314ea7cd	[DOCS] MusicGen Docs Update (#25510 ) * docs: note token limitations for MusicGen * docs: note token limitations for MusicGen * docs: fix token count with token limitations for MusicGen	2023-08-22 08:22:45 +02:00
Susnato Dhar	450a181d8b	Add Pop2Piano (#21785 ) * init commit * config updated also some modeling * Processor and Model config combined * extraction pipeline(upto before spectogram & mel_conditioner) added but not properly tested * model loading successful! * feature extractor done! * FE can now be called from HF * postprocessing added in fe file * same as prev commit * Pop2PianoConfig doc done * cfg docs slightly changed * fe docs done * batched * batched working! * temp * v1 * checking * trying to go with generate * with generate and model tests passed * before rebasing * . * tests done docs done remaining others & nits * nits * LogMelSpectogram shifted to FeatureExtractor * is_tf rmeoved from pop2piano/init * import solved * tokenization tests added * minor fixed regarding modeling_pop2piano * tokenizer changed to only return midi_object and other changes * Updated paper abstract(Camera-ready version) (#2) * more comments and nits * ruff changes * code quality fix * sg comments * t5 change added and rebased * comments except batching * batching done * comments * small doc fix * example removed from modeling * ckpt * forward it compatible with fe and generation done * comments * comments * code-quality fix(maybe) * ckpts changed * doc file changed from mdx to md * test fixes * tokenizer test fix * changes * nits done main changes remaining * code modified * Pop2PianoProcessor added with tests * other comments * added Pop2PianoProcessor to dummy_objects * added require_onnx to modeling file * changes * update .md file * remove extra line in index.md * back to the main index * added pop2piano to index * Added tokenizer.__call__ with valid args and batch_decode and aligned the processor part too * changes * added return types to 2 tokenizer methods * the PR build test might work now * added backends * PR build fix * vocab added * comments * refactored vocab into 1 file * added conversion script * comments * essentia version changed in .md * comments * more tokenizer tests added * minor fix * tests extended for outputs acc check * small fix --------- Co-authored-by: Jongho Choi <sweetcocoa@snu.ac.kr>	2023-08-21 16:35:00 +01:00
mchau	6f041fcbb8	fix documentation for CustomTrainer (#25635 ) fix doc	2023-08-21 17:23:17 +02:00
Stas Bekman	6c811a322f	new model: IDEFICS via HuggingFaceM4 (#24796 ) * rename * restore * mappings * unedited tests+docs * docs * fixes * fix auto-sync breakage * cleanup * wip * wip * add fetch_images * remove einops dependency * update * fix * fix * fix * fix * fix * re-add * add batching * rework * fix * improve * add Leo as I am extending his work * cleanup * fix * cleanup * slow-test * fix * fix * fixes * deal with warning * rename modified llama classes * rework fetch_images * alternative implementation * cleanup * strict version * cleanup * [`IDEFICS`] Fix idefics ci (#25056) * Fix IDEFICS CI * fix test file * fixup * some changes to make tests pass * fix * fixup * Update src/transformers/models/idefics/configuration_idefics.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> --------- Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * remove compat checks * style * explain that Idefics is not for training from scratch * require pt>=2.0 * fix idefics vision config (#25092) * fix idefics vision config * fixup * clean * Update src/transformers/models/idefics/configuration_idefics.py --------- Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * cleanup * style * cleanup * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * upcase * sequence of images * handle the case with no images * Update src/transformers/image_processing_utils.py Co-authored-by: Victor SANH <victorsanh@gmail.com> * support pure lm take 2 * support tokenizer options * parameterize num_channels * fix upcase * s\|IdeficsForCausalLM\|IdeficsForVisionText2Text\|g * manual to one line * addressing review * unbreak * remove clip dependency * fix test * consistency * PIL import * Idefics prefix * Idefics prefix * hack to make tests work * style * fix * fix * revert * try/finally * cleanup * clean up * move * [`IDEFICS`] Fix idefics config refactor (#25149) * refactor config * nuke init weights * more refactor * oops * remove visual question answering pipeline support * Update src/transformers/models/idefics/clip.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/models/idefics/modeling_idefics.py * cleanup * mv clip.py vision.py * tidyup --------- Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org> * fix * license * condition on pt * fix * style * fix * rm torchvision dependency, allow custom transforms * address review * rework device arg * add_eos_token * s/transforms/transform/ * fix top level imports * fix return value * cleanup * cleanup * fix * style * license * license * Update src/transformers/models/idefics/image_processing_idefics.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add a wrapper to freeze vision layears * tidyup * use the correct std/mean settings * parameterize values from config * add tests/models/idefics/test_image_processing_idefics.py * add test_processor_idefics.py * cleanup * cleanups * fix * fix * move to the right group * style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add perceiver config * reset * missing arg docs * Apply suggestions from code review Co-authored-by: Leo Tronchon <leo.tronchon@gmail.com> * address review comments * inject automatic end of utterance tokens (#25218) * inject automatic end of utterance tokens * fix * fix * fix * rework to not use the config * not end_of_utterance_token at the end * Update src/transformers/models/idefics/processing_idefics.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address review * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/image_processing_utils.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * [`Idefics`] add image_embeddings option in generate-related methods (#25442) * add image_embeddings option in generate-related methods * style * rename image_embeddings and allow perceiver embeddings precomputation * compute embeddings within generate * make is_encoder_decoder= True the default in config * nested if else fix * better triple check * switch if elif order for pixel values / img embeds * update model_kwargs perceiver only at the end * use _prepare_model_inputs instead of encoder_decoder logic * fix comment typo * fix config default for is_encoder_decoder * style * add typehints * precompute in forward * doc builder * style * pop instead of get image hidden states * Trigger CI * Update src/transformers/models/idefics/modeling_idefics.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/idefics/modeling_idefics.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * + indentation + style * simplify a bit the use_resampler logic using comments * update diocstrings * Trigger CI --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix rebase changes * unbreak #25237 - to be fixed in follow up PRs * is_composition = False * no longer needed --------- Co-authored-by: leot13 <leo.tronchon@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Victor SANH <victorsanh@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-08-18 14:12:28 -07:00
Omar Sanseviero	6f4424bb08	Make TTS automodels importable (#25595 ) * Add auto model for spectrogram/waveform * Add doc and install * Add dummy objects * Did I miss anything?	2023-08-18 22:01:35 +02:00
Younes Belkada	faed2ca46f	[`PEFT`] Peft integration alternative design (#25077 ) * a draft version * v2 integration * fix * make it more generic and works for IA3 * add set adapter and multiple adapters support * fixup * adapt a bit * oops * oops * oops * adapt more * fix * add more refactor * now works with model class * change it to instance method as it causes issues with `jit`. * add CR * change method name * add `add_adapter` method * clean up * Update src/transformers/adapters/peft_mixin.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add moe utils * fixup * Update src/transformers/adapters/peft_mixin.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * adapt * oops * fixup * add is_peft_available * remove `requires_backend` * trainer compatibility * fixup + docstring * more details * trigger CI * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_utils.py * fixup + is_main_process * added `save_peft_format` in save_pretrained * up * fix nits here and there * nits here and there. * docs * revert `encoding="utf-8"` * comment * added slow tests before the PEFT release. * fixup and nits * let's be on the safe zone * added more comments * v1 docs * add remaining docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * move to `lib_integrations` * fixup * this time fixup * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address final comments * refactor to use `token` * add PEFT to DockerFile for slow tests. * added pipeline support. --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-08-18 19:08:03 +02:00
Younes Belkada	940d1a76b0	[`Docs` / `BetterTransformer` ] Added more details about flash attention + SDPA (#25265 ) * added more details about flash attention * correct and add more details * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * few modifs * more details * up * Apply suggestions from code review Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> * adapt from suggestion * Apply suggestions from code review Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> * trigger CI * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix nits and copies * add new section --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>	2023-08-18 10:32:28 +02:00
Kihoon Son	08e32519f8	Suggestions on Pipeline_webserver (#25570 ) * Suggestions on Pipeline_webserver docs: reorder the warning tip for pseudo-code Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ko/pipeline_webserver.md Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> --------- Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-08-18 10:17:44 +02:00
Amélie T. Reymond	659ab0423e	Fix typo in example code (#25583 ) `lang_code_to_id("en_XX")` => `lang_code_to_id["en_XX"]` lang_code_to_id is a dict	2023-08-18 07:58:59 +02:00
Yoach Lacombe	b8f69d0d10	Add Text-To-Speech pipeline (#24952 ) * add AutoModelForTextToSpeech class * add TTS pipeline and tessting * add docstrings to text_to_speech pipeline * fix torch dependency * corrector 'processor is None' case in Pipeline * correct repo id * modify text-to-speech -> text-to-audio * remove processor * rename text_to_speech pipelines files to text_audio * add textToWaveform and textToSpectrogram instead of textToAudio classes * update TTS pipeline to the bare minimum * update tests TTS pipeline * make style and erase useless import torch in TTS pipeline tests * modify how to check if generate or forward in TTS pipeline * remove unnecessary extra new lines * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * refactor input_texts -> text_inputs * correct docstrings of TTS.__call__ * correct the shape of generated waveform * take care of Bark tokenizer special case * correct run_pipeline_test TTS * make style * update TTS docstrings * address Sylvain nit refactors * make style * refactor into one liners * correct squeeze * correct way to test if forward or generate * Update output audio waveform shape * make style * correct import * modify how the TTS pipeline test if a model can generate * align shape output of TTS pipeline with consistent shape --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-08-17 17:34:47 +01:00
Alex McKinney	1791ef8df6	Adds `TRANSFORMERS_TEST_DEVICE` (#25506 ) * Adds `TRANSFORMERS_TEST_DEVICE` Mirrors the same API in the diffusers library. Useful in transformers too. * replace backend checking with trying `torch.device` * Adds better error message for unknown test devices * `make style` * adds documentation showing `TRANSFORMERS_TEST_DEVICE` usage.	2023-08-17 13:41:34 +02:00
Younes Belkada	e7e9261a20	[`Docs`] Fix un-rendered images (#25561 ) fix un-rendered images	2023-08-17 12:08:11 +02:00
lishukan	c385de2441	[TYPO] fix typo/format in quicktour.md (#25519 ) * fix_all_language_quicktour * give up ! before bash command --------- Co-authored-by: lishukan <lishukan@dxy.cn>	2023-08-16 08:03:23 +02:00
Marc Sun	06a1d75bd5	fix gptq nits (#25500 ) * fix nits * fix docstring * fix doc * fix damp_percent * fix doc	2023-08-14 11:43:38 -04:00
Erfan Zekri Esfahani	892f9ea0db	import required torch and numpy libraries (#25483 )	2023-08-13 19:26:40 +02:00
Marc Sun	55db70c63d	GPTQ integration (#25062 ) * GTPQ integration * Add tests for gptq * support for more quantization model * fix style * typo * fix method * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add dataclass and fix quantization_method * fix doc * Update tests/quantization/gptq/test_gptq.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * modify dataclass * add gtpqconfig import * fix typo * fix tests * remove dataset as req arg * remove tokenizer import * add offload cpu quantization test * fix check dataset * modify dockerfile * protect trainer * style * test for config * add more log * overwrite torch_dtype * draft doc * modify quantization_config docstring * fix class name in docstring * Apply suggestions from code review Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * more warning * fix 8bit kwargs tests * peft compatibility * remove var * fix is_gptq_quantized * remove is_gptq_quantized * fix wrap * Update src/transformers/modeling_utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add exllama * skip test * overwrite float16 * style * fix skip test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix docsting formatting * add doc * better test --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-08-10 16:06:29 -04:00
Merve Noyan	e7b001db4f	Fix rendering for `torch.compile()` docs (#25432 ) fix rendering	2023-08-10 13:25:00 +02:00
Maria Khalusova	f2a43c7383	VQA task guide (#25244 ) * initial commit * semi-finished task guide draft * image link * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/visual_question_answering.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * feedback addressed * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits addressed --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-08-09 08:29:06 -04:00
Joao Gante	f456b4d10b	Generate: generation config validation fixes in docs (#25405 )	2023-08-09 13:07:11 +01:00
Joao Gante	d59b872c9e	Docs: introduction to generation with LLMs (#25240 ) Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-08-09 11:09:20 +01:00
Merve Noyan	5ee9693a1c	Docs: Added benchmarks for `torch.compile()` for vision models (#24748 ) * added benchmarks for compile * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * added more models * added more models fr * added visualizations * minor fix * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Added links to models and put charts side by side * Added batch comparisons * Added more comparisons * Fix table * Added link to wheel * Update perf_torch_compile.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-08-07 17:18:43 +01:00
Sylvain Gugger	f0fd73a2de	Document check copies (#25291 ) * Document check copies better and add tests * Include header in check for copies * Manual fixes * Try autofix * Fixes * Clean tests * Finalize doc * Remove debug print * More fixes	2023-08-04 14:56:29 +02:00
Victor Geislinger	641adca558	Fix typo: Roberta -> RoBERTa (#25302 )	2023-08-03 14:17:30 -07:00
Howard Huang	33da2db5ea	[small] llama2.md typo (#25295 ) `groupe` -> `grouped`	2023-08-03 14:17:06 -07:00
Yoach Lacombe	6d3f9c1e2e	add generate method to SpeechT5ForTextToSpeech (#25233 ) * add generate method to SpeechT5ForTextToSpeech * update speecht5forTTS docstrings * Remove defaults to None in generate docstrings Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-08-03 14:12:07 +01:00
Yoach Lacombe	8455346c5c	Update bark doc (#25234 ) * add mention to optimization in Bark docs * add offload mention in docs * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update bark docs. * Update bark.md --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-08-03 14:08:39 +01:00
Joao Gante	a8817371c9	Docs: separate generate section (#25235 ) Separate generate doc section	2023-08-03 13:51:56 +01:00
Kevin Lloyd Bernal	ad8321512d	recommend DeepSpeed's Argument Parsing documentation (#25268 )	2023-08-02 11:48:39 -04:00
Younes Belkada	972fdcc778	[`Docs`/`quantization`] Clearer explanation on how things works under the hood. + remove outdated info (#25216 ) * clearer explanation on how things works under the hood. * Update docs/source/en/main_classes/quantization.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/main_classes/quantization.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add `load_in_4bit` in `from_pretrained` --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-08-01 10:56:52 +02:00
Stas Bekman	5220606607	[quantization.md] fix (#25190 ) Update quantization.md	2023-07-31 09:37:29 -07:00
Sanchit Gandhi	e93103632b	Add bloom flax (#25094 ) * First commit * step 1 working * add alibi * placeholder for `scan` * add matrix mult alibi * beta scaling factor for bmm * working v1 - simple forward pass * move layer_number from attribute to arg in call * partial functioning scan * hacky working scan * add more modifs * add test * update scan for new kwarg order * fix position_ids problem * fix bug in attention layer * small fix - do the alibi broadcasting only once * prelim refactor * finish refactor * alibi shifting * incorporate dropout_add to attention module * make style * make padding work again * update * remove bogus file * up * get generation to work * clean code a bit * added small tests * adding albii test * make CI tests pass: - change init weight - add correct tuple for output attention - add scan test - make CI tests work * fix few nits * fix nit onnx * fix onnx nit * add missing dtype args to nn.Modules * remove debugging statements * fix scan generate * Update modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * fix small test issue + make style * clean up * Update tests/models/bloom/test_modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fix function name * small fix test * forward contrib credits from PR17761 * Fix failing test * fix small typo documentation * fix non passing test - remove device from build alibi * refactor call - refactor `FlaxBloomBlockCollection` module * make style * upcast to fp32 * cleaner way to upcast * remove unused args * remove layer number * fix scan test * make style * fix i4 casting * fix slow test * Update src/transformers/models/bloom/modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove `layer_past` * refactor a bit * fix `scan` slow test * remove useless import * major changes - remove unused code - refactor a bit - revert import `torch` * major refactoring - change build alibi * remove scan * fix tests * make style * clean-up alibi * add integration tests * up * fix batch norm conversion * style * style * update pt-fx cross tests * update copyright * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * per-weight check * style * line formats --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-27 18:24:56 +01:00
Yih-Dar	da5ff18a4a	Fix doctest (#25031 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-25 22:10:06 +02:00
Sebastian Husch Lee	8f36ab3e22	[`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726 ) * Initial addition of t5forsequenceclassification * Adding imports and adding tests * Formatting * Running make fix-copies * Adding mt5forseq * Formatting * run make fix-copies * Adding to docs * Add model_parallel * Fix bug * Fix * Remove TODO * Fixing tests for T5ForSequenceClassification * Undo changes to dependency_versions_table.py * Change classification head to work with T5Config directly * Change seq length to let tests pass * PR comments for formatting * Formatting * Initial addition of UMT5ForSequenceClassification * Adding to inits and formatting * run make fix-copies * Add doc for UMT5ForSeqClass * Update UMT5 config * Fix docs * Skip torch fx test for SequenceClassification * Formatting * Add skip to UMT5 tests as well * Fix umt5 tests * Running make fix-copies * PR comments * Fix for change to sentence_representation * Rename seq_len to hidden_size since that's what it is * Use base_model to follow format of the rest of the library * Update docs * Extract the decoder_input_ids changes and make one liner * Make one-liner	2023-07-25 21:02:49 +02:00
Arthur	dcb183f4bd	[`MPT`] Add MosaicML's `MPT` model to transformers (#24629 ) * draft add new model like * some cleaning of the config * nits * add nested configs * nits * update * update * added layer norms + triton kernels * consider only LPLayerNorm for now. * update * all keys match. * Update * fixing nits here and there * working forward pass. * removed einops dependency * nits * format * add alibi * byebye head mask * refactor attention * nits. * format * fix nits. * nuke ande updates * nuke tokenizer test * don't reshape query with kv heads * added a bit of documentation. * remove unneeded things * nuke more stuff * nit * logits match - same generations * rm unneeded methods * 1 remaining failing CI test * nit * fix nits * fix docs * fix docs * rm tokenizer * fixup * fixup * fixup and fix tests * fixed configuration object. * use correct activation * few minor fixes * clarify docs a bit * logits match à 1e-12 * skip and unskip a test * added some slow tests. * fix readme * add more details * Update docs/source/en/model_doc/mpt.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix configuration issues * more fixes in config * added more models * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove unneeded position ids * fix some comments * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * revert suggestion * mpt alibi + added batched generation * Update src/transformers/models/mpt/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove init config * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix nit * add another slow test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fits in one line * some refactor because make fixup doesn't pass * add ft notebook * update md * correct doc path --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-25 14:32:40 +02:00
Xuehai Pan	6bc61aa7af	Set `TF32` flag for PyTorch cuDNN backend (#25075 )	2023-07-25 08:04:48 -04:00
Injin Paek	5dba88b2d2	fix: add TOC anchor link (#25066 )	2023-07-25 08:02:33 -04:00
Sangam Lee	ee1eb3b325	🌐 [i18n-KO] Translated `perf_hardware.md` to Korean (#24966 ) * docs: ko: perf_hardware.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by: Haewon Kim <ehdvkf02@naver.com> * Fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: fix rendering error of perf_hardware.md --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Haewon Kim <ehdvkf02@naver.com>	2023-07-25 07:44:24 -04:00
Arthur	c53a6eae74	[`RWKV`] Add note in doc on `RwkvStoppingCriteria` (#25055 ) * Add note in doc on `RwkvStoppingCriteria` * give some breathing space to the code	2023-07-25 10:15:00 +02:00
Rinat	a03d13c83d	Pvt model (#24720 ) * pull and push updates * add docs * fix modeling * Add and run test * make copies * add task * fix tests and fix small issues * Checks on a Pull Request * fix docs * add desc pvt.md	2023-07-24 15:34:19 +01:00
Maria Khalusova	75317aefb3	[docs] Performance docs tidy up, part 1 (#23963 ) * first pass at the single gpu doc * overview: improved clarity and navigation * WIP * updated intro and deepspeed sections * improved torch.compile section * more improvements * minor improvements * make style * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * feedback addressed * mdx -> md * link fix * feedback addressed --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-07-24 08:57:24 -04:00
Sylvain Gugger	640e1b6c6f	Remove tokenizers from the doc table (#24963 )	2023-07-21 09:41:36 -04:00
Sourab Mangrulkar	f4eb459ef2	fsdp fixes and enhancements (#24980 ) * fix fsdp prepare to remove the warnings and fix excess memory usage * Update training_args.py * parity for FSDP+XLA * Update trainer.py	2023-07-21 17:52:48 +05:30
Wonhyeong Seo	ec3dfe5e24	🌐 [i18n-KO] Fixed Korean and English `quicktour.md` (#24664 ) * fix: english/korean quicktour.md * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com> * fix: follow glossary * 파인튜닝 -> 미세조정 --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>	2023-07-21 08:19:28 -04:00
Tom Aarsen	79444f370f	Deprecate unused OpenLlama architecture (#24922 ) * Resolve typo in check_repo.py * Specify encoding when opening modeling files * Deprecate the OpenLlama architecture * Add disclaimer pointing to Llama I'm open to different wordings here * Match the capitalisation of LLaMA	2023-07-20 07:03:24 -04:00
Travis Cline	3a43794dd6	Fix minor llama2.md model doc typos (#24909 ) Update llama2.md Fix typos in the llama2 model doc	2023-07-19 08:13:14 -04:00
Eliah Kagan	c035970212	Update tested versions in READMEs (#24895 ) * Update supported Python and PyTorch versions in readme * Update Python, etc. versions in non-English readmes These were more out of date than in the English readme. This updates all the versions the readmes claim the repository is tested with to the same versions stated in the English readme. Those versions are current at least in the case of the Python and PyTorch versions (and less out of date for the others). * Propagate trailing whitespace fix to model list This runs "make fix-copies". The only change is the removal of whitespace. No actual information or wording is changed. * Update tested TensorFlow to 2.6 in all readmes Per pinning in setup.py Unlike Python and PyTorch, the minimum supported TensorFlow version has not very recently changed, but old versions were listed in all READMEs.	2023-07-19 07:17:34 -04:00

1 2 3 4 5 ...

700 Commits