transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-13 01:30:04 +06:00

Author	SHA1	Message	Date
Younes Belkada	ae9a344cce	[`Mistral`] Add Flash Attention-2 support for `mistral` (#26464 ) * add FA-2 support for mistral * fixup * add sliding windows * fixing few nits * v1 slicing cache - logits do not match * add comment * fix bugs * more mem efficient * add warning once * add warning once * oops * fixup * more comments * copy * add safety checker * fixup * Update src/transformers/models/mistral/modeling_mistral.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * copied from * up * raise when padding side is right * fixup * add doc + few minor changes * fixup --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-10-03 13:44:46 +02:00
Arthur	1a2e966cfe	Nit-added-tokens (#26538 ) * fix stripping * nits * fix another test * styling * fix? * update * revert bad merge * found the bug * YES SIR * is that change really required? * make fast even faster * re order functions	2023-10-03 12:23:46 +02:00
Srijan Sahay Srivastava	245da7ed38	[Doctest] Add `configuration_encoder_decoder.py` (#26519 ) * [Doctest] Add configuration_encoder_decoder.py Added configuration_encoder_decoder.py to utils/documentation_tests.txt for doctest * Revert "[Doctest] Add configuration_encoder_decoder.py" This reverts commit `bd653535a4`. * [Doctest] Add configuration_encoder_decoder.py add configuration_encoder_decoder.py to utils/documentation_tests.txt * [Doctest] Add configuration_encoder_decoder.py add configuration_encoder_decoder.py to utils/documentation_tests.txt * [Doctest] Add configuration_encoder_decoder.py add configuration_encoder_decoder.py to utils/documentation_tests.txt * changed as per request * fixed line 46	2023-10-03 11:21:24 +02:00
Funtowicz Morgan	3632fb3c25	[AMD] Add initial version for run_tests_multi_gpu (#26346 ) * Add initial version for run_tests_multi_gpu * Trigger change in BERT * fix typo setup -> setup_gpu * Add tag mi210 * Enable multi-gpu jobs * One more * Use dynamic device allocation * Attempt to fix syntax for docker create * fix script path * fix * temp machine type * fix label * Enable multi-gpu tests * Rename multi-amd-gpu to multi-gpu * Let's not be lazy dude * Update rocm-smi output * Add gpu_flavour in the matrix * Fix typos * merge single/multi dispatch into the matrix * Format. * Revert BERT's change --------- Co-authored-by: Guillaume LEGENDRE <glegendre01@gmail.com>	2023-10-03 11:13:45 +02:00
Sanchit Gandhi	768aa3d9cd	[Wav2Vec2 and Co] Update init tests for PT 2.1 (#26494 )	2023-10-03 10:52:34 +02:00
Nathan Cahill	b5ca8fcd20	Add tokenizer kwargs to fill mask pipeline. (#26234 ) * add tokenizer kwarg inputs * Adding tokenizer_kwargs to _sanitize_parameters * Add truncation=True example to tests * Update test_pipelines_fill_mask.py * Update test_pipelines_fill_mask.py * make fix-copies and make style * Update fill_mask.py Replace single tick with double * make fix-copies * Style --------- Co-authored-by: Lysandre <lysandre@huggingface.co>	2023-10-03 10:25:10 +02:00
Patrick von Platen	df6a855e7b	[RFC, Logging] Change warning to info (#26545 ) [Logging] Change warning to info	2023-10-03 08:55:39 +02:00
dependabot[bot]	cf345d5f38	Bump urllib3 from 1.26.9 to 1.26.17 in /examples/research_projects/decision_transformer (#26554 ) Bump urllib3 in /examples/research_projects/decision_transformer Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.9 to 1.26.17. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/1.26.9...1.26.17) --- updated-dependencies: - dependency-name: urllib3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-10-03 08:55:12 +02:00
dependabot[bot]	6de6fdd06d	Bump urllib3 from 1.26.5 to 1.26.17 in /examples/research_projects/visual_bert (#26552 ) Bump urllib3 in /examples/research_projects/visual_bert Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.5 to 1.26.17. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/1.26.5...1.26.17) --- updated-dependencies: - dependency-name: urllib3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-10-03 08:55:01 +02:00
dependabot[bot]	e092b4ad68	Bump urllib3 from 1.26.5 to 1.26.17 in /examples/research_projects/lxmert (#26551 ) Bump urllib3 in /examples/research_projects/lxmert Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.5 to 1.26.17. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/1.26.5...1.26.17) --- updated-dependencies: - dependency-name: urllib3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-10-03 08:54:50 +02:00
Florian Zimmermeister	9ed538f2e6	[i18n-DE] contribute chapter (#26481 ) * start working on next chapter * finish testing * Update docs/source/de/testing.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/de/testing.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/de/testing.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-10-02 09:56:40 -07:00
Wonhyeong Seo	1470f731b6	🌐 [i18n-KO] Translated `tokenizer_summary.md` to Korean (#26243 ) * docs: ko: toknenizer_summary.md Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-Authored-By: Juntae <79131091+sronger@users.noreply.github.com> Co-Authored-By: Injin Paek <71638597+eenzeenee@users.noreply.github.com> * update review * fix: resolve suggestions Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com> Co-Authored-By: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> --------- Co-authored-by: HanNayeoniee <nayeon2.han@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Juntae <79131091+sronger@users.noreply.github.com> Co-authored-by: Injin Paek <71638597+eenzeenee@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>	2023-10-02 09:55:33 -07:00
Arthur	c20d90d577	add build_inputs_with_special_tokens to LlamaFast (#26297 ) * add build_inputs_with_special_tokens to LlamaFast * fixup * Update src/transformers/models/llama/tokenization_llama_fast.py	2023-10-02 18:30:44 +02:00
Arthur	bab3331906	Code-llama-nit (#26300 ) * fix encoding when the fill token is None * add tests and edge cases * fiuxp * Update tests/models/code_llama/test_tokenization_code_llama.py	2023-10-02 18:29:27 +02:00
Adithya Hegde Kota	4b4c6aabfb	[Doctest] Add configuration_roformer.py (#26530 ) * [Doctest] Add configuration_roformer.py * [Doctest] Add configuration_roformer.py * [Doctest] Add configuration_roformer.py * [Doctest] Add configuration_roformer.py * Removed documentation_test.txt * Removed configuration_roformer.py * Update not_doctested.txt	2023-10-02 17:19:13 +02:00
Arthur	e4dad4fe32	Remove-warns (#26483 ) * fix stripping * remove some warnings and update some warnings * revert changes for other PR	2023-10-02 16:52:00 +02:00
Younes Belkada	1b8decb04c	[`PEFT`] Protect `adapter_kwargs` check (#26537 ) Update modeling_utils.py	2023-10-02 14:59:24 +02:00
Arthur	63864e057f	Fix model integration ci (#26322 ) * fix wav2vec2 * nit * stash * one more file to update * fix byt5 * vocab size is 256, don't change that! * use other revision * test persimon in smaller size * style * tests * nits * update add tokens from pretrained * test tokenization * nits * potential fnet fix? * more nits * nits * correct test * assert close * udpate * ouch * fix it * some more nits * FINALLU * use `adept` checkpoints * more adept checkpoints * that was invlved!	2023-10-02 13:55:46 +02:00
Younes Belkada	6824461f2a	[`core`/ `auto` ] Fix bnb test with code revision + bug with code revision (#26431 ) * fix bnb test with code revision * fix test * Apply suggestions from code review * Update src/transformers/models/auto/auto_factory.py * Update src/transformers/models/auto/auto_factory.py * Update src/transformers/models/auto/auto_factory.py	2023-10-02 11:35:07 +02:00
Younes Belkada	24178c2461	[`PEFT`] Pass token when calling `find_adapter_config` (#26488 ) * try * nit * nits	2023-10-02 11:23:03 +02:00
HelgeS	7d6627d0d9	Fix broken link to video classification task (#26487 )	2023-10-02 11:19:11 +02:00
marcmk6	6d02ca4bb9	Fix issue of canine forward requiring input_ids anyway (#26290 ) * fix issue of canine forward requires input_ids anyway The `forward` requires `input_ids` for deriving other variables in all cases. Change this to use the given one between `input_ids` and `inputs_embeds` * fix canine forward The current `forward` requires (the shape of) `input_ids` for deriving other variables whenever `input_ids` or `inputs_embeds` is provided. Change this to use the given one instead of `input_ids` all the time. * fix format * fix format	2023-10-02 11:06:40 +02:00
Jan Philipp Harries	7d77d7f79c	Fix requests connection error during modelcard creation (#26518 ) fix requests connection error Co-authored-by: Jan Philipp Harries <jphme@users.noreply.github.com>	2023-10-02 10:52:51 +02:00
Florian Seiler	ca0379b8c8	Fix num_heads in _upad_input (#26490 ) * Fix num_heads in _upad_input The variable num_key_value_heads has falsely been named num_heads, which led to reshaping the query_layer using the wrong attention head count. (It would have been enough to use the correct variable self.num_heads instead of num_heads, but I renamed num_heads to num_key_value_heads for clarity) * fixed copies using make fix-copies and ran make fixup --------- Co-authored-by: fseiler <f.seiler@jerocom.de>	2023-10-02 10:10:19 +02:00
Lysandre Debut	67239f7360	Revert falcon exception (#26472 ) * Revert "Falcon: fix revision propagation (#26006)" This reverts commit `118c676ef3`. * Revert "Put Falcon back (#25960)" This reverts commit `22a69f1d7d`.	2023-10-02 09:13:19 +02:00
Sanchit Gandhi	0b192de1f3	[ASR Pipe] Improve docs and error messages (#26476 ) * improve docs/errors * why whisper * Update docs/source/en/pipeline_tutorial.md Co-authored-by: Lysandre Debut <hi@lysand.re> * specify pt only --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2023-09-29 18:32:37 +01:00
Sanchit Gandhi	68e85fc822	[Flax Examples] Seq2Seq ASR Fine-Tuning Script (#21764 ) * from seq2seq speech * [Flax] Example script for speech seq2seq * tests and fixes * make style * fix: label padding tokens * fix: label padding tokens over list * update ln names for Whisper * try datasets iter loader * create readme and append results * style * make style * adjust lr * use pt dataloader * make fast * pin gen max len * finish * add pt to requirements for test * fix pt -> torch * add accelerate	2023-09-29 16:42:58 +01:00
Yih-Dar	391177441b	Avoid all-zeor attnetion mask used in testing (#26469 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-09-29 11:06:06 +02:00
Yih-Dar	9b23d0de0e	Skip 2 failing persimmon pipeline tests for now (#26485 ) skip Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-09-29 10:52:18 +02:00
Maria Khalusova	14170b784b	[docs] navigation improvement between text gen pipelines and text gen params (#26477 ) * navigation improvement between text generation pipelines and text generation docs * make style	2023-09-29 09:43:39 +02:00
Steven Liu	7bb1c0c147	[docs] Update offline mode docs (#26478 ) update	2023-09-29 09:42:21 +02:00
Sanchit Gandhi	211f93aab9	[Whisper Tokenizer] Make decoding faster after adding timestamps (#26299 ) make decoding faster	2023-09-28 19:02:27 +01:00
Amelie Schreiber	4e931a8eb3	Esm checkpointing (#26454 ) * Fixed in-place operation error in EsmEmbeddings * Fixed in-place operation error in EsmEmbeddings again --------- Co-authored-by: Schreiber-Finance <amelie.schreiber.finance@gmail.com>	2023-09-28 18:49:39 +01:00
Marc Sun	5e11d72d4d	fix_mbart_tied_weights (#26422 ) * fix_mbart_tied_weights * add test	2023-09-28 15:08:35 +02:00
fleance	216dff7549	Do not warn about unexpected decoder weights when loading T5EncoderModel and LongT5EncoderModel (#26211 ) Ignore decoder weights when using T5EncoderModel and LongT5EncoderModel Both T5EncoderModel and LongT5EncoderModel do not have any decoder layers, so loading a pretrained model checkpoint such as t5-small will give warnings about keys found in the model checkpoint that are not in the model itself. To prevent this log warning, r"decoder" has been added to _keys_to_ignore_on_load_unexpected for both T5EncoderModel and LongT5EncoderModel	2023-09-28 11:27:43 +02:00
Younes Belkada	38e96324ef	[`PEFT`] introducing `adapter_kwargs` for loading adapters from different Hub location (`subfolder`, `revision`) than the base model (#26270 ) * make use of adapter_revision * v1 adapter kwargs * fix CI * fix CI * fix CI * fixup * add BC * Update src/transformers/integrations/peft.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup * change it to error * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_utils.py * fixup * change * Update src/transformers/integrations/peft.py --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-09-28 11:13:03 +02:00
Fakhir Ali	52e2c13da3	[VITS] Fix speaker_embed device mismatch (#26115 ) * [VITS] Fix speaker_embed device mismatch - pass device arg to speaker_id tensor * [VITS] put speaker_embed on device when int * [VITS] device=self.device instead of self.embed_speaker.weight.device * [VITS] make tensor directly on device using torch.full()	2023-09-28 10:56:36 +02:00
Tanishq Abraham	098c3f400c	change mention of decoder_input_ids to input_ids and same with decode_inputs_embeds (#26406 ) * change mention of decoder_input_ids to input_ids and same with decoder_input_embeds * Style --------- Co-authored-by: Lysandre <lysandre@huggingface.co>	2023-09-28 10:15:48 +02:00
Phuc Van Phan	ba47efbfe4	docs: change assert to raise and some small docs (#26232 ) * docs: change assert to raise and some small docs * docs: add rule and some document * fix: fix bug * fix: fix bug * chorse: revert logging * chorse: revert	2023-09-28 10:14:17 +02:00
Yih-Dar	375b4e0935	Fix `cos_sin` device issue in Falcon model (#26448 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-09-28 10:00:15 +02:00
Norm Inui	a7e0ed829c	optimize VRAM for calculating pos_bias in LayoutLM v2, v3 (#26139 ) * optimize layoutv2, v3 for VRAM saving * reformat codes --------- Co-authored-by: NormXU <xunuo@datagrand.com>	2023-09-28 09:55:57 +02:00
Wonhyeong Seo	ab37b801b1	🌐 [i18n-KO] Translated `perf_train_gpu_many.md` to Korean (#26244 ) * dos: ko: perf_train_gpu_many.mdx * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions Change description Follow the glossary Fix discrepancies Co-Authored-By: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> Co-Authored-By: 이서정 <97655267+sjlee-wise@users.noreply.github.com> Co-Authored-By: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Hyunho <105839613+hyunhp@users.noreply.github.com> Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> Co-authored-by: 이서정 <97655267+sjlee-wise@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-09-27 13:51:15 -07:00
Wonhyeong Seo	a0922a538b	🌐 [i18n-KO] Translated `debugging.md` to Korean (#26246 ) * docs:ko:Debugging.md * feat: chatgpt draft * fix: resolve suggestions Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-Authored-By: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Jang KyuJin <106062329+kj021@users.noreply.github.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-09-27 13:47:44 -07:00
Florian Zimmermeister	ef81759e31	[i18n-DE] Complete first toc chapter (#26311 ) * initial * toctree * add tf model * run scripts * peft * llm and agents * Update docs/source/de/peft.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/de/peft.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/de/peft.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/de/run_scripts.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/de/run_scripts.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/de/transformers_agents.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/de/transformers_agents.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-09-27 11:33:05 -07:00
Yih-Dar	6ae71ec836	Update `runs-on` in workflow files (#26435 ) * update * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-09-27 19:25:52 +02:00
Lysandre Debut	78dd120282	Fix failing doctest (#26450 ) * Fix doctest * Adding modeling also for now	2023-09-27 18:47:26 +02:00
Chris Bamford	72958fcd3c	[Mistral] Mistral-7B-v0.1 support (#26447 ) * [Mistral] Mistral-7B-v0.1 support * fixing names * slightly longer test * fixups * not_doctested * wrongly formatted references * make fixuped --------- Co-authored-by: Timothee Lacroix <t@eugen.ai> Co-authored-by: timlacroix <t@mistral.ai>	2023-09-27 18:30:46 +02:00
Younes Belkada	3ca18d6d09	[`PEFT`] Fix PEFT multi adapters support (#26407 ) * fix PEFT multi adapters support * refactor a bit * save pretrained + BC + added tests * Update src/transformers/integrations/peft.py Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * add more tests * add suggestion * final changes * adapt a bit * fixup * Update src/transformers/integrations/peft.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * adapt from suggestions --------- Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2023-09-27 16:45:31 +02:00
statelesshz	946bac798c	add bf16 mixed precision support for NPU (#26163 ) Co-authored-by: statelesshz <jihuazhong1@huawei.com>	2023-09-27 12:28:40 +02:00
Younes Belkada	153755ee38	[`FA` / `tests`] Add use_cache tests for FA models (#26415 ) * add use_cache tests for FA * fixup	2023-09-27 12:21:54 +02:00

... 18 19 20 21 22 ...

15053 Commits