transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-03 03:31:05 +06:00

Author	SHA1	Message	Date
Susnato Dhar	b5db8ca66f	Add flash attention for `gpt_bigcode` (#26479 ) * added flash attention of gpt_bigcode * changed docs * Update src/transformers/models/gpt_bigcode/modeling_gpt_bigcode.py * add FA-2 docs * oops * Update docs/source/en/perf_infer_gpu_one.md Last Nit Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * oops * remove padding_mask * change getattr->hasattr logic * changed .md file --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-10-31 11:21:02 +00:00
Yih-Dar	9dc4ce9ea7	Disable CI runner check (#27170 ) Disable runner check Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-31 11:59:21 +01:00
Seungwoo, Jeong	14bb196cc8	[doctring] Fix docstring for BlipTextConfig, BlipVisionConfig (#27173 ) Update configuration_blip.py edit docstrings	2023-10-31 10:41:56 +00:00
Akshar Goyal	9234caefb0	[docstring] Fix docstring for AltCLIPTextConfig, AltCLIPVisionConfig and AltCLIPConfig (#27128 ) * [docstring] Fix docstring for AltCLIPVisionConfig, AltCLIPTextConfig + cleaned some docstring * Removed entries from check_docstring.py * Removed entries from check_docstring.py * Removed entry from check_docstring.py * [docstring] Fix docstring for AltCLIPTextConfig, AltCLIPVisionConfig and AltCLIPConfig	2023-10-31 10:20:14 +00:00
Clifford Ressel	b5c8e23f0f	Remove broken links to s-JoL/Open-Llama (#27164 )	2023-10-31 10:17:54 +00:00
Hz, Ji	df6f36a171	deprecate function `get_default_device` in `tools/base.py` (#26774 ) * get default device through `PartialState().default_device` as is has been officially released * apply code review suggestion * apply code review suggestion Co-authored-by: Zach Mueller <muellerzr@gmail.com> --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2023-10-31 09:15:39 +00:00
NielsRogge	8211c59b9a	[KOSMOS-2] Update docs (#27157 ) Update docs	2023-10-30 21:42:19 +01:00
NielsRogge	d39352d12c	Fix import of torch.utils.checkpoint (#27155 ) * Fix import * Apply suggestions from code review Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-10-30 20:08:29 +00:00
MD FAIZAN KHAN	e971486d89	Fix: typos in README.md (#27154 )	2023-10-30 19:12:09 +00:00
Younes Belkada	f7ea959b96	[`core`/ `GC` / `tests`] Stronger GC tests (#27124 ) * stronger GC tests * better tests and skip failing tests * break down into 3 sub-tests * break down into 3 sub-tests * refactor a bit * more refactor * fix * last nit * credits contrib and suggestions * credits contrib and suggestions --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-10-30 19:53:46 +01:00
Hz, Ji	5bbf671276	Device agnostic trainer testing (#27131 )	2023-10-30 18:16:40 +00:00
Rockerz	84724efd10	Translating `en/main_classes` folder docs to Japanese 🇯🇵 (#26894 ) * add * add * add * Add deepspeed.md * Add * add * Update docs/source/ja/main_classes/callback.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/main_classes/output.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/main_classes/pipelines.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/main_classes/processors.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/main_classes/processors.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/main_classes/text_generation.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/main_classes/processors.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update logging.md * Update toctree.yml * Update docs/source/ja/main_classes/deepspeed.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Add suggesitons * m * Update docs/source/ja/main_classes/trainer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update toctree.yml * Update Quantization.md * Update docs/source/ja/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update toctree.yml * Update docs/source/en/main_classes/deepspeed.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/main_classes/deepspeed.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-10-30 09:39:14 -07:00
Yeyang	9093b19b13	🌐 [i18n-ZH] Translate serialization.md into Chinese (#27076 ) * docs(zh): translate serialization.md * docs(zh): add space around links	2023-10-30 08:50:29 -07:00
Yih-Dar	3224c0c13f	Remove some Kosmos-2 `copied from` (#27149 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-30 16:07:27 +01:00
Hz, Ji	cd19b19378	make tests of pytorch_example device agnostic (#27081 )	2023-10-30 14:56:41 +00:00
Younes Belkada	6b466771b0	[`tests` / `Quantization`] Fix bnb test (#27145 ) * fix bnb test * link to GH issue	2023-10-30 15:43:08 +01:00
Yih-Dar	576994963f	Fix some tests using `"common_voice"` (#27147 ) * Use mozilla-foundation/common_voice_11_0 * Update expected values * Update expected values * For test_word_time_stamp_integration --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-30 15:27:15 +01:00
Yih-Dar	691fd8fdde	Add `Kosmos-2` model (#24709 ) * Add KOSMOS-2 model * update * update * update * address review comment - 001 * address review comment - 002 * address review comment - 003 * style * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix * address review comment - 004 * address review comment - 005 * address review comment - 006 * address review comment - 007 * address review comment - 008 * address review comment - 009 * address review comment - 010 * address review comment - 011 * update readme * fix * fix * fix * [skip ci] fix * revert the change in _decode * fix docstring * fix docstring * Update docs/source/en/model_doc/kosmos-2.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * no more Kosmos2Tokenizer * style * remove "returned when being computed by the model" * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * UTM5 Atten * fix attn mask * use present_key_value_states instead of next_decoder_cache * style * conversion scripts * conversion scripts * conversion scripts * Add _reorder_cache * fix doctest and copies * rename 1 * rename 2 * rename 3 * make fixup * fix table * fix docstring * rename 4 * change repo_id * remove tip * update md file * make style * update md file * put docs/source/en/model_doc/kosmos-2.md to slow * update conversion script * Use CLIPImageProcessor in Kosmos2Processor * Remove Kosmos2ImageProcessor * Remove to_dict in Kosmos2Config * Remove files * fix import * Update conversion * normalized=False * Not using hardcoded values like <image> * elt --> element * Apply suggestion * Not using hardcoded values like </image> * No assert * No nested functions * Fix md file * copy * update doc * fix docstring * fix name * Remove _add_remove_spaces_around_tag_tokens * Remove dummy docstring of _preprocess_single_example * Use `BatchEncoding` * temp * temp * temp * Update * Update * Make Kosmos2ProcessorTest a bit pretty * Update gradient checkpointing * Fix gradient checkpointing test * Remove one liner remove_special_fields * Simplify conversion script * fix add_eos_token * update readme * update tests * Change to microsoft/kosmos-2-patch14-224 * style * Fix doc --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-10-30 13:32:17 +01:00
Hz, Ji	d751dbecb2	remove the obsolete code related to fairscale FSDP (#26651 ) * remove the obsolete code related to fairscale FSDP * apple review suggestion	2023-10-30 11:55:03 +00:00
Younes Belkada	5fbed2d7ca	[`Trainer` / `GC`] Add `gradient_checkpointing_kwargs` in trainer and training arguments (#27068 ) * add `gradient_checkpointing_kwargs` in trainer and training arguments * add comment * add test - currently failing * now tests pass	2023-10-30 12:41:48 +01:00
Thien Tran	e830495c1c	Fix data2vec-audio note about attention mask (#27116 ) fix data2vec audio note about attention mask	2023-10-30 10:52:24 +00:00
Younes Belkada	160432110c	[`FA2`/ `Mistral`] Revert previous behavior with right padding + forward (#27125 ) Update modeling_mistral.py	2023-10-30 11:04:50 +01:00
Yih-Dar	211ad4c9cc	Fix slack report failing for doctest (#27042 ) * fix slack report for doctest * separate reports * style --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-30 10:48:24 +01:00
Gema Parreño	722e936491	[Typo fix] flag config in WANDB (#27130 ) typo fix flag config	2023-10-29 18:22:26 +00:00
Daniil	9e87618f2b	Fix docstring and type hint for resize (#27104 ) fix docstring and type hint for resize	2023-10-27 16:50:10 -03:00
jiaqiw09	ef23b68ebf	translate transformers_agents.md to Chinese (#27046 ) * update translation * fix problems mentioned in reviews	2023-10-27 12:45:43 -07:00
Akhil	96f9e78f4c	Added Telugu [te] translation for README.md in main (#27077 ) * Create index.md * Create _toctree.yml * Updated index.md in telugu * Update _toctree.yml * Create quicktour.md * Update quicktour.md * Create index.md * Update quicktour.md * Update docs/source/te/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Delete docs/source/hi/index.md * Update docs/source/te/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/te/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/te/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/te/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/te/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/te/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/te/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/te/quicktour.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update build_documentation.yml Added telugu [te] * Update build_pr_documentation.yml Added Telugu [te] * Update _toctree.yml * Create README_te.md Telugu translation for README.md * Update README_te.md Added Telugu translation for Readme.md * Update README_te.md * Update README_te.md * Update README_te.md * Update README_te.md * Update README.md * Update README_es.md * Update README_es.md * Update README_hd.md * Update README_ja.md * Update README_ko.md * Update README_pt-br.md * Update README_ru.md * Update README_zh-hans.md * Update README_zh-hant.md * Update README_te.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-10-27 11:40:10 -07:00
Patrick von Platen	ac5893756b	[Attention Mask] Refactor all encoder-decoder attention mask (#27086 ) * [FA2 Bart] Add FA2 to all Bart-like * better * Refactor attention mask * remove all customized atteniton logic * format * mass rename * replace _expand_mask * replace _expand_mask * mass rename * add pt files * mass replace & rename * mass replace & rename * mass replace & rename * mass replace & rename * Update src/transformers/models/idefics/modeling_idefics.py * fix more * clean more * fix more * make style * fix again * finish * finish * finish * finish * finish * finish * finish * finish * finish * finish * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * small fix mistral * finish * finish * finish * finish --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-10-27 16:42:01 +02:00
Marc Sun	29c74f58ae	fix detr device map (#27089 ) * fix detr device map * add comments	2023-10-27 10:28:12 -04:00
Younes Belkada	ffff9e70ab	[`core`/ `gradient_checkpointing`] Refactor GC - part 2 (#27073 ) * fix * more fixes * fix other models * fix long t5 * use `gradient_checkpointing_func` instead * fix copies * set `gradient_checkpointing_func` as a private attribute and retrieve previous behaviour * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * replace it with `is_gradient_checkpointing_set` * remove default * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-10-27 16:15:22 +02:00
Marc Sun	5be1fb6d1f	Fix no split modules underlying modules (#27090 ) * fix no split * style * remove comm * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * rename modules --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-10-27 09:49:20 -04:00
Lucain	66b088faf0	Provide alternative when warning on use_auth_token (#27105 )	2023-10-27 14:32:54 +02:00
Isaac Chung	e2bffcfafd	Add early stopping for Bark generation via logits processor (#26675 ) * add early stopping logits processor * black formmated * indent * follow method signature * actual logic * check for None * address comments on docstrings and method signature * add unit test under `LogitsProcessorTest` wip * unit test passing * black formatted * condition per sample * add to BarkModelIntegrationTests * wip BarkSemanticModelTest * rename and add to kwargs handling * not add to BarkSemanticModelTest * correct logic and assert last outputs tokens different in test * doc-builder style * read from kwargs as well * assert len of with less than that of without * ruff * add back seed and test case * add original impl default suggestion * doc-builder * rename and use softmax * switch back to LogitsProcessor and update docs wording * camelCase and spelling and saving compute * assert strictly less than * assert less than * expand test_generate_semantic_early_stop instead	2023-10-27 11:07:33 +01:00
Arthur	90ee9cea19	Revert "add exllamav2 arg" (#27102 ) Revert "add exllamav2 arg (#26437)" This reverts commit `8214d6e7b1`.	2023-10-27 11:23:06 +02:00
Arthur	aa4198a238	[`T5Tokenizer`] Fix fast and extra tokens (#27085 ) * v4.35.dev.0 * nit t5fast match t5 slow	2023-10-27 08:18:24 +02:00
Varshaa Shetty	6f31601687	Added huggingface emoji instead of the markdown format (#27091 ) Added huggingface emoji instead of the markdown format as it was not displaying the required emoji in that format	2023-10-26 14:10:16 -07:00
Zach Mueller	34a640642b	Save TB logs as part of push_to_hub (#27022 ) * Support runs/ * Upload runs folder as part of push to hub * Add a test * Add to test deps * Update with proposed solution from Slack * Ensure that repo gets deleted in tests	2023-10-26 12:13:19 -04:00
L. Yeung	1892592530	Correct docstrings and a typo in comments (#27047 ) * docs(training_args): correct docstrings Correct docstrings of these methods in `TrainingArguments`: - `set_save` - `set_logging` * docs(training_args): adjust words in docstrings Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * docs(trainer): correct a typo in comments --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-10-26 08:46:17 -07:00
Marc Sun	8214d6e7b1	add exllamav2 arg (#26437 ) * add_ xllamav2 arg * add test * style * add check * add doc * replace by use_exllama_v2 * fix tests * fix doc * style * better condition * fix logic * add deprecate msg	2023-10-26 10:15:05 -04:00
Patrick von Platen	d7cb5e138e	[Llama FA2] Re-add _expand_attention_mask and clean a couple things (#27074 ) * clean * clean llama * fix more * make style * Apply suggestions from code review * Apply suggestions from code review * Update src/transformers/models/llama/modeling_llama.py * Update src/transformers/models/llama/modeling_llama.py * Apply suggestions from code review * finish * make style	2023-10-26 13:06:21 +02:00
Arthur	4864d08d3e	Add-support for commit description (#26704 ) * fix * update * revert * add dosctring * good to go * update * add a test	2023-10-26 12:37:09 +02:00
Arthur	15cd096288	Create SECURITY.md	2023-10-26 12:26:47 +02:00
Younes Belkada	fe2877ce21	Remove unneeded prints in modeling_gpt_neox.py (#27080 )	2023-10-26 11:55:31 +02:00
Younes Belkada	efba1a1744	Bump`flash_attn` version to `2.1` (#27079 ) * pin FA-2 to `2.1` * fix on modeling	2023-10-26 11:21:04 +02:00
Zach Mueller	90412401e6	Bring back `set_epoch` for Accelerate-based dataloaders (#26850 ) * Working tests! * Fix sampler * Fix * Update src/transformers/trainer.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix check * Clean --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-10-26 11:20:11 +02:00
dependabot[bot]	3c2692407d	Bump urllib3 from 1.26.17 to 1.26.18 in /examples/research_projects/lxmert (#26888 ) Bump urllib3 in /examples/research_projects/lxmert Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.17 to 1.26.18. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/1.26.17...1.26.18) --- updated-dependencies: - dependency-name: urllib3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-10-26 09:10:29 +02:00
dependabot[bot]	9c5240af14	Bump werkzeug from 2.2.3 to 3.0.1 in /examples/research_projects/decision_transformer (#27072 ) Bump werkzeug in /examples/research_projects/decision_transformer Bumps [werkzeug](https://github.com/pallets/werkzeug) from 2.2.3 to 3.0.1. - [Release notes](https://github.com/pallets/werkzeug/releases) - [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/werkzeug/compare/2.2.3...3.0.1) --- updated-dependencies: - dependency-name: werkzeug dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-10-26 08:56:28 +02:00
corey hu	df2eebf1e7	Handle unsharded Llama2 model types in conversion script (#27069 ) Handle all unshared models types	2023-10-26 08:41:07 +02:00
Aarya Balwadkar	a2f55a65cd	Hindi translation of pipeline_tutorial.md (#26837 ) * hindi translation of pipeline_tutorial.md * Update pipeline_tutorial.md * Update build_documentation.yml * Update build_pr_documentation.yml * Updated build_documentation.yml --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-10-25 11:21:49 -07:00
Yeyang	ba5144f7a9	🌐 [i18n-ZH] Translate custom_models.md into Chinese (#27065 ) * docs(zh): translate custom_models.md * minor fix in customer_models Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-10-25 11:20:32 -07:00

1 2 3 4 5 ...

14393 Commits