transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-15 02:28:24 +06:00

Author	SHA1	Message	Date
Joao Gante	c0742b15cb	Generate - add beam indices output in contrained beam search (#25042 )	2023-07-25 11:12:29 +01:00
Arthur	c53a6eae74	[`RWKV`] Add note in doc on `RwkvStoppingCriteria` (#25055 ) * Add note in doc on `RwkvStoppingCriteria` * give some breathing space to the code	2023-07-25 10:15:00 +02:00
Sylvain Gugger	d2295708a6	Better error message when signal is not supported on OS (#25049 ) * Better error message when signal is not supported on OS * Address review comments	2023-07-24 14:34:16 -04:00
seank021	c0d1c33022	🌐 [i18n-KO] Translated `perf_train_cpu.md` to Korean (#24911 ) * dos: ko: perf_train_cpu.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * fix: manual edits Co-authored-by: Haewon Kim <ehdvkf02@naver.com> --------- Co-authored-by: Haewon Kim <ehdvkf02@naver.com>	2023-07-24 17:54:13 +02:00
Younes Belkada	b08f41e62a	[`8bit`] Fix 8bit corner case with Blip2 8bit (#25047 ) fix 8bit corner case with Blip2 8bit	2023-07-24 16:58:40 +02:00
Nate Brake	3611fc90e0	compute_loss in trainer failing to label shift for PEFT model when label smoothing enabled. (#25044 ) * added PeftModelForCausalLM to MODEL_FOR_CAUSAL_LM_MAPPING_NAMES dict * check for PEFT model in compute_loss section --------- Co-authored-by: Nathan Brake <nbrake3@mmm.com>	2023-07-24 10:53:10 -04:00
Rinat	a03d13c83d	Pvt model (#24720 ) * pull and push updates * add docs * fix modeling * Add and run test * make copies * add task * fix tests and fix small issues * Checks on a Pull Request * fix docs * add desc pvt.md	2023-07-24 15:34:19 +01:00
Sylvain Gugger	afe8bfc075	Comment again print statement	2023-07-24 10:12:20 -04:00
Sylvain Gugger	42571f6eb8	Make more test models smaller (#25005 ) * Make more test models tiny * Make more test models tiny * More models * More models	2023-07-24 10:08:47 -04:00
Sören Brunk	8f1f0bf50f	Fix typo in LlamaTokenizerFast docstring example (#25018 )	2023-07-24 09:37:58 -04:00
Zach Mueller	3b734f5042	Add dispatch_batches to training arguments (#25038 ) * Dispatch batches * Copy items	2023-07-24 09:27:19 -04:00
Sunmin Cho	9d2b983ed0	🌐 [i18n-KO] Translated `testing.md` to Korean (#24900 ) * docs: ko: testing.md * feat: draft * fix: manual edits * fix: edit ko/_toctree.yml * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: resolve suggestions	2023-07-24 09:24:11 -04:00
Sangam Lee	383be1b763	🌐[i18n-KO] Translated performance.md to Korean (#24883 ) * dos: ko: performance.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * Update docs/source/ko/performance.md Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com> * Update docs/source/ko/performance.md --------- Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>	2023-07-24 09:23:34 -04:00
Iskren Ivov Chernev	efb2ba666d	Better handling missing SYS in llama conversation tokenizer (#24997 ) * Better handling missing SYS in llama conversation tokenizer The existing code failed to add SYS if the conversation has history without SYS, but did modify the passed conversation as it did. Rearrange the code so modification to the conversation object are taken into account for token id generation. * Fix formatting with black * Avoid one-liners * Also fix fast tokenizer * Drop List decl	2023-07-24 09:21:10 -04:00
Lucain	6704923107	Support GatedRepoError + use raise from (#25034 ) * Support GatedRepoError + use raise from * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Use token instead of use_auth_token in error messages --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-24 09:12:39 -04:00
Maria Khalusova	75317aefb3	[docs] Performance docs tidy up, part 1 (#23963 ) * first pass at the single gpu doc * overview: improved clarity and navigation * WIP * updated intro and deepspeed sections * improved torch.compile section * more improvements * minor improvements * make style * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * feedback addressed * mdx -> md * link fix * feedback addressed --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-07-24 08:57:24 -04:00
Bharat Ramanathan	54ba8608d0	fix(integrations): store serialized `TrainingArgs` to `wandb.config` without sanitization. (#25035 ) fix: store training args to wandb config without sanitization. Allows resuming runs by reusing the wandb config. Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com>	2023-07-24 08:42:39 -04:00
Arthur	0906d21203	[`logging.py`] set default `stderr` path if `None` (#25033 ) set default logger	2023-07-24 14:31:45 +02:00
Stas Bekman	c9a82be592	[check_config_docstrings.py] improve diagnostics (#25012 ) * [check_config_docstrings.py] improve diagnostics * style * rephrase * fix	2023-07-23 21:17:26 -07:00
Wonhyeong Seo	b257c46a07	🌐 [i18n-KO] Updated Korean `serialization.md` (#24686 ) fix: update ko/serialization.md * chatgpt draft	2023-07-21 19:23:59 -04:00
Sylvain Gugger	87fba947a5	Move template doc file to md (#25004 )	2023-07-21 16:49:44 -04:00
Ivan Sorokin	ea41e18cfc	improve from_pretrained for zero3 multi gpus mode (#24964 ) * improve from_pretrained for zero3 multi gpus mode * Add check if torch.distributed.is_initialized * Revert torch.distributed --------- Co-authored-by: Stas Bekman <stas@stason.org>	2023-07-21 15:39:28 -04:00
Arthur	95f96b45ff	[`Llama`] remove persistent `inv_freq` tensor (#24998 ) remove persistent tensor	2023-07-21 18:11:08 +02:00
Younes Belkada	d3ce048c20	[`bnb`] Add simple check for bnb import (#24995 ) add simple check for bnb	2023-07-21 17:50:52 +02:00
Yih-Dar	f1a1eb4ae1	Fix `llama` tokenization doctest (#24990 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-21 16:47:51 +02:00
Sylvain Gugger	a7d213189d	Use main_input_name for include_inputs_for_metrics (#24993 )	2023-07-21 10:30:17 -04:00
Sylvain Gugger	a6484c89b9	Fix type annotation for deepspeed training arg (#24988 )	2023-07-21 09:42:05 -04:00
Sylvain Gugger	5b7ffd5492	Avoid importing all models when instantiating a pipeline (#24960 ) * Avoid importing all models when instantiating a pipeline * Remove sums that don't work	2023-07-21 09:41:56 -04:00
Sylvain Gugger	640e1b6c6f	Remove tokenizers from the doc table (#24963 )	2023-07-21 09:41:36 -04:00
Arthur	0511369a8b	[`LlamaConfig`] Nit: pad token should be None by default (#24958 ) * pad token should be None by default * fix tests * nits	2023-07-21 14:32:34 +02:00
Joya Chen	f74560d007	Fix missing spaces in system prompt of Llama2 tokenizer (#24930 ) * Update tokenization_llama.py * Update tokenization_llama_fast.py * Update src/transformers/models/llama/tokenization_llama_fast.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/llama/tokenization_llama.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/llama/tokenization_llama.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/llama/tokenization_llama_fast.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-07-21 08:28:54 -04:00
Sourab Mangrulkar	f4eb459ef2	fsdp fixes and enhancements (#24980 ) * fix fsdp prepare to remove the warnings and fix excess memory usage * Update training_args.py * parity for FSDP+XLA * Update trainer.py	2023-07-21 17:52:48 +05:30
Wonhyeong Seo	ec3dfe5e24	🌐 [i18n-KO] Fixed Korean and English `quicktour.md` (#24664 ) * fix: english/korean quicktour.md * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com> * fix: follow glossary * 파인튜닝 -> 미세조정 --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>	2023-07-21 08:19:28 -04:00
Jim Allanson	83f9314d10	fix: cast input pixels to appropriate dtype for image_to_text pipelines (#24947 ) * fix: cast input pixels to appropriate dtype for image_to_text tasks * fix: add casting to pixel inputs of additional models after running copy checks	2023-07-21 08:16:57 -04:00
Sourab Mangrulkar	1c7e5e2368	fix fsdp checkpointing issues (#24926 ) * fix fsdp load * Update trainer.py * remove saving duplicate state_dict	2023-07-21 12:17:26 +05:30
Apoorv Khandelwal	9ef5256dfb	Fallback for missing attribute `Parameter.ds_numel` (#24942 ) * [trainer] fallback for deepspeed param count * [trainer] more readable numel count	2023-07-20 15:19:35 -04:00
Benjamin Badger	caf5e369fc	Contrastive Search peak memory reduction (#24120 ) Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2023-07-20 18:46:53 +01:00
Zach Mueller	aa1b09c5d1	Change logic for logging in the examples (#24956 ) Change logic	2023-07-20 12:30:10 -04:00
Younes Belkada	89a1f34271	[`RWKV`] Add Gradient Checkpointing support for RWKV (#24955 ) add GC support for RWKV	2023-07-20 18:29:23 +02:00
dependabot[bot]	9f912ef62a	Bump aiohttp from 3.8.1 to 3.8.5 in /examples/research_projects/decision_transformer (#24954 ) Bump aiohttp in /examples/research_projects/decision_transformer Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.8.1 to 3.8.5. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/v3.8.5/CHANGES.rst) - [Commits](https://github.com/aio-libs/aiohttp/compare/v3.8.1...v3.8.5) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-07-20 12:17:38 -04:00
Shauray Singh	e75cb0cb3c	fix type annotations for arguments in training_args (#24550 ) * testing * example script * fix typehinting * some tests * make test * optional update * Union of arguments * does this fix the issue * remove reports * set default to False * documentation change * None support * does not need None * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549) * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments * Change dict to Dict * Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments" (#24574) Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)" This reverts commit `c5e29d4381`. * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549) * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments * Change dict to Dict * merge * hacky fix * fixup --------- Co-authored-by: Max Ryabinin <mryabinin0@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-20 10:13:13 -04:00
Shauray Singh	0c41765df4	[DOCS] Example for `LogitsProcessor` class (#24848 ) * make docs * fixup * resolved * remove debugs * Revert "fixup" This reverts commit `5e0f636aae`. * prev (ignore) * fixup broke some files * remove files * reverting modeling_reformer * lang fix	2023-07-20 10:09:40 -04:00
Yih-Dar	35c04596f8	Fix `main_input_name` in `src/transformers/keras_callbacks.py` (#24916 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-20 15:01:37 +02:00
Premtim Sa	85514c17d1	Update processing_vision_text_dual_encoder.py (#24950 ) Fixing small typo: kwrags -> kwargs	2023-07-20 08:25:38 -04:00
dependabot[bot]	9859806608	Bump pygments from 2.11.2 to 2.15.0 in /examples/research_projects/decision_transformer (#24949 ) Bump pygments in /examples/research_projects/decision_transformer Bumps [pygments](https://github.com/pygments/pygments) from 2.11.2 to 2.15.0. - [Release notes](https://github.com/pygments/pygments/releases) - [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES) - [Commits](https://github.com/pygments/pygments/compare/2.11.2...2.15.0) --- updated-dependencies: - dependency-name: pygments dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-07-20 07:43:48 -04:00
Joao Gante	89136ff7f8	Generate: sequence bias can handle same terminations (#24822 )	2023-07-20 12:23:17 +01:00
statelesshz	37d8611ac9	replace no_cuda with use_cpu in test_pytorch_examples (#24944 ) * replace no_cuda with use_cpu in test_pytorch_examples * remove codes that never be used * fix style	2023-07-20 07:09:04 -04:00
Tom Aarsen	79444f370f	Deprecate unused OpenLlama architecture (#24922 ) * Resolve typo in check_repo.py * Specify encoding when opening modeling files * Deprecate the OpenLlama architecture * Add disclaimer pointing to Llama I'm open to different wordings here * Match the capitalisation of LLaMA	2023-07-20 07:03:24 -04:00
ranchlai	8fd8c8e49e	Add multi-label text classification support to pytorch example (#24770 ) * Add text classification example * set the problem type and finetuning task * ruff reformated * fix bug for unseting label_to_id for regression * update README.md * fixed finetuning task * update comment * check if label exists in feature before removing * add useful logging	2023-07-20 07:02:44 -04:00
Jungnerd	7381987f90	🌐 [i18n-KO] Translated`tasks/document_question_answering.md` to Korean (#24588 ) * docs: ko: `document_question_answering.md` * fix: resolve suggestions Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> --------- Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>	2023-07-20 06:19:36 -04:00

... 30 31 32 33 34 ...

15053 Commits