transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-04 05:10:06 +06:00

Author	SHA1	Message	Date
amyeroberts	edb170238f	SiLU activation wrapper for safe importing (#28509 ) Add back in wrapper for safe importing	2024-01-15 19:36:59 +00:00
Timothy Cronin	ff86bc364d	improve dev setup comments and hints (#28495 ) * improve dev setup comments and hints * fix tests for new dev setup hints	2024-01-15 18:36:40 +00:00
Boris Dayma	735968b61c	fix: sampling in flax keeps EOS (#28378 )	2024-01-15 18:12:09 +00:00
Joao Gante	7e0ddf89f4	Generate: consolidate output classes (#28494 )	2024-01-15 17:04:08 +00:00
Matt	72db39c065	Add a use_safetensors arg to TFPreTrainedModel.from_pretrained() (#28511 ) * Add a use_safetensors arg to TFPreTrainedModel.from_pretrained() * One more catch! * One more one more catch	2024-01-15 17:00:54 +00:00
Rishit Ratna	78d767e3c8	Fixed minor typos (#28489 )	2024-01-15 16:45:15 +00:00
Marc Sun	7c8dd88d13	[GPTQ] Fix test (#28018 ) * fix test * reduce length * smaller model	2024-01-15 11:22:54 -05:00
thedamnedrhino	366c03271e	Tokenizer kwargs in textgeneration pipe (#28362 ) * added args to the pipeline * added test * more sensical tests * fixup * docs * typo ; * docs * made changes to support named args * fixed test * docs update * styles * docs * docs	2024-01-15 16:52:18 +01:00
yuanwu2017	a573ac74fd	Add the XPU device check for pipeline mode (#28326 ) * Add the XPU check for pipeline mode When setting xpu device for pipeline, It needs to use is_torch_xpu_available to load ipex and determine whether the device is available. Signed-off-by: yuanwu <yuan.wu@intel.com> * Don't move model to device when hf_device_map isn't None 1. Don't move model to device when hf_device_map is not None 2. The device string maybe includes the device index, so use 'in'instead of equal Signed-off-by: yuanwu <yuan.wu@intel.com> * Raise the error when xpu is not available Signed-off-by: yuanwu <yuan.wu@intel.com> * Update src/transformers/pipelines/base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/pipelines/base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Modify the error message Signed-off-by: yuanwu <yuan.wu@intel.com> * Change message format. Signed-off-by: yuanwu <yuan.wu@intel.com> --------- Signed-off-by: yuanwu <yuan.wu@intel.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-15 15:39:11 +00:00
Younes Belkada	1b9a2e4c80	[`core`/ FEAT] Add the possibility to push custom tags using `PreTrainedModel` itself (#28405 ) * v1 tags * remove unneeded conversion * v2 * rm unneeded warning * add more utility methods * Update src/transformers/utils/hub.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/hub.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/utils/hub.py Co-authored-by: Lucain <lucainp@gmail.com> * more enhancements * oops * merge tags * clean up * revert unneeded change * add extensive docs * more docs * more kwargs * add test * oops * fix test * Update src/transformers/modeling_utils.py Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update src/transformers/utils/hub.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/modeling_utils.py * Update src/transformers/trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add more conditions * more logic --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>	2024-01-15 14:48:07 +01:00
Yih-Dar	64bdbd888c	Don't set `finetuned_from` if it is a local path (#28482 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-15 11:38:20 +01:00
Tom Aarsen	881e966ace	[`chore`] Update warning text, a word was missing (#28017 ) Update warning, a word was missing	2024-01-15 10:08:03 +01:00
Francisco Kurucz	121641cab1	Fix paths to AI Sweden Models reference and model loading (#28423 ) Fix URL to Ai Sweden Models reference and model loading	2024-01-15 09:09:22 +01:00
Joao Gante	bc72b4e2cd	Generate: fix candidate device placement (#28493 ) * fix candidate device * this line shouldn't have been in	2024-01-13 21:31:25 +01:00
Apoorv Saxena	e304f9769c	Adding Prompt lookup decoding (#27775 ) * MVP * fix ci * more ci * remove redundant kwarg * added and wired up PromptLookupCandidateGenerator * rebased with main, working * removed print * style fixes * fix test * fixed tests * added test for prompt lookup decoding * fixed circleci * fixed test issue * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py * Update src/transformers/generation/candidate_generator.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-13 17:15:58 +00:00
Siddartha Naidu	29a2b14206	Change progress logging to once across all nodes (#28373 )	2024-01-12 15:01:21 -05:00
Matt	2382706a1c	Fix docstrings and update docstring checker error message (#28460 ) * Fix TF Regnet docstring * Fix TF Regnet docstring * Make a change to the PyTorch Regnet too to make sure the CI is checking it * Add skips for TFRegnet * Update error message for docstring checker	2024-01-12 17:54:11 +00:00
Joao Gante	4fb3d3a0f6	TF: purge `TFTrainer` (#28483 )	2024-01-12 16:56:34 +00:00
Joao Gante	afc45b13ca	Generate: refuse to save bad generation config files (#28477 )	2024-01-12 16:01:17 +00:00
Joao Gante	dc01cf9c5e	Docs: add model paths (#28475 )	2024-01-12 15:25:43 +00:00
Joao Gante	d026498830	Generate: deprecate old public functions (#28478 )	2024-01-12 15:21:15 +00:00
sungho-ham	edb314ae2b	Fix torch.ones usage in xlnet (#28471 ) Fix xlnet torch.ones usage Co-authored-by: sungho-ham <sungho.ham@linecorp.com>	2024-01-12 15:31:00 +01:00
dependabot[bot]	c45ef1c0d1	Bump jinja2 from 2.11.3 to 3.1.3 in /examples/research_projects/decision_transformer (#28457 ) Bump jinja2 in /examples/research_projects/decision_transformer Bumps [jinja2](https://github.com/pallets/jinja) from 2.11.3 to 3.1.3. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/jinja/compare/2.11.3...3.1.3) --- updated-dependencies: - dependency-name: jinja2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-01-12 15:28:55 +01:00
Younes Belkada	266c67b06a	[`Mixtral` / `Awq`] Add mixtral fused modules for Awq (#28240 ) * add mixtral fused modules * add changes from modeling utils * add test * fix test + rope theta issue * Update src/transformers/modeling_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add tests --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-12 14:29:35 +01:00
amyeroberts	666a6f078c	Update metadata loading for oneformer (#28398 ) * Update meatdata loading for oneformer * Enable loading from a model repo * Update docstrings * Fix tests * Update tests * Clarify repo_path behaviour	2024-01-12 12:35:31 +00:00
amyeroberts	4e36a6cd00	Mark two logger tests as flaky (#28458 ) * Mark two logger tests as flaky * Add description to is_flaky	2024-01-12 11:58:59 +00:00
Younes Belkada	07bdbebb48	[`Awq`] Add llava fused modules support (#28239 ) * add llava + fused modules * Update src/transformers/models/llava/modeling_llava.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-12 06:55:54 +01:00
Hankyeol Kyung	995a7ce9a8	Fix broken link on page (#28451 ) * [docs] Fix broken link Signed-off-by: Hankyeol Kyung <kghnkl0103@gmail.com> * [docs] Use shorter domain Signed-off-by: Hankyeol Kyung <kghnkl0103@gmail.com> --------- Signed-off-by: Hankyeol Kyung <kghnkl0103@gmail.com>	2024-01-11 09:26:13 -08:00
Matt	143451355c	Fix docstring checker issues with PIL enums (#28450 )	2024-01-11 17:23:41 +00:00
jiqing-feng	19e83d174c	Doc (#28431 ) * update version for cpu training * update docs for cpu training * fix readme * fix readme	2024-01-11 08:55:48 -08:00
Yih-Dar	59cd9de39d	Byebye torch 1.10 (#28207 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-11 16:18:27 +01:00
liangxuZhang	e768616afa	Fix load balancing loss func for mixtral (#28256 ) * Correct the implementation of auxiliary loss of mixtrtal * correct the implementation of auxiliary loss of mixtrtal * Implement a simpler calculation method --------- Co-authored-by: zhangliangxu3 <zhangliangxu3@jd.com>	2024-01-11 16:16:12 +01:00
Matt	5d4d62d0a2	Correctly resolve trust_remote_code=None for AutoTokenizer (#28419 ) * Correctly resolve trust_remote_code=None for AutoTokenizer * Second attempt at a proper resolution	2024-01-11 15:12:08 +00:00
Gustavo de Rosa	5509058561	[Phi] Extend implementation to use GQA/MQA. (#28163 ) * chore(phi): Updates configuration_phi with missing keys. * chore(phi): Adds first draft of combined modeling_phi. * fix(phi): Fixes according to latest review. * fix(phi): Removes pad_vocab_size_multiple to prevent inconsistencies. * fix(phi): Fixes unit and integration tests. * fix(phi): Ensures that everything works with microsoft/phi-1 for first integration. * fix(phi): Fixes output of docstring generation. * fix(phi): Fixes according to latest review. * fix(phi): Fixes according to latest review. * fix(tests): Re-enables Phi-1.5 test. * fix(phi): Fixes attention overflow on PhiAttention (for Phi-2). * fix(phi): Improves how queries and keys are upcast. * fix(phi): Small updates on latest changes.	2024-01-11 15:58:02 +01:00
Harisankar Babu	d560637885	Optionally preprocess segmentation maps for MobileViT (#28420 ) * optionally preprocess segmentation maps for mobilevit * changed pretrained model name to that of segmentation model * removed voc-deeplabv3 from model archive list * added preprocess_image and preprocess_mask methods for processing images and segmentation masks respectively * added tests for segmentation masks based on segformer feature extractor * use crop_size instead of size * reverting to initial model	2024-01-11 14:52:14 +00:00
Alex Hedges	95091e1582	Set `cache_dir` for `evaluate.load()` in example scripts (#28422 ) While using `run_clm.py`,[^1] I noticed that some files were being added to my global cache, not the local cache. I set the `cache_dir` parameter for the one call to `evaluate.load()`, which partially solved the problem. I figured that while I was fixing the one script upstream, I might as well fix the problem in all other example scripts that I could. There are still some files being added to my global cache, but this appears to be a bug in `evaluate` itself. This commit at least moves some of the files into the local cache, which is better than before. To create this PR, I made the following regex-based transformation: `evaluate\.load$(.*?)$` -> `evaluate\.load$$1, cache_dir=model_args.cache_dir$`. After using that, I manually fixed all modified files with `ruff` serving as useful guidance. During the process, I removed one existing usage of the `cache_dir` parameter in a script that did not have a corresponding `--cache-dir` argument declared. [^1]: I specifically used `pytorch/language-modeling/run_clm.py` from v4.34.1 of the library. For the original code, see the following URL: `acc394c4f5/examples/pytorch/language-modeling/run_clm.py`.	2024-01-11 15:38:44 +01:00
Yih-Dar	5fd5ef7624	Fix docker file (#28452 ) fix docker file Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-11 15:34:05 +01:00
Yih-Dar	d019acb858	Use python 3.10 for docbuild (#28399 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-11 14:39:49 +01:00
ikkvix	2a85345a23	Optimize the speed of the truncate_sequences function. (#28263 ) * change truncate_sequences * Update tokenization_utils_base.py * change format * fix when ids_to_move=0 * fix * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-11 11:42:14 +01:00
amyeroberts	66964c00f6	Enable multi-label image classification in pipeline (#28433 ) Enable multi-label image classification	2024-01-11 10:29:38 +00:00
jiqing-feng	8205b2647c	Assitant model may on a different device (#27995 ) * Assitant model may on a different device * fix tensor device	2024-01-11 11:24:59 +01:00
Patrick von Platen	cbbe30749b	[Whisper] Fix slow test (#28407 ) * [Whisper] Fix slow test * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-10 22:35:36 +01:00
Sparty	6c78bbcb83	[docstring] Fix docstring for ErnieConfig, ErnieMConfig (#27029 ) * Remove ErnieConfig, ErnieMConfig check_docstrings * Run fix_and_overwrite for ErnieConfig, ErnieMConfig * Replace <fill_type> and <fill_docstring> in configuration_ernie, configuration_ernie_m.py with type and docstring values --------- Co-authored-by: vignesh-raghunathan <vignesh_raghunathan@intuit.com>	2024-01-10 18:20:39 +01:00
Francisco Kurucz	3724156b4d	Fix load correct tokenizer in Mixtral model documentation (#28437 )	2024-01-10 18:09:06 +01:00
Timothy Blattner	cef2e40e0f	Fix for checkpoint rename race condition (#28364 ) * Changed logic for renaming staging directory when saving checkpoint to only operate with the main process. Added fsync functionality to attempt to flush the write changes in case os.rename is not atomic. * Updated styling using make fixup * Updated check for main process to use built-in versions from trainer Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Fixed incorrect usage of trainer main process checks Added with open usage to ensure better file closing as suggested from PR Added rotate_checkpoints into main process logic * Removed "with open" due to not working with directory. os.open seems to work for directories. --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-01-10 16:55:42 +01:00
Susnato Dhar	fff8ca8e59	update docs to add the `phi-2` example (#28392 ) * update docs * added Tip	2024-01-10 16:07:47 +01:00
Joao Gante	ee2482b6f8	CI: limit natten version (#28432 )	2024-01-10 12:39:05 +00:00
prasatee	ffd3710391	Fix number of models in README.md (#28430 )	2024-01-10 12:11:08 +01:00
Zach Mueller	6015d0ad6c	Support `DeepSpeed` when using auto find batch size (#28088 ) Fixup test	2024-01-10 06:03:13 -05:00
Zach Mueller	a777f52599	Skip now failing test in the Trainer tests (#28421 ) * Fix test * Skip	2024-01-10 06:02:31 -05:00

... 2 3 4 5 6 ...

15053 Commits