transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Alex Hedges	95091e1582	Set `cache_dir` for `evaluate.load()` in example scripts (#28422 ) While using `run_clm.py`,[^1] I noticed that some files were being added to my global cache, not the local cache. I set the `cache_dir` parameter for the one call to `evaluate.load()`, which partially solved the problem. I figured that while I was fixing the one script upstream, I might as well fix the problem in all other example scripts that I could. There are still some files being added to my global cache, but this appears to be a bug in `evaluate` itself. This commit at least moves some of the files into the local cache, which is better than before. To create this PR, I made the following regex-based transformation: `evaluate\.load$(.*?)$` -> `evaluate\.load$$1, cache_dir=model_args.cache_dir$`. After using that, I manually fixed all modified files with `ruff` serving as useful guidance. During the process, I removed one existing usage of the `cache_dir` parameter in a script that did not have a corresponding `--cache-dir` argument declared. [^1]: I specifically used `pytorch/language-modeling/run_clm.py` from v4.34.1 of the library. For the original code, see the following URL: `acc394c4f5/examples/pytorch/language-modeling/run_clm.py`.	2024-01-11 15:38:44 +01:00
Yih-Dar	5fd5ef7624	Fix docker file (#28452 ) fix docker file Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-11 15:34:05 +01:00
Yih-Dar	d019acb858	Use python 3.10 for docbuild (#28399 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-11 14:39:49 +01:00
ikkvix	2a85345a23	Optimize the speed of the truncate_sequences function. (#28263 ) * change truncate_sequences * Update tokenization_utils_base.py * change format * fix when ids_to_move=0 * fix * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-11 11:42:14 +01:00
amyeroberts	66964c00f6	Enable multi-label image classification in pipeline (#28433 ) Enable multi-label image classification	2024-01-11 10:29:38 +00:00
jiqing-feng	8205b2647c	Assitant model may on a different device (#27995 ) * Assitant model may on a different device * fix tensor device	2024-01-11 11:24:59 +01:00
Patrick von Platen	cbbe30749b	[Whisper] Fix slow test (#28407 ) * [Whisper] Fix slow test * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-10 22:35:36 +01:00
Sparty	6c78bbcb83	[docstring] Fix docstring for ErnieConfig, ErnieMConfig (#27029 ) * Remove ErnieConfig, ErnieMConfig check_docstrings * Run fix_and_overwrite for ErnieConfig, ErnieMConfig * Replace <fill_type> and <fill_docstring> in configuration_ernie, configuration_ernie_m.py with type and docstring values --------- Co-authored-by: vignesh-raghunathan <vignesh_raghunathan@intuit.com>	2024-01-10 18:20:39 +01:00
Francisco Kurucz	3724156b4d	Fix load correct tokenizer in Mixtral model documentation (#28437 )	2024-01-10 18:09:06 +01:00
Timothy Blattner	cef2e40e0f	Fix for checkpoint rename race condition (#28364 ) * Changed logic for renaming staging directory when saving checkpoint to only operate with the main process. Added fsync functionality to attempt to flush the write changes in case os.rename is not atomic. * Updated styling using make fixup * Updated check for main process to use built-in versions from trainer Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Fixed incorrect usage of trainer main process checks Added with open usage to ensure better file closing as suggested from PR Added rotate_checkpoints into main process logic * Removed "with open" due to not working with directory. os.open seems to work for directories. --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-01-10 16:55:42 +01:00
Susnato Dhar	fff8ca8e59	update docs to add the `phi-2` example (#28392 ) * update docs * added Tip	2024-01-10 16:07:47 +01:00
Joao Gante	ee2482b6f8	CI: limit natten version (#28432 )	2024-01-10 12:39:05 +00:00
prasatee	ffd3710391	Fix number of models in README.md (#28430 )	2024-01-10 12:11:08 +01:00
Zach Mueller	6015d0ad6c	Support `DeepSpeed` when using auto find batch size (#28088 ) Fixup test	2024-01-10 06:03:13 -05:00
Zach Mueller	a777f52599	Skip now failing test in the Trainer tests (#28421 ) * Fix test * Skip	2024-01-10 06:02:31 -05:00
HanHui	4df1d69634	[BUG] BarkEosPrioritizerLogitsProcessor eos_token_id use list, tensor size mismatch (#28201 ) fix(generation/logits_process.py): BarkEosPrioritizerLogitsProcessor eos_token_id use list, tensor size mismatch Co-authored-by: chenhanhui <chenhanhui@kanzhun.com>	2024-01-10 11:46:49 +01:00
dependabot[bot]	932ad8af7a	Bump fonttools from 4.31.1 to 4.43.0 in /examples/research_projects/decision_transformer (#28417 ) Bump fonttools in /examples/research_projects/decision_transformer Bumps [fonttools](https://github.com/fonttools/fonttools) from 4.31.1 to 4.43.0. - [Release notes](https://github.com/fonttools/fonttools/releases) - [Changelog](https://github.com/fonttools/fonttools/blob/main/NEWS.rst) - [Commits](https://github.com/fonttools/fonttools/compare/4.31.1...4.43.0) --- updated-dependencies: - dependency-name: fonttools dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-01-10 11:22:43 +01:00
Weiming Zhao	701298d2d3	Use mmap option to load_state_dict (#28331 ) Use mmap option to load_state_dict (#28331)	2024-01-10 09:57:30 +01:00
Victor SANH	0f2f0c634f	Fix `_merge_input_ids_with_image_features` for llava model (#28333 ) * fix `_merge_input_ids_with_image_features` for llava model * Update src/transformers/models/llava/modeling_llava.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * adress comments * style and tests * ooops * test the backward too * Apply suggestions from code review Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update tests/models/vipllava/test_modeling_vipllava.py * style and quality --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-01-10 08:33:33 +01:00
Xuehai Pan	976189a6df	Fix initialization for missing parameters in `from_pretrained` under ZeRO-3 (#28245 ) * Fix initialization for missing parameters in `from_pretrained` under ZeRO-3 * Test initialization for missing parameters under ZeRO-3 * Add more tests * Only enable deepspeed context for per-module level parameters * Enable deepspeed context only once * Move class definition inside test case body	2024-01-09 14:58:21 +00:00
Sangbum Daniel Choi	357971ec36	fix auxiliary loss training in DetrSegmentation (#28354 ) * fix auxiliary loss training in detrSegmentation * add auxiliary_loss testing	2024-01-09 10:17:07 +00:00
Patrick von Platen	8604dd308d	[SDPA] Make sure attn mask creation is always done on CPU (#28400 ) * [SDPA] Make sure attn mask creation is always done on CPU * Update docker to 2.1.1 * revert test change	2024-01-09 11:05:19 +01:00
Yih-Dar	5c7e11e010	update warning for image processor loading (#28209 ) * info * update * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-09 08:51:37 +01:00
NielsRogge	3b742ea84c	Add SigLIP (#26522 ) * Add first draft * Use appropriate gelu function * More improvements * More improvements * More improvements * Convert checkpoint * More improvements * Improve docs, remove print statements * More improvements * Add link * remove unused masking function * begin tokenizer * do_lower_case * debug * set split_special_tokens=True * Remove script * Fix style * Fix rebase * Use same design as CLIP * Add fast tokenizer * Add SiglipTokenizer to init, remove extra_ids * Improve conversion script * Use smaller inputs in conversion script * Update conversion script * More improvements * Add processor to conversion script * Add tests * Remove print statements * Add tokenizer tests * Fix more tests * More improvements related to weight initialization * More improvements * Make more tests pass * More improvements * More improvements * Add copied from * Add canonicalize_text * Enable fast tokenizer tests * More improvements * Fix most slow tokenizer tests * Address comments * Fix style * Remove script * Address some comments * Add copied from to tests * Add more copied from * Add more copied from * Add more copied from * Remove is_flax_available * More updates * Address comment * Remove SiglipTokenizerFast for now * Add caching * Remove umt5 test * Add canonicalize_text inside _tokenize, thanks Arthur * Fix image processor tests * Skip tests which are not applicable * Skip test_initialization * More improvements * Compare pixel values * Fix doc tests, add integration test * Add do_normalize * Remove causal mask and leverage ignore copy * Fix attention_mask * Fix remaining tests * Fix dummies * Rename temperature and bias * Address comments * Add copied from to tokenizer tests * Add SiglipVisionModel to auto mapping * Add copied from to image processor tests * Improve doc * Remove SiglipVisionModel from index * Address comments * Improve docs * Simplify config * Add first draft * Make it like mistral * More improvements * Fix attention_mask * Fix output_attentions * Add note in docs * Convert multilingual model * Convert large checkpoint * Convert more checkpoints * Add pipeline support, correct image_mean and image_std * Use padding=max_length by default * Make processor like llava * Add code snippet * Convert more checkpoints * Set keep_punctuation_string=None as in OpenCLIP * Set normalized=False for special tokens * Fix doc test * Update integration test * Add figure * Update organization * Happy new year * Use AutoModel everywhere --------- Co-authored-by: patil-suraj <surajp815@gmail.com>	2024-01-08 18:17:16 +01:00
Rosie Wood	73c88012b7	Add segmentation map processing to SAM Image Processor (#27463 ) * add segmentation map processing to sam image processor * fixup * add tests * reshaped_input_size is shape before padding * update tests for size/shape outputs * fixup * add code snippet to docs * Update docs/source/en/model_doc/sam.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add missing backticks * add `segmentation_maps` as arg for SamProcessor.__call__() --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-08 16:40:36 +00:00
Avimanyu Bandyopadhyay	2272ab57a9	Remove shell=True from subprocess.Popen to Mitigate Security Risk (#28299 ) Remove shell=True from subprocess.Popen to mitigate security risk	2024-01-08 14:33:28 +00:00
zspo	87a6cf41d0	[AttentionMaskConverter] fix sdpa unmask unattended (#28369 ) fix tensor device	2024-01-08 13:33:44 +01:00
Ondrej Major	98dba52ccd	Bugfix / ffmpeg input device (mic) not working on Windows (#27051 ) * fix input audio device for windows. * ffmpeg audio device Windows * Fixes wrong input device assignment in Windows * Fixed getting mic on Windows systems by adding _get_microphone_name() function.	2024-01-08 13:32:36 +01:00
Hz, Ji	7d9d5cea55	remove two deprecated function (#28220 )	2024-01-08 11:33:58 +00:00
Mohamed Abu El-Nasr	0c2121f99b	Fix building alibi tensor when num_heads is not a power of 2 (#28380 ) * Fix building alibi tensor when num_heads is not a power of 2 * Remove print function	2024-01-08 10:39:40 +01:00
Chi	53cffeb33c	Enhancing Code Readability and Maintainability with Simplified Activation Function Selection. (#28349 ) * Little bit change code in get_activation() * proper area to deffine gelu_activation() in this two file * Fix github issue * Mistake some typo * My mistake to self using to call config * Reformat my two file * Update src/transformers/activations.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/electra/modeling_electra.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/convbert/modeling_convbert.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Rename gelu_act to activatioin --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-08 09:19:06 +01:00
Susnato Dhar	3eddda1111	[Phi2] Add support for phi2 models (#28211 ) * modified script and added test for phi2 * changes	2024-01-07 08:19:14 +01:00
hugo-syn	4ab5fb8941	chore: Fix typo s/exclusivelly/exclusively/ (#28361 )	2024-01-05 13:19:15 -08:00
Ella Charlaix	7226f3d2b0	Update VITS modeling to enable ONNX export (#28141 ) * Update vits modeling for onnx export compatibility * fix style * Update src/transformers/models/vits/modeling_vits.py	2024-01-05 17:52:32 +01:00
Susnato Dhar	cadf93a6fc	fix FA2 when using quantization for remaining models (#28341 ) * fix fa2 autocasting when using quantization * Update src/transformers/models/distilbert/modeling_distilbert.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/distilbert/modeling_distilbert.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-05 16:46:55 +01:00
Sangbum Daniel Choi	899d8351f9	[DETA] Improvement and Sync from DETA especially for training (#27990 ) * [DETA] fix freeze/unfreeze function * Update src/transformers/models/deta/modeling_deta.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/deta/modeling_deta.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add freeze/unfreeze test case in DETA * fix type * fix typo 2 * fix : enable aux and enc loss in training pipeline * Add unsynced variables from original DETA for training * modification for passing CI test * make style * make fix * manual make fix * change deta_modeling_test of configuration 'two_stage' default to TRUE and minor change of dist checking * remove print * divide configuration in DetaModel and DetaForObjectDetection * image smaller size than 224 will give topk error * pred_boxes and logits should be equivalent to two_stage_num_proposals * add missing part in DetaConfig * Update src/transformers/models/deta/modeling_deta.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add docstring in configure and prettify TO DO part * change distribute related code to accelerate * Update src/transformers/models/deta/configuration_deta.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/deta/test_modeling_deta.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * protect importing accelerate * change variable name to specific value * wrong import --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-05 14:20:21 +00:00
Fernando Rodriguez Sanchez	57e9c83213	Fix pos_mask application and update tests accordingly (#27892 ) * Fix pos_mask application and update tests accordingly * Fix style * Adding comments --------- Co-authored-by: Fernando Rodriguez <fernando.rodriguez@nielseniq.com>	2024-01-05 12:36:10 +01:00
yuanwu2017	03b980990a	Don't check the device when device_map=auto (#28351 ) When running the case on multi-cards server with devcie_map-auto, It will not always be allocated to device 0, Because other processes may be using these cards. It will select the devices that can accommodate this model. Signed-off-by: yuanwu <yuan.wu@intel.com>	2024-01-05 12:21:29 +01:00
Kevin Herro	5d36025ca1	README: install transformers from conda-forge channel (#28313 ) Switch to the conda-forge channel for transformer installation, as the huggingface channel does not offer the latest version. Fixes #28248	2024-01-04 09:36:16 -08:00
Yoach Lacombe	35e9d2b223	Fix error in M4T feature extractor (#28340 ) * fix M4T FE error when no attention mask * modify logic * add test * go back to initial test situation + add other tests	2024-01-04 16:40:53 +00:00
Sangbum Daniel Choi	4a66c0d952	enable training mask2former and maskformer for transformers trainer (#28277 ) * fix get_num_masks output as [int] to int * fix loss size from torch.Size([1]) to torch.Size([])	2024-01-04 09:53:25 +01:00
Aaron Jimenez	6b8ec2588e	[docs] Sort es/toctree.yml \| Translate performance.md (#28262 ) * Sort es/_toctree.yml like en/_toctree.yml * Run make style * Add -Rendimiento y escalabilidad- section to es/_toctree.yml * Run make style * Add s to section * Add translate of performance.md * Add performance.md to es/_toctree.yml * Run make styele * Fix docs links * Run make style	2024-01-03 14:35:58 -08:00
Mayfsz	3ea8833676	Translate contributing.md into Chinese (#28243 ) * Translate contributing.md into Chinese * Update review comments	2024-01-03 14:35:02 -08:00
Apsod	45b1dfa342	Remove token_type_ids from model_input_names (like #24788 ) (#28325 ) * remove token_type_ids from model_input_names (like #24788) * removed test that assumed token_type_ids should be present and updated a model reference so that it points to an available model)	2024-01-03 19:26:07 +01:00
Connor Henderson	d83ff5eeff	Add FastSpeech2Conformer (#23439 ) * start - docs, SpeechT5 copy and rename * add relevant code from FastSpeech2 draft, have tests pass * make it an actual conformer, demo ex. * matching inference with original repo, includes debug code * refactor nn.Sequentials, start more desc. var names * more renaming * more renaming * vocoder scratchwork * matching vocoder outputs * hifigan vocoder conversion script * convert model script, rename some config vars * replace postnet with speecht5's implementation * passing common tests, file cleanup * expand testing, add output hidden states and attention * tokenizer + passing tokenizer tests * variety of updates and tests * g2p_en pckg setup * import structure edits * docstrings and cleanup * repo consistency * deps * small cleanup * forward signature param order * address comments except for masks and labels * address comments on attention_mask and labels * address second round of comments * remove old unneeded line * address comments part 1 * address comments pt 2 * rename auto mapping * fixes for failing tests * address comments part 3 (bart-like, train loss) * make style * pass config where possible * add forward method + tests to WithHifiGan model * make style * address arg passing and generate_speech comments * address Arthur comments * address Arthur comments pt2 * lint changes * Sanchit comment * add g2p-en to doctest deps * move up self.encoder * onnx compatible tensor method * fix is symbolic * fix paper url * move models to espnet org * make style * make fix-copies * update docstring * Arthur comments * update docstring w/ new updates * add model architecture images * header size * md wording update * make style	2024-01-03 18:01:06 +00:00
lain	6eba901d88	fix documentation for zero_shot_object_detection (#28267 ) remove broken space	2024-01-03 09:20:34 -08:00
dependabot[bot]	c2d283a64a	Bump tj-actions/changed-files from 22.2 to 41 in /.github/workflows (#28311 ) Bumps [tj-actions/changed-files](https://github.com/tj-actions/changed-files) from 22.2 to 41. - [Release notes](https://github.com/tj-actions/changed-files/releases) - [Changelog](https://github.com/tj-actions/changed-files/blob/main/HISTORY.md) - [Commits](https://github.com/tj-actions/changed-files/compare/v22.2...v41) --- updated-dependencies: - dependency-name: tj-actions/changed-files dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-01-03 09:12:53 +01:00
Daniel Bustamante Ospina	aa4a0f8ef3	Remove fast tokenization warning in Data Collators (#28213 )	2024-01-02 18:32:23 +00:00
Marco Carosi	5be46dfc09	[Whisper] Fix errors with MPS backend introduced by new code on word-level timestamps computation (#28288 ) * Update modeling_whisper.py to support MPS backend Fixed some issue with MPS backend. First, the torch.std_mean is not implemented and is not scheduled for implementation, while the single torch.std and torch.mean are. Second, MPS backend does not support float64, so it can not cast from float32 to float64. Inverting the double() when the matrix is in the cpu fixes the issue while should not change the logic. * Found another instruction in modeling_whisper.py not implemented byor MPS After a load test, where I transcribed a 2 hours audio file, I got into a branch that did not fix in the previous commit. Similar fix, where the torch.std_mean is changed into torch.std and torch.mean * Update modeling_whisper.py removed trailing white spaces Removed trailing white spaces * Update modeling_whisper.py to use is_torch_mps_available() Using is_torch_mps_available() instead of capturing the NotImplemented exception * Update modeling_whisper.py sorting the import block Sorting the utils import block * Update src/transformers/models/whisper/modeling_whisper.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_whisper.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_whisper.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-02 16:22:28 +00:00
frankenliu	87ae2a4632	fix bug:divide by zero in _maybe_log_save_evaluate() (#28251 ) Co-authored-by: liujizhong1 <liujizhong1@xiaomi.com>	2024-01-02 14:19:42 +00:00

1 2 3 4 5 ...

14868 Commits