transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-04 13:20:12 +06:00

Author	SHA1	Message	Date
Amy Roberts	b2748a6efd	v4.38.dev.0	2024-01-19 10:43:28 +00:00
Alex Hedges	95091e1582	Set `cache_dir` for `evaluate.load()` in example scripts (#28422 ) While using `run_clm.py`,[^1] I noticed that some files were being added to my global cache, not the local cache. I set the `cache_dir` parameter for the one call to `evaluate.load()`, which partially solved the problem. I figured that while I was fixing the one script upstream, I might as well fix the problem in all other example scripts that I could. There are still some files being added to my global cache, but this appears to be a bug in `evaluate` itself. This commit at least moves some of the files into the local cache, which is better than before. To create this PR, I made the following regex-based transformation: `evaluate\.load$(.*?)$` -> `evaluate\.load$$1, cache_dir=model_args.cache_dir$`. After using that, I manually fixed all modified files with `ruff` serving as useful guidance. During the process, I removed one existing usage of the `cache_dir` parameter in a script that did not have a corresponding `--cache-dir` argument declared. [^1]: I specifically used `pytorch/language-modeling/run_clm.py` from v4.34.1 of the library. For the original code, see the following URL: `acc394c4f5/examples/pytorch/language-modeling/run_clm.py`.	2024-01-11 15:38:44 +01:00
Lysandre	3ed3e3190c	Dev version	2023-12-13 18:29:31 +01:00
Adam Louly	4850aaba6f	fix no sequence length models error (#27522 ) * fix no sequence length models error * block size check --------- Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2023-12-11 18:01:26 +00:00
V.Prasanna kumar	ffbcfc0166	Broken links fixed related to datasets docs (#27569 ) fixed the broken links belogs to dataset library of transformers	2023-11-17 13:44:09 -08:00
Adam Louly	e6522e49a7	Fixing the failure of models without max_position_embeddings attribute. (#27499 ) fix max pos issue Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2023-11-15 18:16:42 +00:00
Lysandre	bc78fd1274	Dev version	2023-11-02 18:15:36 +01:00
Dong-geon Lee	25e6e9418c	Unify warning styles for better readability (#27184 )	2023-10-31 18:12:14 +00:00
Lucain	66b088faf0	Provide alternative when warning on use_auth_token (#27105 )	2023-10-27 14:32:54 +02:00
Tom Aarsen	40ea9ab2a1	Add many missing spaces in adjacent strings (#26751 ) Add missing spaces in adjacent strings	2023-10-12 10:28:40 +02:00
Phuc Van Phan	6015f91a5a	refactor: change default block_size (#26229 ) * refactor: change default block_size * fix: return tf to origin * fix: change files to origin * rebase * rebase * rebase * rebase * rebase * rebase * rebase * rebase * refactor: add min block_size to files * reformat: add min block_size for run_clm tf	2023-10-04 15:31:38 +01:00
Lysandre	bd6205919a	v4.35.0.dev0	2023-10-03 16:54:37 +02:00
Phuc Van Phan	5af2c62696	docs: add space to docs (#26067 ) * docs: add space to docs * docs: remove reduntant space	2023-09-11 22:03:26 +01:00
Phuc Van Phan	9cebae64ad	docs: update link huggingface map (#26077 )	2023-09-11 12:57:04 +01:00
Lysandre	d8e13b3e04	v4.34.dev.0	2023-09-04 15:12:11 -04:00
Sylvain Gugger	5c67682b16	v4.33.0.dev0	2023-08-21 07:07:04 -04:00
Jackmin801	145109382a	Allow `trust_remote_code` in example scripts (#25248 ) * pytorch examples * pytorch mim no trainer * cookiecutter * flax examples * missed line in pytorch run_glue * tensorflow examples * tensorflow run_clip * tensorflow run_mlm * tensorflow run_ner * tensorflow run_clm * pytorch example from_configs * pytorch no trainer examples * Revert "tensorflow run_clip" This reverts commit `261f86ac1f`. * fix: duplicated argument	2023-08-07 16:32:25 +02:00
Yih-Dar	149cb0cce2	Add `token` arugment in example scripts (#25172 ) * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-02 11:17:31 +02:00
Yih-Dar	d53b8ad780	Update `use_auth_token` -> `token` in example scripts (#25167 ) * pytorch examples * tensorflow examples * flax examples --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-28 15:33:45 +02:00
Zach Mueller	aa1b09c5d1	Change logic for logging in the examples (#24956 ) Change logic	2023-07-20 12:30:10 -04:00
Sylvain Gugger	e9ad51306f	4.32.0.dev0	2023-07-17 13:30:44 -04:00
Sylvain Gugger	ba695c1efd	v4.31.0.dev0	2023-06-07 16:49:00 -04:00
Boda Sadallah	a7920065f2	fix bug in group_texts function, that was inserting short batches (#23429 ) * fix bug in group_texts function, that was inserting short batches * fully exclude short batches and return empty dict instead * fix style	2023-05-18 14:22:30 -04:00
Sylvain Gugger	a0c0a78233	v4.30.0.dev0	2023-05-09 14:59:38 -04:00
Sylvain Gugger	888c4a2ae0	v4.29.0.dev0	2023-04-12 20:04:29 -04:00
Wang, Yi	4ccaf268fb	add low_cpu_mem_usage option in run_clm.py example which will benefit… (#22288 ) * add low_cpu_mem_usage option in run_clm.py example which will benefit LLM loading Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * update all the example and README under language-modeling Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2023-03-22 10:42:39 +00:00
Sylvain Gugger	ebdb185bef	v4.28.0.dev0	2023-03-14 13:49:10 -04:00
Sylvain Gugger	b19d64d852	Respect documentation on passive log level (#21700 ) * Respect documentation on passive log level * Fix test and set log level in examples * Add doc	2023-02-22 09:39:18 +01:00
Aaron Gokaslan	5e8c8eb5ba	Apply ruff flake8-comprehensions (#21694 )	2023-02-22 09:14:54 +01:00
Sylvain Gugger	6f79d26442	Update quality tooling for formatting (#21480 ) * Result of black 23.1 * Update target to Python 3.7 * Switch flake8 to ruff * Configure isort * Configure isort * Apply isort with line limit * Put the right black version * adapt black in check copies * Fix copies	2023-02-06 18:10:56 -05:00
Stas Bekman	3b9a1dc132	[examples] improve block_size warning message (#21463 )	2023-02-06 08:36:12 -08:00
Quentin Lhoest	074d6b75fd	Simplify column_names in run_clm/mlm (#21382 ) * simplify column_names in run_clm * simplify column_names in run_mlm * minor	2023-01-31 15:23:47 +01:00
Stas Bekman	98d88b23f5	[`run_(clm\|mlm).py` examples] add streaming dataset support (#21343 ) * [run_clm example] add streaming dataset support * unrefactor kwargs * fix * fix * require datasets>=2.0.0 * port to mlm	2023-01-30 14:01:35 -08:00
Sylvain Gugger	7119bb052a	v4.27.0.dev0	2023-01-23 16:52:35 -05:00
Wang, Yi	9c9fe89f84	[run_clm example] add torch_dtype option for model load. (#20971 ) * [run_clm example] add torch_dtype option for model load. for BLOOM 175B model. peak memory will reduce about 350G for inference. the weight of BLOOM in model hub is bfloat16 Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * add other type in option * fix style Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2023-01-03 09:33:11 -05:00
Sylvain Gugger	60d1f31bb0	v4.26.0.dev0	2022-12-01 16:19:33 -05:00
Sylvain Gugger	06886d5a68	Only resize embeddings when necessary (#20043 ) * Only resize embeddings when necessary * Add comment	2022-11-03 12:05:04 -04:00
Sylvain Gugger	c3a93d8d82	v4.25.0.dev0	2022-10-31 21:48:40 -04:00
Lysandre	10100979ed	Dev version	2022-10-10 17:25:40 -04:00
Lysandre	16913b3c92	Dev version	2022-09-14 14:58:20 -04:00
Julien Chaumond	9129fd0377	`transformers-cli login` => `huggingface-cli login` (#18490 ) * zero chance anyone's using that constant no? * `transformers-cli login` => `huggingface-cli login` * `transformers-cli repo create` => `huggingface-cli repo create` * `make style`	2022-08-06 09:42:55 +02:00
atturaioe	1f84399171	Migrate metric to Evaluate in Pytorch examples (#18369 ) * Migrate metric to Evaluate in pytorch examples * Remove unused imports	2022-08-01 07:40:25 -04:00
Lysandre	c89a592e87	Dev version	2022-07-27 17:13:57 +02:00
Sylvain Gugger	7c6ec195ad	v4.21.0.dev0	2022-06-16 12:20:53 -04:00
Sylvain Gugger	3cab90279f	Add examples telemetry (#17552 ) * Add examples telemetry * Alternative approach * Add to all other examples * Add to templates as well * Put framework separately * Same for TensorFlow	2022-06-07 11:57:52 -04:00
Sylvain Gugger	afe5d42d8d	Black preview (#17217 ) * Black preview * Fixup too! * Fix check copies * Use the same version as the CI * Bump black	2022-05-12 16:25:55 -04:00
Lysandre Debut	5294fa12ee	Dev version	2022-05-12 11:04:23 -04:00
Lysandre Debut	a180efe7fd	Dev version	2022-04-06 11:08:12 -04:00
Karim Foda	24a85cca61	Add use_auth to load_datasets for private datasets to PT and TF examples (#16521 ) * fix formatting and remove use_auth * Add use_auth_token to Flax examples	2022-04-04 10:27:45 -04:00
Stas Bekman	a73281e3e4	[examples] max samples can't be bigger than the len of dataset (#16501 ) * [examples] max samples can't be bigger than then len of dataset * do tf and flax	2022-03-30 12:33:16 -07:00

1 2

99 Commits