Lysandre
ce8e64fbe2
Dev version
2024-04-18 15:53:25 +02:00
Haz Sameen Shahgir
5e673ed2dc
updated examples/pytorch/language-modeling scripts and requirements.txt to require datasets>=2.14.0 ( #30120 )
...
updated requirements.txt and require_version() calls in examples/pytorch/language-modeling to require datasets>=2.14.0
2024-04-08 12:41:28 +01:00
Arthur Zucker
1248f09252
v4.40.0.dev.0
2024-03-20 23:31:47 +09:00
Yitong Huang
873d9bb3cc
Make torch xla available on GPU ( #29334 )
...
* add USE_TORCH_XLA env
* rename torch_tpu to torch_xla
* better is_torch_xla_available; fix some fsdp and performance issues
* fix format
* fix bug when pjrt_device is cpu
* fix bug
* fix the deprecation handling
---------
Co-authored-by: anw90 <ang868@gmail.com>
Co-authored-by: wangang.wa <wangang.wa@alibaba-inc.com>
2024-03-11 14:07:16 +00:00
Arthur Zucker
1a77f07f65
v4.39.dev.0
2024-02-21 15:23:22 +09:00
zspo
d98591a12b
[docs] fix some bugs about parameter description ( #28806 )
...
Co-authored-by: p_spozzhang <p_spozzhang@tencent.com>
2024-02-01 16:59:29 +00:00
Amy Roberts
b2748a6efd
v4.38.dev.0
2024-01-19 10:43:28 +00:00
Alex Hedges
95091e1582
Set cache_dir
for evaluate.load()
in example scripts ( #28422 )
...
While using `run_clm.py`,[^1] I noticed that some files were being added
to my global cache, not the local cache. I set the `cache_dir` parameter
for the one call to `evaluate.load()`, which partially solved the
problem. I figured that while I was fixing the one script upstream, I
might as well fix the problem in all other example scripts that I could.
There are still some files being added to my global cache, but this
appears to be a bug in `evaluate` itself. This commit at least moves
some of the files into the local cache, which is better than before.
To create this PR, I made the following regex-based transformation:
`evaluate\.load\((.*?)\)` -> `evaluate\.load\($1,
cache_dir=model_args.cache_dir\)`. After using that, I manually fixed
all modified files with `ruff` serving as useful guidance. During the
process, I removed one existing usage of the `cache_dir` parameter in a
script that did not have a corresponding `--cache-dir` argument
declared.
[^1]: I specifically used `pytorch/language-modeling/run_clm.py` from
v4.34.1 of the library. For the original code, see the following URL:
acc394c4f5/examples/pytorch/language-modeling/run_clm.py
.
2024-01-11 15:38:44 +01:00
Lysandre
3ed3e3190c
Dev version
2023-12-13 18:29:31 +01:00
Adam Louly
4850aaba6f
fix no sequence length models error ( #27522 )
...
* fix no sequence length models error
* block size check
---------
Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2023-12-11 18:01:26 +00:00
V.Prasanna kumar
ffbcfc0166
Broken links fixed related to datasets docs ( #27569 )
...
fixed the broken links belogs to dataset library of transformers
2023-11-17 13:44:09 -08:00
Adam Louly
e6522e49a7
Fixing the failure of models without max_position_embeddings attribute. ( #27499 )
...
fix max pos issue
Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2023-11-15 18:16:42 +00:00
Lysandre
bc78fd1274
Dev version
2023-11-02 18:15:36 +01:00
Dong-geon Lee
25e6e9418c
Unify warning styles for better readability ( #27184 )
2023-10-31 18:12:14 +00:00
Lucain
66b088faf0
Provide alternative when warning on use_auth_token ( #27105 )
2023-10-27 14:32:54 +02:00
Tom Aarsen
40ea9ab2a1
Add many missing spaces in adjacent strings ( #26751 )
...
Add missing spaces in adjacent strings
2023-10-12 10:28:40 +02:00
Phuc Van Phan
6015f91a5a
refactor: change default block_size ( #26229 )
...
* refactor: change default block_size
* fix: return tf to origin
* fix: change files to origin
* rebase
* rebase
* rebase
* rebase
* rebase
* rebase
* rebase
* rebase
* refactor: add min block_size to files
* reformat: add min block_size for run_clm tf
2023-10-04 15:31:38 +01:00
Lysandre
bd6205919a
v4.35.0.dev0
2023-10-03 16:54:37 +02:00
Phuc Van Phan
5af2c62696
docs: add space to docs ( #26067 )
...
* docs: add space to docs
* docs: remove reduntant space
2023-09-11 22:03:26 +01:00
Phuc Van Phan
9cebae64ad
docs: update link huggingface map ( #26077 )
2023-09-11 12:57:04 +01:00
Lysandre
d8e13b3e04
v4.34.dev.0
2023-09-04 15:12:11 -04:00
Sylvain Gugger
5c67682b16
v4.33.0.dev0
2023-08-21 07:07:04 -04:00
Jackmin801
145109382a
Allow trust_remote_code
in example scripts ( #25248 )
...
* pytorch examples
* pytorch mim no trainer
* cookiecutter
* flax examples
* missed line in pytorch run_glue
* tensorflow examples
* tensorflow run_clip
* tensorflow run_mlm
* tensorflow run_ner
* tensorflow run_clm
* pytorch example from_configs
* pytorch no trainer examples
* Revert "tensorflow run_clip"
This reverts commit 261f86ac1f
.
* fix: duplicated argument
2023-08-07 16:32:25 +02:00
Yih-Dar
149cb0cce2
Add token
arugment in example scripts ( #25172 )
...
* fix
* fix
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-08-02 11:17:31 +02:00
Yih-Dar
d53b8ad780
Update use_auth_token
-> token
in example scripts ( #25167 )
...
* pytorch examples
* tensorflow examples
* flax examples
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-28 15:33:45 +02:00
Zach Mueller
aa1b09c5d1
Change logic for logging in the examples ( #24956 )
...
Change logic
2023-07-20 12:30:10 -04:00
Sylvain Gugger
e9ad51306f
4.32.0.dev0
2023-07-17 13:30:44 -04:00
Sylvain Gugger
ba695c1efd
v4.31.0.dev0
2023-06-07 16:49:00 -04:00
Boda Sadallah
a7920065f2
fix bug in group_texts function, that was inserting short batches ( #23429 )
...
* fix bug in group_texts function, that was inserting short batches
* fully exclude short batches and return empty dict instead
* fix style
2023-05-18 14:22:30 -04:00
Sylvain Gugger
a0c0a78233
v4.30.0.dev0
2023-05-09 14:59:38 -04:00
Sylvain Gugger
888c4a2ae0
v4.29.0.dev0
2023-04-12 20:04:29 -04:00
Wang, Yi
4ccaf268fb
add low_cpu_mem_usage option in run_clm.py example which will benefit… ( #22288 )
...
* add low_cpu_mem_usage option in run_clm.py example which will benefit LLM loading
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* update all the example and README under language-modeling
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
---------
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2023-03-22 10:42:39 +00:00
Sylvain Gugger
ebdb185bef
v4.28.0.dev0
2023-03-14 13:49:10 -04:00
Sylvain Gugger
b19d64d852
Respect documentation on passive log level ( #21700 )
...
* Respect documentation on passive log level
* Fix test and set log level in examples
* Add doc
2023-02-22 09:39:18 +01:00
Aaron Gokaslan
5e8c8eb5ba
Apply ruff flake8-comprehensions ( #21694 )
2023-02-22 09:14:54 +01:00
Sylvain Gugger
6f79d26442
Update quality tooling for formatting ( #21480 )
...
* Result of black 23.1
* Update target to Python 3.7
* Switch flake8 to ruff
* Configure isort
* Configure isort
* Apply isort with line limit
* Put the right black version
* adapt black in check copies
* Fix copies
2023-02-06 18:10:56 -05:00
Stas Bekman
3b9a1dc132
[examples] improve block_size warning message ( #21463 )
2023-02-06 08:36:12 -08:00
Quentin Lhoest
074d6b75fd
Simplify column_names in run_clm/mlm ( #21382 )
...
* simplify column_names in run_clm
* simplify column_names in run_mlm
* minor
2023-01-31 15:23:47 +01:00
Stas Bekman
98d88b23f5
[run_(clm|mlm).py
examples] add streaming dataset support ( #21343 )
...
* [run_clm example] add streaming dataset support
* unrefactor kwargs
* fix
* fix
* require datasets>=2.0.0
* port to mlm
2023-01-30 14:01:35 -08:00
Sylvain Gugger
7119bb052a
v4.27.0.dev0
2023-01-23 16:52:35 -05:00
Wang, Yi
9c9fe89f84
[run_clm example] add torch_dtype option for model load. ( #20971 )
...
* [run_clm example] add torch_dtype option for model load.
for BLOOM 175B model. peak memory will reduce about 350G for inference. the weight of BLOOM in model hub is bfloat16
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* add other type in option
* fix style
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2023-01-03 09:33:11 -05:00
Sylvain Gugger
60d1f31bb0
v4.26.0.dev0
2022-12-01 16:19:33 -05:00
Sylvain Gugger
06886d5a68
Only resize embeddings when necessary ( #20043 )
...
* Only resize embeddings when necessary
* Add comment
2022-11-03 12:05:04 -04:00
Sylvain Gugger
c3a93d8d82
v4.25.0.dev0
2022-10-31 21:48:40 -04:00
Lysandre
10100979ed
Dev version
2022-10-10 17:25:40 -04:00
Lysandre
16913b3c92
Dev version
2022-09-14 14:58:20 -04:00
Julien Chaumond
9129fd0377
transformers-cli login
=> huggingface-cli login
(#18490 )
...
* zero chance anyone's using that constant no?
* `transformers-cli login` => `huggingface-cli login`
* `transformers-cli repo create` => `huggingface-cli repo create`
* `make style`
2022-08-06 09:42:55 +02:00
atturaioe
1f84399171
Migrate metric to Evaluate in Pytorch examples ( #18369 )
...
* Migrate metric to Evaluate in pytorch examples
* Remove unused imports
2022-08-01 07:40:25 -04:00
Lysandre
c89a592e87
Dev version
2022-07-27 17:13:57 +02:00
Sylvain Gugger
7c6ec195ad
v4.21.0.dev0
2022-06-16 12:20:53 -04:00