Lingepumpe
5427250351
Avoid invalid escape sequences, use raw strings ( #22936 )
...
* Avoid invalid escape sequences, use raw strings
* Integrate PR feedback
2023-04-25 09:17:56 -04:00
Roy Hvaara
874c7caf19
Remove broken test_data symlink in legacy s2s examples ( #22876 )
2023-04-21 15:35:42 +01:00
Sayak Paul
4116d1ec75
[Examples/TensorFlow] minor refactoring to allow compatible datasets to work ( #22879 )
...
minor refactoring to allow compatible datasets to work.
2023-04-20 18:21:01 +05:30
Zachary Mueller
cd3e0211a6
Remove accelerate from tf test reqs ( #22777 )
...
Remove accelerate from tf
2023-04-17 12:31:21 -04:00
Matt
2237127a6c
Fix sneaky torch dependency in TF example ( #22804 )
2023-04-17 16:11:52 +01:00
Sayak Paul
390e121fb5
[Examples] TPU-based training of a language model using TensorFlow ( #21657 )
...
* add: tokenizer training script for TF TPU LM training.
* add: script for preparing the TFRecord shards.
* add: sequence of execution to readme.
* remove limit from the tfrecord shard name.
* Add initial train_model.py
* Add basic training arguments and model init
* Get up to the point of writing the data collator
* Pushing progress so far!
* Complete first draft of model training code
* feat: grouping of texts efficiently.
Co-authored-by: Matt <rocketknight1@gmail.com>
* Add proper masking collator and get training loop working
* fix: things.
* Read sample counts from filenames
* Read sample counts from filenames
* Draft README
* Improve TPU warning
* Use distribute instead of distribute.experimental
* Apply suggestions from code review
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Modularize loading and add MLM probability as arg
* minor refactoring to better use the cli args.
* readme fillup.
* include tpu and inference sections in the readme.
* table of contents.
* parallelize maps.
* polish readme.
* change script name to run_mlm.py
* address PR feedback (round I).
---------
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2023-04-14 10:41:01 +05:30
Sylvain Gugger
888c4a2ae0
v4.29.0.dev0
2023-04-12 20:04:29 -04:00
Sylvain Gugger
1b1867d86b
Replace -100s in predictions by the pad token ( #22693 )
...
* Replace -100s in predictions by the pad token
* Style
* Try to catch them all
2023-04-11 09:32:20 -04:00
Mikel Penagarikano
d5239bab5b
Sync preprocesses before loading the processor at run_speech_recognition_ctc.py ( #21926 )
...
* Update run_speech_recognition_ctc.py
Make sure all processes wait until data is saved before loading the processor from the output_dit
* Make sure all processes wait until data is saved before loading the processor from the output_dit
* Update run_speech_recognition_ctc.py
* Update run_speech_recognition_seq2seq.py
2023-04-05 09:36:04 -04:00
Maziyar Panahi
98268b2e76
Add id2label and label2id to model's config in run_xnil ( #22558 )
...
Add id2label and label2id to config in run_xnil
2023-04-04 09:28:57 -04:00
dependabot[bot]
6fc44656b4
Bump redis from 4.5.3 to 4.5.4 in /examples/research_projects/decision_transformer ( #22494 )
...
Bump redis in /examples/research_projects/decision_transformer
Bumps [redis](https://github.com/redis/redis-py ) from 4.5.3 to 4.5.4.
- [Release notes](https://github.com/redis/redis-py/releases )
- [Changelog](https://github.com/redis/redis-py/blob/master/CHANGES )
- [Commits](https://github.com/redis/redis-py/compare/v4.5.3...v4.5.4 )
---
updated-dependencies:
- dependency-name: redis
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-31 10:50:33 -04:00
Sabine
173193ccd0
Update Neptune docs ( #22452 )
2023-03-29 13:15:38 -04:00
dependabot[bot]
32ff06403d
Bump redis from 4.1.4 to 4.5.3 in /examples/research_projects/decision_transformer ( #22410 )
...
Bump redis in /examples/research_projects/decision_transformer
Bumps [redis](https://github.com/redis/redis-py ) from 4.1.4 to 4.5.3.
- [Release notes](https://github.com/redis/redis-py/releases )
- [Changelog](https://github.com/redis/redis-py/blob/master/CHANGES )
- [Commits](https://github.com/redis/redis-py/compare/v4.1.4...v4.5.3 )
---
updated-dependencies:
- dependency-name: redis
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-27 20:23:55 -04:00
Sylvain Gugger
057e1d7473
Fix quality
2023-03-27 13:17:14 -04:00
Donny Greenberg
f02e3a2b18
Hardware Auto-Setup for Examples ( #22319 )
...
* Add initial remote hardware auto-setup docs
* Fix a few typos and clarify some language
* Add missing dependency
* Update self-hosted launch script with Sylvain's comments.
* Formatting.
* Trigger CI
* Style
2023-03-27 13:07:53 -04:00
Joao Gante
88dae78f4d
TensorFlow: pin maximum version to 2.12 ( #22364 )
2023-03-24 18:45:03 +00:00
Sylvain Gugger
6587125c0a
Pin tensorflow-text to go with tensorflow ( #22362 )
...
* Pin tensorflow-text to go with tensorflow
* Make it more convenient to pin TensorFlow
* setup don't like f-strings
2023-03-24 10:54:06 -04:00
Sylvain
ef28df0572
Fix quality due to ruff release
2023-03-22 20:45:08 -04:00
Connor Henderson
8e6c34b390
fix: Allow only test_file in pytorch and flax summarization ( #22293 )
...
allow only test_file in pytorch and flax summarization
2023-03-22 10:46:56 +00:00
Wang, Yi
4ccaf268fb
add low_cpu_mem_usage option in run_clm.py example which will benefit… ( #22288 )
...
* add low_cpu_mem_usage option in run_clm.py example which will benefit LLM loading
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* update all the example and README under language-modeling
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
---------
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2023-03-22 10:42:39 +00:00
jiqing-feng
8472a224fb
Enable traced model for text-generation task ( #22265 )
2023-03-22 10:19:26 +00:00
Sylvain Gugger
ebdb185bef
v4.28.0.dev0
2023-03-14 13:49:10 -04:00
bofeng huang
6192549c1f
[examples/speech-recognition] Add SpecAugment to run_speech_recognition_seq2seq.py ( #21942 )
...
* Add specaugment to run_speech_recognition_seq2seq.py
* Remove useless argument: text_column
* Fix quality
* Update return_attention_mask condition
* Update specaugment arguments only for whisper models
* Remove SpecAugment arguments from ModelArguments, only leave default values for simplicity
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update apply_spec_augment only for whisper models
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Rename return_attention_mask to forward_attention_mask to avoid confusion with wav2vec2 models
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2023-03-08 17:59:31 +01:00
Matt
d128f2ffab
Stop requiring Torch for our TF examples! ( #21997 )
...
* Stop requiring Torch for our TF examples!
* Slight tweak to logging in the example itself
2023-03-07 15:54:10 +00:00
Matt
5d8efc79db
Add TF contrastive image text finetuning example ( #21939 )
...
* Initial commit
* stash commit
* Add model checkpointing and pushing
* Fix model name inference
* Update README
* Update README
* Remove a couple of Torch references
* Update copyright date
* make fixup
* Update PushToHubCallback args!
* Remove the torch summary
* Add strategy.scope
2023-03-06 16:57:40 +00:00
Matt
1d3a1cc44b
Add check for different embedding types in examples ( #21881 )
...
* Add check for different embedding types in examples
* Correctly update summarization example
2023-03-01 16:57:06 +00:00
bofeng huang
3c0ce60855
[examples/summarization] deal with max_length
and num_beams
( #21740 )
...
* Override the decoding parameters of Seq2SeqTrainer
* Fix quality
* Fix max_length parameter
* Fix quality
* Remove redundant parameter max_length
* Separate the preprocess of train and validation to use different max_target_length
2023-02-27 08:18:14 +01:00
Sanchit Gandhi
13489248fa
[Examples] Generalise run audio classification for log-mel models ( #21756 )
...
* [Examples] Generalise run audio classification for log-mel models
* batch feature extractor
* make style
2023-02-24 09:19:07 +01:00
Sylvain Gugger
b19d64d852
Respect documentation on passive log level ( #21700 )
...
* Respect documentation on passive log level
* Fix test and set log level in examples
* Add doc
2023-02-22 09:39:18 +01:00
Aaron Gokaslan
5e8c8eb5ba
Apply ruff flake8-comprehensions ( #21694 )
2023-02-22 09:14:54 +01:00
Arthur
4194e5f42b
Fix-rag-finetune-project-requirement ( #21697 )
...
pin pytorch lightning requirement
2023-02-20 17:23:39 +01:00
dependabot[bot]
fcfd4ec789
Bump werkzeug from 2.0.3 to 2.2.3 in /examples/research_projects/decision_transformer ( #21658 )
...
Bump werkzeug in /examples/research_projects/decision_transformer
Bumps [werkzeug](https://github.com/pallets/werkzeug ) from 2.0.3 to 2.2.3.
- [Release notes](https://github.com/pallets/werkzeug/releases )
- [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst )
- [Commits](https://github.com/pallets/werkzeug/compare/2.0.3...2.2.3 )
---
updated-dependencies:
- dependency-name: werkzeug
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-16 09:23:43 -05:00
regisss
751f17aa48
Fix typos in contrastive-image-text example README ( #21665 )
2023-02-16 09:10:25 -05:00
Warren Green
fd5320bb57
Add missing arguemtn to run_clip.py ( #21588 )
2023-02-13 10:27:23 -05:00
dependabot[bot]
92487f5d0b
Bump ipython from 8.1.1 to 8.10.0 in /examples/research_projects/decision_transformer ( #21577 )
...
Bump ipython in /examples/research_projects/decision_transformer
Bumps [ipython](https://github.com/ipython/ipython ) from 8.1.1 to 8.10.0.
- [Release notes](https://github.com/ipython/ipython/releases )
- [Commits](https://github.com/ipython/ipython/compare/8.1.1...8.10.0 )
---
updated-dependencies:
- dependency-name: ipython
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-13 10:21:50 -05:00
steventk-g
c88b11c591
Add _mp_fn to run_mae.py for XLA testing ( #21551 )
...
Update run_mae.py
2023-02-10 09:53:55 -05:00
lee1jun
b31cee6727
fix typo in run_speech_recognition_ctc.py ( #21528 )
...
Update run_speech_recognition_ctc.py
There should be `# limitations under the License` line at the end of the documentation section.
2023-02-09 09:46:40 -05:00
Stefan Schweter
d3046dad80
[Doc] Minor URL fixes in PyTorch Text Classification Readme ( #21511 )
...
docs: fix some references in PyTorch text classification readme
2023-02-08 09:39:52 -05:00
dependabot[bot]
e024cd715e
Bump cryptography from 36.0.2 to 39.0.1 in /examples/research_projects/decision_transformer ( #21507 )
...
Bump cryptography in /examples/research_projects/decision_transformer
Bumps [cryptography](https://github.com/pyca/cryptography ) from 36.0.2 to 39.0.1.
- [Release notes](https://github.com/pyca/cryptography/releases )
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst )
- [Commits](https://github.com/pyca/cryptography/compare/36.0.2...39.0.1 )
---
updated-dependencies:
- dependency-name: cryptography
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-08 09:25:06 -05:00
Sylvain Gugger
67d074874d
Cleanup quality ( #21493 )
...
* Remove mentions of flake8/isort
* Clean up inits
* Deall with all other inits
* Last special rule for dummy files
2023-02-07 12:27:31 -05:00
Jeroen Van Der Donckt
bbe98ea9c3
🖊️ fix typo in pytorch semantic segmentation readme ( #21492 )
2023-02-07 09:39:24 -05:00
dependabot[bot]
35f93f299f
Bump oauthlib from 3.2.1 to 3.2.2 in /examples/research_projects/decision_transformer ( #21481 )
...
Bump oauthlib in /examples/research_projects/decision_transformer
Bumps [oauthlib](https://github.com/oauthlib/oauthlib ) from 3.2.1 to 3.2.2.
- [Release notes](https://github.com/oauthlib/oauthlib/releases )
- [Changelog](https://github.com/oauthlib/oauthlib/blob/master/CHANGELOG.rst )
- [Commits](https://github.com/oauthlib/oauthlib/compare/v3.2.1...v3.2.2 )
---
updated-dependencies:
- dependency-name: oauthlib
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-06 18:27:14 -05:00
Sylvain Gugger
6f79d26442
Update quality tooling for formatting ( #21480 )
...
* Result of black 23.1
* Update target to Python 3.7
* Switch flake8 to ruff
* Configure isort
* Configure isort
* Apply isort with line limit
* Put the right black version
* adapt black in check copies
* Fix copies
2023-02-06 18:10:56 -05:00
Stas Bekman
3b9a1dc132
[examples] improve block_size warning message ( #21463 )
2023-02-06 08:36:12 -08:00
Kaustubh Dhole
182afb7dc6
Fixed RAG script which was failing on dummy example ( #21416 )
...
* do not use prefix="val" for test
The dummy example fails when test_epoch_end is called. The prefix="test" should be dynamic in the log metrics too.
* Create test.source
* Create test.target
2023-02-06 09:27:34 -05:00
Erwann Millon
ea55bd86b9
Add VQGAN-CLIP research project ( #21329 )
...
* Add VQGAN-CLIP research project
* fixed style issues
* Update examples/research_projects/vqgan-clip/README.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update examples/research_projects/vqgan-clip/VQGAN_CLIP.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update examples/research_projects/vqgan-clip/requirements.txt
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update examples/research_projects/vqgan-clip/README.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update examples/research_projects/vqgan-clip/VQGAN_CLIP.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update examples/research_projects/vqgan-clip/VQGAN_CLIP.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update examples/research_projects/vqgan-clip/VQGAN_CLIP.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update examples/research_projects/vqgan-clip/loaders.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* replace CLIPProcessor with tokenizer, change asserts to exceptions
* rm unused import
* remove large files (jupyter notebook linked in readme, imgs migrated to hf dataset)
* add tokenizers dependency
* Remove comment
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* rm model checkpoints
---------
Co-authored-by: Erwann Millon <erwann@Erwanns-MacBook-Air.local>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-02-02 14:45:35 -05:00
amyeroberts
e5db7051a8
Add TF image classification example script ( #19956 )
...
* TF image classification script
* Update requirements
* Fix up
* Add tests
* Update test fetcher
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Fix directory path
* Adding `zero-shot-object-detection` pipeline doctest. (#20274 )
* Adding `zero-shot-object-detection` pipeline doctest.
* Remove nested_simplify.
* Add generate kwargs to `AutomaticSpeechRecognitionPipeline` (#20952 )
* Add generate kwargs to AutomaticSpeechRecognitionPipeline
* Add test for generation kwargs
* Trigger CI
* Data collator returns np
* Update feature extractor -> image processor
* Bug fixes - updates to reflect changes in API
* Update flags to match PT & run faster
* Update instructions - Maria's comment
* Update examples/tensorflow/image-classification/README.md
* Remove slow decorator
---------
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: bofeng huang <bofenghuang7@gmail.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2023-02-01 19:09:36 +00:00
Quentin Lhoest
074d6b75fd
Simplify column_names in run_clm/mlm ( #21382 )
...
* simplify column_names in run_clm
* simplify column_names in run_mlm
* minor
2023-01-31 15:23:47 +01:00
Stas Bekman
98d88b23f5
[run_(clm|mlm).py
examples] add streaming dataset support ( #21343 )
...
* [run_clm example] add streaming dataset support
* unrefactor kwargs
* fix
* fix
* require datasets>=2.0.0
* port to mlm
2023-01-30 14:01:35 -08:00
dependabot[bot]
36b668fa06
Bump onnx from 1.11.0 to 1.13.0 in /examples/research_projects/decision_transformer ( #21331 )
...
Bump onnx in /examples/research_projects/decision_transformer
Bumps [onnx](https://github.com/onnx/onnx ) from 1.11.0 to 1.13.0.
- [Release notes](https://github.com/onnx/onnx/releases )
- [Changelog](https://github.com/onnx/onnx/blob/main/docs/Changelog.md )
- [Commits](https://github.com/onnx/onnx/compare/v1.11.0...v1.13.0 )
---
updated-dependencies:
- dependency-name: onnx
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-27 10:13:13 -05:00