Martina Fumanelli
07575e869d
Italian/accelerate ( #17698 )
...
* Add 'accelerate' to _toctree file
* Fix 'training with a nb' title
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-21 14:23:47 +02:00
Martina Fumanelli
8881e58b22
Italian/model sharing ( #17828 )
...
* Add Italian translation of the doc file model_sharing.mdx
* Fix style
* Fix typo
* Update docs/source/it/_toctree.yml
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-21 14:07:53 +02:00
Lorenzo Balzani
0d971be84f
Italian translation of run_scripts.mdx gh-17459 ( #17642 )
...
* Run_scripts Italian translation gh-17459
* Updated run_scripts gh-17642
* Updated run_scripts gh-17642
Made the text more gender-neutral.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-21 12:02:08 +02:00
Sylvain Gugger
ba552dd027
Make errors for loss-less models more user-friendly ( #18233 )
2022-07-21 11:52:33 +02:00
Sylvain Gugger
43a5375cc1
Fix TrainingArguments help section ( #18232 )
2022-07-21 11:03:25 +02:00
Nicola Procopio
9f787ce874
Translation/debugging ( #18230 )
...
* added debugging.mdx
* updated debugging.mdx
* updated translation
* updated translation debugging
* translated debugging
* updated _toctree.yml
2022-07-21 11:02:26 +02:00
Sebastian Sosa
5e2f2d7dd2
Better messaging and fix for incorrect shape when collating data. ( #18119 )
...
* More informative error message
* raise dynamic error
* remove_excess_nesting application
* incorrect shape assertion for collator & function to remove excess nesting from DatasetDict
* formatting
* eliminating datasets import
* removed and relocated remove_excess_nesting to the datasets library and updated docs accordingly
* independent assert instructions
* inform user of excess nesting
2022-07-21 10:35:41 +02:00
Victor Zhu
d23cf5b1f1
Add support for Sagemaker Model Parallel >= 1.10 new checkpoint API ( #18221 )
...
* Add support for Sagemaker Model Parallel >= 1.10 new checkpoint API
* Support loading checkpoints saved with SMP < 1.10 in SMP < 1.10 and SMP >= 1.10
* Support loading checkpoints saved with SMP >= 1.10 in SMP >= 1.10
* Fix bug and styling
* Update based on reviewer feedback
2022-07-21 07:56:20 +02:00
Zhi Zheng
dbfeffd7c9
Update add_new_pipeline.mdx ( #18224 )
...
fix typo
2022-07-21 07:55:30 +02:00
Steven Liu
ff56b8fbff
Add custom config to quicktour ( #18115 )
...
* 📝 first draft of new quicktour
* make style
* 🖍 edit and review
* 🖍 small fixes
* 🖍 only add custom config section
* 🖍 use autoclass instead
2022-07-20 12:23:03 -05:00
Yih-Dar
9edff45362
skip some test_multi_gpu_data_parallel_forward ( #18188 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-20 15:54:44 +02:00
Yih-Dar
bc6fe6fbcf
Change to FlavaProcessor in PROCESSOR_MAPPING_NAMES ( #18213 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-20 12:30:14 +02:00
Raghavan
dcec4c4387
Adding OPTForSeqClassification class ( #18123 )
...
* Adding OPTForSeqClassification class
* Fix import issues
* Add documentation for optforseqclassification
* Remove checkout
* fix failing tests
* fix typo
* Fix code formatting
* Incorporating the PR feedbacks
* Incorporate PR Feedbacks
* Fix failing test and add new test for multi label setup
* Fix formatting issue
* Fix failing tests
* Fix formatting issues
* Fix failing tests
* Fix failing tests
* Fix failing tests
* Fix failing tests
* PR feedback
2022-07-20 10:14:21 +02:00
Li-Huai (Allan) Lin
0ed4d0dfb6
Fix LayoutXLM
docstrings ( #17038 )
...
* Fix docstrings
* Fix legacy issue
* up
* apply suggestions
* up
* quality
2022-07-20 09:49:57 +02:00
Yih-Dar
4b1ed7979f
update cache to v0.5 ( #18203 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-20 08:14:10 +02:00
Matt
8a61fe0234
Reduce console spam when using the KerasMetricCallback ( #18202 )
...
* Reduce console spam when using the KerasMetricCallback
* Switch to predict_on_batch to improve performance
2022-07-19 17:00:35 +01:00
Joao Gante
ec6cd7633f
TF: Add missing cast to GPT-J ( #18201 )
...
* Fix TF GPT-J tests
* add try/finally block
2022-07-19 15:58:42 +01:00
Yih-Dar
05ed569c79
Use next-gen CircleCI convenience images ( #18197 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-19 15:43:05 +02:00
flozi00
9f12ec7d87
Typo in readme ( #18195 )
2022-07-19 15:28:37 +02:00
Sylvain Gugger
dc9147ff36
Custom pipeline ( #18079 )
...
* Initial work
* More work
* Add tests for custom pipelines on the Hub
* Protect import
* Make the test work for TF as well
* Last PyTorch specific bit
* Add documentation
* Style
* Title in toc
* Bad names!
* Update docs/source/en/add_new_pipeline.mdx
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
* Auto stash before merge of "custom_pipeline" and "origin/custom_pipeline"
* Address review comments
* Address more review comments
* Update src/transformers/pipelines/__init__.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-07-19 12:02:35 +02:00
Patrick von Platen
3bb6356d4d
[From pretrained] Allow download from subfolder inside model repo ( #18184 )
...
* add first generation tutorial
* [from_pretrained] Allow loading models from subfolders
* remove gen file
* add doc strings
* allow download from subfolder
* add tests
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* apply comments
* correct doc string
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-19 11:53:53 +02:00
Snehan Kekre
ce0152819d
Update docs README with instructions on locally previewing docs ( #18196 )
...
* Update docs README with instructions on locally previewing docs
* Add instructions to install `watchdog` before previewing the docs
2022-07-19 11:47:26 +02:00
orgoro
798384467b
bugfix: div-->dim ( #18135 )
2022-07-19 10:24:56 +02:00
Sylvain Gugger
e630dad555
Add vision example to README ( #18194 )
2022-07-19 09:46:18 +02:00
Duong A. Nguyen
4bea6584e3
Remove use_auth_token from the from_config method ( #18192 )
...
* remove use_auth_token from from_config
* restore use_auth_token from_pretrained run_t5_mlm_flax
2022-07-19 08:13:20 +02:00
Sylvain Gugger
29fd471556
Use smaller variant of BLOOM for doc to fix tests
2022-07-18 15:17:29 -04:00
Sourab Mangrulkar
bc8e30bab9
FSDP integration enhancements and fixes ( #18134 )
...
* FSDP integration enhancements and fixes
* resolving comments
* fsdp fp16 mixed precision requires `ShardedGradScaler`
2022-07-19 00:02:10 +05:30
Nicola Procopio
8e445ca51d
Translation/training: italian translation training.mdx ( #17662 )
...
* added training.mdx
* updated training.mdx
* updated training.mdx
* updated training.mdx
* updated _toctree.yml
* fixed typos after review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-18 19:21:07 +02:00
Younes Belkada
6a1b1bf7a6
BLOOM minor fixes small test ( #18175 )
...
* minor fixes
- add correct revision
- corrected dosctring for test
- removed a test
* contrib credits
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>
2022-07-18 19:18:19 +02:00
Nicola Procopio
c4cc894086
Translation italian: multilingual.mdx ( #17768 )
...
* added multilingual.mdx
* updated multilingual.mdx
* italian translation multilingual.mdx
* updated _toctree.yml
* fixed typos _toctree.yml
* fixed typos after review
* fixed error after review
2022-07-18 19:09:08 +02:00
Nicola Procopio
0a5b61d004
Added preprocessing.mdx italian translation ( #17600 )
...
* updated _toctree.yml
* added preprocessing
* updated preprocessing.mdx
* updated preprocessing.mdx
updated after review
2022-07-18 19:06:10 +02:00
SaulLu
ced1f1f5db
fix typo inside bloom documentation ( #18187 )
2022-07-18 17:43:52 +02:00
Sylvain Gugger
edadfc58af
Better default for offload_state_dict in from_pretrained ( #18183 )
2022-07-18 16:02:41 +02:00
Sylvain Gugger
aeeab1ffd0
Fix template for new models in README ( #18182 )
2022-07-18 16:01:51 +02:00
Ayan Sengupta
45255814a2
FIX: Typo ( #18156 )
2022-07-18 15:46:08 +02:00
Yih-Dar
6561fbcc6e
Update TF(Vision)EncoderDecoderModel PT/TF equivalence tests ( #18073 )
...
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-18 15:29:14 +02:00
Yih-Dar
cb19c2afdc
Fix expected loss values in some (m)T5 tests ( #18177 )
...
* fix expected loss values
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-18 15:26:21 +02:00
Wang, Yi
7417f3acb7
[HPO] update to sigopt new experiment api ( #18147 )
...
* [HPO] update to sigopt new experiment api
* follow https://docs.sigopt.com/experiments
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* [HPO] use new API if sigopt version >= 8.0.0
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2022-07-18 15:19:40 +02:00
gcheron
8c14b342aa
add ONNX support for LeVit ( #18154 )
...
Co-authored-by: Guilhem Chéron <guilhemc@authentifier.com>
2022-07-18 15:17:07 +02:00
Lysandre Debut
c1c79b0655
NLLB tokenizer ( #18126 )
...
* NLLB tokenizer
* Apply suggestions from code review - Thanks Stefan!
Co-authored-by: Stefan Schweter <stefan@schweter.it>
* Final touches
* Style :)
* Update docs/source/en/model_doc/nllb.mdx
Co-authored-by: Stefan Schweter <stefan@schweter.it>
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* PR reviews
* Auto models
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-18 08:12:34 -04:00
John Giorgi
a4f97e6ce0
Fix incorrect type hint for lang ( #18161 )
2022-07-18 09:53:18 +02:00
John Giorgi
c46d39f390
Fix check for falsey inputs in run_summarization ( #18155 )
2022-07-18 09:50:32 +02:00
Nicolas Patry
ccc0897804
Adding support for device_map
directly in pipeline(..)
function. ( #17902 )
...
* Adding support for `device_map` directly in `pipeline(..)` function.
* Updating the docstring.
* Adding a better docstring
* Put back type hints.
* Blacked. (`make fixup` didn't work ??!!)
2022-07-15 15:54:26 +02:00
Nicolas Patry
fca66ec4ef
Fixing a hard to trigger bug for text-generation
pipeline. ( #18131 )
...
* Fixing a bug where attention mask was not passed to generate.
* Fixing zero-size prompts.
* Comment on top.
2022-07-15 15:54:07 +02:00
amyeroberts
8581a798c0
Add TF DeiT implementation ( #17806 )
...
* Initial TF DeiT implementation
* Fix copies naming issues
* Fix up + docs
* Properly same main layer
* Name layers properly
* Initial TF DeiT implementation
* Fix copies naming issues
* Fix up + docs
* Properly same main layer
* Name layers properly
* Fixup
* Fix import
* Fix import
* Fix import
* Fix weight loading for tests whilst not on hub
* Add doc tests and remove to_2tuple
* Add back to_2tuple
Removing to_2tuple results in many downstream changes needed because of the copies checks
* Incorporate updates in Improve vision models #17731 PR
* Don't hard code num_channels
* Copy PyTorch DeiT embeddings and remove pytorch operations with mask
* Fix patch embeddings & tidy up
* Update PixelShuffle to move logic into class layer
* Update doc strings - remove PT references
* Use NHWC format in internal layers
* Fix up
* Use linear activation layer
* Remove unused import
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Move dataclass to top of file
* Remove from_pt now weights on hub
* Fixup
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>
2022-07-13 18:04:08 +01:00
Wei
7ea6ccc2b3
Enable torchdynamo with torch_tensorrt(fx path) ( #17765 )
...
* enable fx2trt
* Update perf_train_gpu_one.mdx
* Update perf_train_gpu_one.mdx
* add lib check
* update
* format
* update
* fix import check
* fix isort
* improve doc
* refactor ctx manager
* fix isort
* black format
* isort fix
* fix format
* update args
* update black
* cleanups
* Update perf_train_gpu_one.mdx
* code refactor
* code refactor to init
* remove redundancy
* isort
* replace self.args with args
Co-authored-by: Stas Bekman <stas@stason.org>
2022-07-13 12:43:28 -04:00
Sylvain Gugger
37aeb5787a
Make sharded checkpoints work in offline mode ( #18125 )
...
* Make sharded checkpoints work in offline mode
* Add test
2022-07-13 12:43:08 -04:00
Sylvain Gugger
0a21a48564
Revert "Make sharded checkpoints work in offline mode"
...
This reverts commit 3564c65786
.
2022-07-13 10:53:25 -04:00
Sylvain Gugger
3564c65786
Make sharded checkpoints work in offline mode
2022-07-13 10:51:56 -04:00
lmagne
56e6487c40
add dataset split and config to model-index in TrainingSummary.from_trainer ( #18064 )
...
* added metadata to training summary
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-13 16:07:20 +02:00