Lysandre Debut
245cdb469d
Fix barthez tokenizer ( #9562 )
2021-01-13 06:24:10 -05:00
Julien Chaumond
247a7b2029
Doc: Update pretrained_models wording ( #9545 )
...
* Update pretrained_models.rst
To clarify things cf. this tweet for instance https://twitter.com/RTomMcCoy/status/1349094111505211395
* format
2021-01-13 05:58:05 -05:00
Suraj Patil
69ed36063a
fix BlenderbotSmallTokenizer ( #9538 )
...
* add model_input_names
* fix test
2021-01-13 10:53:43 +05:30
Stas Bekman
2df34f4aba
[trainer] deepspeed integration ( #9211 )
...
* deepspeed integration
* style
* add test
* ds wants to do its own backward
* fp16 assert
* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* style
* for clarity extract what args are being passed to deepspeed
* introduce the concept of self.wrapped_model
* s/self.wrapped_model/self.model_wrapped/
* complete transition to self.wrapped_model / self.model
* fix
* doc
* give ds its own init
* add custom overrides, handle bs correctly
* fix test
* clean up model_init logic, fix small bug
* complete fix
* collapse --deepspeed_config into --deepspeed
* style
* start adding doc notes
* style
* implement hf2ds optimizer and scheduler configuration remapping
* oops
* call get_num_training_steps absolutely when needed
* workaround broken auto-formatter
* deepspeed_config arg is no longer needed - fixed in deepspeed master
* use hf's fp16 args in config
* clean
* start on the docs
* rebase cleanup
* finish up --fp16
* clarify the supported stages
* big refactor thanks to discovering deepspeed.init_distributed
* cleanup
* revert fp16 part
* add checkpoint-support
* more init ds into integrations
* extend docs
* cleanup
* unfix docs
* clean up old code
* imports
* move docs
* fix logic
* make it clear which file it's referring to
* document nodes/gpus
* style
* wrong format
* style
* deepspeed handles gradient clipping
* easier to read
* major doc rewrite
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* docs
* switch to AdamW optimizer
* style
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* clarify doc
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-12 19:05:18 -08:00
Sylvain Gugger
5f6721032a
Use the right version of tokenizers ( #9550 )
...
* Use the right version of tokenizers
* Try another way
* Try another way
* Deps are installed from there...
* Deps are installed from there...
* Revert last
* remove needless comment
2021-01-12 18:55:45 -05:00
Sylvain Gugger
063d8d27f4
Refactor prepare_seq2seq_batch
( #9524 )
...
* Add target contextmanager and rework prepare_seq2seq_batch
* Fix tests, treat BART and Barthez
* Add last tokenizers
* Fix test
* Set src token before calling the superclass
* Remove special behavior for T5
* Remove needless imports
* Remove needless asserts
2021-01-12 18:19:38 -05:00
Sylvain Gugger
e6ecef711e
Revert, it was not the issue.
2021-01-12 18:00:22 -05:00
Sylvain Gugger
250f27f207
Fix tokenizers install for now
2021-01-12 17:50:27 -05:00
Lysandre Debut
dfbf0f5598
topk -> top_k ( #9541 )
2021-01-12 16:21:29 -05:00
Lysandre Debut
a1100fac67
LayoutLM Config ( #9539 )
2021-01-12 10:03:50 -05:00
NielsRogge
e45eba3b1c
Improve LayoutLM ( #9476 )
...
* Add LayoutLMForSequenceClassification and integration tests
Improve docs
Add LayoutLM notebook to list of community notebooks
* Make style & quality
* Address comments by @sgugger, @patrickvonplaten and @LysandreJik
* Fix rebase with master
* Reformat in one line
* Improve code examples as requested by @patrickvonplaten
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-12 09:26:32 -05:00
Suraj Patil
ccd1923f46
[T5] enable T5 fp16 ( #9487 )
...
* fix t5 fp16
2021-01-12 17:12:33 +05:30
Patrick von Platen
2aa9c2f204
fix blenderbot tok ( #9532 )
2021-01-12 05:53:32 -05:00
Lysandre Debut
406cbf58b2
Shouldn't stale issues/PRs with feature request label ( #9511 )
2021-01-12 04:49:15 -05:00
Simon Brandeis
3b67c5abb0
Update 'Develop on Windows' guidelines ( #9519 )
2021-01-12 04:15:16 -05:00
Patrick von Platen
a051d8928a
[ProphetNet] Fix naming and wrong config ( #9514 )
...
* fix naming issues
* better names
2021-01-12 04:10:05 -05:00
Patrick von Platen
7f28613213
[TFBart] Split TF-Bart ( #9497 )
...
* make templates ready
* make add_new_model_command_ready
* finish tf bart
* prepare tf mbart
* finish tf bart
* add tf mbart
* add marian
* prep pegasus
* add tf pegasus
* push blenderbot tf
* add blenderbot
* add blenderbot small
* clean-up
* make fix copy
* define blend bot tok
* fix
* up
* make style
* add to docs
* add copy statements
* overwrite changes
* improve
* fix docs
* finish
* fix last slow test
* fix missing git conflict line
* fix blenderbot
* up
* fix blenderbot small
* load changes
* finish copied from
* upload fix
2021-01-12 02:06:32 +01:00
Stas Bekman
0ecbb69806
[make docs] parallel build ( #9522 )
...
After experimenting with different number of workers https://github.com/huggingface/transformers/issues/9496#issuecomment-758145868 4-5 workers seems to be the most optimal - let's go with 4 as surely we wouldn't find a cpu with less cores these days.
Fixes part of https://github.com/huggingface/transformers/issues/9496
@sgugger
2021-01-11 13:00:08 -08:00
Stas Bekman
e6f211cade
[trainer] round numbers in trainer state ( #9491 )
...
* round numbers
* style
* round only on logging
2021-01-11 10:17:49 -08:00
Sylvain Gugger
01a1684078
Make doc styler behave properly on Windows ( #9516 )
2021-01-11 10:25:24 -05:00
Sylvain Gugger
6009668c63
Add link to forums thread
2021-01-11 10:00:59 -05:00
Julien Plu
ba702966ba
Fix cardinality ( #9505 )
2021-01-11 09:42:19 -05:00
Stas Bekman
33b7422839
[trainer] remove --model_parallel
( #9451 )
...
* fix bad merge - dropped code
* remove --model_parallel
* Deal with TrainingArguments
* Use a private attr and fix batch sizes
* fix _n_gpu
* add is_parallel helper wrapper
* fix attribute
* introduce a new attribute is_model_parallel
* docs
* docs
* Put back init False and rearrange doc
* Ignore non-init args in HFArgumentParser
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2021-01-11 09:39:28 -05:00
Stas Bekman
6f63501383
[doc] How To Request Support document stab ( #9288 )
...
* How To Request Support document stab
* integrate suggestions
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* small corrections
* expand on how to search for issues with examples
* address issues
* Update ISSUES.md
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* patrick's suggestion
* patrick's suggestion
* small fix
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-11 09:23:51 -05:00
Nicolas Patry
d20e9c7299
Enable TruncationStrategy override for pipelines ( #9432 )
...
* Enable TruncationStrategy override for pipelines
* Update isort.
* Fixing test
* Fixing text_generation pipeline.
* Using same DummyTok as other PR for easier merge later.
* Some more import guards.
* Remove bogus file.
* Do not pass `generate_kwargs` to `_parse_and_tokenize`.
@patrickvonplaten
* Removed DummyTok.
* Doc quality.
2021-01-11 09:23:28 -05:00
Sylvain Gugger
8d25df2c7a
Make doc styler detect lists on rst ( #9488 )
2021-01-11 08:53:41 -05:00
Aakash Tripathi
5a442a8db1
New Updated DistilGPT-2 Finetuning and Generation ( #9494 )
...
https://github.com/huggingface/transformers/pull/3177
2021-01-11 14:34:39 +01:00
Patrick von Platen
6c8ec2a931
fix tf led pt test ( #9513 )
2021-01-11 14:14:48 +01:00
Julien Plu
1e3c362235
Fix template ( #9512 )
2021-01-11 08:03:28 -05:00
Lysandre Debut
d415882b41
Remove tolerance + drop_rows_to_fit by default ( #9507 )
...
* Remove tolerance + drop_rows_to_fit by default
* remove drop_rows_to_fit
2021-01-11 08:02:41 -05:00
Julien Plu
1243ee7d0c
Full rework of the TF input/output embeddings and bias resizing ( #9193 )
...
* Start rework resizing
* Rework bias/decoder resizing
* Full resizing rework
* Full resizing rework
* Start to update the models with the new approach
* Finish to update the models
* Update all the tests
* Update the template
* Fix tests
* Fix tests
* Test a new approach
* Refactoring
* Refactoring
* Refactoring
* New rework
* Rework BART
* Rework bert+blenderbot
* Rework CTRL
* Rework Distilbert
* Rework DPR
* Rework Electra
* Rework Flaubert
* Rework Funnel
* Rework GPT2
* Rework Longformer
* Rework Lxmert
* Rework marian+mbart
* Rework mobilebert
* Rework mpnet
* Rework openai
* Rework pegasus
* Rework Roberta
* Rework T5
* Rework xlm+xlnet
* Rework template
* Fix TFT5EncoderOnly + DPRs
* Restore previous methods
* Fix Funnel
* Fix CTRL and TransforXL
* Apply style
* Apply Sylvain's comments
* Restore a test in DPR
* Address the comments
* Fix bug
* Apply style
* remove unused import
* Fix test
* Forgot a method
* missing test
* Trigger CI
* naming update
* Rebase
* Trigger CI
2021-01-11 06:27:28 -05:00
Julien Plu
cf416764f4
Fix template ( #9504 )
2021-01-11 05:21:25 -05:00
Richard Liaw
09926c8e86
fix-template ( #9499 )
...
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-01-10 20:34:17 -05:00
Julien Plu
4f7022d68d
Reformat ( #9482 )
2021-01-10 15:10:15 +01:00
Nicolas Patry
96f1f74aaf
Fixing tests. It seems master changed something in the warnings. ( #9483 )
...
Trying to keep warning tests for now. Should be discarded if it becomes
too hard to maintain.
2021-01-10 15:08:20 +01:00
Boris Dayma
1c19b423bf
fix(wandb): fix config ( #9489 )
2021-01-08 14:32:02 -05:00
Nicolas Patry
02e05fb0a5
Making Conversation possible to create directly a full conversation ( #9434 )
...
* Cleaning up conversation tests.
* Adding tests that don't require downloading models + conversation can be
fully created from static state.
* Making tests non flaky (by fixing generation length)
* Bumping isort version.
* Doc cleanup.
* Remove unused test in this PR.
* Torch import guard for TF.
* Missing torch guard.
* Small mistake in doc.
* Actual uses `_history` and `_index` cache.
+ remove dead enumerate
+ improve warning message.
* Update src/transformers/pipelines/conversational.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/pipelines/conversational.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/pipelines/conversational.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Adding comments and cleaner code to address history copy.
* Improving pipeline name in tests.
* Change tokenizer to a real one (still created at runtime with no
external dependency)
* Simplify DummyTok, reverse changes on tokenization.
* Removing DummyTok.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-08 14:33:25 +01:00
Julien Plu
4fbcf8ea49
Fix TF input for np.ndarray ( #9294 )
...
* Fix input for np.ndarray"
* add a test
* add a test
* Add a test
* Apply style
* Fix test
2021-01-08 08:23:29 -05:00
Thomas Tanon
e34e45536f
Makes HfArgumentParser compatible with Python 3.9 ( #9479 )
...
Python 3.9 changed the format of the string serialization of `typing.Optional`.
For example, `str(typing.Optional[str])` is
`typing.Union[str, NoneType]` in python 3.8 and
`typing.Optional[str]` in Python 3.9.
2021-01-08 08:10:44 -05:00
Sylvain Gugger
1bdf42409c
Fast imports part 3 ( #9474 )
...
* New intermediate inits
* Update template
* Avoid importing torch/tf/flax in tokenization unless necessary
* Styling
* Shutup flake8
* Better python version check
2021-01-08 07:40:59 -05:00
Patrick von Platen
79bbcc5260
[Generation] Fix bug for manual decoder_input_ids + warning message ( #9472 )
...
* up
* improve style
2021-01-08 05:50:39 -05:00
Patrick von Platen
9e1ea846bc
[README] Add new models ( #9465 )
...
* add new models
* make fix-copies
2021-01-08 05:49:43 -05:00
Nicolas Patry
bf9056442a
Removing duplicated code for Translation,Summarization and Text2TextGeneration pipelines ( #9433 )
...
* Merging all duplicated codes for Text2TextPipeline while preserving
backward compat.
* Fixing TranslationPipeline Hierarchy + return_name
* torch import guard.
* Update isort version.
* Remove code from other PR disentanglement.
* Removed named example to something more agnostic.
2021-01-07 23:10:16 +01:00
Patrick von Platen
f33a6f3446
[TFGPT2] - Fix flaky past_key_values test ( #9460 )
...
* fix tf flakey
* remove test files
2021-01-07 16:12:08 +01:00
Sylvain Gugger
758ed3332b
Transformers fast import part 2 ( #9446 )
...
* Main init work
* Add version
* Change from absolute to relative imports
* Fix imports
* One more typo
* More typos
* Styling
* Make quality script pass
* Add necessary replace in template
* Fix typos
* Spaces are ignored in replace for some reason
* Forgot one models.
* Fixes for import
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
* Add documentation
* Styling
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
2021-01-07 09:36:14 -05:00
Patrick von Platen
a400fe8931
[LED Test] fix common inputs pt for flaky pt-tf led test ( #9459 )
...
* fix common inputs pt flakey led
* fix other tests correspondingly
2021-01-07 12:29:03 +01:00
Patrick von Platen
ae5a32bb0d
up ( #9454 )
2021-01-07 11:51:02 +01:00
Julien Plu
812045adcc
New serving ( #9419 )
...
* Add a serving method
* Add albert
* Add serving for BERT and BART
* Add more models
* Finish the serving addition
* Temp fix
* Restore DPR
* Fix funnel attribute
* Fix attributes GPT2
* Fix OpenAIGPT attribute
* Fix T5 attributes
* Fix Bart attributes
* Fix TransfoXL attributes
* Add versioning
* better test
* Update template
* Fix Flaubert
* Fix T5
* Apply style
* Remove unused imports
* Deactivate extra parameters
* Remove too long test + saved_model default to False
* Ignore the saved model test for some models
* Fix some inputs
* Fix mpnet serving
* Trigger CI
* Address all comments
2021-01-07 11:48:49 +01:00
guillaume-be
390cf16bc8
Prophetnet optimization ( #9453 )
...
* Vectorized `ngram_attention_bias` calculation
* updated formatting with black
* Further optimization
* one (last) optimization
2021-01-07 11:41:58 +01:00
Stas Bekman
28d74872cc
a more reliable version of branching point discovery ( #9449 )
2021-01-07 04:47:50 -05:00