Lysandre Debut
5dcc08f1df
Fix S2T example ( #10741 )
2021-03-16 08:55:07 -04:00
Théo Matussière
6f840990a7
split seq2seq script into summarization & translation ( #10611 )
...
* split seq2seq script, update docs
* needless diff
* fix readme
* remove test diff
* s/summarization/translation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* cr
* fix arguments & better mbart/t5 refs
* copyright
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* reword readme
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* s/summarization/translation
* short script names
* fix tests
* fix isort, include mbart doc
* delete old script, update tests
* automate source prefix
* automate source prefix for translation
* s/translation/trans
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* fix script name (short version)
* typos
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* exact parameter
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* remove superfluous source_prefix calls in docs
* rename scripts & warn for source prefix
* black
* flake8
Co-authored-by: theo <theo@matussie.re>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-03-15 09:11:42 -04:00
Stas Bekman
4c32f9f26e
AdamW is now supported by default ( #9624 )
2021-03-12 13:40:07 -08:00
Sylvain Gugger
e8246f78f9
Add auto_wrap option in fairscale integration ( #10673 )
...
* Add auto_wrap option in fairscale integration
* Style
2021-03-12 07:50:20 -05:00
Nicolas Patry
543d0549f8
Adding new parameter to generate
: max_time
. ( #9846 )
...
* [WIP] Adding new parameter to `generate`: `max_time`.
Generation by tokens number is sometimes a bit clunky because we don't
know how many tokens are good enough or even how many tokens are in
the payload (for pipelines users for instance). This leads to hard
to understand behavior.
This PR proposes a new argument `max_time` which is a float of seconds
for the allowed time for `generate` to run on.
Ideally combinations of `max_tokens=None`, `max_time=2` could be used to
generate as many tokens as possible within time budget.
NB: Another possible approach consists of passing a callback to `generate`
putting the caller in charge of the actual decision of when to stop
generating tokens. It opens the door to 'which args should we pass'
to this callback. It's hard to imagine other use-cases for this
early stopping behavior than time (that are not already covered by
parameters of generate)
* Revamp with StoppingCriteria
* Removing deprecated mentions.
* Forgot arguments to stopping criteria.
* Readding max_length it's not just used as a stopping criteria.
* Default value for `stopping_criteria`.
* Address @patrickvonplaten comments.
- More docstrings
- Actual doc
- Include in global namespace
- Remove TF work.
* Put back `max_length` (deprecation different PR).
* Doc quality.
* Fixing old behavior without `stopping_criteria` but with `max_length`.
Making sure we don't break that in the future.
* Adding more tests for possible inconsistencies between
`max_length` and `stopping_criteria`.
* Fixing the torch imports.
2021-03-12 10:11:50 +01:00
WybeKoper
2f8485199c
Fix broken link ( #10656 )
...
* Fixed broken link
* fixed max length violation
Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>
2021-03-11 14:29:02 -05:00
Suraj Patil
055ed78f52
[S2T] fix example in docs ( #10667 )
2021-03-11 22:43:37 +05:30
Patrick von Platen
602d63f05c
[XLSR-Wav2Vec2] Add multi-lingual Wav2Vec2 models ( #10648 )
...
* add conversion script
* add wav2vec2 xslr models
* finish
* Update docs/source/model_doc/xlsr_wav2vec2.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-11 17:44:18 +03:00
Sylvain Gugger
26a33cfd8c
Document Trainer limitation on custom models ( #10635 )
2021-03-10 14:58:22 -05:00
Suraj Patil
d26b37e744
Speech2TextTransformer ( #10175 )
...
* s2t
* fix config
* conversion script
* fix import
* add tokenizer
* fix tok init
* fix tokenizer
* first version working
* fix embeds
* fix lm head
* remove extra heads
* fix convert script
* handle encoder attn mask
* style
* better enc attn mask
* override _prepare_attention_mask_for_generation
* handle attn_maks in encoder and decoder
* input_ids => input_features
* enable use_cache
* remove old code
* expand embeddings if needed
* remove logits bias
* masked_lm_loss => loss
* hack tokenizer to support feature processing
* fix model_input_names
* style
* fix error message
* doc
* remove inputs_embeds
* remove input_embeds
* remove unnecessary docstring
* quality
* SpeechToText => Speech2Text
* style
* remove shared_embeds
* subsample => conv
* remove Speech2TextTransformerDecoderWrapper
* update output_lengths formula
* fix table
* remove max_position_embeddings
* update conversion scripts
* add possibility to do upper case for now
* add FeatureExtractor and Processor
* add tests for extractor
* require_torch_audio => require_torchaudio
* add processor test
* update import
* remove classification head
* attention mask is now 1D
* update docstrings
* attention mask should be of type long
* handle attention mask from generate
* alwyas return attention_mask
* fix test
* style
* doc
* Speech2TextTransformer => Speech2Text
* Speech2TextTransformerConfig => Speech2TextConfig
* remove dummy_inputs
* nit
* style
* multilinguial tok
* fix tokenizer
* add tgt_lang setter
* save lang_codes
* fix tokenizer
* add forced_bos_token_id to tokenizer
* apply review suggestions
* add torchaudio to extra deps
* add speech deps to CI
* fix dep
* add libsndfile to ci
* libsndfile1
* add speech to extras all
* libsndfile1 -> libsndfile1
* libsndfile
* libsndfile1-dev
* apt update
* add sudo to install
* update deps table
* install libsndfile1-dev on CI
* tuple to list
* init conv layer
* add model tests
* quality
* add integration tests
* skip_special_tokens
* add speech_to_text_transformer in toctree
* fix tokenizer
* fix fp16 tests
* add tokenizer tests
* fix copyright
* input_values => input_features
* doc
* add model in readme
* doc
* change checkpoint names
* fix copyright
* fix code example
* add max_model_input_sizes in tokenizer
* fix integration tests
* add do_lower_case to tokenizer
* remove clamp trick
* fix "Add modeling imports here"
* fix copyrights
* fix tests
* SpeechToTextTransformer => SpeechToText
* fix naming
* fix table formatting
* fix typo
* style
* fix typos
* remove speech dep from extras[testing]
* fix copies
* rename doc file,
* put imports under is_torch_available
* run feat extract tests when torch is available
* dummy objects for processor and extractor
* fix imports in tests
* fix import in modeling test
* fxi imports
* fix torch import
* fix imports again
* fix positional embeddings
* fix typo in import
* adapt new extractor refactor
* style
* fix torchscript test
* doc
* doc
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* fix docs, copied from, style
* fix docstring
* handle imports
* remove speech from all extra deps
* remove s2t from seq2seq lm mapping
* better names
* skip training tests
* add install instructions
* List => Tuple
* doc
* fix conversion script
* fix urls
* add instruction for libsndfile
* fix fp16 test
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-10 21:42:04 +05:30
Patrick von Platen
9a06b6b11b
[FeatureExtractorSavingUtils] Refactor PretrainedFeatureExtractor ( #10594 )
...
* save first version
* finish refactor
* finish refactor
* correct naming
* correct naming
* shorter names
* Update src/transformers/feature_extraction_common_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* change name
* finish
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-09 12:16:59 +03:00
Ratthachat (Jung)
696e8a4365
Add TFRag ( #9002 )
...
* Create modeling_tf_dpr.py
* Add TFDPR
* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot
last commit accidentally deleted these 4 lines, so I recover them back
* Add TFDPR
* Add TFDPR
* clean up some comments, add TF input-style doc string
* Add TFDPR
* Make return_dict=False as default
* Fix return_dict bug (in .from_pretrained)
* Add get_input_embeddings()
* Create test_modeling_tf_dpr.py
The current version is already passed all 27 tests!
Please see the test run at :
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing
* fix quality
* delete init weights
* run fix copies
* fix repo consis
* del config_class, load_tf_weights
They shoud be 'pytorch only'
* add config_class back
after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion
* newline after .. note::
* import tf, np (Necessary for ModelIntegrationTest)
* slow_test from_pretrained with from_pt=True
At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug
* Add simple TFDPRModelIntegrationTest
Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet
* upload correct tf model
* remove position_ids as missing keys
* create modeling_tf_rag
* add tests for tf
* add tf tests
* revert wrong pt commit
* further refactor
* further refactor
* refactor
* Update modeling_tf_rag.py
- input_processing
- fix prepare_input_for_generation (mostly fix generate bug)
- bring back from_pretrained hack in order to test generate
* delete colab pieces of code
* Show case of greedy "generate"
Temporarily change from beam_search test to greedy_search test to show case that TF and PT do get equivalent output.
* cosmetic update
* correct typos
* update
* push some progress
* make easy check
* fix rag save from pretrained
* Update src/transformers/modeling_tf_utils.py
* remove commented out lines
* delete unnecessary lines
* add simple test case for nq_checkpoint
Add nq_checkpoint test to show that current version without hack still fails
* temporarily put ugly hack back again
* Add TFRagSequenceForGeneration!!
* __init__.py , import TFRagSequenceForGeneration
* Add TFRagSequence tests!
* rag init.py - add TFRagSequenceForGeneration
* fix from_pretrained
* fix prepare_inputs_for_generation
* Beam search for RagToken!
* minor clean up
* add tf.cast in TFRagModel
* More tf.cast
* Add all remaining tests (still have issues)
* delete all T5 related
* make style
* fix load weight prefix
* fix bart
* fix return_dict for tf_rag
make all tests pass .. Hooray
* fix some tests
* fix code quality
* fix qualtiy check
* finish tests tf rag
* add tf rag to docs
* remove TFT5 from docstring
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* remove TFT5 from docstring
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Delete outdated comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* improve doc strings
* add generative model classes
* fix adjust token logic
* refactor generate for TFRag
* using shape_list, not _get_shape
Co-authored-by: Julien Plu <plu.julien@gmail.com>
* axis=[1]->axis=1
* delete NEED_HELP comment
* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Indicating model is in a developing state in docstrings
As suggested by Julien
* small last changes
* apply sylvains suggestions
* finish tf rag
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
Co-authored-by: Julien Plu <plu.julien@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-09 00:49:51 +03:00
Sylvain Gugger
5469369480
Fix version control with anchors ( #10595 )
...
* Fix version control with anchors
* Simplify
2021-03-08 10:19:22 -05:00
Suraj Patil
f6e74a63ca
Add m2m100 ( #10236 )
...
* m2m_100
* no layernorm_embedding
* sinusoidal positional embeddings
* update pos embeddings
* add default config values
* tokenizer
* add conversion script
* fix config
* fix pos embed
* remove _float_tensor
* update tokenizer
* update lang codes
* handle lang codes
* fix pos embeds
* fix spm key
* put embedding weights on device
* remove qa and seq classification heads
* fix convert script
* lang codes pn one line
* fix embeds
* fix tokenizer
* fix tokenizer
* add fast tokenizer
* style
* M2M100MT => M2M100
* fix copyright, style
* tokenizer converter
* vocab file
* remove fast tokenizer
* fix embeds
* fix tokenizer
* fix tests
* add tokenizer tests
* add integration test
* quality
* fix model name
* fix test
* doc
* doc
* fix doc
* add copied from statements
* fix tokenizer tests
* apply review suggestions
* fix urls
* fix shift_tokens_right
* apply review suggestions
* fix
* fix doc
* add lang code to id
* remove unused function
* update checkpoint names
* fix copy
* fix tokenizer
* fix checkpoint names
* fix merge issue
* style
2021-03-06 22:14:16 +05:30
Stas Bekman
88a951e3cc
offline mode for firewalled envs ( #10407 )
...
* offline mode start
* add specific values
* fix fallback
* add test
* better values check and range
* test that actually works
* document the offline mode
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* more strict check
* cleaner test
* pt-only test
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-05 17:27:48 -08:00
lewtun
12b66215cf
Fix example of custom Trainer to reflect signature of compute_loss ( #10537 )
2021-03-05 07:44:53 -05:00
Sylvain Gugger
948b730f97
Remove unsupported methods from ModelOutput doc ( #10505 )
2021-03-03 14:55:18 -05:00
Jeff Yang
39f70a4058
feat(docs): navigate with left/right arrow keys ( #10481 )
...
* feat(docs): navigate with left/right arrow keys
* fix: add missing comma
2021-03-03 11:17:12 -05:00
Lysandre Debut
0c2325198f
Add I-BERT to README ( #10462 )
2021-03-01 12:12:31 -05:00
Patrick von Platen
0234de8418
Add Fine-Tuning for Wav2Vec2 ( #10145 )
...
* add encode labels function to tokenizer
* start adding finetuning
* init dropout
* upload
* correct convert script
* apply changes
* fix second typo
* make first dummy training run
* adapt convert script
* push confg for comparison
* remove conf
* finish training
* adapt data collator
* add research folder
* update according to fairseq feedback
* some minor corrections
* refactor masking indices a bit
* some minor changes
* clean tokenizer
* finish clean-up
* remove previous logic
* update run script
* correct training
* finish changes
* finish model
* correct bug
* fix training a bit more
* add some tests
* finish gradient checkpointing
* finish example
* correct gradient checkpointing
* improve tokenization method
* revert changes in tokenizer
* revert general change
* adapt fine-tuning
* update
* save intermediate test
* Update README.md
* finish finetuning
* delete conversion script
* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py
* Update src/transformers/models/wav2vec2/processing_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* finish wav2vec2 script
* finish wav2vec2 fine-tuning
* finalize test
* correct test
* adapt tests
* finish
* remove test file
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-01 12:13:17 +03:00
Patrick von Platen
3c733f3208
Update ibert.rst ( #10445 )
2021-02-28 19:03:49 +03:00
Darigov Research
aeba4f95bb
Adds terms to Glossary ( #10443 )
...
* feat: Adds three definitions to glossary from @cronoik
Needed a definition for transformer which in turn needed 2 more definitions
To do with issue https://github.com/huggingface/transformers/issues/9078
* fix: Adjusts definition of neural network to make it easier to read
2021-02-28 08:27:54 -05:00
Tanmay Garg
256482ac92
Introduce save_strategy training argument ( #10286 )
...
* Introduce save_strategy training argument
* deprecate EvaluationStrategy
* collapse EvaluationStrategy and LoggingStrategy into a single
IntervalStrategy enum
* modify tests to use modified enum
2021-02-27 19:34:22 -05:00
Andrea Bacciu
b040e6efc1
Fix None in add_token_positions - issue #10210 ( #10374 )
...
* Fix None in add_token_positions - issue #10210
Fix None in add_token_positions related to the issue #10210
* add_token_positions fix None values in end_positions vector
add_token_positions fix None in end_positions vector as proposed by @joeddav
2021-02-25 09:18:33 -07:00
Sylvain Gugger
9d14be5c20
Add support for ZeRO-2/3 and ZeRO-offload in fairscale ( #10354 )
...
* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale
* Quality
* Rework from review comments
* Add doc
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-02-25 11:07:53 -05:00
Sehoon Kim
63645b3b11
I-BERT model support ( #10153 )
...
* IBertConfig, IBertTokentizer added
* IBert Model names moified
* tokenizer bugfix
* embedding -> QuantEmbedding
* quant utils added
* quant_mode added to configuration
* QuantAct added, Embedding layer + QuantAct addition
* QuantAct added
* unused path removed, QKV quantized
* self attention layer all quantized, except softmax
* temporarl commit
* all liner layers quantized
* quant_utils bugfix
* bugfix: requantization missing
* IntGELU added
* IntSoftmax added
* LayerNorm implemented
* LayerNorm implemented all
* names changed: roberta->ibert
* config not inherit from ROberta
* No support for CausalLM
* static quantization added, quantize_model.py removed
* import modules uncommented
* copyrights fixed
* minor bugfix
* quant_modules, quant_utils merged as one file
* import * fixed
* unused runfile removed
* make style run
* configutration.py docstring fixed
* refactoring: comments removed, function name fixed
* unused dependency removed
* typo fixed
* comments(Copied from), assertion string added
* refactoring: super(..) -> super(), etc.
* refactoring
* refarctoring
* make style
* refactoring
* cuda -> to(x.device)
* weight initialization removed
* QuantLinear set_param removed
* QuantEmbedding set_param removed
* IntLayerNorm set_param removed
* assert string added
* assertion error message fixed
* is_decoder removed
* enc-dec arguments/functions removed
* Converter removed
* quant_modules docstring fixed
* conver_slow_tokenizer rolled back
* quant_utils docstring fixed
* unused aruments e.g. use_cache removed from config
* weight initialization condition fixed
* x_min, x_max initialized with small values to avoid div-zero exceptions
* testing code for ibert
* test emb, linear, gelu, softmax added
* test ln and act added
* style reformatted
* force_dequant added
* error tests overrided
* make style
* Style + Docs
* force dequant tests added
* Fix fast tokenizer in init
* Fix doc
* Remove space
* docstring, IBertConfig, chunk_size
* test_modeling_ibert refactoring
* quant_modules.py refactoring
* e2e integration test added
* tokenizers removed
* IBertConfig added to tokenizer_auto.py
* bugfix
* fix docs & test
* fix style num 2
* final fixes
Co-authored-by: Sehoon Kim <sehoonkim@berkeley.edu>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-25 10:06:42 -05:00
Patrick von Platen
cb38ffcc5e
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer ( #10324 )
...
* push to show
* small improvement
* small improvement
* Update src/transformers/feature_extraction_utils.py
* Update src/transformers/feature_extraction_utils.py
* implement base
* add common tests
* make all tests pass for wav2vec2
* make padding work & add more tests
* finalize feature extractor utils
* add call method to feature extraction
* finalize feature processor
* finish tokenizer
* finish general processor design
* finish tests
* typo
* remove bogus file
* finish docstring
* add docs
* finish docs
* small fix
* correct docs
* save intermediate
* load changes
* apply changes
* apply changes to doc
* change tests
* apply surajs recommend
* final changes
* Apply suggestions from code review
* fix typo
* fix import
* correct docstring
2021-02-25 17:42:46 +03:00
abhishek thakur
9dc7825744
Remove unused variable in example for Q&A ( #10392 )
2021-02-25 09:18:47 -05:00
Lysandre
3591844306
v4.3.3 docs
2021-02-24 15:19:01 -05:00
Stas Bekman
eab0afc19c
[Trainer] implement gradient_accumulation_steps support in DeepSpeed integration ( #10310 )
...
* implement gradient_accumulation_steps support in DeepSpeed integration
* typo
* cleanup
* cleanup
2021-02-22 11:15:59 -08:00
Sylvain Gugger
9e147d31f6
Deprecate prepare_seq2seq_batch ( #10287 )
...
* Deprecate prepare_seq2seq_batch
* Fix last tests
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* More review comments
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-02-22 12:36:16 -05:00
Lysandre Debut
cd8c4c3fc2
DeBERTa-v2 fixes ( #10328 )
...
Co-authored-by: Pengcheng He <penhe@microsoft.com>
Co-authored-by: Pengcheng He <penhe@microsoft.com>
2021-02-22 07:45:18 -05:00
Pengcheng He
9a7e63729f
Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… ( #10018 )
...
* Integrate DeBERTa v2(the 1.5B model surpassed human performance on SuperGLUE); Add DeBERTa v2 900M,1.5B models;
* DeBERTa-v2
* Fix v2 model loading issue (#10129 )
* Doc members
* Update src/transformers/models/deberta/modeling_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Address Sylvain's comments
* Address Patrick's comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Style
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-19 18:34:44 -05:00
Sylvain Gugger
f6e53e3c2b
Fix example links in the task summary ( #10291 )
2021-02-19 18:04:15 -05:00
Stas Bekman
5da7c78ed8
update to new script; notebook notes ( #10241 )
2021-02-17 15:58:08 -08:00
Joe Davison
4210cd96fc
fix add_token_positions fn ( #10217 )
2021-02-16 14:00:05 -05:00
Suraj Patil
6fc940ed09
Add mBART-50 ( #10154 )
...
* add tokenizer for mBART-50
* update tokenizers
* make src_lang and tgt_lang optional
* update tokenizer test
* add setter
* update docs
* update conversion script
* update docs
* update conversion script
* update tokenizer
* update test
* update docs
* doc
* address Sylvain's suggestions
* fix test
* fix formatting
* nits
2021-02-15 20:58:54 +05:30
Sylvain Gugger
803498318c
[Doc] Fix version control in internal pages ( #10124 )
2021-02-13 08:52:30 -05:00
Stas Bekman
b54cb0bd82
[DeepSpeed in notebooks] Jupyter + Colab ( #10130 )
...
* init devices/setup explicitly
* docs + test
* simplify
* cleanup
* cleanup
* cleanup
* correct the required dist setup
* derive local_rank from env LOCAL_RANK
2021-02-11 14:02:05 -08:00
Tanmay Thakur
2f3b5f4dcc
Add new community notebook - Blenderbot ( #10126 )
...
* Update:community.md, new nb add
* feat: updated grammar on nb description
* Update: Train summarizer for BlenderBotSmall
2021-02-11 12:53:40 +03:00
Stas Bekman
7c07a47dfb
[DeepSpeed docs] new information ( #9610 )
...
* how to specify a specific gpu
* new paper
* expand on buffer sizes
* style
* where to find config examples
* specific example
* small updates
2021-02-09 22:16:20 -08:00
Boris Dayma
7c7962ba89
doc: update W&B related doc ( #10086 )
...
* doc: update W&B related doc
* doc(wandb): mention report_to
* doc(wandb): commit suggestion
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* doc(wandb): fix typo
* doc(wandb): remove WANDB_DISABLED
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-09 14:47:52 -05:00
Sylvain Gugger
0c3d23dff7
Add patch releases to the doc
2021-02-09 14:17:09 -05:00
Lysandre Debut
78f4a0e7e5
Logging propagation ( #10092 )
...
* Enable propagation by default
* Document enable/disable default handler
2021-02-09 10:27:49 -05:00
Patrick von Platen
b972125ced
Deprecate Wav2Vec2ForMaskedLM and add Wav2Vec2ForCTC ( #10089 )
...
* add wav2vec2CTC and deprecate for maskedlm
* remove from docs
2021-02-09 03:49:02 -05:00
Juan Cruz-Benito
e4bf9910dc
Removing run_pl_glue.py from text classification docs, include run_xnli.py & run_tf_text_classification.py ( #10066 )
...
* Removing run_pl_glue.py from seq classification docs
* Adding run_tf_text_classification.py
* Using :prefix_link: to refer local files
* Applying "make style" to the branch
* Update docs/source/task_summary.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Removing last underscores
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-08 13:04:21 -05:00
Lysandre
0dd579c9cf
Docs for v4.3.0
2021-02-08 18:53:24 +01:00
Sylvain Gugger
45aaf5f7ab
A few fixes in the documentation ( #10033 )
2021-02-08 05:02:01 -05:00
Patrick von Platen
89be094e29
[Templates] Add template "call-for-model" markdown and "call-for-big-bird" markdown ( #9921 )
...
* add big bird
* change teacher to mentor
* add proposal template
* adapt template
* delete old template
* correct some links
* finish template
* create big bird from template
* add big bird
* improve boxes
* finish boxes
* add pointers for BigBird
* finish big bird
* up
* up
* up
* up
* apply lysandres and sylvains suggestions
* delete bogus file
* correct markdown
* try different style
* try different style
* finalize
2021-02-05 15:47:54 +03:00
Sylvain Gugger
3be965c5db
Update doc for pre-release ( #10014 )
...
* Update doc for pre-release
* Use stable as default
* Use the right commit :facepalms:
2021-02-04 16:52:27 -05:00
Sylvain Gugger
b72f16b3ec
Fix doc for TFConverBertModel
2021-02-04 10:14:46 -05:00
demSd
00031785a8
BartForCausalLM analogs to ProphetNetForCausalLM
( #9128 )
...
* initiliaze bart4causalLM
* create BartDecoderWrapper, setters/getters
* delete spaces
* forward and additional methods
* update cache function, loss function, remove ngram* params in data class.
* add bartcausallm, bartdecoder testing
* correct bart for causal lm
* remove at
* add mbart as well
* up
* fix typo
* up
* correct
* add pegasusforcausallm
* add blenderbotforcausallm
* add blenderbotsmallforcausallm
* add marianforcausallm
* add test for MarianForCausalLM
* add Pegasus test
* add BlenderbotSmall test
* add blenderbot test
* fix a fail
* fix an import fail
* a fix
* fix
* Update modeling_pegasus.py
* fix models
* fix inputs_embeds setting getter
* adapt tests
* correct repo utils check
* finish test improvement
* fix tf models as well
* make style
* make fix-copies
* fix copies
* run all tests
* last changes
* fix all tests
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-04 11:56:12 +03:00
yylun
5442a11f5f
fix steps_in_epoch variable in trainer when using max_steps ( #9969 )
...
* fix steps_in_epoch variable when using max_steps
* redundant sentence
* Revert "redundant sentence"
This reverts commit ad5c0e9b6e
.
* remove redundant sentence
Co-authored-by: wujindou <wujindou@sogou-inc.com>
2021-02-03 09:30:37 -05:00
Patrick von Platen
d6217fb30c
Wav2Vec2 ( #9659 )
...
* add raw scaffold
* implement feat extract layers
* make style
* remove +
* correctly convert weights
* make feat extractor work
* make feature extraction proj work
* run forward pass
* finish forward pass
* Succesful decoding example
* remove unused files
* more changes
* add wav2vec tokenizer
* add new structure
* fix run forward
* add other layer norm architecture
* finish 2nd structure
* add model tests
* finish tests for tok and model
* clean-up
* make style
* finish docstring for model and config
* make style
* correct docstring
* correct tests
* change checkpoints to fairseq
* fix examples
* finish wav2vec2
* make style
* apply sylvains suggestions
* apply lysandres suggestions
* change print to log.info
* re-add assert statement
* add input_values as required input name
* finish wav2vec2 tokenizer
* Update tests/test_tokenization_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* apply sylvains suggestions
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-02-02 15:52:10 +03:00
Sylvain Gugger
de38a6e4d2
Fix 9918 ( #9932 )
...
* Initial work
* Fix doc styler and other models
2021-02-02 05:22:20 -05:00
Patrick von Platen
0e3be1ac8f
Add new model docs ( #9667 )
...
* add new model logic
* fix docs
* change structure
* improve add_new_model
* push new changes
* up
* up
* correct spelling
* improve docstring
* correct line length
* update readme
* correct links
* correct typos
* only add rst file for now
* Apply suggestions from code review 1
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
* Apply suggestions from code review
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>
* finish adding all suggestions
* make style
* apply Niels feedback
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* apply sylvains suggestions
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-01 17:55:10 +03:00
Stas Bekman
40cfc355f1
[doc] nested markup is invalid in rst ( #9898 )
...
Apparently nested markup in RST is invalid: https://docutils.sourceforge.io/FAQ.html#is-nested-inline-markup-possible
So currently this line doesn't get rendered properly, leaving inner markdown unrendered, resulting in:
```
https://docutils.sourceforge.io/FAQ.html#is-nested-inline-markup-possible
```
This PR removes the bold which fixes the link.
2021-01-30 09:59:19 -05:00
Stas Bekman
15e4ce353a
[docs] expand install instructions ( #9817 )
...
* expand install instructions
* fix
* white space
* rewrite as discussed in the PR
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* change the wording to encourage issue report
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-28 09:36:46 -08:00
Joe Davison
caddf9126b
tutorial typo
2021-01-28 09:21:58 -05:00
Stefan Schweter
5ed5a54684
ADD BORT ( #9813 )
...
* tests: add integration tests for new Bort model
* bort: add conversion script from Gluonnlp to Transformers 🚀
* bort: minor cleanup (BORT -> Bort)
* add docs
* make fix-copies
* clean doc a bit
* correct docs
* Update docs/source/model_doc/bort.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update docs/source/model_doc/bort.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* correct dialogpt doc
* correct link
* Update docs/source/model_doc/bort.rst
* Update docs/source/model_doc/dialogpt.rst
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-27 21:25:11 +03:00
abhishek thakur
f617490e71
ConvBERT Model ( #9717 )
...
* finalize convbert
* finalize convbert
* fix
* fix
* fix
* push
* fix
* tf image patches
* fix torch model
* tf tests
* conversion
* everything aligned
* remove print
* tf tests
* fix tf
* make tf tests pass
* everything works
* fix init
* fix
* special treatment for sepconv1d
* style
* 🙏🏽
* add doc and cleanup
* add electra test again
* fix doc
* fix doc again
* fix doc again
* Update src/transformers/modeling_tf_pytorch_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/conv_bert/configuration_conv_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update docs/source/model_doc/conv_bert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/conv_bert/configuration_conv_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* conv_bert -> convbert
* more fixes from review
* add conversion script
* dont use pretrained embed
* unused config
* suggestions from julien
* some more fixes
* p -> param
* fix copyright
* fix doc
* Update src/transformers/models/convbert/configuration_convbert.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* comments from reviews
* fix-copies
* fix style
* revert shape_list
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-01-27 03:20:09 -05:00
Yusuke Mori
cb73ab5a38
Fix broken links in the converting tf ckpt document ( #9791 )
...
* Fix broken links in the converting tf ckpt document
* Update docs/source/converting_tensorflow_models.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Reflect the review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-26 03:37:57 -05:00
Sylvain Gugger
7acfa95afb
Add missing new line
2021-01-20 14:13:16 -05:00
Darigov Research
5a307ece82
Adds flashcards to Glossary & makes small corrections ( #8949 )
...
* fix: Makes small typo corrections & standardises glossary
* feat: Adds introduction & links to transformer flashcards
* feat: Adds attribution & adjustments requested in #8949
* feat: Adds flashcards to community.md
* refactor: Removes flashcards from glossary
2021-01-20 13:28:40 -05:00
NielsRogge
88583d4958
Add notebook ( #9696 )
2021-01-20 10:19:26 -05:00
NielsRogge
d1370d29b1
Add DeBERTa head models ( #9691 )
...
* Add DebertaForMaskedLM, DebertaForTokenClassification, DebertaForQuestionAnswering
* Add docs and fix quality
* Fix Deberta not having pooler
2021-01-20 10:18:50 -05:00
acul3
8940c7662d
Add t5 convert to transformers-cli ( #9654 )
...
* Update run_mlm.py
* add t5 model to transformers-cli convert
* update rum_mlm.py same as master
* update converting model docs
* update converting model docs
* Update convert.py
* Trigger notification
* update import sorted
* fix typo t5
2021-01-20 09:34:27 -05:00
Sylvain Gugger
76f36e183a
Add a community page to the docs ( #9682 )
2021-01-20 04:54:36 -05:00
Stas Bekman
82498cbc37
[deepspeed doc] install issues + 1-gpu deployment ( #9582 )
...
* [doc] install + 1-gpu deployment
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* improvements
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-14 11:05:04 -08:00
Lysandre
e43f3b6190
v4.2.1 in docs
2021-01-14 14:25:30 +01:00
Lysandre
33a8497db8
v4.2.0 documentation
2021-01-13 16:15:40 +01:00
Lysandre
7d9a9d0c72
Release: v4.2.0
2021-01-13 16:01:51 +01:00
Julien Chaumond
247a7b2029
Doc: Update pretrained_models wording ( #9545 )
...
* Update pretrained_models.rst
To clarify things cf. this tweet for instance https://twitter.com/RTomMcCoy/status/1349094111505211395
* format
2021-01-13 05:58:05 -05:00
Stas Bekman
2df34f4aba
[trainer] deepspeed integration ( #9211 )
...
* deepspeed integration
* style
* add test
* ds wants to do its own backward
* fp16 assert
* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* style
* for clarity extract what args are being passed to deepspeed
* introduce the concept of self.wrapped_model
* s/self.wrapped_model/self.model_wrapped/
* complete transition to self.wrapped_model / self.model
* fix
* doc
* give ds its own init
* add custom overrides, handle bs correctly
* fix test
* clean up model_init logic, fix small bug
* complete fix
* collapse --deepspeed_config into --deepspeed
* style
* start adding doc notes
* style
* implement hf2ds optimizer and scheduler configuration remapping
* oops
* call get_num_training_steps absolutely when needed
* workaround broken auto-formatter
* deepspeed_config arg is no longer needed - fixed in deepspeed master
* use hf's fp16 args in config
* clean
* start on the docs
* rebase cleanup
* finish up --fp16
* clarify the supported stages
* big refactor thanks to discovering deepspeed.init_distributed
* cleanup
* revert fp16 part
* add checkpoint-support
* more init ds into integrations
* extend docs
* cleanup
* unfix docs
* clean up old code
* imports
* move docs
* fix logic
* make it clear which file it's referring to
* document nodes/gpus
* style
* wrong format
* style
* deepspeed handles gradient clipping
* easier to read
* major doc rewrite
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* docs
* switch to AdamW optimizer
* style
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* clarify doc
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-12 19:05:18 -08:00
NielsRogge
e45eba3b1c
Improve LayoutLM ( #9476 )
...
* Add LayoutLMForSequenceClassification and integration tests
Improve docs
Add LayoutLM notebook to list of community notebooks
* Make style & quality
* Address comments by @sgugger, @patrickvonplaten and @LysandreJik
* Fix rebase with master
* Reformat in one line
* Improve code examples as requested by @patrickvonplaten
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-12 09:26:32 -05:00
Patrick von Platen
7f28613213
[TFBart] Split TF-Bart ( #9497 )
...
* make templates ready
* make add_new_model_command_ready
* finish tf bart
* prepare tf mbart
* finish tf bart
* add tf mbart
* add marian
* prep pegasus
* add tf pegasus
* push blenderbot tf
* add blenderbot
* add blenderbot small
* clean-up
* make fix copy
* define blend bot tok
* fix
* up
* make style
* add to docs
* add copy statements
* overwrite changes
* improve
* fix docs
* finish
* fix last slow test
* fix missing git conflict line
* fix blenderbot
* up
* fix blenderbot small
* load changes
* finish copied from
* upload fix
2021-01-12 02:06:32 +01:00
Sylvain Gugger
8d25df2c7a
Make doc styler detect lists on rst ( #9488 )
2021-01-11 08:53:41 -05:00
Patrick von Platen
9e1ea846bc
[README] Add new models ( #9465 )
...
* add new models
* make fix-copies
2021-01-08 05:49:43 -05:00
Patrick von Platen
ae5a32bb0d
up ( #9454 )
2021-01-07 11:51:02 +01:00
Simon Brandeis
c89f1bc92e
Add flags to return scores, hidden states and / or attention weights in GenerationMixin ( #9150 )
...
* Define new output dataclasses for greedy generation
* Add output_[...] flags in greedy generation methods
Added output_attentions, output_hidden_states, output_scores flags in
generate and greedy_search methods in GenerationMixin.
* [WIP] Implement logic and tests for output flags in generation
* Update GreedySearchOutput classes & docstring
* Implement greedy search output accumulation logic
Update greedy_search unittests
Fix generate method return value docstring
Properly init flags with the default config
* Update configuration to add output_scores flag
* Fix test_generation_utils
Sort imports and fix isinstance tests for GreedySearchOutputs
* Fix typo in generation_utils
* Add return_dict_in_generate for backwards compatibility
* Add return_dict_in_generate flag in config
* Fix tyPo in configuration
* Fix handling of attentions and hidden_states flags
* Make style & quality
* first attempt attentions
* some corrections
* improve tests
* special models requires special test
* disable xlm test for now
* clean tests
* fix for tf
* isort
* Add output dataclasses for other generation methods
* Add logic to return dict in sample generation
* Complete test for sample generation
- Pass output_attentions and output_hidden_states flags to encoder in
encoder-decoder models
- Fix import satements order in test_generation_utils file
* Add logic to return dict in sample generation
- Refactor tests to avoid using self.assertTrue, which provides
scarce information when the test fails
- Add tests for the three beam_search methods: vanilla, sample and
grouped
* Style doc
* Fix copy-paste error in generation tests
* Rename logits to scores and refactor
* Refactor group_beam_search for consistency
* make style
* add sequences_scores
* fix all tests
* add docs
* fix beam search finalize test
* correct docstring
* clean some files
* Made suggested changes to the documentation
* Style doc ?
* Style doc using the Python util
* Update src/transformers/generation_utils.py
* fix empty lines
* fix all test
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-01-06 17:11:42 +01:00
Qbiwan
ecfcac223c
Improve documentation coverage for Phobert ( #9427 )
...
* first commit
* change phobert to phoBERT as per author in overview
* v3 and v4 both runs on same code hence there is no need to differentiate them
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-06 10:04:32 -05:00
Qbiwan
be898998bb
Improve documentation coverage for Herbert ( #9428 )
...
* first commit
* changed XLMTokenizer to HerbertTokenizer in code example
2021-01-06 09:13:43 -05:00
Patrick von Platen
b972c1bfb0
finalize ( #9431 )
2021-01-06 14:36:55 +01:00
Sylvain Gugger
bcb55d33ce
Upgrade styler to better handle lists ( #9423 )
...
* Add missing lines before a new list.
* Update doc styler and restyle some files.
* Fix docstrings of LED and Longformer
2021-01-06 07:46:17 -05:00
NielsRogge
b7e548976f
Fix URLs to TAPAS notebooks ( #9435 )
2021-01-06 07:20:41 -05:00
Stas Bekman
d64372fdfc
[docs] outline sharded ddp doc ( #9208 )
...
* outline sharded dpp doc
* fix link
* add example
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* narrow the command and remove non-essentials
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-05 17:34:15 -08:00
Patrick von Platen
eef66035a2
[PyTorch Bart] Split Bart into different models ( #9343 )
...
* first try
* remove old template
* finish bart
* finish mbart
* delete unnecessary line
* init pegasus
* save intermediate
* correct pegasus
* finish pegasus
* remove cookie cutter leftover
* add marian
* finish blenderbot
* replace in file
* correctly split blenderbot
* delete "old" folder
* correct "add statement"
* adapt config for tf comp
* correct configs for tf
* remove ipdb
* fix more stuff
* fix mbart
* push pegasus fix
* fix mbart
* more fixes
* fix research projects code
* finish docs for bart, mbart, and marian
* delete unnecessary file
* correct attn typo
* correct configs
* remove pegasus for seq class
* correct peg docs
* correct peg docs
* finish configs
* further improve docs
* add copied from statements to mbart
* fix copied from in mbart
* add copy statements to marian
* add copied from to marian
* add pegasus copied from
* finish pegasus
* finish copied from
* Apply suggestions from code review
* make style
* backward comp blenderbot
* apply lysandres and sylvains suggestions
* apply suggestions
* push last fixes
* fix docs
* fix tok tests
* fix imports code style
* fix doc
2021-01-05 22:00:05 +01:00
Patrick von Platen
189387e9b2
LED ( #9278 )
...
* create model
* add integration
* save current state
* make integration tests pass
* add one more test
* add explanation to tests
* remove from bart
* add padding
* remove unnecessary test
* make all tests pass
* re-add cookie cutter tests
* finish PyTorch
* fix attention test
* Update tests/test_modeling_common.py
* revert change
* remove unused file
* add string to doc
* save intermediate
* make tf integration tests pass
* finish tf
* fix doc
* fix docs again
* add led to doctree
* add to auto tokenizer
* added tips for led
* make style
* apply jplus statements
* correct tf longformer
* apply lysandres suggestions
* apply sylvains suggestions
* Apply suggestions from code review
2021-01-05 13:14:30 +01:00
Sugeeth
314cca2842
Fix documentation links always pointing to master. ( #9217 )
...
* Use extlinks to point hyperlink with the version of code
* Point to version on release and master until then
* Apply style
* Correct links
* Add missing backtick
* Simple missing backtick after all.
Co-authored-by: Raghavendra Sugeeth P S <raghav-5305@raghav-5305.csez.zohocorpin.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-01-05 06:18:48 -05:00
Qbiwan
086718ac6e
Improve documentation coverage for Bertweet ( #9379 )
...
* bertweet docs coverage
* style doc max len 119
* maxlen style rst
* run main() from style_doc
* changed according to comments
2021-01-04 13:12:59 -05:00
Patrick von Platen
75ff530551
correct docs ( #9378 )
2021-01-04 17:27:29 +01:00
Patrick von Platen
52b3a05e83
[Bart doc] Fix outdated statement ( #9299 )
...
* fix bart doc
* fix docs
2020-12-24 14:47:53 +01:00
Suraj Patil
88ef8893cd
Add caching mechanism to BERT, RoBERTa ( #9183 )
...
* add past_key_values
* add use_cache option
* make mask before cutting ids
* adjust position_ids according to past_key_values
* flatten past_key_values
* fix positional embeds
* fix _reorder_cache
* set use_cache to false when not decoder, fix attention mask init
* add test for caching
* add past_key_values for Roberta
* fix position embeds
* add caching test for roberta
* add doc
* make style
* doc, fix attention mask, test
* small fixes
* adress patrick's comments
* input_ids shouldn't start with pad token
* use_cache only when decoder
* make consistent with bert
* make copies consistent
* add use_cache to encoder
* add past_key_values to tapas attention
* apply suggestions from code review
* make coppies consistent
* add attn mask in tests
* remove copied from longformer
* apply suggestions from code review
* fix bart test
* nit
* simplify model outputs
* fix doc
* fix output ordering
2020-12-23 23:01:32 +05:30
Connor Brinton
bcc87c639f
Minor documentation revisions from copyediting ( #9266 )
...
* typo: Revise "checkout" to "check out"
* typo: Change "seemlessly" to "seamlessly"
* typo: Close parentheses in "Using the tokenizer"
* typo: Add closing parenthesis to supported models aside
* docs: Treat ``position_ids`` as plural
Alternatively, the word "argument" could be added to make the subject singular.
* docs: Remove comma, making subordinate clause
* docs: Remove comma separating verb and direct object
* docs: Fix typo ("next" -> "text")
* docs: Reverse phrase order to simplify sentence
* docs: "quicktour" -> "quick tour"
* docs: "to throw" -> "from throwing"
* docs: Remove disruptive newline in padding/truncation section
* docs: "show exemplary" -> "show examples of"
* docs: "much harder as" -> "much harder than"
* docs: Fix typo "seach" -> "search"
* docs: Fix subject-verb disagreement in WordPiece description
* docs: Fix style in preprocessing.rst
2020-12-23 10:15:49 -05:00
Sylvain Gugger
490b39e614
Seq2seq trainer ( #9241 )
...
* Add label smoothing in Trainer
* Add options for scheduler and Adafactor in Trainer
* Put Seq2SeqTrainer in the main lib
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Address review comments and adapt scripts
* Documentation
* Move test not using script to tests folder
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-12-22 11:33:44 -05:00
Sylvain Gugger
1fc7119181
Fix script that check objects are documented ( #9259 )
2020-12-22 11:12:58 -05:00
Suraj Patil
f4432b7e01
add base model classes to bart subclassed models ( #9230 )
...
* add base model classes to bart subclassed models
* add doc
2020-12-21 19:56:46 +05:30
Stas Bekman
3ff5e8955a
[t5 doc] typos ( #9199 )
...
* [t5 doc] typos
a few run away backticks
@sgugger
* style
2020-12-18 16:03:26 -08:00
Sylvain Gugger
3e56e2ce04
Fix typo
2020-12-18 10:11:07 -05:00
sandip
467e9158b4
Added TF CTRL Sequence Classification ( #9151 )
...
* Added TF CTRL Sequence Classification
* code refactor
2020-12-17 18:10:57 -05:00