Suraj Patil
055ed78f52
[S2T] fix example in docs ( #10667 )
2021-03-11 22:43:37 +05:30
Patrick von Platen
602d63f05c
[XLSR-Wav2Vec2] Add multi-lingual Wav2Vec2 models ( #10648 )
...
* add conversion script
* add wav2vec2 xslr models
* finish
* Update docs/source/model_doc/xlsr_wav2vec2.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-11 17:44:18 +03:00
Sylvain Gugger
26a33cfd8c
Document Trainer limitation on custom models ( #10635 )
2021-03-10 14:58:22 -05:00
Suraj Patil
d26b37e744
Speech2TextTransformer ( #10175 )
...
* s2t
* fix config
* conversion script
* fix import
* add tokenizer
* fix tok init
* fix tokenizer
* first version working
* fix embeds
* fix lm head
* remove extra heads
* fix convert script
* handle encoder attn mask
* style
* better enc attn mask
* override _prepare_attention_mask_for_generation
* handle attn_maks in encoder and decoder
* input_ids => input_features
* enable use_cache
* remove old code
* expand embeddings if needed
* remove logits bias
* masked_lm_loss => loss
* hack tokenizer to support feature processing
* fix model_input_names
* style
* fix error message
* doc
* remove inputs_embeds
* remove input_embeds
* remove unnecessary docstring
* quality
* SpeechToText => Speech2Text
* style
* remove shared_embeds
* subsample => conv
* remove Speech2TextTransformerDecoderWrapper
* update output_lengths formula
* fix table
* remove max_position_embeddings
* update conversion scripts
* add possibility to do upper case for now
* add FeatureExtractor and Processor
* add tests for extractor
* require_torch_audio => require_torchaudio
* add processor test
* update import
* remove classification head
* attention mask is now 1D
* update docstrings
* attention mask should be of type long
* handle attention mask from generate
* alwyas return attention_mask
* fix test
* style
* doc
* Speech2TextTransformer => Speech2Text
* Speech2TextTransformerConfig => Speech2TextConfig
* remove dummy_inputs
* nit
* style
* multilinguial tok
* fix tokenizer
* add tgt_lang setter
* save lang_codes
* fix tokenizer
* add forced_bos_token_id to tokenizer
* apply review suggestions
* add torchaudio to extra deps
* add speech deps to CI
* fix dep
* add libsndfile to ci
* libsndfile1
* add speech to extras all
* libsndfile1 -> libsndfile1
* libsndfile
* libsndfile1-dev
* apt update
* add sudo to install
* update deps table
* install libsndfile1-dev on CI
* tuple to list
* init conv layer
* add model tests
* quality
* add integration tests
* skip_special_tokens
* add speech_to_text_transformer in toctree
* fix tokenizer
* fix fp16 tests
* add tokenizer tests
* fix copyright
* input_values => input_features
* doc
* add model in readme
* doc
* change checkpoint names
* fix copyright
* fix code example
* add max_model_input_sizes in tokenizer
* fix integration tests
* add do_lower_case to tokenizer
* remove clamp trick
* fix "Add modeling imports here"
* fix copyrights
* fix tests
* SpeechToTextTransformer => SpeechToText
* fix naming
* fix table formatting
* fix typo
* style
* fix typos
* remove speech dep from extras[testing]
* fix copies
* rename doc file,
* put imports under is_torch_available
* run feat extract tests when torch is available
* dummy objects for processor and extractor
* fix imports in tests
* fix import in modeling test
* fxi imports
* fix torch import
* fix imports again
* fix positional embeddings
* fix typo in import
* adapt new extractor refactor
* style
* fix torchscript test
* doc
* doc
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* fix docs, copied from, style
* fix docstring
* handle imports
* remove speech from all extra deps
* remove s2t from seq2seq lm mapping
* better names
* skip training tests
* add install instructions
* List => Tuple
* doc
* fix conversion script
* fix urls
* add instruction for libsndfile
* fix fp16 test
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-10 21:42:04 +05:30
Patrick von Platen
9a06b6b11b
[FeatureExtractorSavingUtils] Refactor PretrainedFeatureExtractor ( #10594 )
...
* save first version
* finish refactor
* finish refactor
* correct naming
* correct naming
* shorter names
* Update src/transformers/feature_extraction_common_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* change name
* finish
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-09 12:16:59 +03:00
Ratthachat (Jung)
696e8a4365
Add TFRag ( #9002 )
...
* Create modeling_tf_dpr.py
* Add TFDPR
* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot
last commit accidentally deleted these 4 lines, so I recover them back
* Add TFDPR
* Add TFDPR
* clean up some comments, add TF input-style doc string
* Add TFDPR
* Make return_dict=False as default
* Fix return_dict bug (in .from_pretrained)
* Add get_input_embeddings()
* Create test_modeling_tf_dpr.py
The current version is already passed all 27 tests!
Please see the test run at :
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing
* fix quality
* delete init weights
* run fix copies
* fix repo consis
* del config_class, load_tf_weights
They shoud be 'pytorch only'
* add config_class back
after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion
* newline after .. note::
* import tf, np (Necessary for ModelIntegrationTest)
* slow_test from_pretrained with from_pt=True
At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug
* Add simple TFDPRModelIntegrationTest
Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet
* upload correct tf model
* remove position_ids as missing keys
* create modeling_tf_rag
* add tests for tf
* add tf tests
* revert wrong pt commit
* further refactor
* further refactor
* refactor
* Update modeling_tf_rag.py
- input_processing
- fix prepare_input_for_generation (mostly fix generate bug)
- bring back from_pretrained hack in order to test generate
* delete colab pieces of code
* Show case of greedy "generate"
Temporarily change from beam_search test to greedy_search test to show case that TF and PT do get equivalent output.
* cosmetic update
* correct typos
* update
* push some progress
* make easy check
* fix rag save from pretrained
* Update src/transformers/modeling_tf_utils.py
* remove commented out lines
* delete unnecessary lines
* add simple test case for nq_checkpoint
Add nq_checkpoint test to show that current version without hack still fails
* temporarily put ugly hack back again
* Add TFRagSequenceForGeneration!!
* __init__.py , import TFRagSequenceForGeneration
* Add TFRagSequence tests!
* rag init.py - add TFRagSequenceForGeneration
* fix from_pretrained
* fix prepare_inputs_for_generation
* Beam search for RagToken!
* minor clean up
* add tf.cast in TFRagModel
* More tf.cast
* Add all remaining tests (still have issues)
* delete all T5 related
* make style
* fix load weight prefix
* fix bart
* fix return_dict for tf_rag
make all tests pass .. Hooray
* fix some tests
* fix code quality
* fix qualtiy check
* finish tests tf rag
* add tf rag to docs
* remove TFT5 from docstring
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* remove TFT5 from docstring
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Delete outdated comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* improve doc strings
* add generative model classes
* fix adjust token logic
* refactor generate for TFRag
* using shape_list, not _get_shape
Co-authored-by: Julien Plu <plu.julien@gmail.com>
* axis=[1]->axis=1
* delete NEED_HELP comment
* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Indicating model is in a developing state in docstrings
As suggested by Julien
* small last changes
* apply sylvains suggestions
* finish tf rag
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
Co-authored-by: Julien Plu <plu.julien@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-09 00:49:51 +03:00
Sylvain Gugger
5469369480
Fix version control with anchors ( #10595 )
...
* Fix version control with anchors
* Simplify
2021-03-08 10:19:22 -05:00
Suraj Patil
f6e74a63ca
Add m2m100 ( #10236 )
...
* m2m_100
* no layernorm_embedding
* sinusoidal positional embeddings
* update pos embeddings
* add default config values
* tokenizer
* add conversion script
* fix config
* fix pos embed
* remove _float_tensor
* update tokenizer
* update lang codes
* handle lang codes
* fix pos embeds
* fix spm key
* put embedding weights on device
* remove qa and seq classification heads
* fix convert script
* lang codes pn one line
* fix embeds
* fix tokenizer
* fix tokenizer
* add fast tokenizer
* style
* M2M100MT => M2M100
* fix copyright, style
* tokenizer converter
* vocab file
* remove fast tokenizer
* fix embeds
* fix tokenizer
* fix tests
* add tokenizer tests
* add integration test
* quality
* fix model name
* fix test
* doc
* doc
* fix doc
* add copied from statements
* fix tokenizer tests
* apply review suggestions
* fix urls
* fix shift_tokens_right
* apply review suggestions
* fix
* fix doc
* add lang code to id
* remove unused function
* update checkpoint names
* fix copy
* fix tokenizer
* fix checkpoint names
* fix merge issue
* style
2021-03-06 22:14:16 +05:30
Stas Bekman
88a951e3cc
offline mode for firewalled envs ( #10407 )
...
* offline mode start
* add specific values
* fix fallback
* add test
* better values check and range
* test that actually works
* document the offline mode
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* more strict check
* cleaner test
* pt-only test
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-05 17:27:48 -08:00
lewtun
12b66215cf
Fix example of custom Trainer to reflect signature of compute_loss ( #10537 )
2021-03-05 07:44:53 -05:00
Sylvain Gugger
948b730f97
Remove unsupported methods from ModelOutput doc ( #10505 )
2021-03-03 14:55:18 -05:00
Jeff Yang
39f70a4058
feat(docs): navigate with left/right arrow keys ( #10481 )
...
* feat(docs): navigate with left/right arrow keys
* fix: add missing comma
2021-03-03 11:17:12 -05:00
Lysandre Debut
0c2325198f
Add I-BERT to README ( #10462 )
2021-03-01 12:12:31 -05:00
Patrick von Platen
0234de8418
Add Fine-Tuning for Wav2Vec2 ( #10145 )
...
* add encode labels function to tokenizer
* start adding finetuning
* init dropout
* upload
* correct convert script
* apply changes
* fix second typo
* make first dummy training run
* adapt convert script
* push confg for comparison
* remove conf
* finish training
* adapt data collator
* add research folder
* update according to fairseq feedback
* some minor corrections
* refactor masking indices a bit
* some minor changes
* clean tokenizer
* finish clean-up
* remove previous logic
* update run script
* correct training
* finish changes
* finish model
* correct bug
* fix training a bit more
* add some tests
* finish gradient checkpointing
* finish example
* correct gradient checkpointing
* improve tokenization method
* revert changes in tokenizer
* revert general change
* adapt fine-tuning
* update
* save intermediate test
* Update README.md
* finish finetuning
* delete conversion script
* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py
* Update src/transformers/models/wav2vec2/processing_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* finish wav2vec2 script
* finish wav2vec2 fine-tuning
* finalize test
* correct test
* adapt tests
* finish
* remove test file
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-01 12:13:17 +03:00
Patrick von Platen
3c733f3208
Update ibert.rst ( #10445 )
2021-02-28 19:03:49 +03:00
Darigov Research
aeba4f95bb
Adds terms to Glossary ( #10443 )
...
* feat: Adds three definitions to glossary from @cronoik
Needed a definition for transformer which in turn needed 2 more definitions
To do with issue https://github.com/huggingface/transformers/issues/9078
* fix: Adjusts definition of neural network to make it easier to read
2021-02-28 08:27:54 -05:00
Tanmay Garg
256482ac92
Introduce save_strategy training argument ( #10286 )
...
* Introduce save_strategy training argument
* deprecate EvaluationStrategy
* collapse EvaluationStrategy and LoggingStrategy into a single
IntervalStrategy enum
* modify tests to use modified enum
2021-02-27 19:34:22 -05:00
Andrea Bacciu
b040e6efc1
Fix None in add_token_positions - issue #10210 ( #10374 )
...
* Fix None in add_token_positions - issue #10210
Fix None in add_token_positions related to the issue #10210
* add_token_positions fix None values in end_positions vector
add_token_positions fix None in end_positions vector as proposed by @joeddav
2021-02-25 09:18:33 -07:00
Sylvain Gugger
9d14be5c20
Add support for ZeRO-2/3 and ZeRO-offload in fairscale ( #10354 )
...
* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale
* Quality
* Rework from review comments
* Add doc
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-02-25 11:07:53 -05:00
Sehoon Kim
63645b3b11
I-BERT model support ( #10153 )
...
* IBertConfig, IBertTokentizer added
* IBert Model names moified
* tokenizer bugfix
* embedding -> QuantEmbedding
* quant utils added
* quant_mode added to configuration
* QuantAct added, Embedding layer + QuantAct addition
* QuantAct added
* unused path removed, QKV quantized
* self attention layer all quantized, except softmax
* temporarl commit
* all liner layers quantized
* quant_utils bugfix
* bugfix: requantization missing
* IntGELU added
* IntSoftmax added
* LayerNorm implemented
* LayerNorm implemented all
* names changed: roberta->ibert
* config not inherit from ROberta
* No support for CausalLM
* static quantization added, quantize_model.py removed
* import modules uncommented
* copyrights fixed
* minor bugfix
* quant_modules, quant_utils merged as one file
* import * fixed
* unused runfile removed
* make style run
* configutration.py docstring fixed
* refactoring: comments removed, function name fixed
* unused dependency removed
* typo fixed
* comments(Copied from), assertion string added
* refactoring: super(..) -> super(), etc.
* refactoring
* refarctoring
* make style
* refactoring
* cuda -> to(x.device)
* weight initialization removed
* QuantLinear set_param removed
* QuantEmbedding set_param removed
* IntLayerNorm set_param removed
* assert string added
* assertion error message fixed
* is_decoder removed
* enc-dec arguments/functions removed
* Converter removed
* quant_modules docstring fixed
* conver_slow_tokenizer rolled back
* quant_utils docstring fixed
* unused aruments e.g. use_cache removed from config
* weight initialization condition fixed
* x_min, x_max initialized with small values to avoid div-zero exceptions
* testing code for ibert
* test emb, linear, gelu, softmax added
* test ln and act added
* style reformatted
* force_dequant added
* error tests overrided
* make style
* Style + Docs
* force dequant tests added
* Fix fast tokenizer in init
* Fix doc
* Remove space
* docstring, IBertConfig, chunk_size
* test_modeling_ibert refactoring
* quant_modules.py refactoring
* e2e integration test added
* tokenizers removed
* IBertConfig added to tokenizer_auto.py
* bugfix
* fix docs & test
* fix style num 2
* final fixes
Co-authored-by: Sehoon Kim <sehoonkim@berkeley.edu>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-25 10:06:42 -05:00
Patrick von Platen
cb38ffcc5e
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer ( #10324 )
...
* push to show
* small improvement
* small improvement
* Update src/transformers/feature_extraction_utils.py
* Update src/transformers/feature_extraction_utils.py
* implement base
* add common tests
* make all tests pass for wav2vec2
* make padding work & add more tests
* finalize feature extractor utils
* add call method to feature extraction
* finalize feature processor
* finish tokenizer
* finish general processor design
* finish tests
* typo
* remove bogus file
* finish docstring
* add docs
* finish docs
* small fix
* correct docs
* save intermediate
* load changes
* apply changes
* apply changes to doc
* change tests
* apply surajs recommend
* final changes
* Apply suggestions from code review
* fix typo
* fix import
* correct docstring
2021-02-25 17:42:46 +03:00
abhishek thakur
9dc7825744
Remove unused variable in example for Q&A ( #10392 )
2021-02-25 09:18:47 -05:00
Lysandre
3591844306
v4.3.3 docs
2021-02-24 15:19:01 -05:00
Stas Bekman
eab0afc19c
[Trainer] implement gradient_accumulation_steps support in DeepSpeed integration ( #10310 )
...
* implement gradient_accumulation_steps support in DeepSpeed integration
* typo
* cleanup
* cleanup
2021-02-22 11:15:59 -08:00
Sylvain Gugger
9e147d31f6
Deprecate prepare_seq2seq_batch ( #10287 )
...
* Deprecate prepare_seq2seq_batch
* Fix last tests
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* More review comments
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-02-22 12:36:16 -05:00
Lysandre Debut
cd8c4c3fc2
DeBERTa-v2 fixes ( #10328 )
...
Co-authored-by: Pengcheng He <penhe@microsoft.com>
Co-authored-by: Pengcheng He <penhe@microsoft.com>
2021-02-22 07:45:18 -05:00
Pengcheng He
9a7e63729f
Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… ( #10018 )
...
* Integrate DeBERTa v2(the 1.5B model surpassed human performance on SuperGLUE); Add DeBERTa v2 900M,1.5B models;
* DeBERTa-v2
* Fix v2 model loading issue (#10129 )
* Doc members
* Update src/transformers/models/deberta/modeling_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Address Sylvain's comments
* Address Patrick's comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Style
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-19 18:34:44 -05:00
Sylvain Gugger
f6e53e3c2b
Fix example links in the task summary ( #10291 )
2021-02-19 18:04:15 -05:00
Stas Bekman
5da7c78ed8
update to new script; notebook notes ( #10241 )
2021-02-17 15:58:08 -08:00
Joe Davison
4210cd96fc
fix add_token_positions fn ( #10217 )
2021-02-16 14:00:05 -05:00
Suraj Patil
6fc940ed09
Add mBART-50 ( #10154 )
...
* add tokenizer for mBART-50
* update tokenizers
* make src_lang and tgt_lang optional
* update tokenizer test
* add setter
* update docs
* update conversion script
* update docs
* update conversion script
* update tokenizer
* update test
* update docs
* doc
* address Sylvain's suggestions
* fix test
* fix formatting
* nits
2021-02-15 20:58:54 +05:30
Sylvain Gugger
803498318c
[Doc] Fix version control in internal pages ( #10124 )
2021-02-13 08:52:30 -05:00
Stas Bekman
b54cb0bd82
[DeepSpeed in notebooks] Jupyter + Colab ( #10130 )
...
* init devices/setup explicitly
* docs + test
* simplify
* cleanup
* cleanup
* cleanup
* correct the required dist setup
* derive local_rank from env LOCAL_RANK
2021-02-11 14:02:05 -08:00
Tanmay Thakur
2f3b5f4dcc
Add new community notebook - Blenderbot ( #10126 )
...
* Update:community.md, new nb add
* feat: updated grammar on nb description
* Update: Train summarizer for BlenderBotSmall
2021-02-11 12:53:40 +03:00
Stas Bekman
7c07a47dfb
[DeepSpeed docs] new information ( #9610 )
...
* how to specify a specific gpu
* new paper
* expand on buffer sizes
* style
* where to find config examples
* specific example
* small updates
2021-02-09 22:16:20 -08:00
Boris Dayma
7c7962ba89
doc: update W&B related doc ( #10086 )
...
* doc: update W&B related doc
* doc(wandb): mention report_to
* doc(wandb): commit suggestion
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* doc(wandb): fix typo
* doc(wandb): remove WANDB_DISABLED
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-09 14:47:52 -05:00
Sylvain Gugger
0c3d23dff7
Add patch releases to the doc
2021-02-09 14:17:09 -05:00
Lysandre Debut
78f4a0e7e5
Logging propagation ( #10092 )
...
* Enable propagation by default
* Document enable/disable default handler
2021-02-09 10:27:49 -05:00
Patrick von Platen
b972125ced
Deprecate Wav2Vec2ForMaskedLM and add Wav2Vec2ForCTC ( #10089 )
...
* add wav2vec2CTC and deprecate for maskedlm
* remove from docs
2021-02-09 03:49:02 -05:00
Juan Cruz-Benito
e4bf9910dc
Removing run_pl_glue.py from text classification docs, include run_xnli.py & run_tf_text_classification.py ( #10066 )
...
* Removing run_pl_glue.py from seq classification docs
* Adding run_tf_text_classification.py
* Using :prefix_link: to refer local files
* Applying "make style" to the branch
* Update docs/source/task_summary.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Removing last underscores
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-08 13:04:21 -05:00
Lysandre
0dd579c9cf
Docs for v4.3.0
2021-02-08 18:53:24 +01:00
Sylvain Gugger
45aaf5f7ab
A few fixes in the documentation ( #10033 )
2021-02-08 05:02:01 -05:00
Patrick von Platen
89be094e29
[Templates] Add template "call-for-model" markdown and "call-for-big-bird" markdown ( #9921 )
...
* add big bird
* change teacher to mentor
* add proposal template
* adapt template
* delete old template
* correct some links
* finish template
* create big bird from template
* add big bird
* improve boxes
* finish boxes
* add pointers for BigBird
* finish big bird
* up
* up
* up
* up
* apply lysandres and sylvains suggestions
* delete bogus file
* correct markdown
* try different style
* try different style
* finalize
2021-02-05 15:47:54 +03:00
Sylvain Gugger
3be965c5db
Update doc for pre-release ( #10014 )
...
* Update doc for pre-release
* Use stable as default
* Use the right commit :facepalms:
2021-02-04 16:52:27 -05:00
Sylvain Gugger
b72f16b3ec
Fix doc for TFConverBertModel
2021-02-04 10:14:46 -05:00
demSd
00031785a8
BartForCausalLM analogs to ProphetNetForCausalLM
( #9128 )
...
* initiliaze bart4causalLM
* create BartDecoderWrapper, setters/getters
* delete spaces
* forward and additional methods
* update cache function, loss function, remove ngram* params in data class.
* add bartcausallm, bartdecoder testing
* correct bart for causal lm
* remove at
* add mbart as well
* up
* fix typo
* up
* correct
* add pegasusforcausallm
* add blenderbotforcausallm
* add blenderbotsmallforcausallm
* add marianforcausallm
* add test for MarianForCausalLM
* add Pegasus test
* add BlenderbotSmall test
* add blenderbot test
* fix a fail
* fix an import fail
* a fix
* fix
* Update modeling_pegasus.py
* fix models
* fix inputs_embeds setting getter
* adapt tests
* correct repo utils check
* finish test improvement
* fix tf models as well
* make style
* make fix-copies
* fix copies
* run all tests
* last changes
* fix all tests
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-04 11:56:12 +03:00
yylun
5442a11f5f
fix steps_in_epoch variable in trainer when using max_steps ( #9969 )
...
* fix steps_in_epoch variable when using max_steps
* redundant sentence
* Revert "redundant sentence"
This reverts commit ad5c0e9b6e
.
* remove redundant sentence
Co-authored-by: wujindou <wujindou@sogou-inc.com>
2021-02-03 09:30:37 -05:00
Patrick von Platen
d6217fb30c
Wav2Vec2 ( #9659 )
...
* add raw scaffold
* implement feat extract layers
* make style
* remove +
* correctly convert weights
* make feat extractor work
* make feature extraction proj work
* run forward pass
* finish forward pass
* Succesful decoding example
* remove unused files
* more changes
* add wav2vec tokenizer
* add new structure
* fix run forward
* add other layer norm architecture
* finish 2nd structure
* add model tests
* finish tests for tok and model
* clean-up
* make style
* finish docstring for model and config
* make style
* correct docstring
* correct tests
* change checkpoints to fairseq
* fix examples
* finish wav2vec2
* make style
* apply sylvains suggestions
* apply lysandres suggestions
* change print to log.info
* re-add assert statement
* add input_values as required input name
* finish wav2vec2 tokenizer
* Update tests/test_tokenization_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* apply sylvains suggestions
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-02-02 15:52:10 +03:00
Sylvain Gugger
de38a6e4d2
Fix 9918 ( #9932 )
...
* Initial work
* Fix doc styler and other models
2021-02-02 05:22:20 -05:00
Patrick von Platen
0e3be1ac8f
Add new model docs ( #9667 )
...
* add new model logic
* fix docs
* change structure
* improve add_new_model
* push new changes
* up
* up
* correct spelling
* improve docstring
* correct line length
* update readme
* correct links
* correct typos
* only add rst file for now
* Apply suggestions from code review 1
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
* Apply suggestions from code review
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>
* finish adding all suggestions
* make style
* apply Niels feedback
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* apply sylvains suggestions
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-01 17:55:10 +03:00