Suraj Patil
f6e74a63ca
Add m2m100 ( #10236 )
...
* m2m_100
* no layernorm_embedding
* sinusoidal positional embeddings
* update pos embeddings
* add default config values
* tokenizer
* add conversion script
* fix config
* fix pos embed
* remove _float_tensor
* update tokenizer
* update lang codes
* handle lang codes
* fix pos embeds
* fix spm key
* put embedding weights on device
* remove qa and seq classification heads
* fix convert script
* lang codes pn one line
* fix embeds
* fix tokenizer
* fix tokenizer
* add fast tokenizer
* style
* M2M100MT => M2M100
* fix copyright, style
* tokenizer converter
* vocab file
* remove fast tokenizer
* fix embeds
* fix tokenizer
* fix tests
* add tokenizer tests
* add integration test
* quality
* fix model name
* fix test
* doc
* doc
* fix doc
* add copied from statements
* fix tokenizer tests
* apply review suggestions
* fix urls
* fix shift_tokens_right
* apply review suggestions
* fix
* fix doc
* add lang code to id
* remove unused function
* update checkpoint names
* fix copy
* fix tokenizer
* fix checkpoint names
* fix merge issue
* style
2021-03-06 22:14:16 +05:30
Patrick von Platen
0234de8418
Add Fine-Tuning for Wav2Vec2 ( #10145 )
...
* add encode labels function to tokenizer
* start adding finetuning
* init dropout
* upload
* correct convert script
* apply changes
* fix second typo
* make first dummy training run
* adapt convert script
* push confg for comparison
* remove conf
* finish training
* adapt data collator
* add research folder
* update according to fairseq feedback
* some minor corrections
* refactor masking indices a bit
* some minor changes
* clean tokenizer
* finish clean-up
* remove previous logic
* update run script
* correct training
* finish changes
* finish model
* correct bug
* fix training a bit more
* add some tests
* finish gradient checkpointing
* finish example
* correct gradient checkpointing
* improve tokenization method
* revert changes in tokenizer
* revert general change
* adapt fine-tuning
* update
* save intermediate test
* Update README.md
* finish finetuning
* delete conversion script
* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py
* Update src/transformers/models/wav2vec2/processing_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* finish wav2vec2 script
* finish wav2vec2 fine-tuning
* finalize test
* correct test
* adapt tests
* finish
* remove test file
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-01 12:13:17 +03:00
Patrick von Platen
3c733f3208
Update ibert.rst ( #10445 )
2021-02-28 19:03:49 +03:00
Sehoon Kim
63645b3b11
I-BERT model support ( #10153 )
...
* IBertConfig, IBertTokentizer added
* IBert Model names moified
* tokenizer bugfix
* embedding -> QuantEmbedding
* quant utils added
* quant_mode added to configuration
* QuantAct added, Embedding layer + QuantAct addition
* QuantAct added
* unused path removed, QKV quantized
* self attention layer all quantized, except softmax
* temporarl commit
* all liner layers quantized
* quant_utils bugfix
* bugfix: requantization missing
* IntGELU added
* IntSoftmax added
* LayerNorm implemented
* LayerNorm implemented all
* names changed: roberta->ibert
* config not inherit from ROberta
* No support for CausalLM
* static quantization added, quantize_model.py removed
* import modules uncommented
* copyrights fixed
* minor bugfix
* quant_modules, quant_utils merged as one file
* import * fixed
* unused runfile removed
* make style run
* configutration.py docstring fixed
* refactoring: comments removed, function name fixed
* unused dependency removed
* typo fixed
* comments(Copied from), assertion string added
* refactoring: super(..) -> super(), etc.
* refactoring
* refarctoring
* make style
* refactoring
* cuda -> to(x.device)
* weight initialization removed
* QuantLinear set_param removed
* QuantEmbedding set_param removed
* IntLayerNorm set_param removed
* assert string added
* assertion error message fixed
* is_decoder removed
* enc-dec arguments/functions removed
* Converter removed
* quant_modules docstring fixed
* conver_slow_tokenizer rolled back
* quant_utils docstring fixed
* unused aruments e.g. use_cache removed from config
* weight initialization condition fixed
* x_min, x_max initialized with small values to avoid div-zero exceptions
* testing code for ibert
* test emb, linear, gelu, softmax added
* test ln and act added
* style reformatted
* force_dequant added
* error tests overrided
* make style
* Style + Docs
* force dequant tests added
* Fix fast tokenizer in init
* Fix doc
* Remove space
* docstring, IBertConfig, chunk_size
* test_modeling_ibert refactoring
* quant_modules.py refactoring
* e2e integration test added
* tokenizers removed
* IBertConfig added to tokenizer_auto.py
* bugfix
* fix docs & test
* fix style num 2
* final fixes
Co-authored-by: Sehoon Kim <sehoonkim@berkeley.edu>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-25 10:06:42 -05:00
Patrick von Platen
cb38ffcc5e
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer ( #10324 )
...
* push to show
* small improvement
* small improvement
* Update src/transformers/feature_extraction_utils.py
* Update src/transformers/feature_extraction_utils.py
* implement base
* add common tests
* make all tests pass for wav2vec2
* make padding work & add more tests
* finalize feature extractor utils
* add call method to feature extraction
* finalize feature processor
* finish tokenizer
* finish general processor design
* finish tests
* typo
* remove bogus file
* finish docstring
* add docs
* finish docs
* small fix
* correct docs
* save intermediate
* load changes
* apply changes
* apply changes to doc
* change tests
* apply surajs recommend
* final changes
* Apply suggestions from code review
* fix typo
* fix import
* correct docstring
2021-02-25 17:42:46 +03:00
Sylvain Gugger
9e147d31f6
Deprecate prepare_seq2seq_batch ( #10287 )
...
* Deprecate prepare_seq2seq_batch
* Fix last tests
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* More review comments
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-02-22 12:36:16 -05:00
Pengcheng He
9a7e63729f
Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… ( #10018 )
...
* Integrate DeBERTa v2(the 1.5B model surpassed human performance on SuperGLUE); Add DeBERTa v2 900M,1.5B models;
* DeBERTa-v2
* Fix v2 model loading issue (#10129 )
* Doc members
* Update src/transformers/models/deberta/modeling_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Address Sylvain's comments
* Address Patrick's comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Style
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-19 18:34:44 -05:00
Suraj Patil
6fc940ed09
Add mBART-50 ( #10154 )
...
* add tokenizer for mBART-50
* update tokenizers
* make src_lang and tgt_lang optional
* update tokenizer test
* add setter
* update docs
* update conversion script
* update docs
* update conversion script
* update tokenizer
* update test
* update docs
* doc
* address Sylvain's suggestions
* fix test
* fix formatting
* nits
2021-02-15 20:58:54 +05:30
Patrick von Platen
b972125ced
Deprecate Wav2Vec2ForMaskedLM and add Wav2Vec2ForCTC ( #10089 )
...
* add wav2vec2CTC and deprecate for maskedlm
* remove from docs
2021-02-09 03:49:02 -05:00
Sylvain Gugger
b72f16b3ec
Fix doc for TFConverBertModel
2021-02-04 10:14:46 -05:00
demSd
00031785a8
BartForCausalLM analogs to ProphetNetForCausalLM
( #9128 )
...
* initiliaze bart4causalLM
* create BartDecoderWrapper, setters/getters
* delete spaces
* forward and additional methods
* update cache function, loss function, remove ngram* params in data class.
* add bartcausallm, bartdecoder testing
* correct bart for causal lm
* remove at
* add mbart as well
* up
* fix typo
* up
* correct
* add pegasusforcausallm
* add blenderbotforcausallm
* add blenderbotsmallforcausallm
* add marianforcausallm
* add test for MarianForCausalLM
* add Pegasus test
* add BlenderbotSmall test
* add blenderbot test
* fix a fail
* fix an import fail
* a fix
* fix
* Update modeling_pegasus.py
* fix models
* fix inputs_embeds setting getter
* adapt tests
* correct repo utils check
* finish test improvement
* fix tf models as well
* make style
* make fix-copies
* fix copies
* run all tests
* last changes
* fix all tests
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-04 11:56:12 +03:00
Patrick von Platen
d6217fb30c
Wav2Vec2 ( #9659 )
...
* add raw scaffold
* implement feat extract layers
* make style
* remove +
* correctly convert weights
* make feat extractor work
* make feature extraction proj work
* run forward pass
* finish forward pass
* Succesful decoding example
* remove unused files
* more changes
* add wav2vec tokenizer
* add new structure
* fix run forward
* add other layer norm architecture
* finish 2nd structure
* add model tests
* finish tests for tok and model
* clean-up
* make style
* finish docstring for model and config
* make style
* correct docstring
* correct tests
* change checkpoints to fairseq
* fix examples
* finish wav2vec2
* make style
* apply sylvains suggestions
* apply lysandres suggestions
* change print to log.info
* re-add assert statement
* add input_values as required input name
* finish wav2vec2 tokenizer
* Update tests/test_tokenization_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* apply sylvains suggestions
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-02-02 15:52:10 +03:00
Stefan Schweter
5ed5a54684
ADD BORT ( #9813 )
...
* tests: add integration tests for new Bort model
* bort: add conversion script from Gluonnlp to Transformers 🚀
* bort: minor cleanup (BORT -> Bort)
* add docs
* make fix-copies
* clean doc a bit
* correct docs
* Update docs/source/model_doc/bort.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update docs/source/model_doc/bort.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* correct dialogpt doc
* correct link
* Update docs/source/model_doc/bort.rst
* Update docs/source/model_doc/dialogpt.rst
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-27 21:25:11 +03:00
abhishek thakur
f617490e71
ConvBERT Model ( #9717 )
...
* finalize convbert
* finalize convbert
* fix
* fix
* fix
* push
* fix
* tf image patches
* fix torch model
* tf tests
* conversion
* everything aligned
* remove print
* tf tests
* fix tf
* make tf tests pass
* everything works
* fix init
* fix
* special treatment for sepconv1d
* style
* 🙏🏽
* add doc and cleanup
* add electra test again
* fix doc
* fix doc again
* fix doc again
* Update src/transformers/modeling_tf_pytorch_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/conv_bert/configuration_conv_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update docs/source/model_doc/conv_bert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/conv_bert/configuration_conv_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* conv_bert -> convbert
* more fixes from review
* add conversion script
* dont use pretrained embed
* unused config
* suggestions from julien
* some more fixes
* p -> param
* fix copyright
* fix doc
* Update src/transformers/models/convbert/configuration_convbert.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* comments from reviews
* fix-copies
* fix style
* revert shape_list
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-01-27 03:20:09 -05:00
NielsRogge
d1370d29b1
Add DeBERTa head models ( #9691 )
...
* Add DebertaForMaskedLM, DebertaForTokenClassification, DebertaForQuestionAnswering
* Add docs and fix quality
* Fix Deberta not having pooler
2021-01-20 10:18:50 -05:00
NielsRogge
e45eba3b1c
Improve LayoutLM ( #9476 )
...
* Add LayoutLMForSequenceClassification and integration tests
Improve docs
Add LayoutLM notebook to list of community notebooks
* Make style & quality
* Address comments by @sgugger, @patrickvonplaten and @LysandreJik
* Fix rebase with master
* Reformat in one line
* Improve code examples as requested by @patrickvonplaten
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-12 09:26:32 -05:00
Patrick von Platen
7f28613213
[TFBart] Split TF-Bart ( #9497 )
...
* make templates ready
* make add_new_model_command_ready
* finish tf bart
* prepare tf mbart
* finish tf bart
* add tf mbart
* add marian
* prep pegasus
* add tf pegasus
* push blenderbot tf
* add blenderbot
* add blenderbot small
* clean-up
* make fix copy
* define blend bot tok
* fix
* up
* make style
* add to docs
* add copy statements
* overwrite changes
* improve
* fix docs
* finish
* fix last slow test
* fix missing git conflict line
* fix blenderbot
* up
* fix blenderbot small
* load changes
* finish copied from
* upload fix
2021-01-12 02:06:32 +01:00
Qbiwan
ecfcac223c
Improve documentation coverage for Phobert ( #9427 )
...
* first commit
* change phobert to phoBERT as per author in overview
* v3 and v4 both runs on same code hence there is no need to differentiate them
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-06 10:04:32 -05:00
Qbiwan
be898998bb
Improve documentation coverage for Herbert ( #9428 )
...
* first commit
* changed XLMTokenizer to HerbertTokenizer in code example
2021-01-06 09:13:43 -05:00
Sylvain Gugger
bcb55d33ce
Upgrade styler to better handle lists ( #9423 )
...
* Add missing lines before a new list.
* Update doc styler and restyle some files.
* Fix docstrings of LED and Longformer
2021-01-06 07:46:17 -05:00
NielsRogge
b7e548976f
Fix URLs to TAPAS notebooks ( #9435 )
2021-01-06 07:20:41 -05:00
Patrick von Platen
eef66035a2
[PyTorch Bart] Split Bart into different models ( #9343 )
...
* first try
* remove old template
* finish bart
* finish mbart
* delete unnecessary line
* init pegasus
* save intermediate
* correct pegasus
* finish pegasus
* remove cookie cutter leftover
* add marian
* finish blenderbot
* replace in file
* correctly split blenderbot
* delete "old" folder
* correct "add statement"
* adapt config for tf comp
* correct configs for tf
* remove ipdb
* fix more stuff
* fix mbart
* push pegasus fix
* fix mbart
* more fixes
* fix research projects code
* finish docs for bart, mbart, and marian
* delete unnecessary file
* correct attn typo
* correct configs
* remove pegasus for seq class
* correct peg docs
* correct peg docs
* finish configs
* further improve docs
* add copied from statements to mbart
* fix copied from in mbart
* add copy statements to marian
* add copied from to marian
* add pegasus copied from
* finish pegasus
* finish copied from
* Apply suggestions from code review
* make style
* backward comp blenderbot
* apply lysandres and sylvains suggestions
* apply suggestions
* push last fixes
* fix docs
* fix tok tests
* fix imports code style
* fix doc
2021-01-05 22:00:05 +01:00
Patrick von Platen
189387e9b2
LED ( #9278 )
...
* create model
* add integration
* save current state
* make integration tests pass
* add one more test
* add explanation to tests
* remove from bart
* add padding
* remove unnecessary test
* make all tests pass
* re-add cookie cutter tests
* finish PyTorch
* fix attention test
* Update tests/test_modeling_common.py
* revert change
* remove unused file
* add string to doc
* save intermediate
* make tf integration tests pass
* finish tf
* fix doc
* fix docs again
* add led to doctree
* add to auto tokenizer
* added tips for led
* make style
* apply jplus statements
* correct tf longformer
* apply lysandres suggestions
* apply sylvains suggestions
* Apply suggestions from code review
2021-01-05 13:14:30 +01:00
Sugeeth
314cca2842
Fix documentation links always pointing to master. ( #9217 )
...
* Use extlinks to point hyperlink with the version of code
* Point to version on release and master until then
* Apply style
* Correct links
* Add missing backtick
* Simple missing backtick after all.
Co-authored-by: Raghavendra Sugeeth P S <raghav-5305@raghav-5305.csez.zohocorpin.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-01-05 06:18:48 -05:00
Qbiwan
086718ac6e
Improve documentation coverage for Bertweet ( #9379 )
...
* bertweet docs coverage
* style doc max len 119
* maxlen style rst
* run main() from style_doc
* changed according to comments
2021-01-04 13:12:59 -05:00
Patrick von Platen
52b3a05e83
[Bart doc] Fix outdated statement ( #9299 )
...
* fix bart doc
* fix docs
2020-12-24 14:47:53 +01:00
Sylvain Gugger
1fc7119181
Fix script that check objects are documented ( #9259 )
2020-12-22 11:12:58 -05:00
Suraj Patil
f4432b7e01
add base model classes to bart subclassed models ( #9230 )
...
* add base model classes to bart subclassed models
* add doc
2020-12-21 19:56:46 +05:30
Stas Bekman
3ff5e8955a
[t5 doc] typos ( #9199 )
...
* [t5 doc] typos
a few run away backticks
@sgugger
* style
2020-12-18 16:03:26 -08:00
sandip
467e9158b4
Added TF CTRL Sequence Classification ( #9151 )
...
* Added TF CTRL Sequence Classification
* code refactor
2020-12-17 18:10:57 -05:00
Lysandre
e0790cca78
Fix TAPAS doc
2020-12-17 11:25:05 -05:00
Lysandre
ac2c7e398f
Remove erroneous character
2020-12-17 09:47:19 -05:00
Lysandre Debut
1aca3d6afa
Add disclaimer to TAPAS rst file ( #9167 )
...
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2020-12-17 09:34:06 -05:00
Lysandre Debut
07384baf7a
AutoModelForTableQuestionAnswering ( #9154 )
...
* AutoModelForTableQuestionAnswering
* Update src/transformers/models/auto/modeling_auto.py
* Style
2020-12-16 12:14:33 -05:00
Hayden Housen
34334662df
Add message to documentation that longformer doesn't support token_type_ids ( #9152 )
...
* Add message to documentation that longformer doesn't support token_type_ids
* Format changes
2020-12-16 11:06:14 -05:00
NielsRogge
1551e2dc6d
[WIP] Tapas v4 (tres) ( #9117 )
...
* First commit: adding all files from tapas_v3
* Fix multiple bugs including soft dependency and new structure of the library
* Improve testing by adding torch_device to inputs and adding dependency on scatter
* Use Python 3 inheritance rather than Python 2
* First draft model cards of base sized models
* Remove model cards as they are already on the hub
* Fix multiple bugs with integration tests
* All model integration tests pass
* Remove print statement
* Add test for convert_logits_to_predictions method of TapasTokenizer
* Incorporate suggestions by Google authors
* Fix remaining tests
* Change position embeddings sizes to 512 instead of 1024
* Comment out positional embedding sizes
* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
* Added more model names
* Fix truncation when no max length is specified
* Disable torchscript test
* Make style & make quality
* Quality
* Address CI needs
* Test the Masked LM model
* Fix the masked LM model
* Truncate when overflowing
* More much needed docs improvements
* Fix some URLs
* Some more docs improvements
* Test PyTorch scatter
* Set to slow + minify
* Calm flake8 down
* First commit: adding all files from tapas_v3
* Fix multiple bugs including soft dependency and new structure of the library
* Improve testing by adding torch_device to inputs and adding dependency on scatter
* Use Python 3 inheritance rather than Python 2
* First draft model cards of base sized models
* Remove model cards as they are already on the hub
* Fix multiple bugs with integration tests
* All model integration tests pass
* Remove print statement
* Add test for convert_logits_to_predictions method of TapasTokenizer
* Incorporate suggestions by Google authors
* Fix remaining tests
* Change position embeddings sizes to 512 instead of 1024
* Comment out positional embedding sizes
* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
* Added more model names
* Fix truncation when no max length is specified
* Disable torchscript test
* Make style & make quality
* Quality
* Address CI needs
* Test the Masked LM model
* Fix the masked LM model
* Truncate when overflowing
* More much needed docs improvements
* Fix some URLs
* Some more docs improvements
* Add add_pooling_layer argument to TapasModel
Fix comments by @sgugger and @patrickvonplaten
* Fix issue in docs + fix style and quality
* Clean up conversion script and add task parameter to TapasConfig
* Revert the task parameter of TapasConfig
Some minor fixes
* Improve conversion script and add test for absolute position embeddings
* Improve conversion script and add test for absolute position embeddings
* Fix bug with reset_position_index_per_cell arg of the conversion cli
* Add notebooks to the examples directory and fix style and quality
* Apply suggestions from code review
* Move from `nielsr/` to `google/` namespace
* Apply Sylvain's comments
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Rogge Niels <niels.rogge@howest.be>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2020-12-15 17:08:49 -05:00
sandip
389aba34bf
Added TF OpenAi GPT1 Sequence Classification ( #9105 )
...
* TF OpenAI GPT Sequence Classification
* Update src/transformers/models/openai/modeling_tf_openai.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-12-15 11:27:08 -05:00
Ahmed Elnaggar
a9c8bff724
Add parallelization support for T5EncoderModel ( #9082 )
...
* add model parallelism to T5EncoderModel
add model parallelism to T5EncoderModel
* remove decoder from T5EncoderModel parallelize
* uodate T5EncoderModel docs
* Extend T5ModelTest for T5EncoderModel
* fix T5Stask using range for get_device_map
* fix style
Co-authored-by: Ahmed Elnaggar <elnaggar@rostlab.informatik.tu-muenchen.de>
2020-12-14 12:00:45 -05:00
Sylvain Gugger
1310e1a758
Enforce all objects in the main init are documented ( #9014 )
2020-12-10 11:57:12 -05:00
Sylvain Gugger
51e81e5895
MPNet copyright files ( #9015 )
2020-12-10 09:29:38 -05:00
Patrick von Platen
06971ac4f9
[Bart] Refactor - fix issues, consistency with the library, naming ( #8900 )
...
* remove make on the fly linear embedding
* start refactor
* big first refactor
* save intermediate
* save intermediat
* correct mask issue
* save tests
* refactor padding masks
* make all tests pass
* further refactor
* make pegasus test pass
* fix bool if
* fix leftover tests
* continue
* bart renaming
* delete torchscript test hack
* fix imports in tests
* correct shift
* fix docs and repo cons
* re-add fix for FSTM
* typo in test
* fix typo
* fix another typo
* continue
* hot fix 2 for tf
* small fixes
* refactor types linting
* continue
* finish refactor
* fix import in tests
* better bart names
* further refactor and add test
* delete hack
* apply sylvains and lysandres commens
* small perf improv
* further perf improv
* improv perf
* fix typo
* make style
* small perf improv
2020-12-09 20:55:24 +01:00
StillKeepTry
df2af6d8b8
Add MP Net 2 ( #9004 )
2020-12-09 10:32:43 -05:00
Sylvain Gugger
00aa9dbca2
Copyright ( #8970 )
...
* Add copyright everywhere missing
* Style
2020-12-07 18:36:34 -05:00
sandip
483e13273f
Add TFGPT2ForSequenceClassification based on DialogRPT ( #8714 )
...
* Add TFGPT2ForSequenceClassification based on DialogRPT
* Add TFGPT2ForSequenceClassification based on DialogRPT
* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing
* Add TFGPT2ForSequenceClassification based on DialogRPT
* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing
* code refactor for latest other TF PR
* code refactor
* code refactor
* Update modeling_tf_gpt2.py
2020-12-07 16:58:37 +01:00
sandip
f6b44e6190
Transfoxl seq classification ( #8868 )
...
* Transfoxl sequence classification
* Transfoxl sequence classification
2020-12-02 10:08:32 -05:00
elk-cloner
4a9e502a36
Ctrl for sequence classification ( #8812 )
...
* add CTRLForSequenceClassification
* pass local test
* merge with master
* fix modeling test for sequence classification
* fix deco
* fix assert
2020-12-01 09:49:27 +01:00
Ahmed Elnaggar
40ecaf0c2b
Add T5 Encoder for Feature Extraction ( #8717 )
...
* Add T5 Encoder class for feature extraction
* fix T5 encoder add_start_docstrings indent
* update init with T5 encoder
* update init with TFT5ModelEncoder
* remove TFT5ModelEncoder
* change T5ModelEncoder order in init
* add T5ModelEncoder to transformers init
* clean T5ModelEncoder
* update init with TFT5ModelEncoder
* add TFModelEncoder for Tensorflow
* update init with TFT5ModelEncoder
* Update src/transformers/models/t5/modeling_t5.py
change output from Seq2SeqModelOutput to BaseModelOutput
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* remove encoder_outputs
1. remove encoder_outputs from the function call.
2. remove the encoder_outputs If statement.
3. remove isinstance from return_dict.
* Authorize missing decoder keys
* remove unnecessary input parameters
remove pask_key_values and use_cache
* remove use_cache
remove use_cache from the forward method
* add doctoring for T5 encoder
add doctoring for T5 encoder with T5_ENCODER_INPUTS_DOCSTRING
* change return_dict to dot access
* add T5_ENCODER_INPUTS_DOCSTRING for TF T5
* change TFT5Encoder output type to BaseModelOutput
* remove unnecessary parameters for TFT5Encoder
* remove unnecessary if statement
* add import BaseModelOutput
* fix BaseModelOutput typo to TFBaseModelOutput
* update T5 doc with T5ModelEncoder
* add T5ModelEncoder to tests
* finish pytorch
* finish docs and mt5
* add mtf to init
* fix init
* remove n_positions
* finish PR
* Update src/transformers/models/mt5/modeling_mt5.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/t5/modeling_t5.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/t5/modeling_tf_t5.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/mt5/modeling_tf_mt5.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-30 08:34:40 +01:00
Moussa Kamal Eddine
81fe0bf085
Add barthez model ( #8393 )
...
* Add init barthez
* Add barthez model, tokenizer and docs
BARThez is a pre-trained french seq2seq model that uses BART objective.
* Apply suggestions from code review docs typos
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add license
* Change URLs scheme
* Remove barthez model keep tokenizer
* Fix style
* Fix quality
* Update tokenizer
* Add fast tokenizer
* Add fast tokenizer test
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-27 12:31:42 -05:00
Patrick von Platen
2a6fbe6a40
[XLNet] Fix mems behavior ( #8567 )
...
* fix mems in xlnet
* fix use_mems
* fix use_mem_len
* fix use mems
* clean docs
* fix tf typo
* make xlnet tf for generation work
* fix tf test
* refactor use cache
* add use cache for missing models
* correct use_cache in generate
* correct use cache in tf generate
* fix tf
* correct getattr typo
* make sylvain happy
* change in docs as well
* do not apply to cookie cutter statements
* fix tf test
* make pytorch model fully backward compatible
2020-11-25 16:54:59 -05:00
Lysandre Debut
02f48b9bfc
Model parallel documentation ( #8741 )
...
* Add parallelize methods to the .rst files
* Correct format
2020-11-23 20:14:48 -05:00