Commit Graph

6103 Commits

Author SHA1 Message Date
Stas Bekman
b00eb4fb02
Testing Experimental CI Features (#9070) 2020-12-14 10:34:59 -05:00
Simon Brandeis
74daf1f954
Fixed a broken link in documentation (#9101) 2020-12-14 09:12:27 -05:00
Navjot
d6af344c9e
correct var name in TrainingArguments docstring (#9096) 2020-12-14 09:02:54 -05:00
Patrick von Platen
fa1ddced9e
[RAG, Bart] Align RAG, Bart cache with T5 and other models of transformers (#9098)
* fix rag

* fix slow test

* fix past in bart
2020-12-14 12:32:26 +01:00
Lysandre Debut
6587cf9f84
Patch *ForCausalLM model (#9092) 2020-12-14 00:39:55 -05:00
Julien Plu
51d9c569fa
Fix embeddings resizing in TF models (#8657)
* Resize the biases in same time than the embeddings

* Trigger CI

* Biases are not reset anymore

* Remove get_output_embeddings + better LM model detection in generation utils

* Apply style

* First test on BERT

* Update docstring + new name

* Apply the new resizing logic to all the models

* fix tests

* Apply style

* Update the template

* Fix naming

* Fix naming

* Apply style

* Apply style

* Remove unused import

* Revert get_output_embeddings

* Trigger CI

* Update num parameters

* Restore get_output_embeddings in TFPretrainedModel and add comments

* Style

* Add decoder resizing

* Style

* Fix tests

* Separate bias and decoder resize

* Fix tests

* Fix tests

* Apply style

* Add bias resizing in MPNet

* Trigger CI

* Apply style
2020-12-13 23:05:24 -05:00
Julien Chaumond
3552d0e0d8
[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)
* rm all model cards

* Update the .rst

@sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler

* Add a rootlevel README.md with simple instructions/context

* Update docs/source/model_sharing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style

* rm all model cards

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-12-11 18:24:42 -05:00
Sylvain Gugger
29e4597950
Fix min_null_pred in the run_qa script (#9067) 2020-12-11 16:26:05 -05:00
Patrick von Platen
9cc9f4122e
Make ProphetNetModel really compatible with EncoderDecoder (#9033)
* improve

* finish

* upload model

* fix lm head

* fix test
2020-12-11 16:59:54 +01:00
dependabot[bot]
24f6cdeab6
Bump notebook in /examples/research_projects/movement-pruning/lxmert (#9062)
Bumps [notebook](https://github.com/jupyter/jupyterhub) from 6.1.4 to 6.1.5.
- [Release notes](https://github.com/jupyter/jupyterhub/releases)
- [Changelog](https://github.com/jupyterhub/jupyterhub/blob/master/CHECKLIST-Release.md)
- [Commits](https://github.com/jupyter/jupyterhub/commits)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2020-12-11 10:32:43 -05:00
Lysandre Debut
91fa707217
Remove docs only check (#9065) 2020-12-11 10:27:31 -05:00
Sylvain Gugger
70527ba694
Fix PreTrainedTokenizer.pad when first inputs are empty (#9018)
* Fix PreTrainedTokenizer.pad when first inputs are empty

* Handle empty inputs case
2020-12-11 10:25:00 -05:00
Sylvain Gugger
783d7d2629
Reorganize examples (#9010)
* Reorganize example folder

* Continue reorganization

* Change requirements for tests

* Final cleanup

* Finish regroup with tests all passing

* Copyright

* Requirements and readme

* Make a full link for the documentation

* Address review comments

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add symlink

* Reorg again

* Apply suggestions from code review

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Adapt title

* Update to new strucutre

* Remove test

* Update READMEs

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-12-11 10:07:02 -05:00
Suraj Patil
86896de064
update tatoeba workflow (#9051) 2020-12-11 20:29:15 +05:30
Ganesh Kharad
7c8f5f6487
Create README.md (#8096)
* Create README.md

* Fix model card

Co-authored-by: Julien Chaumond <julien@huggingface.co>
2020-12-11 09:45:12 -05:00
RamonMamon
5527f78721
Create README.md (#8281)
* Create README.md

* Update model_cards/kiri-ai/distiluse-base-multilingual-cased-et/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-12-11 09:41:29 -05:00
joangines
c615df7422
Create README.md (#8751)
* Create README.md

* Update model_cards/Cinnamon/electra-small-japanese-generator/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-12-11 09:40:14 -05:00
Ahmed Abdelali
76df559383
QARiB Arabic and dialects models (#8796)
* Add QARiB models

* fix README.md

* Fix README.md

* Fix README.md

* Fix README.md

* Fix QARiB files

* add models card for QARiB models 860k, 1790k, and 1970k

* try to fix PR

* re-add files

* links aren't allowed here :)

Co-authored-by: Ahmed Abdelali <aabdelali@hbku.edu.qa>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
2020-12-11 09:38:38 -05:00
moniquebm
b161f1ae54
Update README.md (#8820) 2020-12-11 09:24:21 -05:00
Panggi Libersa Jasri Akadol
649d389dab
Initial README for t5-base-indonesian-summarization-cased model (#9028)
* Create README.md

Initial README for `t5-base-indonesian-summarization-cased` model

* Update README for t5-base-indonesian-summarization-cased

Typo in README, change from `small` to `base`
2020-12-11 09:18:16 -05:00
Panggi Libersa Jasri Akadol
5e794b6628
Create README.md (#9030)
Initial README for `t5-small-indonesian-summarization-cased` model
2020-12-11 09:17:29 -05:00
Cola
935e346959
🎨 Change nn.dropout to layer.Dropout (#9047) 2020-12-11 10:40:25 +01:00
Julien Plu
b01ddc9577
Remove value error (#8985)
* Remove value error

* Try a fix for parameter ordering

* Restore previous behavior

* Add documentation

* Review the comment
2020-12-10 17:17:19 -05:00
NatLun137
91ab02af28
Fix typo #9012 (#1) (#9038)
There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](https://github.com/huggingface/transformers/issues/9012)
2020-12-10 16:41:00 -05:00
Sylvain Gugger
8d4bb02056
Refactor FLAX tests (#9034) 2020-12-10 15:57:39 -05:00
Sylvain Gugger
1310e1a758
Enforce all objects in the main init are documented (#9014) 2020-12-10 11:57:12 -05:00
Sylvain Gugger
51e81e5895
MPNet copyright files (#9015) 2020-12-10 09:29:38 -05:00
Sylvain Gugger
35bffd70e2
Fix documention of book in LayoutLM (#9017) 2020-12-10 09:28:49 -05:00
Cola
c95de29e31
✏️ Fix typo (#9020) 2020-12-10 08:22:52 +01:00
Stas Bekman
5e637e6c69
[wip] [ci] doc-job-skip take #4 dry-run (#8980)
* ci-doc-job-skip-take-4

* wip

* wip

* wip

* wip

* skip yaml

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* ready to test

* yet another way

* trying with HEAD

* trying with head.sha

* trying with head.sha fix

* trying with head.sha fix wip

* undo

* try to switch to sha

* current branch

* current branch

* PR number check

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride
2020-12-09 15:36:36 -05:00
Patrick von Platen
06971ac4f9
[Bart] Refactor - fix issues, consistency with the library, naming (#8900)
* remove make on the fly linear embedding

* start refactor

* big first refactor

* save intermediate

* save intermediat

* correct mask issue

* save tests

* refactor padding masks

* make all tests pass

* further refactor

* make pegasus test pass

* fix bool if

* fix leftover tests

* continue

* bart renaming

* delete torchscript test hack

* fix imports in tests

* correct shift

* fix docs and repo cons

* re-add fix for FSTM

* typo in test

* fix typo

* fix another typo

* continue

* hot fix 2 for tf

* small fixes

* refactor types linting

* continue

* finish refactor

* fix import in tests

* better bart names

* further refactor and add test

* delete hack

* apply sylvains and lysandres commens

* small perf improv

* further perf improv

* improv perf

* fix typo

* make style

* small perf improv
2020-12-09 20:55:24 +01:00
Funtowicz Morgan
75627148ee
Flax Masked Language Modeling training example (#8728)
* Remove "Model" suffix from Flax models to look more 🤗

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Initial working (forward + backward) for Flax MLM training example.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Simply code

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Addressing comments, using module and moving to LM task.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore parameter name "module" wrongly renamed model.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore correct output ordering...

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Actually commit the example 😅

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Add FlaxBertModelForMaskedLM after rebasing.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make it possible to initialize the training from scratch

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Reuse flax linen example of cross entropy loss

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added specific data collator for flax

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Remove todo for data collator

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added evaluation step

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added ability to provide dtype to support bfloat16 on TPU

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable flax tensorboard output

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable jax.pmap support.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Ensure batches are correctly sized to be dispatched with jax.pmap

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable bfloat16 with --fp16 cmdline args

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Correctly export metrics to tensorboard

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added dropout and ability to use it.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Effectively enable & disable during training and evaluation steps.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Oops.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable specifying kernel initializer scale

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Style.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added warmup step to the learning rate scheduler.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix typo.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Print training loss

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make style

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix linter issue (flake8)

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix model matching

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix dummies

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix non default dtype on Flax models

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Use the same create_position_ids_from_input_ids for FlaxRoberta

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make Roberta attention as Bert

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix copy

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Wording.

Co-authored-by: Marc van Zee <marcvanzee@gmail.com>

Co-authored-by: Marc van Zee <marcvanzee@gmail.com>
2020-12-09 17:13:56 +01:00
StillKeepTry
df2af6d8b8 Add MP Net 2 (#9004) 2020-12-09 10:32:43 -05:00
cronoik
8729109855
fixes #8968 (#9009) 2020-12-09 16:21:41 +01:00
Simon Brandeis
e977ed2142
Add the code_search_net dataset tag to CodeBERTa model cards (#9005) 2020-12-09 15:43:19 +01:00
Patrick von Platen
da37a21c89
push (#9008) 2020-12-09 15:14:33 +01:00
Sylvain Gugger
61abd50b98
Remove use of deprected method in Trainer HP search (#8996) 2020-12-09 09:13:41 -05:00
Sylvain Gugger
7e1d709e2a
Fix link to stable version in the doc navbar (#9007) 2020-12-09 09:11:39 -05:00
Patrick von Platen
02d0e0355c
Diverse beam search 2 (#9006)
* diverse beam search

* bug fixes

* bug fixes

* bug fix

* separate out diverse_beam_search function

* separate out diverse_beam_search function

* bug fix

* improve code quality

* bug fix

* bug fix

* separate out diverse beam search scorer

* code format

* code format

* code format

* code format

* add test

* code format

* documentation changes

* code quality

* add slow integration tests

* more general name

* refactor into logits processor

* add test

* avoid too much copy paste

* refactor

* add to docs

* fix-copies

* bug fix

* Revert "bug fix"

This reverts commit c99eb5a8dc.

* improve comment

* implement sylvains feedback

Co-authored-by: Ayush Jain <a.jain@sprinklr.com>
Co-authored-by: ayushtiku5 <40797286+ayushtiku5@users.noreply.github.com>
2020-12-09 15:00:37 +01:00
Lysandre Debut
67ff1c314a
Templates overhaul 1 (#8993) 2020-12-08 18:00:07 -05:00
Sylvain Gugger
447808c85f
New squad example (#8992)
* Add new SQUAD example

* Same with a task-specific Trainer

* Address review comment.

* Small fixes

* Initial work for XLNet

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Final clean up and working XLNet script

* Test and debug

* Final working version

* Add new SQUAD example

* Same with a task-specific Trainer

* Address review comment.

* Small fixes

* Initial work for XLNet

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Final clean up and working XLNet script

* Test and debug

* Final working version

* Add tick

* Update README

* Address review comments

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-12-08 14:39:29 -05:00
guillaume-be
7809eb82ae
Removed unused encoder_hidden_states and encoder_attention_mask (#8972)
* Removed unused `encoder_hidden_states` and `encoder_attention_mask` from MobileBert

* Removed decoder tests for MobileBert

* Removed now unnecessary import
2020-12-08 12:04:34 -05:00
Lysandre Debut
b7cdd00f15
Fix interaction of return_token_type_ids and add_special_tokens (#8854) 2020-12-08 12:04:01 -05:00
Sylvain Gugger
04c446f764
Make ModelOutput pickle-able (#8989) 2020-12-08 11:59:40 -05:00
Julien Chaumond
0d9e6ca9ed
[model_card] remove bogus testing changes 2020-12-08 09:58:45 -05:00
Julien Plu
bf7f79cd57
Optional layers (#8961)
* Apply on BERT and ALBERT

* Update TF Bart

* Add input processing to TF BART

* Add input processing for TF CTRL

* Add input processing to TF Distilbert

* Add input processing to TF DPR

* Add input processing to TF Electra

* Add deprecated arguments

* Add input processing to TF XLM

* remove unused imports

* Add input processing to TF Funnel

* Add input processing to TF GPT2

* Add input processing to TF Longformer

* Add input processing to TF Lxmert

* Apply style

* Add input processing to TF Mobilebert

* Add input processing to TF GPT

* Add input processing to TF Roberta

* Add input processing to TF T5

* Add input processing to TF TransfoXL

* Apply style

* Rebase on master

* Fix wrong model name

* Fix BART

* Apply style

* Put the deprecated warnings in the input processing function

* Remove the unused imports

* Raise an error when len(kwargs)>0

* test ModelOutput instead of TFBaseModelOutput

* Address Patrick's comments

* Address Patrick's comments

* Add boolean processing for the inputs

* Take into account the optional layers

* Add missing/unexpected weights in the other models

* Apply style

* rename parameters

* Apply style

* Remove useless

* Remove useless

* Remove useless

* Update num parameters

* Fix tests

* Address Patrick's comment

* Remove useless attribute
2020-12-08 09:14:09 -05:00
Stas Bekman
9d7d0005b0
[training] SAVE_STATE_WARNING was removed in pytorch (#8979)
* [training] SAVE_STATE_WARNING was removed in pytorch

FYI `SAVE_STATE_WARNING` has been removed 3 days ago: pytorch/pytorch#46813

Fixes: #8232

@sgugger

* style, but add () to prevent autoformatters from botching it

* switch to try/except

* cleanup
2020-12-07 21:59:55 -08:00
Lysandre Debut
2ae7388eee
Check table as independent script (#8976) 2020-12-07 19:55:12 -05:00
Sylvain Gugger
00aa9dbca2
Copyright (#8970)
* Add copyright everywhere missing

* Style
2020-12-07 18:36:34 -05:00
Navjot
c108d0b5a4
add max_length to showcase the use of truncation (#8975) 2020-12-07 18:35:39 -05:00