Commit Graph

51 Commits

Author SHA1 Message Date
Sylvain Gugger
76f36e183a
Add a community page to the docs (#9682) 2021-01-20 04:54:36 -05:00
Patrick von Platen
b39bd763e8
Update README.md 2021-01-19 12:25:51 +01:00
Patrick von Platen
126fd281bc
Update README.md 2021-01-13 16:55:59 +01:00
NielsRogge
e45eba3b1c
Improve LayoutLM (#9476)
* Add LayoutLMForSequenceClassification and integration tests

Improve docs

Add LayoutLM notebook to list of community notebooks

* Make style & quality

* Address comments by @sgugger, @patrickvonplaten and @LysandreJik

* Fix rebase with master

* Reformat in one line

* Improve code examples as requested by @patrickvonplaten

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-12 09:26:32 -05:00
Aakash Tripathi
5a442a8db1
New Updated DistilGPT-2 Finetuning and Generation (#9494)
https://github.com/huggingface/transformers/pull/3177
2021-01-11 14:34:39 +01:00
NielsRogge
b7e548976f
Fix URLs to TAPAS notebooks (#9435) 2021-01-06 07:20:41 -05:00
Manuel Romero
7988edc031
Fix link to Notebook to fine-tune TAPAS (#9413)
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-06 03:44:52 -05:00
Manuel Romero
c9553c0352
Fix link to Evaluate TAPAS Notebook (#9414) 2021-01-06 03:42:50 -05:00
Vasudev Gupta
21fc676645
add translation example (#9303)
* Created using Colaboratory

* mbart-training examples add

* link add

* Update description

Co-authored-by: Suraj Patil <surajp815@gmail.com>
2020-12-25 14:47:49 +05:30
Sylvain Gugger
4d48973523
Update notebook table and transformers intro notebook (#9136) 2020-12-16 10:24:31 -05:00
NielsRogge
1551e2dc6d
[WIP] Tapas v4 (tres) (#9117)
* First commit: adding all files from tapas_v3

* Fix multiple bugs including soft dependency and new structure of the library

* Improve testing by adding torch_device to inputs and adding dependency on scatter

* Use Python 3 inheritance rather than Python 2

* First draft model cards of base sized models

* Remove model cards as they are already on the hub

* Fix multiple bugs with integration tests

* All model integration tests pass

* Remove print statement

* Add test for convert_logits_to_predictions method of TapasTokenizer

* Incorporate suggestions by Google authors

* Fix remaining tests

* Change position embeddings sizes to 512 instead of 1024

* Comment out positional embedding sizes

* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES

* Added more model names

* Fix truncation when no max length is specified

* Disable torchscript test

* Make style & make quality

* Quality

* Address CI needs

* Test the Masked LM model

* Fix the masked LM model

* Truncate when overflowing

* More much needed docs improvements

* Fix some URLs

* Some more docs improvements

* Test PyTorch scatter

* Set to slow + minify

* Calm flake8 down

* First commit: adding all files from tapas_v3

* Fix multiple bugs including soft dependency and new structure of the library

* Improve testing by adding torch_device to inputs and adding dependency on scatter

* Use Python 3 inheritance rather than Python 2

* First draft model cards of base sized models

* Remove model cards as they are already on the hub

* Fix multiple bugs with integration tests

* All model integration tests pass

* Remove print statement

* Add test for convert_logits_to_predictions method of TapasTokenizer

* Incorporate suggestions by Google authors

* Fix remaining tests

* Change position embeddings sizes to 512 instead of 1024

* Comment out positional embedding sizes

* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES

* Added more model names

* Fix truncation when no max length is specified

* Disable torchscript test

* Make style & make quality

* Quality

* Address CI needs

* Test the Masked LM model

* Fix the masked LM model

* Truncate when overflowing

* More much needed docs improvements

* Fix some URLs

* Some more docs improvements

* Add add_pooling_layer argument to TapasModel

Fix comments by @sgugger and @patrickvonplaten

* Fix issue in docs + fix style and quality

* Clean up conversion script and add task parameter to TapasConfig

* Revert the task parameter of TapasConfig

Some minor fixes

* Improve conversion script and add test for absolute position embeddings

* Improve conversion script and add test for absolute position embeddings

* Fix bug with reset_position_index_per_cell arg of the conversion cli

* Add notebooks to the examples directory and fix style and quality

* Apply suggestions from code review

* Move from `nielsr/` to `google/` namespace

* Apply Sylvain's comments

Co-authored-by: sgugger <sylvain.gugger@gmail.com>

Co-authored-by: Rogge Niels <niels.rogge@howest.be>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2020-12-15 17:08:49 -05:00
Sylvain Gugger
00aa9dbca2
Copyright (#8970)
* Add copyright everywhere missing

* Style
2020-12-07 18:36:34 -05:00
Patrick von Platen
f744b81572
add new notebooks (#8246) 2020-11-02 20:21:55 +01:00
Martin Monperrus
93354bc779
doc: fix typo (#8235) 2020-11-02 08:53:17 -05:00
Peter Bayerle
cc2e312ca3
adding text classification with DistilBERT/tf notebook (#7964)
Looking at the current community notebooks, it seems that few are targeted for absolute beginners and even fewer are written with TensorFlow. This notebook describes absolutely everything a beginner would need to know, including how to save/load their model and use it for new predictions (this is often omitted in tutorials)

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-22 09:30:50 -04:00
zolekode
4abb7ffc18
added qg evaluation notebook (#7958)
* added qg evaluation notebook

* Update notebooks/README.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-10-22 11:02:12 +02:00
Dhaval Taunk
2ca0fae9a6
added script for fine-tuning roberta for sentiment analysis task (#7505) 2020-10-05 03:57:15 -04:00
Muhammad Harris
a42f62d34f
Train T5 in Tensoflow 2 Community Notebook (#7428)
* t5 t5 community notebook added

* author link updated

* t5 t5 community notebook added

* author link updated

* new colab link updated

Co-authored-by: harris <muhammad.harris@visionx.io>
2020-10-01 16:54:29 +02:00
Nadir El Manouzi
4b3e55bdcc
Add "Fine-tune ALBERT for sentence-pair classification" notebook to the community notebooks (#7255) 2020-09-21 04:25:22 -04:00
Dhaval Taunk
c183d81e27
added multilabel text classification notebook using distilbert to community notebooks (#7201)
* added multilabel classification using distilbert notebook to community notebooks

* added multilabel classification using distilbert notebook to community notebooks
2020-09-17 05:58:57 -04:00
Philipp Schmid
8235426ee8
New Community NB "Fine tune GPT-2 with Trainer class" (#7005) 2020-09-08 03:42:20 -04:00
elsanns
9f57e39f71
Add notebook on fine-tuning and interpreting Electra (#6321)
Co-authored-by: eliska <3648991+elisans@users.noreply.github.com>
2020-08-08 11:47:33 +02:00
Tanmay Thakur
842eb45606
New Community NB Add (#5824)
Signed-off-by: lordtt13 <thakurtanmay72@yahoo.com>
2020-07-28 04:25:12 -04:00
Patrick von Platen
223084e42b
Add Reformer to notebooks 2020-07-10 18:34:25 +02:00
Patrick von Platen
306f1a2695
Add Reformer MLM notebook (#5450)
* Add Reformer MLM notebook

* Update notebooks/README.md
2020-07-02 00:20:49 +02:00
Patrick von Platen
834b6884c5
Add benchmark notebook (#5312)
* add notebook

* Créé avec Colaboratory

* move notebook to correct folder

* correct link

* correct filename

* correct filename

* better name
2020-06-26 17:38:13 +02:00
Sylvain Gugger
7c41057d50
Add hugs (#5225) 2020-06-24 07:56:14 -04:00
Michaël Benesty
0cca61925c
Add link to new comunity notebook (optimization) (#5195)
* Add link to new comunity notebook (optimization)

related to https://github.com/huggingface/transformers/issues/4842#event-3469184635

This notebook is about benchmarking model training with/without dynamic padding optimization. 
https://github.com/ELS-RD/transformers-notebook 

Using dynamic padding on MNLI provides a **4.7 times training time reduction**, with max pad length set to 512. The effect is strong because few examples are >> 400 tokens in this dataset. IRL, it will depend of the dataset, but it always bring improvement and, after more than 20 experiments listed in this [article](https://towardsdatascience.com/divide-hugging-face-transformers-training-time-by-2-or-more-21bf7129db9q-21bf7129db9e?source=friends_link&sk=10a45a0ace94b3255643d81b6475f409), it seems to not hurt performance.

Following advice from @patrickvonplaten I do the PR myself :-)

* Update notebooks/README.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-06-22 23:47:33 +02:00
Abhishek Kumar Mishra
3e5928c57d
Adding notebooks for Fine Tuning [Community Notebook] (#4732)
* Added links to more community notebooks

Added links to 3 more community notebooks from the git repo: https://github.com/abhimishra91/transformers-tutorials
Different Transformers models are fine tuned on Dataset using PyTorch

* Update README.md

* Update README.md

* Update README.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-06-03 11:07:26 +02:00
Lorenzo Ampil
d3ef14f931
Add community notebook for sentiment span extraction (#4700) 2020-06-02 09:59:53 +02:00
Patrick von Platen
6f82aea66b
Include nlp notebook for model evaluation (#4676) 2020-05-29 19:38:56 +02:00
Iz Beltagy
91487cbb8e
[Longformer] fix model name in examples (#4653)
* fix longformer model names in examples

* a better name for the notebook
2020-05-29 13:12:35 +02:00
Iz Beltagy
fe5cb1a1c8
Adding community notebook (#4642)
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-05-28 22:35:15 +02:00
Suraj Patil
aecaaf73a4
[Community notebooks] add longformer-for-qa notebook (#4652) 2020-05-28 22:27:22 +02:00
Lavanya Shukla
3cc2c2a150
add 2 colab notebooks (#4505)
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-05-28 11:18:16 +02:00
ohmeow
5ddd8d6531
Add BART fine-tuning summarization community notebook (#4539)
* adding BART summarization how-to community notebook

* Update notebooks/README.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-05-26 16:43:41 +02:00
Patrick von Platen
0f6969b7e9
Better github link for Reformer Colab Notebook 2020-05-22 23:51:36 +02:00
Patrick von Platen
12e6afe900
Add Reformer colab to community noteboos 2020-05-22 17:03:34 +02:00
Nathan Cooper
cacb654c7f
Add Fine-tune DialoGPT on new datasets notebook (#4473) 2020-05-20 16:17:52 -04:00
Suraj Patil
5856999a9f
add T5 fine-tuning notebook [Community notebooks] (#4462)
* add T5 fine-tuning notebook [Community notebooks]

* Update README.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-05-19 18:26:28 +02:00
Patrick von Platen
24538df919
[Community notebooks] General notebooks (#4441)
* Update README.md

* Update README.md

* Update README.md

* Update README.md
2020-05-18 20:23:57 +02:00
Morgan Funtowicz
84894974bd Updated ONNX notebook link in README. 2020-05-14 22:40:59 +02:00
Funtowicz Morgan
db0076a9df
Conversion script to export transformers models to ONNX IR. (#4253)
* Added generic ONNX conversion script for PyTorch model.

* WIP initial TF support.

* TensorFlow/Keras ONNX export working.

* Print framework version info

* Add possibility to check the model is correctly loading on ONNX runtime.

* Remove quantization option.

* Specify ONNX opset version when exporting.

* Formatting.

* Remove unused imports.

* Make functions more generally reusable from other part of the code.

* isort happy.

* flake happy

* Export only feature-extraction for now

* Correctly check inputs order / filter before export.

* Removed task variable

* Fix invalid args call in load_graph_from_args.

* Fix invalid args call in convert.

* Fix invalid args call in infer_shapes.

* Raise exception and catch in caller function instead of exit.

* Add 04-onnx-export.ipynb notebook

* More WIP on the notebook

* Remove unused imports

* Simplify & remove unused constants.

* Export with constant_folding in PyTorch

* Let's try to put function args in the right order this time ...

* Disable external_data_format temporary

* ONNX notebook draft ready.

* Updated notebooks charts + wording

* Correct error while exporting last chart in notebook.

* Adressing @LysandreJik comment.

* Set ONNX opset to 11 as default value.

* Set opset param mandatory

* Added ONNX export unittests

* Quality.

* flake8 happy

* Add keras2onnx dependency on extras["tf"]

* Pin keras2onnx on github master to v1.6.5

* Second attempt.

* Third attempt.

* Use the right repo URL this time ...

* Do the same for onnxconverter-common

* Added keras2onnx and onnxconveter-common to 1.7.0 to supports TF2.2

* Correct commit hash.

* Addressing PR review: Optimization are enabled by default.

* Addressing PR review: small changes in the notebook

* setup.py comment about keras2onnx versioning.
2020-05-14 16:35:52 -04:00
Lysandre Debut
261c4ff4e2
Update notebooks (#3620)
* Update notebooks

* From local to global link

* from local links to *actual* global links
2020-04-06 14:32:39 -04:00
Kyeongpil Kang
3bedfd3347
Fix wrong link for the notebook file (#3344)
For the tutorial of "How to generate text", the URL link was wrong (it was linked to the tutorial of "How to train a language model").

I fixed the URL.
2020-03-19 17:22:47 +01:00
Patrick von Platen
efdb46b6e2
add link to blog post (#3326) 2020-03-18 13:24:28 +01:00
Morgan Funtowicz
012cbdb0f5 Updating colab links in notebooks README.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-03-05 15:34:15 +01:00
Morgan Funtowicz
30624f7056 Fix Colab links + install dependencies first.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-03-05 11:40:15 +01:00
Morgan Funtowicz
1bca97ec7f Update notebook link and fix few working issues.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-03-04 21:19:33 +01:00
Julien Chaumond
256cbbc4a2
[doc] Fix link to how-to-train Colab 2020-03-04 12:01:45 -05:00