Commit Graph

90 Commits

Author SHA1 Message Date
Nathan Cooper
cacb654c7f
Add Fine-tune DialoGPT on new datasets notebook (#4473) 2020-05-20 16:17:52 -04:00
Suraj Patil
5856999a9f
add T5 fine-tuning notebook [Community notebooks] (#4462)
* add T5 fine-tuning notebook [Community notebooks]

* Update README.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-05-19 18:26:28 +02:00
Funtowicz Morgan
ca4a3f4da9
Adding optimizations block from ONNXRuntime. (#4431)
* Adding optimizations block from ONNXRuntime.

* Turn off external data format by default for PyTorch export.

* Correct the way use_external_format is passed through the cmdline args.
2020-05-18 20:32:33 +02:00
Patrick von Platen
24538df919
[Community notebooks] General notebooks (#4441)
* Update README.md

* Update README.md

* Update README.md

* Update README.md
2020-05-18 20:23:57 +02:00
Nikita
62427d0815
rerun notebook 02-transformers (#4341) 2020-05-15 10:33:08 -04:00
Morgan Funtowicz
84894974bd Updated ONNX notebook link in README. 2020-05-14 22:40:59 +02:00
Funtowicz Morgan
db0076a9df
Conversion script to export transformers models to ONNX IR. (#4253)
* Added generic ONNX conversion script for PyTorch model.

* WIP initial TF support.

* TensorFlow/Keras ONNX export working.

* Print framework version info

* Add possibility to check the model is correctly loading on ONNX runtime.

* Remove quantization option.

* Specify ONNX opset version when exporting.

* Formatting.

* Remove unused imports.

* Make functions more generally reusable from other part of the code.

* isort happy.

* flake happy

* Export only feature-extraction for now

* Correctly check inputs order / filter before export.

* Removed task variable

* Fix invalid args call in load_graph_from_args.

* Fix invalid args call in convert.

* Fix invalid args call in infer_shapes.

* Raise exception and catch in caller function instead of exit.

* Add 04-onnx-export.ipynb notebook

* More WIP on the notebook

* Remove unused imports

* Simplify & remove unused constants.

* Export with constant_folding in PyTorch

* Let's try to put function args in the right order this time ...

* Disable external_data_format temporary

* ONNX notebook draft ready.

* Updated notebooks charts + wording

* Correct error while exporting last chart in notebook.

* Adressing @LysandreJik comment.

* Set ONNX opset to 11 as default value.

* Set opset param mandatory

* Added ONNX export unittests

* Quality.

* flake8 happy

* Add keras2onnx dependency on extras["tf"]

* Pin keras2onnx on github master to v1.6.5

* Second attempt.

* Third attempt.

* Use the right repo URL this time ...

* Do the same for onnxconverter-common

* Added keras2onnx and onnxconveter-common to 1.7.0 to supports TF2.2

* Correct commit hash.

* Addressing PR review: Optimization are enabled by default.

* Addressing PR review: small changes in the notebook

* setup.py comment about keras2onnx versioning.
2020-05-14 16:35:52 -04:00
Patrick von Platen
839bfaedb2
[Docs, Notebook] Include generation pipeline (#4295)
* add first text for generation

* add generation pipeline to usage

* Created using Colaboratory

* correct docstring

* finish
2020-05-13 14:24:08 -04:00
Stefan Schweter
b5c6d3d4c7
notebooks: minor fix for community provided models example (#4025) 2020-04-28 09:12:25 +02:00
Jonathan Sum
0cec4fab7d typo: fine-grained token-leven
Changing from "fine-grained token-leven" to "fine-grained token-level"
2020-04-16 15:11:23 -04:00
Anthony MOI
b7cf9f43d2
Update tokenizers to 0.7.0-rc5 (#3705) 2020-04-10 14:23:49 -04:00
Lysandre Debut
261c4ff4e2
Update notebooks (#3620)
* Update notebooks

* From local to global link

* from local links to *actual* global links
2020-04-06 14:32:39 -04:00
Patrick von Platen
00ea100e96
add summarization and translation to notebook (#3478) 2020-03-27 11:05:37 -04:00
Kyeongpil Kang
8eeefcb576
Update 01-training-tokenizers.ipynb (typo issue) (#3343)
I found there are two grammar errors or typo issues in the explanation of the encoding properties.

The original sentences:
If your was made of multiple \"parts\" such as (question, context), then this would be a vector with for each token the segment it belongs to
If your has been truncated into multiple subparts because of a length limit (for BERT for example the sequence length is limited to 512), this will contain all the remaining overflowing parts.

I think "input" should be inserted after the phrase "If your".
2020-03-19 23:21:49 +01:00
Kyeongpil Kang
3bedfd3347
Fix wrong link for the notebook file (#3344)
For the tutorial of "How to generate text", the URL link was wrong (it was linked to the tutorial of "How to train a language model").

I fixed the URL.
2020-03-19 17:22:47 +01:00
Morgan Funtowicz
cae334c43c Improve fill-mask pipeline example in 03-pipelines notebook.
Remove hardcoded mask_token and use the value provided by the tokenizer.
2020-03-18 17:11:42 +01:00
Patrick von Platen
efdb46b6e2
add link to blog post (#3326) 2020-03-18 13:24:28 +01:00
Param bhavsar
b29fed790b Updated Tokenw ise in print statement to Token wise 2020-03-08 10:55:30 -04:00
Morgan Funtowicz
7ac47bfe69 Updated notebook dependencies for Colab.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-03-05 16:07:51 +01:00
Morgan Funtowicz
be02176a4b Fixing sentiment pipeline in 03-pipelines notebook.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-03-05 16:07:51 +01:00
Morgan Funtowicz
012cbdb0f5 Updating colab links in notebooks README.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-03-05 15:34:15 +01:00
Morgan Funtowicz
30624f7056 Fix Colab links + install dependencies first.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-03-05 11:40:15 +01:00
Morgan Funtowicz
1bca97ec7f Update notebook link and fix few working issues.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-03-04 21:19:33 +01:00
Julien Chaumond
256cbbc4a2
[doc] Fix link to how-to-train Colab 2020-03-04 12:01:45 -05:00
Funtowicz Morgan
71c8711970
Adding Docker images for transformers + notebooks (#3051)
* Added transformers-pytorch-cpu and gpu Docker images

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added automatic jupyter launch for Docker image.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Move image from alpine to Ubuntu to align with NVidia container images.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added TRANSFORMERS_VERSION argument to Dockerfile.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added Pytorch-GPU based Docker image

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added Tensorflow images.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Use python 3.7 as Tensorflow doesnt provide 3.8 compatible wheel.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Remove double FROM instructions on transformers-pytorch-cpu image.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added transformers-tensorflow-gpu Docker image.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* use the correct ubuntu version for tensorflow-gpu

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added pipelines example notebook

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added transformers-cpu and transformers-gpu (including both PyTorch and TensorFlow) images.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Docker images doesnt start jupyter notebook by default.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Tokenizers notebook

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Update images links

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Update Docker images to python 3.7.6 and transformers 2.5.1

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added 02-transformers notebook.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Trying to realign 02-transformers notebook ?

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added Transformer image schema

* Some tweaks on tokenizers notebook

* Removed old notebooks.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Attempt to provide table of content for each notebooks

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Second attempt.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Reintroduce transformer image.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Keep trying

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* It's going to fly !

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Remaining of the Table of Content

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Fix inlined elements for the table of content

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Removed anaconda dependencies for Docker images.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Removing notebooks ToC

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added LABEL to each docker image.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Removed old Dockerfile

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Directly use the context and include transformers from here.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Reduce overall size of compiled Docker images.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Install jupyter by default and use CMD for easier launching of the images.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Reduce number of layers in the images.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added README.md for notebooks.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Fix notebooks link in README

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Fix some wording issues.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added blog notebooks too.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Addressing spelling errors in review comments.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

Co-authored-by: MOI Anthony <xn1t0x@gmail.com>
2020-03-04 11:45:57 -05:00
alberduris
81d6841b4b GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b Moved the encoded_prompts to correct device 2020-01-06 15:11:12 +01:00
thomwolf
f31154cb9d Merge branch 'xlnet' 2019-07-16 11:51:13 +02:00
thomwolf
0bab55d5d5 [BIG] name change 2019-07-05 11:55:36 +02:00
thomwolf
c41f2bad69 WIP XLM + refactoring 2019-07-03 22:54:39 +02:00
thomwolf
62d78aa37e updating GLUE utils for compatibility with XLNet 2019-06-24 14:36:11 +02:00
chrislarson1
a8e071c690 added notebook to check correctness of the pytorch->tensorflow conversion 2019-06-19 23:08:08 -04:00
thomwolf
32167cdf4b remove convert_to_unicode and printable_text from examples 2018-11-26 23:33:22 +01:00
thomwolf
f920eff8c3 update readme 2018-11-17 08:42:45 +01:00
thomwolf
886cb49792 updating readme and notebooks 2018-11-16 14:31:15 +01:00
thomwolf
fd647e8c87 comparison masked LM ok 2018-11-16 11:04:31 +01:00
thomwolf
1de35b624b preparing for first release 2018-11-15 20:56:10 +01:00
thomwolf
907d3569c1 cleaning up SQuAD notebook - more explanation - fixing error 2018-11-06 11:13:43 +01:00
thomwolf
e6646751ac update notebooks 2018-11-05 15:02:50 +01:00
thomwolf
b705c9eff5 remove small script, moved notebooks to notebook folder 2018-11-05 14:55:08 +01:00