Commit Graph

297 Commits

Author SHA1 Message Date
ELanning
7ecff0ccbb
Fix typo in training (#5510) 2020-07-06 09:14:57 -04:00
Sylvain Gugger
6b735a7253
Tokenizer summary (#5467)
* Work on tokenizer summary

* Finish tutorial

* Link to it

* Apply suggestions from code review

Co-authored-by: Anthony MOI <xn1t0x@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add vocab definition

Co-authored-by: Anthony MOI <xn1t0x@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-07-02 17:07:42 -04:00
George Ho
84e56669af
Fix typo in glossary (#5466) 2020-07-02 09:19:33 -04:00
Patrick von Platen
d16e36c7e5
[Reformer] Add Masked LM Reformer (#5426)
* fix conflicts

* fix

* happy rebasing
2020-07-01 22:43:18 +02:00
Patrick von Platen
fe81f7d12c
finish reformer qa head (#5433) 2020-07-01 12:27:14 -04:00
Sylvain Gugger
6c55e9fc32
Fix dropdown bug in searches (#5440)
* Trigger CI

* Fix dropdown bug in searches
2020-07-01 11:02:59 -04:00
Sylvain Gugger
4ade7491f4
Fix examples titles and optimization doc page (#5408) 2020-07-01 08:11:25 -04:00
Sylvain Gugger
87716a6d07
Documentation for the Trainer API (#5383)
* Documentation for the Trainer API

* Address review comments

* Address comments
2020-06-30 11:43:43 -04:00
Sylvain Gugger
0607b88945
How to share model cards with the CLI (#5374)
* How to share model cards

* Switch the two options

* Fix bad copy/cut

* Julien's suggestion
2020-06-30 08:59:32 -04:00
Lysandre Debut
b9ee87f5c7
Doc for v3.0.0 (#5366)
* Doc for v3.0.0

* Update docs/source/_static/js/custom.js

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/_static/js/custom.js

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-06-29 11:08:54 -04:00
Lysandre
b62ca59527 Release: v3.0.0 2020-06-29 10:40:13 -04:00
Patrick von Platen
4bcc35cd69
[Docs] Benchmark docs (#5360)
* first doc version

* add benchmark docs

* fix typos

* improve README

* Update docs/source/benchmarks.rst

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix naming and docs

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-06-29 16:08:57 +02:00
Julien Chaumond
c950fef545 [docs] Small tweaks to #5323 2020-06-29 14:24:33 +02:00
Sylvain Gugger
1af58c0706
New model sharing tutorial (#5323) 2020-06-27 11:10:02 -04:00
Thomas Wolf
601d4d699c
[tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308)
* remove references to old API in docstring - update data processors

* style

* fix tests - better type checking error messages

* better type checking

* include awesome fix by @LysandreJik for #5310

* updated doc and examples
2020-06-26 19:48:14 +02:00
Joe Davison
2ffef0d0c7
Training & fine-tuning quickstart (#5034)
* add initial fine-tuning guide

* split code blocks to smaller segments

* fix up trianer section of fine-tune doc

* a few last typos

* Update usage -> task summary link

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-06-25 15:11:11 -06:00
Lysandre Debut
364a5ae1f0
Refactor Code samples; Test code samples (#5036)
* Refactor code samples

* Test docstrings

* Style

* Tokenization examples

* Run rust of tests

* First step to testing source docs

* Style and BART comment

* Test the remainder of the code samples

* Style

* let to const

* Formatting fixes

* Ready for merge

* Fix fixture + Style

* Fix last tests

* Update docs/source/quicktour.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Addressing @sgugger's comments + Fix MobileBERT in TF

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-06-25 16:46:00 -04:00
Sylvain Gugger
d12ceb48ba
Tokenization tutorial (#5257)
* All done

* Link to the tutorial

* Typo fixes

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Add metnion of the return_xxx args

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-06-24 18:43:20 -04:00
Sylvain Gugger
6894b486d0
Fix version controller links (for realsies) (#5251) 2020-06-24 12:13:43 -04:00
Sylvain Gugger
609e0c583f
Fix links (#5248) 2020-06-24 11:35:55 -04:00
Sylvain Gugger
7c41057d50
Add hugs (#5225) 2020-06-24 07:56:14 -04:00
Sylvain Gugger
173528e368
Add version control menu (#5222)
* Add version control menu

* Constify things

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-23 17:05:12 -04:00
Sylvain Gugger
417e492f1e
Quick tour (#5145)
* Quicktour part 1

* Update

* All done

* Typos

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Address comments in quick tour

* Update docs/source/quicktour.rst

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update from feedback

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-06-22 16:08:09 -04:00
Sylvain Gugger
1262495a91
Add TF auto model to the docs + fix sphinx warnings (#5187) 2020-06-22 14:43:52 -04:00
Sylvain Gugger
eb0ca71ef6
Update glossary (#5148)
* Update glossary

* Update docs/source/glossary.rst

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-06-22 08:30:49 -04:00
Vasily Shamporov
9a3f91088c
Add MobileBert (#4901)
* Add MobileBert

* Quality + Conversion script

* style

* Update src/transformers/modeling_mobilebert.py

* Links to S3

* Style

* TFMobileBert

Slight fixes to the pytorch MobileBert
Style

* MobileBertForMaskedLM (PT + TF)

* MobileBertForNextSentencePrediction (PT + TF)

* MobileFor{MultipleChoice, TokenClassification} (PT + TF)


ss

* Tests + Auto

* Doc

* Tests

* Addressing @sgugger's comments

* Adressing @patrickvonplaten's comments

* Style

* Style

* Integration test

* style

* Model card

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-06-19 16:38:36 -04:00
Suraj Patil
18177a1a60
lm_labels => labels (#5080) 2020-06-18 09:16:29 +02:00
Sylvain Gugger
204ebc25e6
Update installation page and add contributing to the doc (#5084)
* Update installation page and add contributing to the doc

* Remove mention of symlinks
2020-06-17 14:01:10 -04:00
Sylvain Gugger
7291ea0bff
Reorganize documentation (#5064)
* Reorganize topics and add all models
2020-06-17 07:55:20 -04:00
Sylvain Gugger
011cc0be51
Fix all sphynx warnings (#5068) 2020-06-16 16:50:02 -04:00
Yacine Jernite
49c5202522
Eli5 examples (#4968)
* add eli5 examples

* add dense query script

* query_di

* merging

* merging

* add_utils

* adds nearest neighbor wikipedia

* batch queries

* training_retriever

* new notebooks

* moved retriever traiing script

* finished wiki40b

* max_len_fix

* train_s2s

* retriever_batch_checkpointing

* cleanup

* merge

* dim_fix

* fix_indexer

* fix_wiki40b_snippets

* fix_embed_for_r

* fp32 index

* fix_sparse_q

* joint_training

* remove obsolete datasets

* add_passage_nn_results

* add_passage_nn_results

* add_batch_nn

* add_batch_nn

* add_data_scripts

* notebook

* notebook

* notebook

* fix_multi_gpu

* add_app

* full_caching

* full_caching

* notebook

* sparse_done

* images

* notebook

* add_image_gif

* with_Gif

* add_contr_image

* notebook

* notebook

* notebook

* train_functions

* notebook

* min_retrieval_length

* pandas_option

* notebook

* min_retrieval_length

* notebook

* notebook

* eval_Retriever

* notebook

* images

* notebook

* add_example

* add_example

* notebook

* fireworks

* notebook

* notebook

* joe's notebook comments

* app_update

* notebook

* notebook_link

* captions

* notebook

* assing RetriBert model

* add RetriBert to Auto

* change AutoLMHead to AutoSeq2Seq

* notebook downloads from hf models

* style_black

* style_black

* app_update

* app_update

* fix_app_update

* style

* style

* isort

* Delete WikiELI5training.ipynb

* Delete evaluate_eli5.py

* Delete WikiELI5explore.ipynb

* Delete ExploreWikiELI5Support.html

* Delete explainlikeimfive.py

* Delete wiki_snippets.py

* children before parent

* children before parent

* style_black

* style_black_only

* isort

* isort_new

* Update src/transformers/modeling_retribert.py

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* typo fixes

* app_without_asset

* cleanup

* Delete ELI5animation.gif

* Delete ELI5contrastive.svg

* Delete ELI5wiki_index.svg

* Delete choco_bis.svg

* Delete fireworks.gif

* Delete huggingface_logo.jpg

* Delete huggingface_logo.svg

* Delete Long_Form_Question_Answering_with_ELI5_and_Wikipedia.ipynb

* Delete eli5_app.py

* Delete eli5_utils.py

* readme

* Update README.md

* unused imports

* moved_info

* default_beam

* ftuned model

* disclaimer

* Update src/transformers/modeling_retribert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* black

* add_doc

* names

* isort_Examples

* isort_Examples

* Add doc to index

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-06-16 16:36:58 -04:00
Sylvain Gugger
439aa1d6e9
Remove old section + caching in install (#5027) 2020-06-16 13:03:41 -04:00
Sylvain Gugger
f9f8a5312e
Add DistilBertForMultipleChoice (#5032)
* Add `DistilBertForMultipleChoice`
2020-06-15 18:31:41 -04:00
Anthony MOI
36434220fc
[HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized pipeline - fast tokenizers - tests (#4510)
* Use tokenizers pre-tokenized pipeline

* failing pretrokenized test

* Fix is_pretokenized in python

* add pretokenized tests

* style and quality

* better tests for batched pretokenized inputs

* tokenizers clean up - new padding_strategy - split the files

* [HUGE] refactoring tokenizers - padding - truncation - tests

* style and quality

* bump up requied tokenizers version to 0.8.0-rc1

* switched padding/truncation API - simpler better backward compat

* updating tests for custom tokenizers

* style and quality - tests on pad

* fix QA pipeline

* fix backward compatibility for max_length only

* style and quality

* Various cleans up - add verbose

* fix tests

* update docstrings

* Fix tests

* Docs reformatted

* __call__ method documented

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-06-15 17:12:51 -04:00
Sam Shleifer
a9f1fc6c94
Add bart-base (#5014) 2020-06-15 13:29:26 -04:00
Suraj Patil
e93ccb3290
BartForQuestionAnswering (#4908) 2020-06-12 15:47:57 -04:00
Sylvain Gugger
538531cde5
Add AlbertForMultipleChoice (#4959)
* Add AlbertForMultipleChoice

* Make up to date and add all models to common tests
2020-06-12 14:20:19 -04:00
Suraj Patil
ef2dcdccaa
ElectraForQuestionAnswering (#4913)
* ElectraForQuestionAnswering

* udate __init__

* add test for electra qa model

* add ElectraForQuestionAnswering in auto models

* add ElectraForQuestionAnswering in all_model_classes

* fix outputs, input_ids defaults to None

* add ElectraForQuestionAnswering in docs

* remove commented line
2020-06-10 15:17:52 -04:00
Sylvain Gugger
41a1d27cde
Add XLMRobertaForQuestionAnswering (#4855)
* Add XLMRobertaForQuestionAnswering

* Formatting

* Make test happy
2020-06-08 21:22:37 -04:00
Sylvain Gugger
37be3786cf
Clean documentation (#4849)
* Clean documentation
2020-06-08 11:28:19 -04:00
Sylvain Gugger
56d5d160cd
Add model and doc badges (#4811)
* Add badges for models and docs
2020-06-05 18:45:42 -04:00
Sylvain Gugger
5c0cfc2cf0
Add link to community models (#4804) 2020-06-05 15:29:20 -04:00
Sylvain Gugger
fa661ce749
Add model summary (#4789)
* Add model summary

* Add link to pretrained models
2020-06-05 12:22:50 -04:00
Julien Chaumond
99207bd112
Pipelines: miscellanea of QoL improvements and small features... (#4632)
* [hf_api] Attach all unknown attributes for future-proof compatibility

* [Pipeline] NerPipeline is really a TokenClassificationPipeline

* modelcard.py: I don't think we need to force the download

* Remove config, tokenizer from SUPPORTED_TASKS as we're moving to one model = one weight + one tokenizer

* FillMaskPipeline: also output token in string form

* TextClassificationPipeline: option to return all scores, not just the argmax

* Update docs/source/main_classes/pipelines.rst
2020-06-03 03:51:31 -04:00
Julien Chaumond
b42586ea56
Fix CI after killing archive maps (#4724)
* 🐛 Fix model ids for BART and Flaubert
2020-06-02 10:21:09 -04:00
Lysandre
b43c78e5d3 Release: v2.11.0 2020-06-02 09:49:09 -04:00
Julien Chaumond
d4c2cb402d
Kill model archive maps (#4636)
* Kill model archive maps

* Fixup

* Also kill model_archive_map for MaskedBertPreTrainedModel

* Unhook config_archive_map

* Tokenizers: align with model id changes

* make style && make quality

* Fix CI
2020-06-02 09:39:33 -04:00
Patrick von Platen
56ee2560be
[Longformer] Better handling of global attention mask vs local attention mask (#4672)
* better api

* improve automatic setting of global attention mask

* fix longformer bug

* fix global attention mask in test

* fix global attn mask flatten

* fix slow tests

* update docstring

* update docs and make more robust

* improve attention mask
2020-05-29 17:58:42 +02:00
Patrick von Platen
9c17256447
[Longformer] Multiple choice for longformer (#4645)
* add multiple choice for longformer

* add models to docs

* adapt docstring

* add test to longformer

* add longformer for mc in init and modeling auto

* fix tests
2020-05-29 13:46:08 +02:00
Lysandre Debut
6a17688021
per_device instead of per_gpu/error thrown when argument unknown (#4618)
* per_device instead of per_gpu/error thrown when argument unknown

* [docs] Restore examples.md symlink

* Correct absolute links so that symlink to the doc works correctly

* Update src/transformers/hf_argparser.py

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Warning + reorder

* Docs

* Style

* not for squad

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-27 11:36:55 -04:00