Martin Malmsten
105dcb4162
Now passes style guide enforcement
2020-02-23 21:47:59 +01:00
Martin Malmsten
33eb8a165d
Added ,
2020-02-23 21:43:31 +01:00
Martin Malmsten
869b66f6b3
* Added support for Albert when fine-tuning for NER
...
* Added support for Albert in NER pipeline
* Added command-line options to examples/ner/run_ner.py to better control tokenization
* Added class AlbertForTokenClassification
* Changed output for NerPipeline to use .convert_ids_to_tokens(...) instead of .decode(...) to better reflect tokens
2020-02-23 21:13:03 +01:00
saippuakauppias
cafc4dfc7c
fix hardcoded path in examples readme
2020-02-22 11:12:38 -05:00
Patrick von Platen
fc38d4c86f
Improve special_token_id logic in run_generation.py and add tests ( #2885 )
...
* improving generation
* finalized special token behaviour for no_beam_search generation
* solved modeling_utils merge conflict
* solve merge conflicts in modeling_utils.py
* add run_generation improvements from PR #2749
* adapted language generation to not use hardcoded -1 if no padding token is available
* remove the -1 removal as hard coded -1`s are not necessary anymore
* add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown
* add slow language generation tests for pretrained models using hardcoded output with pytorch seed
* delete ipdb
* check that all generated tokens are valid
* renaming
* renaming Generation -> Generate
* make style
* updated so that generate_beam_search has same token behavior than generate_no_beam_search
* consistent return format for run_generation.py
* deleted pretrain lm generate tests -> will be added in another PR
* cleaning of unused if statements and renaming
* run_generate will always return an iterable
* make style
* consistent renaming
* improve naming, make sure generate function always returns the same tensor, add docstring
* add slow tests for all lmhead models
* make style and improve example comments modeling_utils
* better naming and refactoring in modeling_utils
* improving generation
* finalized special token behaviour for no_beam_search generation
* solved modeling_utils merge conflict
* solve merge conflicts in modeling_utils.py
* add run_generation improvements from PR #2749
* adapted language generation to not use hardcoded -1 if no padding token is available
* remove the -1 removal as hard coded -1`s are not necessary anymore
* add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown
* add slow language generation tests for pretrained models using hardcoded output with pytorch seed
* delete ipdb
* check that all generated tokens are valid
* renaming
* renaming Generation -> Generate
* make style
* updated so that generate_beam_search has same token behavior than generate_no_beam_search
* consistent return format for run_generation.py
* deleted pretrain lm generate tests -> will be added in another PR
* cleaning of unused if statements and renaming
* run_generate will always return an iterable
* make style
* consistent renaming
* improve naming, make sure generate function always returns the same tensor, add docstring
* add slow tests for all lmhead models
* make style and improve example comments modeling_utils
* better naming and refactoring in modeling_utils
* changed fast random lm generation testing design to more general one
* delete in old testing design in gpt2
* correct old variable name
* temporary fix for encoder_decoder lm generation tests - has to be updated when t5 is fixed
* adapted all fast random generate tests to new design
* better warning description in modeling_utils
* better comment
* better comment and error message
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-02-21 12:09:59 -05:00
maximeilluin
c749a543fa
Added CamembertForQuestionAnswering ( #2746 )
...
* Added CamembertForQuestionAnswering
* fixed camembert tokenizer case
2020-02-21 12:01:02 -05:00
Martin Malmsten
4452b44b90
Labels are now added to model config under id2label and label2id ( #2945 )
2020-02-21 08:53:05 -05:00
Sam Shleifer
53ce3854a1
New BartModel ( #2745 )
...
* Results same as fairseq
* Wrote a ton of tests
* Struggled with api signatures
* added some docs
2020-02-20 18:11:13 -05:00
srush
889d3bfdbb
default arg fix ( #2937 )
2020-02-20 15:31:17 -05:00
srush
b662f0e625
Support for torch-lightning in NER examples ( #2890 )
...
* initial pytorch lightning commit
* tested multigpu
* Fix learning rate schedule
* black formatting
* fix flake8
* isort
* isort
* .
Co-authored-by: Check your git settings! <chris@chris-laptop>
2020-02-20 11:50:05 -05:00
VictorSanh
2ae98336d1
fix vocab size in binarized_data (distil): int16 vs int32
2020-02-18 16:17:35 +00:00
VictorSanh
0dbddba6d2
fix typo in hans example call
2020-02-17 20:19:57 +00:00
Manuel Romero
4e597c8e4d
Fix typo
2020-02-14 09:07:42 -05:00
Julien Chaumond
4d36472b96
[run_ner] Don't crash if fine-tuning local model that doesn't end with digit
2020-02-14 03:25:29 +00:00
Lysandre
f54a5bd37f
Raise error when using an mlm flag for a clm model + correct TextDataset
2020-02-12 13:23:14 -05:00
Lysandre
569897ce2c
Fix a few issues regarding the language modeling script
2020-02-12 13:23:14 -05:00
VictorSanh
ee5a6856ca
distilbert-base-cased weights + Readmes + omissions
2020-02-07 15:28:13 -05:00
Julien Chaumond
42f08e596f
[examples] rename run_lm_finetuning to run_language_modeling
2020-02-07 09:15:28 -05:00
Julien Chaumond
4f7bdb0958
[examples] Fix broken markdown
2020-02-07 09:15:28 -05:00
Peter Izsak
6fc3d34abd
Fix multi-gpu evaluation in run_glue.py
2020-02-06 16:38:55 -05:00
Julien Chaumond
ada24def22
[run_lm_finetuning] Tweak fix for non-long tensor, close #2728
...
see 1ebfeb7946
and #2728
Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2020-02-05 12:49:18 -05:00
Yuval Pinter
d1ab1fab1b
pass langs parameter to certain XLM models ( #2734 )
...
* pass langs parameter to certain XLM models
Adding an argument that specifies the language the SQuAD dataset is in so language-sensitive XLMs (e.g. `xlm-mlm-tlm-xnli15-1024`) don't default to language `0`.
Allows resolution of issue #1799 .
* fixing from `make style`
* fixing style (again)
2020-02-04 17:12:42 -05:00
Lysandre
3bf5417258
Revert erroneous fix
2020-02-04 16:31:07 -05:00
Lysandre
1ebfeb7946
Cast to long when masking tokens
2020-02-04 15:56:16 -05:00
Lysandre
239dd23f64
[Follow up 213]
...
Masked indices should have -1 and not -100. Updating documentation + scripts that were forgotten
2020-02-03 16:08:05 -05:00
Antonio Carlos Falcão Petri
2ba147ecff
Fix typo in examples/utils_ner.py
...
"%s-%d".format() -> "{}-{}".format()
2020-02-01 11:10:57 -05:00
Lysandre
d18d47be67
run_generation style
2020-01-31 12:05:48 -05:00
Lysandre
7365f01d43
do_sample should be set to True in run_generation.py
2020-01-31 11:49:32 -05:00
Jared Nielsen
71a382319f
Correct documentation
2020-01-30 18:41:24 -05:00
Hang Le
f0a4fc6cd6
Add Flaubert
2020-01-30 10:04:18 -05:00
Jared Nielsen
adb8c93134
Remove lines causing a KeyError
2020-01-29 14:01:16 -05:00
Lysandre
335dd5e68a
Default save steps 50 to 500 in all scripts
2020-01-28 09:42:11 -05:00
Julien Chaumond
6b4c3ee234
[run_lm_finetuning] GPT2 tokenizer doesn't have a pad_token
...
ping @lysandrejik
2020-01-27 20:14:02 -05:00
VictorSanh
1ce3fb5cc7
update correct eval metrics (distilbert & co)
2020-01-24 11:45:22 -05:00
Julien Chaumond
1a8e87be4e
Line-by-line text dataset (including padding)
2020-01-21 16:57:38 -05:00
Julien Chaumond
b94cf7faac
change order
2020-01-21 16:57:38 -05:00
Julien Chaumond
2eaa8b6e56
Easier to not support this, as it could be confusing
...
cc @lysandrejik
2020-01-21 16:57:38 -05:00
Julien Chaumond
801aaa5508
make style
2020-01-21 16:57:38 -05:00
Julien Chaumond
56d4ba8ddb
[run_lm_finetuning] Train from scratch
2020-01-21 16:57:38 -05:00
jiyeon_baek
6d5049a24d
Fix typo in examples/run_squad.py
...
Rul -> Run
2020-01-17 11:22:51 -05:00
Lysandre
6e2c28a14a
Run SQuAD warning when the doc stride may be too high
2020-01-16 13:59:26 -05:00
thomwolf
258ed2eaa8
adding details in readme
2020-01-16 13:21:30 +01:00
thomwolf
50ee59578d
update formating - make flake8 happy
2020-01-16 13:21:30 +01:00
thomwolf
1c9333584a
formating
2020-01-16 13:21:30 +01:00
thomwolf
e25b6fe354
updating readme
2020-01-16 13:21:30 +01:00
thomwolf
27c7b99015
adding details in readme - moving file
2020-01-16 13:21:30 +01:00
Nafise Sadat Moosavi
99d4515572
HANS evaluation
2020-01-16 13:21:30 +01:00
Julien Chaumond
83a41d39b3
💄 super
2020-01-15 18:33:50 -05:00
Julien Chaumond
715fa638a7
Merge branch 'master' into from_scratch_training
2020-01-14 18:58:21 +00:00
Julien Chaumond
b803b067bf
Config to Model mapping
2020-01-13 20:05:20 +00:00
IWillPull
a3085020ed
Added repetition penalty to PPLM example ( #2436 )
...
* Added repetition penalty
* Default PPLM repetition_penalty to neutral
* Minor modifications to comply with reviewer's suggestions. (j -> token_idx)
* Formatted code with `make style`
2020-01-10 23:00:07 -05:00
VictorSanh
e83d9f1c1d
cleaning - change ' to " (black requirements)
2020-01-10 19:34:25 -05:00
VictorSanh
ebba9e929d
minor spring cleaning - missing configs + processing
2020-01-10 19:14:58 -05:00
Victor SANH
331065e62d
missing import
2020-01-10 11:42:53 +01:00
Victor SANH
414e9e7122
indents test
2020-01-10 11:42:53 +01:00
Victor SANH
3cdb38a7c0
indents
2020-01-10 11:42:53 +01:00
Victor SANH
ebd45980a0
Align with run_squad
+ fix some errors
2020-01-10 11:42:53 +01:00
Victor SANH
45634f87f8
fix Sampler in distributed training - evaluation
2020-01-10 11:42:53 +01:00
Victor SANH
af1ee9e648
Move torch.nn.utils.clip_grad_norm_
2020-01-10 11:42:53 +01:00
Lysandre
164c794eb3
New SQuAD API for distillation script
2020-01-10 11:42:53 +01:00
Lysandre
16ce15ed4b
DistilBERT token type ids removed from inputs in run_squad
2020-01-08 13:18:30 +01:00
Lysandre Debut
f24232cd1b
Fix error with global step in run_squad.py
2020-01-08 11:39:00 +01:00
Oren Amsalem
43114b89ba
spelling correction ( #2434 )
2020-01-07 17:25:25 +01:00
Lysandre Debut
27c1b656cc
Fix error with global step in run_lm_finetuning.py
2020-01-07 16:16:12 +01:00
Simone Primarosa
176d3b3079
Add support for Albert and XLMRoberta for the Glue example ( #2403 )
...
* Add support for Albert and XLMRoberta for the Glue example
2020-01-07 14:55:55 +01:00
alberduris
81d6841b4b
GPU text generation: mMoved the encoded_prompt to correct device
2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b
Moved the encoded_prompts to correct device
2020-01-06 15:11:12 +01:00
karajan1001
f01b3e6680
fix #2399 an ImportError in official example ( #2400 )
...
* fix #2399 an ImportError in official example
* style
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-01-05 12:50:20 -05:00
Julien Chaumond
629b22adcf
[run_lm_finetuning] mask_tokens: document types
2020-01-01 12:55:10 -05:00
Thomas Wolf
0412f3d929
Merge pull request #2291 from aaugustin/fix-flake8-F841
...
Fix F841 flake8 warning
2019-12-25 22:37:42 +01:00
Aymeric Augustin
a8d34e534e
Remove [--editable] in install instructions.
...
Use -e only in docs targeted at contributors.
If a user copy-pastes command line with [--editable], they will hit
an error. If they don't know the --editable option, we're giving them
a choice to make before they can move forwards, but this isn't a choice
they need to make right now.
2019-12-24 08:46:08 +01:00
Aymeric Augustin
81422c4e6d
Remove unused variables in examples.
2019-12-23 22:29:02 +01:00
Aymeric Augustin
c3783399db
Remove redundant requirements with transformers.
2019-12-23 19:17:27 +01:00
Aymeric Augustin
9fc8dcb2a0
Standardize import.
...
Every other file uses this pattern.
2019-12-23 18:45:42 +01:00
Aymeric Augustin
1c62e87b34
Use built-in open().
...
On Python 3, `open is io.open`.
2019-12-22 18:38:56 +01:00
Aymeric Augustin
d6eaf4e6d2
Update comments mentioning Python 2.
2019-12-22 18:38:56 +01:00
Aymeric Augustin
75a23d24af
Remove import fallbacks.
2019-12-22 18:38:56 +01:00
Aymeric Augustin
798b3b3899
Remove sys.version_info[0] == 2 or 3.
2019-12-22 18:38:42 +01:00
Aymeric Augustin
6b2200fc88
Remove u-prefixes.
2019-12-22 17:47:54 +01:00
Aymeric Augustin
c824d15aa1
Remove __future__ imports.
2019-12-22 17:47:54 +01:00
Aymeric Augustin
7e98e211f0
Remove unittest.main() in test modules.
...
This construct isn't used anymore these days.
Running python tests/test_foo.py puts the tests/ directory on
PYTHONPATH, which isn't representative of how we run tests.
Use python -m unittest tests/test_foo.py instead.
2019-12-22 14:42:03 +01:00
Aymeric Augustin
ced0a94204
Switch test files to the standard test_*.py scheme.
2019-12-22 14:15:13 +01:00
Aymeric Augustin
c11b3e2926
Sort imports for optional third-party libraries.
...
These libraries aren't always installed in the virtual environment where
isort is running. Declaring them properly avoids mixing these
third-party imports with local imports.
2019-12-22 11:19:13 +01:00
Aymeric Augustin
939148b050
Fix F401 flake8 warning (x28).
...
Do manually what autoflake couldn't manage.
2019-12-22 10:59:08 +01:00
Aymeric Augustin
783a616999
Fix F401 flake8 warning (x88 / 116).
...
This change is mostly autogenerated with:
$ python -m autoflake --in-place --recursive --remove-all-unused-imports --ignore-init-module-imports examples templates transformers utils hubconf.py setup.py
I made minor changes in the generated diff.
2019-12-22 10:59:08 +01:00
Aymeric Augustin
80327a13ea
Fix F401 flake8 warning (x152 / 268).
...
This change is mostly autogenerated with:
$ python -m autoflake --in-place --recursive examples templates transformers utils hubconf.py setup.py
I made minor changes in the generated diff.
2019-12-22 10:59:08 +01:00
Aymeric Augustin
fa2ccbc081
Fix E266 flake8 warning (x90).
2019-12-22 10:59:08 +01:00
Aymeric Augustin
2ab78325f0
Fix F821 flake8 warning (x47).
...
Ignore warnings related to Python 2, because it's going away soon.
2019-12-22 10:59:07 +01:00
Aymeric Augustin
631be27078
Fix E722 flake8 warnings (x26).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
b0f7db73cd
Fix E741 flake8 warning (x14).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
fd2f17a7a1
Fix E714 flake8 warning (x8).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
5eab3cf6bc
Fix W605 flake8 warning (x5).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
7dce8dc7ac
Fix E731 flake8 warning (x3).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
357db7098c
Fix E712 flake8 warning (x1).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
f9c5317db2
Fix E265 flake8 warning (x1).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
28e608a2c2
Remove trailing whitespace from all Python files.
...
Fixes flake8 warning W291 (x224).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
158e82e061
Sort imports with isort.
...
This is the result of:
$ isort --recursive examples templates transformers utils hubconf.py setup.py
2019-12-22 10:57:46 +01:00
Aymeric Augustin
fa84ae26d6
Reformat source code with black.
...
This is the result of:
$ black --line-length 119 examples templates transformers utils hubconf.py setup.py
There's a lot of fairly long lines in the project. As a consequence, I'm
picking the longest widely accepted line length, 119 characters.
This is also Thomas' preference, because it allows for explicit variable
names, to make the code easier to understand.
2019-12-21 17:52:29 +01:00
Thomas Wolf
73f6e9817c
Merge pull request #2115 from suvrat96/add_mmbt_model
...
[WIP] Add MMBT Model to Transformers Repo
2019-12-21 15:26:08 +01:00
thomwolf
344126fe58
move example to mm-imdb folder
2019-12-21 15:06:52 +01:00
Thomas Wolf
5b7fb6a4a1
Merge pull request #2134 from bkkaggle/saving-and-resuming
...
closes #1960 Add saving and resuming functionality for remaining examples
2019-12-21 15:03:53 +01:00
Thomas Wolf
6f68d559ab
Merge pull request #2130 from huggingface/ignored-index-coherence
...
[BREAKING CHANGE] Setting all ignored index to the PyTorch standard
2019-12-21 14:55:40 +01:00
thomwolf
1ab25c49d3
Merge branch 'master' into pr/2115
2019-12-21 14:54:30 +01:00
thomwolf
b03872aae0
fix merge
2019-12-21 14:49:54 +01:00
Thomas Wolf
518ba748e0
Merge branch 'master' into saving-and-resuming
2019-12-21 14:41:39 +01:00
Thomas Wolf
18601c3b6e
Merge pull request #2173 from erenup/master
...
run_squad with roberta
2019-12-21 14:33:16 +01:00
Thomas Wolf
eeb70cdd77
Merge branch 'master' into saving-and-resuming
2019-12-21 14:29:59 +01:00
Thomas Wolf
ed9b84816e
Merge pull request #1840 from huggingface/generation_sampler
...
[WIP] Sampling sequence generator for transformers
2019-12-21 14:27:35 +01:00
thomwolf
cfa0380515
Merge branch 'master' into generation_sampler
2019-12-21 14:12:52 +01:00
thomwolf
300ec3003c
fixing run_generation example - using torch.no_grad
2019-12-21 14:02:19 +01:00
thomwolf
1c37746892
fixing run_generation
2019-12-21 13:52:49 +01:00
thomwolf
8a2be93b4e
fix merge
2019-12-21 13:31:28 +01:00
Thomas Wolf
562f864038
Merge branch 'master' into fix-xlnet-squad2.0
2019-12-21 12:48:10 +01:00
Thomas Wolf
59941c5d1f
Merge pull request #2189 from stefan-it/xlmr
...
Add support for XLM-RoBERTa
2019-12-20 13:26:38 +01:00
Julien Chaumond
a5a06a851e
[doc] Param name consistency
2019-12-19 16:24:20 -05:00
Aidan Kierans
1718fb9e74
Minor/basic text fixes ( #2229 )
...
* Small clarification
Matches line 431 to line 435 for additional clarity and consistency.
* Fixed minor typo
The letter "s" was previously omitted from the word "docstrings".
2019-12-19 16:23:18 -05:00
Francesco
62c1fc3c1e
Removed duplicate XLMConfig, XLMForQuestionAnswering and XLMTokenizer from import statement of run_squad.py script
2019-12-19 09:50:56 -05:00
Ejar
284572efc0
Updated typo on the link
...
Updated documentation due to typo
2019-12-19 09:36:43 -05:00
Stefan Schweter
a26ce4dee1
examples: add XLM-RoBERTa to glue script
2019-12-19 02:23:01 +01:00
thomwolf
3d2096f516
further cleanup
2019-12-18 11:50:54 +01:00
thomwolf
83bc5235cf
Merge branch 'master' into pr/2189
2019-12-17 11:47:32 +01:00
Thomas Wolf
f061606277
Merge pull request #2164 from huggingface/cleanup-configs
...
[SMALL BREAKING CHANGE] Cleaning up configuration classes - Adding Model Cards
2019-12-17 09:10:16 +01:00
Lysandre
18a879f475
fix #2180
2019-12-16 16:44:29 -05:00
Lysandre
d803409215
Fix run squad evaluate during training
2019-12-16 16:31:38 -05:00
Stefan Schweter
71b4750517
examples: add support for XLM-RoBERTa to run_ner script
2019-12-16 16:37:27 +01:00
thomwolf
dc667ce1a7
double check cc @LysandreJik
2019-12-14 09:56:27 +01:00
thomwolf
7140363e09
update bertabs
2019-12-14 09:44:53 +01:00
Thomas Wolf
a52d56c8d9
Merge branch 'master' into cleanup-configs
2019-12-14 09:43:07 +01:00
erenup
c7780700f5
Merge branch 'refs/heads/squad_roberta'
...
# Conflicts:
# transformers/data/processors/squad.py
2019-12-14 08:53:59 +08:00
erenup
8e9526b4b5
add multiple processing
2019-12-14 08:43:58 +08:00
Lysandre
c8ed1c82c8
[SQUAD] Load checkpoint when evaluating without training
2019-12-13 12:13:48 -05:00
Pierric Cistac
5a5c4349e8
Fix summarization to_cpu
doc
2019-12-13 10:02:33 -05:00
thomwolf
47f0e3cfb7
cleaning up configuration classes
2019-12-13 14:33:24 +01:00
erenup
9b312f9d41
initial version for roberta squad
2019-12-13 14:51:40 +08:00
LysandreJik
7296f1010b
Cleanup squad and add allow train_file and predict_file usage
2019-12-12 13:01:04 -05:00
LysandreJik
3fd71c4431
Update example scripts
2019-12-12 12:08:54 -05:00
Alan deLevie
fbf5455a86
Fix typo in examples/run_glue.py args declaration.
...
deay -> decay
2019-12-12 11:16:19 -05:00
Bilal Khan
6aa919469d
Update run_xnli to save optimizer and scheduler states, then resume training from a checkpoint
2019-12-10 19:31:22 -06:00
Bilal Khan
89896fe04f
Update run_ner to save optimizer and scheduler states, then resume training from a checkpoint
2019-12-10 19:31:22 -06:00
Bilal Khan
fdc05cd68f
Update run_squad to save optimizer and scheduler states, then resume training from a checkpoint
2019-12-10 19:31:22 -06:00
Bilal Khan
854ec5784e
Update run_glue to save optimizer and scheduler states, then resume training from a checkpoint
2019-12-10 19:30:36 -06:00
LysandreJik
b72f9d340e
Correct index in script
2019-12-10 18:33:17 -05:00
LysandreJik
6a73382706
Complete warning + cleanup
2019-12-10 14:33:24 -05:00
Lysandre
dc4e9e5cb3
DataParallel for SQuAD + fix XLM
2019-12-10 19:21:20 +00:00
Rémi Louf
07bc8efbc3
add greedy decoding and sampling
2019-12-10 17:27:50 +01:00
Rémi Louf
4b82c485de
remove misplaced summarization documentation
2019-12-10 09:13:33 -05:00
Thomas Wolf
e57d00ee10
Merge pull request #1984 from huggingface/squad-refactor
...
[WIP] Squad refactor
2019-12-10 11:07:26 +01:00
Suvrat Bhooshan
df3961121f
Add MMBT Model to Transformers Repo
2019-12-09 18:36:48 -08:00
Julien Chaumond
1d18930462
Harmonize no_cuda
flag with other scripts
2019-12-09 20:37:55 -05:00
Rémi Louf
f7eba09007
clean for release
2019-12-09 20:37:55 -05:00
Rémi Louf
2a64107e44
improve device usage
2019-12-09 20:37:55 -05:00
Rémi Louf
c0707a85d2
add README
2019-12-09 20:37:55 -05:00
Rémi Louf
ade3cdf5ad
integrate ROUGE
2019-12-09 20:37:55 -05:00
Rémi Louf
076602bdc4
prevent BERT weights from being downloaded twice
2019-12-09 20:37:55 -05:00
Rémi Louf
a1994a71ee
simplified model and configuration
2019-12-09 20:37:55 -05:00
Rémi Louf
3a9a9f7861
default output dir to documents dir
2019-12-09 20:37:55 -05:00
Rémi Louf
693606a75c
update the docs
2019-12-09 20:37:55 -05:00
Rémi Louf
2403a66598
give transformers API to BertAbs
2019-12-09 20:37:55 -05:00
Rémi Louf
ba089c780b
share pretrained embeddings
2019-12-09 20:37:55 -05:00
Rémi Louf
9660ba1cbd
Add beam search
2019-12-09 20:37:55 -05:00
Rémi Louf
1c71ecc880
load the pretrained weights for encoder-decoder
...
We currently save the pretrained_weights of the encoder and decoder in
two separate directories `encoder` and `decoder`. However, for the
`from_pretrained` function to operate with automodels we need to
specify the type of model in the path to the weights.
The path to the encoder/decoder weights is handled by the
`PreTrainedEncoderDecoder` class in the `save_pretrained` function. Sice
there is no easy way to infer the type of model that was initialized for
the encoder and decoder we add a parameter `model_type` to the function.
This is not an ideal solution as it is error prone, and the model type
should be carried by the Model classes somehow.
This is a temporary fix that should be changed before merging.
2019-12-09 20:37:55 -05:00
Rémi Louf
07f4cd73f6
update function to add special tokens
...
Since I started my PR the `add_special_token_single_sequence` function
has been deprecated for another; I replaced it with the new function.
2019-12-09 20:37:55 -05:00
Bilal Khan
79526f82f5
Remove unnecessary epoch variable
2019-12-09 16:24:35 -05:00
Bilal Khan
9626e0458c
Add functionality to continue training from last saved global_step
2019-12-09 16:24:35 -05:00
Bilal Khan
2d73591a18
Stop saving current epoch
2019-12-09 16:24:35 -05:00
Bilal Khan
0eb973b0d9
Use saved optimizer and scheduler states if available
2019-12-09 16:24:35 -05:00
Bilal Khan
a03fcf570d
Save tokenizer after each epoch to be able to resume training from a checkpoint
2019-12-09 16:24:35 -05:00
Bilal Khan
f71b1bb05a
Save optimizer state, scheduler state and current epoch
2019-12-09 16:24:35 -05:00
LysandreJik
2a4ef098d6
Add ALBERT and XLM to SQuAD script
2019-12-09 10:46:47 -05:00
Lysandre Debut
00c4e39581
Merge branch 'master' into squad-refactor
2019-12-09 10:41:15 -05:00
Thomas Wolf
5482822a2b
Merge pull request #2046 from jplu/tf2-ner-example
...
Add NER TF2 example.
2019-12-06 12:12:22 +01:00
LysandreJik
e9217da5ff
Cleanup
...
Improve global visibility on the run_squad script, remove unused files and fixes related to XLNet.
2019-12-05 16:01:51 -05:00
LysandreJik
9ecd83dace
Patch evaluation for impossible values + cleanup
2019-12-05 14:44:57 -05:00
VictorSanh
35ff345fc9
update requirements
2019-12-05 12:07:04 -05:00
VictorSanh
552c44a9b1
release distilm-bert
2019-12-05 10:14:58 -05:00
Rosanne Liu
ee53de7aac
Pr for pplm ( #2060 )
...
* license
* changes
* ok
* Update paper link and commands to run
* pointer to uber repo
2019-12-05 09:20:07 -05:00
Julien Plu
9200a759d7
Add few tests on the TF optimization file with some info in the documentation. Complete the README.
2019-12-05 12:56:43 +01:00
thomwolf
75a97af6bc
fix #1450 - add doc
2019-12-05 11:26:55 +01:00
LysandreJik
f7e4a7cdfa
Cleanup
2019-12-04 16:24:15 -05:00
LysandreJik
cca75e7884
Kill the demon spawn
2019-12-04 15:42:29 -05:00
LysandreJik
9ddc3f1a12
Naming update + XLNet/XLM evaluation
2019-12-04 10:37:00 -05:00
thomwolf
5bfcd0485e
fix #1991
2019-12-04 14:53:11 +01:00
Julien Plu
ecb923da9c
Create a NER example similar to the Pytorch one. It takes the same options, and can be run the same way.
2019-12-04 09:43:15 +01:00
LysandreJik
de276de1c1
Working evaluation
2019-12-03 17:15:51 -05:00
Julien Chaumond
7edb51f3a5
[pplm] split classif head into its own file
2019-12-03 22:07:25 +00:00
VictorSanh
48cbf267c9
Use full dataset for eval (SequentialSampler in Distributed setting)
2019-12-03 11:01:37 -05:00
Julien Chaumond
f434bfc623
[pplm] Update S3 links
...
Co-Authored-By: Piero Molino <w4nderlust@gmail.com>
2019-12-03 10:53:02 -05:00
Ethan Perez
96e83506d1
Always use SequentialSampler during evaluation
...
When evaluating, shouldn't we always use the SequentialSampler instead of DistributedSampler? Evaluation only runs on 1 GPU no matter what, so if you use the DistributedSampler with N GPUs, I think you'll only evaluate on 1/N of the evaluation set. That's at least what I'm finding when I run an older/modified version of this repo.
2019-12-03 10:15:39 -05:00
Julien Chaumond
3b48806f75
[pplm] README: add setup + tweaks
2019-12-03 10:14:02 -05:00
Julien Chaumond
0cb2c90890
readme
...
Co-Authored-By: Rosanne Liu <mimosavvy@gmail.com>
2019-12-03 10:14:02 -05:00
Julien Chaumond
1efb2ae7fc
[pplm] move scripts under examples/pplm/
2019-12-03 10:14:02 -05:00
Piero Molino
a59fdd1627
generate_text_pplm now works with batch_size > 1
2019-12-03 10:14:02 -05:00
w4nderlust
893d0d64fe
Changed order of some parameters to be more consistent. Identical results.
2019-12-03 10:14:02 -05:00
w4nderlust
f42816e7fc
Added additional check for url and path in discriminator model params
2019-12-03 10:14:02 -05:00
w4nderlust
f10b925015
Imrpovements: model_path renamed pretrained_model, tokenizer loaded from pretrained_model, pretrained_model set to discriminator's when discrim is specified, sample = False by default but cli parameter introduced. To obtain identical samples call the cli with --sample
2019-12-03 10:14:02 -05:00
w4nderlust
75904dae66
Removed global variable device
2019-12-03 10:14:02 -05:00
piero
7fd54b55a3
Added support for generic discriminators
2019-12-03 10:14:02 -05:00
piero
b0eaff36e6
Added a +1 to epoch when saving weights
2019-12-03 10:14:02 -05:00
piero
611961ade7
Added tqdm to preprocessing
2019-12-03 10:14:02 -05:00
piero
afc7dcd94d
Now run_pplm works on cpu. Identical output as before (when using gpu).
2019-12-03 10:14:02 -05:00
piero
61399e5afe
Cleaned perturb_past. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
ffc2935405
Fix for making unditioned generation work. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
9f693a0c48
Cleaned generate_text_pplm. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
61a12f790d
Renamed SmallConst to SMALL_CONST and introduced BIG_CONST. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
ef47b2c03a
Removed commented code. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
7ea12db3f5
Removed commented code. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
08c6e456a3
Cleaned full_text_generation. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
6c9c131780
More cleanup for run_model. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
7ffe47c888
Improved device specification
2019-12-03 10:14:02 -05:00
piero
4f2164e40e
First cleanup step, changing function names and passing parameters all the way through without using args. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
821de121e8
Minor changes
2019-12-03 10:14:02 -05:00
w4nderlust
7469d03b1c
Fixed minor bug when running training on cuda
2019-12-03 10:14:02 -05:00
piero
0b51fba20b
Added script for training a discriminator for pplm to use
2019-12-03 10:14:02 -05:00
Piero Molino
34a83faabe
Let's make PPLM great again
2019-12-03 10:14:02 -05:00
Julien Chaumond
d5faa74cd6
tokenizer white space: revert to previous behavior
2019-12-03 10:14:02 -05:00
Julien Chaumond
0b77d66a6d
rm extraneous import
2019-12-03 10:14:02 -05:00
Rosanne Liu
83b1e6ac9e
fix the loss backward issue
...
(cherry picked from commit 566468cc984c6ec7e10dfc62b5b4191781a99cd2)
2019-12-03 10:14:02 -05:00
Julien Chaumond
572c24cfa2
PPLM (squashed)
...
Co-authored-by: piero <piero@uber.com>
Co-authored-by: Rosanne Liu <mimosavvy@gmail.com>
2019-12-03 10:14:02 -05:00
Thomas Wolf
f19a78a634
Merge pull request #1903 from valohai/master
...
Valohai integration
2019-12-03 16:13:01 +01:00
maxvidal
b0ee7c7df3
Added Camembert to available models
2019-11-29 14:17:02 -05:00
Juha Kiili
41aa0e8003
Refactor logs and fix loss bug
2019-11-29 15:33:25 +02:00
Lysandre
bd41e8292a
Cleanup & Evaluation now works
2019-11-28 16:03:56 -05:00
Stefan Schweter
8c276b9c92
Merge branch 'master' into distilbert-german
2019-11-27 18:11:49 +01:00
VictorSanh
d5478b939d
add distilbert + update run_xnli wrt run_glue
2019-11-27 11:07:22 -05:00
VictorSanh
73fe2e7385
remove fstrings
2019-11-27 11:07:22 -05:00
VictorSanh
3e7656f7ac
update readme
2019-11-27 11:07:22 -05:00
VictorSanh
abd397e954
uniformize w/ the cache_dir update
2019-11-27 11:07:22 -05:00
VictorSanh
d5910b312f
move xnli processor (and utils) to transformers/data/processors
2019-11-27 11:07:22 -05:00
VictorSanh
289cf4d2b7
change default for XNLI: dev --> test
2019-11-27 11:07:22 -05:00
VictorSanh
84a0b522cf
mbert reproducibility results
2019-11-27 11:07:22 -05:00
VictorSanh
c4336ecbbd
xnli - output_mode consistency
2019-11-27 11:07:22 -05:00
VictorSanh
d52e98ff9a
add xnli examples/README.md
2019-11-27 11:07:22 -05:00
VictorSanh
71f71ddb3e
run_xnli + utils_xnli
2019-11-27 11:07:22 -05:00
Julien Chaumond
b5d884d25c
Uniformize #1952
2019-11-27 11:05:55 -05:00
Lysandre
4374eaea78
ALBERT for SQuAD
2019-11-26 13:08:12 -05:00
Lysandre
c110c41fdb
Run GLUE and remove LAMB
2019-11-26 13:08:12 -05:00
manansanghi
5d3b8daad2
Minor bug fixes on run_ner.py
2019-11-25 16:48:03 -05:00
İbrahim Ethem Demirci
aa92a184d2
resize model when special tokenizer present
2019-11-25 15:06:32 -05:00
Lysandre
7485caefb0
fix #1894
2019-11-25 09:33:39 -05:00
Julien Chaumond
176cd1ce1b
[doc] homogenize instructions slightly
2019-11-23 11:18:54 -05:00
Lysandre
c3ba645237
Works for XLNet
2019-11-22 16:27:37 -05:00
Lysandre
72e506b22e
wip
2019-11-22 16:26:00 -05:00
Rémi Louf
26db31e0c0
update the documentation
2019-11-21 14:41:19 -05:00
Juha Kiili
2cf3447e0a
Glue: log in Valohai-compatible JSON format too
2019-11-21 12:35:25 +02:00
Thomas Wolf
0cdfcca24b
Merge pull request #1860 from stefan-it/camembert-for-token-classification
...
[WIP] Add support for CamembertForTokenClassification
2019-11-21 10:56:07 +01:00
Jin Young Sohn
e70cdf083d
Cleanup TPU bits from run_glue.py
...
TPU runner is currently implemented in:
https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py .
We plan to upstream this directly into `huggingface/transformers`
(either `master` or `tpu`) branch once it's been more thoroughly tested.
2019-11-20 17:54:34 -05:00
Lysandre
454455c695
fix #1879
2019-11-20 09:42:48 -05:00
Stefan Schweter
e7cf2ccd15
distillation: add German distilbert model
2019-11-19 19:55:19 +01:00
Kazutoshi Shinoda
f3386d9383
typo "deay" -> "decay"
2019-11-18 11:50:06 -05:00
Stefan Schweter
56c84863a1
camembert: add support for CamemBERT in run_ner example
2019-11-18 17:06:57 +01:00
Julien Chaumond
26858f27cb
[camembert] Upload to s3 + rename script
2019-11-16 00:11:07 -05:00
Louis MARTIN
3e20c2e871
Update demo_camembert.py with new classes
2019-11-16 00:11:07 -05:00
Louis MARTIN
f12e4d8da7
Move demo_camembert.py to examples/contrib
2019-11-16 00:11:07 -05:00
Louis MARTIN
6e72fd094c
Add demo_camembert.py
2019-11-16 00:11:07 -05:00
Xu Hongshen
ca99a2d500
Update example readme
2019-11-15 14:55:26 +08:00
Xu Hongshen
7da3ef24cd
add is_impossible tensor to model inputs during fine-tuning xlnet on squad2.0
2019-11-15 14:18:53 +08:00
Thomas Wolf
74ce8de7d8
Merge pull request #1792 from stefan-it/distilbert-for-token-classification
...
DistilBERT for token classification
2019-11-14 22:47:53 +01:00
Thomas Wolf
05db5bc1af
added small comparison between BERT, RoBERTa and DistilBERT
2019-11-14 22:40:22 +01:00
Thomas Wolf
9629e2c676
Merge pull request #1804 from ronakice/master
...
fix multi-gpu eval in torch examples
2019-11-14 22:24:05 +01:00
Thomas Wolf
df99f8c5a1
Merge pull request #1832 from huggingface/memory-leak-schedulers
...
replace LambdaLR scheduler wrappers by function
2019-11-14 22:10:31 +01:00
Rémi Louf
2276bf69b7
update the examples, docs and template
2019-11-14 20:38:02 +01:00
Lysandre
d7929899da
Specify checkpoint in saved file for run_lm_finetuning.py
2019-11-14 10:49:00 -05:00
ronakice
2e31176557
fix multi-gpu eval
2019-11-12 05:55:11 -05:00
Stefan Schweter
2b07b9e5ee
examples: add DistilBert support for NER fine-tuning
2019-11-11 16:19:34 +01:00
Adrian Bauer
7a9aae1044
Fix run_bertology.py
...
Make imports and args.overwrite_cache match run_glue.py
2019-11-08 16:28:40 -05:00
Julien Chaumond
f88c104d8f
[run_tf_glue] Add comment for context
2019-11-05 19:56:43 -05:00
Julien Chaumond
30968d70af
misc doc
2019-11-05 19:06:12 -05:00
Thomas Wolf
e99071f105
Merge pull request #1734 from orena1/patch-1
...
add progress bar to convert_examples_to_features
2019-11-05 11:34:20 +01:00
Thomas Wolf
ba973342e3
Merge pull request #1553 from WilliamTambellini/timeSquadInference
...
Add speed log to examples/run_squad.py
2019-11-05 11:13:12 +01:00
Thomas Wolf
237fad339c
Merge pull request #1709 from oneraghavan/master
...
Fixing mode in evaluate during training
2019-11-05 10:55:33 +01:00
Oren Amsalem
d7906165a3
add progress bar for convert_examples_to_features
...
It takes considerate amount of time (~10 min) to parse the examples to features, it is good to have a progress-bar to track this
2019-11-05 10:34:27 +02:00
thomwolf
89d6272898
Fix #1623
2019-11-04 16:21:12 +01:00
Thomas Wolf
9a3b173cd3
Merge branch 'master' into master
2019-11-04 11:41:26 +01:00
thomwolf
ad90868627
Update example readme
2019-11-04 11:27:22 +01:00
Raghavan
e5b1048bae
Fixing mode in evaluate during training
2019-11-03 16:14:46 +05:30
Lysandre
1a2b40cb53
run_tf_glue MRPC evaluation only for MRPC
2019-10-31 18:00:51 -04:00
Timothy Liu
be36cf92fb
Added mixed precision support to benchmarks.py
2019-10-31 17:24:37 -04:00
Julien Chaumond
f96ce1c241
[run_generation] Fix generation with batch_size>1
2019-10-31 18:27:11 +00:00
Julien Chaumond
3c1b6f594e
Merge branch 'master' into fix_top_k_top_p_filtering
2019-10-31 13:53:51 -04:00
Victor SANH
fa735208c9
update readme - fix example command distil*
2019-10-30 14:27:28 -04:00
Thomas Wolf
c7058d8224
Merge pull request #1608 from focox/master
...
Error raised by "tmp_eval_loss += tmp_eval_loss.item()" when using multi-gpu
2019-10-30 17:14:07 +01:00
Thomas Wolf
04c69db399
Merge pull request #1628 from huggingface/tfglue
...
run_tf_glue works with all tasks
2019-10-30 17:04:03 +01:00
Thomas Wolf
3df4367244
Merge pull request #1601 from huggingface/clean-roberta
...
Clean roberta model & all tokenizers now add special tokens by default (breaking change)
2019-10-30 17:00:40 +01:00
Thomas Wolf
36174696cc
Merge branch 'master' into clean-roberta
2019-10-30 16:51:06 +01:00
Thomas Wolf
228cdd6a6e
Merge branch 'master' into conditional-generation
2019-10-30 16:40:35 +01:00
Rémi Louf
070507df1f
format utils for summarization
2019-10-30 11:24:12 +01:00
Rémi Louf
da10de8466
fix bug with padding mask + add corresponding test
2019-10-30 11:19:58 +01:00
Rémi Louf
3b0d2fa30e
rename seq2seq to encoder_decoder
2019-10-30 10:54:46 +01:00
Rémi Louf
9c1bdb5b61
revert renaming of lm_labels to ltr_lm_labels
2019-10-30 10:43:13 +01:00
Rémi Louf
098a89f312
update docstrings; rename lm_labels to more explicit ltr_lm_labels
2019-10-29 20:08:03 +01:00
Rémi Louf
dfce409691
resolve PR comments
2019-10-29 17:10:20 +01:00
altsoph
079bfb32fb
Evaluation fixed.
2019-10-28 10:18:58 -04:00
altsoph
438f2730a0
Evaluation code fixed.
2019-10-28 10:18:58 -04:00
Rémi Louf
4c3ac4a7d8
here's one big commit
2019-10-28 10:49:50 +01:00
Rémi Louf
932543f77e
fix test of truncation function
2019-10-28 10:49:49 +01:00
Rémi Louf
a67413ccc8
extend works in-place
2019-10-28 10:49:49 +01:00
Rémi Louf
b915ba9dfe
pad sequence with 0, mask with -1
2019-10-28 10:49:49 +01:00
Lysandre
bab6ad01aa
run_tf_glue works with all tasks
2019-10-24 21:41:45 +00:00
Matt Maybeno
ae1d03fc51
Add roberta to doc
2019-10-24 14:32:48 -04:00
Matt Maybeno
4e5f88b74f
Add Roberta to run_ner.py
2019-10-24 14:32:48 -04:00
VictorSanh
5b6cafb11b
[release] fix table weirdness
2019-10-23 10:35:16 -04:00
VictorSanh
8ad5c591cd
[RELEASE] DistilRoBERTa
2019-10-23 10:29:47 -04:00
focox@qq.com
bd847ce7d7
fixed the bug raised by "tmp_eval_loss += tmp_eval_loss.item()" when parallelly using multi-gpu.
2019-10-23 20:27:13 +08:00
Julien Chaumond
ef1b8b2ae5
[CTRL] warn if generation prompt does not start with a control code
...
see also https://github.com/salesforce/ctrl/pull/50
2019-10-22 21:30:32 +00:00
Lysandre
7d709e55ed
Remove
2019-10-22 14:12:33 -04:00
Lysandre
1cfd974868
Option to benchmark only one of the two libraries
2019-10-22 13:32:23 -04:00
Pasquale Minervini
abd7110e21
gradient norm clipping should be done right before calling the optimiser - fixing run_glue and run_ner as well
2019-10-21 19:56:52 +01:00
Pasquale Minervini
3775550c4b
gradient norm clipping should be done right before calling the optimiser
2019-10-20 22:33:56 +01:00
LysandreJik
7dd29ed2f1
Benchmarks example script
2019-10-18 10:53:04 -04:00
William Tambellini
0919389d9a
Add speed log to examples/run_squad.py
...
Add a speed estimate log (time per example)
for evaluation to examples/run_squad.py
2019-10-17 14:41:04 -07:00
leo-du
ecd15667f3
fix repetition penalty
2019-10-17 14:47:14 -04:00
thomwolf
8cd56e3036
fix data processing in script
2019-10-17 16:33:26 +02:00
Rémi Louf
578d23e061
add training pipeline (formatting temporary)
2019-10-17 14:02:27 +02:00
Rémi Louf
47a06d88a0
use two different tokenizers for storyand summary
2019-10-17 13:04:26 +02:00
Rémi Louf
bfb9b540d4
add Model2Model to __init__
2019-10-17 12:59:51 +02:00
Rémi Louf
c1bc709c35
correct the truncation and padding of dataset
2019-10-17 10:41:53 +02:00
Rémi Louf
e4e0ee14bd
add separator between data import and train
2019-10-16 20:05:32 +02:00
Rémi Louf
0d81fc853e
specify in readme that both datasets are required
2019-10-15 15:26:33 +02:00
Rémi Louf
1aec940587
test the full story processing
2019-10-15 15:18:07 +02:00
Rémi Louf
22e1af6859
truncation function is fully tested
2019-10-15 14:43:50 +02:00
Rémi Louf
260ac7d9a8
wip commit, switching computers
2019-10-15 12:24:35 +02:00
thomwolf
be916cb3fb
Merge branch 'master' of https://github.com/huggingface/transformers
2019-10-15 10:37:13 +02:00
thomwolf
5875aaf762
install tensorboard
2019-10-15 10:36:46 +02:00
Thomas Wolf
40f14ff545
Merge pull request #1513 from slayton58/amp_fp16_einsum
...
Force einsum to run in fp16
2019-10-15 10:25:00 +02:00
Thomas Wolf
d147671c6c
Merge pull request #1508 from tlkh/master
...
Added performance enhancements (XLA, AMP) to examples
2019-10-15 09:57:18 +02:00
thomwolf
2c1d5564ad
add readme information
2019-10-15 09:56:52 +02:00
thomwolf
c55badcee0
Add NER finetuning details by @stefan-it in example readme
2019-10-15 09:33:52 +02:00
Julien Chaumond
788e632622
[ner] Honor args.overwrite_cache
2019-10-15 09:17:31 +02:00
thomwolf
0f9ebb0b43
add seqeval as requirement for examples
2019-10-15 09:17:31 +02:00
thomwolf
66adb71734
update to transformers
2019-10-15 09:17:31 +02:00
Marianne Stecklina
5ff9cd158a
Add option to predict on test set
2019-10-15 09:17:31 +02:00
Marianne Stecklina
7f5367e0b1
Add cli argument for configuring labels
2019-10-15 09:17:31 +02:00
Marianne Stecklina
e1d4179b64
Make file reading more robust
2019-10-15 09:17:31 +02:00
Marianne Stecklina
383ef96747
Implement fine-tuning BERT on CoNLL-2003 named entity recognition task
2019-10-15 09:17:31 +02:00
Marianne Stecklina
5adb39e757
Add option to predict on test set
2019-10-15 09:14:53 +02:00
Marianne Stecklina
99b189df6d
Add cli argument for configuring labels
2019-10-15 09:14:53 +02:00
Marianne Stecklina
3e9420add1
Make file reading more robust
2019-10-15 09:14:53 +02:00
Marianne Stecklina
cde42c4354
Implement fine-tuning BERT on CoNLL-2003 named entity recognition task
2019-10-15 09:14:53 +02:00
hlums
74c5035808
Fix token order in xlnet preprocessing.
2019-10-14 21:27:11 +00:00
Rémi Louf
fe25eefc15
add instructions to fetch the dataset
2019-10-14 20:45:39 +02:00
Rémi Louf
412793275d
delegate the padding with special tokens to the tokenizer
2019-10-14 20:45:16 +02:00
Rémi Louf
447fffb21f
process the raw CNN/Daily Mail dataset
...
the data provided by Li Dong et al. were already tokenized, which means
that they are not compatible with all the models in the library. We
thus process the raw data directly and tokenize them using the models'
tokenizers.
2019-10-14 18:12:20 +02:00
Simon Layton
4e6a55751a
Force einsum to fp16
2019-10-14 11:12:41 -04:00
Rémi Louf
67d10960ae
load and prepare CNN/Daily Mail data
...
We write a function to load an preprocess the CNN/Daily Mail dataset as
provided by Li Dong et al. The issue is that this dataset has already
been tokenized by the authors, so we actually need to find the original,
plain-text dataset if we want to apply it to all models.
2019-10-14 14:11:20 +02:00
Timothy Liu
376e65a674
Added automatic mixed precision and XLA options to run_tf_glue.py
2019-10-13 13:19:06 +00:00
Timothy Liu
86f23a1944
Minor enhancements to run_tf_glue.py
2019-10-13 10:21:35 +00:00
VictorSanh
d844db4005
Add citation bibtex
2019-10-11 16:55:42 -04:00
Rémi Louf
b3261e7ace
read parameters from CLI, load model & tokenizer
2019-10-11 18:40:38 +02:00
Rémi Louf
d889e0b71b
add base for seq2seq finetuning
2019-10-11 17:36:12 +02:00
Thomas Wolf
4428aefc63
Merge pull request #1488 from huggingface/pytorch-tpu
...
GLUE on TPU
2019-10-11 16:33:00 +02:00