srush
889d3bfdbb
default arg fix ( #2937 )
2020-02-20 15:31:17 -05:00
srush
b662f0e625
Support for torch-lightning in NER examples ( #2890 )
...
* initial pytorch lightning commit
* tested multigpu
* Fix learning rate schedule
* black formatting
* fix flake8
* isort
* isort
* .
Co-authored-by: Check your git settings! <chris@chris-laptop>
2020-02-20 11:50:05 -05:00
VictorSanh
2ae98336d1
fix vocab size in binarized_data (distil): int16 vs int32
2020-02-18 16:17:35 +00:00
VictorSanh
0dbddba6d2
fix typo in hans example call
2020-02-17 20:19:57 +00:00
Manuel Romero
4e597c8e4d
Fix typo
2020-02-14 09:07:42 -05:00
Julien Chaumond
4d36472b96
[run_ner] Don't crash if fine-tuning local model that doesn't end with digit
2020-02-14 03:25:29 +00:00
Lysandre
f54a5bd37f
Raise error when using an mlm flag for a clm model + correct TextDataset
2020-02-12 13:23:14 -05:00
Lysandre
569897ce2c
Fix a few issues regarding the language modeling script
2020-02-12 13:23:14 -05:00
VictorSanh
ee5a6856ca
distilbert-base-cased weights + Readmes + omissions
2020-02-07 15:28:13 -05:00
Julien Chaumond
42f08e596f
[examples] rename run_lm_finetuning to run_language_modeling
2020-02-07 09:15:28 -05:00
Julien Chaumond
4f7bdb0958
[examples] Fix broken markdown
2020-02-07 09:15:28 -05:00
Peter Izsak
6fc3d34abd
Fix multi-gpu evaluation in run_glue.py
2020-02-06 16:38:55 -05:00
Julien Chaumond
ada24def22
[run_lm_finetuning] Tweak fix for non-long tensor, close #2728
...
see 1ebfeb7946
and #2728
Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2020-02-05 12:49:18 -05:00
Yuval Pinter
d1ab1fab1b
pass langs parameter to certain XLM models ( #2734 )
...
* pass langs parameter to certain XLM models
Adding an argument that specifies the language the SQuAD dataset is in so language-sensitive XLMs (e.g. `xlm-mlm-tlm-xnli15-1024`) don't default to language `0`.
Allows resolution of issue #1799 .
* fixing from `make style`
* fixing style (again)
2020-02-04 17:12:42 -05:00
Lysandre
3bf5417258
Revert erroneous fix
2020-02-04 16:31:07 -05:00
Lysandre
1ebfeb7946
Cast to long when masking tokens
2020-02-04 15:56:16 -05:00
Lysandre
239dd23f64
[Follow up 213]
...
Masked indices should have -1 and not -100. Updating documentation + scripts that were forgotten
2020-02-03 16:08:05 -05:00
Antonio Carlos Falcão Petri
2ba147ecff
Fix typo in examples/utils_ner.py
...
"%s-%d".format() -> "{}-{}".format()
2020-02-01 11:10:57 -05:00
Lysandre
d18d47be67
run_generation style
2020-01-31 12:05:48 -05:00
Lysandre
7365f01d43
do_sample should be set to True in run_generation.py
2020-01-31 11:49:32 -05:00
Jared Nielsen
71a382319f
Correct documentation
2020-01-30 18:41:24 -05:00
Hang Le
f0a4fc6cd6
Add Flaubert
2020-01-30 10:04:18 -05:00
Jared Nielsen
adb8c93134
Remove lines causing a KeyError
2020-01-29 14:01:16 -05:00
Lysandre
335dd5e68a
Default save steps 50 to 500 in all scripts
2020-01-28 09:42:11 -05:00
Julien Chaumond
6b4c3ee234
[run_lm_finetuning] GPT2 tokenizer doesn't have a pad_token
...
ping @lysandrejik
2020-01-27 20:14:02 -05:00
VictorSanh
1ce3fb5cc7
update correct eval metrics (distilbert & co)
2020-01-24 11:45:22 -05:00
Julien Chaumond
1a8e87be4e
Line-by-line text dataset (including padding)
2020-01-21 16:57:38 -05:00
Julien Chaumond
b94cf7faac
change order
2020-01-21 16:57:38 -05:00
Julien Chaumond
2eaa8b6e56
Easier to not support this, as it could be confusing
...
cc @lysandrejik
2020-01-21 16:57:38 -05:00
Julien Chaumond
801aaa5508
make style
2020-01-21 16:57:38 -05:00
Julien Chaumond
56d4ba8ddb
[run_lm_finetuning] Train from scratch
2020-01-21 16:57:38 -05:00
jiyeon_baek
6d5049a24d
Fix typo in examples/run_squad.py
...
Rul -> Run
2020-01-17 11:22:51 -05:00
Lysandre
6e2c28a14a
Run SQuAD warning when the doc stride may be too high
2020-01-16 13:59:26 -05:00
thomwolf
258ed2eaa8
adding details in readme
2020-01-16 13:21:30 +01:00
thomwolf
50ee59578d
update formating - make flake8 happy
2020-01-16 13:21:30 +01:00
thomwolf
1c9333584a
formating
2020-01-16 13:21:30 +01:00
thomwolf
e25b6fe354
updating readme
2020-01-16 13:21:30 +01:00
thomwolf
27c7b99015
adding details in readme - moving file
2020-01-16 13:21:30 +01:00
Nafise Sadat Moosavi
99d4515572
HANS evaluation
2020-01-16 13:21:30 +01:00
Julien Chaumond
83a41d39b3
💄 super
2020-01-15 18:33:50 -05:00
Julien Chaumond
715fa638a7
Merge branch 'master' into from_scratch_training
2020-01-14 18:58:21 +00:00
Julien Chaumond
b803b067bf
Config to Model mapping
2020-01-13 20:05:20 +00:00
IWillPull
a3085020ed
Added repetition penalty to PPLM example ( #2436 )
...
* Added repetition penalty
* Default PPLM repetition_penalty to neutral
* Minor modifications to comply with reviewer's suggestions. (j -> token_idx)
* Formatted code with `make style`
2020-01-10 23:00:07 -05:00
VictorSanh
e83d9f1c1d
cleaning - change ' to " (black requirements)
2020-01-10 19:34:25 -05:00
VictorSanh
ebba9e929d
minor spring cleaning - missing configs + processing
2020-01-10 19:14:58 -05:00
Victor SANH
331065e62d
missing import
2020-01-10 11:42:53 +01:00
Victor SANH
414e9e7122
indents test
2020-01-10 11:42:53 +01:00
Victor SANH
3cdb38a7c0
indents
2020-01-10 11:42:53 +01:00
Victor SANH
ebd45980a0
Align with run_squad
+ fix some errors
2020-01-10 11:42:53 +01:00
Victor SANH
45634f87f8
fix Sampler in distributed training - evaluation
2020-01-10 11:42:53 +01:00
Victor SANH
af1ee9e648
Move torch.nn.utils.clip_grad_norm_
2020-01-10 11:42:53 +01:00
Lysandre
164c794eb3
New SQuAD API for distillation script
2020-01-10 11:42:53 +01:00
Lysandre
16ce15ed4b
DistilBERT token type ids removed from inputs in run_squad
2020-01-08 13:18:30 +01:00
Lysandre Debut
f24232cd1b
Fix error with global step in run_squad.py
2020-01-08 11:39:00 +01:00
Oren Amsalem
43114b89ba
spelling correction ( #2434 )
2020-01-07 17:25:25 +01:00
Lysandre Debut
27c1b656cc
Fix error with global step in run_lm_finetuning.py
2020-01-07 16:16:12 +01:00
Simone Primarosa
176d3b3079
Add support for Albert and XLMRoberta for the Glue example ( #2403 )
...
* Add support for Albert and XLMRoberta for the Glue example
2020-01-07 14:55:55 +01:00
alberduris
81d6841b4b
GPU text generation: mMoved the encoded_prompt to correct device
2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b
Moved the encoded_prompts to correct device
2020-01-06 15:11:12 +01:00
karajan1001
f01b3e6680
fix #2399 an ImportError in official example ( #2400 )
...
* fix #2399 an ImportError in official example
* style
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-01-05 12:50:20 -05:00
Julien Chaumond
629b22adcf
[run_lm_finetuning] mask_tokens: document types
2020-01-01 12:55:10 -05:00
Thomas Wolf
0412f3d929
Merge pull request #2291 from aaugustin/fix-flake8-F841
...
Fix F841 flake8 warning
2019-12-25 22:37:42 +01:00
Aymeric Augustin
a8d34e534e
Remove [--editable] in install instructions.
...
Use -e only in docs targeted at contributors.
If a user copy-pastes command line with [--editable], they will hit
an error. If they don't know the --editable option, we're giving them
a choice to make before they can move forwards, but this isn't a choice
they need to make right now.
2019-12-24 08:46:08 +01:00
Aymeric Augustin
81422c4e6d
Remove unused variables in examples.
2019-12-23 22:29:02 +01:00
Aymeric Augustin
c3783399db
Remove redundant requirements with transformers.
2019-12-23 19:17:27 +01:00
Aymeric Augustin
9fc8dcb2a0
Standardize import.
...
Every other file uses this pattern.
2019-12-23 18:45:42 +01:00
Aymeric Augustin
1c62e87b34
Use built-in open().
...
On Python 3, `open is io.open`.
2019-12-22 18:38:56 +01:00
Aymeric Augustin
d6eaf4e6d2
Update comments mentioning Python 2.
2019-12-22 18:38:56 +01:00
Aymeric Augustin
75a23d24af
Remove import fallbacks.
2019-12-22 18:38:56 +01:00
Aymeric Augustin
798b3b3899
Remove sys.version_info[0] == 2 or 3.
2019-12-22 18:38:42 +01:00
Aymeric Augustin
6b2200fc88
Remove u-prefixes.
2019-12-22 17:47:54 +01:00
Aymeric Augustin
c824d15aa1
Remove __future__ imports.
2019-12-22 17:47:54 +01:00
Aymeric Augustin
7e98e211f0
Remove unittest.main() in test modules.
...
This construct isn't used anymore these days.
Running python tests/test_foo.py puts the tests/ directory on
PYTHONPATH, which isn't representative of how we run tests.
Use python -m unittest tests/test_foo.py instead.
2019-12-22 14:42:03 +01:00
Aymeric Augustin
ced0a94204
Switch test files to the standard test_*.py scheme.
2019-12-22 14:15:13 +01:00
Aymeric Augustin
c11b3e2926
Sort imports for optional third-party libraries.
...
These libraries aren't always installed in the virtual environment where
isort is running. Declaring them properly avoids mixing these
third-party imports with local imports.
2019-12-22 11:19:13 +01:00
Aymeric Augustin
939148b050
Fix F401 flake8 warning (x28).
...
Do manually what autoflake couldn't manage.
2019-12-22 10:59:08 +01:00
Aymeric Augustin
783a616999
Fix F401 flake8 warning (x88 / 116).
...
This change is mostly autogenerated with:
$ python -m autoflake --in-place --recursive --remove-all-unused-imports --ignore-init-module-imports examples templates transformers utils hubconf.py setup.py
I made minor changes in the generated diff.
2019-12-22 10:59:08 +01:00
Aymeric Augustin
80327a13ea
Fix F401 flake8 warning (x152 / 268).
...
This change is mostly autogenerated with:
$ python -m autoflake --in-place --recursive examples templates transformers utils hubconf.py setup.py
I made minor changes in the generated diff.
2019-12-22 10:59:08 +01:00
Aymeric Augustin
fa2ccbc081
Fix E266 flake8 warning (x90).
2019-12-22 10:59:08 +01:00
Aymeric Augustin
2ab78325f0
Fix F821 flake8 warning (x47).
...
Ignore warnings related to Python 2, because it's going away soon.
2019-12-22 10:59:07 +01:00
Aymeric Augustin
631be27078
Fix E722 flake8 warnings (x26).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
b0f7db73cd
Fix E741 flake8 warning (x14).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
fd2f17a7a1
Fix E714 flake8 warning (x8).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
5eab3cf6bc
Fix W605 flake8 warning (x5).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
7dce8dc7ac
Fix E731 flake8 warning (x3).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
357db7098c
Fix E712 flake8 warning (x1).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
f9c5317db2
Fix E265 flake8 warning (x1).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
28e608a2c2
Remove trailing whitespace from all Python files.
...
Fixes flake8 warning W291 (x224).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
158e82e061
Sort imports with isort.
...
This is the result of:
$ isort --recursive examples templates transformers utils hubconf.py setup.py
2019-12-22 10:57:46 +01:00
Aymeric Augustin
fa84ae26d6
Reformat source code with black.
...
This is the result of:
$ black --line-length 119 examples templates transformers utils hubconf.py setup.py
There's a lot of fairly long lines in the project. As a consequence, I'm
picking the longest widely accepted line length, 119 characters.
This is also Thomas' preference, because it allows for explicit variable
names, to make the code easier to understand.
2019-12-21 17:52:29 +01:00
Thomas Wolf
73f6e9817c
Merge pull request #2115 from suvrat96/add_mmbt_model
...
[WIP] Add MMBT Model to Transformers Repo
2019-12-21 15:26:08 +01:00
thomwolf
344126fe58
move example to mm-imdb folder
2019-12-21 15:06:52 +01:00
Thomas Wolf
5b7fb6a4a1
Merge pull request #2134 from bkkaggle/saving-and-resuming
...
closes #1960 Add saving and resuming functionality for remaining examples
2019-12-21 15:03:53 +01:00
Thomas Wolf
6f68d559ab
Merge pull request #2130 from huggingface/ignored-index-coherence
...
[BREAKING CHANGE] Setting all ignored index to the PyTorch standard
2019-12-21 14:55:40 +01:00
thomwolf
1ab25c49d3
Merge branch 'master' into pr/2115
2019-12-21 14:54:30 +01:00
thomwolf
b03872aae0
fix merge
2019-12-21 14:49:54 +01:00
Thomas Wolf
518ba748e0
Merge branch 'master' into saving-and-resuming
2019-12-21 14:41:39 +01:00
Thomas Wolf
18601c3b6e
Merge pull request #2173 from erenup/master
...
run_squad with roberta
2019-12-21 14:33:16 +01:00
Thomas Wolf
eeb70cdd77
Merge branch 'master' into saving-and-resuming
2019-12-21 14:29:59 +01:00
Thomas Wolf
ed9b84816e
Merge pull request #1840 from huggingface/generation_sampler
...
[WIP] Sampling sequence generator for transformers
2019-12-21 14:27:35 +01:00