Julien Chaumond
42f08e596f
[examples] rename run_lm_finetuning to run_language_modeling
2020-02-07 09:15:28 -05:00
Julien Chaumond
4f7bdb0958
[examples] Fix broken markdown
2020-02-07 09:15:28 -05:00
Peter Izsak
6fc3d34abd
Fix multi-gpu evaluation in run_glue.py
2020-02-06 16:38:55 -05:00
Julien Chaumond
ada24def22
[run_lm_finetuning] Tweak fix for non-long tensor, close #2728
...
see 1ebfeb7946
and #2728
Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2020-02-05 12:49:18 -05:00
Yuval Pinter
d1ab1fab1b
pass langs parameter to certain XLM models ( #2734 )
...
* pass langs parameter to certain XLM models
Adding an argument that specifies the language the SQuAD dataset is in so language-sensitive XLMs (e.g. `xlm-mlm-tlm-xnli15-1024`) don't default to language `0`.
Allows resolution of issue #1799 .
* fixing from `make style`
* fixing style (again)
2020-02-04 17:12:42 -05:00
Lysandre
3bf5417258
Revert erroneous fix
2020-02-04 16:31:07 -05:00
Lysandre
1ebfeb7946
Cast to long when masking tokens
2020-02-04 15:56:16 -05:00
Lysandre
239dd23f64
[Follow up 213]
...
Masked indices should have -1 and not -100. Updating documentation + scripts that were forgotten
2020-02-03 16:08:05 -05:00
Antonio Carlos Falcão Petri
2ba147ecff
Fix typo in examples/utils_ner.py
...
"%s-%d".format() -> "{}-{}".format()
2020-02-01 11:10:57 -05:00
Lysandre
d18d47be67
run_generation style
2020-01-31 12:05:48 -05:00
Lysandre
7365f01d43
do_sample should be set to True in run_generation.py
2020-01-31 11:49:32 -05:00
Jared Nielsen
71a382319f
Correct documentation
2020-01-30 18:41:24 -05:00
Hang Le
f0a4fc6cd6
Add Flaubert
2020-01-30 10:04:18 -05:00
Jared Nielsen
adb8c93134
Remove lines causing a KeyError
2020-01-29 14:01:16 -05:00
Lysandre
335dd5e68a
Default save steps 50 to 500 in all scripts
2020-01-28 09:42:11 -05:00
Julien Chaumond
6b4c3ee234
[run_lm_finetuning] GPT2 tokenizer doesn't have a pad_token
...
ping @lysandrejik
2020-01-27 20:14:02 -05:00
VictorSanh
1ce3fb5cc7
update correct eval metrics (distilbert & co)
2020-01-24 11:45:22 -05:00
Julien Chaumond
1a8e87be4e
Line-by-line text dataset (including padding)
2020-01-21 16:57:38 -05:00
Julien Chaumond
b94cf7faac
change order
2020-01-21 16:57:38 -05:00
Julien Chaumond
2eaa8b6e56
Easier to not support this, as it could be confusing
...
cc @lysandrejik
2020-01-21 16:57:38 -05:00
Julien Chaumond
801aaa5508
make style
2020-01-21 16:57:38 -05:00
Julien Chaumond
56d4ba8ddb
[run_lm_finetuning] Train from scratch
2020-01-21 16:57:38 -05:00
jiyeon_baek
6d5049a24d
Fix typo in examples/run_squad.py
...
Rul -> Run
2020-01-17 11:22:51 -05:00
Lysandre
6e2c28a14a
Run SQuAD warning when the doc stride may be too high
2020-01-16 13:59:26 -05:00
thomwolf
258ed2eaa8
adding details in readme
2020-01-16 13:21:30 +01:00
thomwolf
50ee59578d
update formating - make flake8 happy
2020-01-16 13:21:30 +01:00
thomwolf
1c9333584a
formating
2020-01-16 13:21:30 +01:00
thomwolf
e25b6fe354
updating readme
2020-01-16 13:21:30 +01:00
thomwolf
27c7b99015
adding details in readme - moving file
2020-01-16 13:21:30 +01:00
Nafise Sadat Moosavi
99d4515572
HANS evaluation
2020-01-16 13:21:30 +01:00
Julien Chaumond
83a41d39b3
💄 super
2020-01-15 18:33:50 -05:00
Julien Chaumond
715fa638a7
Merge branch 'master' into from_scratch_training
2020-01-14 18:58:21 +00:00
Julien Chaumond
b803b067bf
Config to Model mapping
2020-01-13 20:05:20 +00:00
IWillPull
a3085020ed
Added repetition penalty to PPLM example ( #2436 )
...
* Added repetition penalty
* Default PPLM repetition_penalty to neutral
* Minor modifications to comply with reviewer's suggestions. (j -> token_idx)
* Formatted code with `make style`
2020-01-10 23:00:07 -05:00
VictorSanh
e83d9f1c1d
cleaning - change ' to " (black requirements)
2020-01-10 19:34:25 -05:00
VictorSanh
ebba9e929d
minor spring cleaning - missing configs + processing
2020-01-10 19:14:58 -05:00
Victor SANH
331065e62d
missing import
2020-01-10 11:42:53 +01:00
Victor SANH
414e9e7122
indents test
2020-01-10 11:42:53 +01:00
Victor SANH
3cdb38a7c0
indents
2020-01-10 11:42:53 +01:00
Victor SANH
ebd45980a0
Align with run_squad
+ fix some errors
2020-01-10 11:42:53 +01:00
Victor SANH
45634f87f8
fix Sampler in distributed training - evaluation
2020-01-10 11:42:53 +01:00
Victor SANH
af1ee9e648
Move torch.nn.utils.clip_grad_norm_
2020-01-10 11:42:53 +01:00
Lysandre
164c794eb3
New SQuAD API for distillation script
2020-01-10 11:42:53 +01:00
Lysandre
16ce15ed4b
DistilBERT token type ids removed from inputs in run_squad
2020-01-08 13:18:30 +01:00
Lysandre Debut
f24232cd1b
Fix error with global step in run_squad.py
2020-01-08 11:39:00 +01:00
Oren Amsalem
43114b89ba
spelling correction ( #2434 )
2020-01-07 17:25:25 +01:00
Lysandre Debut
27c1b656cc
Fix error with global step in run_lm_finetuning.py
2020-01-07 16:16:12 +01:00
Simone Primarosa
176d3b3079
Add support for Albert and XLMRoberta for the Glue example ( #2403 )
...
* Add support for Albert and XLMRoberta for the Glue example
2020-01-07 14:55:55 +01:00
alberduris
81d6841b4b
GPU text generation: mMoved the encoded_prompt to correct device
2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b
Moved the encoded_prompts to correct device
2020-01-06 15:11:12 +01:00
karajan1001
f01b3e6680
fix #2399 an ImportError in official example ( #2400 )
...
* fix #2399 an ImportError in official example
* style
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-01-05 12:50:20 -05:00
Julien Chaumond
629b22adcf
[run_lm_finetuning] mask_tokens: document types
2020-01-01 12:55:10 -05:00
Thomas Wolf
0412f3d929
Merge pull request #2291 from aaugustin/fix-flake8-F841
...
Fix F841 flake8 warning
2019-12-25 22:37:42 +01:00
Aymeric Augustin
a8d34e534e
Remove [--editable] in install instructions.
...
Use -e only in docs targeted at contributors.
If a user copy-pastes command line with [--editable], they will hit
an error. If they don't know the --editable option, we're giving them
a choice to make before they can move forwards, but this isn't a choice
they need to make right now.
2019-12-24 08:46:08 +01:00
Aymeric Augustin
81422c4e6d
Remove unused variables in examples.
2019-12-23 22:29:02 +01:00
Aymeric Augustin
c3783399db
Remove redundant requirements with transformers.
2019-12-23 19:17:27 +01:00
Aymeric Augustin
9fc8dcb2a0
Standardize import.
...
Every other file uses this pattern.
2019-12-23 18:45:42 +01:00
Aymeric Augustin
1c62e87b34
Use built-in open().
...
On Python 3, `open is io.open`.
2019-12-22 18:38:56 +01:00
Aymeric Augustin
d6eaf4e6d2
Update comments mentioning Python 2.
2019-12-22 18:38:56 +01:00
Aymeric Augustin
75a23d24af
Remove import fallbacks.
2019-12-22 18:38:56 +01:00
Aymeric Augustin
798b3b3899
Remove sys.version_info[0] == 2 or 3.
2019-12-22 18:38:42 +01:00
Aymeric Augustin
6b2200fc88
Remove u-prefixes.
2019-12-22 17:47:54 +01:00
Aymeric Augustin
c824d15aa1
Remove __future__ imports.
2019-12-22 17:47:54 +01:00
Aymeric Augustin
7e98e211f0
Remove unittest.main() in test modules.
...
This construct isn't used anymore these days.
Running python tests/test_foo.py puts the tests/ directory on
PYTHONPATH, which isn't representative of how we run tests.
Use python -m unittest tests/test_foo.py instead.
2019-12-22 14:42:03 +01:00
Aymeric Augustin
ced0a94204
Switch test files to the standard test_*.py scheme.
2019-12-22 14:15:13 +01:00
Aymeric Augustin
c11b3e2926
Sort imports for optional third-party libraries.
...
These libraries aren't always installed in the virtual environment where
isort is running. Declaring them properly avoids mixing these
third-party imports with local imports.
2019-12-22 11:19:13 +01:00
Aymeric Augustin
939148b050
Fix F401 flake8 warning (x28).
...
Do manually what autoflake couldn't manage.
2019-12-22 10:59:08 +01:00
Aymeric Augustin
783a616999
Fix F401 flake8 warning (x88 / 116).
...
This change is mostly autogenerated with:
$ python -m autoflake --in-place --recursive --remove-all-unused-imports --ignore-init-module-imports examples templates transformers utils hubconf.py setup.py
I made minor changes in the generated diff.
2019-12-22 10:59:08 +01:00
Aymeric Augustin
80327a13ea
Fix F401 flake8 warning (x152 / 268).
...
This change is mostly autogenerated with:
$ python -m autoflake --in-place --recursive examples templates transformers utils hubconf.py setup.py
I made minor changes in the generated diff.
2019-12-22 10:59:08 +01:00
Aymeric Augustin
fa2ccbc081
Fix E266 flake8 warning (x90).
2019-12-22 10:59:08 +01:00
Aymeric Augustin
2ab78325f0
Fix F821 flake8 warning (x47).
...
Ignore warnings related to Python 2, because it's going away soon.
2019-12-22 10:59:07 +01:00
Aymeric Augustin
631be27078
Fix E722 flake8 warnings (x26).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
b0f7db73cd
Fix E741 flake8 warning (x14).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
fd2f17a7a1
Fix E714 flake8 warning (x8).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
5eab3cf6bc
Fix W605 flake8 warning (x5).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
7dce8dc7ac
Fix E731 flake8 warning (x3).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
357db7098c
Fix E712 flake8 warning (x1).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
f9c5317db2
Fix E265 flake8 warning (x1).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
28e608a2c2
Remove trailing whitespace from all Python files.
...
Fixes flake8 warning W291 (x224).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
158e82e061
Sort imports with isort.
...
This is the result of:
$ isort --recursive examples templates transformers utils hubconf.py setup.py
2019-12-22 10:57:46 +01:00
Aymeric Augustin
fa84ae26d6
Reformat source code with black.
...
This is the result of:
$ black --line-length 119 examples templates transformers utils hubconf.py setup.py
There's a lot of fairly long lines in the project. As a consequence, I'm
picking the longest widely accepted line length, 119 characters.
This is also Thomas' preference, because it allows for explicit variable
names, to make the code easier to understand.
2019-12-21 17:52:29 +01:00
Thomas Wolf
73f6e9817c
Merge pull request #2115 from suvrat96/add_mmbt_model
...
[WIP] Add MMBT Model to Transformers Repo
2019-12-21 15:26:08 +01:00
thomwolf
344126fe58
move example to mm-imdb folder
2019-12-21 15:06:52 +01:00
Thomas Wolf
5b7fb6a4a1
Merge pull request #2134 from bkkaggle/saving-and-resuming
...
closes #1960 Add saving and resuming functionality for remaining examples
2019-12-21 15:03:53 +01:00
Thomas Wolf
6f68d559ab
Merge pull request #2130 from huggingface/ignored-index-coherence
...
[BREAKING CHANGE] Setting all ignored index to the PyTorch standard
2019-12-21 14:55:40 +01:00
thomwolf
1ab25c49d3
Merge branch 'master' into pr/2115
2019-12-21 14:54:30 +01:00
thomwolf
b03872aae0
fix merge
2019-12-21 14:49:54 +01:00
Thomas Wolf
518ba748e0
Merge branch 'master' into saving-and-resuming
2019-12-21 14:41:39 +01:00
Thomas Wolf
18601c3b6e
Merge pull request #2173 from erenup/master
...
run_squad with roberta
2019-12-21 14:33:16 +01:00
Thomas Wolf
eeb70cdd77
Merge branch 'master' into saving-and-resuming
2019-12-21 14:29:59 +01:00
Thomas Wolf
ed9b84816e
Merge pull request #1840 from huggingface/generation_sampler
...
[WIP] Sampling sequence generator for transformers
2019-12-21 14:27:35 +01:00
thomwolf
cfa0380515
Merge branch 'master' into generation_sampler
2019-12-21 14:12:52 +01:00
thomwolf
300ec3003c
fixing run_generation example - using torch.no_grad
2019-12-21 14:02:19 +01:00
thomwolf
1c37746892
fixing run_generation
2019-12-21 13:52:49 +01:00
thomwolf
8a2be93b4e
fix merge
2019-12-21 13:31:28 +01:00
Thomas Wolf
562f864038
Merge branch 'master' into fix-xlnet-squad2.0
2019-12-21 12:48:10 +01:00
Thomas Wolf
59941c5d1f
Merge pull request #2189 from stefan-it/xlmr
...
Add support for XLM-RoBERTa
2019-12-20 13:26:38 +01:00
Julien Chaumond
a5a06a851e
[doc] Param name consistency
2019-12-19 16:24:20 -05:00
Aidan Kierans
1718fb9e74
Minor/basic text fixes ( #2229 )
...
* Small clarification
Matches line 431 to line 435 for additional clarity and consistency.
* Fixed minor typo
The letter "s" was previously omitted from the word "docstrings".
2019-12-19 16:23:18 -05:00
Francesco
62c1fc3c1e
Removed duplicate XLMConfig, XLMForQuestionAnswering and XLMTokenizer from import statement of run_squad.py script
2019-12-19 09:50:56 -05:00
Ejar
284572efc0
Updated typo on the link
...
Updated documentation due to typo
2019-12-19 09:36:43 -05:00
Stefan Schweter
a26ce4dee1
examples: add XLM-RoBERTa to glue script
2019-12-19 02:23:01 +01:00
thomwolf
3d2096f516
further cleanup
2019-12-18 11:50:54 +01:00
thomwolf
83bc5235cf
Merge branch 'master' into pr/2189
2019-12-17 11:47:32 +01:00
Thomas Wolf
f061606277
Merge pull request #2164 from huggingface/cleanup-configs
...
[SMALL BREAKING CHANGE] Cleaning up configuration classes - Adding Model Cards
2019-12-17 09:10:16 +01:00
Lysandre
18a879f475
fix #2180
2019-12-16 16:44:29 -05:00
Lysandre
d803409215
Fix run squad evaluate during training
2019-12-16 16:31:38 -05:00
Stefan Schweter
71b4750517
examples: add support for XLM-RoBERTa to run_ner script
2019-12-16 16:37:27 +01:00
thomwolf
dc667ce1a7
double check cc @LysandreJik
2019-12-14 09:56:27 +01:00
thomwolf
7140363e09
update bertabs
2019-12-14 09:44:53 +01:00
Thomas Wolf
a52d56c8d9
Merge branch 'master' into cleanup-configs
2019-12-14 09:43:07 +01:00
erenup
c7780700f5
Merge branch 'refs/heads/squad_roberta'
...
# Conflicts:
# transformers/data/processors/squad.py
2019-12-14 08:53:59 +08:00
erenup
8e9526b4b5
add multiple processing
2019-12-14 08:43:58 +08:00
Lysandre
c8ed1c82c8
[SQUAD] Load checkpoint when evaluating without training
2019-12-13 12:13:48 -05:00
Pierric Cistac
5a5c4349e8
Fix summarization to_cpu
doc
2019-12-13 10:02:33 -05:00
thomwolf
47f0e3cfb7
cleaning up configuration classes
2019-12-13 14:33:24 +01:00
erenup
9b312f9d41
initial version for roberta squad
2019-12-13 14:51:40 +08:00
LysandreJik
7296f1010b
Cleanup squad and add allow train_file and predict_file usage
2019-12-12 13:01:04 -05:00
LysandreJik
3fd71c4431
Update example scripts
2019-12-12 12:08:54 -05:00
Alan deLevie
fbf5455a86
Fix typo in examples/run_glue.py args declaration.
...
deay -> decay
2019-12-12 11:16:19 -05:00
Bilal Khan
6aa919469d
Update run_xnli to save optimizer and scheduler states, then resume training from a checkpoint
2019-12-10 19:31:22 -06:00
Bilal Khan
89896fe04f
Update run_ner to save optimizer and scheduler states, then resume training from a checkpoint
2019-12-10 19:31:22 -06:00
Bilal Khan
fdc05cd68f
Update run_squad to save optimizer and scheduler states, then resume training from a checkpoint
2019-12-10 19:31:22 -06:00
Bilal Khan
854ec5784e
Update run_glue to save optimizer and scheduler states, then resume training from a checkpoint
2019-12-10 19:30:36 -06:00
LysandreJik
b72f9d340e
Correct index in script
2019-12-10 18:33:17 -05:00
LysandreJik
6a73382706
Complete warning + cleanup
2019-12-10 14:33:24 -05:00
Lysandre
dc4e9e5cb3
DataParallel for SQuAD + fix XLM
2019-12-10 19:21:20 +00:00
Rémi Louf
07bc8efbc3
add greedy decoding and sampling
2019-12-10 17:27:50 +01:00
Rémi Louf
4b82c485de
remove misplaced summarization documentation
2019-12-10 09:13:33 -05:00
Thomas Wolf
e57d00ee10
Merge pull request #1984 from huggingface/squad-refactor
...
[WIP] Squad refactor
2019-12-10 11:07:26 +01:00
Suvrat Bhooshan
df3961121f
Add MMBT Model to Transformers Repo
2019-12-09 18:36:48 -08:00
Julien Chaumond
1d18930462
Harmonize no_cuda
flag with other scripts
2019-12-09 20:37:55 -05:00
Rémi Louf
f7eba09007
clean for release
2019-12-09 20:37:55 -05:00
Rémi Louf
2a64107e44
improve device usage
2019-12-09 20:37:55 -05:00
Rémi Louf
c0707a85d2
add README
2019-12-09 20:37:55 -05:00
Rémi Louf
ade3cdf5ad
integrate ROUGE
2019-12-09 20:37:55 -05:00
Rémi Louf
076602bdc4
prevent BERT weights from being downloaded twice
2019-12-09 20:37:55 -05:00
Rémi Louf
a1994a71ee
simplified model and configuration
2019-12-09 20:37:55 -05:00
Rémi Louf
3a9a9f7861
default output dir to documents dir
2019-12-09 20:37:55 -05:00
Rémi Louf
693606a75c
update the docs
2019-12-09 20:37:55 -05:00
Rémi Louf
2403a66598
give transformers API to BertAbs
2019-12-09 20:37:55 -05:00
Rémi Louf
ba089c780b
share pretrained embeddings
2019-12-09 20:37:55 -05:00
Rémi Louf
9660ba1cbd
Add beam search
2019-12-09 20:37:55 -05:00
Rémi Louf
1c71ecc880
load the pretrained weights for encoder-decoder
...
We currently save the pretrained_weights of the encoder and decoder in
two separate directories `encoder` and `decoder`. However, for the
`from_pretrained` function to operate with automodels we need to
specify the type of model in the path to the weights.
The path to the encoder/decoder weights is handled by the
`PreTrainedEncoderDecoder` class in the `save_pretrained` function. Sice
there is no easy way to infer the type of model that was initialized for
the encoder and decoder we add a parameter `model_type` to the function.
This is not an ideal solution as it is error prone, and the model type
should be carried by the Model classes somehow.
This is a temporary fix that should be changed before merging.
2019-12-09 20:37:55 -05:00
Rémi Louf
07f4cd73f6
update function to add special tokens
...
Since I started my PR the `add_special_token_single_sequence` function
has been deprecated for another; I replaced it with the new function.
2019-12-09 20:37:55 -05:00
Bilal Khan
79526f82f5
Remove unnecessary epoch variable
2019-12-09 16:24:35 -05:00
Bilal Khan
9626e0458c
Add functionality to continue training from last saved global_step
2019-12-09 16:24:35 -05:00
Bilal Khan
2d73591a18
Stop saving current epoch
2019-12-09 16:24:35 -05:00
Bilal Khan
0eb973b0d9
Use saved optimizer and scheduler states if available
2019-12-09 16:24:35 -05:00
Bilal Khan
a03fcf570d
Save tokenizer after each epoch to be able to resume training from a checkpoint
2019-12-09 16:24:35 -05:00
Bilal Khan
f71b1bb05a
Save optimizer state, scheduler state and current epoch
2019-12-09 16:24:35 -05:00
LysandreJik
2a4ef098d6
Add ALBERT and XLM to SQuAD script
2019-12-09 10:46:47 -05:00
Lysandre Debut
00c4e39581
Merge branch 'master' into squad-refactor
2019-12-09 10:41:15 -05:00
Thomas Wolf
5482822a2b
Merge pull request #2046 from jplu/tf2-ner-example
...
Add NER TF2 example.
2019-12-06 12:12:22 +01:00
LysandreJik
e9217da5ff
Cleanup
...
Improve global visibility on the run_squad script, remove unused files and fixes related to XLNet.
2019-12-05 16:01:51 -05:00
LysandreJik
9ecd83dace
Patch evaluation for impossible values + cleanup
2019-12-05 14:44:57 -05:00
VictorSanh
35ff345fc9
update requirements
2019-12-05 12:07:04 -05:00
VictorSanh
552c44a9b1
release distilm-bert
2019-12-05 10:14:58 -05:00
Rosanne Liu
ee53de7aac
Pr for pplm ( #2060 )
...
* license
* changes
* ok
* Update paper link and commands to run
* pointer to uber repo
2019-12-05 09:20:07 -05:00
Julien Plu
9200a759d7
Add few tests on the TF optimization file with some info in the documentation. Complete the README.
2019-12-05 12:56:43 +01:00
thomwolf
75a97af6bc
fix #1450 - add doc
2019-12-05 11:26:55 +01:00
LysandreJik
f7e4a7cdfa
Cleanup
2019-12-04 16:24:15 -05:00
LysandreJik
cca75e7884
Kill the demon spawn
2019-12-04 15:42:29 -05:00
LysandreJik
9ddc3f1a12
Naming update + XLNet/XLM evaluation
2019-12-04 10:37:00 -05:00
thomwolf
5bfcd0485e
fix #1991
2019-12-04 14:53:11 +01:00
Julien Plu
ecb923da9c
Create a NER example similar to the Pytorch one. It takes the same options, and can be run the same way.
2019-12-04 09:43:15 +01:00
LysandreJik
de276de1c1
Working evaluation
2019-12-03 17:15:51 -05:00
Julien Chaumond
7edb51f3a5
[pplm] split classif head into its own file
2019-12-03 22:07:25 +00:00
VictorSanh
48cbf267c9
Use full dataset for eval (SequentialSampler in Distributed setting)
2019-12-03 11:01:37 -05:00
Julien Chaumond
f434bfc623
[pplm] Update S3 links
...
Co-Authored-By: Piero Molino <w4nderlust@gmail.com>
2019-12-03 10:53:02 -05:00
Ethan Perez
96e83506d1
Always use SequentialSampler during evaluation
...
When evaluating, shouldn't we always use the SequentialSampler instead of DistributedSampler? Evaluation only runs on 1 GPU no matter what, so if you use the DistributedSampler with N GPUs, I think you'll only evaluate on 1/N of the evaluation set. That's at least what I'm finding when I run an older/modified version of this repo.
2019-12-03 10:15:39 -05:00
Julien Chaumond
3b48806f75
[pplm] README: add setup + tweaks
2019-12-03 10:14:02 -05:00
Julien Chaumond
0cb2c90890
readme
...
Co-Authored-By: Rosanne Liu <mimosavvy@gmail.com>
2019-12-03 10:14:02 -05:00
Julien Chaumond
1efb2ae7fc
[pplm] move scripts under examples/pplm/
2019-12-03 10:14:02 -05:00
Piero Molino
a59fdd1627
generate_text_pplm now works with batch_size > 1
2019-12-03 10:14:02 -05:00
w4nderlust
893d0d64fe
Changed order of some parameters to be more consistent. Identical results.
2019-12-03 10:14:02 -05:00
w4nderlust
f42816e7fc
Added additional check for url and path in discriminator model params
2019-12-03 10:14:02 -05:00
w4nderlust
f10b925015
Imrpovements: model_path renamed pretrained_model, tokenizer loaded from pretrained_model, pretrained_model set to discriminator's when discrim is specified, sample = False by default but cli parameter introduced. To obtain identical samples call the cli with --sample
2019-12-03 10:14:02 -05:00
w4nderlust
75904dae66
Removed global variable device
2019-12-03 10:14:02 -05:00
piero
7fd54b55a3
Added support for generic discriminators
2019-12-03 10:14:02 -05:00
piero
b0eaff36e6
Added a +1 to epoch when saving weights
2019-12-03 10:14:02 -05:00
piero
611961ade7
Added tqdm to preprocessing
2019-12-03 10:14:02 -05:00
piero
afc7dcd94d
Now run_pplm works on cpu. Identical output as before (when using gpu).
2019-12-03 10:14:02 -05:00
piero
61399e5afe
Cleaned perturb_past. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
ffc2935405
Fix for making unditioned generation work. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
9f693a0c48
Cleaned generate_text_pplm. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
61a12f790d
Renamed SmallConst to SMALL_CONST and introduced BIG_CONST. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
ef47b2c03a
Removed commented code. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
7ea12db3f5
Removed commented code. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
08c6e456a3
Cleaned full_text_generation. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
6c9c131780
More cleanup for run_model. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
7ffe47c888
Improved device specification
2019-12-03 10:14:02 -05:00
piero
4f2164e40e
First cleanup step, changing function names and passing parameters all the way through without using args. Identical output as before.
2019-12-03 10:14:02 -05:00
piero
821de121e8
Minor changes
2019-12-03 10:14:02 -05:00
w4nderlust
7469d03b1c
Fixed minor bug when running training on cuda
2019-12-03 10:14:02 -05:00
piero
0b51fba20b
Added script for training a discriminator for pplm to use
2019-12-03 10:14:02 -05:00
Piero Molino
34a83faabe
Let's make PPLM great again
2019-12-03 10:14:02 -05:00
Julien Chaumond
d5faa74cd6
tokenizer white space: revert to previous behavior
2019-12-03 10:14:02 -05:00
Julien Chaumond
0b77d66a6d
rm extraneous import
2019-12-03 10:14:02 -05:00
Rosanne Liu
83b1e6ac9e
fix the loss backward issue
...
(cherry picked from commit 566468cc984c6ec7e10dfc62b5b4191781a99cd2)
2019-12-03 10:14:02 -05:00
Julien Chaumond
572c24cfa2
PPLM (squashed)
...
Co-authored-by: piero <piero@uber.com>
Co-authored-by: Rosanne Liu <mimosavvy@gmail.com>
2019-12-03 10:14:02 -05:00
Thomas Wolf
f19a78a634
Merge pull request #1903 from valohai/master
...
Valohai integration
2019-12-03 16:13:01 +01:00
maxvidal
b0ee7c7df3
Added Camembert to available models
2019-11-29 14:17:02 -05:00
Juha Kiili
41aa0e8003
Refactor logs and fix loss bug
2019-11-29 15:33:25 +02:00
Lysandre
bd41e8292a
Cleanup & Evaluation now works
2019-11-28 16:03:56 -05:00
Stefan Schweter
8c276b9c92
Merge branch 'master' into distilbert-german
2019-11-27 18:11:49 +01:00
VictorSanh
d5478b939d
add distilbert + update run_xnli wrt run_glue
2019-11-27 11:07:22 -05:00
VictorSanh
73fe2e7385
remove fstrings
2019-11-27 11:07:22 -05:00
VictorSanh
3e7656f7ac
update readme
2019-11-27 11:07:22 -05:00
VictorSanh
abd397e954
uniformize w/ the cache_dir update
2019-11-27 11:07:22 -05:00
VictorSanh
d5910b312f
move xnli processor (and utils) to transformers/data/processors
2019-11-27 11:07:22 -05:00
VictorSanh
289cf4d2b7
change default for XNLI: dev --> test
2019-11-27 11:07:22 -05:00
VictorSanh
84a0b522cf
mbert reproducibility results
2019-11-27 11:07:22 -05:00
VictorSanh
c4336ecbbd
xnli - output_mode consistency
2019-11-27 11:07:22 -05:00
VictorSanh
d52e98ff9a
add xnli examples/README.md
2019-11-27 11:07:22 -05:00
VictorSanh
71f71ddb3e
run_xnli + utils_xnli
2019-11-27 11:07:22 -05:00
Julien Chaumond
b5d884d25c
Uniformize #1952
2019-11-27 11:05:55 -05:00
Lysandre
4374eaea78
ALBERT for SQuAD
2019-11-26 13:08:12 -05:00
Lysandre
c110c41fdb
Run GLUE and remove LAMB
2019-11-26 13:08:12 -05:00
manansanghi
5d3b8daad2
Minor bug fixes on run_ner.py
2019-11-25 16:48:03 -05:00
İbrahim Ethem Demirci
aa92a184d2
resize model when special tokenizer present
2019-11-25 15:06:32 -05:00
Lysandre
7485caefb0
fix #1894
2019-11-25 09:33:39 -05:00
Julien Chaumond
176cd1ce1b
[doc] homogenize instructions slightly
2019-11-23 11:18:54 -05:00
Lysandre
c3ba645237
Works for XLNet
2019-11-22 16:27:37 -05:00
Lysandre
72e506b22e
wip
2019-11-22 16:26:00 -05:00
Rémi Louf
26db31e0c0
update the documentation
2019-11-21 14:41:19 -05:00
Juha Kiili
2cf3447e0a
Glue: log in Valohai-compatible JSON format too
2019-11-21 12:35:25 +02:00
Thomas Wolf
0cdfcca24b
Merge pull request #1860 from stefan-it/camembert-for-token-classification
...
[WIP] Add support for CamembertForTokenClassification
2019-11-21 10:56:07 +01:00
Jin Young Sohn
e70cdf083d
Cleanup TPU bits from run_glue.py
...
TPU runner is currently implemented in:
https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py .
We plan to upstream this directly into `huggingface/transformers`
(either `master` or `tpu`) branch once it's been more thoroughly tested.
2019-11-20 17:54:34 -05:00
Lysandre
454455c695
fix #1879
2019-11-20 09:42:48 -05:00
Stefan Schweter
e7cf2ccd15
distillation: add German distilbert model
2019-11-19 19:55:19 +01:00
Kazutoshi Shinoda
f3386d9383
typo "deay" -> "decay"
2019-11-18 11:50:06 -05:00
Stefan Schweter
56c84863a1
camembert: add support for CamemBERT in run_ner example
2019-11-18 17:06:57 +01:00
Julien Chaumond
26858f27cb
[camembert] Upload to s3 + rename script
2019-11-16 00:11:07 -05:00
Louis MARTIN
3e20c2e871
Update demo_camembert.py with new classes
2019-11-16 00:11:07 -05:00
Louis MARTIN
f12e4d8da7
Move demo_camembert.py to examples/contrib
2019-11-16 00:11:07 -05:00
Louis MARTIN
6e72fd094c
Add demo_camembert.py
2019-11-16 00:11:07 -05:00
Xu Hongshen
ca99a2d500
Update example readme
2019-11-15 14:55:26 +08:00
Xu Hongshen
7da3ef24cd
add is_impossible tensor to model inputs during fine-tuning xlnet on squad2.0
2019-11-15 14:18:53 +08:00
Thomas Wolf
74ce8de7d8
Merge pull request #1792 from stefan-it/distilbert-for-token-classification
...
DistilBERT for token classification
2019-11-14 22:47:53 +01:00
Thomas Wolf
05db5bc1af
added small comparison between BERT, RoBERTa and DistilBERT
2019-11-14 22:40:22 +01:00
Thomas Wolf
9629e2c676
Merge pull request #1804 from ronakice/master
...
fix multi-gpu eval in torch examples
2019-11-14 22:24:05 +01:00
Thomas Wolf
df99f8c5a1
Merge pull request #1832 from huggingface/memory-leak-schedulers
...
replace LambdaLR scheduler wrappers by function
2019-11-14 22:10:31 +01:00
Rémi Louf
2276bf69b7
update the examples, docs and template
2019-11-14 20:38:02 +01:00
Lysandre
d7929899da
Specify checkpoint in saved file for run_lm_finetuning.py
2019-11-14 10:49:00 -05:00
ronakice
2e31176557
fix multi-gpu eval
2019-11-12 05:55:11 -05:00
Stefan Schweter
2b07b9e5ee
examples: add DistilBert support for NER fine-tuning
2019-11-11 16:19:34 +01:00
Adrian Bauer
7a9aae1044
Fix run_bertology.py
...
Make imports and args.overwrite_cache match run_glue.py
2019-11-08 16:28:40 -05:00
Julien Chaumond
f88c104d8f
[run_tf_glue] Add comment for context
2019-11-05 19:56:43 -05:00
Julien Chaumond
30968d70af
misc doc
2019-11-05 19:06:12 -05:00
Thomas Wolf
e99071f105
Merge pull request #1734 from orena1/patch-1
...
add progress bar to convert_examples_to_features
2019-11-05 11:34:20 +01:00
Thomas Wolf
ba973342e3
Merge pull request #1553 from WilliamTambellini/timeSquadInference
...
Add speed log to examples/run_squad.py
2019-11-05 11:13:12 +01:00
Thomas Wolf
237fad339c
Merge pull request #1709 from oneraghavan/master
...
Fixing mode in evaluate during training
2019-11-05 10:55:33 +01:00
Oren Amsalem
d7906165a3
add progress bar for convert_examples_to_features
...
It takes considerate amount of time (~10 min) to parse the examples to features, it is good to have a progress-bar to track this
2019-11-05 10:34:27 +02:00
thomwolf
89d6272898
Fix #1623
2019-11-04 16:21:12 +01:00
Thomas Wolf
9a3b173cd3
Merge branch 'master' into master
2019-11-04 11:41:26 +01:00
thomwolf
ad90868627
Update example readme
2019-11-04 11:27:22 +01:00
Raghavan
e5b1048bae
Fixing mode in evaluate during training
2019-11-03 16:14:46 +05:30
Lysandre
1a2b40cb53
run_tf_glue MRPC evaluation only for MRPC
2019-10-31 18:00:51 -04:00
Timothy Liu
be36cf92fb
Added mixed precision support to benchmarks.py
2019-10-31 17:24:37 -04:00
Julien Chaumond
f96ce1c241
[run_generation] Fix generation with batch_size>1
2019-10-31 18:27:11 +00:00
Julien Chaumond
3c1b6f594e
Merge branch 'master' into fix_top_k_top_p_filtering
2019-10-31 13:53:51 -04:00
Victor SANH
fa735208c9
update readme - fix example command distil*
2019-10-30 14:27:28 -04:00
Thomas Wolf
c7058d8224
Merge pull request #1608 from focox/master
...
Error raised by "tmp_eval_loss += tmp_eval_loss.item()" when using multi-gpu
2019-10-30 17:14:07 +01:00
Thomas Wolf
04c69db399
Merge pull request #1628 from huggingface/tfglue
...
run_tf_glue works with all tasks
2019-10-30 17:04:03 +01:00
Thomas Wolf
3df4367244
Merge pull request #1601 from huggingface/clean-roberta
...
Clean roberta model & all tokenizers now add special tokens by default (breaking change)
2019-10-30 17:00:40 +01:00
Thomas Wolf
36174696cc
Merge branch 'master' into clean-roberta
2019-10-30 16:51:06 +01:00
Thomas Wolf
228cdd6a6e
Merge branch 'master' into conditional-generation
2019-10-30 16:40:35 +01:00
Rémi Louf
070507df1f
format utils for summarization
2019-10-30 11:24:12 +01:00
Rémi Louf
da10de8466
fix bug with padding mask + add corresponding test
2019-10-30 11:19:58 +01:00
Rémi Louf
3b0d2fa30e
rename seq2seq to encoder_decoder
2019-10-30 10:54:46 +01:00
Rémi Louf
9c1bdb5b61
revert renaming of lm_labels to ltr_lm_labels
2019-10-30 10:43:13 +01:00
Rémi Louf
098a89f312
update docstrings; rename lm_labels to more explicit ltr_lm_labels
2019-10-29 20:08:03 +01:00
Rémi Louf
dfce409691
resolve PR comments
2019-10-29 17:10:20 +01:00
altsoph
079bfb32fb
Evaluation fixed.
2019-10-28 10:18:58 -04:00
altsoph
438f2730a0
Evaluation code fixed.
2019-10-28 10:18:58 -04:00
Rémi Louf
4c3ac4a7d8
here's one big commit
2019-10-28 10:49:50 +01:00
Rémi Louf
932543f77e
fix test of truncation function
2019-10-28 10:49:49 +01:00
Rémi Louf
a67413ccc8
extend works in-place
2019-10-28 10:49:49 +01:00
Rémi Louf
b915ba9dfe
pad sequence with 0, mask with -1
2019-10-28 10:49:49 +01:00
Lysandre
bab6ad01aa
run_tf_glue works with all tasks
2019-10-24 21:41:45 +00:00
Matt Maybeno
ae1d03fc51
Add roberta to doc
2019-10-24 14:32:48 -04:00
Matt Maybeno
4e5f88b74f
Add Roberta to run_ner.py
2019-10-24 14:32:48 -04:00
VictorSanh
5b6cafb11b
[release] fix table weirdness
2019-10-23 10:35:16 -04:00
VictorSanh
8ad5c591cd
[RELEASE] DistilRoBERTa
2019-10-23 10:29:47 -04:00
focox@qq.com
bd847ce7d7
fixed the bug raised by "tmp_eval_loss += tmp_eval_loss.item()" when parallelly using multi-gpu.
2019-10-23 20:27:13 +08:00
Julien Chaumond
ef1b8b2ae5
[CTRL] warn if generation prompt does not start with a control code
...
see also https://github.com/salesforce/ctrl/pull/50
2019-10-22 21:30:32 +00:00
Lysandre
7d709e55ed
Remove
2019-10-22 14:12:33 -04:00
Lysandre
1cfd974868
Option to benchmark only one of the two libraries
2019-10-22 13:32:23 -04:00
Pasquale Minervini
abd7110e21
gradient norm clipping should be done right before calling the optimiser - fixing run_glue and run_ner as well
2019-10-21 19:56:52 +01:00
Pasquale Minervini
3775550c4b
gradient norm clipping should be done right before calling the optimiser
2019-10-20 22:33:56 +01:00
LysandreJik
7dd29ed2f1
Benchmarks example script
2019-10-18 10:53:04 -04:00
William Tambellini
0919389d9a
Add speed log to examples/run_squad.py
...
Add a speed estimate log (time per example)
for evaluation to examples/run_squad.py
2019-10-17 14:41:04 -07:00
leo-du
ecd15667f3
fix repetition penalty
2019-10-17 14:47:14 -04:00
thomwolf
8cd56e3036
fix data processing in script
2019-10-17 16:33:26 +02:00
Rémi Louf
578d23e061
add training pipeline (formatting temporary)
2019-10-17 14:02:27 +02:00
Rémi Louf
47a06d88a0
use two different tokenizers for storyand summary
2019-10-17 13:04:26 +02:00
Rémi Louf
bfb9b540d4
add Model2Model to __init__
2019-10-17 12:59:51 +02:00
Rémi Louf
c1bc709c35
correct the truncation and padding of dataset
2019-10-17 10:41:53 +02:00
Rémi Louf
e4e0ee14bd
add separator between data import and train
2019-10-16 20:05:32 +02:00
Rémi Louf
0d81fc853e
specify in readme that both datasets are required
2019-10-15 15:26:33 +02:00
Rémi Louf
1aec940587
test the full story processing
2019-10-15 15:18:07 +02:00
Rémi Louf
22e1af6859
truncation function is fully tested
2019-10-15 14:43:50 +02:00
Rémi Louf
260ac7d9a8
wip commit, switching computers
2019-10-15 12:24:35 +02:00
thomwolf
be916cb3fb
Merge branch 'master' of https://github.com/huggingface/transformers
2019-10-15 10:37:13 +02:00
thomwolf
5875aaf762
install tensorboard
2019-10-15 10:36:46 +02:00
Thomas Wolf
40f14ff545
Merge pull request #1513 from slayton58/amp_fp16_einsum
...
Force einsum to run in fp16
2019-10-15 10:25:00 +02:00
Thomas Wolf
d147671c6c
Merge pull request #1508 from tlkh/master
...
Added performance enhancements (XLA, AMP) to examples
2019-10-15 09:57:18 +02:00
thomwolf
2c1d5564ad
add readme information
2019-10-15 09:56:52 +02:00
thomwolf
c55badcee0
Add NER finetuning details by @stefan-it in example readme
2019-10-15 09:33:52 +02:00
Julien Chaumond
788e632622
[ner] Honor args.overwrite_cache
2019-10-15 09:17:31 +02:00
thomwolf
0f9ebb0b43
add seqeval as requirement for examples
2019-10-15 09:17:31 +02:00
thomwolf
66adb71734
update to transformers
2019-10-15 09:17:31 +02:00
Marianne Stecklina
5ff9cd158a
Add option to predict on test set
2019-10-15 09:17:31 +02:00
Marianne Stecklina
7f5367e0b1
Add cli argument for configuring labels
2019-10-15 09:17:31 +02:00
Marianne Stecklina
e1d4179b64
Make file reading more robust
2019-10-15 09:17:31 +02:00
Marianne Stecklina
383ef96747
Implement fine-tuning BERT on CoNLL-2003 named entity recognition task
2019-10-15 09:17:31 +02:00
Marianne Stecklina
5adb39e757
Add option to predict on test set
2019-10-15 09:14:53 +02:00
Marianne Stecklina
99b189df6d
Add cli argument for configuring labels
2019-10-15 09:14:53 +02:00
Marianne Stecklina
3e9420add1
Make file reading more robust
2019-10-15 09:14:53 +02:00
Marianne Stecklina
cde42c4354
Implement fine-tuning BERT on CoNLL-2003 named entity recognition task
2019-10-15 09:14:53 +02:00
hlums
74c5035808
Fix token order in xlnet preprocessing.
2019-10-14 21:27:11 +00:00
Rémi Louf
fe25eefc15
add instructions to fetch the dataset
2019-10-14 20:45:39 +02:00
Rémi Louf
412793275d
delegate the padding with special tokens to the tokenizer
2019-10-14 20:45:16 +02:00
Rémi Louf
447fffb21f
process the raw CNN/Daily Mail dataset
...
the data provided by Li Dong et al. were already tokenized, which means
that they are not compatible with all the models in the library. We
thus process the raw data directly and tokenize them using the models'
tokenizers.
2019-10-14 18:12:20 +02:00
Simon Layton
4e6a55751a
Force einsum to fp16
2019-10-14 11:12:41 -04:00
Rémi Louf
67d10960ae
load and prepare CNN/Daily Mail data
...
We write a function to load an preprocess the CNN/Daily Mail dataset as
provided by Li Dong et al. The issue is that this dataset has already
been tokenized by the authors, so we actually need to find the original,
plain-text dataset if we want to apply it to all models.
2019-10-14 14:11:20 +02:00
Timothy Liu
376e65a674
Added automatic mixed precision and XLA options to run_tf_glue.py
2019-10-13 13:19:06 +00:00
Timothy Liu
86f23a1944
Minor enhancements to run_tf_glue.py
2019-10-13 10:21:35 +00:00
VictorSanh
d844db4005
Add citation bibtex
2019-10-11 16:55:42 -04:00
Rémi Louf
b3261e7ace
read parameters from CLI, load model & tokenizer
2019-10-11 18:40:38 +02:00
Rémi Louf
d889e0b71b
add base for seq2seq finetuning
2019-10-11 17:36:12 +02:00
Thomas Wolf
4428aefc63
Merge pull request #1488 from huggingface/pytorch-tpu
...
GLUE on TPU
2019-10-11 16:33:00 +02:00
Luran He
f382a8decd
convert int to str before adding to a str
2019-10-10 19:20:39 -04:00
Lysandre
639f4b7190
Don't save/load when on TPU
2019-10-10 19:17:25 +00:00
Lysandre
d4e7934ac3
GLUE on TPU
2019-10-10 19:03:06 +00:00
Rémi Louf
1e68c28670
add test for initialization of Bert2Rnd
2019-10-10 18:07:11 +02:00
Thomas Wolf
6596e3d566
Merge pull request #1454 from bkkaggle/pytorch-built-in-tensorboard
...
Change tensorboard imports to use built-in tensorboard if available
2019-10-10 11:56:55 +02:00
thomwolf
177a721205
move back to simple space spliting
2019-10-10 11:45:47 +02:00
thomwolf
a5997dd81a
better error messages
2019-10-10 11:31:01 +02:00
Lysandre Debut
2431fea98a
Merge pull request #1383 from keskarnitish/master
...
Adding CTRL
2019-10-09 11:31:05 -04:00
thomwolf
d9e60f4f0d
Merge branch 'master' into pr/1383
2019-10-09 17:25:08 +02:00
Lysandre Debut
e84470ef81
Merge pull request #1384 from huggingface/encoding-qol
...
Quality of life enhancements in encoding + patch MLM masking
2019-10-09 11:18:24 -04:00
jinoobaek-qz
69629c4f0f
Improve naming and only do regex when necessary
2019-10-09 08:48:40 -04:00
jinoobaek-qz
bf34a252b8
Golden path
2019-10-09 08:48:40 -04:00
jinoobaek-qz
528d3f327b
Improve readability and improve make less assumptions about checkpoint format
2019-10-09 08:48:40 -04:00
jinoobaek-qz
56301bd9e8
Extract method
2019-10-09 08:48:40 -04:00
jinoobaek-qz
d6c5469712
Delete older checkpoint after saving new checkpoint
2019-10-09 08:48:40 -04:00
jinoobaek-qz
54a31f50fb
Add save_total_limit
2019-10-09 08:48:40 -04:00
Thomas Wolf
439fac723a
Merge pull request #1409 from brian41005/master
...
Evaluation result.txt path changing #1286
2019-10-09 03:14:34 +02:00
Bilal Khan
5ce8d29abe
Change tensorboard imports to use built-in tensorboard if available
2019-10-08 16:29:43 -05:00
VictorSanh
7ce83b4931
update weights for distilgpt2
2019-10-07 12:30:27 -04:00
LysandreJik
f3e0218fbb
Correct device assignment in run_generation
2019-10-05 21:05:16 -04:00
thomwolf
78ef1a9930
fixes
2019-10-04 17:59:44 -04:00
thomwolf
6c1d0bc066
update encode_plus - add truncation strategies
2019-10-04 17:38:38 -04:00
VictorSanh
0820bb0555
unecessary carriage return
2019-10-04 17:23:15 -04:00
VictorSanh
f5891c3821
run_squad --> run_squad_w_distillation
2019-10-04 17:23:15 -04:00
VictorSanh
764a7923ec
add distillation+finetuning option in run_squad
2019-10-04 17:23:15 -04:00
thomwolf
92c0f2fb90
Merge remote-tracking branch 'origin/julien_multiple-choice' into encoding-qol
2019-10-04 15:48:06 -04:00
Julien Chaumond
9e136ff57c
Honor args.overwrite_cache (h/t @erenup)
2019-10-04 15:00:56 -04:00
keskarnitish
dbed1c5d94
Adding CTRL (squashed commit)
...
adding conversion script
adding first draft of modeling & tokenization
adding placeholder for test files
bunch of changes
registering the tokenizer/model/etc
tests
change link; something is very VERY wrong here
weird end-of-word thingy going on
i think the tokenization works now ; wrote the unit tests
overall structure works;load w next
the monster is alive!
works after some cleanup as well
adding emacs autosave to gitignore
currently only supporting the 48 layer one; seems to infer fine on my macbook
cleanup
fixing some documentation
fixing some documentation
tests passing?
now works on CUDA also
adding greedy?
adding greedy sampling
works well
2019-10-03 22:29:03 -07:00
Lysandre Debut
d3f24dfad7
Merge branch 'master' into master
2019-10-03 22:43:09 +00:00
LysandreJik
ecc4f1bdfa
XLM use_lang_embedding flag in run_generation
2019-10-03 17:42:16 -04:00
LysandreJik
c2c2ca0fdb
Added XLM to run_generation, with prompt language selection.
2019-10-03 17:18:48 -04:00
LysandreJik
aebd83230f
Update naming + remove f string in run_lm_finetuning example
2019-10-03 11:31:36 -04:00
LysandreJik
5ed50a93fb
LM finetuning won't mask special tokens anymore
2019-10-03 11:31:36 -04:00
Brian Ma
7af0777910
Update run_glue.py
...
add DistilBert model shortcut into ALL_MODELS
2019-10-03 15:31:11 +00:00
VictorSanh
5f07d8f11a
prepare release
2019-10-03 10:27:11 -04:00
VictorSanh
35071007cb
incoming release 🔥 update links to arxiv preprint
2019-10-03 10:27:11 -04:00
VictorSanh
2a91f6071f
upddate README - TODO updadte link to paper
2019-10-03 10:27:11 -04:00
VictorSanh
c51e533a5f
update train.py
2019-10-03 10:27:11 -04:00
VictorSanh
a76c3f9cb0
update requirements
2019-10-03 10:27:11 -04:00
VictorSanh
bb9c5ead54
update distiller
2019-10-03 10:27:11 -04:00
VictorSanh
a12ab0a8db
update binarized_data
2019-10-03 10:27:11 -04:00
VictorSanh
4d6dfbd376
update extract
2019-10-03 10:27:11 -04:00
VictorSanh
23edebc079
update extract_distilbert
2019-10-03 10:27:11 -04:00
VictorSanh
cbfcfce205
update token_counts
2019-10-03 10:27:11 -04:00
VictorSanh
19e4ebbe3f
grouped_batch_sampler
2019-10-03 10:27:11 -04:00
VictorSanh
594202a934
lm_seqs_dataset
2019-10-03 10:27:11 -04:00
VictorSanh
38084507c4
add distillation_configs
2019-10-03 10:27:11 -04:00
Brian Ma
2195c0d5f9
Evaluation result.txt path changing #1286
2019-10-03 12:49:12 +08:00
Thomas Wolf
963529e29b
Merge pull request #1288 from echan00/master
...
Typo with LM Fine tuning script
2019-10-01 18:46:07 -04:00
thomwolf
f7978f70ec
use format instead of f-strings
2019-10-01 18:45:38 -04:00
Julien Chaumond
b350662955
overflowing_tokens do not really make sense here, let's just return a number
...
Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2019-09-30 16:37:09 -04:00
Julien Chaumond
f5bcde0b2f
[multiple-choice] Simplify and use tokenizer.encode_plus
2019-09-30 16:04:55 -04:00
Denny
9478590630
Update run_lm_finetuning.py
...
The previous method, just as phrased, did not exist in the class.
2019-09-27 15:18:42 -03:00
Thomas Wolf
d83d295763
Merge pull request #1337 from mgrankin/fastdataset
...
faster dataset building
2019-09-27 10:35:12 +02:00
thomwolf
da2e47ad15
clean up a little run_tf_glue
2019-09-27 09:41:15 +02:00
thomwolf
528c288fa9
clean up run_tf_glue
2019-09-27 09:40:29 +02:00
VictorSanh
702f589848
fix input in run_glue for distilbert
2019-09-27 00:20:14 -04:00
mgrankin
f71a4577b8
faster dataset building
2019-09-26 16:53:13 +03:00
thomwolf
481d9c4fb5
Merge branch 'master' into tf2
2019-09-26 12:02:54 +02:00
thomwolf
31c23bd5ee
[BIG] pytorch-transformers => transformers
2019-09-26 10:15:53 +02:00
thomwolf
5705333441
add initialization for everybody
2019-09-26 10:06:20 +02:00
thomwolf
7c9f8f93f9
fix tests
2019-09-26 01:59:53 +02:00
thomwolf
d6dde438ea
add batch dimension in encode
2019-09-26 01:45:55 +02:00
thomwolf
4a21c4d88d
add warning if neither pt nor tf are found
2019-09-26 01:30:06 +02:00
thomwolf
3b7fb48c3b
fix loading from tf/pt
2019-09-25 17:46:16 +02:00
thomwolf
a049c8043b
push fix to training
2019-09-25 17:33:16 +02:00
mataney
a9f24a16bc
[FIX] fix run_generation.py to work with batch_size > 1
2019-09-25 15:53:29 +03:00
thomwolf
5def3302f4
update run_glue
2019-09-25 12:38:08 +02:00
thomwolf
f71758f7a4
update internal glue processors
2019-09-25 12:00:50 +02:00
thomwolf
b5ec526f85
updated data processor and metrics
2019-09-24 17:10:50 +02:00
LysandreJik
f09e5ecef0
[Proposal] GLUE processors included in library
2019-09-24 09:47:34 -04:00
LysandreJik
c832f43a4d
output_token_type
-> token_type_ids
2019-09-24 07:21:38 -04:00
LysandreJik
3927d7756c
Updated the GLUE pre-processing method
2019-09-24 07:15:11 -04:00
LysandreJik
9d44236f70
Updated DistilBERT
2019-09-24 07:03:24 -04:00
Lorenzo Ampil
4b543c3007
Add option to use a 'stop token' which will be used to truncate the output text to everything till right before the 'stop token'
2019-09-22 21:38:38 +08:00
VictorSanh
9f995b99d4
minor fixes
2019-09-19 21:36:06 +00:00
VictorSanh
3fe5c8e8a8
update bert-base-uncased rslts
2019-09-19 19:34:22 +00:00
VictorSanh
354944e607
[distillation] big update w/ new weights
2019-09-19 19:25:21 +00:00
LysandreJik
60414f31a9
GLUE updated with new methods
2019-09-19 10:55:06 +02:00
LysandreJik
bf503158c5
Sentence -> Sequence. Removed output_mask from the special token addition methods.
2019-09-19 10:55:06 +02:00
LysandreJik
de8e14b6c0
Added DistilBERT to run_squad script
2019-09-19 10:55:06 +02:00
LysandreJik
88368c2a16
Added DistilBERT to run_lm_finetuning
2019-09-19 10:55:06 +02:00
LysandreJik
75635072e1
Updated GLUE script to add DistilBERT. Cleaned up unused args in the utils file.
2019-09-19 10:55:06 +02:00
LysandreJik
59057abe52
typo
2019-09-19 10:55:06 +02:00
LysandreJik
bac332fec0
Updated the GLUE data processor. Corrections to RoBERTa and XLNet.
2019-09-19 10:55:06 +02:00
Erik Chan
f0340eccf9
Typo
...
Typo
2019-09-18 13:42:11 -07:00
erenup
8960988f35
fixed to find best dev acc
2019-09-19 01:10:05 +08:00
erenup
46ffc28329
Merge branch 'master' into run_multiple_choice_merge
...
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
2019-09-18 21:43:46 +08:00
erenup
15143fbad6
move run_multiple_choice.py and utils_multiple_choice.py to examples
2019-09-18 21:18:46 +08:00
erenup
3cd6289758
Merge remote-tracking branch 'huggingface/master' into run_multiple_choice_merge
...
# Conflicts:
# examples/contrib/run_swag.py
2019-09-18 21:16:59 +08:00
erenup
36362cf086
move schedule.step after optimizer.step
2019-09-18 21:13:40 +08:00
thomwolf
e768f2322a
update run_openai_gpt to fix #1264
2019-09-18 10:07:47 +02:00
thomwolf
8334993915
clean up examples - updated to new keyword inputs - #1246
2019-09-18 10:01:27 +02:00
erenup
5882c442e5
add example usage
2019-09-16 22:38:08 +08:00
erenup
982f181aa7
Merge remote-tracking branch 'origin/master' into run_multiple_choice_add_doc
2019-09-16 19:12:00 +08:00
erenup
84b9d1c423
Merge remote-tracking branch 'huggingface/master'
...
# Conflicts:
# pytorch_transformers/__init__.py
2019-09-16 19:06:12 +08:00
erenup
603b470a3d
add warnning info
2019-09-16 18:53:37 +08:00
erenup
4812a5a767
add doc string
2019-09-16 11:50:18 +08:00
VictorSanh
32e1332acf
[distil] fix once for all general logger for scripts
2019-09-11 14:19:07 +00:00
VictorSanh
364920e216
fix small bug/typo
2019-09-10 21:45:01 +00:00
Thomas Wolf
23c23f5399
Merge pull request #1229 from SKRohit/master
...
changes in evaluate function in run_lm_finetuning.py
2019-09-10 22:16:45 +02:00
searchivarius
eab980fd68
Fix to prevent crashing on assert len(tokens_b)>=1
2019-09-09 19:58:08 -04:00
VictorSanh
a95ced6260
[Distillation] save last chkpt as pytorch_model.bin
2019-09-09 19:53:35 +00:00
Rohit Kumar Singh
e5df36397b
changes in return statement of evaluate function
...
changed `results` to `result` and removed `results` dict defined previously
2019-09-09 19:55:57 +05:30
LysandreJik
3f91338be9
Patched a few outdated parameters
2019-09-06 17:48:06 -04:00
LysandreJik
f47f9a5874
Updated outdated examples
2019-09-06 17:10:33 -04:00
LysandreJik
5e151f5e77
Table of contents
2019-09-06 12:08:36 -04:00
LysandreJik
593c070435
Better examples
2019-09-06 12:00:12 -04:00
VictorSanh
dddd6b9927
Update DistilBERT training code
2019-09-05 18:26:14 +00:00
Stefan Schweter
a1c34bd286
distillation: fix ModuleNotFoundError error in token counts script
2019-08-31 12:21:38 +02:00
Thomas Wolf
51e980ce36
Merge pull request #1155 from anhnt170489/apex_fp16
...
Update apex fp16 implementation
2019-08-30 23:29:11 +02:00
VictorSanh
282c276e09
typos + file name coherence in distillation README
2019-08-30 12:02:29 -04:00
VictorSanh
803c1cc4ea
fix relative import bug cf Issue #1140
2019-08-30 12:01:27 -04:00
Thomas Wolf
0a2fecdf90
Merge branch 'master' into master
2019-08-30 16:30:08 +02:00
Rabeeh KARIMI
39eb31e11e
remove reloading tokenizer in the training, adding it to the evaluation part
2019-08-30 15:44:41 +02:00
Rabeeh KARIMI
350bb6bffa
updated tokenizer loading for addressing reproducibility issues
2019-08-30 15:34:28 +02:00
Thomas Wolf
01ad55f8cf
Merge pull request #1026 from rabeehk/master
...
loads the tokenizer for each checkpoint, to solve the reproducability…
2019-08-30 14:15:36 +02:00