Commit Graph

2613 Commits

Author SHA1 Message Date
IWillPull
a3085020ed Added repetition penalty to PPLM example (#2436)
* Added repetition penalty

* Default PPLM repetition_penalty to neutral

* Minor modifications to comply with reviewer's suggestions. (j -> token_idx)

* Formatted code with `make style`
2020-01-10 23:00:07 -05:00
VictorSanh
e83d9f1c1d cleaning - change ' to " (black requirements) 2020-01-10 19:34:25 -05:00
VictorSanh
ebba9e929d minor spring cleaning - missing configs + processing 2020-01-10 19:14:58 -05:00
Victor SANH
331065e62d missing import 2020-01-10 11:42:53 +01:00
Victor SANH
414e9e7122 indents test 2020-01-10 11:42:53 +01:00
Victor SANH
3cdb38a7c0 indents 2020-01-10 11:42:53 +01:00
Victor SANH
ebd45980a0 Align with run_squad + fix some errors 2020-01-10 11:42:53 +01:00
Victor SANH
45634f87f8 fix Sampler in distributed training - evaluation 2020-01-10 11:42:53 +01:00
Victor SANH
af1ee9e648 Move torch.nn.utils.clip_grad_norm_ 2020-01-10 11:42:53 +01:00
Lysandre
164c794eb3 New SQuAD API for distillation script 2020-01-10 11:42:53 +01:00
Lysandre
16ce15ed4b DistilBERT token type ids removed from inputs in run_squad 2020-01-08 13:18:30 +01:00
Lysandre Debut
f24232cd1b Fix error with global step in run_squad.py 2020-01-08 11:39:00 +01:00
Oren Amsalem
43114b89ba spelling correction (#2434) 2020-01-07 17:25:25 +01:00
Lysandre Debut
27c1b656cc Fix error with global step in run_lm_finetuning.py 2020-01-07 16:16:12 +01:00
Simone Primarosa
176d3b3079 Add support for Albert and XLMRoberta for the Glue example (#2403)
* Add support for Albert and XLMRoberta for the Glue example
2020-01-07 14:55:55 +01:00
alberduris
81d6841b4b GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b Moved the encoded_prompts to correct device 2020-01-06 15:11:12 +01:00
karajan1001
f01b3e6680 fix #2399 an ImportError in official example (#2400)
* fix #2399 an ImportError in official example

* style

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-01-05 12:50:20 -05:00
Julien Chaumond
629b22adcf [run_lm_finetuning] mask_tokens: document types 2020-01-01 12:55:10 -05:00
Thomas Wolf
0412f3d929
Merge pull request #2291 from aaugustin/fix-flake8-F841
Fix F841 flake8 warning
2019-12-25 22:37:42 +01:00
Aymeric Augustin
a8d34e534e Remove [--editable] in install instructions.
Use -e only in docs targeted at contributors.

If a user copy-pastes  command line with [--editable], they will hit
an error. If they don't know the --editable option, we're giving them
a choice to make before they can move forwards, but this isn't a choice
they need to make right now.
2019-12-24 08:46:08 +01:00
Aymeric Augustin
81422c4e6d Remove unused variables in examples. 2019-12-23 22:29:02 +01:00
Aymeric Augustin
c3783399db Remove redundant requirements with transformers. 2019-12-23 19:17:27 +01:00
Aymeric Augustin
9fc8dcb2a0 Standardize import.
Every other file uses this pattern.
2019-12-23 18:45:42 +01:00
Aymeric Augustin
1c62e87b34 Use built-in open().
On Python 3, `open is io.open`.
2019-12-22 18:38:56 +01:00
Aymeric Augustin
d6eaf4e6d2 Update comments mentioning Python 2. 2019-12-22 18:38:56 +01:00
Aymeric Augustin
75a23d24af Remove import fallbacks. 2019-12-22 18:38:56 +01:00
Aymeric Augustin
798b3b3899 Remove sys.version_info[0] == 2 or 3. 2019-12-22 18:38:42 +01:00
Aymeric Augustin
6b2200fc88 Remove u-prefixes. 2019-12-22 17:47:54 +01:00
Aymeric Augustin
c824d15aa1 Remove __future__ imports. 2019-12-22 17:47:54 +01:00
Aymeric Augustin
7e98e211f0 Remove unittest.main() in test modules.
This construct isn't used anymore these days.

Running python tests/test_foo.py puts the tests/ directory on
PYTHONPATH, which isn't representative of how we run tests.

Use python -m unittest tests/test_foo.py instead.
2019-12-22 14:42:03 +01:00
Aymeric Augustin
ced0a94204 Switch test files to the standard test_*.py scheme. 2019-12-22 14:15:13 +01:00
Aymeric Augustin
c11b3e2926 Sort imports for optional third-party libraries.
These libraries aren't always installed in the virtual environment where
isort is running. Declaring them properly avoids mixing these
third-party imports with local imports.
2019-12-22 11:19:13 +01:00
Aymeric Augustin
939148b050 Fix F401 flake8 warning (x28).
Do manually what autoflake couldn't manage.
2019-12-22 10:59:08 +01:00
Aymeric Augustin
783a616999 Fix F401 flake8 warning (x88 / 116).
This change is mostly autogenerated with:

    $ python -m autoflake --in-place --recursive --remove-all-unused-imports --ignore-init-module-imports examples templates transformers utils hubconf.py setup.py

I made minor changes in the generated diff.
2019-12-22 10:59:08 +01:00
Aymeric Augustin
80327a13ea Fix F401 flake8 warning (x152 / 268).
This change is mostly autogenerated with:

    $ python -m autoflake --in-place --recursive examples templates transformers utils hubconf.py setup.py

I made minor changes in the generated diff.
2019-12-22 10:59:08 +01:00
Aymeric Augustin
fa2ccbc081 Fix E266 flake8 warning (x90). 2019-12-22 10:59:08 +01:00
Aymeric Augustin
2ab78325f0 Fix F821 flake8 warning (x47).
Ignore warnings related to Python 2, because it's going away soon.
2019-12-22 10:59:07 +01:00
Aymeric Augustin
631be27078 Fix E722 flake8 warnings (x26). 2019-12-22 10:59:07 +01:00
Aymeric Augustin
b0f7db73cd Fix E741 flake8 warning (x14). 2019-12-22 10:59:07 +01:00
Aymeric Augustin
fd2f17a7a1 Fix E714 flake8 warning (x8). 2019-12-22 10:59:07 +01:00
Aymeric Augustin
5eab3cf6bc Fix W605 flake8 warning (x5). 2019-12-22 10:59:07 +01:00
Aymeric Augustin
7dce8dc7ac Fix E731 flake8 warning (x3). 2019-12-22 10:59:07 +01:00
Aymeric Augustin
357db7098c Fix E712 flake8 warning (x1). 2019-12-22 10:59:07 +01:00
Aymeric Augustin
f9c5317db2 Fix E265 flake8 warning (x1). 2019-12-22 10:59:07 +01:00
Aymeric Augustin
28e608a2c2 Remove trailing whitespace from all Python files.
Fixes flake8 warning W291 (x224).
2019-12-22 10:59:07 +01:00
Aymeric Augustin
158e82e061 Sort imports with isort.
This is the result of:

    $ isort --recursive examples templates transformers utils hubconf.py setup.py
2019-12-22 10:57:46 +01:00
Aymeric Augustin
fa84ae26d6 Reformat source code with black.
This is the result of:

    $ black --line-length 119 examples templates transformers utils hubconf.py setup.py

There's a lot of fairly long lines in the project. As a consequence, I'm
picking the longest widely accepted line length, 119 characters.

This is also Thomas' preference, because it allows for explicit variable
names, to make the code easier to understand.
2019-12-21 17:52:29 +01:00
Thomas Wolf
73f6e9817c
Merge pull request #2115 from suvrat96/add_mmbt_model
[WIP] Add MMBT Model to Transformers Repo
2019-12-21 15:26:08 +01:00
thomwolf
344126fe58 move example to mm-imdb folder 2019-12-21 15:06:52 +01:00
Thomas Wolf
5b7fb6a4a1
Merge pull request #2134 from bkkaggle/saving-and-resuming
closes #1960 Add saving and resuming functionality for remaining examples
2019-12-21 15:03:53 +01:00
Thomas Wolf
6f68d559ab
Merge pull request #2130 from huggingface/ignored-index-coherence
[BREAKING CHANGE] Setting all ignored index to the PyTorch standard
2019-12-21 14:55:40 +01:00
thomwolf
1ab25c49d3 Merge branch 'master' into pr/2115 2019-12-21 14:54:30 +01:00
thomwolf
b03872aae0 fix merge 2019-12-21 14:49:54 +01:00
Thomas Wolf
518ba748e0
Merge branch 'master' into saving-and-resuming 2019-12-21 14:41:39 +01:00
Thomas Wolf
18601c3b6e
Merge pull request #2173 from erenup/master
run_squad with roberta
2019-12-21 14:33:16 +01:00
Thomas Wolf
eeb70cdd77
Merge branch 'master' into saving-and-resuming 2019-12-21 14:29:59 +01:00
Thomas Wolf
ed9b84816e
Merge pull request #1840 from huggingface/generation_sampler
[WIP] Sampling sequence generator for transformers
2019-12-21 14:27:35 +01:00
thomwolf
cfa0380515 Merge branch 'master' into generation_sampler 2019-12-21 14:12:52 +01:00
thomwolf
300ec3003c fixing run_generation example - using torch.no_grad 2019-12-21 14:02:19 +01:00
thomwolf
1c37746892 fixing run_generation 2019-12-21 13:52:49 +01:00
thomwolf
8a2be93b4e fix merge 2019-12-21 13:31:28 +01:00
Thomas Wolf
562f864038
Merge branch 'master' into fix-xlnet-squad2.0 2019-12-21 12:48:10 +01:00
Thomas Wolf
59941c5d1f
Merge pull request #2189 from stefan-it/xlmr
Add support for XLM-RoBERTa
2019-12-20 13:26:38 +01:00
Julien Chaumond
a5a06a851e [doc] Param name consistency 2019-12-19 16:24:20 -05:00
Aidan Kierans
1718fb9e74 Minor/basic text fixes (#2229)
* Small clarification

Matches line 431 to line 435 for additional clarity and consistency.

* Fixed minor typo

The letter "s" was previously omitted from the word "docstrings".
2019-12-19 16:23:18 -05:00
Francesco
62c1fc3c1e Removed duplicate XLMConfig, XLMForQuestionAnswering and XLMTokenizer from import statement of run_squad.py script 2019-12-19 09:50:56 -05:00
Ejar
284572efc0 Updated typo on the link
Updated documentation due to typo
2019-12-19 09:36:43 -05:00
Stefan Schweter
a26ce4dee1 examples: add XLM-RoBERTa to glue script 2019-12-19 02:23:01 +01:00
thomwolf
3d2096f516 further cleanup 2019-12-18 11:50:54 +01:00
thomwolf
83bc5235cf Merge branch 'master' into pr/2189 2019-12-17 11:47:32 +01:00
Thomas Wolf
f061606277
Merge pull request #2164 from huggingface/cleanup-configs
[SMALL BREAKING CHANGE] Cleaning up configuration classes - Adding Model Cards
2019-12-17 09:10:16 +01:00
Lysandre
18a879f475 fix #2180 2019-12-16 16:44:29 -05:00
Lysandre
d803409215 Fix run squad evaluate during training 2019-12-16 16:31:38 -05:00
Stefan Schweter
71b4750517 examples: add support for XLM-RoBERTa to run_ner script 2019-12-16 16:37:27 +01:00
thomwolf
dc667ce1a7 double check cc @LysandreJik 2019-12-14 09:56:27 +01:00
thomwolf
7140363e09 update bertabs 2019-12-14 09:44:53 +01:00
Thomas Wolf
a52d56c8d9
Merge branch 'master' into cleanup-configs 2019-12-14 09:43:07 +01:00
erenup
c7780700f5 Merge branch 'refs/heads/squad_roberta'
# Conflicts:
#	transformers/data/processors/squad.py
2019-12-14 08:53:59 +08:00
erenup
8e9526b4b5 add multiple processing 2019-12-14 08:43:58 +08:00
Lysandre
c8ed1c82c8 [SQUAD] Load checkpoint when evaluating without training 2019-12-13 12:13:48 -05:00
Pierric Cistac
5a5c4349e8
Fix summarization to_cpu doc 2019-12-13 10:02:33 -05:00
thomwolf
47f0e3cfb7 cleaning up configuration classes 2019-12-13 14:33:24 +01:00
erenup
9b312f9d41 initial version for roberta squad 2019-12-13 14:51:40 +08:00
LysandreJik
7296f1010b Cleanup squad and add allow train_file and predict_file usage 2019-12-12 13:01:04 -05:00
LysandreJik
3fd71c4431 Update example scripts 2019-12-12 12:08:54 -05:00
Alan deLevie
fbf5455a86 Fix typo in examples/run_glue.py args declaration.
deay -> decay
2019-12-12 11:16:19 -05:00
Bilal Khan
6aa919469d Update run_xnli to save optimizer and scheduler states, then resume training from a checkpoint 2019-12-10 19:31:22 -06:00
Bilal Khan
89896fe04f Update run_ner to save optimizer and scheduler states, then resume training from a checkpoint 2019-12-10 19:31:22 -06:00
Bilal Khan
fdc05cd68f Update run_squad to save optimizer and scheduler states, then resume training from a checkpoint 2019-12-10 19:31:22 -06:00
Bilal Khan
854ec5784e Update run_glue to save optimizer and scheduler states, then resume training from a checkpoint 2019-12-10 19:30:36 -06:00
LysandreJik
b72f9d340e Correct index in script 2019-12-10 18:33:17 -05:00
LysandreJik
6a73382706 Complete warning + cleanup 2019-12-10 14:33:24 -05:00
Lysandre
dc4e9e5cb3 DataParallel for SQuAD + fix XLM 2019-12-10 19:21:20 +00:00
Rémi Louf
07bc8efbc3 add greedy decoding and sampling 2019-12-10 17:27:50 +01:00
Rémi Louf
4b82c485de remove misplaced summarization documentation 2019-12-10 09:13:33 -05:00
Thomas Wolf
e57d00ee10
Merge pull request #1984 from huggingface/squad-refactor
[WIP] Squad refactor
2019-12-10 11:07:26 +01:00
Suvrat Bhooshan
df3961121f Add MMBT Model to Transformers Repo 2019-12-09 18:36:48 -08:00
Julien Chaumond
1d18930462 Harmonize no_cuda flag with other scripts 2019-12-09 20:37:55 -05:00
Rémi Louf
f7eba09007 clean for release 2019-12-09 20:37:55 -05:00
Rémi Louf
2a64107e44 improve device usage 2019-12-09 20:37:55 -05:00
Rémi Louf
c0707a85d2 add README 2019-12-09 20:37:55 -05:00
Rémi Louf
ade3cdf5ad integrate ROUGE 2019-12-09 20:37:55 -05:00
Rémi Louf
076602bdc4 prevent BERT weights from being downloaded twice 2019-12-09 20:37:55 -05:00
Rémi Louf
a1994a71ee simplified model and configuration 2019-12-09 20:37:55 -05:00
Rémi Louf
3a9a9f7861 default output dir to documents dir 2019-12-09 20:37:55 -05:00
Rémi Louf
693606a75c update the docs 2019-12-09 20:37:55 -05:00
Rémi Louf
2403a66598 give transformers API to BertAbs 2019-12-09 20:37:55 -05:00
Rémi Louf
ba089c780b share pretrained embeddings 2019-12-09 20:37:55 -05:00
Rémi Louf
9660ba1cbd Add beam search 2019-12-09 20:37:55 -05:00
Rémi Louf
1c71ecc880 load the pretrained weights for encoder-decoder
We currently save the pretrained_weights of the encoder and decoder in
two separate directories `encoder` and `decoder`. However, for the
`from_pretrained` function to operate with automodels we need to
specify the type of model in the path to the weights.

The path to the encoder/decoder weights is handled by the
`PreTrainedEncoderDecoder` class in the `save_pretrained` function. Sice
there is no easy way to infer the type of model that was initialized for
the encoder and decoder we add a parameter `model_type` to the function.
This is not an ideal solution as it is error prone, and the model type
should be carried by the Model classes somehow.

This is a temporary fix that should be changed before merging.
2019-12-09 20:37:55 -05:00
Rémi Louf
07f4cd73f6 update function to add special tokens
Since I started my PR the `add_special_token_single_sequence` function
has been deprecated for another; I replaced it with the new function.
2019-12-09 20:37:55 -05:00
Bilal Khan
79526f82f5 Remove unnecessary epoch variable 2019-12-09 16:24:35 -05:00
Bilal Khan
9626e0458c Add functionality to continue training from last saved global_step 2019-12-09 16:24:35 -05:00
Bilal Khan
2d73591a18 Stop saving current epoch 2019-12-09 16:24:35 -05:00
Bilal Khan
0eb973b0d9 Use saved optimizer and scheduler states if available 2019-12-09 16:24:35 -05:00
Bilal Khan
a03fcf570d Save tokenizer after each epoch to be able to resume training from a checkpoint 2019-12-09 16:24:35 -05:00
Bilal Khan
f71b1bb05a Save optimizer state, scheduler state and current epoch 2019-12-09 16:24:35 -05:00
LysandreJik
2a4ef098d6 Add ALBERT and XLM to SQuAD script 2019-12-09 10:46:47 -05:00
Lysandre Debut
00c4e39581
Merge branch 'master' into squad-refactor 2019-12-09 10:41:15 -05:00
Thomas Wolf
5482822a2b
Merge pull request #2046 from jplu/tf2-ner-example
Add NER TF2 example.
2019-12-06 12:12:22 +01:00
LysandreJik
e9217da5ff Cleanup
Improve global visibility on the run_squad script, remove unused files and fixes related to XLNet.
2019-12-05 16:01:51 -05:00
LysandreJik
9ecd83dace Patch evaluation for impossible values + cleanup 2019-12-05 14:44:57 -05:00
VictorSanh
35ff345fc9 update requirements 2019-12-05 12:07:04 -05:00
VictorSanh
552c44a9b1 release distilm-bert 2019-12-05 10:14:58 -05:00
Rosanne Liu
ee53de7aac Pr for pplm (#2060)
* license

* changes

* ok

* Update paper link and commands to run

* pointer to uber repo
2019-12-05 09:20:07 -05:00
Julien Plu
9200a759d7 Add few tests on the TF optimization file with some info in the documentation. Complete the README. 2019-12-05 12:56:43 +01:00
thomwolf
75a97af6bc fix #1450 - add doc 2019-12-05 11:26:55 +01:00
LysandreJik
f7e4a7cdfa Cleanup 2019-12-04 16:24:15 -05:00
LysandreJik
cca75e7884 Kill the demon spawn 2019-12-04 15:42:29 -05:00
LysandreJik
9ddc3f1a12 Naming update + XLNet/XLM evaluation 2019-12-04 10:37:00 -05:00
thomwolf
5bfcd0485e fix #1991 2019-12-04 14:53:11 +01:00
Julien Plu
ecb923da9c Create a NER example similar to the Pytorch one. It takes the same options, and can be run the same way. 2019-12-04 09:43:15 +01:00
LysandreJik
de276de1c1 Working evaluation 2019-12-03 17:15:51 -05:00
Julien Chaumond
7edb51f3a5 [pplm] split classif head into its own file 2019-12-03 22:07:25 +00:00
VictorSanh
48cbf267c9 Use full dataset for eval (SequentialSampler in Distributed setting) 2019-12-03 11:01:37 -05:00
Julien Chaumond
f434bfc623 [pplm] Update S3 links
Co-Authored-By: Piero Molino <w4nderlust@gmail.com>
2019-12-03 10:53:02 -05:00
Ethan Perez
96e83506d1 Always use SequentialSampler during evaluation
When evaluating, shouldn't we always use the SequentialSampler instead of DistributedSampler? Evaluation only runs on 1 GPU no matter what, so if you use the DistributedSampler with N GPUs, I think you'll only evaluate on 1/N of the evaluation set. That's at least what I'm finding when I run an older/modified version of this repo.
2019-12-03 10:15:39 -05:00
Julien Chaumond
3b48806f75 [pplm] README: add setup + tweaks 2019-12-03 10:14:02 -05:00
Julien Chaumond
0cb2c90890 readme
Co-Authored-By: Rosanne Liu <mimosavvy@gmail.com>
2019-12-03 10:14:02 -05:00
Julien Chaumond
1efb2ae7fc [pplm] move scripts under examples/pplm/ 2019-12-03 10:14:02 -05:00
Piero Molino
a59fdd1627 generate_text_pplm now works with batch_size > 1 2019-12-03 10:14:02 -05:00
w4nderlust
893d0d64fe Changed order of some parameters to be more consistent. Identical results. 2019-12-03 10:14:02 -05:00
w4nderlust
f42816e7fc Added additional check for url and path in discriminator model params 2019-12-03 10:14:02 -05:00
w4nderlust
f10b925015 Imrpovements: model_path renamed pretrained_model, tokenizer loaded from pretrained_model, pretrained_model set to discriminator's when discrim is specified, sample = False by default but cli parameter introduced. To obtain identical samples call the cli with --sample 2019-12-03 10:14:02 -05:00
w4nderlust
75904dae66 Removed global variable device 2019-12-03 10:14:02 -05:00
piero
7fd54b55a3 Added support for generic discriminators 2019-12-03 10:14:02 -05:00
piero
b0eaff36e6 Added a +1 to epoch when saving weights 2019-12-03 10:14:02 -05:00
piero
611961ade7 Added tqdm to preprocessing 2019-12-03 10:14:02 -05:00
piero
afc7dcd94d Now run_pplm works on cpu. Identical output as before (when using gpu). 2019-12-03 10:14:02 -05:00
piero
61399e5afe Cleaned perturb_past. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
ffc2935405 Fix for making unditioned generation work. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
9f693a0c48 Cleaned generate_text_pplm. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
61a12f790d Renamed SmallConst to SMALL_CONST and introduced BIG_CONST. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
ef47b2c03a Removed commented code. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
7ea12db3f5 Removed commented code. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
08c6e456a3 Cleaned full_text_generation. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
6c9c131780 More cleanup for run_model. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
7ffe47c888 Improved device specification 2019-12-03 10:14:02 -05:00
piero
4f2164e40e First cleanup step, changing function names and passing parameters all the way through without using args. Identical output as before. 2019-12-03 10:14:02 -05:00
piero
821de121e8 Minor changes 2019-12-03 10:14:02 -05:00
w4nderlust
7469d03b1c Fixed minor bug when running training on cuda 2019-12-03 10:14:02 -05:00
piero
0b51fba20b Added script for training a discriminator for pplm to use 2019-12-03 10:14:02 -05:00
Piero Molino
34a83faabe Let's make PPLM great again 2019-12-03 10:14:02 -05:00
Julien Chaumond
d5faa74cd6 tokenizer white space: revert to previous behavior 2019-12-03 10:14:02 -05:00
Julien Chaumond
0b77d66a6d rm extraneous import 2019-12-03 10:14:02 -05:00
Rosanne Liu
83b1e6ac9e fix the loss backward issue
(cherry picked from commit 566468cc984c6ec7e10dfc62b5b4191781a99cd2)
2019-12-03 10:14:02 -05:00
Julien Chaumond
572c24cfa2 PPLM (squashed)
Co-authored-by: piero <piero@uber.com>
Co-authored-by: Rosanne Liu <mimosavvy@gmail.com>
2019-12-03 10:14:02 -05:00
Thomas Wolf
f19a78a634
Merge pull request #1903 from valohai/master
Valohai integration
2019-12-03 16:13:01 +01:00
maxvidal
b0ee7c7df3 Added Camembert to available models 2019-11-29 14:17:02 -05:00
Juha Kiili
41aa0e8003 Refactor logs and fix loss bug 2019-11-29 15:33:25 +02:00
Lysandre
bd41e8292a Cleanup & Evaluation now works 2019-11-28 16:03:56 -05:00
Stefan Schweter
8c276b9c92
Merge branch 'master' into distilbert-german 2019-11-27 18:11:49 +01:00
VictorSanh
d5478b939d add distilbert + update run_xnli wrt run_glue 2019-11-27 11:07:22 -05:00
VictorSanh
73fe2e7385 remove fstrings 2019-11-27 11:07:22 -05:00
VictorSanh
3e7656f7ac update readme 2019-11-27 11:07:22 -05:00
VictorSanh
abd397e954 uniformize w/ the cache_dir update 2019-11-27 11:07:22 -05:00
VictorSanh
d5910b312f move xnli processor (and utils) to transformers/data/processors 2019-11-27 11:07:22 -05:00
VictorSanh
289cf4d2b7 change default for XNLI: dev --> test 2019-11-27 11:07:22 -05:00
VictorSanh
84a0b522cf mbert reproducibility results 2019-11-27 11:07:22 -05:00
VictorSanh
c4336ecbbd xnli - output_mode consistency 2019-11-27 11:07:22 -05:00
VictorSanh
d52e98ff9a add xnli examples/README.md 2019-11-27 11:07:22 -05:00
VictorSanh
71f71ddb3e run_xnli + utils_xnli 2019-11-27 11:07:22 -05:00
Julien Chaumond
b5d884d25c Uniformize #1952 2019-11-27 11:05:55 -05:00
Lysandre
4374eaea78 ALBERT for SQuAD 2019-11-26 13:08:12 -05:00
Lysandre
c110c41fdb Run GLUE and remove LAMB 2019-11-26 13:08:12 -05:00
manansanghi
5d3b8daad2 Minor bug fixes on run_ner.py 2019-11-25 16:48:03 -05:00
İbrahim Ethem Demirci
aa92a184d2 resize model when special tokenizer present 2019-11-25 15:06:32 -05:00
Lysandre
7485caefb0 fix #1894 2019-11-25 09:33:39 -05:00
Julien Chaumond
176cd1ce1b [doc] homogenize instructions slightly 2019-11-23 11:18:54 -05:00
Lysandre
c3ba645237 Works for XLNet 2019-11-22 16:27:37 -05:00
Lysandre
72e506b22e wip 2019-11-22 16:26:00 -05:00
Rémi Louf
26db31e0c0 update the documentation 2019-11-21 14:41:19 -05:00
Juha Kiili
2cf3447e0a Glue: log in Valohai-compatible JSON format too 2019-11-21 12:35:25 +02:00
Thomas Wolf
0cdfcca24b
Merge pull request #1860 from stefan-it/camembert-for-token-classification
[WIP] Add support for CamembertForTokenClassification
2019-11-21 10:56:07 +01:00
Jin Young Sohn
e70cdf083d Cleanup TPU bits from run_glue.py
TPU runner is currently implemented in:
https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py.

We plan to upstream this directly into `huggingface/transformers`
(either `master` or `tpu`) branch once it's been more thoroughly tested.
2019-11-20 17:54:34 -05:00
Lysandre
454455c695 fix #1879 2019-11-20 09:42:48 -05:00
Stefan Schweter
e7cf2ccd15 distillation: add German distilbert model 2019-11-19 19:55:19 +01:00
Kazutoshi Shinoda
f3386d9383 typo "deay" -> "decay" 2019-11-18 11:50:06 -05:00
Stefan Schweter
56c84863a1 camembert: add support for CamemBERT in run_ner example 2019-11-18 17:06:57 +01:00
Julien Chaumond
26858f27cb [camembert] Upload to s3 + rename script 2019-11-16 00:11:07 -05:00
Louis MARTIN
3e20c2e871 Update demo_camembert.py with new classes 2019-11-16 00:11:07 -05:00
Louis MARTIN
f12e4d8da7 Move demo_camembert.py to examples/contrib 2019-11-16 00:11:07 -05:00
Louis MARTIN
6e72fd094c Add demo_camembert.py 2019-11-16 00:11:07 -05:00
Xu Hongshen
ca99a2d500 Update example readme 2019-11-15 14:55:26 +08:00
Xu Hongshen
7da3ef24cd add is_impossible tensor to model inputs during fine-tuning xlnet on squad2.0 2019-11-15 14:18:53 +08:00
Thomas Wolf
74ce8de7d8
Merge pull request #1792 from stefan-it/distilbert-for-token-classification
DistilBERT for token classification
2019-11-14 22:47:53 +01:00
Thomas Wolf
05db5bc1af
added small comparison between BERT, RoBERTa and DistilBERT 2019-11-14 22:40:22 +01:00
Thomas Wolf
9629e2c676
Merge pull request #1804 from ronakice/master
fix multi-gpu eval in torch examples
2019-11-14 22:24:05 +01:00
Thomas Wolf
df99f8c5a1
Merge pull request #1832 from huggingface/memory-leak-schedulers
replace LambdaLR scheduler wrappers by function
2019-11-14 22:10:31 +01:00
Rémi Louf
2276bf69b7 update the examples, docs and template 2019-11-14 20:38:02 +01:00
Lysandre
d7929899da Specify checkpoint in saved file for run_lm_finetuning.py 2019-11-14 10:49:00 -05:00
ronakice
2e31176557 fix multi-gpu eval 2019-11-12 05:55:11 -05:00
Stefan Schweter
2b07b9e5ee examples: add DistilBert support for NER fine-tuning 2019-11-11 16:19:34 +01:00
Adrian Bauer
7a9aae1044 Fix run_bertology.py
Make imports and args.overwrite_cache match run_glue.py
2019-11-08 16:28:40 -05:00
Julien Chaumond
f88c104d8f [run_tf_glue] Add comment for context 2019-11-05 19:56:43 -05:00
Julien Chaumond
30968d70af misc doc 2019-11-05 19:06:12 -05:00
Thomas Wolf
e99071f105
Merge pull request #1734 from orena1/patch-1
add progress bar to convert_examples_to_features
2019-11-05 11:34:20 +01:00
Thomas Wolf
ba973342e3
Merge pull request #1553 from WilliamTambellini/timeSquadInference
Add speed log to examples/run_squad.py
2019-11-05 11:13:12 +01:00
Thomas Wolf
237fad339c
Merge pull request #1709 from oneraghavan/master
Fixing mode in evaluate during training
2019-11-05 10:55:33 +01:00
Oren Amsalem
d7906165a3
add progress bar for convert_examples_to_features
It takes considerate amount of time (~10 min) to parse the examples to features, it is good to have a progress-bar to track this
2019-11-05 10:34:27 +02:00
thomwolf
89d6272898 Fix #1623 2019-11-04 16:21:12 +01:00
Thomas Wolf
9a3b173cd3
Merge branch 'master' into master 2019-11-04 11:41:26 +01:00
thomwolf
ad90868627 Update example readme 2019-11-04 11:27:22 +01:00
Raghavan
e5b1048bae
Fixing mode in evaluate during training 2019-11-03 16:14:46 +05:30
Lysandre
1a2b40cb53 run_tf_glue MRPC evaluation only for MRPC 2019-10-31 18:00:51 -04:00
Timothy Liu
be36cf92fb Added mixed precision support to benchmarks.py 2019-10-31 17:24:37 -04:00
Julien Chaumond
f96ce1c241 [run_generation] Fix generation with batch_size>1 2019-10-31 18:27:11 +00:00
Julien Chaumond
3c1b6f594e
Merge branch 'master' into fix_top_k_top_p_filtering 2019-10-31 13:53:51 -04:00
Victor SANH
fa735208c9
update readme - fix example command distil* 2019-10-30 14:27:28 -04:00
Thomas Wolf
c7058d8224
Merge pull request #1608 from focox/master
Error raised by "tmp_eval_loss += tmp_eval_loss.item()" when using multi-gpu
2019-10-30 17:14:07 +01:00
Thomas Wolf
04c69db399
Merge pull request #1628 from huggingface/tfglue
run_tf_glue works with all tasks
2019-10-30 17:04:03 +01:00
Thomas Wolf
3df4367244
Merge pull request #1601 from huggingface/clean-roberta
Clean roberta model & all tokenizers now add special tokens by default (breaking change)
2019-10-30 17:00:40 +01:00
Thomas Wolf
36174696cc
Merge branch 'master' into clean-roberta 2019-10-30 16:51:06 +01:00
Thomas Wolf
228cdd6a6e
Merge branch 'master' into conditional-generation 2019-10-30 16:40:35 +01:00
Rémi Louf
070507df1f format utils for summarization 2019-10-30 11:24:12 +01:00
Rémi Louf
da10de8466 fix bug with padding mask + add corresponding test 2019-10-30 11:19:58 +01:00
Rémi Louf
3b0d2fa30e rename seq2seq to encoder_decoder 2019-10-30 10:54:46 +01:00
Rémi Louf
9c1bdb5b61 revert renaming of lm_labels to ltr_lm_labels 2019-10-30 10:43:13 +01:00
Rémi Louf
098a89f312 update docstrings; rename lm_labels to more explicit ltr_lm_labels 2019-10-29 20:08:03 +01:00
Rémi Louf
dfce409691 resolve PR comments 2019-10-29 17:10:20 +01:00
altsoph
079bfb32fb Evaluation fixed. 2019-10-28 10:18:58 -04:00
altsoph
438f2730a0 Evaluation code fixed. 2019-10-28 10:18:58 -04:00
Rémi Louf
4c3ac4a7d8 here's one big commit 2019-10-28 10:49:50 +01:00
Rémi Louf
932543f77e fix test of truncation function 2019-10-28 10:49:49 +01:00
Rémi Louf
a67413ccc8 extend works in-place 2019-10-28 10:49:49 +01:00
Rémi Louf
b915ba9dfe pad sequence with 0, mask with -1 2019-10-28 10:49:49 +01:00
Lysandre
bab6ad01aa run_tf_glue works with all tasks 2019-10-24 21:41:45 +00:00
Matt Maybeno
ae1d03fc51 Add roberta to doc 2019-10-24 14:32:48 -04:00
Matt Maybeno
4e5f88b74f Add Roberta to run_ner.py 2019-10-24 14:32:48 -04:00
VictorSanh
5b6cafb11b [release] fix table weirdness 2019-10-23 10:35:16 -04:00
VictorSanh
8ad5c591cd [RELEASE] DistilRoBERTa 2019-10-23 10:29:47 -04:00
focox@qq.com
bd847ce7d7 fixed the bug raised by "tmp_eval_loss += tmp_eval_loss.item()" when parallelly using multi-gpu. 2019-10-23 20:27:13 +08:00
Julien Chaumond
ef1b8b2ae5 [CTRL] warn if generation prompt does not start with a control code
see also https://github.com/salesforce/ctrl/pull/50
2019-10-22 21:30:32 +00:00
Lysandre
7d709e55ed Remove 2019-10-22 14:12:33 -04:00
Lysandre
1cfd974868 Option to benchmark only one of the two libraries 2019-10-22 13:32:23 -04:00
Pasquale Minervini
abd7110e21 gradient norm clipping should be done right before calling the optimiser - fixing run_glue and run_ner as well 2019-10-21 19:56:52 +01:00
Pasquale Minervini
3775550c4b gradient norm clipping should be done right before calling the optimiser 2019-10-20 22:33:56 +01:00
LysandreJik
7dd29ed2f1 Benchmarks example script 2019-10-18 10:53:04 -04:00
William Tambellini
0919389d9a Add speed log to examples/run_squad.py
Add a speed estimate log (time per example)
for evaluation to examples/run_squad.py
2019-10-17 14:41:04 -07:00
leo-du
ecd15667f3 fix repetition penalty 2019-10-17 14:47:14 -04:00
thomwolf
8cd56e3036 fix data processing in script 2019-10-17 16:33:26 +02:00
Rémi Louf
578d23e061 add training pipeline (formatting temporary) 2019-10-17 14:02:27 +02:00
Rémi Louf
47a06d88a0 use two different tokenizers for storyand summary 2019-10-17 13:04:26 +02:00
Rémi Louf
bfb9b540d4 add Model2Model to __init__ 2019-10-17 12:59:51 +02:00
Rémi Louf
c1bc709c35 correct the truncation and padding of dataset 2019-10-17 10:41:53 +02:00
Rémi Louf
e4e0ee14bd add separator between data import and train 2019-10-16 20:05:32 +02:00
Rémi Louf
0d81fc853e specify in readme that both datasets are required 2019-10-15 15:26:33 +02:00
Rémi Louf
1aec940587 test the full story processing 2019-10-15 15:18:07 +02:00
Rémi Louf
22e1af6859 truncation function is fully tested 2019-10-15 14:43:50 +02:00
Rémi Louf
260ac7d9a8 wip commit, switching computers 2019-10-15 12:24:35 +02:00
thomwolf
be916cb3fb Merge branch 'master' of https://github.com/huggingface/transformers 2019-10-15 10:37:13 +02:00
thomwolf
5875aaf762 install tensorboard 2019-10-15 10:36:46 +02:00
Thomas Wolf
40f14ff545
Merge pull request #1513 from slayton58/amp_fp16_einsum
Force einsum to run in fp16
2019-10-15 10:25:00 +02:00
Thomas Wolf
d147671c6c
Merge pull request #1508 from tlkh/master
Added performance enhancements (XLA, AMP) to examples
2019-10-15 09:57:18 +02:00
thomwolf
2c1d5564ad add readme information 2019-10-15 09:56:52 +02:00
thomwolf
c55badcee0 Add NER finetuning details by @stefan-it in example readme 2019-10-15 09:33:52 +02:00
Julien Chaumond
788e632622 [ner] Honor args.overwrite_cache 2019-10-15 09:17:31 +02:00
thomwolf
0f9ebb0b43 add seqeval as requirement for examples 2019-10-15 09:17:31 +02:00
thomwolf
66adb71734 update to transformers 2019-10-15 09:17:31 +02:00
Marianne Stecklina
5ff9cd158a Add option to predict on test set 2019-10-15 09:17:31 +02:00
Marianne Stecklina
7f5367e0b1 Add cli argument for configuring labels 2019-10-15 09:17:31 +02:00
Marianne Stecklina
e1d4179b64 Make file reading more robust 2019-10-15 09:17:31 +02:00
Marianne Stecklina
383ef96747 Implement fine-tuning BERT on CoNLL-2003 named entity recognition task 2019-10-15 09:17:31 +02:00
Marianne Stecklina
5adb39e757 Add option to predict on test set 2019-10-15 09:14:53 +02:00
Marianne Stecklina
99b189df6d Add cli argument for configuring labels 2019-10-15 09:14:53 +02:00
Marianne Stecklina
3e9420add1 Make file reading more robust 2019-10-15 09:14:53 +02:00
Marianne Stecklina
cde42c4354 Implement fine-tuning BERT on CoNLL-2003 named entity recognition task 2019-10-15 09:14:53 +02:00
hlums
74c5035808 Fix token order in xlnet preprocessing. 2019-10-14 21:27:11 +00:00
Rémi Louf
fe25eefc15 add instructions to fetch the dataset 2019-10-14 20:45:39 +02:00
Rémi Louf
412793275d delegate the padding with special tokens to the tokenizer 2019-10-14 20:45:16 +02:00
Rémi Louf
447fffb21f process the raw CNN/Daily Mail dataset
the data provided by Li Dong et al. were already tokenized, which means
that they are not compatible with  all the models in the library. We
thus process the raw data directly and tokenize them using the models'
tokenizers.
2019-10-14 18:12:20 +02:00
Simon Layton
4e6a55751a Force einsum to fp16 2019-10-14 11:12:41 -04:00
Rémi Louf
67d10960ae load and prepare CNN/Daily Mail data
We write a function to load an preprocess the CNN/Daily Mail dataset as
provided by Li Dong et al. The issue is that this dataset has already
been tokenized by the authors, so we actually need to find the original,
plain-text dataset if we want to apply it to all models.
2019-10-14 14:11:20 +02:00
Timothy Liu
376e65a674 Added automatic mixed precision and XLA options to run_tf_glue.py 2019-10-13 13:19:06 +00:00
Timothy Liu
86f23a1944 Minor enhancements to run_tf_glue.py 2019-10-13 10:21:35 +00:00
VictorSanh
d844db4005 Add citation bibtex 2019-10-11 16:55:42 -04:00
Rémi Louf
b3261e7ace read parameters from CLI, load model & tokenizer 2019-10-11 18:40:38 +02:00
Rémi Louf
d889e0b71b add base for seq2seq finetuning 2019-10-11 17:36:12 +02:00
Thomas Wolf
4428aefc63
Merge pull request #1488 from huggingface/pytorch-tpu
GLUE on TPU
2019-10-11 16:33:00 +02:00
Luran He
f382a8decd convert int to str before adding to a str 2019-10-10 19:20:39 -04:00
Lysandre
639f4b7190 Don't save/load when on TPU 2019-10-10 19:17:25 +00:00
Lysandre
d4e7934ac3 GLUE on TPU 2019-10-10 19:03:06 +00:00
Rémi Louf
1e68c28670 add test for initialization of Bert2Rnd 2019-10-10 18:07:11 +02:00
Thomas Wolf
6596e3d566
Merge pull request #1454 from bkkaggle/pytorch-built-in-tensorboard
Change tensorboard imports to use built-in tensorboard if available
2019-10-10 11:56:55 +02:00
thomwolf
177a721205 move back to simple space spliting 2019-10-10 11:45:47 +02:00
thomwolf
a5997dd81a better error messages 2019-10-10 11:31:01 +02:00
Lysandre Debut
2431fea98a
Merge pull request #1383 from keskarnitish/master
Adding CTRL
2019-10-09 11:31:05 -04:00
thomwolf
d9e60f4f0d Merge branch 'master' into pr/1383 2019-10-09 17:25:08 +02:00
Lysandre Debut
e84470ef81
Merge pull request #1384 from huggingface/encoding-qol
Quality of life enhancements in encoding + patch MLM masking
2019-10-09 11:18:24 -04:00
jinoobaek-qz
69629c4f0f Improve naming and only do regex when necessary 2019-10-09 08:48:40 -04:00
jinoobaek-qz
bf34a252b8 Golden path 2019-10-09 08:48:40 -04:00
jinoobaek-qz
528d3f327b Improve readability and improve make less assumptions about checkpoint format 2019-10-09 08:48:40 -04:00
jinoobaek-qz
56301bd9e8 Extract method 2019-10-09 08:48:40 -04:00
jinoobaek-qz
d6c5469712 Delete older checkpoint after saving new checkpoint 2019-10-09 08:48:40 -04:00
jinoobaek-qz
54a31f50fb Add save_total_limit 2019-10-09 08:48:40 -04:00
Thomas Wolf
439fac723a
Merge pull request #1409 from brian41005/master
Evaluation result.txt path changing #1286
2019-10-09 03:14:34 +02:00
Bilal Khan
5ce8d29abe Change tensorboard imports to use built-in tensorboard if available 2019-10-08 16:29:43 -05:00
VictorSanh
7ce83b4931 update weights for distilgpt2 2019-10-07 12:30:27 -04:00
LysandreJik
f3e0218fbb Correct device assignment in run_generation 2019-10-05 21:05:16 -04:00
thomwolf
78ef1a9930 fixes 2019-10-04 17:59:44 -04:00
thomwolf
6c1d0bc066 update encode_plus - add truncation strategies 2019-10-04 17:38:38 -04:00
VictorSanh
0820bb0555 unecessary carriage return 2019-10-04 17:23:15 -04:00
VictorSanh
f5891c3821 run_squad --> run_squad_w_distillation 2019-10-04 17:23:15 -04:00
VictorSanh
764a7923ec add distillation+finetuning option in run_squad 2019-10-04 17:23:15 -04:00
thomwolf
92c0f2fb90 Merge remote-tracking branch 'origin/julien_multiple-choice' into encoding-qol 2019-10-04 15:48:06 -04:00
Julien Chaumond
9e136ff57c Honor args.overwrite_cache (h/t @erenup) 2019-10-04 15:00:56 -04:00
keskarnitish
dbed1c5d94 Adding CTRL (squashed commit)
adding conversion script

adding first draft of modeling & tokenization

adding placeholder for test files

bunch of changes

registering the tokenizer/model/etc

tests

change link; something is very VERY wrong here

weird end-of-word thingy going on

i think the tokenization works now ; wrote the unit tests

overall structure works;load w next

the monster is alive!

works after some cleanup as well

adding emacs autosave to gitignore

currently only supporting the 48 layer one; seems to infer fine on my macbook

cleanup

fixing some documentation

fixing some documentation

tests passing?

now works on CUDA also

adding greedy?

adding greedy sampling

works well
2019-10-03 22:29:03 -07:00
Lysandre Debut
d3f24dfad7
Merge branch 'master' into master 2019-10-03 22:43:09 +00:00
LysandreJik
ecc4f1bdfa XLM use_lang_embedding flag in run_generation 2019-10-03 17:42:16 -04:00
LysandreJik
c2c2ca0fdb Added XLM to run_generation, with prompt language selection. 2019-10-03 17:18:48 -04:00
LysandreJik
aebd83230f Update naming + remove f string in run_lm_finetuning example 2019-10-03 11:31:36 -04:00
LysandreJik
5ed50a93fb LM finetuning won't mask special tokens anymore 2019-10-03 11:31:36 -04:00
Brian Ma
7af0777910 Update run_glue.py
add DistilBert model shortcut into ALL_MODELS
2019-10-03 15:31:11 +00:00
VictorSanh
5f07d8f11a prepare release 2019-10-03 10:27:11 -04:00
VictorSanh
35071007cb incoming release 🔥 update links to arxiv preprint 2019-10-03 10:27:11 -04:00
VictorSanh
2a91f6071f upddate README - TODO updadte link to paper 2019-10-03 10:27:11 -04:00
VictorSanh
c51e533a5f update train.py 2019-10-03 10:27:11 -04:00
VictorSanh
a76c3f9cb0 update requirements 2019-10-03 10:27:11 -04:00
VictorSanh
bb9c5ead54 update distiller 2019-10-03 10:27:11 -04:00
VictorSanh
a12ab0a8db update binarized_data 2019-10-03 10:27:11 -04:00
VictorSanh
4d6dfbd376 update extract 2019-10-03 10:27:11 -04:00
VictorSanh
23edebc079 update extract_distilbert 2019-10-03 10:27:11 -04:00
VictorSanh
cbfcfce205 update token_counts 2019-10-03 10:27:11 -04:00
VictorSanh
19e4ebbe3f grouped_batch_sampler 2019-10-03 10:27:11 -04:00
VictorSanh
594202a934 lm_seqs_dataset 2019-10-03 10:27:11 -04:00
VictorSanh
38084507c4 add distillation_configs 2019-10-03 10:27:11 -04:00
Brian Ma
2195c0d5f9 Evaluation result.txt path changing #1286 2019-10-03 12:49:12 +08:00
Thomas Wolf
963529e29b
Merge pull request #1288 from echan00/master
Typo with LM Fine tuning script
2019-10-01 18:46:07 -04:00
thomwolf
f7978f70ec use format instead of f-strings 2019-10-01 18:45:38 -04:00
Julien Chaumond
b350662955 overflowing_tokens do not really make sense here, let's just return a number
Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2019-09-30 16:37:09 -04:00
Julien Chaumond
f5bcde0b2f [multiple-choice] Simplify and use tokenizer.encode_plus 2019-09-30 16:04:55 -04:00
Denny
9478590630
Update run_lm_finetuning.py
The previous method, just as phrased, did not exist in the class.
2019-09-27 15:18:42 -03:00
Thomas Wolf
d83d295763
Merge pull request #1337 from mgrankin/fastdataset
faster dataset building
2019-09-27 10:35:12 +02:00
thomwolf
da2e47ad15 clean up a little run_tf_glue 2019-09-27 09:41:15 +02:00
thomwolf
528c288fa9 clean up run_tf_glue 2019-09-27 09:40:29 +02:00
VictorSanh
702f589848 fix input in run_glue for distilbert 2019-09-27 00:20:14 -04:00
mgrankin
f71a4577b8 faster dataset building 2019-09-26 16:53:13 +03:00
thomwolf
481d9c4fb5 Merge branch 'master' into tf2 2019-09-26 12:02:54 +02:00
thomwolf
31c23bd5ee [BIG] pytorch-transformers => transformers 2019-09-26 10:15:53 +02:00
thomwolf
5705333441 add initialization for everybody 2019-09-26 10:06:20 +02:00
thomwolf
7c9f8f93f9 fix tests 2019-09-26 01:59:53 +02:00
thomwolf
d6dde438ea add batch dimension in encode 2019-09-26 01:45:55 +02:00
thomwolf
4a21c4d88d add warning if neither pt nor tf are found 2019-09-26 01:30:06 +02:00
thomwolf
3b7fb48c3b fix loading from tf/pt 2019-09-25 17:46:16 +02:00
thomwolf
a049c8043b push fix to training 2019-09-25 17:33:16 +02:00
mataney
a9f24a16bc [FIX] fix run_generation.py to work with batch_size > 1 2019-09-25 15:53:29 +03:00
thomwolf
5def3302f4 update run_glue 2019-09-25 12:38:08 +02:00
thomwolf
f71758f7a4 update internal glue processors 2019-09-25 12:00:50 +02:00
thomwolf
b5ec526f85 updated data processor and metrics 2019-09-24 17:10:50 +02:00
LysandreJik
f09e5ecef0 [Proposal] GLUE processors included in library 2019-09-24 09:47:34 -04:00
LysandreJik
c832f43a4d output_token_type -> token_type_ids 2019-09-24 07:21:38 -04:00
LysandreJik
3927d7756c Updated the GLUE pre-processing method 2019-09-24 07:15:11 -04:00
LysandreJik
9d44236f70 Updated DistilBERT 2019-09-24 07:03:24 -04:00
Lorenzo Ampil
4b543c3007 Add option to use a 'stop token' which will be used to truncate the output text to everything till right before the 'stop token' 2019-09-22 21:38:38 +08:00
VictorSanh
9f995b99d4 minor fixes 2019-09-19 21:36:06 +00:00
VictorSanh
3fe5c8e8a8 update bert-base-uncased rslts 2019-09-19 19:34:22 +00:00
VictorSanh
354944e607 [distillation] big update w/ new weights 2019-09-19 19:25:21 +00:00
LysandreJik
60414f31a9 GLUE updated with new methods 2019-09-19 10:55:06 +02:00
LysandreJik
bf503158c5 Sentence -> Sequence. Removed output_mask from the special token addition methods. 2019-09-19 10:55:06 +02:00
LysandreJik
de8e14b6c0 Added DistilBERT to run_squad script 2019-09-19 10:55:06 +02:00
LysandreJik
88368c2a16 Added DistilBERT to run_lm_finetuning 2019-09-19 10:55:06 +02:00
LysandreJik
75635072e1 Updated GLUE script to add DistilBERT. Cleaned up unused args in the utils file. 2019-09-19 10:55:06 +02:00
LysandreJik
59057abe52 typo 2019-09-19 10:55:06 +02:00
LysandreJik
bac332fec0 Updated the GLUE data processor. Corrections to RoBERTa and XLNet. 2019-09-19 10:55:06 +02:00
Erik Chan
f0340eccf9
Typo
Typo
2019-09-18 13:42:11 -07:00
erenup
8960988f35 fixed to find best dev acc 2019-09-19 01:10:05 +08:00
erenup
46ffc28329 Merge branch 'master' into run_multiple_choice_merge
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
2019-09-18 21:43:46 +08:00
erenup
15143fbad6 move run_multiple_choice.py and utils_multiple_choice.py to examples 2019-09-18 21:18:46 +08:00
erenup
3cd6289758 Merge remote-tracking branch 'huggingface/master' into run_multiple_choice_merge
# Conflicts:
#	examples/contrib/run_swag.py
2019-09-18 21:16:59 +08:00
erenup
36362cf086 move schedule.step after optimizer.step 2019-09-18 21:13:40 +08:00
thomwolf
e768f2322a update run_openai_gpt to fix #1264 2019-09-18 10:07:47 +02:00
thomwolf
8334993915 clean up examples - updated to new keyword inputs - #1246 2019-09-18 10:01:27 +02:00
erenup
5882c442e5 add example usage 2019-09-16 22:38:08 +08:00
erenup
982f181aa7 Merge remote-tracking branch 'origin/master' into run_multiple_choice_add_doc 2019-09-16 19:12:00 +08:00
erenup
84b9d1c423 Merge remote-tracking branch 'huggingface/master'
# Conflicts:
#	pytorch_transformers/__init__.py
2019-09-16 19:06:12 +08:00
erenup
603b470a3d add warnning info 2019-09-16 18:53:37 +08:00
erenup
4812a5a767 add doc string 2019-09-16 11:50:18 +08:00
VictorSanh
32e1332acf [distil] fix once for all general logger for scripts 2019-09-11 14:19:07 +00:00
VictorSanh
364920e216 fix small bug/typo 2019-09-10 21:45:01 +00:00
Thomas Wolf
23c23f5399
Merge pull request #1229 from SKRohit/master
changes in evaluate function in run_lm_finetuning.py
2019-09-10 22:16:45 +02:00
searchivarius
eab980fd68 Fix to prevent crashing on assert len(tokens_b)>=1 2019-09-09 19:58:08 -04:00
VictorSanh
a95ced6260 [Distillation] save last chkpt as pytorch_model.bin 2019-09-09 19:53:35 +00:00
Rohit Kumar Singh
e5df36397b
changes in return statement of evaluate function
changed `results` to `result` and removed `results` dict defined previously
2019-09-09 19:55:57 +05:30
LysandreJik
3f91338be9 Patched a few outdated parameters 2019-09-06 17:48:06 -04:00
LysandreJik
f47f9a5874 Updated outdated examples 2019-09-06 17:10:33 -04:00
LysandreJik
5e151f5e77 Table of contents 2019-09-06 12:08:36 -04:00
LysandreJik
593c070435 Better examples 2019-09-06 12:00:12 -04:00
VictorSanh
dddd6b9927 Update DistilBERT training code 2019-09-05 18:26:14 +00:00
Stefan Schweter
a1c34bd286 distillation: fix ModuleNotFoundError error in token counts script 2019-08-31 12:21:38 +02:00
Thomas Wolf
51e980ce36
Merge pull request #1155 from anhnt170489/apex_fp16
Update apex fp16 implementation
2019-08-30 23:29:11 +02:00
VictorSanh
282c276e09 typos + file name coherence in distillation README 2019-08-30 12:02:29 -04:00
VictorSanh
803c1cc4ea fix relative import bug cf Issue #1140 2019-08-30 12:01:27 -04:00
Thomas Wolf
0a2fecdf90
Merge branch 'master' into master 2019-08-30 16:30:08 +02:00
Rabeeh KARIMI
39eb31e11e remove reloading tokenizer in the training, adding it to the evaluation part 2019-08-30 15:44:41 +02:00
Rabeeh KARIMI
350bb6bffa updated tokenizer loading for addressing reproducibility issues 2019-08-30 15:34:28 +02:00
Thomas Wolf
01ad55f8cf
Merge pull request #1026 from rabeehk/master
loads the tokenizer for each checkpoint, to solve the reproducability…
2019-08-30 14:15:36 +02:00
erenup
6e1ac34e2b Merge remote-tracking branch 'huggingface/master' 2019-08-30 15:50:11 +08:00
jamin
2fb9a934b4 re-format 2019-08-30 14:05:28 +09:00
jamin
c8731b9583 update apex fp16 implementation 2019-08-30 13:54:00 +09:00
LysandreJik
caf1d116a6 Closing bracket in DistilBERT's token count. 2019-08-29 15:30:10 -04:00
Luis
fe8fb10b44 Small modification of comment in the run_glue.py example
Add RoBERTa to the comment as it was not explicit that RoBERTa don't use token_type_ids.
2019-08-29 14:43:30 +02:00
erenup
942d3f4b20 modifiy code of arc label insurance 2019-08-29 10:21:17 +08:00
LysandreJik
bf3dc778b8 Changed learning rate for run_squad test 2019-08-28 18:24:43 -04:00
Andreas Daiminger
1d15a7f278 swap order of optimizer.step() and scheduler.step() 2019-08-28 19:18:27 +02:00
Thomas Wolf
0ecfd17f49
Merge pull request #987 from huggingface/generative-finetuning
Generative finetuning
2019-08-28 16:51:50 +02:00
thomwolf
b5eb283aaa update credits 2019-08-28 16:36:55 +02:00
thomwolf
912a377e90 dilbert -> distilbert 2019-08-28 13:59:42 +02:00
thomwolf
4ce5f36f78 update readmes 2019-08-28 12:14:31 +02:00
erenup
ec4b1c659f logging truth error 2019-08-28 16:50:40 +08:00
erenup
df52abe373 add sep_toekn between question and choice 2019-08-28 16:36:21 +08:00
erenup
43c243254a avoid invalid labels of truth 2019-08-28 16:03:17 +08:00
erenup
3c7e676f8b add test related code: test the best dev acc model when model is training 2019-08-28 15:57:29 +08:00
VictorSanh
93e82ab424 Write README for DilBERT 2019-08-28 06:26:09 +00:00
VictorSanh
fea921d382 add licensing 2019-08-28 04:45:39 +00:00
VictorSanh
da1e4e53fc some fixes in train.py for loading previous checkpoint 2019-08-28 04:01:03 +00:00
VictorSanh
0d8f8848d5 add scripts/extract_for_distil.py 2019-08-28 04:00:19 +00:00
VictorSanh
7f2c384c80 add scripts/token_counts.py 2019-08-28 04:00:03 +00:00
VictorSanh
4d16b279e5 add scripts/binarized_data.py 2019-08-28 03:59:48 +00:00
VictorSanh
b247b0d880 add train.py for distillation 2019-08-28 02:12:47 +00:00
VictorSanh
780f183e55 add requirements 2019-08-28 01:39:52 +00:00
VictorSanh
e424d2e45d add README 2019-08-28 01:10:10 +00:00
VictorSanh
1ae81e4aa1 add dataset. distiller, utils 2019-08-28 01:10:05 +00:00
thomwolf
06510ccb53 typo 2019-08-23 22:08:10 +02:00
thomwolf
ab7bd5ef98 fixing tokenization and training 2019-08-23 17:31:21 +02:00
Thomas Wolf
90dcd8c05d
Merge branch 'master' into generative-finetuning 2019-08-22 10:43:30 +02:00
VictorSanh
57272d5ddf fix for glue 2019-08-22 00:25:49 -04:00
VictorSanh
b006a7a12f fix for squad 2019-08-22 00:25:42 -04:00
Thomas Wolf
9beaa85b07
Merge pull request #1055 from qipeng/run_squad_fix
Fix #1015 (tokenizer defaults to use_lower_case=True when loading from trained models)
2019-08-21 01:20:46 +02:00
Lysandre
2d042274ac Sequence special token handling for BERT and RoBERTa 2019-08-20 14:15:28 -04:00
Peng Qi
3bffd2e8e5 more fixes 2019-08-20 10:59:28 -07:00
Thomas Wolf
3b56427a1e
Merge pull request #1040 from FeiWang96/multi_gpu
Fix bug of multi-gpu training in lm finetuning
2019-08-20 17:13:44 +02:00
thomwolf
a690edab17 various fix and clean up on run_lm_finetuning 2019-08-20 15:52:12 +02:00
erenup
fc74132598 add best steps to train 2019-08-20 19:06:41 +08:00
Duzeyao
d86b49ac86 swap optimizer.step and scheduler.step 2019-08-20 16:46:34 +08:00
Duzeyao
45ab8bf60e Revert "Update finetune_on_pregenerated.py"
This reverts commit a1359b970c.
2019-08-20 16:40:39 +08:00
erenup
97c30b73d5 add test related code 2019-08-20 16:31:04 +08:00
erenup
d5e60e5b7a add test related code 2019-08-20 16:25:50 +08:00
Zeyao Du
a1359b970c
Update finetune_on_pregenerated.py 2019-08-20 16:00:07 +08:00
Zeyao Du
28f7ca1f80
swap optimizer.step and scheduler.step 2019-08-20 15:58:42 +08:00
Peng Qi
a368b87791 Fix #1015 2019-08-19 13:07:00 -07:00
Lysandre
f94f1c6016 Distributed training + tokenizer agnostic mask token 2019-08-19 14:58:50 -04:00
Thomas Wolf
5a49b793d9
Merge pull request #1023 from tuvuumass/patch-1
fix issue #824
2019-08-19 15:31:46 +02:00
erenup
4270d3da1b fix a bug of evaluating 2019-08-19 16:38:52 +08:00
Chi-Liang Liu
40acf6b52a don't save model without training 2019-08-18 05:02:25 -04:00
erenup
47e9aea0fe add args info to evaluate_result.txt 2019-08-18 17:00:53 +08:00
erenup
5582bc4b23 add multiple choice to robreta and xlnet, test on swag, roberta=0.82.28
, xlnet=0.80
2019-08-18 16:01:48 +08:00
wangfei
856a63da4d Fix: save model/model.module 2019-08-18 11:03:47 +08:00
wangfei
1ef41b8337 Revert "Fix: save model/model.module"
This reverts commit 00e9c4cc96.
2019-08-18 11:03:12 +08:00
wangfei
00e9c4cc96 Fix: save model/model.module 2019-08-18 11:02:02 +08:00
erenup
e384ae2b9d Merge remote-tracking branch 'huggingface/master'
merge huggingface/master to update
2019-08-17 12:05:57 +08:00
Jason Phang
d8923270e6 Correct truncation for RoBERTa in 2-input GLUE 2019-08-16 16:30:38 -04:00
Lysandre
5652f54ac2 Simplified data generator + better perplexity calculator
GPT-2 now obtains ~20 perplexity on WikiText-2
2019-08-16 13:49:56 -04:00
LysandreJik
7e7fc53da5 Fixing run_glue example with RoBERTa 2019-08-16 11:53:10 -04:00
LysandreJik
715534800a BERT + RoBERTa masking tokens handling + GPU device update. 2019-08-16 10:10:21 -04:00
LysandreJik
339e556feb CLM for BERT, beginning of CLM fot RoBERTa; still needs a better masking token mechanism. 2019-08-16 10:10:20 -04:00
LysandreJik
5c18825a18 Removed dataset limit 2019-08-16 10:10:20 -04:00
LysandreJik
3e3e145497 Added GPT to the generative fine-tuning. 2019-08-16 10:10:20 -04:00
LysandreJik
47975ed53e Language Modeling fine-tuning using GPT-2. 2019-08-16 10:10:20 -04:00
wangfei
b8ff56896c Fix bug of multi-gpu training in lm finetuning 2019-08-16 12:11:05 +08:00
Rabeeh KARIMI
3d47a7f8ab loads the tokenizer for each checkpoint, to solve the reproducability issue 2019-08-14 10:58:26 +02:00
LysandreJik
39f426be65 Added special tokens <pad> and <mask> to RoBERTa. 2019-08-13 15:19:50 -04:00
Julien Chaumond
baf08ca1d4 [RoBERTa] run_glue: correct pad_token + reorder labels 2019-08-13 12:51:15 -04:00
tuvuumass
ba4bce2581
fix issue #824 2019-08-13 11:26:27 -04:00
Julien Chaumond
912fdff899 [RoBERTa] Update run_glue for RoBERTa 2019-08-12 13:49:50 -04:00
erenup
b219029c45 refactoring old run_swag. This script is mainly refatored from run_squad in pytorch_transformers 2019-08-11 15:20:37 +08:00
Thomas Wolf
b4f9464f90
Merge pull request #960 from ethanjperez/patch-1
Fixing unused weight_decay argument
2019-08-07 10:09:55 +02:00
Thomas Wolf
d43dc48b34
Merge branch 'master' into auto_models 2019-08-05 19:17:35 +02:00
thomwolf
70c10caa06 add option mentioned in #940 2019-08-05 17:09:37 +02:00
thomwolf
b90e29d52c working on automodels 2019-08-05 16:06:34 +02:00
Ethan Perez
28ba345ecc
Fixing unused weight_decay argument
Currently the L2 regularization is hard-coded to "0.01", even though there is a --weight_decay flag implemented (that is unused). I'm making this flag control the weight decay used for fine-tuning in this script.
2019-08-04 12:31:46 -04:00
Thomas Wolf
c054b5ee64
Merge pull request #896 from zijunsun/master
fix multi-gpu training bug when using fp16
2019-07-26 19:31:02 +02:00
zijunsun
f0aeb7a814 multi-gpu training also should be after apex fp16(squad) 2019-07-26 15:23:29 +08:00
zijunsun
adb3ef6368 multi-gpu training also should be after apex fp16 2019-07-25 13:09:10 +08:00
Chi-Liang Liu
a7fce6d917 fix squad v1 error (na_prob_file should be None) 2019-07-24 16:11:36 +08:00
thomwolf
6070b55443 fix #868 2019-07-23 17:46:01 +02:00
thomwolf
2c9a3115b7 fix #858 2019-07-23 16:45:55 +02:00
Thomas Wolf
268c6cc160
Merge pull request #845 from rabeehk/master
fixed version issues in run_openai_gpt
2019-07-23 15:29:31 +02:00
Peiqin Lin
76be189b08 typos 2019-07-21 20:39:42 +08:00
Rabeeh KARIMI
f63ff536ad fixed version issues in run_openai_gpt 2019-07-20 12:43:07 +02:00
Thomas Wolf
a615499076
Merge pull request #797 from yzy5630/fix-examples
fix some errors for distributed lm_finetuning
2019-07-18 23:32:33 +02:00
yzy5630
a1fe4ba9c9 use new API for save and load 2019-07-18 15:45:23 +08:00
yzy5630
a7ba27b1b4 add parser for adam 2019-07-18 08:52:51 +08:00
yzy5630
d6522e2873 change loss and optimizer to new API 2019-07-17 21:22:34 +08:00
thomwolf
71d597dad0 fix #800 2019-07-17 13:51:09 +02:00
yzy5630
123da5a2fa fix errors for lm_finetuning examples 2019-07-17 09:56:07 +08:00
yzy5630
60a1bdcdac fix some errors for distributed lm_finetuning 2019-07-17 09:16:20 +08:00
thomwolf
e848b54730 fix #792 2019-07-16 21:22:19 +02:00
thomwolf
1849aa7d39 update readme and pretrained model weight files 2019-07-16 15:11:29 +02:00
thomwolf
f31154cb9d Merge branch 'xlnet' 2019-07-16 11:51:13 +02:00
thomwolf
76da9765b6 fix run_generation test 2019-07-15 17:52:35 +02:00
thomwolf
e691fc0963 update QA models tests + run_generation 2019-07-15 17:45:24 +02:00
thomwolf
15d8b1266c update tokenizer - update squad example for xlnet 2019-07-15 17:30:42 +02:00
thomwolf
3b469cb422 updating squad for compatibility with XLNet 2019-07-15 15:28:37 +02:00
thomwolf
0e9825e252 small fix to run_glue 2019-07-14 23:43:28 +02:00
thomwolf
2397f958f9 updating examples and doc 2019-07-14 23:20:10 +02:00
thomwolf
c490f5ce87 added generation examples in tests 2019-07-13 15:26:58 +02:00
thomwolf
7d4b200e40 good quality generation example for GPT, GPT-2, Transfo-XL, XLNet 2019-07-13 15:25:03 +02:00
thomwolf
7322c314a6 remove python2 testing for examples 2019-07-12 14:24:08 +02:00
thomwolf
936e813c84 clean up examples - added squad example and test 2019-07-12 14:16:06 +02:00
thomwolf
762ded9b1c wip examples 2019-07-12 11:28:52 +02:00
LysandreJik
3821ecbf4a Byte order mark management in TSV glue reading. 2019-07-11 20:16:28 -04:00
thomwolf
c6bf1a400d fix test examples et model pretrained 2019-07-11 22:29:08 +02:00
thomwolf
92a782b108 fix run_glue test 2019-07-11 22:20:10 +02:00
thomwolf
ccb6947dc1 optimization tests 2019-07-11 17:39:47 +02:00
thomwolf
b21d84b027 update examples 2019-07-11 15:37:34 +02:00
thomwolf
ec07cf5a66 rewamp optimization 2019-07-11 14:48:22 +02:00
thomwolf
4fef5919a5 updating examples 2019-07-11 12:03:08 +02:00
thomwolf
50b7e52a7f WIP examples 2019-07-10 15:33:34 +02:00
thomwolf
ed6c8d37f4 fix merge 2019-07-09 17:14:52 +02:00
thomwolf
4ce237c880 update run_glue 2019-07-09 17:00:32 +02:00
thomwolf
3b7cb7bf44 small update to run_glue 2019-07-09 16:12:15 +02:00
thomwolf
d0efbd3cd1 update sequencesummary module 2019-07-09 15:46:43 +02:00
thomwolf
d5481cbe1b adding tests to examples - updating summary module - coverage update 2019-07-09 15:29:42 +02:00
thomwolf
b19786985d unified tokenizer api and serialization + tests 2019-07-09 10:25:18 +02:00
thomwolf
3d5f291386 updates to run_glue 2019-07-05 17:22:15 +02:00
thomwolf
99b90edab1 cleaning up run_glue example 2019-07-05 17:09:35 +02:00
thomwolf
1113f97f33 clean up glue example 2019-07-05 16:31:13 +02:00
thomwolf
162ba383b0 fix model loading 2019-07-05 15:57:14 +02:00
thomwolf
36bca545ff tokenization abstract class - tests for examples 2019-07-05 15:02:59 +02:00
Thomas Wolf
78462aad61
Merge pull request #733 from ceremonious/parallel-generation
Added option to use multiple workers to create training data
2019-07-05 12:04:30 +02:00
thomwolf
0bab55d5d5 [BIG] name change 2019-07-05 11:55:36 +02:00
thomwolf
c41f2bad69 WIP XLM + refactoring 2019-07-03 22:54:39 +02:00
Lei Mao
64b2a828c0 fix evaluation bug 2019-07-01 14:56:24 -07:00
thomwolf
2b56e98892 standardizing API across models - XLNetForSeqClass working 2019-06-28 16:35:09 +02:00
thomwolf
3a00674cbf fix imports 2019-06-27 17:18:46 +02:00
Mayhul Arora
08ff056c43 Added option to use multiple workers to create training data for lm fine tuning 2019-06-26 16:16:12 -07:00
thomwolf
59cefd4f98 fix #726 - get_lr in examples 2019-06-26 11:28:27 +02:00
thomwolf
092dacfd62 changing is_regression to unified API 2019-06-26 09:54:05 +02:00
thomwolf
e55d4c4ede various updates to conversion, models and examples 2019-06-26 00:57:53 +02:00
thomwolf
7334bf6c21 pad on left for xlnet 2019-06-24 15:05:11 +02:00
thomwolf
c888663f18 overwrite output directories if needed 2019-06-24 14:38:24 +02:00
thomwolf
62d78aa37e updating GLUE utils for compatibility with XLNet 2019-06-24 14:36:11 +02:00
thomwolf
24ed0b9346 updating run_xlnet_classifier 2019-06-24 12:00:09 +02:00
thomwolf
f6081f2255 add xlnetforsequence classif and run_classifier example for xlnet 2019-06-24 10:01:07 +02:00
Rocketknight1
c7b2808ed7 Update LM finetuning README to include a literature reference 2019-06-22 15:04:01 +01:00
thomwolf
181075635d updating model loading and adding special tokens ids 2019-06-21 23:23:37 +02:00
thomwolf
ebd2cb8d74 update from_pretrained to load XLNetModel as well 2019-06-21 21:08:44 +02:00
thomwolf
edfe91c36e first version bertology ok 2019-06-19 23:43:04 +02:00
thomwolf
7766ce66dd update bertology 2019-06-19 22:29:51 +02:00
thomwolf
e4b46d86ce update head pruning 2019-06-19 22:16:30 +02:00
thomwolf
0f40e8d6a6 debugger 2019-06-19 15:38:46 +02:00
thomwolf
0e1e8128bf more logging 2019-06-19 15:35:49 +02:00
thomwolf
909d4f1af2 cuda again 2019-06-19 15:32:10 +02:00
thomwolf
14f0e8e557 fix cuda 2019-06-19 15:29:28 +02:00
thomwolf
34d706a0e1 pruning in bertology 2019-06-19 15:25:49 +02:00
thomwolf
dc8e0019b7 updating examples 2019-06-19 13:23:20 +02:00
thomwolf
68ab9599ce small fix and updates to readme 2019-06-19 09:38:38 +02:00
thomwolf
f7e2ac01ea update barrier 2019-06-18 22:43:35 +02:00
thomwolf
4d8c4337ae test barrier in distrib training 2019-06-18 22:41:28 +02:00
thomwolf
3359955622 updating run_classif 2019-06-18 22:23:10 +02:00
thomwolf
29b7b30eaa updating evaluation on a single gpu 2019-06-18 22:20:21 +02:00
thomwolf
7d2001aa44 overwrite_output_dir 2019-06-18 22:13:30 +02:00
thomwolf
16a1f338c4 fixing 2019-06-18 17:06:31 +02:00
thomwolf
92e0ad5aba no numpy 2019-06-18 17:00:52 +02:00
thomwolf
4e6edc3274 hop 2019-06-18 16:57:15 +02:00
thomwolf
f55b60b9ee fixing again 2019-06-18 16:56:52 +02:00
thomwolf
8bd9118294 quick fix 2019-06-18 16:54:41 +02:00
thomwolf
3e847449ad fix out_label_ids 2019-06-18 16:53:31 +02:00
thomwolf
aad3a54e9c fix paths 2019-06-18 16:48:04 +02:00
thomwolf
40dbda6871 updating classification example 2019-06-18 16:45:52 +02:00
thomwolf
7388c83b60 update run_classifier for distributed eval 2019-06-18 16:32:49 +02:00
thomwolf
9727723243 fix pickle 2019-06-18 16:02:42 +02:00
thomwolf
9710b68dbc fix pickles 2019-06-18 16:01:15 +02:00
thomwolf
15ebd67d4e cache in run_classifier + various fixes to the examples 2019-06-18 15:58:22 +02:00
thomwolf
e6e5f19257 fix 2019-06-18 14:45:14 +02:00
thomwolf
a432b3d466 distributed traing t_total 2019-06-18 14:39:09 +02:00
thomwolf
c5407f343f split squad example in two 2019-06-18 14:29:03 +02:00
thomwolf
335f57baf8 only on main process 2019-06-18 14:03:46 +02:00
thomwolf
326944d627 add tensorboard to run_squad 2019-06-18 14:02:42 +02:00
thomwolf
d82e5deeb1 set find_unused_parameters=True in DDP 2019-06-18 12:13:14 +02:00
thomwolf
a59abedfb5 DDP update 2019-06-18 12:06:26 +02:00
thomwolf
2ef5e0de87 switch to pytorch DistributedDataParallel 2019-06-18 12:03:13 +02:00
thomwolf
9ce37af99b oups 2019-06-18 11:47:54 +02:00
thomwolf
a40955f071 no need to duplicate models anymore 2019-06-18 11:46:14 +02:00
thomwolf
382e2d1e50 spliting config and weight files for bert also 2019-06-18 10:37:16 +02:00
Thomas Wolf
cad88e19de
Merge pull request #672 from oliverguhr/master
Add vocabulary and model config to the finetune output
2019-06-14 17:02:47 +02:00
Thomas Wolf
460d9afd45
Merge pull request #640 from Barqawiz/master
Support latest multi language bert fine tune
2019-06-14 16:57:02 +02:00
Thomas Wolf
277c77f1c5
Merge pull request #630 from tguens/master
Update run_squad.py
2019-06-14 16:56:26 +02:00
Thomas Wolf
659af2cbd0
Merge pull request #604 from samuelbroscheit/master
Fixing issue "Training beyond specified 't_total' steps with schedule 'warmup_linear'" reported in #556
2019-06-14 16:49:24 +02:00
Meet Pragnesh Shah
e02ce4dc79
[hotfix] Fix frozen pooler parameters in SWAG example. 2019-06-11 15:13:53 -07:00
Oliver Guhr
5c08c8c273 adds the tokenizer + model config to the output 2019-06-11 13:46:33 +02:00
jeonsworld
a3a604cefb
Update pregenerate_training_data.py
apply Whole Word Masking technique.
referred to [create_pretraining_data.py](https://github.com/google-research/bert/blob/master/create_pretraining_data.py)
2019-06-10 12:17:23 +09:00
Ahmad Barqawi
c4fe56dcc0 support latest multi language bert fine tune
fix issue of bert-base-multilingual and add support for uncased multilingual
2019-05-27 11:27:41 +02:00
tguens
9e7bc51b95
Update run_squad.py
Indentation change so that the output "nbest_predictions.json" is not empty.
2019-05-22 17:27:59 +08:00
samuelbroscheit
94247ad6cb Make num_train_optimization_steps int 2019-05-13 12:38:22 +02:00
samuel.broscheit
49a77ac16f Clean up a little bit 2019-05-12 00:31:10 +02:00
samuel.broscheit
3bf3f9596f Fixing the issues reported in https://github.com/huggingface/pytorch-pretrained-BERT/issues/556
Reason for issue was that optimzation steps where computed from example size, which is different from actual size of dataloader when an example is chunked into multiple instances.

Solution in this pull request is to compute num_optimization_steps directly from len(data_loader).
2019-05-12 00:13:45 +02:00
burcturkoglu
00c7fd2b79 Division to num_train_optimizer of global_step in lr_this_step is removed. 2019-05-09 10:57:03 +03:00
burcturkoglu
fa37b4da77 Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT 2019-05-09 10:55:24 +03:00
burcturkoglu
5289b4b9e0 Division to num_train_optimizer of global_step in lr_this_step is removed. 2019-05-09 10:51:38 +03:00
Thomas Wolf
0198399d84
Merge pull request #570 from MottoX/fix-1
Create optimizer only when args.do_train is True
2019-05-08 16:07:50 +02:00
MottoX
18c8aef9d3 Fix documentation typo 2019-05-02 19:23:36 +08:00
MottoX
74dbba64bc Prepare optimizer only when args.do_train is True 2019-05-02 19:09:29 +08:00
Aneesh Pappu
365fb34c6c small fix to remove shifting of lm labels during pre process of roc stories, as this shifting happens interanlly in the model 2019-04-30 13:53:04 -07:00
Thomas Wolf
2dee86319d
Merge pull request #527 from Mathieu-Prouveur/fix_value_training_loss
Update example files so that tr_loss is not affected by args.gradient…
2019-04-30 11:12:55 +02:00
Mathieu Prouveur
87b9ec3843 Fix tr_loss rescaling factor using global_step 2019-04-29 12:58:29 +02:00
Mathieu Prouveur
ed8fad7390 Update example files so that tr_loss is not affected by args.gradient_accumulation_step 2019-04-24 14:07:00 +02:00
thomwolf
d94c6b0144 fix training schedules in examples to match new API 2019-04-23 11:17:06 +02:00
Thomas Wolf
c36cca075a
Merge pull request #515 from Rocketknight1/master
Fix --reduce_memory in finetune_on_pregenerated
2019-04-23 10:30:23 +02:00
Matthew Carrigan
b8e2a9c584 Made --reduce_memory actually do something in finetune_on_pregenerated 2019-04-22 14:01:48 +01:00
Sangwhan Moon
14b1f719f4 Fix indentation weirdness in GPT-2 example. 2019-04-22 02:20:22 +09:00
Thomas Wolf
8407429d74
Merge pull request #494 from SudoSharma/patch-1
Fix indentation for unconditional generation
2019-04-17 11:11:36 +02:00
Ben Mann
87677fcc4d
[run_gpt2.py] temperature should be a float, not int 2019-04-16 15:23:21 -07:00
Abhi Sharma
07154dadb4
Fix indentation for unconditional generation 2019-04-16 11:11:49 -07:00
Thomas Wolf
3d78e226e6
Merge pull request #489 from huggingface/tokenization_serialization
Better serialization for Tokenizers and Configuration classes - Also fix #466
2019-04-16 08:49:54 +02:00
thomwolf
3571187ef6 fix saving models in distributed setting examples 2019-04-15 16:43:56 +02:00
thomwolf
2499b0a5fc add ptvsd to run_squad 2019-04-15 15:33:04 +02:00
thomwolf
7816f7921f clean up distributed training logging in run_squad example 2019-04-15 15:27:10 +02:00
thomwolf
1135f2384a clean up logger in examples for distributed case 2019-04-15 15:22:40 +02:00
thomwolf
60ea6c59d2 added best practices for serialization in README and examples 2019-04-15 15:00:33 +02:00
thomwolf
179a2c2ff6 update example to work with new serialization semantic 2019-04-15 14:33:23 +02:00
thomwolf
3e65f255dc add serialization semantics to tokenizers - fix transfo-xl tokenizer 2019-04-15 11:47:25 +02:00
Thomas Wolf
aff44f0c08
Merge branch 'master' into master 2019-04-15 10:58:34 +02:00
Thomas Wolf
bb61b747df
Merge pull request #474 from jiesutd/master
Fix tsv read error in Windows
2019-04-15 10:56:48 +02:00
Matthew Carrigan
dbbd6c7500 Replaced some randints with cleaner randranges, and added a helpful
error for users whose corpus is just one giant document.
2019-04-12 15:07:58 +01:00
Thomas Wolf
616743330e
Merge pull request #462 from 8enmann/master
fix run_gpt2.py
2019-04-11 21:54:46 +02:00
Thomas Wolf
2cdfb8b254
Merge pull request #467 from yaroslavvb/patch-2
Update README.md
2019-04-11 21:53:23 +02:00
Jie Yang
c49ce3c722 fix tsv read error in Windows 2019-04-11 15:40:19 -04:00
thomwolf
4bc4c69af9 finetuning any BERT model - fixes #455 2019-04-11 16:57:59 +02:00
Yaroslav Bulatov
8fffba5f47
Update README.md
Fix for

```> > > > 04/09/2019 21:39:38 - INFO - __main__ -   device: cuda n_gpu: 1, distributed training: False, 16-bits training: False
Traceback (most recent call last):
  File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 642, in <module>
    main()
  File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 502, in main
    raise ValueError("Training is currently the only implemented execution option. Please set `do_train`.")
ValueError: Training is currently the only implemented execution option. Please set `do_train`.
```
2019-04-09 14:45:47 -07:00
Benjamin Mann
fd8a3556f0 fix run_gpt2.py 2019-04-08 17:20:35 -07:00
Dhanajit Brahma
6c4c7be282 Merge remote-tracking branch 'upstream/master' 2019-04-07 16:59:36 +05:30
Dhanajit Brahma
4d3cf0d602 removing some redundant lines 2019-04-07 16:59:07 +05:30
Thomas Wolf
9ca25ce828
Merge pull request #427 from jeonsworld/patch-1
fix sample_doc
2019-04-03 11:26:58 +02:00
thomwolf
846b1fd6f8 Fix #419 2019-04-03 10:50:38 +02:00
Thomas Wolf
2f80dbbc0d
Merge pull request #430 from MottoX/master
Fix typo in example code
2019-04-02 10:41:56 +02:00
Mike Arpaia
8b5c63e4de Fixes to the TensorFlow conversion tool 2019-04-01 13:17:54 -06:00
Weixin Wang
d07db28f52
Fix typo in example code
Modify 'unambigiously' to 'unambiguously'
2019-03-31 01:20:18 +08:00
jeonsworld
60005f464d
Update pregenerate_training_data.py
If the value of rand_end is returned from the randint function, the value of sampled_doc_index that matches current_idx is returned from searchsorted.

example:
cumsum_max = {int64} 30
doc_cumsum = {ndarray} [ 5  7 11 19 30]
doc_lengths = {list} <class 'list'>: [5, 2, 4, 8, 11]
if current_idx  = 1,
rand_start = 7
rand_end = 35
sentence_index = randint(7, 35) % cumsum_max
if randint return 35, sentence_index becomes 5.
if sentence_index is 5, np.searchsorted returns 1 equal to current_index.
2019-03-30 14:50:17 +09:00
dhanajitb
f872eb98c2
making unconditional generation work
The unconditional generation works now but if the seed is fixed, the sample is the same every time.
n_samples > 1 will give different samples though.
I am giving the start token as '<|endoftext|>' for the unconditional generation.
2019-03-28 22:46:15 +05:30
Thomas Wolf
694e2117f3
Merge pull request #388 from ananyahjha93/master
Added remaining GLUE tasks to 'run_classifier.py'
2019-03-28 09:06:53 +01:00
Thomas Wolf
cc8c2d2332
Merge pull request #396 from IndexFziQ/IndexFziQ
add tqdm to the process of eval in examples/run_swag.py
2019-03-27 12:03:26 +01:00
thomwolf
361aff6de5 typos 2019-03-27 11:54:59 +01:00
thomwolf
cea8ba1d59 adjusted formating and some wording in the readme 2019-03-27 11:53:44 +01:00
Matthew Carrigan
24e67fbf75 Minor README update 2019-03-25 12:33:30 +00:00
Matthew Carrigan
8d1d1ffde2 Corrected the displayed loss when gradient_accumulation_steps > 1 2019-03-25 12:15:19 +00:00
Matthew Carrigan
abb7d1ff6d Added proper context management to ensure cleanup happens in the right
order.
2019-03-21 17:50:03 +00:00
Matthew Carrigan
06a30cfdf3 Added a --reduce_memory option to the training script to keep training
data on disc as a memmap rather than in memory
2019-03-21 17:04:12 +00:00
Matthew Carrigan
7d1ae644ef Added a --reduce_memory option to the training script to keep training
data on disc as a memmap rather than in memory
2019-03-21 17:02:18 +00:00
Matthew Carrigan
2bba7f810e Added a --reduce_memory option to shelve docs to disc instead of keeping them in memory. 2019-03-21 16:50:16 +00:00
Matthew Carrigan
8733ffcb5e Removing a couple of other old unnecessary comments 2019-03-21 14:09:57 +00:00
Matthew Carrigan
8a861048dd Fixed up the notes on a possible future low-memory path 2019-03-21 14:08:39 +00:00
Matthew Carrigan
a8a577ba93 Reduced memory usage for pregenerating the data a lot by writing it
out on the fly without shuffling - the Sampler in the finetuning script
will shuffle for us.
2019-03-21 14:05:52 +00:00
Matthew Carrigan
0ae59e662d Reduced memory usage for pregenerating the data a lot by writing it
out on the fly without shuffling - the Sampler in the finetuning script
will shuffle for us.
2019-03-21 14:04:17 +00:00
Matthew Carrigan
6a9038ba53 Removed an old irrelevant comment 2019-03-21 13:36:41 +00:00
Yuqiang Xie
77944d1b31
add tqdm to the process of eval
Maybe better.
2019-03-21 20:59:33 +08:00
Matthew Carrigan
29a392fbcf Small README changes 2019-03-20 17:35:17 +00:00
Matthew Carrigan
832b2b0058 Adding README 2019-03-20 17:31:49 +00:00
Matthew Carrigan
934d3f4d2f Syncing up argument names between the scripts 2019-03-20 17:23:23 +00:00
Matthew Carrigan
f19ba35b2b Move old finetuning script into the new folder 2019-03-20 16:47:06 +00:00
Matthew Carrigan
7de5c6aa5e PEP8 and formatting cleanups 2019-03-20 16:44:04 +00:00
Matthew Carrigan
1798e98e5a Added final TODOs 2019-03-20 16:42:37 +00:00
Matthew Carrigan
c64c2fc4c2 Fixed embarrassing indentation problem 2019-03-20 15:42:57 +00:00
Matthew Carrigan
0540d360f2 Fixed logging 2019-03-20 15:36:51 +00:00
Matthew Carrigan
976554a472 First commit of the new LM finetuning 2019-03-20 14:23:51 +00:00
Ananya Harsh Jha
e5b63fb542 Merge branch 'master' of https://github.com/ananyahjha93/pytorch-pretrained-BERT
pull current master to local
2019-03-17 08:30:13 -04:00
Ananya Harsh Jha
8a4e90ff40 corrected folder creation error for MNLI-MM, verified GLUE results 2019-03-17 08:16:50 -04:00
Ananya Harsh Jha
e0bf01d9a9 added hack for mismatched MNLI 2019-03-16 14:10:48 -04:00
Ananya Harsh Jha
4c721c6b6a added eval time metrics for GLUE tasks 2019-03-15 23:21:24 -04:00
tseretelitornike
83857ffeaa
Added missing imports. 2019-03-15 12:45:48 +01:00
Yongbo Wang
d1e4fa98a9
typo in annotation
modify `heruistic` to `heuristic` in line 660, `charcter` to `character` in line 661.
2019-03-14 17:32:15 +08:00
Yongbo Wang
3d6452163d
typo
modify `mull` to `null` in line 474 annotation.
2019-03-14 17:03:38 +08:00
thomwolf
a98dfe4ced fixing #377 (empty nbest_predictions.json) 2019-03-14 09:57:06 +01:00
Ananya Harsh Jha
043c8781ef added code for all glue task processors 2019-03-14 04:24:04 -04:00
Yongbo Wang
22a465a91f
Simplify code, delete redundancy line
delete redundancy line `if args.train`, simplify code.
2019-03-13 09:42:06 +08:00
Elon Musk
66d8206809
Update run_gpt2.py 2019-03-08 11:59:08 -05:00
thomwolf
7cc35c3104 fix openai gpt example and updating readme 2019-03-06 11:43:21 +01:00
thomwolf
994d86609b fixing PYTORCH_PRETRAINED_BERT_CACHE use in examples 2019-03-06 10:21:24 +01:00
thomwolf
5c85fc3977 fix typo - logger info 2019-03-06 10:05:21 +01:00
Thomas Wolf
8e36da7acb
Merge pull request #347 from jplehmann/feature/sst2-processor
Processor for SST-2 task
2019-03-06 09:48:27 +01:00
Thomas Wolf
3c01dfb775
Merge pull request #338 from CatalinVoss/patch-3
Fix top k generation for k != 0
2019-03-06 09:47:33 +01:00
John Lehmann
0f96d4b1f7 Run classifier processor for SST-2. 2019-03-05 13:38:28 -06:00
Catalin Voss
4b4b079272
Fix top k generation for k != 0 2019-03-02 21:54:44 -08:00
Catalin Voss
c0cf0a04d5
Fix typo 2019-02-27 18:01:06 -08:00
Ben Johnson
8607233679
Update run_openai_gpt.py 2019-02-20 13:58:54 -05:00
thomwolf
0202da0271 remove unnecessary example 2019-02-18 13:51:42 +01:00
thomwolf
690a0dbf36 fix example - masking 2019-02-18 10:50:30 +01:00
thomwolf
fbb248a2e4 examples testing 2019-02-18 01:28:18 +01:00
thomwolf
b65f07d8c0 adding examples 2019-02-18 00:55:33 +01:00
wlhgtc
8efaf8f176
fix 'best_non_null_entry' is None error 2019-02-15 15:57:25 +08:00
Davide Fiocco
65df0d78ed
--do_lower_case is duplicated in parser args
Deleting one repetition (please review!)
2019-02-13 15:30:05 +01:00
Thomas Wolf
03cdb2a390
Merge pull request #254 from huggingface/python_2
Adding OpenAI GPT and Transformer-XL models, compatibility with Python 2
2019-02-11 14:19:26 +01:00
thomwolf
d38caba169 typo in run_squad 2019-02-11 14:10:27 +01:00
thomwolf
af62cc5f20 fix run_squad example 2019-02-11 14:06:32 +01:00
thomwolf
eebc8abbe2 clarify and unify model saving logic in examples 2019-02-11 14:04:19 +01:00
thomwolf
32fea876bb add distant debugging to run_transfo_xl 2019-02-11 12:53:32 +01:00
thomwolf
b31ba23913 cuda on in the examples by default 2019-02-11 12:15:43 +01:00
thomwolf
6cd769957e update transfo xl example 2019-02-09 16:59:17 +01:00
thomwolf
1320e4ec0c mc_token_mask => mc_token_ids 2019-02-09 16:58:53 +01:00
thomwolf
f4a07a392c mems not splitted 2019-02-09 16:14:31 +01:00
thomwolf
43b9af0cac mems initialized to None in run_transfo 2019-02-09 16:12:19 +01:00
thomwolf
b80684b23f fixing run openai gpt example 2019-02-08 22:31:32 +01:00
thomwolf
7b4b0cf966 logging 2019-02-08 11:16:29 +01:00
thomwolf
4bbb9f2d68 log loss - helpers 2019-02-08 11:14:29 +01:00
thomwolf
5d7e845712 fix model on cuda 2019-02-08 11:08:43 +01:00
thomwolf
eccb2f0163 hot fix 2019-02-08 11:05:20 +01:00
thomwolf
5adc20723b add distant debugging 2019-02-08 11:03:59 +01:00
thomwolf
777459b471 run openai example running 2019-02-08 10:33:14 +01:00
thomwolf
6bc082da0a updating examples 2019-02-08 00:02:26 +01:00
thomwolf
e77721e4fe renamed examples 2019-02-07 23:15:15 +01:00
thomwolf
d482e3d79d adding examples for openai and transformer-xl 2019-02-07 17:06:41 +01:00
tholor
9aebc711c9 adjust error message related to args.do_eval 2019-02-07 11:49:38 +01:00
tholor
4a450b25d5 removing unused argument eval_batch_size from LM finetuning #256 2019-02-07 10:06:38 +01:00
Baoyang Song
7ac3311e48
Fix the undefined variable in squad example 2019-02-06 19:36:08 +01:00
thomwolf
ed47cb6cba fixing transfo eval script 2019-02-06 16:22:17 +01:00
Thomas Wolf
848aae49e1
Merge branch 'master' into python_2 2019-02-06 00:13:20 +01:00
thomwolf
448937c00d python 2 compatibility 2019-02-06 00:07:46 +01:00
thomwolf
d609ba24cb resolving merge conflicts 2019-02-05 16:14:25 +01:00
Thomas Wolf
64ce900974
Merge pull request #248 from JoeDumoulin/squad1.1-fix
fix prediction on run-squad.py example
2019-02-05 16:00:51 +01:00
Thomas Wolf
e9e77cd3c4
Merge pull request #218 from matej-svejda/master
Fix learning rate problems in run_classifier.py
2019-02-05 15:40:44 +01:00
thomwolf
1579c53635 more explicit notation: num_train_step => num_train_optimization_steps 2019-02-05 15:36:33 +01:00
joe dumoulin
aa90e0c36a fix prediction on run-squad.py example 2019-02-01 10:15:44 -08:00
Thomas Wolf
8f8bbd4a4c
Merge pull request #244 from deepset-ai/prettify_lm_masking
Avoid confusion of inplace LM masking
2019-02-01 12:17:50 +01:00
tholor
ce75b169bd avoid confusion of inplace masking of tokens_a / tokens_b 2019-01-31 11:42:06 +01:00
Surya Kasturi
9bf528877e
Update run_squad.py 2019-01-30 15:09:31 -05:00
Surya Kasturi
af2b78601b
Update run_squad2.py 2019-01-30 15:08:56 -05:00
Matej Svejda
5169069997 make examples consistent, revert error in num_train_steps calculation 2019-01-30 11:47:25 +01:00
Matej Svejda
9c6a48c8c3 fix learning rate/fp16 and warmup problem for all examples 2019-01-27 14:07:24 +01:00
Matej Svejda
01ff4f82ba learning rate problems in run_classifier.py 2019-01-22 23:40:06 +01:00
liangtaiwan
be9fa192f0 don't save if do not train 2019-01-18 00:41:55 +08:00
thomwolf
a28dfc8659 fix eval for wt103 2019-01-16 11:18:19 +01:00
thomwolf
8831c68803 fixing various parts of model conversion, loading and weights sharing 2019-01-16 10:31:16 +01:00
thomwolf
bcd4aa8fe0 update evaluation example 2019-01-15 23:32:34 +01:00
thomwolf
a69ec2c722 improved corpus and tokenization conversion - added evaluation script 2019-01-15 23:17:46 +01:00
Thomas Wolf
4e0cba1053
Merge pull request #191 from nhatchan/20190113_py35_finetune
lm_finetuning compatibility with Python 3.5
2019-01-14 09:40:07 +01:00
nhatchan
6c65cb2492 lm_finetuning compatibility with Python 3.5
dicts are not ordered in Python 3.5 or prior, which is a cause of #175.
This PR replaces one with a list, to keep its order.
2019-01-13 21:09:13 +09:00
Li Dong
a2da2b4109
[bug fix] args.do_lower_case is always True
The "default=True" makes args.do_lower_case always True.

```python
parser.add_argument("--do_lower_case",
                        default=True,
                        action='store_true')
```
2019-01-13 19:51:11 +08:00
tholor
506e5bb0c8 add do_lower_case arg and adjust model saving for lm finetuning. 2019-01-11 08:32:46 +01:00
Thomas Wolf
e485829a41
Merge pull request #174 from abeljim/master
Added Squad 2.0
2019-01-10 23:40:45 +01:00
Sang-Kil Park
64326dccfb
Fix it to run properly even if without --do_train param.
It was modified similar to `run_classifier.py`, and Fixed to run properly even if without `--do_train` param.
2019-01-10 21:51:39 +09:00
thomwolf
e5c78c6684 update readme and few typos 2019-01-10 01:40:00 +01:00
thomwolf
fa5222c296 update readme 2019-01-10 01:25:28 +01:00
Unknown
b3628f117e Added Squad 2.0 2019-01-08 15:13:13 -08:00
thomwolf
ab90d4cddd adding docs and example for OpenAI GPT 2019-01-09 00:12:43 +01:00
thomwolf
2e4db64cab add do_lower_case tokenizer loading optino in run_squad and ine_tuning examples 2019-01-07 13:06:42 +01:00
thomwolf
c9fd350567 remove default when action is store_true in arguments 2019-01-07 13:01:54 +01:00
Thomas Wolf
d3d56f9a0b
Merge pull request #166 from likejazz/patch-1
Fix error when `bert_model` param is path or url.
2019-01-07 12:40:55 +01:00
Thomas Wolf
766c6b2ce3
Merge pull request #159 from jaderabbit/master
Allow do_eval to be used without do_train and to use the pretrained model in the output folder
2019-01-07 12:31:06 +01:00
Thomas Wolf
77966a43a4
Merge pull request #156 from rodgzilla/cl_args_doc
Adding new pretrained model to the help of the `bert_model` argument.
2019-01-07 12:27:16 +01:00
Thomas Wolf
2e8c5c00ec
Merge pull request #141 from SinghJasdeep/patch-1
loading saved model when n_classes != 2
2019-01-07 12:21:13 +01:00
Sang-Kil Park
ca4e7aaa72
Fix error when bert_model param is path or url.
Error occurs when `bert_model` param is path or url. Therefore, if it is path, specify the last path to prevent error.
2019-01-05 11:42:54 +09:00
Jade Abbott
193e2df8ba Remove rogue comment 2019-01-03 13:13:06 +02:00
Jade Abbott
c64de50ea4 nb_tr_steps is not initialized 2019-01-03 12:34:57 +02:00
Jade Abbott
b96149a19b Training loss is not initialized if only do_eval is specified 2019-01-03 10:32:10 +02:00
Jade Abbott
be3b9bcf4d Allow one to use the pretrained model in evaluation when do_train is not selected 2019-01-03 09:02:33 +02:00
Grégory Châtel
186f75342e Adding new pretrained model to the help of the bert_model argument. 2019-01-02 14:00:59 +01:00
Jasdeep Singh
99709ee61d
loading saved model when n_classes != 2
Required to for: Assertion `t >= 0 && t < n_classes` failed,  if your default number of classes is not 2.
2018-12-20 13:55:47 -08:00
tholor
e5fc98c542 add exemplary training data. update to nvidia apex. refactor 'item -> line in doc' mapping. add warning for unknown word. 2018-12-20 18:30:52 +01:00
deepset
a58361f197
Add example for fine tuning BERT language model (#1)
Adds an example for loading a pre-trained BERT model and fine tune it as a language model (masked tokens & nextSentence) on your target corpus.
2018-12-18 10:32:25 +01:00
thomwolf
ae88eb88a4 set encoding to 'utf-8' in calls to open 2018-12-14 13:48:58 +01:00
thomwolf
e1eab59aac no fp16 on evaluation 2018-12-13 14:54:02 +01:00
thomwolf
087798b7fa fix reloading model for evaluation in examples 2018-12-13 14:48:12 +01:00
thomwolf
0f544625f4 fix swag example for work with apex 2018-12-13 13:35:59 +01:00
thomwolf
0cf88ff084 make examples work without apex 2018-12-13 13:28:00 +01:00
thomwolf
d3fcec1a3e add saving and loading model in examples 2018-12-13 12:50:44 +01:00
thomwolf
b3caec5a56 adding save checkpoint and loading in examples 2018-12-13 12:48:13 +01:00
Thomas Wolf
91aab2a6d3
Merge pull request #116 from FDecaYed/deyuf/fp16_with_apex
Change to use apex for better fp16 and multi-gpu support
2018-12-13 12:32:37 +01:00
Thomas Wolf
ffe9075f48
Merge pull request #96 from rodgzilla/multiple-choice-code
BertForMultipleChoice and Swag dataset example.
2018-12-13 12:05:11 +01:00
Deyu Fu
c8ea286048 change to apex for better fp16 and multi-gpu support 2018-12-11 17:13:58 -08:00
Thomas Wolf
e622790a93
Merge pull request #91 from rodgzilla/convert-examples-code-improvement
run_classifier.py improvements
2018-12-11 05:12:04 -05:00
Grégory Châtel
df34f22854 Removing the dependency to pandas and using the csv module to load data. 2018-12-10 17:45:23 +01:00
Grégory Châtel
d429c15f25 Removing old code from copy-paste. 2018-12-06 19:19:21 +01:00
Grégory Châtel
63c45056aa Finishing the code for the Swag task. 2018-12-06 18:53:05 +01:00
Grégory Châtel
c45d8ac554 Storing the feature of each choice as a dict for readability. 2018-12-06 16:01:28 +01:00
Grégory Châtel
0812aee2c3 Fixing problems in convert_examples_to_features. 2018-12-06 15:53:07 +01:00
Grégory Châtel
f2b873e995 convert_examples_to_features code and small improvements. 2018-12-06 15:40:47 +01:00
Grégory Châtel
83fdbd6043 Adding read_swag_examples to load the dataset. 2018-12-06 14:02:46 +01:00
Grégory Châtel
7183cded4e SwagExample class. 2018-12-06 13:39:44 +01:00
Grégory Châtel
fa7daa247d Fixing the commentary of the SquadExample class. 2018-12-06 13:14:33 +01:00
Grégory Châtel
a994bf4076 Fixing related to issue #83. 2018-12-05 18:16:30 +01:00
Grégory Châtel
c6d9d5394e Simplifying code for easier understanding. 2018-12-05 17:53:09 +01:00
Grégory Châtel
793262e8ec Removing trailing whitespaces. 2018-12-05 17:52:39 +01:00
Davide Fiocco
e60e8a6068
Correct assignement for logits in classifier example
I tried to address https://github.com/huggingface/pytorch-pretrained-BERT/issues/76
should be correct, but there's likely a more efficient way.
2018-12-02 12:38:26 +01:00
Davide Fiocco
dc13e276ee
Point typo fix 2018-12-01 01:02:16 +01:00
thomwolf
89d47230d7 clean up classification model output 2018-11-30 22:54:53 +01:00
thomwolf
c588453a0f fix run_squad 2018-11-30 14:22:40 +01:00
thomwolf
0541442558 add do_lower_case in examples 2018-11-30 13:47:33 +01:00
Li Li
0aaedcc02f Bug fix in examples;correct t_total for distributed training;run prediction for full dataset 2018-11-27 01:08:37 -08:00
thomwolf
32167cdf4b remove convert_to_unicode and printable_text from examples 2018-11-26 23:33:22 +01:00
thomwolf
05053d163c update cache_dir in readme and examples 2018-11-26 10:45:13 +01:00
thomwolf
6b2136a8a9 fixing weights decay in run_squad example 2018-11-20 10:12:44 +01:00
Thomas Wolf
061eeca84a
Merge pull request #32 from xiaoda99/master
Fix ineffective no_decay bug when using BERTAdam
2018-11-20 10:11:46 +01:00
thomwolf
2f21497d3e fixing param.grad is None in fp16 examples 2018-11-20 10:01:21 +01:00
xiaoda99
6c4789e4e8
Fix ineffective no_decay bug 2018-11-18 16:16:21 +08:00
thomwolf
27ee0fff3c add no_cuda args in extract_features 2018-11-17 23:04:44 +01:00
thomwolf
aa50fd196f remove unused arguments in example scripts 2018-11-17 23:01:05 +01:00
thomwolf
47a7d4ec14 update examples from master 2018-11-17 12:21:35 +01:00
thomwolf
c8cba67742 clean up readme and examples 2018-11-17 12:19:16 +01:00
thomwolf
757750d6f6 fix tests 2018-11-17 11:58:14 +01:00
thomwolf
4e46affc34 updating examples 2018-11-17 10:30:54 +01:00
thomwolf
cba85a67b9 fix nan in optimizer_on_cpu 2018-11-15 21:47:41 +01:00
thomwolf
1de35b624b preparing for first release 2018-11-15 20:56:10 +01:00