transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-07 23:00:08 +06:00

Author	SHA1	Message	Date
IWillPull	a3085020ed	Added repetition penalty to PPLM example (#2436 ) * Added repetition penalty * Default PPLM repetition_penalty to neutral * Minor modifications to comply with reviewer's suggestions. (j -> token_idx) * Formatted code with `make style`	2020-01-10 23:00:07 -05:00
VictorSanh	e83d9f1c1d	cleaning - change ' to " (black requirements)	2020-01-10 19:34:25 -05:00
VictorSanh	ebba9e929d	minor spring cleaning - missing configs + processing	2020-01-10 19:14:58 -05:00
Victor SANH	331065e62d	missing import	2020-01-10 11:42:53 +01:00
Victor SANH	414e9e7122	indents test	2020-01-10 11:42:53 +01:00
Victor SANH	3cdb38a7c0	indents	2020-01-10 11:42:53 +01:00
Victor SANH	ebd45980a0	Align with `run_squad` + fix some errors	2020-01-10 11:42:53 +01:00
Victor SANH	45634f87f8	fix Sampler in distributed training - evaluation	2020-01-10 11:42:53 +01:00
Victor SANH	af1ee9e648	Move `torch.nn.utils.clip_grad_norm_`	2020-01-10 11:42:53 +01:00
Lysandre	164c794eb3	New SQuAD API for distillation script	2020-01-10 11:42:53 +01:00
Lysandre	16ce15ed4b	DistilBERT token type ids removed from inputs in run_squad	2020-01-08 13:18:30 +01:00
Lysandre Debut	f24232cd1b	Fix error with global step in run_squad.py	2020-01-08 11:39:00 +01:00
Oren Amsalem	43114b89ba	spelling correction (#2434 )	2020-01-07 17:25:25 +01:00
Lysandre Debut	27c1b656cc	Fix error with global step in run_lm_finetuning.py	2020-01-07 16:16:12 +01:00
Simone Primarosa	176d3b3079	Add support for Albert and XLMRoberta for the Glue example (#2403 ) * Add support for Albert and XLMRoberta for the Glue example	2020-01-07 14:55:55 +01:00
alberduris	81d6841b4b	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
alberduris	dd4df80f0b	Moved the encoded_prompts to correct device	2020-01-06 15:11:12 +01:00
karajan1001	f01b3e6680	fix #2399 an ImportError in official example (#2400 ) * fix #2399 an ImportError in official example * style Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-01-05 12:50:20 -05:00
Julien Chaumond	629b22adcf	[run_lm_finetuning] mask_tokens: document types	2020-01-01 12:55:10 -05:00
Thomas Wolf	0412f3d929	Merge pull request #2291 from aaugustin/fix-flake8-F841 Fix F841 flake8 warning	2019-12-25 22:37:42 +01:00
Aymeric Augustin	a8d34e534e	Remove [--editable] in install instructions. Use -e only in docs targeted at contributors. If a user copy-pastes command line with [--editable], they will hit an error. If they don't know the --editable option, we're giving them a choice to make before they can move forwards, but this isn't a choice they need to make right now.	2019-12-24 08:46:08 +01:00
Aymeric Augustin	81422c4e6d	Remove unused variables in examples.	2019-12-23 22:29:02 +01:00
Aymeric Augustin	c3783399db	Remove redundant requirements with transformers.	2019-12-23 19:17:27 +01:00
Aymeric Augustin	9fc8dcb2a0	Standardize import. Every other file uses this pattern.	2019-12-23 18:45:42 +01:00
Aymeric Augustin	1c62e87b34	Use built-in open(). On Python 3, `open is io.open`.	2019-12-22 18:38:56 +01:00
Aymeric Augustin	d6eaf4e6d2	Update comments mentioning Python 2.	2019-12-22 18:38:56 +01:00
Aymeric Augustin	75a23d24af	Remove import fallbacks.	2019-12-22 18:38:56 +01:00
Aymeric Augustin	798b3b3899	Remove sys.version_info[0] == 2 or 3.	2019-12-22 18:38:42 +01:00
Aymeric Augustin	6b2200fc88	Remove u-prefixes.	2019-12-22 17:47:54 +01:00
Aymeric Augustin	c824d15aa1	Remove __future__ imports.	2019-12-22 17:47:54 +01:00
Aymeric Augustin	7e98e211f0	Remove unittest.main() in test modules. This construct isn't used anymore these days. Running python tests/test_foo.py puts the tests/ directory on PYTHONPATH, which isn't representative of how we run tests. Use python -m unittest tests/test_foo.py instead.	2019-12-22 14:42:03 +01:00
Aymeric Augustin	ced0a94204	Switch test files to the standard test_*.py scheme.	2019-12-22 14:15:13 +01:00
Aymeric Augustin	c11b3e2926	Sort imports for optional third-party libraries. These libraries aren't always installed in the virtual environment where isort is running. Declaring them properly avoids mixing these third-party imports with local imports.	2019-12-22 11:19:13 +01:00
Aymeric Augustin	939148b050	Fix F401 flake8 warning (x28). Do manually what autoflake couldn't manage.	2019-12-22 10:59:08 +01:00
Aymeric Augustin	783a616999	Fix F401 flake8 warning (x88 / 116). This change is mostly autogenerated with: $ python -m autoflake --in-place --recursive --remove-all-unused-imports --ignore-init-module-imports examples templates transformers utils hubconf.py setup.py I made minor changes in the generated diff.	2019-12-22 10:59:08 +01:00
Aymeric Augustin	80327a13ea	Fix F401 flake8 warning (x152 / 268). This change is mostly autogenerated with: $ python -m autoflake --in-place --recursive examples templates transformers utils hubconf.py setup.py I made minor changes in the generated diff.	2019-12-22 10:59:08 +01:00
Aymeric Augustin	fa2ccbc081	Fix E266 flake8 warning (x90).	2019-12-22 10:59:08 +01:00
Aymeric Augustin	2ab78325f0	Fix F821 flake8 warning (x47). Ignore warnings related to Python 2, because it's going away soon.	2019-12-22 10:59:07 +01:00
Aymeric Augustin	631be27078	Fix E722 flake8 warnings (x26).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	b0f7db73cd	Fix E741 flake8 warning (x14).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	fd2f17a7a1	Fix E714 flake8 warning (x8).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	5eab3cf6bc	Fix W605 flake8 warning (x5).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	7dce8dc7ac	Fix E731 flake8 warning (x3).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	357db7098c	Fix E712 flake8 warning (x1).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	f9c5317db2	Fix E265 flake8 warning (x1).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	28e608a2c2	Remove trailing whitespace from all Python files. Fixes flake8 warning W291 (x224).	2019-12-22 10:59:07 +01:00
Aymeric Augustin	158e82e061	Sort imports with isort. This is the result of: $ isort --recursive examples templates transformers utils hubconf.py setup.py	2019-12-22 10:57:46 +01:00
Aymeric Augustin	fa84ae26d6	Reformat source code with black. This is the result of: $ black --line-length 119 examples templates transformers utils hubconf.py setup.py There's a lot of fairly long lines in the project. As a consequence, I'm picking the longest widely accepted line length, 119 characters. This is also Thomas' preference, because it allows for explicit variable names, to make the code easier to understand.	2019-12-21 17:52:29 +01:00
Thomas Wolf	73f6e9817c	Merge pull request #2115 from suvrat96/add_mmbt_model [WIP] Add MMBT Model to Transformers Repo	2019-12-21 15:26:08 +01:00
thomwolf	344126fe58	move example to mm-imdb folder	2019-12-21 15:06:52 +01:00
Thomas Wolf	5b7fb6a4a1	Merge pull request #2134 from bkkaggle/saving-and-resuming closes #1960 Add saving and resuming functionality for remaining examples	2019-12-21 15:03:53 +01:00
Thomas Wolf	6f68d559ab	Merge pull request #2130 from huggingface/ignored-index-coherence [BREAKING CHANGE] Setting all ignored index to the PyTorch standard	2019-12-21 14:55:40 +01:00
thomwolf	1ab25c49d3	Merge branch 'master' into pr/2115	2019-12-21 14:54:30 +01:00
thomwolf	b03872aae0	fix merge	2019-12-21 14:49:54 +01:00
Thomas Wolf	518ba748e0	Merge branch 'master' into saving-and-resuming	2019-12-21 14:41:39 +01:00
Thomas Wolf	18601c3b6e	Merge pull request #2173 from erenup/master run_squad with roberta	2019-12-21 14:33:16 +01:00
Thomas Wolf	eeb70cdd77	Merge branch 'master' into saving-and-resuming	2019-12-21 14:29:59 +01:00
Thomas Wolf	ed9b84816e	Merge pull request #1840 from huggingface/generation_sampler [WIP] Sampling sequence generator for transformers	2019-12-21 14:27:35 +01:00
thomwolf	cfa0380515	Merge branch 'master' into generation_sampler	2019-12-21 14:12:52 +01:00
thomwolf	300ec3003c	fixing run_generation example - using torch.no_grad	2019-12-21 14:02:19 +01:00
thomwolf	1c37746892	fixing run_generation	2019-12-21 13:52:49 +01:00
thomwolf	8a2be93b4e	fix merge	2019-12-21 13:31:28 +01:00
Thomas Wolf	562f864038	Merge branch 'master' into fix-xlnet-squad2.0	2019-12-21 12:48:10 +01:00
Thomas Wolf	59941c5d1f	Merge pull request #2189 from stefan-it/xlmr Add support for XLM-RoBERTa	2019-12-20 13:26:38 +01:00
Julien Chaumond	a5a06a851e	[doc] Param name consistency	2019-12-19 16:24:20 -05:00
Aidan Kierans	1718fb9e74	Minor/basic text fixes (#2229 ) * Small clarification Matches line 431 to line 435 for additional clarity and consistency. * Fixed minor typo The letter "s" was previously omitted from the word "docstrings".	2019-12-19 16:23:18 -05:00
Francesco	62c1fc3c1e	Removed duplicate XLMConfig, XLMForQuestionAnswering and XLMTokenizer from import statement of run_squad.py script	2019-12-19 09:50:56 -05:00
Ejar	284572efc0	Updated typo on the link Updated documentation due to typo	2019-12-19 09:36:43 -05:00
Stefan Schweter	a26ce4dee1	examples: add XLM-RoBERTa to glue script	2019-12-19 02:23:01 +01:00
thomwolf	3d2096f516	further cleanup	2019-12-18 11:50:54 +01:00
thomwolf	83bc5235cf	Merge branch 'master' into pr/2189	2019-12-17 11:47:32 +01:00
Thomas Wolf	f061606277	Merge pull request #2164 from huggingface/cleanup-configs [SMALL BREAKING CHANGE] Cleaning up configuration classes - Adding Model Cards	2019-12-17 09:10:16 +01:00
Lysandre	18a879f475	fix #2180	2019-12-16 16:44:29 -05:00
Lysandre	d803409215	Fix run squad evaluate during training	2019-12-16 16:31:38 -05:00
Stefan Schweter	71b4750517	examples: add support for XLM-RoBERTa to run_ner script	2019-12-16 16:37:27 +01:00
thomwolf	dc667ce1a7	double check cc @LysandreJik	2019-12-14 09:56:27 +01:00
thomwolf	7140363e09	update bertabs	2019-12-14 09:44:53 +01:00
Thomas Wolf	a52d56c8d9	Merge branch 'master' into cleanup-configs	2019-12-14 09:43:07 +01:00
erenup	c7780700f5	Merge branch 'refs/heads/squad_roberta' # Conflicts: # transformers/data/processors/squad.py	2019-12-14 08:53:59 +08:00
erenup	8e9526b4b5	add multiple processing	2019-12-14 08:43:58 +08:00
Lysandre	c8ed1c82c8	[SQUAD] Load checkpoint when evaluating without training	2019-12-13 12:13:48 -05:00
Pierric Cistac	5a5c4349e8	Fix summarization `to_cpu` doc	2019-12-13 10:02:33 -05:00
thomwolf	47f0e3cfb7	cleaning up configuration classes	2019-12-13 14:33:24 +01:00
erenup	9b312f9d41	initial version for roberta squad	2019-12-13 14:51:40 +08:00
LysandreJik	7296f1010b	Cleanup squad and add allow train_file and predict_file usage	2019-12-12 13:01:04 -05:00
LysandreJik	3fd71c4431	Update example scripts	2019-12-12 12:08:54 -05:00
Alan deLevie	fbf5455a86	Fix typo in examples/run_glue.py args declaration. deay -> decay	2019-12-12 11:16:19 -05:00
Bilal Khan	6aa919469d	Update run_xnli to save optimizer and scheduler states, then resume training from a checkpoint	2019-12-10 19:31:22 -06:00
Bilal Khan	89896fe04f	Update run_ner to save optimizer and scheduler states, then resume training from a checkpoint	2019-12-10 19:31:22 -06:00
Bilal Khan	fdc05cd68f	Update run_squad to save optimizer and scheduler states, then resume training from a checkpoint	2019-12-10 19:31:22 -06:00
Bilal Khan	854ec5784e	Update run_glue to save optimizer and scheduler states, then resume training from a checkpoint	2019-12-10 19:30:36 -06:00
LysandreJik	b72f9d340e	Correct index in script	2019-12-10 18:33:17 -05:00
LysandreJik	6a73382706	Complete warning + cleanup	2019-12-10 14:33:24 -05:00
Lysandre	dc4e9e5cb3	DataParallel for SQuAD + fix XLM	2019-12-10 19:21:20 +00:00
Rémi Louf	07bc8efbc3	add greedy decoding and sampling	2019-12-10 17:27:50 +01:00
Rémi Louf	4b82c485de	remove misplaced summarization documentation	2019-12-10 09:13:33 -05:00
Thomas Wolf	e57d00ee10	Merge pull request #1984 from huggingface/squad-refactor [WIP] Squad refactor	2019-12-10 11:07:26 +01:00
Suvrat Bhooshan	df3961121f	Add MMBT Model to Transformers Repo	2019-12-09 18:36:48 -08:00
Julien Chaumond	1d18930462	Harmonize `no_cuda` flag with other scripts	2019-12-09 20:37:55 -05:00
Rémi Louf	f7eba09007	clean for release	2019-12-09 20:37:55 -05:00
Rémi Louf	2a64107e44	improve device usage	2019-12-09 20:37:55 -05:00
Rémi Louf	c0707a85d2	add README	2019-12-09 20:37:55 -05:00
Rémi Louf	ade3cdf5ad	integrate ROUGE	2019-12-09 20:37:55 -05:00
Rémi Louf	076602bdc4	prevent BERT weights from being downloaded twice	2019-12-09 20:37:55 -05:00
Rémi Louf	a1994a71ee	simplified model and configuration	2019-12-09 20:37:55 -05:00
Rémi Louf	3a9a9f7861	default output dir to documents dir	2019-12-09 20:37:55 -05:00
Rémi Louf	693606a75c	update the docs	2019-12-09 20:37:55 -05:00
Rémi Louf	2403a66598	give transformers API to BertAbs	2019-12-09 20:37:55 -05:00
Rémi Louf	ba089c780b	share pretrained embeddings	2019-12-09 20:37:55 -05:00
Rémi Louf	9660ba1cbd	Add beam search	2019-12-09 20:37:55 -05:00
Rémi Louf	1c71ecc880	load the pretrained weights for encoder-decoder We currently save the pretrained_weights of the encoder and decoder in two separate directories `encoder` and `decoder`. However, for the `from_pretrained` function to operate with automodels we need to specify the type of model in the path to the weights. The path to the encoder/decoder weights is handled by the `PreTrainedEncoderDecoder` class in the `save_pretrained` function. Sice there is no easy way to infer the type of model that was initialized for the encoder and decoder we add a parameter `model_type` to the function. This is not an ideal solution as it is error prone, and the model type should be carried by the Model classes somehow. This is a temporary fix that should be changed before merging.	2019-12-09 20:37:55 -05:00
Rémi Louf	07f4cd73f6	update function to add special tokens Since I started my PR the `add_special_token_single_sequence` function has been deprecated for another; I replaced it with the new function.	2019-12-09 20:37:55 -05:00
Bilal Khan	79526f82f5	Remove unnecessary epoch variable	2019-12-09 16:24:35 -05:00
Bilal Khan	9626e0458c	Add functionality to continue training from last saved global_step	2019-12-09 16:24:35 -05:00
Bilal Khan	2d73591a18	Stop saving current epoch	2019-12-09 16:24:35 -05:00
Bilal Khan	0eb973b0d9	Use saved optimizer and scheduler states if available	2019-12-09 16:24:35 -05:00
Bilal Khan	a03fcf570d	Save tokenizer after each epoch to be able to resume training from a checkpoint	2019-12-09 16:24:35 -05:00
Bilal Khan	f71b1bb05a	Save optimizer state, scheduler state and current epoch	2019-12-09 16:24:35 -05:00
LysandreJik	2a4ef098d6	Add ALBERT and XLM to SQuAD script	2019-12-09 10:46:47 -05:00
Lysandre Debut	00c4e39581	Merge branch 'master' into squad-refactor	2019-12-09 10:41:15 -05:00
Thomas Wolf	5482822a2b	Merge pull request #2046 from jplu/tf2-ner-example Add NER TF2 example.	2019-12-06 12:12:22 +01:00
LysandreJik	e9217da5ff	Cleanup Improve global visibility on the run_squad script, remove unused files and fixes related to XLNet.	2019-12-05 16:01:51 -05:00
LysandreJik	9ecd83dace	Patch evaluation for impossible values + cleanup	2019-12-05 14:44:57 -05:00
VictorSanh	35ff345fc9	update requirements	2019-12-05 12:07:04 -05:00
VictorSanh	552c44a9b1	release distilm-bert	2019-12-05 10:14:58 -05:00
Rosanne Liu	ee53de7aac	Pr for pplm (#2060 ) * license * changes * ok * Update paper link and commands to run * pointer to uber repo	2019-12-05 09:20:07 -05:00
Julien Plu	9200a759d7	Add few tests on the TF optimization file with some info in the documentation. Complete the README.	2019-12-05 12:56:43 +01:00
thomwolf	75a97af6bc	fix #1450 - add doc	2019-12-05 11:26:55 +01:00
LysandreJik	f7e4a7cdfa	Cleanup	2019-12-04 16:24:15 -05:00
LysandreJik	cca75e7884	Kill the demon spawn	2019-12-04 15:42:29 -05:00
LysandreJik	9ddc3f1a12	Naming update + XLNet/XLM evaluation	2019-12-04 10:37:00 -05:00
thomwolf	5bfcd0485e	fix #1991	2019-12-04 14:53:11 +01:00
Julien Plu	ecb923da9c	Create a NER example similar to the Pytorch one. It takes the same options, and can be run the same way.	2019-12-04 09:43:15 +01:00
LysandreJik	de276de1c1	Working evaluation	2019-12-03 17:15:51 -05:00
Julien Chaumond	7edb51f3a5	[pplm] split classif head into its own file	2019-12-03 22:07:25 +00:00
VictorSanh	48cbf267c9	Use full dataset for eval (SequentialSampler in Distributed setting)	2019-12-03 11:01:37 -05:00
Julien Chaumond	f434bfc623	[pplm] Update S3 links Co-Authored-By: Piero Molino <w4nderlust@gmail.com>	2019-12-03 10:53:02 -05:00
Ethan Perez	96e83506d1	Always use SequentialSampler during evaluation When evaluating, shouldn't we always use the SequentialSampler instead of DistributedSampler? Evaluation only runs on 1 GPU no matter what, so if you use the DistributedSampler with N GPUs, I think you'll only evaluate on 1/N of the evaluation set. That's at least what I'm finding when I run an older/modified version of this repo.	2019-12-03 10:15:39 -05:00
Julien Chaumond	3b48806f75	[pplm] README: add setup + tweaks	2019-12-03 10:14:02 -05:00
Julien Chaumond	0cb2c90890	readme Co-Authored-By: Rosanne Liu <mimosavvy@gmail.com>	2019-12-03 10:14:02 -05:00
Julien Chaumond	1efb2ae7fc	[pplm] move scripts under examples/pplm/	2019-12-03 10:14:02 -05:00
Piero Molino	a59fdd1627	generate_text_pplm now works with batch_size > 1	2019-12-03 10:14:02 -05:00
w4nderlust	893d0d64fe	Changed order of some parameters to be more consistent. Identical results.	2019-12-03 10:14:02 -05:00
w4nderlust	f42816e7fc	Added additional check for url and path in discriminator model params	2019-12-03 10:14:02 -05:00
w4nderlust	f10b925015	Imrpovements: model_path renamed pretrained_model, tokenizer loaded from pretrained_model, pretrained_model set to discriminator's when discrim is specified, sample = False by default but cli parameter introduced. To obtain identical samples call the cli with --sample	2019-12-03 10:14:02 -05:00
w4nderlust	75904dae66	Removed global variable device	2019-12-03 10:14:02 -05:00
piero	7fd54b55a3	Added support for generic discriminators	2019-12-03 10:14:02 -05:00
piero	b0eaff36e6	Added a +1 to epoch when saving weights	2019-12-03 10:14:02 -05:00
piero	611961ade7	Added tqdm to preprocessing	2019-12-03 10:14:02 -05:00
piero	afc7dcd94d	Now run_pplm works on cpu. Identical output as before (when using gpu).	2019-12-03 10:14:02 -05:00
piero	61399e5afe	Cleaned perturb_past. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	ffc2935405	Fix for making unditioned generation work. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	9f693a0c48	Cleaned generate_text_pplm. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	61a12f790d	Renamed SmallConst to SMALL_CONST and introduced BIG_CONST. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	ef47b2c03a	Removed commented code. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	7ea12db3f5	Removed commented code. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	08c6e456a3	Cleaned full_text_generation. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	6c9c131780	More cleanup for run_model. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	7ffe47c888	Improved device specification	2019-12-03 10:14:02 -05:00
piero	4f2164e40e	First cleanup step, changing function names and passing parameters all the way through without using args. Identical output as before.	2019-12-03 10:14:02 -05:00
piero	821de121e8	Minor changes	2019-12-03 10:14:02 -05:00
w4nderlust	7469d03b1c	Fixed minor bug when running training on cuda	2019-12-03 10:14:02 -05:00
piero	0b51fba20b	Added script for training a discriminator for pplm to use	2019-12-03 10:14:02 -05:00
Piero Molino	34a83faabe	Let's make PPLM great again	2019-12-03 10:14:02 -05:00
Julien Chaumond	d5faa74cd6	tokenizer white space: revert to previous behavior	2019-12-03 10:14:02 -05:00
Julien Chaumond	0b77d66a6d	rm extraneous import	2019-12-03 10:14:02 -05:00
Rosanne Liu	83b1e6ac9e	fix the loss backward issue (cherry picked from commit 566468cc984c6ec7e10dfc62b5b4191781a99cd2)	2019-12-03 10:14:02 -05:00
Julien Chaumond	572c24cfa2	PPLM (squashed) Co-authored-by: piero <piero@uber.com> Co-authored-by: Rosanne Liu <mimosavvy@gmail.com>	2019-12-03 10:14:02 -05:00
Thomas Wolf	f19a78a634	Merge pull request #1903 from valohai/master Valohai integration	2019-12-03 16:13:01 +01:00
maxvidal	b0ee7c7df3	Added Camembert to available models	2019-11-29 14:17:02 -05:00
Juha Kiili	41aa0e8003	Refactor logs and fix loss bug	2019-11-29 15:33:25 +02:00
Lysandre	bd41e8292a	Cleanup & Evaluation now works	2019-11-28 16:03:56 -05:00
Stefan Schweter	8c276b9c92	Merge branch 'master' into distilbert-german	2019-11-27 18:11:49 +01:00
VictorSanh	d5478b939d	add distilbert + update run_xnli wrt run_glue	2019-11-27 11:07:22 -05:00
VictorSanh	73fe2e7385	remove fstrings	2019-11-27 11:07:22 -05:00
VictorSanh	3e7656f7ac	update readme	2019-11-27 11:07:22 -05:00
VictorSanh	abd397e954	uniformize w/ the cache_dir update	2019-11-27 11:07:22 -05:00
VictorSanh	d5910b312f	move xnli processor (and utils) to transformers/data/processors	2019-11-27 11:07:22 -05:00
VictorSanh	289cf4d2b7	change default for XNLI: dev --> test	2019-11-27 11:07:22 -05:00
VictorSanh	84a0b522cf	mbert reproducibility results	2019-11-27 11:07:22 -05:00
VictorSanh	c4336ecbbd	xnli - output_mode consistency	2019-11-27 11:07:22 -05:00
VictorSanh	d52e98ff9a	add xnli examples/README.md	2019-11-27 11:07:22 -05:00
VictorSanh	71f71ddb3e	run_xnli + utils_xnli	2019-11-27 11:07:22 -05:00
Julien Chaumond	b5d884d25c	Uniformize #1952	2019-11-27 11:05:55 -05:00
Lysandre	4374eaea78	ALBERT for SQuAD	2019-11-26 13:08:12 -05:00
Lysandre	c110c41fdb	Run GLUE and remove LAMB	2019-11-26 13:08:12 -05:00
manansanghi	5d3b8daad2	Minor bug fixes on run_ner.py	2019-11-25 16:48:03 -05:00
İbrahim Ethem Demirci	aa92a184d2	resize model when special tokenizer present	2019-11-25 15:06:32 -05:00
Lysandre	7485caefb0	fix #1894	2019-11-25 09:33:39 -05:00
Julien Chaumond	176cd1ce1b	[doc] homogenize instructions slightly	2019-11-23 11:18:54 -05:00
Lysandre	c3ba645237	Works for XLNet	2019-11-22 16:27:37 -05:00
Lysandre	72e506b22e	wip	2019-11-22 16:26:00 -05:00
Rémi Louf	26db31e0c0	update the documentation	2019-11-21 14:41:19 -05:00
Juha Kiili	2cf3447e0a	Glue: log in Valohai-compatible JSON format too	2019-11-21 12:35:25 +02:00
Thomas Wolf	0cdfcca24b	Merge pull request #1860 from stefan-it/camembert-for-token-classification [WIP] Add support for CamembertForTokenClassification	2019-11-21 10:56:07 +01:00
Jin Young Sohn	e70cdf083d	Cleanup TPU bits from run_glue.py TPU runner is currently implemented in: https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py. We plan to upstream this directly into `huggingface/transformers` (either `master` or `tpu`) branch once it's been more thoroughly tested.	2019-11-20 17:54:34 -05:00
Lysandre	454455c695	fix #1879	2019-11-20 09:42:48 -05:00
Stefan Schweter	e7cf2ccd15	distillation: add German distilbert model	2019-11-19 19:55:19 +01:00
Kazutoshi Shinoda	f3386d9383	typo "deay" -> "decay"	2019-11-18 11:50:06 -05:00
Stefan Schweter	56c84863a1	camembert: add support for CamemBERT in run_ner example	2019-11-18 17:06:57 +01:00
Julien Chaumond	26858f27cb	[camembert] Upload to s3 + rename script	2019-11-16 00:11:07 -05:00
Louis MARTIN	3e20c2e871	Update demo_camembert.py with new classes	2019-11-16 00:11:07 -05:00
Louis MARTIN	f12e4d8da7	Move demo_camembert.py to examples/contrib	2019-11-16 00:11:07 -05:00
Louis MARTIN	6e72fd094c	Add demo_camembert.py	2019-11-16 00:11:07 -05:00
Xu Hongshen	ca99a2d500	Update example readme	2019-11-15 14:55:26 +08:00
Xu Hongshen	7da3ef24cd	add is_impossible tensor to model inputs during fine-tuning xlnet on squad2.0	2019-11-15 14:18:53 +08:00
Thomas Wolf	74ce8de7d8	Merge pull request #1792 from stefan-it/distilbert-for-token-classification DistilBERT for token classification	2019-11-14 22:47:53 +01:00
Thomas Wolf	05db5bc1af	added small comparison between BERT, RoBERTa and DistilBERT	2019-11-14 22:40:22 +01:00
Thomas Wolf	9629e2c676	Merge pull request #1804 from ronakice/master fix multi-gpu eval in torch examples	2019-11-14 22:24:05 +01:00
Thomas Wolf	df99f8c5a1	Merge pull request #1832 from huggingface/memory-leak-schedulers replace LambdaLR scheduler wrappers by function	2019-11-14 22:10:31 +01:00
Rémi Louf	2276bf69b7	update the examples, docs and template	2019-11-14 20:38:02 +01:00
Lysandre	d7929899da	Specify checkpoint in saved file for run_lm_finetuning.py	2019-11-14 10:49:00 -05:00
ronakice	2e31176557	fix multi-gpu eval	2019-11-12 05:55:11 -05:00
Stefan Schweter	2b07b9e5ee	examples: add DistilBert support for NER fine-tuning	2019-11-11 16:19:34 +01:00
Adrian Bauer	7a9aae1044	Fix run_bertology.py Make imports and args.overwrite_cache match run_glue.py	2019-11-08 16:28:40 -05:00
Julien Chaumond	f88c104d8f	[run_tf_glue] Add comment for context	2019-11-05 19:56:43 -05:00
Julien Chaumond	30968d70af	misc doc	2019-11-05 19:06:12 -05:00
Thomas Wolf	e99071f105	Merge pull request #1734 from orena1/patch-1 add progress bar to convert_examples_to_features	2019-11-05 11:34:20 +01:00
Thomas Wolf	ba973342e3	Merge pull request #1553 from WilliamTambellini/timeSquadInference Add speed log to examples/run_squad.py	2019-11-05 11:13:12 +01:00
Thomas Wolf	237fad339c	Merge pull request #1709 from oneraghavan/master Fixing mode in evaluate during training	2019-11-05 10:55:33 +01:00
Oren Amsalem	d7906165a3	add progress bar for convert_examples_to_features It takes considerate amount of time (~10 min) to parse the examples to features, it is good to have a progress-bar to track this	2019-11-05 10:34:27 +02:00
thomwolf	89d6272898	Fix #1623	2019-11-04 16:21:12 +01:00
Thomas Wolf	9a3b173cd3	Merge branch 'master' into master	2019-11-04 11:41:26 +01:00
thomwolf	ad90868627	Update example readme	2019-11-04 11:27:22 +01:00
Raghavan	e5b1048bae	Fixing mode in evaluate during training	2019-11-03 16:14:46 +05:30
Lysandre	1a2b40cb53	run_tf_glue MRPC evaluation only for MRPC	2019-10-31 18:00:51 -04:00
Timothy Liu	be36cf92fb	Added mixed precision support to benchmarks.py	2019-10-31 17:24:37 -04:00
Julien Chaumond	f96ce1c241	[run_generation] Fix generation with batch_size>1	2019-10-31 18:27:11 +00:00
Julien Chaumond	3c1b6f594e	Merge branch 'master' into fix_top_k_top_p_filtering	2019-10-31 13:53:51 -04:00
Victor SANH	fa735208c9	update readme - fix example command distil*	2019-10-30 14:27:28 -04:00
Thomas Wolf	c7058d8224	Merge pull request #1608 from focox/master Error raised by "tmp_eval_loss += tmp_eval_loss.item()" when using multi-gpu	2019-10-30 17:14:07 +01:00
Thomas Wolf	04c69db399	Merge pull request #1628 from huggingface/tfglue run_tf_glue works with all tasks	2019-10-30 17:04:03 +01:00
Thomas Wolf	3df4367244	Merge pull request #1601 from huggingface/clean-roberta Clean roberta model & all tokenizers now add special tokens by default (breaking change)	2019-10-30 17:00:40 +01:00
Thomas Wolf	36174696cc	Merge branch 'master' into clean-roberta	2019-10-30 16:51:06 +01:00
Thomas Wolf	228cdd6a6e	Merge branch 'master' into conditional-generation	2019-10-30 16:40:35 +01:00
Rémi Louf	070507df1f	format utils for summarization	2019-10-30 11:24:12 +01:00
Rémi Louf	da10de8466	fix bug with padding mask + add corresponding test	2019-10-30 11:19:58 +01:00
Rémi Louf	3b0d2fa30e	rename seq2seq to encoder_decoder	2019-10-30 10:54:46 +01:00
Rémi Louf	9c1bdb5b61	revert renaming of lm_labels to ltr_lm_labels	2019-10-30 10:43:13 +01:00
Rémi Louf	098a89f312	update docstrings; rename lm_labels to more explicit ltr_lm_labels	2019-10-29 20:08:03 +01:00
Rémi Louf	dfce409691	resolve PR comments	2019-10-29 17:10:20 +01:00
altsoph	079bfb32fb	Evaluation fixed.	2019-10-28 10:18:58 -04:00
altsoph	438f2730a0	Evaluation code fixed.	2019-10-28 10:18:58 -04:00
Rémi Louf	4c3ac4a7d8	here's one big commit	2019-10-28 10:49:50 +01:00
Rémi Louf	932543f77e	fix test of truncation function	2019-10-28 10:49:49 +01:00
Rémi Louf	a67413ccc8	extend works in-place	2019-10-28 10:49:49 +01:00
Rémi Louf	b915ba9dfe	pad sequence with 0, mask with -1	2019-10-28 10:49:49 +01:00
Lysandre	bab6ad01aa	run_tf_glue works with all tasks	2019-10-24 21:41:45 +00:00
Matt Maybeno	ae1d03fc51	Add roberta to doc	2019-10-24 14:32:48 -04:00
Matt Maybeno	4e5f88b74f	Add Roberta to run_ner.py	2019-10-24 14:32:48 -04:00
VictorSanh	5b6cafb11b	[release] fix table weirdness	2019-10-23 10:35:16 -04:00
VictorSanh	8ad5c591cd	[RELEASE] DistilRoBERTa	2019-10-23 10:29:47 -04:00
focox@qq.com	bd847ce7d7	fixed the bug raised by "tmp_eval_loss += tmp_eval_loss.item()" when parallelly using multi-gpu.	2019-10-23 20:27:13 +08:00
Julien Chaumond	ef1b8b2ae5	[CTRL] warn if generation prompt does not start with a control code see also https://github.com/salesforce/ctrl/pull/50	2019-10-22 21:30:32 +00:00
Lysandre	7d709e55ed	Remove	2019-10-22 14:12:33 -04:00
Lysandre	1cfd974868	Option to benchmark only one of the two libraries	2019-10-22 13:32:23 -04:00
Pasquale Minervini	abd7110e21	gradient norm clipping should be done right before calling the optimiser - fixing run_glue and run_ner as well	2019-10-21 19:56:52 +01:00
Pasquale Minervini	3775550c4b	gradient norm clipping should be done right before calling the optimiser	2019-10-20 22:33:56 +01:00
LysandreJik	7dd29ed2f1	Benchmarks example script	2019-10-18 10:53:04 -04:00
William Tambellini	0919389d9a	Add speed log to examples/run_squad.py Add a speed estimate log (time per example) for evaluation to examples/run_squad.py	2019-10-17 14:41:04 -07:00
leo-du	ecd15667f3	fix repetition penalty	2019-10-17 14:47:14 -04:00
thomwolf	8cd56e3036	fix data processing in script	2019-10-17 16:33:26 +02:00
Rémi Louf	578d23e061	add training pipeline (formatting temporary)	2019-10-17 14:02:27 +02:00
Rémi Louf	47a06d88a0	use two different tokenizers for storyand summary	2019-10-17 13:04:26 +02:00
Rémi Louf	bfb9b540d4	add Model2Model to __init__	2019-10-17 12:59:51 +02:00
Rémi Louf	c1bc709c35	correct the truncation and padding of dataset	2019-10-17 10:41:53 +02:00
Rémi Louf	e4e0ee14bd	add separator between data import and train	2019-10-16 20:05:32 +02:00
Rémi Louf	0d81fc853e	specify in readme that both datasets are required	2019-10-15 15:26:33 +02:00
Rémi Louf	1aec940587	test the full story processing	2019-10-15 15:18:07 +02:00
Rémi Louf	22e1af6859	truncation function is fully tested	2019-10-15 14:43:50 +02:00
Rémi Louf	260ac7d9a8	wip commit, switching computers	2019-10-15 12:24:35 +02:00
thomwolf	be916cb3fb	Merge branch 'master' of https://github.com/huggingface/transformers	2019-10-15 10:37:13 +02:00
thomwolf	5875aaf762	install tensorboard	2019-10-15 10:36:46 +02:00
Thomas Wolf	40f14ff545	Merge pull request #1513 from slayton58/amp_fp16_einsum Force einsum to run in fp16	2019-10-15 10:25:00 +02:00
Thomas Wolf	d147671c6c	Merge pull request #1508 from tlkh/master Added performance enhancements (XLA, AMP) to examples	2019-10-15 09:57:18 +02:00
thomwolf	2c1d5564ad	add readme information	2019-10-15 09:56:52 +02:00
thomwolf	c55badcee0	Add NER finetuning details by @stefan-it in example readme	2019-10-15 09:33:52 +02:00
Julien Chaumond	788e632622	[ner] Honor args.overwrite_cache	2019-10-15 09:17:31 +02:00
thomwolf	0f9ebb0b43	add seqeval as requirement for examples	2019-10-15 09:17:31 +02:00
thomwolf	66adb71734	update to transformers	2019-10-15 09:17:31 +02:00
Marianne Stecklina	5ff9cd158a	Add option to predict on test set	2019-10-15 09:17:31 +02:00
Marianne Stecklina	7f5367e0b1	Add cli argument for configuring labels	2019-10-15 09:17:31 +02:00
Marianne Stecklina	e1d4179b64	Make file reading more robust	2019-10-15 09:17:31 +02:00
Marianne Stecklina	383ef96747	Implement fine-tuning BERT on CoNLL-2003 named entity recognition task	2019-10-15 09:17:31 +02:00
Marianne Stecklina	5adb39e757	Add option to predict on test set	2019-10-15 09:14:53 +02:00
Marianne Stecklina	99b189df6d	Add cli argument for configuring labels	2019-10-15 09:14:53 +02:00
Marianne Stecklina	3e9420add1	Make file reading more robust	2019-10-15 09:14:53 +02:00
Marianne Stecklina	cde42c4354	Implement fine-tuning BERT on CoNLL-2003 named entity recognition task	2019-10-15 09:14:53 +02:00
hlums	74c5035808	Fix token order in xlnet preprocessing.	2019-10-14 21:27:11 +00:00
Rémi Louf	fe25eefc15	add instructions to fetch the dataset	2019-10-14 20:45:39 +02:00
Rémi Louf	412793275d	delegate the padding with special tokens to the tokenizer	2019-10-14 20:45:16 +02:00
Rémi Louf	447fffb21f	process the raw CNN/Daily Mail dataset the data provided by Li Dong et al. were already tokenized, which means that they are not compatible with all the models in the library. We thus process the raw data directly and tokenize them using the models' tokenizers.	2019-10-14 18:12:20 +02:00
Simon Layton	4e6a55751a	Force einsum to fp16	2019-10-14 11:12:41 -04:00
Rémi Louf	67d10960ae	load and prepare CNN/Daily Mail data We write a function to load an preprocess the CNN/Daily Mail dataset as provided by Li Dong et al. The issue is that this dataset has already been tokenized by the authors, so we actually need to find the original, plain-text dataset if we want to apply it to all models.	2019-10-14 14:11:20 +02:00
Timothy Liu	376e65a674	Added automatic mixed precision and XLA options to run_tf_glue.py	2019-10-13 13:19:06 +00:00
Timothy Liu	86f23a1944	Minor enhancements to run_tf_glue.py	2019-10-13 10:21:35 +00:00
VictorSanh	d844db4005	Add citation bibtex	2019-10-11 16:55:42 -04:00
Rémi Louf	b3261e7ace	read parameters from CLI, load model & tokenizer	2019-10-11 18:40:38 +02:00
Rémi Louf	d889e0b71b	add base for seq2seq finetuning	2019-10-11 17:36:12 +02:00
Thomas Wolf	4428aefc63	Merge pull request #1488 from huggingface/pytorch-tpu GLUE on TPU	2019-10-11 16:33:00 +02:00
Luran He	f382a8decd	convert int to str before adding to a str	2019-10-10 19:20:39 -04:00
Lysandre	639f4b7190	Don't save/load when on TPU	2019-10-10 19:17:25 +00:00
Lysandre	d4e7934ac3	GLUE on TPU	2019-10-10 19:03:06 +00:00
Rémi Louf	1e68c28670	add test for initialization of Bert2Rnd	2019-10-10 18:07:11 +02:00
Thomas Wolf	6596e3d566	Merge pull request #1454 from bkkaggle/pytorch-built-in-tensorboard Change tensorboard imports to use built-in tensorboard if available	2019-10-10 11:56:55 +02:00
thomwolf	177a721205	move back to simple space spliting	2019-10-10 11:45:47 +02:00
thomwolf	a5997dd81a	better error messages	2019-10-10 11:31:01 +02:00
Lysandre Debut	2431fea98a	Merge pull request #1383 from keskarnitish/master Adding CTRL	2019-10-09 11:31:05 -04:00
thomwolf	d9e60f4f0d	Merge branch 'master' into pr/1383	2019-10-09 17:25:08 +02:00
Lysandre Debut	e84470ef81	Merge pull request #1384 from huggingface/encoding-qol Quality of life enhancements in encoding + patch MLM masking	2019-10-09 11:18:24 -04:00
jinoobaek-qz	69629c4f0f	Improve naming and only do regex when necessary	2019-10-09 08:48:40 -04:00
jinoobaek-qz	bf34a252b8	Golden path	2019-10-09 08:48:40 -04:00
jinoobaek-qz	528d3f327b	Improve readability and improve make less assumptions about checkpoint format	2019-10-09 08:48:40 -04:00
jinoobaek-qz	56301bd9e8	Extract method	2019-10-09 08:48:40 -04:00
jinoobaek-qz	d6c5469712	Delete older checkpoint after saving new checkpoint	2019-10-09 08:48:40 -04:00
jinoobaek-qz	54a31f50fb	Add save_total_limit	2019-10-09 08:48:40 -04:00
Thomas Wolf	439fac723a	Merge pull request #1409 from brian41005/master Evaluation result.txt path changing #1286	2019-10-09 03:14:34 +02:00
Bilal Khan	5ce8d29abe	Change tensorboard imports to use built-in tensorboard if available	2019-10-08 16:29:43 -05:00
VictorSanh	7ce83b4931	update weights for distilgpt2	2019-10-07 12:30:27 -04:00
LysandreJik	f3e0218fbb	Correct device assignment in run_generation	2019-10-05 21:05:16 -04:00
thomwolf	78ef1a9930	fixes	2019-10-04 17:59:44 -04:00
thomwolf	6c1d0bc066	update encode_plus - add truncation strategies	2019-10-04 17:38:38 -04:00
VictorSanh	0820bb0555	unecessary carriage return	2019-10-04 17:23:15 -04:00
VictorSanh	f5891c3821	run_squad --> run_squad_w_distillation	2019-10-04 17:23:15 -04:00
VictorSanh	764a7923ec	add distillation+finetuning option in run_squad	2019-10-04 17:23:15 -04:00
thomwolf	92c0f2fb90	Merge remote-tracking branch 'origin/julien_multiple-choice' into encoding-qol	2019-10-04 15:48:06 -04:00
Julien Chaumond	9e136ff57c	Honor args.overwrite_cache (h/t @erenup)	2019-10-04 15:00:56 -04:00
keskarnitish	dbed1c5d94	Adding CTRL (squashed commit) adding conversion script adding first draft of modeling & tokenization adding placeholder for test files bunch of changes registering the tokenizer/model/etc tests change link; something is very VERY wrong here weird end-of-word thingy going on i think the tokenization works now ; wrote the unit tests overall structure works;load w next the monster is alive! works after some cleanup as well adding emacs autosave to gitignore currently only supporting the 48 layer one; seems to infer fine on my macbook cleanup fixing some documentation fixing some documentation tests passing? now works on CUDA also adding greedy? adding greedy sampling works well	2019-10-03 22:29:03 -07:00
Lysandre Debut	d3f24dfad7	Merge branch 'master' into master	2019-10-03 22:43:09 +00:00
LysandreJik	ecc4f1bdfa	XLM use_lang_embedding flag in run_generation	2019-10-03 17:42:16 -04:00
LysandreJik	c2c2ca0fdb	Added XLM to run_generation, with prompt language selection.	2019-10-03 17:18:48 -04:00
LysandreJik	aebd83230f	Update naming + remove f string in run_lm_finetuning example	2019-10-03 11:31:36 -04:00
LysandreJik	5ed50a93fb	LM finetuning won't mask special tokens anymore	2019-10-03 11:31:36 -04:00
Brian Ma	7af0777910	Update run_glue.py add DistilBert model shortcut into ALL_MODELS	2019-10-03 15:31:11 +00:00
VictorSanh	5f07d8f11a	prepare release	2019-10-03 10:27:11 -04:00
VictorSanh	35071007cb	incoming release 🔥 update links to arxiv preprint	2019-10-03 10:27:11 -04:00
VictorSanh	2a91f6071f	upddate README - TODO updadte link to paper	2019-10-03 10:27:11 -04:00
VictorSanh	c51e533a5f	update train.py	2019-10-03 10:27:11 -04:00
VictorSanh	a76c3f9cb0	update requirements	2019-10-03 10:27:11 -04:00
VictorSanh	bb9c5ead54	update distiller	2019-10-03 10:27:11 -04:00
VictorSanh	a12ab0a8db	update binarized_data	2019-10-03 10:27:11 -04:00
VictorSanh	4d6dfbd376	update extract	2019-10-03 10:27:11 -04:00
VictorSanh	23edebc079	update extract_distilbert	2019-10-03 10:27:11 -04:00
VictorSanh	cbfcfce205	update token_counts	2019-10-03 10:27:11 -04:00
VictorSanh	19e4ebbe3f	grouped_batch_sampler	2019-10-03 10:27:11 -04:00
VictorSanh	594202a934	lm_seqs_dataset	2019-10-03 10:27:11 -04:00
VictorSanh	38084507c4	add distillation_configs	2019-10-03 10:27:11 -04:00
Brian Ma	2195c0d5f9	Evaluation result.txt path changing #1286	2019-10-03 12:49:12 +08:00
Thomas Wolf	963529e29b	Merge pull request #1288 from echan00/master Typo with LM Fine tuning script	2019-10-01 18:46:07 -04:00
thomwolf	f7978f70ec	use format instead of f-strings	2019-10-01 18:45:38 -04:00
Julien Chaumond	b350662955	overflowing_tokens do not really make sense here, let's just return a number Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2019-09-30 16:37:09 -04:00
Julien Chaumond	f5bcde0b2f	[multiple-choice] Simplify and use tokenizer.encode_plus	2019-09-30 16:04:55 -04:00
Denny	9478590630	Update run_lm_finetuning.py The previous method, just as phrased, did not exist in the class.	2019-09-27 15:18:42 -03:00
Thomas Wolf	d83d295763	Merge pull request #1337 from mgrankin/fastdataset faster dataset building	2019-09-27 10:35:12 +02:00
thomwolf	da2e47ad15	clean up a little run_tf_glue	2019-09-27 09:41:15 +02:00
thomwolf	528c288fa9	clean up run_tf_glue	2019-09-27 09:40:29 +02:00
VictorSanh	702f589848	fix input in run_glue for distilbert	2019-09-27 00:20:14 -04:00
mgrankin	f71a4577b8	faster dataset building	2019-09-26 16:53:13 +03:00
thomwolf	481d9c4fb5	Merge branch 'master' into tf2	2019-09-26 12:02:54 +02:00
thomwolf	31c23bd5ee	[BIG] pytorch-transformers => transformers	2019-09-26 10:15:53 +02:00
thomwolf	5705333441	add initialization for everybody	2019-09-26 10:06:20 +02:00
thomwolf	7c9f8f93f9	fix tests	2019-09-26 01:59:53 +02:00
thomwolf	d6dde438ea	add batch dimension in encode	2019-09-26 01:45:55 +02:00
thomwolf	4a21c4d88d	add warning if neither pt nor tf are found	2019-09-26 01:30:06 +02:00
thomwolf	3b7fb48c3b	fix loading from tf/pt	2019-09-25 17:46:16 +02:00
thomwolf	a049c8043b	push fix to training	2019-09-25 17:33:16 +02:00
mataney	a9f24a16bc	[FIX] fix run_generation.py to work with batch_size > 1	2019-09-25 15:53:29 +03:00
thomwolf	5def3302f4	update run_glue	2019-09-25 12:38:08 +02:00
thomwolf	f71758f7a4	update internal glue processors	2019-09-25 12:00:50 +02:00
thomwolf	b5ec526f85	updated data processor and metrics	2019-09-24 17:10:50 +02:00
LysandreJik	f09e5ecef0	[Proposal] GLUE processors included in library	2019-09-24 09:47:34 -04:00
LysandreJik	c832f43a4d	`output_token_type` -> `token_type_ids`	2019-09-24 07:21:38 -04:00
LysandreJik	3927d7756c	Updated the GLUE pre-processing method	2019-09-24 07:15:11 -04:00
LysandreJik	9d44236f70	Updated DistilBERT	2019-09-24 07:03:24 -04:00
Lorenzo Ampil	4b543c3007	Add option to use a 'stop token' which will be used to truncate the output text to everything till right before the 'stop token'	2019-09-22 21:38:38 +08:00
VictorSanh	9f995b99d4	minor fixes	2019-09-19 21:36:06 +00:00
VictorSanh	3fe5c8e8a8	update bert-base-uncased rslts	2019-09-19 19:34:22 +00:00
VictorSanh	354944e607	[distillation] big update w/ new weights	2019-09-19 19:25:21 +00:00
LysandreJik	60414f31a9	GLUE updated with new methods	2019-09-19 10:55:06 +02:00
LysandreJik	bf503158c5	Sentence -> Sequence. Removed output_mask from the special token addition methods.	2019-09-19 10:55:06 +02:00
LysandreJik	de8e14b6c0	Added DistilBERT to run_squad script	2019-09-19 10:55:06 +02:00
LysandreJik	88368c2a16	Added DistilBERT to `run_lm_finetuning`	2019-09-19 10:55:06 +02:00
LysandreJik	75635072e1	Updated GLUE script to add DistilBERT. Cleaned up unused args in the utils file.	2019-09-19 10:55:06 +02:00
LysandreJik	59057abe52	typo	2019-09-19 10:55:06 +02:00
LysandreJik	bac332fec0	Updated the GLUE data processor. Corrections to RoBERTa and XLNet.	2019-09-19 10:55:06 +02:00
Erik Chan	f0340eccf9	Typo Typo	2019-09-18 13:42:11 -07:00
erenup	8960988f35	fixed to find best dev acc	2019-09-19 01:10:05 +08:00
erenup	46ffc28329	Merge branch 'master' into run_multiple_choice_merge # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.	2019-09-18 21:43:46 +08:00
erenup	15143fbad6	move run_multiple_choice.py and utils_multiple_choice.py to examples	2019-09-18 21:18:46 +08:00
erenup	3cd6289758	Merge remote-tracking branch 'huggingface/master' into run_multiple_choice_merge # Conflicts: # examples/contrib/run_swag.py	2019-09-18 21:16:59 +08:00
erenup	36362cf086	move schedule.step after optimizer.step	2019-09-18 21:13:40 +08:00
thomwolf	e768f2322a	update run_openai_gpt to fix #1264	2019-09-18 10:07:47 +02:00
thomwolf	8334993915	clean up examples - updated to new keyword inputs - #1246	2019-09-18 10:01:27 +02:00
erenup	5882c442e5	add example usage	2019-09-16 22:38:08 +08:00
erenup	982f181aa7	Merge remote-tracking branch 'origin/master' into run_multiple_choice_add_doc	2019-09-16 19:12:00 +08:00
erenup	84b9d1c423	Merge remote-tracking branch 'huggingface/master' # Conflicts: # pytorch_transformers/__init__.py	2019-09-16 19:06:12 +08:00
erenup	603b470a3d	add warnning info	2019-09-16 18:53:37 +08:00
erenup	4812a5a767	add doc string	2019-09-16 11:50:18 +08:00
VictorSanh	32e1332acf	[distil] fix once for all general logger for scripts	2019-09-11 14:19:07 +00:00
VictorSanh	364920e216	fix small bug/typo	2019-09-10 21:45:01 +00:00
Thomas Wolf	23c23f5399	Merge pull request #1229 from SKRohit/master changes in evaluate function in run_lm_finetuning.py	2019-09-10 22:16:45 +02:00
searchivarius	eab980fd68	Fix to prevent crashing on assert len(tokens_b)>=1	2019-09-09 19:58:08 -04:00
VictorSanh	a95ced6260	[Distillation] save last chkpt as `pytorch_model.bin`	2019-09-09 19:53:35 +00:00
Rohit Kumar Singh	e5df36397b	changes in return statement of evaluate function changed `results` to `result` and removed `results` dict defined previously	2019-09-09 19:55:57 +05:30
LysandreJik	3f91338be9	Patched a few outdated parameters	2019-09-06 17:48:06 -04:00
LysandreJik	f47f9a5874	Updated outdated examples	2019-09-06 17:10:33 -04:00
LysandreJik	5e151f5e77	Table of contents	2019-09-06 12:08:36 -04:00
LysandreJik	593c070435	Better examples	2019-09-06 12:00:12 -04:00
VictorSanh	dddd6b9927	Update DistilBERT training code	2019-09-05 18:26:14 +00:00
Stefan Schweter	a1c34bd286	distillation: fix ModuleNotFoundError error in token counts script	2019-08-31 12:21:38 +02:00
Thomas Wolf	51e980ce36	Merge pull request #1155 from anhnt170489/apex_fp16 Update apex fp16 implementation	2019-08-30 23:29:11 +02:00
VictorSanh	282c276e09	typos + file name coherence in distillation README	2019-08-30 12:02:29 -04:00
VictorSanh	803c1cc4ea	fix relative import bug cf Issue #1140	2019-08-30 12:01:27 -04:00
Thomas Wolf	0a2fecdf90	Merge branch 'master' into master	2019-08-30 16:30:08 +02:00
Rabeeh KARIMI	39eb31e11e	remove reloading tokenizer in the training, adding it to the evaluation part	2019-08-30 15:44:41 +02:00
Rabeeh KARIMI	350bb6bffa	updated tokenizer loading for addressing reproducibility issues	2019-08-30 15:34:28 +02:00
Thomas Wolf	01ad55f8cf	Merge pull request #1026 from rabeehk/master loads the tokenizer for each checkpoint, to solve the reproducability…	2019-08-30 14:15:36 +02:00
erenup	6e1ac34e2b	Merge remote-tracking branch 'huggingface/master'	2019-08-30 15:50:11 +08:00
jamin	2fb9a934b4	re-format	2019-08-30 14:05:28 +09:00
jamin	c8731b9583	update apex fp16 implementation	2019-08-30 13:54:00 +09:00
LysandreJik	caf1d116a6	Closing bracket in DistilBERT's token count.	2019-08-29 15:30:10 -04:00
Luis	fe8fb10b44	Small modification of comment in the run_glue.py example Add RoBERTa to the comment as it was not explicit that RoBERTa don't use token_type_ids.	2019-08-29 14:43:30 +02:00
erenup	942d3f4b20	modifiy code of arc label insurance	2019-08-29 10:21:17 +08:00
LysandreJik	bf3dc778b8	Changed learning rate for run_squad test	2019-08-28 18:24:43 -04:00
Andreas Daiminger	1d15a7f278	swap order of optimizer.step() and scheduler.step()	2019-08-28 19:18:27 +02:00
Thomas Wolf	0ecfd17f49	Merge pull request #987 from huggingface/generative-finetuning Generative finetuning	2019-08-28 16:51:50 +02:00
thomwolf	b5eb283aaa	update credits	2019-08-28 16:36:55 +02:00
thomwolf	912a377e90	dilbert -> distilbert	2019-08-28 13:59:42 +02:00
thomwolf	4ce5f36f78	update readmes	2019-08-28 12:14:31 +02:00
erenup	ec4b1c659f	logging truth error	2019-08-28 16:50:40 +08:00
erenup	df52abe373	add sep_toekn between question and choice	2019-08-28 16:36:21 +08:00
erenup	43c243254a	avoid invalid labels of truth	2019-08-28 16:03:17 +08:00
erenup	3c7e676f8b	add test related code: test the best dev acc model when model is training	2019-08-28 15:57:29 +08:00
VictorSanh	93e82ab424	Write README for DilBERT	2019-08-28 06:26:09 +00:00
VictorSanh	fea921d382	add licensing	2019-08-28 04:45:39 +00:00
VictorSanh	da1e4e53fc	some fixes in `train.py` for loading previous checkpoint	2019-08-28 04:01:03 +00:00
VictorSanh	0d8f8848d5	add `scripts/extract_for_distil.py`	2019-08-28 04:00:19 +00:00
VictorSanh	7f2c384c80	add `scripts/token_counts.py`	2019-08-28 04:00:03 +00:00
VictorSanh	4d16b279e5	add `scripts/binarized_data.py`	2019-08-28 03:59:48 +00:00
VictorSanh	b247b0d880	add `train.py` for distillation	2019-08-28 02:12:47 +00:00
VictorSanh	780f183e55	add requirements	2019-08-28 01:39:52 +00:00
VictorSanh	e424d2e45d	add README	2019-08-28 01:10:10 +00:00
VictorSanh	1ae81e4aa1	add dataset. distiller, utils	2019-08-28 01:10:05 +00:00
thomwolf	06510ccb53	typo	2019-08-23 22:08:10 +02:00
thomwolf	ab7bd5ef98	fixing tokenization and training	2019-08-23 17:31:21 +02:00
Thomas Wolf	90dcd8c05d	Merge branch 'master' into generative-finetuning	2019-08-22 10:43:30 +02:00
VictorSanh	57272d5ddf	fix for glue	2019-08-22 00:25:49 -04:00
VictorSanh	b006a7a12f	fix for squad	2019-08-22 00:25:42 -04:00
Thomas Wolf	9beaa85b07	Merge pull request #1055 from qipeng/run_squad_fix Fix #1015 (tokenizer defaults to use_lower_case=True when loading from trained models)	2019-08-21 01:20:46 +02:00
Lysandre	2d042274ac	Sequence special token handling for BERT and RoBERTa	2019-08-20 14:15:28 -04:00
Peng Qi	3bffd2e8e5	more fixes	2019-08-20 10:59:28 -07:00
Thomas Wolf	3b56427a1e	Merge pull request #1040 from FeiWang96/multi_gpu Fix bug of multi-gpu training in lm finetuning	2019-08-20 17:13:44 +02:00
thomwolf	a690edab17	various fix and clean up on run_lm_finetuning	2019-08-20 15:52:12 +02:00
erenup	fc74132598	add best steps to train	2019-08-20 19:06:41 +08:00
Duzeyao	d86b49ac86	swap optimizer.step and scheduler.step	2019-08-20 16:46:34 +08:00
Duzeyao	45ab8bf60e	Revert "Update finetune_on_pregenerated.py" This reverts commit `a1359b970c`.	2019-08-20 16:40:39 +08:00
erenup	97c30b73d5	add test related code	2019-08-20 16:31:04 +08:00
erenup	d5e60e5b7a	add test related code	2019-08-20 16:25:50 +08:00
Zeyao Du	a1359b970c	Update finetune_on_pregenerated.py	2019-08-20 16:00:07 +08:00
Zeyao Du	28f7ca1f80	swap optimizer.step and scheduler.step	2019-08-20 15:58:42 +08:00
Peng Qi	a368b87791	Fix #1015	2019-08-19 13:07:00 -07:00
Lysandre	f94f1c6016	Distributed training + tokenizer agnostic mask token	2019-08-19 14:58:50 -04:00
Thomas Wolf	5a49b793d9	Merge pull request #1023 from tuvuumass/patch-1 fix issue #824	2019-08-19 15:31:46 +02:00
erenup	4270d3da1b	fix a bug of evaluating	2019-08-19 16:38:52 +08:00
Chi-Liang Liu	40acf6b52a	don't save model without training	2019-08-18 05:02:25 -04:00
erenup	47e9aea0fe	add args info to evaluate_result.txt	2019-08-18 17:00:53 +08:00
erenup	5582bc4b23	add multiple choice to robreta and xlnet, test on swag, roberta=0.82.28 , xlnet=0.80	2019-08-18 16:01:48 +08:00
wangfei	856a63da4d	Fix: save model/model.module	2019-08-18 11:03:47 +08:00
wangfei	1ef41b8337	Revert "Fix: save model/model.module" This reverts commit `00e9c4cc96`.	2019-08-18 11:03:12 +08:00
wangfei	00e9c4cc96	Fix: save model/model.module	2019-08-18 11:02:02 +08:00
erenup	e384ae2b9d	Merge remote-tracking branch 'huggingface/master' merge huggingface/master to update	2019-08-17 12:05:57 +08:00
Jason Phang	d8923270e6	Correct truncation for RoBERTa in 2-input GLUE	2019-08-16 16:30:38 -04:00
Lysandre	5652f54ac2	Simplified data generator + better perplexity calculator GPT-2 now obtains ~20 perplexity on WikiText-2	2019-08-16 13:49:56 -04:00
LysandreJik	7e7fc53da5	Fixing run_glue example with RoBERTa	2019-08-16 11:53:10 -04:00
LysandreJik	715534800a	BERT + RoBERTa masking tokens handling + GPU device update.	2019-08-16 10:10:21 -04:00
LysandreJik	339e556feb	CLM for BERT, beginning of CLM fot RoBERTa; still needs a better masking token mechanism.	2019-08-16 10:10:20 -04:00
LysandreJik	5c18825a18	Removed dataset limit	2019-08-16 10:10:20 -04:00
LysandreJik	3e3e145497	Added GPT to the generative fine-tuning.	2019-08-16 10:10:20 -04:00
LysandreJik	47975ed53e	Language Modeling fine-tuning using GPT-2.	2019-08-16 10:10:20 -04:00
wangfei	b8ff56896c	Fix bug of multi-gpu training in lm finetuning	2019-08-16 12:11:05 +08:00
Rabeeh KARIMI	3d47a7f8ab	loads the tokenizer for each checkpoint, to solve the reproducability issue	2019-08-14 10:58:26 +02:00
LysandreJik	39f426be65	Added special tokens <pad> and <mask> to RoBERTa.	2019-08-13 15:19:50 -04:00
Julien Chaumond	baf08ca1d4	[RoBERTa] run_glue: correct pad_token + reorder labels	2019-08-13 12:51:15 -04:00
tuvuumass	ba4bce2581	fix issue #824	2019-08-13 11:26:27 -04:00
Julien Chaumond	912fdff899	[RoBERTa] Update `run_glue` for RoBERTa	2019-08-12 13:49:50 -04:00
erenup	b219029c45	refactoring old run_swag. This script is mainly refatored from run_squad in pytorch_transformers	2019-08-11 15:20:37 +08:00
Thomas Wolf	b4f9464f90	Merge pull request #960 from ethanjperez/patch-1 Fixing unused weight_decay argument	2019-08-07 10:09:55 +02:00
Thomas Wolf	d43dc48b34	Merge branch 'master' into auto_models	2019-08-05 19:17:35 +02:00
thomwolf	70c10caa06	add option mentioned in #940	2019-08-05 17:09:37 +02:00
thomwolf	b90e29d52c	working on automodels	2019-08-05 16:06:34 +02:00
Ethan Perez	28ba345ecc	Fixing unused weight_decay argument Currently the L2 regularization is hard-coded to "0.01", even though there is a --weight_decay flag implemented (that is unused). I'm making this flag control the weight decay used for fine-tuning in this script.	2019-08-04 12:31:46 -04:00
Thomas Wolf	c054b5ee64	Merge pull request #896 from zijunsun/master fix multi-gpu training bug when using fp16	2019-07-26 19:31:02 +02:00
zijunsun	f0aeb7a814	multi-gpu training also should be after apex fp16（squad）	2019-07-26 15:23:29 +08:00
zijunsun	adb3ef6368	multi-gpu training also should be after apex fp16	2019-07-25 13:09:10 +08:00
Chi-Liang Liu	a7fce6d917	fix squad v1 error (na_prob_file should be None)	2019-07-24 16:11:36 +08:00
thomwolf	6070b55443	fix #868	2019-07-23 17:46:01 +02:00
thomwolf	2c9a3115b7	fix #858	2019-07-23 16:45:55 +02:00
Thomas Wolf	268c6cc160	Merge pull request #845 from rabeehk/master fixed version issues in run_openai_gpt	2019-07-23 15:29:31 +02:00
Peiqin Lin	76be189b08	typos	2019-07-21 20:39:42 +08:00
Rabeeh KARIMI	f63ff536ad	fixed version issues in run_openai_gpt	2019-07-20 12:43:07 +02:00
Thomas Wolf	a615499076	Merge pull request #797 from yzy5630/fix-examples fix some errors for distributed lm_finetuning	2019-07-18 23:32:33 +02:00
yzy5630	a1fe4ba9c9	use new API for save and load	2019-07-18 15:45:23 +08:00
yzy5630	a7ba27b1b4	add parser for adam	2019-07-18 08:52:51 +08:00
yzy5630	d6522e2873	change loss and optimizer to new API	2019-07-17 21:22:34 +08:00
thomwolf	71d597dad0	fix #800	2019-07-17 13:51:09 +02:00
yzy5630	123da5a2fa	fix errors for lm_finetuning examples	2019-07-17 09:56:07 +08:00
yzy5630	60a1bdcdac	fix some errors for distributed lm_finetuning	2019-07-17 09:16:20 +08:00
thomwolf	e848b54730	fix #792	2019-07-16 21:22:19 +02:00
thomwolf	1849aa7d39	update readme and pretrained model weight files	2019-07-16 15:11:29 +02:00
thomwolf	f31154cb9d	Merge branch 'xlnet'	2019-07-16 11:51:13 +02:00
thomwolf	76da9765b6	fix run_generation test	2019-07-15 17:52:35 +02:00
thomwolf	e691fc0963	update QA models tests + run_generation	2019-07-15 17:45:24 +02:00
thomwolf	15d8b1266c	update tokenizer - update squad example for xlnet	2019-07-15 17:30:42 +02:00
thomwolf	3b469cb422	updating squad for compatibility with XLNet	2019-07-15 15:28:37 +02:00
thomwolf	0e9825e252	small fix to run_glue	2019-07-14 23:43:28 +02:00
thomwolf	2397f958f9	updating examples and doc	2019-07-14 23:20:10 +02:00
thomwolf	c490f5ce87	added generation examples in tests	2019-07-13 15:26:58 +02:00
thomwolf	7d4b200e40	good quality generation example for GPT, GPT-2, Transfo-XL, XLNet	2019-07-13 15:25:03 +02:00
thomwolf	7322c314a6	remove python2 testing for examples	2019-07-12 14:24:08 +02:00
thomwolf	936e813c84	clean up examples - added squad example and test	2019-07-12 14:16:06 +02:00
thomwolf	762ded9b1c	wip examples	2019-07-12 11:28:52 +02:00
LysandreJik	3821ecbf4a	Byte order mark management in TSV glue reading.	2019-07-11 20:16:28 -04:00
thomwolf	c6bf1a400d	fix test examples et model pretrained	2019-07-11 22:29:08 +02:00
thomwolf	92a782b108	fix run_glue test	2019-07-11 22:20:10 +02:00
thomwolf	ccb6947dc1	optimization tests	2019-07-11 17:39:47 +02:00
thomwolf	b21d84b027	update examples	2019-07-11 15:37:34 +02:00
thomwolf	ec07cf5a66	rewamp optimization	2019-07-11 14:48:22 +02:00
thomwolf	4fef5919a5	updating examples	2019-07-11 12:03:08 +02:00
thomwolf	50b7e52a7f	WIP examples	2019-07-10 15:33:34 +02:00
thomwolf	ed6c8d37f4	fix merge	2019-07-09 17:14:52 +02:00
thomwolf	4ce237c880	update run_glue	2019-07-09 17:00:32 +02:00
thomwolf	3b7cb7bf44	small update to run_glue	2019-07-09 16:12:15 +02:00
thomwolf	d0efbd3cd1	update sequencesummary module	2019-07-09 15:46:43 +02:00
thomwolf	d5481cbe1b	adding tests to examples - updating summary module - coverage update	2019-07-09 15:29:42 +02:00
thomwolf	b19786985d	unified tokenizer api and serialization + tests	2019-07-09 10:25:18 +02:00
thomwolf	3d5f291386	updates to run_glue	2019-07-05 17:22:15 +02:00
thomwolf	99b90edab1	cleaning up run_glue example	2019-07-05 17:09:35 +02:00
thomwolf	1113f97f33	clean up glue example	2019-07-05 16:31:13 +02:00
thomwolf	162ba383b0	fix model loading	2019-07-05 15:57:14 +02:00
thomwolf	36bca545ff	tokenization abstract class - tests for examples	2019-07-05 15:02:59 +02:00
Thomas Wolf	78462aad61	Merge pull request #733 from ceremonious/parallel-generation Added option to use multiple workers to create training data	2019-07-05 12:04:30 +02:00
thomwolf	0bab55d5d5	[BIG] name change	2019-07-05 11:55:36 +02:00
thomwolf	c41f2bad69	WIP XLM + refactoring	2019-07-03 22:54:39 +02:00
Lei Mao	64b2a828c0	fix evaluation bug	2019-07-01 14:56:24 -07:00
thomwolf	2b56e98892	standardizing API across models - XLNetForSeqClass working	2019-06-28 16:35:09 +02:00
thomwolf	3a00674cbf	fix imports	2019-06-27 17:18:46 +02:00
Mayhul Arora	08ff056c43	Added option to use multiple workers to create training data for lm fine tuning	2019-06-26 16:16:12 -07:00
thomwolf	59cefd4f98	fix #726 - get_lr in examples	2019-06-26 11:28:27 +02:00
thomwolf	092dacfd62	changing is_regression to unified API	2019-06-26 09:54:05 +02:00
thomwolf	e55d4c4ede	various updates to conversion, models and examples	2019-06-26 00:57:53 +02:00
thomwolf	7334bf6c21	pad on left for xlnet	2019-06-24 15:05:11 +02:00
thomwolf	c888663f18	overwrite output directories if needed	2019-06-24 14:38:24 +02:00
thomwolf	62d78aa37e	updating GLUE utils for compatibility with XLNet	2019-06-24 14:36:11 +02:00
thomwolf	24ed0b9346	updating run_xlnet_classifier	2019-06-24 12:00:09 +02:00
thomwolf	f6081f2255	add xlnetforsequence classif and run_classifier example for xlnet	2019-06-24 10:01:07 +02:00
Rocketknight1	c7b2808ed7	Update LM finetuning README to include a literature reference	2019-06-22 15:04:01 +01:00
thomwolf	181075635d	updating model loading and adding special tokens ids	2019-06-21 23:23:37 +02:00
thomwolf	ebd2cb8d74	update from_pretrained to load XLNetModel as well	2019-06-21 21:08:44 +02:00
thomwolf	edfe91c36e	first version bertology ok	2019-06-19 23:43:04 +02:00
thomwolf	7766ce66dd	update bertology	2019-06-19 22:29:51 +02:00
thomwolf	e4b46d86ce	update head pruning	2019-06-19 22:16:30 +02:00
thomwolf	0f40e8d6a6	debugger	2019-06-19 15:38:46 +02:00
thomwolf	0e1e8128bf	more logging	2019-06-19 15:35:49 +02:00
thomwolf	909d4f1af2	cuda again	2019-06-19 15:32:10 +02:00
thomwolf	14f0e8e557	fix cuda	2019-06-19 15:29:28 +02:00
thomwolf	34d706a0e1	pruning in bertology	2019-06-19 15:25:49 +02:00
thomwolf	dc8e0019b7	updating examples	2019-06-19 13:23:20 +02:00
thomwolf	68ab9599ce	small fix and updates to readme	2019-06-19 09:38:38 +02:00
thomwolf	f7e2ac01ea	update barrier	2019-06-18 22:43:35 +02:00
thomwolf	4d8c4337ae	test barrier in distrib training	2019-06-18 22:41:28 +02:00
thomwolf	3359955622	updating run_classif	2019-06-18 22:23:10 +02:00
thomwolf	29b7b30eaa	updating evaluation on a single gpu	2019-06-18 22:20:21 +02:00
thomwolf	7d2001aa44	overwrite_output_dir	2019-06-18 22:13:30 +02:00
thomwolf	16a1f338c4	fixing	2019-06-18 17:06:31 +02:00
thomwolf	92e0ad5aba	no numpy	2019-06-18 17:00:52 +02:00
thomwolf	4e6edc3274	hop	2019-06-18 16:57:15 +02:00
thomwolf	f55b60b9ee	fixing again	2019-06-18 16:56:52 +02:00
thomwolf	8bd9118294	quick fix	2019-06-18 16:54:41 +02:00
thomwolf	3e847449ad	fix out_label_ids	2019-06-18 16:53:31 +02:00
thomwolf	aad3a54e9c	fix paths	2019-06-18 16:48:04 +02:00
thomwolf	40dbda6871	updating classification example	2019-06-18 16:45:52 +02:00
thomwolf	7388c83b60	update run_classifier for distributed eval	2019-06-18 16:32:49 +02:00
thomwolf	9727723243	fix pickle	2019-06-18 16:02:42 +02:00
thomwolf	9710b68dbc	fix pickles	2019-06-18 16:01:15 +02:00
thomwolf	15ebd67d4e	cache in run_classifier + various fixes to the examples	2019-06-18 15:58:22 +02:00
thomwolf	e6e5f19257	fix	2019-06-18 14:45:14 +02:00
thomwolf	a432b3d466	distributed traing t_total	2019-06-18 14:39:09 +02:00
thomwolf	c5407f343f	split squad example in two	2019-06-18 14:29:03 +02:00
thomwolf	335f57baf8	only on main process	2019-06-18 14:03:46 +02:00
thomwolf	326944d627	add tensorboard to run_squad	2019-06-18 14:02:42 +02:00
thomwolf	d82e5deeb1	set find_unused_parameters=True in DDP	2019-06-18 12:13:14 +02:00
thomwolf	a59abedfb5	DDP update	2019-06-18 12:06:26 +02:00
thomwolf	2ef5e0de87	switch to pytorch DistributedDataParallel	2019-06-18 12:03:13 +02:00
thomwolf	9ce37af99b	oups	2019-06-18 11:47:54 +02:00
thomwolf	a40955f071	no need to duplicate models anymore	2019-06-18 11:46:14 +02:00
thomwolf	382e2d1e50	spliting config and weight files for bert also	2019-06-18 10:37:16 +02:00
Thomas Wolf	cad88e19de	Merge pull request #672 from oliverguhr/master Add vocabulary and model config to the finetune output	2019-06-14 17:02:47 +02:00
Thomas Wolf	460d9afd45	Merge pull request #640 from Barqawiz/master Support latest multi language bert fine tune	2019-06-14 16:57:02 +02:00
Thomas Wolf	277c77f1c5	Merge pull request #630 from tguens/master Update run_squad.py	2019-06-14 16:56:26 +02:00
Thomas Wolf	659af2cbd0	Merge pull request #604 from samuelbroscheit/master Fixing issue "Training beyond specified 't_total' steps with schedule 'warmup_linear'" reported in #556	2019-06-14 16:49:24 +02:00
Meet Pragnesh Shah	e02ce4dc79	[hotfix] Fix frozen pooler parameters in SWAG example.	2019-06-11 15:13:53 -07:00
Oliver Guhr	5c08c8c273	adds the tokenizer + model config to the output	2019-06-11 13:46:33 +02:00
jeonsworld	a3a604cefb	Update pregenerate_training_data.py apply Whole Word Masking technique. referred to [create_pretraining_data.py](https://github.com/google-research/bert/blob/master/create_pretraining_data.py)	2019-06-10 12:17:23 +09:00
Ahmad Barqawi	c4fe56dcc0	support latest multi language bert fine tune fix issue of bert-base-multilingual and add support for uncased multilingual	2019-05-27 11:27:41 +02:00
tguens	9e7bc51b95	Update run_squad.py Indentation change so that the output "nbest_predictions.json" is not empty.	2019-05-22 17:27:59 +08:00
samuelbroscheit	94247ad6cb	Make num_train_optimization_steps int	2019-05-13 12:38:22 +02:00
samuel.broscheit	49a77ac16f	Clean up a little bit	2019-05-12 00:31:10 +02:00
samuel.broscheit	3bf3f9596f	Fixing the issues reported in https://github.com/huggingface/pytorch-pretrained-BERT/issues/556 Reason for issue was that optimzation steps where computed from example size, which is different from actual size of dataloader when an example is chunked into multiple instances. Solution in this pull request is to compute num_optimization_steps directly from len(data_loader).	2019-05-12 00:13:45 +02:00
burcturkoglu	00c7fd2b79	Division to num_train_optimizer of global_step in lr_this_step is removed.	2019-05-09 10:57:03 +03:00
burcturkoglu	fa37b4da77	Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT	2019-05-09 10:55:24 +03:00
burcturkoglu	5289b4b9e0	Division to num_train_optimizer of global_step in lr_this_step is removed.	2019-05-09 10:51:38 +03:00
Thomas Wolf	0198399d84	Merge pull request #570 from MottoX/fix-1 Create optimizer only when args.do_train is True	2019-05-08 16:07:50 +02:00
MottoX	18c8aef9d3	Fix documentation typo	2019-05-02 19:23:36 +08:00
MottoX	74dbba64bc	Prepare optimizer only when args.do_train is True	2019-05-02 19:09:29 +08:00
Aneesh Pappu	365fb34c6c	small fix to remove shifting of lm labels during pre process of roc stories, as this shifting happens interanlly in the model	2019-04-30 13:53:04 -07:00
Thomas Wolf	2dee86319d	Merge pull request #527 from Mathieu-Prouveur/fix_value_training_loss Update example files so that tr_loss is not affected by args.gradient…	2019-04-30 11:12:55 +02:00
Mathieu Prouveur	87b9ec3843	Fix tr_loss rescaling factor using global_step	2019-04-29 12:58:29 +02:00
Mathieu Prouveur	ed8fad7390	Update example files so that tr_loss is not affected by args.gradient_accumulation_step	2019-04-24 14:07:00 +02:00
thomwolf	d94c6b0144	fix training schedules in examples to match new API	2019-04-23 11:17:06 +02:00
Thomas Wolf	c36cca075a	Merge pull request #515 from Rocketknight1/master Fix --reduce_memory in finetune_on_pregenerated	2019-04-23 10:30:23 +02:00
Matthew Carrigan	b8e2a9c584	Made --reduce_memory actually do something in finetune_on_pregenerated	2019-04-22 14:01:48 +01:00
Sangwhan Moon	14b1f719f4	Fix indentation weirdness in GPT-2 example.	2019-04-22 02:20:22 +09:00
Thomas Wolf	8407429d74	Merge pull request #494 from SudoSharma/patch-1 Fix indentation for unconditional generation	2019-04-17 11:11:36 +02:00
Ben Mann	87677fcc4d	[run_gpt2.py] temperature should be a float, not int	2019-04-16 15:23:21 -07:00
Abhi Sharma	07154dadb4	Fix indentation for unconditional generation	2019-04-16 11:11:49 -07:00
Thomas Wolf	3d78e226e6	Merge pull request #489 from huggingface/tokenization_serialization Better serialization for Tokenizers and Configuration classes - Also fix #466	2019-04-16 08:49:54 +02:00
thomwolf	3571187ef6	fix saving models in distributed setting examples	2019-04-15 16:43:56 +02:00
thomwolf	2499b0a5fc	add ptvsd to run_squad	2019-04-15 15:33:04 +02:00
thomwolf	7816f7921f	clean up distributed training logging in run_squad example	2019-04-15 15:27:10 +02:00
thomwolf	1135f2384a	clean up logger in examples for distributed case	2019-04-15 15:22:40 +02:00
thomwolf	60ea6c59d2	added best practices for serialization in README and examples	2019-04-15 15:00:33 +02:00
thomwolf	179a2c2ff6	update example to work with new serialization semantic	2019-04-15 14:33:23 +02:00
thomwolf	3e65f255dc	add serialization semantics to tokenizers - fix transfo-xl tokenizer	2019-04-15 11:47:25 +02:00
Thomas Wolf	aff44f0c08	Merge branch 'master' into master	2019-04-15 10:58:34 +02:00
Thomas Wolf	bb61b747df	Merge pull request #474 from jiesutd/master Fix tsv read error in Windows	2019-04-15 10:56:48 +02:00
Matthew Carrigan	dbbd6c7500	Replaced some randints with cleaner randranges, and added a helpful error for users whose corpus is just one giant document.	2019-04-12 15:07:58 +01:00
Thomas Wolf	616743330e	Merge pull request #462 from 8enmann/master fix run_gpt2.py	2019-04-11 21:54:46 +02:00
Thomas Wolf	2cdfb8b254	Merge pull request #467 from yaroslavvb/patch-2 Update README.md	2019-04-11 21:53:23 +02:00
Jie Yang	c49ce3c722	fix tsv read error in Windows	2019-04-11 15:40:19 -04:00
thomwolf	4bc4c69af9	finetuning any BERT model - fixes #455	2019-04-11 16:57:59 +02:00
Yaroslav Bulatov	8fffba5f47	Update README.md Fix for ```> > > > 04/09/2019 21:39:38 - INFO - __main__ - device: cuda n_gpu: 1, distributed training: False, 16-bits training: False Traceback (most recent call last): File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 642, in <module> main() File "/home/ubuntu/pytorch-pretrained-BERT/examples/lm_finetuning/simple_lm_finetuning.py", line 502, in main raise ValueError("Training is currently the only implemented execution option. Please set `do_train`.") ValueError: Training is currently the only implemented execution option. Please set `do_train`. ```	2019-04-09 14:45:47 -07:00
Benjamin Mann	fd8a3556f0	fix run_gpt2.py	2019-04-08 17:20:35 -07:00
Dhanajit Brahma	6c4c7be282	Merge remote-tracking branch 'upstream/master'	2019-04-07 16:59:36 +05:30
Dhanajit Brahma	4d3cf0d602	removing some redundant lines	2019-04-07 16:59:07 +05:30
Thomas Wolf	9ca25ce828	Merge pull request #427 from jeonsworld/patch-1 fix sample_doc	2019-04-03 11:26:58 +02:00
thomwolf	846b1fd6f8	Fix #419	2019-04-03 10:50:38 +02:00
Thomas Wolf	2f80dbbc0d	Merge pull request #430 from MottoX/master Fix typo in example code	2019-04-02 10:41:56 +02:00
Mike Arpaia	8b5c63e4de	Fixes to the TensorFlow conversion tool	2019-04-01 13:17:54 -06:00
Weixin Wang	d07db28f52	Fix typo in example code Modify 'unambigiously' to 'unambiguously'	2019-03-31 01:20:18 +08:00
jeonsworld	60005f464d	Update pregenerate_training_data.py If the value of rand_end is returned from the randint function, the value of sampled_doc_index that matches current_idx is returned from searchsorted. example: cumsum_max = {int64} 30 doc_cumsum = {ndarray} [ 5 7 11 19 30] doc_lengths = {list} <class 'list'>: [5, 2, 4, 8, 11] if current_idx = 1, rand_start = 7 rand_end = 35 sentence_index = randint(7, 35) % cumsum_max if randint return 35, sentence_index becomes 5. if sentence_index is 5, np.searchsorted returns 1 equal to current_index.	2019-03-30 14:50:17 +09:00
dhanajitb	f872eb98c2	making unconditional generation work The unconditional generation works now but if the seed is fixed, the sample is the same every time. n_samples > 1 will give different samples though. I am giving the start token as '<\|endoftext\|>' for the unconditional generation.	2019-03-28 22:46:15 +05:30
Thomas Wolf	694e2117f3	Merge pull request #388 from ananyahjha93/master Added remaining GLUE tasks to 'run_classifier.py'	2019-03-28 09:06:53 +01:00
Thomas Wolf	cc8c2d2332	Merge pull request #396 from IndexFziQ/IndexFziQ add tqdm to the process of eval in examples/run_swag.py	2019-03-27 12:03:26 +01:00
thomwolf	361aff6de5	typos	2019-03-27 11:54:59 +01:00
thomwolf	cea8ba1d59	adjusted formating and some wording in the readme	2019-03-27 11:53:44 +01:00
Matthew Carrigan	24e67fbf75	Minor README update	2019-03-25 12:33:30 +00:00
Matthew Carrigan	8d1d1ffde2	Corrected the displayed loss when gradient_accumulation_steps > 1	2019-03-25 12:15:19 +00:00
Matthew Carrigan	abb7d1ff6d	Added proper context management to ensure cleanup happens in the right order.	2019-03-21 17:50:03 +00:00
Matthew Carrigan	06a30cfdf3	Added a --reduce_memory option to the training script to keep training data on disc as a memmap rather than in memory	2019-03-21 17:04:12 +00:00
Matthew Carrigan	7d1ae644ef	Added a --reduce_memory option to the training script to keep training data on disc as a memmap rather than in memory	2019-03-21 17:02:18 +00:00
Matthew Carrigan	2bba7f810e	Added a --reduce_memory option to shelve docs to disc instead of keeping them in memory.	2019-03-21 16:50:16 +00:00
Matthew Carrigan	8733ffcb5e	Removing a couple of other old unnecessary comments	2019-03-21 14:09:57 +00:00
Matthew Carrigan	8a861048dd	Fixed up the notes on a possible future low-memory path	2019-03-21 14:08:39 +00:00
Matthew Carrigan	a8a577ba93	Reduced memory usage for pregenerating the data a lot by writing it out on the fly without shuffling - the Sampler in the finetuning script will shuffle for us.	2019-03-21 14:05:52 +00:00
Matthew Carrigan	0ae59e662d	Reduced memory usage for pregenerating the data a lot by writing it out on the fly without shuffling - the Sampler in the finetuning script will shuffle for us.	2019-03-21 14:04:17 +00:00
Matthew Carrigan	6a9038ba53	Removed an old irrelevant comment	2019-03-21 13:36:41 +00:00
Yuqiang Xie	77944d1b31	add tqdm to the process of eval Maybe better.	2019-03-21 20:59:33 +08:00
Matthew Carrigan	29a392fbcf	Small README changes	2019-03-20 17:35:17 +00:00
Matthew Carrigan	832b2b0058	Adding README	2019-03-20 17:31:49 +00:00
Matthew Carrigan	934d3f4d2f	Syncing up argument names between the scripts	2019-03-20 17:23:23 +00:00
Matthew Carrigan	f19ba35b2b	Move old finetuning script into the new folder	2019-03-20 16:47:06 +00:00
Matthew Carrigan	7de5c6aa5e	PEP8 and formatting cleanups	2019-03-20 16:44:04 +00:00
Matthew Carrigan	1798e98e5a	Added final TODOs	2019-03-20 16:42:37 +00:00
Matthew Carrigan	c64c2fc4c2	Fixed embarrassing indentation problem	2019-03-20 15:42:57 +00:00
Matthew Carrigan	0540d360f2	Fixed logging	2019-03-20 15:36:51 +00:00
Matthew Carrigan	976554a472	First commit of the new LM finetuning	2019-03-20 14:23:51 +00:00
Ananya Harsh Jha	e5b63fb542	Merge branch 'master' of https://github.com/ananyahjha93/pytorch-pretrained-BERT pull current master to local	2019-03-17 08:30:13 -04:00
Ananya Harsh Jha	8a4e90ff40	corrected folder creation error for MNLI-MM, verified GLUE results	2019-03-17 08:16:50 -04:00
Ananya Harsh Jha	e0bf01d9a9	added hack for mismatched MNLI	2019-03-16 14:10:48 -04:00
Ananya Harsh Jha	4c721c6b6a	added eval time metrics for GLUE tasks	2019-03-15 23:21:24 -04:00
tseretelitornike	83857ffeaa	Added missing imports.	2019-03-15 12:45:48 +01:00
Yongbo Wang	d1e4fa98a9	typo in annotation modify `heruistic` to `heuristic` in line 660, `charcter` to `character` in line 661.	2019-03-14 17:32:15 +08:00
Yongbo Wang	3d6452163d	typo modify `mull` to `null` in line 474 annotation.	2019-03-14 17:03:38 +08:00
thomwolf	a98dfe4ced	fixing #377 (empty nbest_predictions.json)	2019-03-14 09:57:06 +01:00
Ananya Harsh Jha	043c8781ef	added code for all glue task processors	2019-03-14 04:24:04 -04:00
Yongbo Wang	22a465a91f	Simplify code, delete redundancy line delete redundancy line `if args.train`, simplify code.	2019-03-13 09:42:06 +08:00
Elon Musk	66d8206809	Update run_gpt2.py	2019-03-08 11:59:08 -05:00
thomwolf	7cc35c3104	fix openai gpt example and updating readme	2019-03-06 11:43:21 +01:00
thomwolf	994d86609b	fixing PYTORCH_PRETRAINED_BERT_CACHE use in examples	2019-03-06 10:21:24 +01:00
thomwolf	5c85fc3977	fix typo - logger info	2019-03-06 10:05:21 +01:00
Thomas Wolf	8e36da7acb	Merge pull request #347 from jplehmann/feature/sst2-processor Processor for SST-2 task	2019-03-06 09:48:27 +01:00
Thomas Wolf	3c01dfb775	Merge pull request #338 from CatalinVoss/patch-3 Fix top k generation for k != 0	2019-03-06 09:47:33 +01:00
John Lehmann	0f96d4b1f7	Run classifier processor for SST-2.	2019-03-05 13:38:28 -06:00
Catalin Voss	4b4b079272	Fix top k generation for k != 0	2019-03-02 21:54:44 -08:00
Catalin Voss	c0cf0a04d5	Fix typo	2019-02-27 18:01:06 -08:00
Ben Johnson	8607233679	Update run_openai_gpt.py	2019-02-20 13:58:54 -05:00
thomwolf	0202da0271	remove unnecessary example	2019-02-18 13:51:42 +01:00
thomwolf	690a0dbf36	fix example - masking	2019-02-18 10:50:30 +01:00
thomwolf	fbb248a2e4	examples testing	2019-02-18 01:28:18 +01:00
thomwolf	b65f07d8c0	adding examples	2019-02-18 00:55:33 +01:00
wlhgtc	8efaf8f176	fix 'best_non_null_entry' is None error	2019-02-15 15:57:25 +08:00
Davide Fiocco	65df0d78ed	--do_lower_case is duplicated in parser args Deleting one repetition (please review!)	2019-02-13 15:30:05 +01:00
Thomas Wolf	03cdb2a390	Merge pull request #254 from huggingface/python_2 Adding OpenAI GPT and Transformer-XL models, compatibility with Python 2	2019-02-11 14:19:26 +01:00
thomwolf	d38caba169	typo in run_squad	2019-02-11 14:10:27 +01:00
thomwolf	af62cc5f20	fix run_squad example	2019-02-11 14:06:32 +01:00
thomwolf	eebc8abbe2	clarify and unify model saving logic in examples	2019-02-11 14:04:19 +01:00
thomwolf	32fea876bb	add distant debugging to run_transfo_xl	2019-02-11 12:53:32 +01:00
thomwolf	b31ba23913	cuda on in the examples by default	2019-02-11 12:15:43 +01:00
thomwolf	6cd769957e	update transfo xl example	2019-02-09 16:59:17 +01:00
thomwolf	1320e4ec0c	mc_token_mask => mc_token_ids	2019-02-09 16:58:53 +01:00
thomwolf	f4a07a392c	mems not splitted	2019-02-09 16:14:31 +01:00
thomwolf	43b9af0cac	mems initialized to None in run_transfo	2019-02-09 16:12:19 +01:00
thomwolf	b80684b23f	fixing run openai gpt example	2019-02-08 22:31:32 +01:00
thomwolf	7b4b0cf966	logging	2019-02-08 11:16:29 +01:00
thomwolf	4bbb9f2d68	log loss - helpers	2019-02-08 11:14:29 +01:00
thomwolf	5d7e845712	fix model on cuda	2019-02-08 11:08:43 +01:00
thomwolf	eccb2f0163	hot fix	2019-02-08 11:05:20 +01:00
thomwolf	5adc20723b	add distant debugging	2019-02-08 11:03:59 +01:00
thomwolf	777459b471	run openai example running	2019-02-08 10:33:14 +01:00
thomwolf	6bc082da0a	updating examples	2019-02-08 00:02:26 +01:00
thomwolf	e77721e4fe	renamed examples	2019-02-07 23:15:15 +01:00
thomwolf	d482e3d79d	adding examples for openai and transformer-xl	2019-02-07 17:06:41 +01:00
tholor	9aebc711c9	adjust error message related to args.do_eval	2019-02-07 11:49:38 +01:00
tholor	4a450b25d5	removing unused argument eval_batch_size from LM finetuning #256	2019-02-07 10:06:38 +01:00
Baoyang Song	7ac3311e48	Fix the undefined variable in squad example	2019-02-06 19:36:08 +01:00
thomwolf	ed47cb6cba	fixing transfo eval script	2019-02-06 16:22:17 +01:00
Thomas Wolf	848aae49e1	Merge branch 'master' into python_2	2019-02-06 00:13:20 +01:00
thomwolf	448937c00d	python 2 compatibility	2019-02-06 00:07:46 +01:00
thomwolf	d609ba24cb	resolving merge conflicts	2019-02-05 16:14:25 +01:00
Thomas Wolf	64ce900974	Merge pull request #248 from JoeDumoulin/squad1.1-fix fix prediction on run-squad.py example	2019-02-05 16:00:51 +01:00
Thomas Wolf	e9e77cd3c4	Merge pull request #218 from matej-svejda/master Fix learning rate problems in run_classifier.py	2019-02-05 15:40:44 +01:00
thomwolf	1579c53635	more explicit notation: num_train_step => num_train_optimization_steps	2019-02-05 15:36:33 +01:00
joe dumoulin	aa90e0c36a	fix prediction on run-squad.py example	2019-02-01 10:15:44 -08:00
Thomas Wolf	8f8bbd4a4c	Merge pull request #244 from deepset-ai/prettify_lm_masking Avoid confusion of inplace LM masking	2019-02-01 12:17:50 +01:00
tholor	ce75b169bd	avoid confusion of inplace masking of tokens_a / tokens_b	2019-01-31 11:42:06 +01:00
Surya Kasturi	9bf528877e	Update run_squad.py	2019-01-30 15:09:31 -05:00
Surya Kasturi	af2b78601b	Update run_squad2.py	2019-01-30 15:08:56 -05:00
Matej Svejda	5169069997	make examples consistent, revert error in num_train_steps calculation	2019-01-30 11:47:25 +01:00
Matej Svejda	9c6a48c8c3	fix learning rate/fp16 and warmup problem for all examples	2019-01-27 14:07:24 +01:00
Matej Svejda	01ff4f82ba	learning rate problems in run_classifier.py	2019-01-22 23:40:06 +01:00
liangtaiwan	be9fa192f0	don't save if do not train	2019-01-18 00:41:55 +08:00
thomwolf	a28dfc8659	fix eval for wt103	2019-01-16 11:18:19 +01:00
thomwolf	8831c68803	fixing various parts of model conversion, loading and weights sharing	2019-01-16 10:31:16 +01:00
thomwolf	bcd4aa8fe0	update evaluation example	2019-01-15 23:32:34 +01:00
thomwolf	a69ec2c722	improved corpus and tokenization conversion - added evaluation script	2019-01-15 23:17:46 +01:00
Thomas Wolf	4e0cba1053	Merge pull request #191 from nhatchan/20190113_py35_finetune lm_finetuning compatibility with Python 3.5	2019-01-14 09:40:07 +01:00
nhatchan	6c65cb2492	lm_finetuning compatibility with Python 3.5 dicts are not ordered in Python 3.5 or prior, which is a cause of #175. This PR replaces one with a list, to keep its order.	2019-01-13 21:09:13 +09:00
Li Dong	a2da2b4109	[bug fix] args.do_lower_case is always True The "default=True" makes args.do_lower_case always True. ```python parser.add_argument("--do_lower_case", default=True, action='store_true') ```	2019-01-13 19:51:11 +08:00
tholor	506e5bb0c8	add do_lower_case arg and adjust model saving for lm finetuning.	2019-01-11 08:32:46 +01:00
Thomas Wolf	e485829a41	Merge pull request #174 from abeljim/master Added Squad 2.0	2019-01-10 23:40:45 +01:00
Sang-Kil Park	64326dccfb	Fix it to run properly even if without `--do_train` param. It was modified similar to `run_classifier.py`, and Fixed to run properly even if without `--do_train` param.	2019-01-10 21:51:39 +09:00
thomwolf	e5c78c6684	update readme and few typos	2019-01-10 01:40:00 +01:00
thomwolf	fa5222c296	update readme	2019-01-10 01:25:28 +01:00
Unknown	b3628f117e	Added Squad 2.0	2019-01-08 15:13:13 -08:00
thomwolf	ab90d4cddd	adding docs and example for OpenAI GPT	2019-01-09 00:12:43 +01:00
thomwolf	2e4db64cab	add do_lower_case tokenizer loading optino in run_squad and ine_tuning examples	2019-01-07 13:06:42 +01:00
thomwolf	c9fd350567	remove default when action is store_true in arguments	2019-01-07 13:01:54 +01:00
Thomas Wolf	d3d56f9a0b	Merge pull request #166 from likejazz/patch-1 Fix error when `bert_model` param is path or url.	2019-01-07 12:40:55 +01:00
Thomas Wolf	766c6b2ce3	Merge pull request #159 from jaderabbit/master Allow do_eval to be used without do_train and to use the pretrained model in the output folder	2019-01-07 12:31:06 +01:00
Thomas Wolf	77966a43a4	Merge pull request #156 from rodgzilla/cl_args_doc Adding new pretrained model to the help of the `bert_model` argument.	2019-01-07 12:27:16 +01:00
Thomas Wolf	2e8c5c00ec	Merge pull request #141 from SinghJasdeep/patch-1 loading saved model when n_classes != 2	2019-01-07 12:21:13 +01:00
Sang-Kil Park	ca4e7aaa72	Fix error when `bert_model` param is path or url. Error occurs when `bert_model` param is path or url. Therefore, if it is path, specify the last path to prevent error.	2019-01-05 11:42:54 +09:00
Jade Abbott	193e2df8ba	Remove rogue comment	2019-01-03 13:13:06 +02:00
Jade Abbott	c64de50ea4	nb_tr_steps is not initialized	2019-01-03 12:34:57 +02:00
Jade Abbott	b96149a19b	Training loss is not initialized if only do_eval is specified	2019-01-03 10:32:10 +02:00
Jade Abbott	be3b9bcf4d	Allow one to use the pretrained model in evaluation when do_train is not selected	2019-01-03 09:02:33 +02:00
Grégory Châtel	186f75342e	Adding new pretrained model to the help of the `bert_model` argument.	2019-01-02 14:00:59 +01:00
Jasdeep Singh	99709ee61d	loading saved model when n_classes != 2 Required to for: Assertion `t >= 0 && t < n_classes` failed, if your default number of classes is not 2.	2018-12-20 13:55:47 -08:00
tholor	e5fc98c542	add exemplary training data. update to nvidia apex. refactor 'item -> line in doc' mapping. add warning for unknown word.	2018-12-20 18:30:52 +01:00
deepset	a58361f197	Add example for fine tuning BERT language model (#1 ) Adds an example for loading a pre-trained BERT model and fine tune it as a language model (masked tokens & nextSentence) on your target corpus.	2018-12-18 10:32:25 +01:00
thomwolf	ae88eb88a4	set encoding to 'utf-8' in calls to open	2018-12-14 13:48:58 +01:00
thomwolf	e1eab59aac	no fp16 on evaluation	2018-12-13 14:54:02 +01:00
thomwolf	087798b7fa	fix reloading model for evaluation in examples	2018-12-13 14:48:12 +01:00
thomwolf	0f544625f4	fix swag example for work with apex	2018-12-13 13:35:59 +01:00
thomwolf	0cf88ff084	make examples work without apex	2018-12-13 13:28:00 +01:00
thomwolf	d3fcec1a3e	add saving and loading model in examples	2018-12-13 12:50:44 +01:00
thomwolf	b3caec5a56	adding save checkpoint and loading in examples	2018-12-13 12:48:13 +01:00
Thomas Wolf	91aab2a6d3	Merge pull request #116 from FDecaYed/deyuf/fp16_with_apex Change to use apex for better fp16 and multi-gpu support	2018-12-13 12:32:37 +01:00
Thomas Wolf	ffe9075f48	Merge pull request #96 from rodgzilla/multiple-choice-code BertForMultipleChoice and Swag dataset example.	2018-12-13 12:05:11 +01:00
Deyu Fu	c8ea286048	change to apex for better fp16 and multi-gpu support	2018-12-11 17:13:58 -08:00
Thomas Wolf	e622790a93	Merge pull request #91 from rodgzilla/convert-examples-code-improvement run_classifier.py improvements	2018-12-11 05:12:04 -05:00
Grégory Châtel	df34f22854	Removing the dependency to pandas and using the csv module to load data.	2018-12-10 17:45:23 +01:00
Grégory Châtel	d429c15f25	Removing old code from copy-paste.	2018-12-06 19:19:21 +01:00
Grégory Châtel	63c45056aa	Finishing the code for the Swag task.	2018-12-06 18:53:05 +01:00
Grégory Châtel	c45d8ac554	Storing the feature of each choice as a dict for readability.	2018-12-06 16:01:28 +01:00
Grégory Châtel	0812aee2c3	Fixing problems in convert_examples_to_features.	2018-12-06 15:53:07 +01:00
Grégory Châtel	f2b873e995	convert_examples_to_features code and small improvements.	2018-12-06 15:40:47 +01:00
Grégory Châtel	83fdbd6043	Adding read_swag_examples to load the dataset.	2018-12-06 14:02:46 +01:00
Grégory Châtel	7183cded4e	SwagExample class.	2018-12-06 13:39:44 +01:00
Grégory Châtel	fa7daa247d	Fixing the commentary of the `SquadExample` class.	2018-12-06 13:14:33 +01:00
Grégory Châtel	a994bf4076	Fixing related to issue #83 .	2018-12-05 18:16:30 +01:00
Grégory Châtel	c6d9d5394e	Simplifying code for easier understanding.	2018-12-05 17:53:09 +01:00
Grégory Châtel	793262e8ec	Removing trailing whitespaces.	2018-12-05 17:52:39 +01:00
Davide Fiocco	e60e8a6068	Correct assignement for logits in classifier example I tried to address https://github.com/huggingface/pytorch-pretrained-BERT/issues/76 should be correct, but there's likely a more efficient way.	2018-12-02 12:38:26 +01:00
Davide Fiocco	dc13e276ee	Point typo fix	2018-12-01 01:02:16 +01:00
thomwolf	89d47230d7	clean up classification model output	2018-11-30 22:54:53 +01:00
thomwolf	c588453a0f	fix run_squad	2018-11-30 14:22:40 +01:00
thomwolf	0541442558	add do_lower_case in examples	2018-11-30 13:47:33 +01:00
Li Li	0aaedcc02f	Bug fix in examples;correct t_total for distributed training;run prediction for full dataset	2018-11-27 01:08:37 -08:00
thomwolf	32167cdf4b	remove convert_to_unicode and printable_text from examples	2018-11-26 23:33:22 +01:00
thomwolf	05053d163c	update cache_dir in readme and examples	2018-11-26 10:45:13 +01:00
thomwolf	6b2136a8a9	fixing weights decay in run_squad example	2018-11-20 10:12:44 +01:00
Thomas Wolf	061eeca84a	Merge pull request #32 from xiaoda99/master Fix ineffective no_decay bug when using BERTAdam	2018-11-20 10:11:46 +01:00
thomwolf	2f21497d3e	fixing param.grad is None in fp16 examples	2018-11-20 10:01:21 +01:00
xiaoda99	6c4789e4e8	Fix ineffective no_decay bug	2018-11-18 16:16:21 +08:00
thomwolf	27ee0fff3c	add no_cuda args in extract_features	2018-11-17 23:04:44 +01:00
thomwolf	aa50fd196f	remove unused arguments in example scripts	2018-11-17 23:01:05 +01:00
thomwolf	47a7d4ec14	update examples from master	2018-11-17 12:21:35 +01:00
thomwolf	c8cba67742	clean up readme and examples	2018-11-17 12:19:16 +01:00
thomwolf	757750d6f6	fix tests	2018-11-17 11:58:14 +01:00
thomwolf	4e46affc34	updating examples	2018-11-17 10:30:54 +01:00
thomwolf	cba85a67b9	fix nan in optimizer_on_cpu	2018-11-15 21:47:41 +01:00
thomwolf	1de35b624b	preparing for first release	2018-11-15 20:56:10 +01:00

... 35 36 37 38 39 ...

2613 Commits