transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-18 20:18:24 +06:00

Author	SHA1	Message	Date
Thomas Wolf	7e17f09fb5	Merge pull request #1803 from importpandas/fix-xlnet-squad2.0 fix run_squad.py during fine-tuning xlnet on squad2.0	2019-12-21 13:38:48 +01:00
thomwolf	8a2be93b4e	fix merge	2019-12-21 13:31:28 +01:00
Thomas Wolf	562f864038	Merge branch 'master' into fix-xlnet-squad2.0	2019-12-21 12:48:10 +01:00
Thomas Wolf	8618bf15d6	Merge pull request #1736 from huggingface/fix-tf-xlnet Fix TFXLNet	2019-12-21 12:42:05 +01:00
Thomas Wolf	2fa8737c44	Merge pull request #1586 from enzoampil/include_special_tokens_in_bert_examples Add special tokens to documentation for bert examples to resolve issue: #1561	2019-12-21 12:36:11 +01:00
Thomas Wolf	f15f087143	Merge pull request #1764 from DomHudson/bug-fix-1761 Bug-fix: Roberta Embeddings Not Masked	2019-12-21 12:13:27 +01:00
Thomas Wolf	fae4d1c266	Merge pull request #2217 from aaugustin/test-parallelization Support running tests in parallel	2019-12-21 11:54:23 +01:00
Aymeric Augustin	b8e924e10d	Restore test. This looks like debug code accidentally committed in `b18509c2`. Refs #2250.	2019-12-21 08:50:15 +01:00
Aymeric Augustin	767bc3ca68	Fix typo in model name. This looks like a copy/paste mistake. Probably this test was never run. Refs #2250.	2019-12-21 08:46:26 +01:00
Aymeric Augustin	343c094f21	Run examples separately from tests. This optimizes the total run time of the Circle CI test suite.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	80caf79d07	Prevent excessive parallelism in PyTorch. We're already using as many processes in parallel as we have CPU cores. Furthermore, the number of core may be incorrectly calculated as 36 (we've seen this in pytest-xdist) which make compound the problem. PyTorch performance craters without this.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	bb3bfa2d29	Distribute tests from the same file to the same worker. This should prevent two issues: - hitting API rate limits for tests that hit the HF API - multiplying the cost of expensive test setups	2019-12-21 08:43:19 +01:00
Aymeric Augustin	29cbab98f0	Parallelize tests on Circle CI. Set the number of CPUs manually based on the Circle CI resource class, or else we're getting 36 CPUs, which is far too much (perhaps that's the underlying hardware and not what Circle CI allocates to us). Don't parallelize the custom tokenizers tests because they take less than one second to run and parallelization actually makes them slower.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	a4c9338b83	Prevent parallel downloads of the same file with a lock. Since the file is written to the filesystem, a filesystem lock is the way to go here. Add a dependency on the third-party filelock library to get cross-platform functionality.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	b670c26684	Take advantage of the cache when running tests. Caching models across test cases and across runs of the test suite makes slow tests somewhat more bearable. Use gettempdir() instead of /tmp in tests. This makes it easier to change the location of the cache with semi-standard TMPDIR/TEMP/TMP environment variables. Fix #2222.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	b67fa1a8d2	Download models directly to cache_dir. This allows moving the file instead of copying it, which is more reliable. Also it avoids writing large amounts of data to /tmp, which may not be large enough to accomodate it. Refs #2222.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	286d5bb6b7	Use a random temp dir for writing pruned models in tests.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	478e456e83	Use a random temp dir for writing file in tests.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	12726f8556	Remove redundant torch.jit.trace in tests. This looks like it could be expensive, so don't run it twice.	2019-12-21 08:43:19 +01:00
Julien Chaumond	ac1b449cc9	[doc] move distilroberta to more appropriate place cc @lysandrejik	2019-12-21 00:09:01 -05:00
Julien Chaumond	3e52915fa7	[RoBERTa] Embeddings: fix dimensionality bug	2019-12-20 19:01:27 -05:00
Dom Hudson	228f52867c	Bug fix: 1764	2019-12-20 18:27:35 -05:00
Francesco	a80778f40e	small refactoring (only esthetic, not functional)	2019-12-20 17:21:24 -05:00
Francesco	3df1d2d144	- Create the output directory (whose name is passed by the user in the "save_directory" parameter) where it will be saved encoder and decoder, if not exists. - Empty the output directory, if it contains any files or subdirectories. - Create the "encoder" directory inside "save_directory", if not exists. - Create the "decoder" directory inside "save_directory", if not exists. - Save the encoder and the decoder in the previous two directories, respectively.	2019-12-20 17:21:24 -05:00
Lysandre	a436574bfd	Release: v2.3.0	2019-12-20 16:22:20 -05:00
Thomas Wolf	d0f8b9a978	Merge pull request #2244 from huggingface/fix-tok-pipe Fix Camembert and XLM-R `decode` method- Fix NER pipeline alignement	2019-12-20 22:10:39 +01:00
Thomas Wolf	a557836a70	Merge pull request #2191 from huggingface/fix_sp_np Numpy compatibility for sentence piece	2019-12-20 22:08:08 +01:00
thomwolf	655fd06853	clean up	2019-12-20 21:57:49 +01:00
thomwolf	e5812462fc	clean up debug and less verbose tqdm	2019-12-20 21:51:48 +01:00
thomwolf	4775ec354b	add overwrite - fix ner decoding	2019-12-20 21:47:15 +01:00
Lysandre	cb6d54bfda	Numpy compatibility for sentence piece convert to int earlier	2019-12-20 15:06:28 -05:00
thomwolf	f79a7dc661	fix NER pipeline	2019-12-20 20:57:45 +01:00
thomwolf	a241011057	fix pipeline NER	2019-12-20 20:43:48 +01:00
thomwolf	e37ca8e11a	fix camembert and XLM-R tokenizer	2019-12-20 20:43:42 +01:00
thomwolf	ceae85ad60	fix mc loading	2019-12-20 19:52:24 +01:00
thomwolf	71883b6ddc	update link in readme	2019-12-20 19:40:23 +01:00
Thomas Wolf	8d5a47c79b	Merge pull request #2243 from huggingface/fix-xlm-roberta fixing xlm-roberta tokenizer max_length and automodels	2019-12-20 19:34:08 +01:00
thomwolf	79e4a6a25c	update serving API	2019-12-20 19:33:12 +01:00
thomwolf	bbaaec046c	fixing CLI pipeline	2019-12-20 19:19:20 +01:00
thomwolf	1c12ee0e55	fixing xlm-roberta tokenizer max_length and automodels	2019-12-20 18:28:27 +01:00
Lysandre	65c75fc587	Clean special tokens test	2019-12-20 11:34:16 -05:00
Lysandre	fb393ad994	Added test for all special tokens	2019-12-20 11:29:58 -05:00
Dirk Groeneveld	90debb9ff2	Keep even the first of the special tokens intact while lowercasing.	2019-12-20 11:29:43 -05:00
Morgan Funtowicz	b98ff88544	Added pipelines quick tour in README	2019-12-20 15:52:50 +01:00
Thomas Wolf	3a2c4e6f63	Merge pull request #1548 from huggingface/cli [2.2] - Command-line interface - Pipeline class	2019-12-20 15:28:29 +01:00
Rémi Louf	4e3f745ba4	add example for Model2Model in quickstart	2019-12-20 09:12:31 -05:00
thomwolf	db0795b5d0	defaults models for tf and pt - update tests	2019-12-20 15:07:00 +01:00
Morgan Funtowicz	7f74084528	Fix leading axis added when saving through the command run	2019-12-20 14:47:04 +01:00
thomwolf	c37815f130	clean up PT <=> TF 2.0 conversion and config loading	2019-12-20 14:35:40 +01:00
thomwolf	73fcebf7ec	update serving command	2019-12-20 13:47:35 +01:00

... 245 246 247 248 249 ...

15053 Commits