transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-23 22:38:58 +06:00

Author	SHA1	Message	Date
Aymeric Augustin	1efa0a7552	Add black-compatible flake8 configuration.	2019-12-22 10:59:07 +01:00
Aymeric Augustin	d0c9fe277a	Fix circular import in transformers.pipelines. Submodules shouldn't import from their parent in general.	2019-12-22 10:59:07 +01:00
Aymeric Augustin	5ca054757f	Update "make style" to sort imports with isort.	2019-12-22 10:59:07 +01:00
Aymeric Augustin	9e80fc7b2f	Enforce isort in CI. We need https://github.com/timothycrosley/isort/pull/1000 but there's no release with this fix yet, so we'll install from GitHub.	2019-12-22 10:59:00 +01:00
Aymeric Augustin	158e82e061	Sort imports with isort. This is the result of: $ isort --recursive examples templates transformers utils hubconf.py setup.py	2019-12-22 10:57:46 +01:00
upura	9d00f78f16	fix doc link	2019-12-22 16:07:05 +09:00
Daniil Larionov	b668a740ca	Fixing incorrect link in model docstring The docstring contains a link to Salesforce/CTRL repo, while the model itself is Facebookresearch/mmbt. It may be the wrong copy\paste.	2019-12-22 00:01:14 +03:00
Aymeric Augustin	bc1715c1e0	Add black-compatible isort configuration. lines_after_imports = 2 is a matter of taste; I like it.	2019-12-21 17:53:18 +01:00
Aymeric Augustin	36883c1192	Add "make style" to format code with black.	2019-12-21 17:53:18 +01:00
Aymeric Augustin	6e5291a915	Enforce black in CI.	2019-12-21 17:53:18 +01:00
Aymeric Augustin	fa84ae26d6	Reformat source code with black. This is the result of: $ black --line-length 119 examples templates transformers utils hubconf.py setup.py There's a lot of fairly long lines in the project. As a consequence, I'm picking the longest widely accepted line length, 119 characters. This is also Thomas' preference, because it allows for explicit variable names, to make the code easier to understand.	2019-12-21 17:52:29 +01:00
Aymeric Augustin	63e3827c6b	Remove empty file. Likely it was added by accident.	2019-12-21 15:38:08 +01:00
Thomas Wolf	645713e2cb	Merge pull request #2254 from huggingface/fix-tfroberta adding positional embeds masking to TFRoBERTa	2019-12-21 15:33:22 +01:00
Thomas Wolf	73f6e9817c	Merge pull request #2115 from suvrat96/add_mmbt_model [WIP] Add MMBT Model to Transformers Repo	2019-12-21 15:26:08 +01:00
thomwolf	77676c27d2	adding positional embeds masking to TFRoBERTa	2019-12-21 15:24:48 +01:00
thomwolf	344126fe58	move example to mm-imdb folder	2019-12-21 15:06:52 +01:00
Thomas Wolf	5b7fb6a4a1	Merge pull request #2134 from bkkaggle/saving-and-resuming closes #1960 Add saving and resuming functionality for remaining examples	2019-12-21 15:03:53 +01:00
Thomas Wolf	6f68d559ab	Merge pull request #2130 from huggingface/ignored-index-coherence [BREAKING CHANGE] Setting all ignored index to the PyTorch standard	2019-12-21 14:55:40 +01:00
thomwolf	1ab25c49d3	Merge branch 'master' into pr/2115	2019-12-21 14:54:30 +01:00
thomwolf	b03872aae0	fix merge	2019-12-21 14:49:54 +01:00
Thomas Wolf	518ba748e0	Merge branch 'master' into saving-and-resuming	2019-12-21 14:41:39 +01:00
Thomas Wolf	18601c3b6e	Merge pull request #2173 from erenup/master run_squad with roberta	2019-12-21 14:33:16 +01:00
Thomas Wolf	6e7102cfb3	Merge pull request #2203 from gthb/patch-1 fix: wrong architecture count in README	2019-12-21 14:31:44 +01:00
Thomas Wolf	deceb00161	Merge pull request #2177 from mandubian/issue-2106 :zip: #2106 tokenizer.tokenize speed improvement (3-8x) by caching added_tokens in a Set	2019-12-21 14:31:20 +01:00
Thomas Wolf	eeb70cdd77	Merge branch 'master' into saving-and-resuming	2019-12-21 14:29:59 +01:00
Thomas Wolf	ed9b84816e	Merge pull request #1840 from huggingface/generation_sampler [WIP] Sampling sequence generator for transformers	2019-12-21 14:27:35 +01:00
thomwolf	f86ed23189	update doc	2019-12-21 14:13:06 +01:00
thomwolf	cfa0380515	Merge branch 'master' into generation_sampler	2019-12-21 14:12:52 +01:00
thomwolf	300ec3003c	fixing run_generation example - using torch.no_grad	2019-12-21 14:02:19 +01:00
thomwolf	1c37746892	fixing run_generation	2019-12-21 13:52:49 +01:00
Thomas Wolf	7e17f09fb5	Merge pull request #1803 from importpandas/fix-xlnet-squad2.0 fix run_squad.py during fine-tuning xlnet on squad2.0	2019-12-21 13:38:48 +01:00
thomwolf	8a2be93b4e	fix merge	2019-12-21 13:31:28 +01:00
Thomas Wolf	562f864038	Merge branch 'master' into fix-xlnet-squad2.0	2019-12-21 12:48:10 +01:00
Thomas Wolf	8618bf15d6	Merge pull request #1736 from huggingface/fix-tf-xlnet Fix TFXLNet	2019-12-21 12:42:05 +01:00
Thomas Wolf	2fa8737c44	Merge pull request #1586 from enzoampil/include_special_tokens_in_bert_examples Add special tokens to documentation for bert examples to resolve issue: #1561	2019-12-21 12:36:11 +01:00
Thomas Wolf	f15f087143	Merge pull request #1764 from DomHudson/bug-fix-1761 Bug-fix: Roberta Embeddings Not Masked	2019-12-21 12:13:27 +01:00
Thomas Wolf	fae4d1c266	Merge pull request #2217 from aaugustin/test-parallelization Support running tests in parallel	2019-12-21 11:54:23 +01:00
Aymeric Augustin	b8e924e10d	Restore test. This looks like debug code accidentally committed in `b18509c2`. Refs #2250.	2019-12-21 08:50:15 +01:00
Aymeric Augustin	767bc3ca68	Fix typo in model name. This looks like a copy/paste mistake. Probably this test was never run. Refs #2250.	2019-12-21 08:46:26 +01:00
Aymeric Augustin	343c094f21	Run examples separately from tests. This optimizes the total run time of the Circle CI test suite.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	80caf79d07	Prevent excessive parallelism in PyTorch. We're already using as many processes in parallel as we have CPU cores. Furthermore, the number of core may be incorrectly calculated as 36 (we've seen this in pytest-xdist) which make compound the problem. PyTorch performance craters without this.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	bb3bfa2d29	Distribute tests from the same file to the same worker. This should prevent two issues: - hitting API rate limits for tests that hit the HF API - multiplying the cost of expensive test setups	2019-12-21 08:43:19 +01:00
Aymeric Augustin	29cbab98f0	Parallelize tests on Circle CI. Set the number of CPUs manually based on the Circle CI resource class, or else we're getting 36 CPUs, which is far too much (perhaps that's the underlying hardware and not what Circle CI allocates to us). Don't parallelize the custom tokenizers tests because they take less than one second to run and parallelization actually makes them slower.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	a4c9338b83	Prevent parallel downloads of the same file with a lock. Since the file is written to the filesystem, a filesystem lock is the way to go here. Add a dependency on the third-party filelock library to get cross-platform functionality.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	b670c26684	Take advantage of the cache when running tests. Caching models across test cases and across runs of the test suite makes slow tests somewhat more bearable. Use gettempdir() instead of /tmp in tests. This makes it easier to change the location of the cache with semi-standard TMPDIR/TEMP/TMP environment variables. Fix #2222.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	b67fa1a8d2	Download models directly to cache_dir. This allows moving the file instead of copying it, which is more reliable. Also it avoids writing large amounts of data to /tmp, which may not be large enough to accomodate it. Refs #2222.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	286d5bb6b7	Use a random temp dir for writing pruned models in tests.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	478e456e83	Use a random temp dir for writing file in tests.	2019-12-21 08:43:19 +01:00
Aymeric Augustin	12726f8556	Remove redundant torch.jit.trace in tests. This looks like it could be expensive, so don't run it twice.	2019-12-21 08:43:19 +01:00
Julien Chaumond	ac1b449cc9	[doc] move distilroberta to more appropriate place cc @lysandrejik	2019-12-21 00:09:01 -05:00

... 331 332 333 334 335 ...

19383 Commits