Commit Graph

19383 Commits

Author SHA1 Message Date
patrickvonplaten
7bb4271291 remove ipdb debugging statements 2019-12-25 23:17:24 +01:00
patrickvonplaten
267587c258 add and improve comments 2019-12-25 23:17:24 +01:00
patrickvonplaten
d891fd0ae0 add past hidden key states for more efficient language generation & add prepare_inputs for gpt2 and ctrl model 2019-12-25 23:17:24 +01:00
Thomas Wolf
aeef4823ab
Merge pull request #2303 from patrickvonplaten/fix_error_with_repetition_penalty
fix repetition penalty error in modeling_utils.py
2019-12-25 22:39:20 +01:00
Thomas Wolf
0412f3d929
Merge pull request #2291 from aaugustin/fix-flake8-F841
Fix F841 flake8 warning
2019-12-25 22:37:42 +01:00
Thomas Wolf
8742c95461
Merge pull request #2289 from patrickvonplaten/fix_effective_batch_size_lang_gen_xlm
fix bug in prepare inputs for language generation for xlm for effective batch_size > 1
2019-12-25 22:30:46 +01:00
Thomas Wolf
1240be3ed9
Merge pull request #2312 from vitaliyradchenko/fix_special_and_add_tokens_loading
Correct tokenization for special and added tokens
2019-12-25 20:52:30 +01:00
vitaliyradchenko
b262577d17 add special tokens to unique_added_tokens_encoder 2019-12-25 18:31:35 +02:00
vitaliyradchenko
83a2347952 fixed lack of added and special tokens 2019-12-25 18:03:19 +02:00
Thomas Wolf
cea04a2443
Merge pull request #2310 from ShnitzelKiller/scatter-unfix
revert erroneous fix #2276
2019-12-25 12:43:22 +01:00
James Noeckel
e1844d9a45 use positional arguments due to inconsistent API 2019-12-25 01:34:02 -08:00
James Noeckel
9fb7addd4d revert erroneous fix 2019-12-24 22:26:09 -08:00
Anthony MOI
734d29b03d
tokenizers is now a real dependency 2019-12-24 13:32:41 -05:00
Anthony MOI
2818e50569
Add tests for fast tokenizers 2019-12-24 13:29:01 -05:00
Anthony MOI
31c56f2e0b
Fix style 2019-12-24 12:43:27 -05:00
Anthony MOI
951ae99bea
BertTokenizerFast 2019-12-24 12:24:24 -05:00
Anthony MOI
041eac2d6d
GPT2TokenizerFast 2019-12-24 12:24:14 -05:00
Anthony MOI
3471ff0d35
FastPreTrainedTokenizer 2019-12-24 12:23:30 -05:00
patrickvonplaten
18e5bdbec5 fix repetition penalty error in modeling_utils.py 2019-12-24 17:18:05 +01:00
patrickvonplaten
f18ac4c28e fix sequence length for prepare_inputs for xlnet 2019-12-24 16:43:24 +01:00
patrickvonplaten
359dc43837 fix effective batch_size error in prepare_inputs also for xlnet 2019-12-24 16:33:20 +01:00
patrickvonplaten
d98a384cb0 fix bug in prepare inputs for language generation for xlm for effective batch_size > 1 2019-12-24 16:29:54 +01:00
thomwolf
3e0cf49514 adding back last dropout in TF 2.0 T5 2019-12-24 11:30:56 +01:00
thomwolf
35d32308de adding back final dropout in T5 2019-12-24 11:29:49 +01:00
Thomas Wolf
81db12c3ba
Merge pull request #2271 from aaugustin/improve-setup-and-requirements
Improve setup and requirements
2019-12-24 11:21:20 +01:00
Aymeric Augustin
10724a8123 Run the slow tests every Monday morning. 2019-12-24 09:09:43 +01:00
Aymeric Augustin
a8d34e534e Remove [--editable] in install instructions.
Use -e only in docs targeted at contributors.

If a user copy-pastes  command line with [--editable], they will hit
an error. If they don't know the --editable option, we're giving them
a choice to make before they can move forwards, but this isn't a choice
they need to make right now.
2019-12-24 08:46:08 +01:00
Aymeric Augustin
e74c73a85d Enable F841 warning in flake8. 2019-12-23 22:38:23 +01:00
Aymeric Augustin
e6c0019c80 Remove unused variables in tests. 2019-12-23 22:38:18 +01:00
Aymeric Augustin
495580dad1 Remove unused variables in templates. 2019-12-23 22:38:18 +01:00
Aymeric Augustin
71f94a8a1c Remove unused variables in src. 2019-12-23 22:38:09 +01:00
Aymeric Augustin
81422c4e6d Remove unused variables in examples. 2019-12-23 22:29:02 +01:00
Aymeric Augustin
072750f4dc
Merge pull request #2288 from aaugustin/better-handle-optional-imports
Improve handling of optional imports
2019-12-23 22:28:47 +01:00
Aymeric Augustin
4621ad6f9d Use the same pattern as everywhere else.
This is really just for consistency.
2019-12-23 21:30:04 +01:00
Aymeric Augustin
a31d4a2971 Reraise ImportError when sentencepiece isn't installed.
Else, the next line fails with a confusion exception because the spm
variable isn't defined.
2019-12-23 21:27:42 +01:00
Aymeric Augustin
c8b0c1e551 Improve exception type.
ImportError isn't really appropriate when there's no import involved.
2019-12-23 21:27:38 +01:00
Aymeric Augustin
4c09a96096 Simplify re-raising exceptions.
Most module use the simpler `raise` version. Normalize those that don't.
2019-12-23 21:20:54 +01:00
Aymeric Augustin
5565dcdd35 Remove warning when scikit-learn isn't available.
Most users don't need it.
2019-12-23 21:16:26 +01:00
Aymeric Augustin
8a6881822a Run some tests on Python 3.7.
This will improve version coverage.
2019-12-23 21:06:23 +01:00
Aymeric Augustin
7a865821d9 Remove stray egg-info directory automatically.
If a user or contributor ran `pip install -e .` on transformers < 3.0,
pip created a transformers.egg-info directory next to the transformers
directory at the root of the repository.

In transformers 3.0, the source is in a `src` subdirectory.
`pip install -e .` creates a transformers.egg-info directory there.
However, pip will still pick transformers.egg-info from the previous
location. This is a bug: https://github.com/pypa/pip/issues/5466

Users and contributors are likely to hit this problem because the
documentation for transformers 3.0 relies heavily on extra_requires
which didn't exist in earlier versions, so aren't defined in a stale
transformers.egg-info directory.

If such a directory exists, remove it. It's autogenerated, gitignored
and not supposed to contain anything of value.
2019-12-23 21:06:23 +01:00
Aymeric Augustin
70373a5f7c Update contribution instructions.
Also provide shortcuts in a Makefile.
2019-12-23 21:05:30 +01:00
Aymeric Augustin
c3783399db Remove redundant requirements with transformers. 2019-12-23 19:17:27 +01:00
Aymeric Augustin
d79e9c9a9a Remove docs/requirements.txt.
It's superseded by the "docs" extras.
2019-12-23 19:17:07 +01:00
Aymeric Augustin
d73eb552e8 Remove requirements.txt.
It's redundant with setup.py and, also, incomplete (e.g. numpy).
2019-12-23 19:15:08 +01:00
Aymeric Augustin
9fcc532df6 Remove requirements-dev.txt.
It was generated once, likely in a non-reproducible way (pip freeze
in a contributor's local environment), and never updated.
2019-12-23 19:14:36 +01:00
Aymeric Augustin
76a1417f2a Include all optional dependencies in extras.
Take advantage of this to simplify the Circle CI configuration.

Don't bother with tensorboardX: it's a fallback for PyTorch < 1.1.0.
2019-12-23 19:14:31 +01:00
Aymeric Augustin
9fc8dcb2a0 Standardize import.
Every other file uses this pattern.
2019-12-23 18:45:42 +01:00
Aymeric Augustin
f2522869ea Review and update setup.py. 2019-12-23 18:45:42 +01:00
Alan deLevie
7cef764ec0
Typo in tokenization_utils.py
avoir -> avoid
2019-12-23 12:14:50 -05:00
Aymeric Augustin
23dad8447c Install deps from setup.py for building docs.
requirements.txt isn't up to date.
2019-12-23 17:06:32 +01:00