Anthony MOI
e6ec24fa88
Better added_tokens handling
2019-12-26 16:49:48 -05:00
Anthony MOI
599db139f9
Code style update
2019-12-26 15:13:30 -05:00
Anthony MOI
835b76a46f
Handle unk_token
...
As we discussed, this is handled here directly
cc @thomwolf
2019-12-26 14:42:55 -05:00
Anthony MOI
7ead04ce14
FastPreTrainedTokenizer => PreTrainedTokenizerFast
2019-12-26 14:39:39 -05:00
Anthony MOI
1f82a5d910
Update for changes in tokenizers API
2019-12-26 14:37:55 -05:00
Thomas Wolf
8c67b529f6
Merge pull request #2324 from kashif/patch-1
...
Typo in serving.py
2019-12-26 12:38:06 +01:00
Kashif Rasul
7211541ade
Typo in serving.py
2019-12-26 12:21:40 +01:00
patrickvonplaten
0f6017bee3
improve comments for examples
2019-12-26 00:35:11 +01:00
patrickvonplaten
87c8fca9bc
add example for ctrl text generation in docs
2019-12-26 00:29:19 +01:00
patrickvonplaten
88def24c45
merge conflicts - renamed to previous_token singular
2019-12-26 00:27:16 +01:00
patrickvonplaten
822f725a07
duplicated line for repeating_words_penalty_for_language_generation
2019-12-26 00:25:29 +01:00
patrickvonplaten
fc84bd5254
adapt style to predefined style layout
2019-12-25 23:32:44 +01:00
patrickvonplaten
deff792bb6
add prepare inputs for transfo_xl and xlnet
2019-12-25 23:17:24 +01:00
patrickvonplaten
9398058e19
add easy tensor shape match test
2019-12-25 23:17:24 +01:00
patrickvonplaten
90cda45e9e
add past re-ordering for beam search
2019-12-25 23:17:24 +01:00
patrickvonplaten
6bca56fdb0
check for self.config.mem_len instead of self.mem_len in _do_output_past
2019-12-25 23:17:24 +01:00
patrickvonplaten
365ccd0af2
make if statements cleaner for prepare_inputs_for_generation
2019-12-25 23:17:24 +01:00
patrickvonplaten
d039c679d2
better naming for if statement
2019-12-25 23:17:24 +01:00
patrickvonplaten
7e0c5c731a
changed do_output_past function to check for self.config.output_past instead of self.output_past
2019-12-25 23:17:24 +01:00
patrickvonplaten
eeaa402cd4
rename comments
2019-12-25 23:17:24 +01:00
patrickvonplaten
7bb4271291
remove ipdb debugging statements
2019-12-25 23:17:24 +01:00
patrickvonplaten
267587c258
add and improve comments
2019-12-25 23:17:24 +01:00
patrickvonplaten
d891fd0ae0
add past hidden key states for more efficient language generation & add prepare_inputs for gpt2 and ctrl model
2019-12-25 23:17:24 +01:00
Thomas Wolf
aeef4823ab
Merge pull request #2303 from patrickvonplaten/fix_error_with_repetition_penalty
...
fix repetition penalty error in modeling_utils.py
2019-12-25 22:39:20 +01:00
Thomas Wolf
0412f3d929
Merge pull request #2291 from aaugustin/fix-flake8-F841
...
Fix F841 flake8 warning
2019-12-25 22:37:42 +01:00
Thomas Wolf
8742c95461
Merge pull request #2289 from patrickvonplaten/fix_effective_batch_size_lang_gen_xlm
...
fix bug in prepare inputs for language generation for xlm for effective batch_size > 1
2019-12-25 22:30:46 +01:00
Thomas Wolf
1240be3ed9
Merge pull request #2312 from vitaliyradchenko/fix_special_and_add_tokens_loading
...
Correct tokenization for special and added tokens
2019-12-25 20:52:30 +01:00
vitaliyradchenko
b262577d17
add special tokens to unique_added_tokens_encoder
2019-12-25 18:31:35 +02:00
vitaliyradchenko
83a2347952
fixed lack of added and special tokens
2019-12-25 18:03:19 +02:00
Thomas Wolf
cea04a2443
Merge pull request #2310 from ShnitzelKiller/scatter-unfix
...
revert erroneous fix #2276
2019-12-25 12:43:22 +01:00
James Noeckel
e1844d9a45
use positional arguments due to inconsistent API
2019-12-25 01:34:02 -08:00
James Noeckel
9fb7addd4d
revert erroneous fix
2019-12-24 22:26:09 -08:00
Anthony MOI
734d29b03d
tokenizers is now a real dependency
2019-12-24 13:32:41 -05:00
Anthony MOI
2818e50569
Add tests for fast tokenizers
2019-12-24 13:29:01 -05:00
Anthony MOI
31c56f2e0b
Fix style
2019-12-24 12:43:27 -05:00
Anthony MOI
951ae99bea
BertTokenizerFast
2019-12-24 12:24:24 -05:00
Anthony MOI
041eac2d6d
GPT2TokenizerFast
2019-12-24 12:24:14 -05:00
Anthony MOI
3471ff0d35
FastPreTrainedTokenizer
2019-12-24 12:23:30 -05:00
patrickvonplaten
18e5bdbec5
fix repetition penalty error in modeling_utils.py
2019-12-24 17:18:05 +01:00
patrickvonplaten
f18ac4c28e
fix sequence length for prepare_inputs for xlnet
2019-12-24 16:43:24 +01:00
patrickvonplaten
359dc43837
fix effective batch_size error in prepare_inputs also for xlnet
2019-12-24 16:33:20 +01:00
patrickvonplaten
d98a384cb0
fix bug in prepare inputs for language generation for xlm for effective batch_size > 1
2019-12-24 16:29:54 +01:00
thomwolf
3e0cf49514
adding back last dropout in TF 2.0 T5
2019-12-24 11:30:56 +01:00
thomwolf
35d32308de
adding back final dropout in T5
2019-12-24 11:29:49 +01:00
Thomas Wolf
81db12c3ba
Merge pull request #2271 from aaugustin/improve-setup-and-requirements
...
Improve setup and requirements
2019-12-24 11:21:20 +01:00
Aymeric Augustin
10724a8123
Run the slow tests every Monday morning.
2019-12-24 09:09:43 +01:00
Aymeric Augustin
a8d34e534e
Remove [--editable] in install instructions.
...
Use -e only in docs targeted at contributors.
If a user copy-pastes command line with [--editable], they will hit
an error. If they don't know the --editable option, we're giving them
a choice to make before they can move forwards, but this isn't a choice
they need to make right now.
2019-12-24 08:46:08 +01:00
Aymeric Augustin
e74c73a85d
Enable F841 warning in flake8.
2019-12-23 22:38:23 +01:00
Aymeric Augustin
e6c0019c80
Remove unused variables in tests.
2019-12-23 22:38:18 +01:00
Aymeric Augustin
495580dad1
Remove unused variables in templates.
2019-12-23 22:38:18 +01:00