transformers/tests
David 042f420364
Update pipeline word heuristic to work with whitespace in token offsets (#18402)
* Update pipeline word heuristic to work with whitespace in token offsets

This change checks for whitespace in the input string at either the
character preceding the token or in the first character of the token.
This works with tokenizers that return offsets excluding whitespace
between words or with offsets including whitespace.

fixes #18111

starting

* Use smaller model, ensure expected tokenization

* Re-run CI (please squash)
2022-08-02 15:31:01 -04:00
..
benchmark [Test refactor 1/5] Per-folder tests reorganization (#15725) 2022-02-23 15:46:28 -05:00
deepspeed deprecate is_torch_bf16_available (#17738) 2022-06-20 08:40:11 -04:00
extended Update self-push workflow (#17177) 2022-05-13 16:28:00 +02:00
fixtures add a warning in SpmConverter for sentencepiece's model using the byte fallback feature (#16629) 2022-04-11 11:06:10 +02:00
generation Generate: deprecate default max_length (#18018) 2022-07-23 18:02:03 +01:00
models Adding fine-tuning models to LUKE (#18353) 2022-08-01 11:09:47 -04:00
onnx add ONNX support for LeVit (#18154) 2022-07-18 15:17:07 +02:00
optimization [Test refactor 1/5] Per-folder tests reorganization (#15725) 2022-02-23 15:46:28 -05:00
pipelines Update pipeline word heuristic to work with whitespace in token offsets (#18402) 2022-08-02 15:31:01 -04:00
sagemaker Black preview (#17217) 2022-05-12 16:25:55 -04:00
tokenization fix train_new_from_iterator in the case of byte-level tokenizers (#17549) 2022-06-08 15:30:41 +02:00
trainer Enable torchdynamo with torch_tensorrt(fx path) (#17765) 2022-07-13 12:43:28 -04:00
utils Update serving code to enable saved_model=True (#18153) 2022-07-22 18:05:38 +01:00
__init__.py GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
test_configuration_common.py Rewrite push_to_hub to use upload_files (#18366) 2022-08-01 12:07:30 -04:00
test_feature_extraction_common.py Rewrite push_to_hub to use upload_files (#18366) 2022-08-01 12:07:30 -04:00
test_modeling_common.py Rewrite push_to_hub to use upload_files (#18366) 2022-08-01 12:07:30 -04:00
test_modeling_flax_common.py Rewrite push_to_hub to use upload_files (#18366) 2022-08-01 12:07:30 -04:00
test_modeling_tf_common.py Rewrite push_to_hub to use upload_files (#18366) 2022-08-01 12:07:30 -04:00
test_sequence_feature_extraction_common.py Some tests misusing assertTrue for comparisons fix (#16771) 2022-04-19 14:44:08 +02:00
test_tokenization_common.py Rewrite push_to_hub to use upload_files (#18366) 2022-08-01 12:07:30 -04:00