transformers/tests/models/layoutlmv3
Thibault Douzon 4e441e529c
fix LayoutLMv3TokenizerFast subword label after 'Ġ' token (#21695)
LayoutLMv3TokenizerFast produces empty 'Ġ' token with `offset_mapping = (0, 0)`.
Next token is wrongly assumed to also be beginning of word and isn't
correctly assigned `pad_token_label`.
Modify test with text that produce 'Ġ' token.
Remove copy check from LayoutLMv2TokenizerFast for `_batch_encode_plus`.

solves issue: #19978
2023-04-03 10:32:36 -04:00
..
__init__.py
test_image_processing_layoutlmv3.py Update quality tooling for formatting (#21480) 2023-02-06 18:10:56 -05:00
test_modeling_layoutlmv3.py Move is_pipeline_test_to_skip to specific model test classes (#21999) 2023-03-14 10:03:02 +01:00
test_modeling_tf_layoutlmv3.py Move is_pipeline_test_to_skip to specific model test classes (#21999) 2023-03-14 10:03:02 +01:00
test_processor_layoutlmv3.py Apply ruff flake8-comprehensions (#21694) 2023-02-22 09:14:54 +01:00
test_tokenization_layoutlmv3.py fix LayoutLMv3TokenizerFast subword label after 'Ġ' token (#21695) 2023-04-03 10:32:36 -04:00