mirror of
https://github.com/huggingface/transformers.git
synced 2025-08-02 11:11:05 +06:00
![]() LayoutLMv3TokenizerFast produces empty 'Ġ' token with `offset_mapping = (0, 0)`. Next token is wrongly assumed to also be beginning of word and isn't correctly assigned `pad_token_label`. Modify test with text that produce 'Ġ' token. Remove copy check from LayoutLMv2TokenizerFast for `_batch_encode_plus`. solves issue: #19978 |
||
---|---|---|
.. | ||
__init__.py | ||
test_image_processing_layoutlmv3.py | ||
test_modeling_layoutlmv3.py | ||
test_modeling_tf_layoutlmv3.py | ||
test_processor_layoutlmv3.py | ||
test_tokenization_layoutlmv3.py |