transformers/docs/source/en
NielsRogge f3d2f7a6e0
Add MarkupLM (#19198)
* First draft

* Make basic test work

* Fix most tokenizer tests

* More improvements

* Make more tests pass

* Fix more tests

* Fix some code quality

* Improve truncation

* Implement feature extractor

* Improve feature extractor and add tests

* Improve feature extractor tests

* Fix pair_input test partly

* Add fast tokenizer

* Improve implementation

* Fix rebase

* Fix rebase

* Fix most of the tokenizer tests.

* propose solution for fast

* add: integration test for fasttokenizer, warning for decode, fix template in slow tokenizer

* add: modify markuplmconverter

* add: some modify on converter and tokenizerfast

* Fix style, copies

* Make fixup

* Update tokenization_markuplm.py

* Update test_tokenization_markuplm.py

* Update markuplm related

* Improve processor, add integration test

* Add processor test file

* Improve processor

* Improve processor tests

* Fix more processor tests

* Fix processor tests

* Update docstrings

* Add Copied from statements

* Add more Copied from statements

* Add code examples

* Improve code examples

* Add model to doc tests

* Adding dependency check

* Add dummy file

* Add requires_backends

* Add model to toctree

* Fix more things, disable dependency check for now

* Apply more suggestions

* Add soft dependency

* Add annotators to tests

* Fix style

* Remove from_slow=True

* Remove print statements

* Add sanity check

* Fix processor test

* Fix processor tests, add more docs

* Add doc tests for mdx file

* Add more tips

* Apply suggestions

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: lockon-n <45759388+lockon-n@users.noreply.github.com>
Co-authored-by: SaulLu <lucilesaul.com@gmail.com>
Co-authored-by: lockon-n <dd098309@126.com>
2022-09-30 08:25:43 +02:00
..
internal Allow from transformers import TypicalLogitsWarper (#17477) 2022-06-03 11:08:35 +02:00
main_classes Fix a broken link for deepspeed ZeRO inference in the docs (#19001) 2022-09-14 16:21:06 -07:00
model_doc Add MarkupLM (#19198) 2022-09-30 08:25:43 +02:00
tasks Use repo_type instead of deprecated datasets repo IDs (#19202) 2022-09-26 09:50:48 -04:00
_config.py Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
_toctree.yml Add MarkupLM (#19198) 2022-09-30 08:25:43 +02:00
accelerate.mdx update to use interlibrary links instead of Markdown (#18500) 2022-08-08 10:53:52 -05:00
add_new_model.mdx Rewrite push_to_hub to use upload_files (#18366) 2022-08-01 12:07:30 -04:00
add_new_pipeline.mdx Update add_new_pipeline.mdx (#18224) 2022-07-21 07:55:30 +02:00
autoclass_tutorial.mdx Mention TF and Flax checkpoints (#18894) 2022-09-05 11:09:39 +02:00
benchmarks.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
bertology.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
big_models.mdx Add link to existing documentation (#17931) 2022-07-04 04:13:05 -04:00
community.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
contributing.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
converting_tensorflow_models.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
create_a_model.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
custom_models.mdx Fix some typos. (#17560) 2022-07-11 05:00:13 -04:00
debugging.mdx [doc] debug: fix import (#19042) 2022-09-14 16:29:58 -07:00
fast_tokenizers.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
glossary.mdx [doc] fix anchors (#18591) 2022-08-12 10:49:59 -07:00
hpo_train.mdx add doc for hyperparameter search (#19192) 2022-09-27 07:51:51 -04:00
index.mdx Add MarkupLM (#19198) 2022-09-30 08:25:43 +02:00
installation.mdx Move cache folder to huggingface/hub for consistency with hf_hub (#18492) 2022-08-05 13:14:00 -04:00
migration.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
model_sharing.mdx Just re-reading the whole doc every couple of months 😬 (#18489) 2022-08-06 09:38:55 +02:00
model_summary.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
multilingual.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
notebooks.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
pad_truncation.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
perf_hardware.mdx [WIP] [doc] performance/scalability revamp (#15723) 2022-05-16 13:36:41 +02:00
perf_infer_cpu.mdx Extend Transformers Trainer Class to Enable PyTorch Torchscript for Inference (#17153) 2022-06-14 07:56:47 -04:00
perf_infer_gpu_many.mdx Update perf_infer_gpu_many.mdx (#18744) 2022-08-24 10:37:52 +02:00
perf_infer_gpu_one.mdx [bnb] Move documentation (#18671) 2022-08-18 17:34:48 +02:00
perf_infer_special.mdx Improve performance docs (#17750) 2022-06-23 14:51:54 +02:00
perf_train_cpu_many.mdx update perf_train_cpu_many doc (#19151) 2022-09-22 09:20:15 -04:00
perf_train_cpu.mdx Extend Transformers Trainer Class to Enable CPU AMP and Integrate Intel Extension for PyTorch (#17138) 2022-06-08 09:41:57 -04:00
perf_train_gpu_many.mdx Improve performance docs (#17750) 2022-06-23 14:51:54 +02:00
perf_train_gpu_one.mdx Update perf_train_gpu_one.mdx (#18442) 2022-09-05 14:06:36 +02:00
perf_train_special.mdx Improve performance docs (#17750) 2022-06-23 14:51:54 +02:00
perf_train_tpu.mdx Improve performance docs (#17750) 2022-06-23 14:51:54 +02:00
performance.mdx Improve performance docs (#17750) 2022-06-23 14:51:54 +02:00
perplexity.mdx Fix incorrect size of input for 1st strided window length in Perplexity of fixed-length models (#18906) 2022-09-06 15:20:12 -04:00
philosophy.mdx Update philosophy to include other preprocessing classes (#18550) 2022-08-10 13:20:39 -05:00
pipeline_tutorial.mdx fix pipeline_tutorial.mdx doctest (#18717) 2022-08-24 05:38:03 -04:00
pr_checks.mdx 📝 update documentation build section (#18548) 2022-08-09 18:22:55 -05:00
preprocessing.mdx Focus doc around preprocessing classes (#18768) 2022-09-28 17:09:44 -07:00
quicktour.mdx Skip some doctests in quicktour (#18927) 2022-09-07 14:45:22 -07:00
run_scripts.mdx Just re-reading the whole doc every couple of months 😬 (#18489) 2022-08-06 09:38:55 +02:00
sagemaker.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
serialization.mdx Add support for conditional detr (#18948) 2022-09-22 09:45:04 +02:00
task_summary.mdx Just re-reading the whole doc every couple of months 😬 (#18489) 2022-08-06 09:38:55 +02:00
testing.mdx Fix some typos. (#17560) 2022-07-11 05:00:13 -04:00
tokenizer_summary.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
training.mdx Update TF fine-tuning docs (#18654) 2022-09-07 13:30:07 +01:00
troubleshooting.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00