transformers/docs/source/en
NielsRogge 4ef0abb738
Add TAPEX (#16473)
* Add TapexTokenizer

* Improve docstrings and provide option to provide answer

* Remove option for pretokenized inputs

* Add TAPEX to README

* Fix copies

* Remove option for pretokenized inputs

* Initial commit: add tapex fine-tuning examples on both table-based question answering and table-based fact verification.

* - Draft a README file for running the script and introducing some background.
- Remove unused code lines in tabfact script.
- Disable the deafult `pad_to_max_length` option which is memory-consuming.

* * Support `as_target_tokenizer` function for TapexTokenizer.
* Fix the do_lower_case behaviour of TapexTokenizer.
* Add unit tests for target scenarios and cased/uncased scenarios for both source and target.

* * Replace the label BartTokenizer with TapexTokenizer's as_target_tokenizer function.
* Fix typos in tapex example README.

* * fix the evaluation script - remove the property `task_name`

* * Make the label space more clear for tabfact tasks

* * Using a new fine-tuning script for tapex-base on tabfact.

* * Remove the lowercase code outside the tokenizer - we use the tokenizer to control whether do_lower_case
* Guarantee the hyper-parameter can be run without out-of-memory on 16GB card and report the new reproduced number on wikisql

* * Remove the default tokenizer_name option.
* Provide evaluation command.

* * Support for WikiTableQuestion dataset.

* Fix a typo in README.

* * Fix the datasets's key name in WikiTableQuestions

* Run make fixup and move test to folder

* Fix quality

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply some more suggestions from code review

* Improve docstrings

* Overwrite failing test

* Improve comment in example scripts

* Fix rebase

* Add TAPEX to Auto mapping

* Add TAPEX to auto config mappings

* Put TAPEX higher than BART in auto mapping

* Add TAPEX to doc tests

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
Co-authored-by: SivilTaram <qianlxc@outlook.com>
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-04-08 10:57:51 +02:00
..
internal TF generate refactor - Beam Search (#16374) 2022-04-06 18:19:34 +01:00
main_classes Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
model_doc Add TAPEX (#16473) 2022-04-08 10:57:51 +02:00
tasks Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
_config.py Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
_toctree.yml Add TAPEX (#16473) 2022-04-08 10:57:51 +02:00
accelerate.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
add_new_model.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
add_new_pipeline.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
autoclass_tutorial.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
benchmarks.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
bertology.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
community.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
contributing.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
converting_tensorflow_models.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
create_a_model.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
custom_models.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
debugging.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
fast_tokenizers.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
glossary.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
index.mdx Add TAPEX (#16473) 2022-04-08 10:57:51 +02:00
installation.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
migration.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
model_sharing.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
model_summary.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
multilingual.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
notebooks.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
pad_truncation.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
parallelism.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
performance.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
perplexity.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
philosophy.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
pipeline_tutorial.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
pr_checks.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
preprocessing.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
quicktour.mdx [Minds14] Correct quicktour (#16626) 2022-04-06 11:27:11 +02:00
run_scripts.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
sagemaker.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
serialization.mdx Add TAPEX (#16473) 2022-04-08 10:57:51 +02:00
task_summary.mdx Quality 2022-04-05 14:12:01 -04:00
testing.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
tokenizer_summary.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
training.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
troubleshooting.mdx Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00