transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-16 11:08:23 +06:00

History

Ola Piktus c754c41c61 RAG (#6813 ) * added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * Formatting / renaming prior to actual work * First commit * improve comments * Retrieval evaluation scripts * refactor to include modeling outputs + MPI retriever * Fix rag-token model + refactor * Various fixes + finetuning logic * use_bos fix * Retrieval refactor * Finetuning refactoring and cleanup * Add documentation and cleanup * Remove set_up_rag_env.sh file * Fix retrieval wit HF index * Fix import errors * Fix quality errors * Refactor as per suggestions in https://github.com/huggingface/transformers/pull/6813#issuecomment-687208867 * fix quality * Fix RAG Sequence generation * minor cleanup plus initial tests * fix test * fix tests 2 * Comments fix * post-merge fixes * Improve readme + post-rebase refactor * Extra dependencied for tests * Fix tests * Fix tests 2 * Refactor test requirements * Fix tests 3 * Post-rebase refactor * rename nlp->datasets * RAG integration tests * add tokenizer to slow integration test and allow retriever to run on cpu * add tests; fix position ids warning * change structure * change structure * add from encoder generator * save working solution * make all integration tests pass * add RagTokenizer.save/from_pretrained and RagRetriever.save/from_pretrained * don't save paths * delete unnecessary imports * pass config to AutoTokenizer.from_pretrained for Rag tokenizers * init wiki_dpr only once * hardcode legacy index and passages paths (todo: add the right urls) * finalize config * finalize retriver api and config api * LegacyIndex index download refactor * add dpr to autotokenizer * make from pretrained more flexible * fix ragfortokengeneration * small name changes in tokenizer * add labels to models * change default index name * add retrieval tests * finish token generate * align test with previous version and make all tests pass * add tests * finalize tests * implement thoms suggestions * add first version of test * make first tests work * make retriever platform agnostic * naming * style * add legacy index URL * docstrings + simple retrieval test for distributed * clean model api * add doc_ids to retriever's outputs * fix retrieval tests * finish model outputs * finalize model api * fix generate problem for rag * fix generate for other modles * fix some tests * save intermediate * set generate to default * big refactor generate * delete rag_api * correct pip faiss install * fix auto tokenization test * fix faiss install * fix test * move the distributed logic to examples * model page * docs * finish tests * fix dependencies * fix import in __init__ * Refactor eval_rag and finetune scripts * start docstring * add psutil to test * fix tf test * move require torch to top * fix retrieval test * align naming * finish automodel * fix repo consistency * test ragtokenizer save/load * add rag model output docs * fix ragtokenizer save/load from pretrained * fix tokenizer dir * remove torch in retrieval * fix docs * fixe finetune scripts * finish model docs * finish docs * remove auto model for now * add require torch * remove solved todos * integrate sylvains suggestions * sams comments * correct mistake on purpose * improve README * Add generation test cases * fix rag token * clean token generate * fix test * add note to test * fix attention mask * add t5 test for rag * Fix handling prefix in finetune.py * don't overwrite index_name Co-authored-by: Patrick Lewis <plewis@fb.com> Co-authored-by: Aleksandra Piktus <piktus@devfair0141.h2.fair> Co-authored-by: Aleksandra Piktus <piktus@learnfair5102.h2.fair> Co-authored-by: Aleksandra Piktus <piktus@learnfair5067.h2.fair> Co-authored-by: Your Name <you@example.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>		2020-09-22 18:29:58 +02:00
..
albert.rst	Tf model outputs (#6247 )	2020-08-05 11:34:39 -04:00
auto.rst	Extra )	2020-09-14 09:37:55 -04:00
bart.rst	remove BartForConditionalGeneration.generate (#6659 )	2020-08-25 00:42:34 +08:00
bert.rst	Add a script to check all models are tested and documented (#6298 )	2020-08-07 09:18:37 -04:00
bertgeneration.rst	[BertGeneration, Docs] Fix another old name in docs (#7050 )	2020-09-10 17:12:33 +02:00
camembert.rst	CamembertForCausalLM (#6577 )	2020-08-21 13:52:54 +02:00
ctrl.rst	Clean documentation (#4849 )	2020-06-08 11:28:19 -04:00
dialogpt.rst	add dialogpt training tips (#3996 )	2020-04-28 14:32:31 +02:00
distilbert.rst	Add DistilBertForMultipleChoice (#5032 )	2020-06-15 18:31:41 -04:00
dpr.rst	Document model outputs (#5673 )	2020-07-10 17:31:02 -04:00
electra.rst	Tf model outputs (#6247 )	2020-08-05 11:34:39 -04:00
encoderdecoder.rst	fix link to paper (#7116 )	2020-09-14 07:43:40 -04:00
flaubert.rst	Add a script to check all models are tested and documented (#6298 )	2020-08-07 09:18:37 -04:00
fsmt.rst	[ported model] FSMT (FairSeq MachineTranslation) (#6940 )	2020-09-17 11:31:29 -04:00
funnel.rst	Add TF Funnel Transformer (#7029 )	2020-09-10 10:41:56 -04:00
gpt.rst	Tf model outputs (#6247 )	2020-08-05 11:34:39 -04:00
gpt2.rst	Tf model outputs (#6247 )	2020-08-05 11:34:39 -04:00
layoutlm.rst	Add LayoutLM Model (#7064 )	2020-09-22 09:28:02 -04:00
longformer.rst	TF Longformer (#5764 )	2020-08-10 23:25:06 +02:00
lxmert.rst	Adding the LXMERT pretraining model (MultiModal languageXvision) to HuggingFace's suite of models (#5793 )	2020-09-03 04:02:25 -04:00
marian.rst	[marian] converter supports models from new Tatoeba project (#6342 )	2020-08-17 23:55:42 -04:00
mbart.rst	[Doc] add more MBart and other doc (#6490 )	2020-08-17 12:30:26 -04:00
mobilebert.rst	Tf model outputs (#6247 )	2020-08-05 11:34:39 -04:00
pegasus.rst	pegasus.rst: fix expected output (#7017 )	2020-09-08 13:29:16 -04:00
rag.rst	RAG (#6813 )	2020-09-22 18:29:58 +02:00
reformer.rst	Add a script to check all models are tested and documented (#6298 )	2020-08-07 09:18:37 -04:00
retribert.rst	Eli5 examples (#4968 )	2020-06-16 16:36:58 -04:00
roberta.rst	[EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer (#6411 )	2020-08-12 18:23:30 +02:00
t5.rst	Actually the extra_id are from 0-99 and not from 1-100 (#5967 )	2020-07-30 06:13:29 -04:00
transformerxl.rst	Tf model outputs (#6247 )	2020-08-05 11:34:39 -04:00
xlm.rst	Add a script to check all models are tested and documented (#6298 )	2020-08-07 09:18:37 -04:00
xlmroberta.rst	[EncoderDecoder] Add xlm-roberta to encoder decoder (#6878 )	2020-09-01 21:56:39 +02:00
xlnet.rst	Tf model outputs (#6247 )	2020-08-05 11:34:39 -04:00