mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-15 10:38:23 +06:00

* beginning of dpr modeling * wip * implement forward * remove biencoder + better init weights * export dpr model to embed model for nlp lib * add new api * remove old code * make style * fix dumb typo * don't load bert weights * docs * docs * style * move the `k` parameter * fix init_weights * add pretrained configs * minor * update config names * style * better config * style * clean code based on PR comments * change Dpr to DPR * fix config * switch encoder config to a dict * style * inheritance -> composition * add messages in assert startements * add dpr reader tokenizer * one tokenizer per model * fix base_model_prefix * fix imports * typo * add convert script * docs * change tokenizers conf names * style * change tokenizers conf names * minor * minor * fix wrong names * minor * remove unused convert functions * rename convert script * use return_tensors in tokenizers * remove n_questions dim * move generate logic to tokenizer * style * add docs * docs * quality * docs * add tests * style * add tokenization tests * DPR full tests * Stay true to the attention mask building * update docs * missing param in bert input docs * docs * style Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
90 lines
2.4 KiB
ReStructuredText
90 lines
2.4 KiB
ReStructuredText
DPR
|
|
----------------------------------------------------
|
|
|
|
Overview
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Dense Passage Retrieval (DPR) - is a set of tools and models for state-of-the-art open-domain Q&A research.
|
|
It is based on the following paper:
|
|
|
|
Vladimir Karpukhin, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, Wen-tau Yih, Dense Passage Retrieval for Open-Domain Question Answering.
|
|
|
|
The abstract from the paper is the following:
|
|
|
|
*Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional
|
|
sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can
|
|
be practically implemented using dense representations alone, where embeddings are learned from a small number of
|
|
questions and passages by a simple dual-encoder framework. When evaluated on a wide range of open-domain QA datasets,
|
|
our dense retriever outperforms a strong Lucene-BM25 system largely by 9%-19% absolute in terms of top-20 passage
|
|
retrieval accuracy, and helps our end-to-end QA system establish new state-of-the-art on multiple open-domain QA
|
|
benchmarks.*
|
|
|
|
The original code can be found `here <https://github.com/facebookresearch/DPR>`_.
|
|
|
|
|
|
DPRConfig
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.DPRConfig
|
|
:members:
|
|
|
|
|
|
DPRContextEncoderTokenizer
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.DPRContextEncoderTokenizer
|
|
:members:
|
|
|
|
|
|
DPRContextEncoderTokenizerFast
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.DPRContextEncoderTokenizerFast
|
|
:members:
|
|
|
|
DPRQuestionEncoderTokenizer
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.DPRQuestionEncoderTokenizer
|
|
:members:
|
|
|
|
|
|
DPRQuestionEncoderTokenizerFast
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.DPRQuestionEncoderTokenizerFast
|
|
:members:
|
|
|
|
DPRReaderTokenizer
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.DPRReaderTokenizer
|
|
:members:
|
|
|
|
|
|
DPRReaderTokenizerFast
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.DPRReaderTokenizerFast
|
|
:members:
|
|
|
|
|
|
DPRContextEncoder
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.DPRContextEncoder
|
|
:members:
|
|
|
|
DPRQuestionEncoder
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.DPRQuestionEncoder
|
|
:members:
|
|
|
|
|
|
DPRReader
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. autoclass:: transformers.DPRReader
|
|
:members:
|