* Clean up model documentation
* Formatting
* Preparation work
* Long lines
* Main work on rst files
* Cleanup all config files
* Syntax fix
* Clean all tokenizers
* Work on first models
* Models beginning
* FaluBERT
* All PyTorch models
* All models
* Long lines again
* Fixes
* More fixes
* Update docs/source/model_doc/bert.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update docs/source/model_doc/electra.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Last fixes
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* skip decorators: docs, tests, bugs
* another important note
* style
* bloody style
* add @pytest.mark.parametrize
* add note
* no idea what it wants :(
* added rag WIP
* path fix
* Formatting / renaming prior to actual work
* added rag WIP
* path fix
* Formatting / renaming prior to actual work
* added rag WIP
* path fix
* Formatting / renaming prior to actual work
* added rag WIP
* Formatting / renaming prior to actual work
* First commit
* improve comments
* Retrieval evaluation scripts
* refactor to include modeling outputs + MPI retriever
* Fix rag-token model + refactor
* Various fixes + finetuning logic
* use_bos fix
* Retrieval refactor
* Finetuning refactoring and cleanup
* Add documentation and cleanup
* Remove set_up_rag_env.sh file
* Fix retrieval wit HF index
* Fix import errors
* Fix quality errors
* Refactor as per suggestions in https://github.com/huggingface/transformers/pull/6813#issuecomment-687208867
* fix quality
* Fix RAG Sequence generation
* minor cleanup plus initial tests
* fix test
* fix tests 2
* Comments fix
* post-merge fixes
* Improve readme + post-rebase refactor
* Extra dependencied for tests
* Fix tests
* Fix tests 2
* Refactor test requirements
* Fix tests 3
* Post-rebase refactor
* rename nlp->datasets
* RAG integration tests
* add tokenizer to slow integration test and allow retriever to run on cpu
* add tests; fix position ids warning
* change structure
* change structure
* add from encoder generator
* save working solution
* make all integration tests pass
* add RagTokenizer.save/from_pretrained and RagRetriever.save/from_pretrained
* don't save paths
* delete unnecessary imports
* pass config to AutoTokenizer.from_pretrained for Rag tokenizers
* init wiki_dpr only once
* hardcode legacy index and passages paths (todo: add the right urls)
* finalize config
* finalize retriver api and config api
* LegacyIndex index download refactor
* add dpr to autotokenizer
* make from pretrained more flexible
* fix ragfortokengeneration
* small name changes in tokenizer
* add labels to models
* change default index name
* add retrieval tests
* finish token generate
* align test with previous version and make all tests pass
* add tests
* finalize tests
* implement thoms suggestions
* add first version of test
* make first tests work
* make retriever platform agnostic
* naming
* style
* add legacy index URL
* docstrings + simple retrieval test for distributed
* clean model api
* add doc_ids to retriever's outputs
* fix retrieval tests
* finish model outputs
* finalize model api
* fix generate problem for rag
* fix generate for other modles
* fix some tests
* save intermediate
* set generate to default
* big refactor generate
* delete rag_api
* correct pip faiss install
* fix auto tokenization test
* fix faiss install
* fix test
* move the distributed logic to examples
* model page
* docs
* finish tests
* fix dependencies
* fix import in __init__
* Refactor eval_rag and finetune scripts
* start docstring
* add psutil to test
* fix tf test
* move require torch to top
* fix retrieval test
* align naming
* finish automodel
* fix repo consistency
* test ragtokenizer save/load
* add rag model output docs
* fix ragtokenizer save/load from pretrained
* fix tokenizer dir
* remove torch in retrieval
* fix docs
* fixe finetune scripts
* finish model docs
* finish docs
* remove auto model for now
* add require torch
* remove solved todos
* integrate sylvains suggestions
* sams comments
* correct mistake on purpose
* improve README
* Add generation test cases
* fix rag token
* clean token generate
* fix test
* add note to test
* fix attention mask
* add t5 test for rag
* Fix handling prefix in finetune.py
* don't overwrite index_name
Co-authored-by: Patrick Lewis <plewis@fb.com>
Co-authored-by: Aleksandra Piktus <piktus@devfair0141.h2.fair>
Co-authored-by: Aleksandra Piktus <piktus@learnfair5102.h2.fair>
Co-authored-by: Aleksandra Piktus <piktus@learnfair5067.h2.fair>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
* ready for PR
* cleanup
* correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST
* fix
* perfectionism
* revert change from another PR
* odd, already committed this one
* non-interactive upload workaround
* backup the failed experiment
* store langs in config
* workaround for localizing model path
* doc clean up as in https://github.com/huggingface/transformers/pull/6956
* style
* back out debug mode
* document: run_eval.py --num_beams 10
* remove unneeded constant
* typo
* re-use bart's Attention
* re-use EncoderLayer, DecoderLayer from bart
* refactor
* send to cuda and fp16
* cleanup
* revert (moved to another PR)
* better error message
* document run_eval --num_beams
* solve the problem of tokenizer finding the right files when model is local
* polish, remove hardcoded config
* add a note that the file is autogenerated to avoid losing changes
* prep for org change, remove unneeded code
* switch to model4.pt, update scores
* s/python/bash/
* missing init (but doesn't impact the finetuned model)
* cleanup
* major refactor (reuse-bart)
* new model, new expected weights
* cleanup
* cleanup
* full link
* fix model type
* merge porting notes
* style
* cleanup
* have to create a DecoderConfig object to handle vocab_size properly
* doc fix
* add note (not a public class)
* parametrize
* - add bleu scores integration tests
* skip test if sacrebleu is not installed
* cache heavy models/tokenizers
* some tweaks
* remove tokens that aren't used
* more purging
* simplify code
* switch to using decoder_start_token_id
* add doc
* Revert "major refactor (reuse-bart)"
This reverts commit 226dad15ca.
* decouple from bart
* remove unused code #1
* remove unused code #2
* remove unused code #3
* update instructions
* clean up
* move bleu eval to examples
* check import only once
* move data+gen script into files
* reuse via import
* take less space
* add prepare_seq2seq_batch (auto-tested)
* cleanup
* recode test to use json instead of yaml
* ignore keys not needed
* use the new -y in transformers-cli upload -y
* [xlm tok] config dict: fix str into int to match definition (#7034)
* [s2s] --eval_max_generate_length (#7018)
* Fix CI with change of name of nlp (#7054)
* nlp -> datasets
* More nlp -> datasets
* Woopsie
* More nlp -> datasets
* One last
* extending to support allen_nlp wmt models
- allow a specific checkpoint file to be passed
- more arg settings
- scripts for allen_nlp models
* sync with changes
* s/fsmt-wmt/wmt/ in model names
* s/fsmt-wmt/wmt/ in model names (p2)
* s/fsmt-wmt/wmt/ in model names (p3)
* switch to a better checkpoint
* typo
* make non-optional args such - adjust tests where possible or skip when there is no other choice
* consistency
* style
* adjust header
* cards moved (model rename)
* use best custom hparams
* update info
* remove old cards
* cleanup
* s/stas/facebook/
* update scores
* s/allen_nlp/allenai/
* url maps aren't needed
* typo
* move all the doc / build /eval generators to their own scripts
* cleanup
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* fix indent
* duplicated line
* style
* use the correct add_start_docstrings
* oops
* resizing can't be done with the core approach, due to 2 dicts
* check that the arg is a list
* style
* style
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Initial model
* Fix upsampling
* Add special cls token id and test
* Formatting
* Test and fist FunnelTokenizerFast
* Common tests
* Fix the check_repo script and document Funnel
* Doc fixes
* Add all models
* Write doc
* Fix test
* Initial model
* Fix upsampling
* Add special cls token id and test
* Formatting
* Test and fist FunnelTokenizerFast
* Common tests
* Fix the check_repo script and document Funnel
* Doc fixes
* Add all models
* Write doc
* Fix test
* Fix copyright
* Forgot some layers can be repeated
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/transformers/modeling_funnel.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments
* Update src/transformers/modeling_funnel.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Address review comments
* Update src/transformers/modeling_funnel.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* Slow integration test
* Make small integration test
* Formatting
* Add checkpoint and separate classification head
* Formatting
* Expand list, fix link and add in pretrained models
* Styling
* Add the model in all summaries
* Typo fixes
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Tested in a local build of the docs.
e.g. Just above https://huggingface.co/transformers/task_summary.html#causal-language-modeling
Copy will copy the full code, e.g.
for token in top_5_tokens:
print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token])))
Instead of currently only:
for token in top_5_tokens:
>>> for token in top_5_tokens:
... print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token])))
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help reduce our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help increase our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help decrease our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help offset our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help improve our carbon footprint.
Docs for the option fix:
https://sphinx-copybutton.readthedocs.io/en/latest/