transformers/docs/source/model_doc
Stas Bekman 1eeb206bef
[ported model] FSMT (FairSeq MachineTranslation) (#6940)
* ready for PR

* cleanup

* correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST

* fix

* perfectionism

* revert change from another PR

* odd, already committed this one

* non-interactive upload workaround

* backup the failed experiment

* store langs in config

* workaround for localizing model path

* doc clean up as in https://github.com/huggingface/transformers/pull/6956

* style

* back out debug mode

* document: run_eval.py --num_beams 10

* remove unneeded constant

* typo

* re-use bart's Attention

* re-use EncoderLayer, DecoderLayer from bart

* refactor

* send to cuda and fp16

* cleanup

* revert (moved to another PR)

* better error message

* document run_eval --num_beams

* solve the problem of tokenizer finding the right files when model is local

* polish, remove hardcoded config

* add a note that the file is autogenerated to avoid losing changes

* prep for org change, remove unneeded code

* switch to model4.pt, update scores

* s/python/bash/

* missing init (but doesn't impact the finetuned model)

* cleanup

* major refactor (reuse-bart)

* new model, new expected weights

* cleanup

* cleanup

* full link

* fix model type

* merge porting notes

* style

* cleanup

* have to create a DecoderConfig object to handle vocab_size properly

* doc fix

* add note (not a public class)

* parametrize

* - add bleu scores integration tests

* skip test if sacrebleu is not installed

* cache heavy models/tokenizers

* some tweaks

* remove tokens that aren't used

* more purging

* simplify code

* switch to using decoder_start_token_id

* add doc

* Revert "major refactor (reuse-bart)"

This reverts commit 226dad15ca.

* decouple from bart

* remove unused code #1

* remove unused code #2

* remove unused code #3

* update instructions

* clean up

* move bleu eval to examples

* check import only once

* move data+gen script into files

* reuse via import

* take less space

* add prepare_seq2seq_batch (auto-tested)

* cleanup

* recode test to use json instead of yaml

* ignore keys not needed

* use the new -y in transformers-cli upload -y

* [xlm tok] config dict: fix str into int to match definition (#7034)

* [s2s] --eval_max_generate_length (#7018)

* Fix CI with change of name of nlp (#7054)

* nlp -> datasets

* More nlp -> datasets

* Woopsie

* More nlp -> datasets

* One last

* extending to support allen_nlp wmt models

- allow a specific checkpoint file to be passed
- more arg settings
- scripts for allen_nlp models

* sync with changes

* s/fsmt-wmt/wmt/ in model names

* s/fsmt-wmt/wmt/ in model names (p2)

* s/fsmt-wmt/wmt/ in model names (p3)

* switch to a better checkpoint

* typo

* make non-optional args such - adjust tests where possible or skip when there is no other choice

* consistency

* style

* adjust header

* cards moved (model rename)

* use best custom hparams

* update info

* remove old cards

* cleanup

* s/stas/facebook/

* update scores

* s/allen_nlp/allenai/

* url maps aren't needed

* typo

* move all the doc / build /eval generators to their own scripts

* cleanup

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix indent

* duplicated line

* style

* use the correct add_start_docstrings

* oops

* resizing can't be done with the core approach, due to 2 dicts

* check that the arg is a list

* style

* style

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-09-17 11:31:29 -04:00
..
albert.rst Tf model outputs (#6247) 2020-08-05 11:34:39 -04:00
auto.rst Extra ) 2020-09-14 09:37:55 -04:00
bart.rst remove BartForConditionalGeneration.generate (#6659) 2020-08-25 00:42:34 +08:00
bert.rst Add a script to check all models are tested and documented (#6298) 2020-08-07 09:18:37 -04:00
bertgeneration.rst [BertGeneration, Docs] Fix another old name in docs (#7050) 2020-09-10 17:12:33 +02:00
camembert.rst CamembertForCausalLM (#6577) 2020-08-21 13:52:54 +02:00
ctrl.rst Clean documentation (#4849) 2020-06-08 11:28:19 -04:00
dialogpt.rst add dialogpt training tips (#3996) 2020-04-28 14:32:31 +02:00
distilbert.rst Add DistilBertForMultipleChoice (#5032) 2020-06-15 18:31:41 -04:00
dpr.rst Document model outputs (#5673) 2020-07-10 17:31:02 -04:00
electra.rst Tf model outputs (#6247) 2020-08-05 11:34:39 -04:00
encoderdecoder.rst fix link to paper (#7116) 2020-09-14 07:43:40 -04:00
flaubert.rst Add a script to check all models are tested and documented (#6298) 2020-08-07 09:18:37 -04:00
fsmt.rst [ported model] FSMT (FairSeq MachineTranslation) (#6940) 2020-09-17 11:31:29 -04:00
funnel.rst Add TF Funnel Transformer (#7029) 2020-09-10 10:41:56 -04:00
gpt.rst Tf model outputs (#6247) 2020-08-05 11:34:39 -04:00
gpt2.rst Tf model outputs (#6247) 2020-08-05 11:34:39 -04:00
longformer.rst TF Longformer (#5764) 2020-08-10 23:25:06 +02:00
lxmert.rst Adding the LXMERT pretraining model (MultiModal languageXvision) to HuggingFace's suite of models (#5793) 2020-09-03 04:02:25 -04:00
marian.rst [marian] converter supports models from new Tatoeba project (#6342) 2020-08-17 23:55:42 -04:00
mbart.rst [Doc] add more MBart and other doc (#6490) 2020-08-17 12:30:26 -04:00
mobilebert.rst Tf model outputs (#6247) 2020-08-05 11:34:39 -04:00
pegasus.rst pegasus.rst: fix expected output (#7017) 2020-09-08 13:29:16 -04:00
reformer.rst Add a script to check all models are tested and documented (#6298) 2020-08-07 09:18:37 -04:00
retribert.rst Eli5 examples (#4968) 2020-06-16 16:36:58 -04:00
roberta.rst [EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer (#6411) 2020-08-12 18:23:30 +02:00
t5.rst Actually the extra_id are from 0-99 and not from 1-100 (#5967) 2020-07-30 06:13:29 -04:00
transformerxl.rst Tf model outputs (#6247) 2020-08-05 11:34:39 -04:00
xlm.rst Add a script to check all models are tested and documented (#6298) 2020-08-07 09:18:37 -04:00
xlmroberta.rst [EncoderDecoder] Add xlm-roberta to encoder decoder (#6878) 2020-09-01 21:56:39 +02:00
xlnet.rst Tf model outputs (#6247) 2020-08-05 11:34:39 -04:00