transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-27 16:22:23 +06:00

History

Stas Bekman 1eeb206bef [ported model] FSMT (FairSeq MachineTranslation) (#6940 ) * ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in https://github.com/huggingface/transformers/pull/6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit `226dad15ca`. * decouple from bart * remove unused code #1 * remove unused code #2 * remove unused code #3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (#7034) * [s2s] --eval_max_generate_length (#7018) * Fix CI with change of name of nlp (#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>		2020-09-17 11:31:29 -04:00
..
albert.rst	Tf model outputs (#6247 )	2020-08-05 11:34:39 -04:00
auto.rst	Extra )	2020-09-14 09:37:55 -04:00
bart.rst	remove BartForConditionalGeneration.generate (#6659 )	2020-08-25 00:42:34 +08:00
bert.rst	Add a script to check all models are tested and documented (#6298 )	2020-08-07 09:18:37 -04:00
bertgeneration.rst	[BertGeneration, Docs] Fix another old name in docs (#7050 )	2020-09-10 17:12:33 +02:00
camembert.rst	CamembertForCausalLM (#6577 )	2020-08-21 13:52:54 +02:00
ctrl.rst	Clean documentation (#4849 )	2020-06-08 11:28:19 -04:00
dialogpt.rst	add dialogpt training tips (#3996 )	2020-04-28 14:32:31 +02:00
distilbert.rst	Add DistilBertForMultipleChoice (#5032 )	2020-06-15 18:31:41 -04:00
dpr.rst	Document model outputs (#5673 )	2020-07-10 17:31:02 -04:00
electra.rst	Tf model outputs (#6247 )	2020-08-05 11:34:39 -04:00
encoderdecoder.rst	fix link to paper (#7116 )	2020-09-14 07:43:40 -04:00
flaubert.rst	Add a script to check all models are tested and documented (#6298 )	2020-08-07 09:18:37 -04:00
fsmt.rst	[ported model] FSMT (FairSeq MachineTranslation) (#6940 )	2020-09-17 11:31:29 -04:00
funnel.rst	Add TF Funnel Transformer (#7029 )	2020-09-10 10:41:56 -04:00
gpt.rst	Tf model outputs (#6247 )	2020-08-05 11:34:39 -04:00
gpt2.rst	Tf model outputs (#6247 )	2020-08-05 11:34:39 -04:00
longformer.rst	TF Longformer (#5764 )	2020-08-10 23:25:06 +02:00
lxmert.rst	Adding the LXMERT pretraining model (MultiModal languageXvision) to HuggingFace's suite of models (#5793 )	2020-09-03 04:02:25 -04:00
marian.rst	[marian] converter supports models from new Tatoeba project (#6342 )	2020-08-17 23:55:42 -04:00
mbart.rst	[Doc] add more MBart and other doc (#6490 )	2020-08-17 12:30:26 -04:00
mobilebert.rst	Tf model outputs (#6247 )	2020-08-05 11:34:39 -04:00
pegasus.rst	pegasus.rst: fix expected output (#7017 )	2020-09-08 13:29:16 -04:00
reformer.rst	Add a script to check all models are tested and documented (#6298 )	2020-08-07 09:18:37 -04:00
retribert.rst	Eli5 examples (#4968 )	2020-06-16 16:36:58 -04:00
roberta.rst	[EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer (#6411 )	2020-08-12 18:23:30 +02:00
t5.rst	Actually the extra_id are from 0-99 and not from 1-100 (#5967 )	2020-07-30 06:13:29 -04:00
transformerxl.rst	Tf model outputs (#6247 )	2020-08-05 11:34:39 -04:00
xlm.rst	Add a script to check all models are tested and documented (#6298 )	2020-08-07 09:18:37 -04:00
xlmroberta.rst	[EncoderDecoder] Add xlm-roberta to encoder decoder (#6878 )	2020-09-01 21:56:39 +02:00
xlnet.rst	Tf model outputs (#6247 )	2020-08-05 11:34:39 -04:00