transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-14 01:58:22 +06:00

Author	SHA1	Message	Date
Sam Shleifer	96e47d9229	[cleanup] assign todos, faster bart-cnn test (#7835 ) * 2 beam output * unassign/remove TODOs * remove one more	2020-10-16 03:11:18 -04:00
Stas Bekman	2255c2c7a0	[seq2seq] get_git_info fails gracefully (#7843 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-10-16 00:22:43 -04:00
Sylvain Gugger	a1d1b332d0	Add predict step accumulation (#7767 ) * Add eval_accumulation_step and clean distributed eval * Add TPU test * Add TPU stuff * Fix arg name * Fix Seq2SeqTrainer * Fix total_size * Update src/transformers/trainer_pt_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Doc and add test to TPU * Add unit test * Adapt name Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-14 11:41:45 -04:00
Tiger	7e73c12805	fixed lots of typos. (#7758 )	2020-10-13 10:00:20 -04:00
Sam Shleifer	9c2b2db2cd	[marian] Automate Tatoeba-Challenge conversion (#7709 )	2020-10-12 12:24:25 -04:00
Sam Shleifer	827c519494	[examples] bump pl=0.9.0 (#7053 )	2020-10-11 16:39:38 -04:00
Sam Shleifer	297233fa92	[s2s] Switch README urls to cdn (#7670 )	2020-10-08 21:22:22 -04:00
Sam Shleifer	a1ecc90d6b	[pseudo] Switch URLS to CDN (#7661 )	2020-10-08 14:12:39 -04:00
Suraj Patil	06a973fd2a	[s2s] configure lr_scheduler from command line (#7641 )	2020-10-08 13:06:35 -04:00
Sam Shleifer	aba4e22944	[pseudolabels] cleanup markdown table (#7653 )	2020-10-07 23:04:18 -04:00
Sam Shleifer	e2bb9abb6a	[s2s] release pseudolabel links and instructions (#7639 )	2020-10-07 11:20:44 -04:00
Sylvain Gugger	08ba4b4902	Trainer callbacks (#7596 ) * Initial callback proposal * Finish various callbacks * Post-rebase conflicts * Fix tests * Don't use something that's not set * Documentation * Remove unwanted print. * Document all models can work * Add tests + small fixes * Update docs/source/internal/trainer_utils.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * Fix TF tests * Real fix this time * This one should work * Fix typo * Really fix typo Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-07 10:50:21 -04:00
Sam Shleifer	500be01c5d	[s2s] save first batch to json for debugging purposes (#6810 )	2020-10-06 16:11:56 -04:00
Sam Shleifer	d5d2744aa7	Support T5 Distillation w/hidden state supervision (#7599 )	2020-10-05 21:31:48 -04:00
Suraj Patil	99cb924bfb	[s2s] add config params like Dropout in Seq2SeqTrainingArguments (#7532 )	2020-10-04 12:42:30 -04:00
Sam Shleifer	9bdce3a4f9	[s2s] fix lockfile and peg distillation constants (#7545 )	2020-10-02 15:58:14 -04:00
Sam Shleifer	de4d7b004a	[s2s] Adafactor support for builtin trainer (#7522 )	2020-10-01 17:27:45 -04:00
Sam Shleifer	d3a9601a11	[s2s] trainer scripts: Remove --run_name, thanks sylvain! (#7521 )	2020-10-01 17:18:47 -04:00
Sylvain Gugger	bdcc4b78a2	Fix seq2seq example test (#7518 ) * Fix seq2seq example test * Fix bad copy-paste * Also save the state	2020-10-01 14:13:29 -04:00
Sam Shleifer	2a358f45ef	[s2s] fix nltk pytest race condition with FileLock (#7515 )	2020-10-01 12:51:09 -04:00
Suraj Patil	72d363d979	[examples/s2s] clean up finetune_trainer (#7509 )	2020-10-01 12:19:29 -04:00
Sam Shleifer	48f23f92a8	[s2sTrainer] test + code cleanup (#7467 )	2020-10-01 00:33:01 -04:00
Sam Shleifer	03e46c1de3	[s2s] fix kwargs style (#7488 )	2020-09-30 17:00:06 -04:00
Sam Shleifer	6fe8a693eb	[s2s] Fix t5 warning for distributed eval (#7487 )	2020-09-30 16:58:03 -04:00
Amanpreet Singh	c031d01023	Seq2SeqDataset: avoid passing src_lang everywhere (#7470 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-09-30 13:27:48 -04:00
Suraj Patil	08939cfdf7	[s2strainer] fix eval dataset loading (#7477 )	2020-09-30 12:39:13 -04:00
Sam Shleifer	74d8d69bd4	[s2s] consistent output format across eval scripts (#7435 )	2020-09-28 23:20:03 -04:00
Sam Shleifer	748425d47d	[T5] allow config.decoder_layers to control decoder size (#7409 ) * Working assymmetrical T5 * rename decoder_layers -> num_decoder_layers * Fix docstring * Allow creation of asymmetric t5 students	2020-09-28 03:08:04 -04:00
Sam Shleifer	7296fea1d6	[s2s] rougeLSum expects \n between sentences (#7410 ) Co-authored-by: Swetha Mandava <smandava@nvidia.com>	2020-09-27 16:27:19 -04:00
Suraj Patil	eab5f59682	[s2s] add create student script (#7290 ) Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-09-27 15:10:46 -04:00
Suraj Patil	415071b4c2	doc changes (#7385 )	2020-09-25 08:00:36 -04:00
Suraj Patil	9e68d075a4	Seq2SeqTrainer (#6769 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-09-24 18:46:58 -04:00
Sam Shleifer	d9d0f1140b	[s2s] distributed eval allows num_return_sequences > 1 (#7254 )	2020-09-24 17:30:09 -04:00
Stas Bekman	eadd870b2f	[seq2seq] make it easier to run the scripts (#7274 )	2020-09-24 15:23:48 -04:00
Sam Shleifer	78387cc63e	[s2s] only save metrics.json from rank zero (#7331 )	2020-09-22 18:27:28 -04:00
Sam Shleifer	e53138a1b9	[s2s] add src_lang kwarg for distributed eval (#7300 )	2020-09-22 18:26:37 -04:00
Sam Shleifer	25b0463d0b	[s2s] add supported architecures to MD (#7252 )	2020-09-22 13:09:35 -04:00
Sam Shleifer	656c27c3a3	[s2s] save hostname with repo info (#7301 ) * save hostname	2020-09-21 17:26:24 -04:00
Stas Bekman	af4b98ed97	[s2s] adjust finetune + test to work with fsmt (#7263 )	2020-09-21 15:13:19 -04:00
Stas Bekman	8d562a2d1a	[s2s] s/alpha_loss_encoder/alpha_encoder_loss/ (#7298 ) fix to match `distillation.py: self.alpha_encoder_loss`	2020-09-21 14:14:26 -04:00
Stas Bekman	cbb2f75a16	[s2s tests] fix test_run_eval_search (#7297 )	2020-09-21 14:00:40 -04:00
Stas Bekman	7cbf0f722d	examples/seq2seq/__init__.py mutates sys.path (#7194 )	2020-09-20 16:54:42 -04:00
Sam Shleifer	83dba10b8f	[s2s] distributed_eval.py saves better speed info (#7242 )	2020-09-18 15:46:01 -04:00
Sam Shleifer	67d9fc50d9	[s2s] remove double assert (#7223 )	2020-09-17 18:32:31 -04:00
Sam Shleifer	a5638b2b3a	[s2s] dynamic batch size with --max_tokens_per_batch (#7030 )	2020-09-17 15:19:34 -04:00
Stas Bekman	efeab6a3f1	[s2s] run_eval/run_eval_search tweaks (#7192 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-09-17 14:26:38 -04:00
Stas Bekman	1eeb206bef	[ported model] FSMT (FairSeq MachineTranslation) (#6940 ) * ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in https://github.com/huggingface/transformers/pull/6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit `226dad15ca`. * decouple from bart * remove unused code #1 * remove unused code #2 * remove unused code #3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (#7034) * [s2s] --eval_max_generate_length (#7018) * Fix CI with change of name of nlp (#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-17 11:31:29 -04:00
Sam Shleifer	45b0b1ff2f	[s2s] fix kwarg typo (#7196 )	2020-09-16 21:58:57 -04:00
Sam Shleifer	0203ad43bc	[s2s] distributed eval cleanup (#7186 )	2020-09-16 15:38:37 -04:00
sgugger	3babef815c	Formatting	2020-09-16 14:57:09 -04:00

1 2 3

137 Commits