transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-13 17:48:22 +06:00

Author	SHA1	Message	Date
Suraj Patil	06a973fd2a	[s2s] configure lr_scheduler from command line (#7641 )	2020-10-08 13:06:35 -04:00
Sam Shleifer	aba4e22944	[pseudolabels] cleanup markdown table (#7653 )	2020-10-07 23:04:18 -04:00
Sam Shleifer	e2bb9abb6a	[s2s] release pseudolabel links and instructions (#7639 )	2020-10-07 11:20:44 -04:00
Sylvain Gugger	08ba4b4902	Trainer callbacks (#7596 ) * Initial callback proposal * Finish various callbacks * Post-rebase conflicts * Fix tests * Don't use something that's not set * Documentation * Remove unwanted print. * Document all models can work * Add tests + small fixes * Update docs/source/internal/trainer_utils.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * Fix TF tests * Real fix this time * This one should work * Fix typo * Really fix typo Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-07 10:50:21 -04:00
Sam Shleifer	500be01c5d	[s2s] save first batch to json for debugging purposes (#6810 )	2020-10-06 16:11:56 -04:00
Sam Shleifer	d5d2744aa7	Support T5 Distillation w/hidden state supervision (#7599 )	2020-10-05 21:31:48 -04:00
Suraj Patil	99cb924bfb	[s2s] add config params like Dropout in Seq2SeqTrainingArguments (#7532 )	2020-10-04 12:42:30 -04:00
Sam Shleifer	9bdce3a4f9	[s2s] fix lockfile and peg distillation constants (#7545 )	2020-10-02 15:58:14 -04:00
Sam Shleifer	de4d7b004a	[s2s] Adafactor support for builtin trainer (#7522 )	2020-10-01 17:27:45 -04:00
Sam Shleifer	d3a9601a11	[s2s] trainer scripts: Remove --run_name, thanks sylvain! (#7521 )	2020-10-01 17:18:47 -04:00
Sylvain Gugger	bdcc4b78a2	Fix seq2seq example test (#7518 ) * Fix seq2seq example test * Fix bad copy-paste * Also save the state	2020-10-01 14:13:29 -04:00
Sam Shleifer	2a358f45ef	[s2s] fix nltk pytest race condition with FileLock (#7515 )	2020-10-01 12:51:09 -04:00
Suraj Patil	72d363d979	[examples/s2s] clean up finetune_trainer (#7509 )	2020-10-01 12:19:29 -04:00
Sam Shleifer	48f23f92a8	[s2sTrainer] test + code cleanup (#7467 )	2020-10-01 00:33:01 -04:00
Sam Shleifer	03e46c1de3	[s2s] fix kwargs style (#7488 )	2020-09-30 17:00:06 -04:00
Sam Shleifer	6fe8a693eb	[s2s] Fix t5 warning for distributed eval (#7487 )	2020-09-30 16:58:03 -04:00
Amanpreet Singh	c031d01023	Seq2SeqDataset: avoid passing src_lang everywhere (#7470 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-09-30 13:27:48 -04:00
Suraj Patil	08939cfdf7	[s2strainer] fix eval dataset loading (#7477 )	2020-09-30 12:39:13 -04:00
Sam Shleifer	74d8d69bd4	[s2s] consistent output format across eval scripts (#7435 )	2020-09-28 23:20:03 -04:00
Sam Shleifer	748425d47d	[T5] allow config.decoder_layers to control decoder size (#7409 ) * Working assymmetrical T5 * rename decoder_layers -> num_decoder_layers * Fix docstring * Allow creation of asymmetric t5 students	2020-09-28 03:08:04 -04:00
Sam Shleifer	7296fea1d6	[s2s] rougeLSum expects \n between sentences (#7410 ) Co-authored-by: Swetha Mandava <smandava@nvidia.com>	2020-09-27 16:27:19 -04:00
Suraj Patil	eab5f59682	[s2s] add create student script (#7290 ) Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-09-27 15:10:46 -04:00
Suraj Patil	415071b4c2	doc changes (#7385 )	2020-09-25 08:00:36 -04:00
Suraj Patil	9e68d075a4	Seq2SeqTrainer (#6769 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-09-24 18:46:58 -04:00
Sam Shleifer	d9d0f1140b	[s2s] distributed eval allows num_return_sequences > 1 (#7254 )	2020-09-24 17:30:09 -04:00
Stas Bekman	eadd870b2f	[seq2seq] make it easier to run the scripts (#7274 )	2020-09-24 15:23:48 -04:00
Sam Shleifer	78387cc63e	[s2s] only save metrics.json from rank zero (#7331 )	2020-09-22 18:27:28 -04:00
Sam Shleifer	e53138a1b9	[s2s] add src_lang kwarg for distributed eval (#7300 )	2020-09-22 18:26:37 -04:00
Sam Shleifer	25b0463d0b	[s2s] add supported architecures to MD (#7252 )	2020-09-22 13:09:35 -04:00
Sam Shleifer	656c27c3a3	[s2s] save hostname with repo info (#7301 ) * save hostname	2020-09-21 17:26:24 -04:00
Stas Bekman	af4b98ed97	[s2s] adjust finetune + test to work with fsmt (#7263 )	2020-09-21 15:13:19 -04:00
Stas Bekman	8d562a2d1a	[s2s] s/alpha_loss_encoder/alpha_encoder_loss/ (#7298 ) fix to match `distillation.py: self.alpha_encoder_loss`	2020-09-21 14:14:26 -04:00
Stas Bekman	cbb2f75a16	[s2s tests] fix test_run_eval_search (#7297 )	2020-09-21 14:00:40 -04:00
Stas Bekman	7cbf0f722d	examples/seq2seq/__init__.py mutates sys.path (#7194 )	2020-09-20 16:54:42 -04:00
Sam Shleifer	83dba10b8f	[s2s] distributed_eval.py saves better speed info (#7242 )	2020-09-18 15:46:01 -04:00
Sam Shleifer	67d9fc50d9	[s2s] remove double assert (#7223 )	2020-09-17 18:32:31 -04:00
Sam Shleifer	a5638b2b3a	[s2s] dynamic batch size with --max_tokens_per_batch (#7030 )	2020-09-17 15:19:34 -04:00
Stas Bekman	efeab6a3f1	[s2s] run_eval/run_eval_search tweaks (#7192 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-09-17 14:26:38 -04:00
Stas Bekman	1eeb206bef	[ported model] FSMT (FairSeq MachineTranslation) (#6940 ) * ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in https://github.com/huggingface/transformers/pull/6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit `226dad15ca`. * decouple from bart * remove unused code #1 * remove unused code #2 * remove unused code #3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (#7034) * [s2s] --eval_max_generate_length (#7018) * Fix CI with change of name of nlp (#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-17 11:31:29 -04:00
Sam Shleifer	45b0b1ff2f	[s2s] fix kwarg typo (#7196 )	2020-09-16 21:58:57 -04:00
Sam Shleifer	0203ad43bc	[s2s] distributed eval cleanup (#7186 )	2020-09-16 15:38:37 -04:00
sgugger	3babef815c	Formatting	2020-09-16 14:57:09 -04:00
Stas Bekman	fdaf8ab349	[s2s run_eval] new features (#7109 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-09-16 13:59:57 -04:00
Sam Shleifer	33d479d2b2	[s2s] distributed eval in one command (#7124 )	2020-09-14 15:57:56 -04:00
Sam Shleifer	0fab39695a	[s2s distill] allow pegasus-12-12 (#7104 )	2020-09-14 00:03:59 -04:00
Sam Shleifer	de9e297964	[s2s] distributed eval cleanup (#7110 )	2020-09-13 23:40:38 -04:00
Sam Shleifer	e7f8d2ab64	[s2s] two stage run_distributed_eval.py (#7105 )	2020-09-13 17:28:18 -04:00
Sam Shleifer	b76cb1c3df	[s2s] run_eval supports --prefix clarg. (#6953 )	2020-09-12 01:08:21 -04:00
Sam Shleifer	77950c485a	[wip/s2s] DistributedSortishSampler (#7056 )	2020-09-10 15:23:44 -04:00
Sylvain Gugger	514486739c	Fix CI with change of name of nlp (#7054 ) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last	2020-09-10 14:51:08 -04:00
Sam Shleifer	e9a2f772bc	[s2s] --eval_max_generate_length (#7018 )	2020-09-10 14:11:34 -04:00
Sam Shleifer	ce37be9d94	[s2s] warn if --fp16 for torch 1.6 (#6977 )	2020-09-06 20:41:29 -04:00
Sam Shleifer	a4fc0c80b1	[s2s] run_eval.py parses generate_kwargs (#6948 )	2020-09-04 14:19:31 -04:00
Sam Shleifer	6078b12098	[s2s] distill: --normalize_hidden --supervise_forward (#6834 )	2020-09-04 14:05:56 -04:00
Sam Shleifer	e95d262f25	[s2s] support early stopping based on loss, rather than rouge (#6927 )	2020-09-03 17:31:35 -04:00
Sam Shleifer	207ed8cb78	[s2s] use --eval_beams command line arg (#6926 )	2020-09-03 12:42:09 -04:00
Sam Shleifer	39ed68d597	[s2s] allow task_specific_params=summarization_xsum (#6923 )	2020-09-03 11:11:40 -04:00
Sam Shleifer	5a318f075a	[s2s]: script to convert pl checkpoints to hf checkpoints (#6911 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-09-03 09:47:00 -04:00
brett koonce	b8e4906c97	tweak tar command in readme (#6919 )	2020-09-03 09:29:01 -04:00
Sam Shleifer	b9772897ec	[s2s] command line args for faster val steps (#6833 )	2020-08-31 16:16:10 -04:00
Sam Shleifer	61b7ba93f5	Marian distill scripts + integration test (#6799 )	2020-08-31 13:48:26 -04:00
Sam Shleifer	dfa10a41ba	[s2s README] Add more dataset download instructions (#6737 )	2020-08-30 16:29:24 -04:00
Sam Shleifer	0f58903bb6	Pegasus finetune script: add --adafactor (#6811 )	2020-08-29 17:43:32 -04:00
Sam Shleifer	ac47458a02	[s2s] round runtime in run_eval (#6798 )	2020-08-29 17:36:31 -04:00
Sam Shleifer	5ab21b072f	[s2s] Test hub configs in self-scheduled CI (#6809 )	2020-08-28 17:05:52 -04:00
Sam Shleifer	9336086ab5	prepare_seq2seq_batch makes labels/ decoder_input_ids made later. (#6654 ) * broken test * batch parity * tests pass * boom boom * boom boom * split out bart tokenizer tests * fix tests * boom boom * Fixed dataset bug * Fix marian * Undo extra * Get marian working * Fix t5 tok tests * Test passing * Cleanup * better assert msg * require torch * Fix mbart tests * undo extra decoder_attn_mask change * Fix import * pegasus tokenizer can ignore src_lang kwargs * unused kwarg test cov * boom boom * add todo for pegasus issue * cover one word translation edge case * Cleanup * doc	2020-08-28 11:15:17 -04:00
Sam Shleifer	fb78a90d6a	PL: --adafactor option (#6776 )	2020-08-27 22:19:46 -04:00
Sam Shleifer	4bd7be9a42	s2s distillation uses AutoModelForSeqToSeqLM (#6761 )	2020-08-26 23:25:11 -04:00
Sam Shleifer	61518e2df3	[s2s] run_eval.py QOL improvements and cleanup(#6746 )	2020-08-26 18:59:20 -04:00
Lysandre	a75c64d80c	Black 20 release	2020-08-26 17:20:22 +02:00
Sam Shleifer	0344428f79	[s2s] round bleu, rouge to 4 digits (#6704 )	2020-08-25 00:33:11 -04:00
Sylvain Gugger	a573777901	Update repo to isort v5 (#6686 ) * Run new isort * More changes * Update CI, CONTRIBUTING and benchmarks	2020-08-24 11:03:01 -04:00
Sam Shleifer	d2da2cb232	allow spaces in bash args with "$@" (#6521 )	2020-08-17 09:06:35 -04:00
Sam Shleifer	84c265ffcc	[lightning_base] fix s2s logging, only make train_loader once (#6404 )	2020-08-16 22:49:41 -04:00
Sam Shleifer	72add6c98f	[s2s] docs, document desired filenames nicely (#6525 )	2020-08-16 20:31:22 -04:00
Kyle Piira	2060181126	Fixes paths with spaces in seq2seq example (#6493 )	2020-08-16 13:36:38 -04:00
Sam Shleifer	e92efcf728	Mult rouge by 100: standard units (#6359 )	2020-08-13 12:15:54 -04:00
Sam Shleifer	f94a52cd79	[s2s] add BartTranslationDistiller for distilling mBART (#6363 )	2020-08-12 11:41:04 -04:00
Stas Bekman	87b359439f	[test] replace capsys with the more refined CaptureStderr/CaptureStdout (#6422 ) * replace capsys with the more refined CaptureStderr/CaptureStdout * Update examples/seq2seq/test_seq2seq_examples.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-12 07:54:28 -04:00
Sam Shleifer	be1520d3a3	rename prepare_translation_batch -> prepare_seq2seq_batch (#6103 )	2020-08-11 15:57:07 -04:00
Sam Shleifer	66fa8ceaea	PegasusForConditionalGeneration (torch version) (#6340 ) Co-authored-by: Jingqing Zhang <jingqing.zhang15@imperial.ac.uk>	2020-08-11 14:31:23 -04:00
Stas Bekman	f6cb0f806e	[s2s] wmt download script use less ram (#6405 )	2020-08-11 12:04:17 -04:00
Sam Shleifer	b9ecd92ee4	[s2s] Script to save wmt data to disk (#6403 )	2020-08-10 22:49:39 -04:00
Stas Bekman	0830e79512	the test now works again (#6371 )	2020-08-10 02:55:52 -04:00
Suraj Patil	9bed355449	[s2s] fix label_smoothed_nll_loss (#6344 )	2020-08-08 04:21:12 -04:00
Sam Shleifer	99f73bcc71	[s2s] tiny QOL improvement: run_eval prints scores (#6341 )	2020-08-08 02:45:55 -04:00
Stas Bekman	175cd45e13	fix the shuffle agrument usage and the default (#6307 )	2020-08-06 20:32:28 -04:00
Sam Shleifer	2804fff839	[s2s]Use prepare_translation_batch for Marian finetuning (#6293 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-06 14:58:38 -04:00
Stas Bekman	376c02e9a9	[WIP] lightning_base: support --lr_scheduler with multiple possibilities (#6232 ) * support --lr_scheduler with multiple possibilities * correct the error message * add a note about supported schedulers * cleanup * cleanup2 * needs the argument default * style * add another assert in the test * implement requested changes * cleanups * fix relative import * cleanup	2020-08-05 09:01:17 -04:00
Sam Shleifer	57eb1cb68d	[s2s] Document better mbart finetuning command (#6229 ) * Document better MT command * improve multigpu command	2020-08-03 18:22:31 -04:00
Sam Shleifer	b6b2f2270f	s2s: fix LR logging, remove some dead code. (#6205 )	2020-08-03 10:36:26 -04:00
Stas Bekman	d8dbf3b75d	[s2s] clean up + doc (#6184 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-01 14:51:07 -04:00
Sylvain Gugger	91cb95461e	Switch from return_tuple to return_dict (#6138 ) * Switch from return_tuple to return_dict * Fix test * [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614) * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests * AutoModels Tiny tweaks * Style * Final changes before merge * Re-order for simpler review * Final fixes * Addressing @sgugger's comments * Test MultipleChoice * Rework TF trainer (#6038) * Fully rework training/prediction loops * fix method name * Fix variable name * Fix property name * Fix scope * Fix method name * Fix tuple index * Fix tuple index * Fix indentation * Fix variable name * fix eval before log * Add drop remainder for test dataset * Fix step number + fix logging datetime * fix eval loss value * use global step instead of step + fix logging at step 0 * Fix logging datetime * Fix global_step usage * Fix breaking loop + logging datetime * Fix step in prediction loop * Fix step breaking * Fix train/test loops * Force TF at least 2.2 for the trainer * Use assert_cardinality to facilitate the dataset size computation * Log steps per epoch * Make tfds compliant with TPU * Make tfds compliant with TPU * Use TF dataset enumerate instead of the Python one * revert previous commit * Fix data_dir * Apply style * rebase on master * Address Sylvain's comments * Address Sylvain's and Lysandre comments * Trigger CI * Remove unused import * Switch from return_tuple to return_dict * Fix test * Add recent model Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Plu <plu.julien@gmail.com>	2020-07-30 09:17:00 -04:00
Stas Bekman	3212b8850d	[s2s] add support for overriding config params (#6149 )	2020-07-30 01:09:46 -04:00
Sam Shleifer	dafa296c95	[s2s] Delete useless method, log tokens_per_batch (#6081 )	2020-07-28 11:24:23 -04:00
Stas Bekman	f0c70085c2	link to README.md (#6068 ) * add a link to README.md * Update README.md	2020-07-28 20:34:58 +08:00
Sam Shleifer	3c7fbf35a6	MBART: support summarization tasks where max_src_len > max_tgt_len (#6003 ) * MBART: support summarization tasks * fix test * Style * add tokenizer test	2020-07-28 08:18:11 -04:00
Sam Shleifer	7a68d40138	[s2s] Don't mention packed data in README (#6079 )	2020-07-27 20:07:21 -04:00
Sam Shleifer	1e00ef681d	[s2s] dont document packing because it hurts performance (#6077 )	2020-07-27 18:26:00 -04:00
Sam Shleifer	11792d7826	CL util to convert models to fp16 before upload (#5953 )	2020-07-27 12:21:25 -04:00
Sam Shleifer	4302ace5bd	[pack_dataset] don't sort before packing, only pack train (#5954 )	2020-07-27 12:14:23 -04:00
Suraj Patil	d1d15d6f2d	[examples (seq2seq)] fix preparing decoder_input_ids for T5 (#5994 )	2020-07-27 10:10:43 -04:00
Sam Shleifer	c69ea5efc4	[CI] Don't test apex (#6021 )	2020-07-24 15:34:16 -04:00
Sam Shleifer	c3206eef44	[test] partial coverage for train_mbart_enro_cc25.sh (#5976 )	2020-07-22 14:34:49 -04:00
Sam Shleifer	9dab39feea	seq2seq/run_eval.py can take decoder_start_token_id (#5949 )	2020-07-21 16:58:45 -04:00
Sam Shleifer	5b193b39b0	[examples/seq2seq]: add --label_smoothing option (#5919 )	2020-07-21 16:51:39 -04:00
Sam Shleifer	95d1962b9c	[Doc] explaining romanian postprocessing for MBART BLEU hacking (#5943 )	2020-07-21 14:12:48 -04:00
Aditya Soni	ccbf74a685	typos in seq2seq/readme (#5937 )	2020-07-21 09:44:59 -04:00
Sam Shleifer	f1a4e06f1f	[Fix] seq2seq pack_dataset.py actually packs (#5913 ) Huge MT speedup!	2020-07-20 15:18:26 -04:00
Sam Shleifer	09a2f40684	Seq2SeqDataset uses linecache to save memory by @Pradhy729 (#5792 ) Co-authored-by: Pradhy729 <49659913+Pradhy729@users.noreply.github.com>	2020-07-18 13:57:33 -04:00
Sam Shleifer	dad5e12e54	[seq2seq] distillation.py accepts trainer arguments (#5865 )	2020-07-18 07:43:57 -04:00
Sam Shleifer	ba2400189b	[seq2seq] MAX_LEN env var for MT commands (#5837 )	2020-07-17 22:51:31 -04:00
Nathan Raw	529850ae7b	Lightning Updates for v0.8.5 (#5798 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-07-17 22:43:06 -04:00
Sam Shleifer	e238e3d55a	[seq2seq] Don't copy self.source in sortishsampler (#5818 )	2020-07-17 01:53:25 -04:00
Sam Shleifer	283500ff9f	[seq2seq] pack_dataset.py rewrites dataset in max_tokens format (#5819 )	2020-07-16 14:06:49 -04:00
Sam Shleifer	1a647abf0b	[fix] check code quality (#5772 )	2020-07-15 14:59:38 -04:00
Sam Shleifer	d0486c8bc2	[cleanup] T5 test, warnings (#5761 )	2020-07-15 08:23:22 -04:00
Sam Shleifer	353b8f1e7a	Add mbart-large-cc25, support translation finetuning (#5129 ) improve unittests for finetuning, especially w.r.t testing frozen parameters fix freeze_embeds for T5 add streamlit setup.cfg	2020-07-07 13:23:01 -04:00
Sam Shleifer	13deb95a40	Move tests/utils.py -> transformers/testing_utils.py (#5350 )	2020-07-01 10:31:17 -04:00
Sam Shleifer	27a7fe7a8d	examples/seq2seq: never override $WANDB_PROJECT (#5407 )	2020-06-30 15:29:13 -04:00
Kevin Canwen Xu	331d8d2936	Upload DistilBART artwork (#5394 )	2020-06-30 18:11:11 +08:00
MichaelJanz	9a473f1e43	Update Bertabs example to work again (#5355 ) * Fix the bug 'Attempted relative import with no known parent package' when using the bertabs example. Also change the used model from bertabs-finetuned-cnndm, since it seems not be accessible anymore * Update run_summarization.py Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>	2020-06-30 14:05:01 +08:00
Sam Shleifer	a316a6aaa8	[seq2seq docs] Move evaluation down, fix typo (#5365 )	2020-06-29 10:36:04 -04:00
Sam Shleifer	45e26125de	save_pretrained: mkdir(exist_ok=True) (#5258 ) * all save_pretrained methods mkdir if not os.path.exists	2020-06-28 14:53:47 -04:00
Sam Shleifer	393b8dc09a	examples/seq2seq/run_eval.py fixes and docs (#5322 )	2020-06-26 19:20:43 -04:00
Sam Shleifer	5543b30aa6	[pl_examples] default warmup steps=0 (#5316 )	2020-06-26 15:03:41 -04:00
Thomas Wolf	601d4d699c	[tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308 ) * remove references to old API in docstring - update data processors * style * fix tests - better type checking error messages * better type checking * include awesome fix by @LysandreJik for #5310 * updated doc and examples	2020-06-26 19:48:14 +02:00
Sam Shleifer	e008d520bb	[examples/seq2seq] more README improvements (#5274 )	2020-06-25 10:13:01 -04:00
Sam Shleifer	40457bcebb	examples/seq2seq supports translation (#5202 )	2020-06-24 23:58:11 -04:00

1 2 3 4 5

229 Commits