transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-16 11:08:23 +06:00

Author	SHA1	Message	Date
Sam Shleifer	25b0463d0b	[s2s] add supported architecures to MD (#7252 )	2020-09-22 13:09:35 -04:00
Ola Piktus	c754c41c61	RAG (#6813 ) * added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * path fix * Formatting / renaming prior to actual work * added rag WIP * Formatting / renaming prior to actual work * First commit * improve comments * Retrieval evaluation scripts * refactor to include modeling outputs + MPI retriever * Fix rag-token model + refactor * Various fixes + finetuning logic * use_bos fix * Retrieval refactor * Finetuning refactoring and cleanup * Add documentation and cleanup * Remove set_up_rag_env.sh file * Fix retrieval wit HF index * Fix import errors * Fix quality errors * Refactor as per suggestions in https://github.com/huggingface/transformers/pull/6813#issuecomment-687208867 * fix quality * Fix RAG Sequence generation * minor cleanup plus initial tests * fix test * fix tests 2 * Comments fix * post-merge fixes * Improve readme + post-rebase refactor * Extra dependencied for tests * Fix tests * Fix tests 2 * Refactor test requirements * Fix tests 3 * Post-rebase refactor * rename nlp->datasets * RAG integration tests * add tokenizer to slow integration test and allow retriever to run on cpu * add tests; fix position ids warning * change structure * change structure * add from encoder generator * save working solution * make all integration tests pass * add RagTokenizer.save/from_pretrained and RagRetriever.save/from_pretrained * don't save paths * delete unnecessary imports * pass config to AutoTokenizer.from_pretrained for Rag tokenizers * init wiki_dpr only once * hardcode legacy index and passages paths (todo: add the right urls) * finalize config * finalize retriver api and config api * LegacyIndex index download refactor * add dpr to autotokenizer * make from pretrained more flexible * fix ragfortokengeneration * small name changes in tokenizer * add labels to models * change default index name * add retrieval tests * finish token generate * align test with previous version and make all tests pass * add tests * finalize tests * implement thoms suggestions * add first version of test * make first tests work * make retriever platform agnostic * naming * style * add legacy index URL * docstrings + simple retrieval test for distributed * clean model api * add doc_ids to retriever's outputs * fix retrieval tests * finish model outputs * finalize model api * fix generate problem for rag * fix generate for other modles * fix some tests * save intermediate * set generate to default * big refactor generate * delete rag_api * correct pip faiss install * fix auto tokenization test * fix faiss install * fix test * move the distributed logic to examples * model page * docs * finish tests * fix dependencies * fix import in __init__ * Refactor eval_rag and finetune scripts * start docstring * add psutil to test * fix tf test * move require torch to top * fix retrieval test * align naming * finish automodel * fix repo consistency * test ragtokenizer save/load * add rag model output docs * fix ragtokenizer save/load from pretrained * fix tokenizer dir * remove torch in retrieval * fix docs * fixe finetune scripts * finish model docs * finish docs * remove auto model for now * add require torch * remove solved todos * integrate sylvains suggestions * sams comments * correct mistake on purpose * improve README * Add generation test cases * fix rag token * clean token generate * fix test * add note to test * fix attention mask * add t5 test for rag * Fix handling prefix in finetune.py * don't overwrite index_name Co-authored-by: Patrick Lewis <plewis@fb.com> Co-authored-by: Aleksandra Piktus <piktus@devfair0141.h2.fair> Co-authored-by: Aleksandra Piktus <piktus@learnfair5102.h2.fair> Co-authored-by: Aleksandra Piktus <piktus@learnfair5067.h2.fair> Co-authored-by: Your Name <you@example.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>	2020-09-22 18:29:58 +02:00
Julien Plu	585217c87f	Add generic text classification example in TF (#5716 ) * Add new example with nlp * Update README * replace nlp by datasets * Update examples/text-classification/README.md Add Lysandre's suggestion. Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-22 12:05:05 -04:00
Sam Shleifer	656c27c3a3	[s2s] save hostname with repo info (#7301 ) * save hostname	2020-09-21 17:26:24 -04:00
Stas Bekman	af4b98ed97	[s2s] adjust finetune + test to work with fsmt (#7263 )	2020-09-21 15:13:19 -04:00
Stas Bekman	8d562a2d1a	[s2s] s/alpha_loss_encoder/alpha_encoder_loss/ (#7298 ) fix to match `distillation.py: self.alpha_encoder_loss`	2020-09-21 14:14:26 -04:00
Stas Bekman	cbb2f75a16	[s2s tests] fix test_run_eval_search (#7297 )	2020-09-21 14:00:40 -04:00
Lysandre	aae4edb5f0	Addressing review comment	2020-09-21 11:37:00 +02:00
Suraj Patil	43b9d93875	[example/glue] fix compute_metrics_fn for bart like models (#7248 ) * fix compute_metrics_fn * p.predictions -> preds * apply suggestions	2020-09-21 05:34:20 -04:00
Stas Bekman	7cbf0f722d	examples/seq2seq/__init__.py mutates sys.path (#7194 )	2020-09-20 16:54:42 -04:00
Sam Shleifer	83dba10b8f	[s2s] distributed_eval.py saves better speed info (#7242 )	2020-09-18 15:46:01 -04:00
Stefan Schweter	ee9eae4e06	token-classification: update url of GermEval 2014 dataset (#6571 )	2020-09-18 06:18:06 -04:00
Sam Shleifer	67d9fc50d9	[s2s] remove double assert (#7223 )	2020-09-17 18:32:31 -04:00
Sam Shleifer	a5638b2b3a	[s2s] dynamic batch size with --max_tokens_per_batch (#7030 )	2020-09-17 15:19:34 -04:00
Stas Bekman	efeab6a3f1	[s2s] run_eval/run_eval_search tweaks (#7192 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-09-17 14:26:38 -04:00
Stas Bekman	1eeb206bef	[ported model] FSMT (FairSeq MachineTranslation) (#6940 ) * ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in https://github.com/huggingface/transformers/pull/6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit `226dad15ca`. * decouple from bart * remove unused code #1 * remove unused code #2 * remove unused code #3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (#7034) * [s2s] --eval_max_generate_length (#7018) * Fix CI with change of name of nlp (#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-17 11:31:29 -04:00
RafaelWO	709745927b	Transformer-XL: Remove unused parameters (#7087 ) * Removed 'tgt_len' and 'ext_len' from Transfomer-XL * Some changes are still to be done * Removed 'tgt_len' and 'ext_len' from Transfomer-XL (2) * Removed comments * Fixed quality * Changed warning to info	2020-09-17 06:10:34 -04:00
Sam Shleifer	45b0b1ff2f	[s2s] fix kwarg typo (#7196 )	2020-09-16 21:58:57 -04:00
Sam Shleifer	0203ad43bc	[s2s] distributed eval cleanup (#7186 )	2020-09-16 15:38:37 -04:00
sgugger	3babef815c	Formatting	2020-09-16 14:57:09 -04:00
Stas Bekman	fdaf8ab349	[s2s run_eval] new features (#7109 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-09-16 13:59:57 -04:00
Stas Bekman	b0cbcdb05b	[logging] remove no longer needed verbosity override (#7100 )	2020-09-15 04:01:14 -04:00
Sam Shleifer	33d479d2b2	[s2s] distributed eval in one command (#7124 )	2020-09-14 15:57:56 -04:00
Antonio V Mendoza	e0e0675ac7	Demoing LXMERT with raw images by incorporating the FRCNN model for roi-pooled extraction and bounding-box predction on the GQA answer set. (#6986 ) * adding demo * Update examples/lxmert/requirements.txt Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update examples/lxmert/checkpoint.sh Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * added user input for .py demo * updated model loading, data extrtaction, checkpoints, and lots of other automation * adding normalizing for bounding boxes * Update requirements.txt * some optimizations for extracting data * added data extracting file * added data extraction file * minor fixes to reqs and readme * Style * remove options Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-09-14 10:07:04 -04:00
Stas Bekman	3ca1874ca4	[examples testing] restore code (#7099 ) For some reason https://github.com/huggingface/transformers/pull/5512 re-added temp dir creation code that was removed by https://github.com/huggingface/transformers/pull/6494 defeating the purpose of that PR for those tests.	2020-09-14 08:54:23 -04:00
Lysandre Debut	bb3106f741	Temporarily skip failing tests due to dependency change (#7118 ) * Temporarily skip failing tests due to dependency change * Remove trace	2020-09-14 07:42:13 -04:00
Sam Shleifer	0fab39695a	[s2s distill] allow pegasus-12-12 (#7104 )	2020-09-14 00:03:59 -04:00
Sam Shleifer	de9e297964	[s2s] distributed eval cleanup (#7110 )	2020-09-13 23:40:38 -04:00
Sam Shleifer	e7f8d2ab64	[s2s] two stage run_distributed_eval.py (#7105 )	2020-09-13 17:28:18 -04:00
Sam Shleifer	b76cb1c3df	[s2s] run_eval supports --prefix clarg. (#6953 )	2020-09-12 01:08:21 -04:00
Sam Shleifer	77950c485a	[wip/s2s] DistributedSortishSampler (#7056 )	2020-09-10 15:23:44 -04:00
Sylvain Gugger	514486739c	Fix CI with change of name of nlp (#7054 ) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last	2020-09-10 14:51:08 -04:00
Sam Shleifer	e9a2f772bc	[s2s] --eval_max_generate_length (#7018 )	2020-09-10 14:11:34 -04:00
Manuel Romero	1b76936d1a	Fix typo (#6994 )	2020-09-08 04:22:57 -04:00
Lysandre	1650130b0f	Remove misleading docstring	2020-09-07 14:16:59 +02:00
Boris Dayma	995a958dd1	feat: allow prefix for any generative model (#5885 ) * feat: allow padding_text for any generative model * docs(pipelines.py): correct typo * Update src/transformers/pipelines.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * feat: rename padding_text to prefix * fix: cannot tokenize empty text * fix: pass prefix arg to pipeline * test: add prefix to text-generetation pipeline * style: fix style * style: clean code and variable name more explicit * set arg docstring to optional Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-09-07 03:03:45 -04:00
Sam Shleifer	ce37be9d94	[s2s] warn if --fp16 for torch 1.6 (#6977 )	2020-09-06 20:41:29 -04:00
Stas Bekman	48ff6d5109	[doc] remove the implied defaults to :obj:`None`, s/True/ :obj:`True/, etc. (#6956 ) * remove the implied defaults to :obj:`None` * fix bug in the original * replace to :obj:`True`, :obj:`False`	2020-09-04 18:22:25 -04:00
Sam Shleifer	a4fc0c80b1	[s2s] run_eval.py parses generate_kwargs (#6948 )	2020-09-04 14:19:31 -04:00
Sam Shleifer	6078b12098	[s2s] distill: --normalize_hidden --supervise_forward (#6834 )	2020-09-04 14:05:56 -04:00
Sam Shleifer	e95d262f25	[s2s] support early stopping based on loss, rather than rouge (#6927 )	2020-09-03 17:31:35 -04:00
Sam Shleifer	207ed8cb78	[s2s] use --eval_beams command line arg (#6926 )	2020-09-03 12:42:09 -04:00
Sam Shleifer	39ed68d597	[s2s] allow task_specific_params=summarization_xsum (#6923 )	2020-09-03 11:11:40 -04:00
Sam Shleifer	5a318f075a	[s2s]: script to convert pl checkpoints to hf checkpoints (#6911 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-09-03 09:47:00 -04:00
brett koonce	b8e4906c97	tweak tar command in readme (#6919 )	2020-09-03 09:29:01 -04:00
Jin Young (Daniel) Sohn	21d719238c	Add cache_dir to save features TextDataset (#6879 ) * Add cache_dir to save features TextDataset This is in case the dataset is in a RO filesystem, for which is the case in tests (GKE TPU tests). * style	2020-09-01 11:42:17 -04:00
Sam Shleifer	431ab19d7a	[fix] typo in available in helper function (#6859 )	2020-08-31 17:59:34 -04:00
Sam Shleifer	b9772897ec	[s2s] command line args for faster val steps (#6833 )	2020-08-31 16:16:10 -04:00
Sam Shleifer	61b7ba93f5	Marian distill scripts + integration test (#6799 )	2020-08-31 13:48:26 -04:00
Sam Shleifer	dfa10a41ba	[s2s README] Add more dataset download instructions (#6737 )	2020-08-30 16:29:24 -04:00
xujiaze13	32fe44086c	clearly indicate shuffle=False (#6312 ) * Clarify shuffle * clarify shuffle Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>	2020-08-30 19:26:10 +08:00
Sam Shleifer	0f58903bb6	Pegasus finetune script: add --adafactor (#6811 )	2020-08-29 17:43:32 -04:00
Sam Shleifer	ac47458a02	[s2s] round runtime in run_eval (#6798 )	2020-08-29 17:36:31 -04:00
Sam Shleifer	5ab21b072f	[s2s] Test hub configs in self-scheduled CI (#6809 )	2020-08-28 17:05:52 -04:00
Sam Shleifer	9336086ab5	prepare_seq2seq_batch makes labels/ decoder_input_ids made later. (#6654 ) * broken test * batch parity * tests pass * boom boom * boom boom * split out bart tokenizer tests * fix tests * boom boom * Fixed dataset bug * Fix marian * Undo extra * Get marian working * Fix t5 tok tests * Test passing * Cleanup * better assert msg * require torch * Fix mbart tests * undo extra decoder_attn_mask change * Fix import * pegasus tokenizer can ignore src_lang kwargs * unused kwarg test cov * boom boom * add todo for pegasus issue * cover one word translation edge case * Cleanup * doc	2020-08-28 11:15:17 -04:00
Sam Shleifer	fb78a90d6a	PL: --adafactor option (#6776 )	2020-08-27 22:19:46 -04:00
Tom Grek	c225e872ed	Fix it to work with BART (#6756 )	2020-08-27 09:04:50 -04:00
Julien Plu	6f289dc97a	Fix the TF Trainer gradient accumulation and the TF NER example (#6713 ) * Align TF NER example over the PT one * Fix Dataset call * Fix gradient accumulation training * Apply style * Address Sylvain's comments * Address Sylvain's comments * Apply style	2020-08-27 08:45:34 -04:00
Sam Shleifer	4bd7be9a42	s2s distillation uses AutoModelForSeqToSeqLM (#6761 )	2020-08-26 23:25:11 -04:00
Sam Shleifer	61518e2df3	[s2s] run_eval.py QOL improvements and cleanup(#6746 )	2020-08-26 18:59:20 -04:00
Lysandre	a75c64d80c	Black 20 release	2020-08-26 17:20:22 +02:00
Joel Hanson	4db2fa77d7	Allow tests in examples to use cuda or fp16,if they are available (#5512 ) * Allow tests in examples to use cuda or fp16,if they are available The tests in examples didn't use the cuda or fp16 even if they where available. - The text classification example (`run_glue.py`) didn't use the fp16 even if it was available but the device was take based on the availablity(cuda/cpu). - The language-modeling example (`run_language_modeling.py`) was having `--no_cuda` argument which made the test to work without cuda. This example is having issue when running with fp16 thus it not enabled (got an assertion error for perplexity due to it higher value). - The cuda and fp16 is not enabled for question-answering example (`run_squad.py`) as it is having a difference in the f1 score. - The text-generation example (`run_generation.py`) will take the cuda or fp16 whenever it is available. Resolves some of: #5057 * Unwanted import of is_apex_available was removed * Made changes to test examples file to have the pass --fp16 only if cuda and apex is avaliable - run_glue.py: Removed the check for cuda and fp16. - run_generation.py: Removed the check for cuda and fp16 also removed unwanted flag creation. * Incorrectly sorted imports fixed * The model needs to be converted to half precision * Formatted single line if condition statement to multiline * The torch_device also needed to be checked before running the test on examples - The tests in examples which uses cuda should also depend from the USE_CUDA flag, similarly to the rest of the test suite. Even if we decide to set USE_CUDA to True by default, setting USE_CUDA to False should result in the examples not using CUDA * Format some of the code in test_examples file * The improper import of is_apex_available was sorted * Formatted the code to keep the style standards * The comma at the end of list giving a flake8 issue was fixed * Import sort was fixed * Removed the clean_test_dir function as its not used right now	2020-08-25 06:02:07 -04:00
Sam Shleifer	0344428f79	[s2s] round bleu, rouge to 4 digits (#6704 )	2020-08-25 00:33:11 -04:00
vblagoje	dd522da004	Fix PL token classification examples (#6682 )	2020-08-24 11:30:06 -04:00
Sylvain Gugger	a573777901	Update repo to isort v5 (#6686 ) * Run new isort * More changes * Update CI, CONTRIBUTING and benchmarks	2020-08-24 11:03:01 -04:00
Suraj Patil	6f972e1423	update xnli-mt url (#6580 )	2020-08-18 13:10:47 -04:00
Sam Shleifer	d2da2cb232	allow spaces in bash args with "$@" (#6521 )	2020-08-17 09:06:35 -04:00
Stas Bekman	9dbe4094f2	[testing] a new TestCasePlus subclass + get_auto_remove_tmp_dir() (#6494 ) * [testing] switch to a new TestCasePlus + get_auto_remove_tmp_dir() for auto-removal of tmp dirs * respect after=True for tempfile, simplify code * comments * comment fix * put `before` last in args, so can make debug even faster	2020-08-17 08:12:19 -04:00
Sam Shleifer	84c265ffcc	[lightning_base] fix s2s logging, only make train_loader once (#6404 )	2020-08-16 22:49:41 -04:00
Sam Shleifer	72add6c98f	[s2s] docs, document desired filenames nicely (#6525 )	2020-08-16 20:31:22 -04:00
Kyle Piira	2060181126	Fixes paths with spaces in seq2seq example (#6493 )	2020-08-16 13:36:38 -04:00
Kevin Canwen Xu	eb613b566a	Use hash to clean the test dirs (#6475 ) * Use hash to clean the test dirs * Use hash to clean the test dirs * Use hash to clean the test dirs * fix	2020-08-14 15:34:39 +08:00
Kevin Canwen Xu	7bc00569df	Clean directory after script testing (#6453 ) * Clean Dir after testing * remove pabee ignore	2020-08-14 00:34:03 +08:00
Sam Shleifer	e92efcf728	Mult rouge by 100: standard units (#6359 )	2020-08-13 12:15:54 -04:00
vblagoje	eda07efaa5	Add POS tagging and Phrase chunking token classification examples (#6457 ) * Add more token classification examples * POS tagging example * Phrase chunking example * PR review fixes * Add conllu to third party list (used in token classification examples)	2020-08-13 12:09:51 -04:00
Sam Shleifer	f94a52cd79	[s2s] add BartTranslationDistiller for distilling mBART (#6363 )	2020-08-12 11:41:04 -04:00
Stas Bekman	87b359439f	[test] replace capsys with the more refined CaptureStderr/CaptureStdout (#6422 ) * replace capsys with the more refined CaptureStderr/CaptureStdout * Update examples/seq2seq/test_seq2seq_examples.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-12 07:54:28 -04:00
Lysandre Debut	4ffea5ce2f	Disabled pabee test (#6431 )	2020-08-12 02:52:50 -04:00
Sam Shleifer	3f071c4b6e	[examples] add pytest dependency (#6425 )	2020-08-11 17:58:09 -04:00
Stas Bekman	ece0903e11	lr_schedulers: add get_polynomial_decay_schedule_with_warmup (#6361 ) * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * [model_cards] electra-base-turkish-cased-ner (#6350) * for electra-base-turkish-cased-ner * Add metadata Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Temporarily de-activate TPU CI * Update modeling_tf_utils.py (#6372) fix typo: ckeckpoint->checkpoint * the test now works again (#6371) * correct pl link in readme (#6364) * refactor almost identical tests (#6339) * refactor almost identical tests * important to add a clear assert error message * make the assert error even more descriptive than the original bt * Small docfile fixes (#6328) * Patch models (#6326) * TFAlbertFor{TokenClassification, MultipleChoice} * Patch models * BERT and TF BERT info s * Update check_repo * Ci GitHub caching (#6382) * Cache Github Actions CI * Remove useless file * Colab button (#6389) * Add colab button * Add colab link for tutorials * Fix links for open in colab (#6391) * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove dup (leftover from merge) * convert the test into the new refactored format * stick to using the current_step as is, without ++ Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com> Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Alexander Measure <ameasure@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-11 17:56:41 -04:00
Stas Bekman	0203d6517f	[pl] restore lr logging behavior for glue, ner examples (#6314 )	2020-08-11 16:27:11 -04:00
Sam Shleifer	be1520d3a3	rename prepare_translation_batch -> prepare_seq2seq_batch (#6103 )	2020-08-11 15:57:07 -04:00
Sam Shleifer	66fa8ceaea	PegasusForConditionalGeneration (torch version) (#6340 ) Co-authored-by: Jingqing Zhang <jingqing.zhang15@imperial.ac.uk>	2020-08-11 14:31:23 -04:00
Stas Bekman	f6cb0f806e	[s2s] wmt download script use less ram (#6405 )	2020-08-11 12:04:17 -04:00
Stas Bekman	7c6a085ebf	pl version: examples/requirements.txt is single source of truth (#6309 )	2020-08-11 10:58:54 -04:00
Stas Bekman	f6c0680d36	add pl_glue example test (#6034 ) * add pl_glue example test * for now just test that it runs, next validate results of eval or predict? * complete the run_pl_glue test to validate the actual outcome * worked on my machine, CI gets less accuracy - trying higher epochs * match run_pl.sh hparms * more epochs? * trying higher lr * for now just test that the script runs to a completion * correct the comment * if cuda is available, add --fp16 --gpus=1 to cover more bases * style	2020-08-11 03:16:52 -04:00
Sam Shleifer	b9ecd92ee4	[s2s] Script to save wmt data to disk (#6403 )	2020-08-10 22:49:39 -04:00
Rohit Gupta	35eb96de4d	correct pl link in readme (#6364 )	2020-08-10 03:08:46 -04:00
Stas Bekman	0830e79512	the test now works again (#6371 )	2020-08-10 02:55:52 -04:00
Sam Shleifer	9a5ef83748	[s2s] fix --gpus clarg collision (#6358 )	2020-08-08 21:51:37 -04:00
Suraj Patil	9bed355449	[s2s] fix label_smoothed_nll_loss (#6344 )	2020-08-08 04:21:12 -04:00
Sam Shleifer	99f73bcc71	[s2s] tiny QOL improvement: run_eval prints scores (#6341 )	2020-08-08 02:45:55 -04:00
Stas Bekman	322dffc6c9	remove a TODO item to use a tiny model (#6338 ) as discussed with @sshleifer, removing this TODO to switch to a tiny model, since it won't be able to test the results of the evaluation (i.e. the results are meaningless).	2020-08-07 21:30:39 -04:00
zcain117	1b8a7ffcfd	Add setup for TPU CI to run every hour. (#6219 ) * Add setup for TPU CI to run every hour. * Re-organize config.yml Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-08-07 11:17:07 -04:00
Stas Bekman	6695450a23	[examples] consistently use --gpus, instead of --n_gpu (#6315 )	2020-08-07 10:36:32 -04:00
Stas Bekman	175cd45e13	fix the shuffle agrument usage and the default (#6307 )	2020-08-06 20:32:28 -04:00
Bhashithe Abeysinghe	ffceef2042	[Fix] text-classification PL example (#6027 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-06 15:46:43 -04:00
xujiaze13	eb2bd8d6eb	Remove redundant line in run_pl_glue.py (#6305 )	2020-08-06 15:43:45 -04:00
Sam Shleifer	2804fff839	[s2s]Use prepare_translation_batch for Marian finetuning (#6293 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-06 14:58:38 -04:00
Doug Blank	b923871bb7	Adds comet_ml to the list of auto-experiment loggers (#6176 ) * Support for Comet.ml * Need to import comet first * Log this model, not the one in the backprop step * Log args as hyperparameters; use framework to allow fine control * Log hyperparameters with context * Apply black formatting * isort fix integrations * isort fix __init__ * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/trainer_tf.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Address review comments * Style + Quality, remove Tensorboard import test Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-08-06 11:31:30 -04:00

1 2 3 4 5 ...

1246 Commits