transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 10:12:23 +06:00

Author	SHA1	Message	Date
Shamane Siri	5257818e68	minor fixes in original RAG training (#12395 )	2021-06-29 13:39:48 +01:00
Jabin Huang	e3f39a2952	fix ids_to_tokens naming error in tokenizer of deberta v2 (#12412 ) Co-authored-by: Jipeng Huang <jihuan@microsoft.com>	2021-06-29 08:15:35 -04:00
Patrick von Platen	813328682e	[Flax] Example scripts - correct weight decay (#12409 ) * fix_torch_device_generate_test * remove @ * finish * finish * correct style	2021-06-29 12:01:08 +01:00
Suraj Patil	aecae53377	[example/flax] add summarization readme (#12393 ) * add readme * update readme and add requirements * Update examples/flax/summarization/README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-06-29 14:02:33 +05:30
Will Rice	3886104574	Fix TFWav2Vec2 SpecAugment (#12289 ) * Fix TFWav2Vec2 SpecAugment * Invert masks * Feedback changes	2021-06-29 09:15:57 +01:00
Will Rice	bc084938f2	Add out of vocabulary error to ASR models (#12288 ) * Add OOV error to ASR models * Feedback changes	2021-06-29 08:57:46 +01:00
NielsRogge	1fc6817a30	Rename detr targets to labels (#12280 ) * Rename target to labels in DetrFeatureExtractor * Update DetrFeatureExtractor tests accordingly * Improve docs of DetrFeatureExtractor * Improve docs * Make style	2021-06-29 03:07:46 -04:00
Stas Bekman	7682e97702	[models] respect dtype of the model when instantiating it (#12316 ) * [models] respect dtype of the model when instantiating it * cleanup * cleanup * rework to handle non-float dtype * fix * switch to fp32 tiny model * improve * use dtype.is_floating_point * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix the doc * recode to use explicit torch_dtype_auto_detect, torch_dtype args * docs and tweaks * docs and tweaks * docs and tweaks * merge 2 args, add docs * fix * fix * better doc * better doc Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-28 20:11:21 -07:00
Patrick von Platen	31c3e7e75b	[Flax] Add T5 pretraining script (#12355 ) * fix_torch_device_generate_test * remove @ * add length computatan * finish masking * finish * upload * fix some bugs * finish * fix dependency table * correct tensorboard * Apply suggestions from code review * correct processing * slight change init * correct some more mistakes * apply suggestions * improve readme * fix indent * Apply suggestions from code review Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> * correct tokenizer * finish * finish * finish * finish Co-authored-by: Patrick von Platen <patrick@huggingface.co> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>	2021-06-28 20:11:29 +01:00
Stas Bekman	e277074889	pass the matching trainer log level to deepspeed (#12401 )	2021-06-28 11:43:24 -07:00
Matt	7e22609e0f	Tensorflow LM examples (#12358 ) * Tensorflow MLM example * Add CLM example * Style fixes, adding missing checkpoint code from the CLM example * Fix TPU training, avoid massive dataset warnings * Fix incorrect training length calculation for multi-GPU training * Fix incorrect training length calculation for multi-GPU training * Refactors and nitpicks from the review * Style pass * Adding README	2021-06-28 19:31:44 +01:00
Patrick von Platen	2d70c91206	[Flax] Adapt flax examples to include `push_to_hub` (#12391 ) * fix_torch_device_generate_test * remove @ * finish * correct summary writer * correct push to hub * fix indent * finish * finish * finish * finish * finish Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-28 19:23:35 +01:00
Funtowicz Morgan	a7d0b288fa	Remove the need for `einsum` in Albert's attention computation (#12394 ) * debug albert einsum * Fix matmul computation * Let's use torch linear layer. * Style.	2021-06-28 18:30:05 +02:00
Sylvain Gugger	276bc149d2	Fix copies	2021-06-28 12:26:40 -04:00
Patrick von Platen	27b6ac4611	Update README.md	2021-06-28 17:22:10 +01:00
Patrick von Platen	89b57a6669	[Flax community event] Add more description to readme (#12398 ) * fix_torch_device_generate_test * remove @ * boom boom * correct typos * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * Apply suggestions from code review Co-authored-by: Suzana Ilić <io.suzanai@gmail.com> * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Suzana Ilić <io.suzanai@gmail.com>	2021-06-28 17:18:42 +01:00
Bhadresh Savani	04dbea31a9	[Examples] Added context manager to datasets map (#12367 ) * added cotext manager to datasets map * fixed style and spaces * fixed warning of deprecation * changed desc	2021-06-28 09:14:00 -07:00
Stas Bekman	d25ad34c82	[CI] add dependency table sync verification (#12364 ) * add dependency table sync verification * improve the message * improve the message * revert * ready to merge	2021-06-28 08:55:59 -07:00
Sylvain Gugger	57461ac0b4	Add possibility to maintain full copies of files (#12312 )	2021-06-28 10:02:53 -04:00
Taha ValizadehAslani	9490d668d2	Update run_mlm.py (#12344 ) Before the code could not be used for validation only because of this line: extension = data_args.train_file.split(".")[-1] was assuming that extension must be extracted from the training dataset. This line would run regardless of the training or validation options of the user. This would lead to an error if the user only wants to run an evaluation only and does not want to do train (because the training file does not exist). I modified it to extract extension from the training file if the user wants to do train and extract it from the validation file if the user wants to run eval. This way the code can be used for both training and validation separately.	2021-06-28 07:49:22 -04:00
Kilian Kluge	c7faf2ccc0	[Documentation] Warn that DataCollatorForWholeWordMask is limited to BertTokenizer-like tokenizers (#12371 ) * Notify users that DataCollatorForWholeWordMask is limited to BertTokenier-like tokenizers * Fix code formatting	2021-06-28 07:39:56 -04:00
Bhadresh Savani	ff5cdc086b	replace print with logger (#12368 )	2021-06-26 09:31:25 -07:00
Bhadresh Savani	9a7545943d	updated example template (#12365 )	2021-06-25 20:50:30 -07:00
Bhadresh Savani	539ee456d4	[Examples] Replicates the new --log_level feature to all trainer-based pytorch (#12359 ) * added log_level * fix comment * fixed log_level * Trigger CI * Unfied logging * simplified args for log_level	2021-06-25 14:58:42 -07:00
Stas Bekman	64e6098094	[trainer] add main_process_first context manager (#12351 ) * main_process_first context manager * handle multi-node, add context description * sync desc	2021-06-25 14:58:03 -07:00
cronoik	f866425898	fixed multiplechoice tokenization (#12362 ) * fixed multiplechoice tokenization The model would have seen two sequences: 1. [CLS]prompt[SEP]prompt[SEP] 2. [CLS]choice0[SEP]choice1[SEP] that is not correct as we want a contextualized embedding of prompt and choice * removed outer brackets for proper sequence generation	2021-06-25 17:41:08 -04:00
Stas Bekman	4a872caef4	remove extra white space from log format (#12360 )	2021-06-25 13:20:14 -07:00
Sylvain Gugger	a3daabfe14	Style	2021-06-25 15:54:31 -04:00
Kai Fricke	238521b0b6	Replace NotebookProgressReporter by ProgressReporter in Ray Tune run (#12357 ) * Replace NotebookProgressReporter by ProgressReporter in Ray Tune run * Move to local import	2021-06-25 14:12:03 -04:00
Vasudev Gupta	332a245861	Add FlaxBigBird QuestionAnswering script (#12233 ) * port bigbird script * adapt script a bit * change location * adapt more * save progress * init commit * style * dataset script tested * readme add	2021-06-25 18:05:48 +01:00
jglaser	55bb4c06f7	Fix exception in prediction loop occurring for certain batch sizes (#12350 ) * fix distributed_concat for scalar outputs * Update README.md * fixed typo (#12356) * simplify fix with terser syntax Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Trigger CI Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: michal pitr <21157924+MichalPitr@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-25 10:55:15 -04:00
michal pitr	d4ce31e839	fixed typo (#12356 )	2021-06-25 07:49:29 -04:00
Patrick von Platen	aa550c4a11	Update README.md	2021-06-25 11:55:51 +01:00
Marc van Zee	f2c4ce7e33	Add flax/jax quickstart (#12342 )	2021-06-24 17:04:18 +01:00
Sylvain Gugger	5b1b5635d3	Document patch release v4.8.1	2021-06-24 10:15:15 -04:00
Lysandre Debut	8ef62ec9e1	Fix torchscript tests (#12336 ) * Fix torchscript tests * Better test * Remove bogus print	2021-06-24 09:52:28 -04:00
Suraj Patil	aef3823e1a	[examples/Flax] move the examples table up (#12341 )	2021-06-24 16:03:37 +05:30
Richard Liaw	7875b638cd	try-this (#12338 ) Signed-off-by: Richard Liaw <rliaw@berkeley.edu>	2021-06-24 04:13:17 -04:00
Sylvain Gugger	cf3c9198aa	Fix default to logging_dir lost in merge conflict	2021-06-23 16:22:29 -04:00
Stas Bekman	07ae6103c3	[Deepspeed] new docs (#12077 ) * document sub_group_size * style * install + issues reporting * style * style * Update docs/source/main_classes/deepspeed.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * indent 4 * restore * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-23 11:07:37 -07:00
Sam Havens	3694484d0a	Update training_args.py (#12328 ) mention in `save_strategy` param description that `load_best_model_at_end` can override	2021-06-23 13:39:43 -04:00
Sylvain Gugger	2150dfed31	v4.9.0.dev0	2021-06-23 13:31:19 -04:00
Sylvain Gugger	9252a5127f	Release: v4.8.0	2021-06-23 13:25:56 -04:00
Patrick von Platen	468cda20f2	[Flax T5] Fix weight initialization and fix docs (#12327 ) * finish t5 flax fixes * improve naming	2021-06-23 17:39:21 +01:00
Sylvain Gugger	12a4457c56	Pin good version of huggingface_hub	2021-06-23 12:30:15 -04:00
Michael Benayoun	986ac03e37	changed modeling_fx_utils.py to utils/fx.py for clarity (#12326 ) Co-authored-by: Michael Benayoun <michael@huggingface.co>	2021-06-23 18:16:24 +02:00
Lysandre	941b4442ba	Temporarily revert the `fill-mask` improvements.	2021-06-23 17:46:24 +02:00
Lysandre Debut	4bdff2cdbe	Conda build (#12323 )	2021-06-23 11:07:07 -04:00
Sylvain Gugger	9eda6b52e2	Add all XxxPreTrainedModel to the main init (#12314 ) * Add all XxxPreTrainedModel to the main init * Add to template * Add to template bis * Add FlaxT5	2021-06-23 10:40:54 -04:00
Sylvain Gugger	53c60babe4	Clean push to hub API (#12187 ) * Clean push to hub API * Create working dir if it does not exist * Different tweak * New API + all models + test Flax * Adds the Trainer clean up * Update src/transformers/file_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * (nit) output types * No need to set clone_from when folder exists * Update src/transformers/trainer.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Add generated_from_trainer tag * Update to new version * Fixes Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-06-23 10:11:19 -04:00

1 2 3 4 5 ...

7452 Commits