transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Patrick von Platen	31c3e7e75b	[Flax] Add T5 pretraining script (#12355 ) * fix_torch_device_generate_test * remove @ * add length computatan * finish masking * finish * upload * fix some bugs * finish * fix dependency table * correct tensorboard * Apply suggestions from code review * correct processing * slight change init * correct some more mistakes * apply suggestions * improve readme * fix indent * Apply suggestions from code review Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> * correct tokenizer * finish * finish * finish * finish Co-authored-by: Patrick von Platen <patrick@huggingface.co> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>	2021-06-28 20:11:29 +01:00
Stas Bekman	e277074889	pass the matching trainer log level to deepspeed (#12401 )	2021-06-28 11:43:24 -07:00
Matt	7e22609e0f	Tensorflow LM examples (#12358 ) * Tensorflow MLM example * Add CLM example * Style fixes, adding missing checkpoint code from the CLM example * Fix TPU training, avoid massive dataset warnings * Fix incorrect training length calculation for multi-GPU training * Fix incorrect training length calculation for multi-GPU training * Refactors and nitpicks from the review * Style pass * Adding README	2021-06-28 19:31:44 +01:00
Patrick von Platen	2d70c91206	[Flax] Adapt flax examples to include `push_to_hub` (#12391 ) * fix_torch_device_generate_test * remove @ * finish * correct summary writer * correct push to hub * fix indent * finish * finish * finish * finish * finish Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-28 19:23:35 +01:00
Funtowicz Morgan	a7d0b288fa	Remove the need for `einsum` in Albert's attention computation (#12394 ) * debug albert einsum * Fix matmul computation * Let's use torch linear layer. * Style.	2021-06-28 18:30:05 +02:00
Sylvain Gugger	276bc149d2	Fix copies	2021-06-28 12:26:40 -04:00
Patrick von Platen	27b6ac4611	Update README.md	2021-06-28 17:22:10 +01:00
Patrick von Platen	89b57a6669	[Flax community event] Add more description to readme (#12398 ) * fix_torch_device_generate_test * remove @ * boom boom * correct typos * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * Apply suggestions from code review Co-authored-by: Suzana Ilić <io.suzanai@gmail.com> * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Suzana Ilić <io.suzanai@gmail.com>	2021-06-28 17:18:42 +01:00
Bhadresh Savani	04dbea31a9	[Examples] Added context manager to datasets map (#12367 ) * added cotext manager to datasets map * fixed style and spaces * fixed warning of deprecation * changed desc	2021-06-28 09:14:00 -07:00
Stas Bekman	d25ad34c82	[CI] add dependency table sync verification (#12364 ) * add dependency table sync verification * improve the message * improve the message * revert * ready to merge	2021-06-28 08:55:59 -07:00
Sylvain Gugger	57461ac0b4	Add possibility to maintain full copies of files (#12312 )	2021-06-28 10:02:53 -04:00
Taha ValizadehAslani	9490d668d2	Update run_mlm.py (#12344 ) Before the code could not be used for validation only because of this line: extension = data_args.train_file.split(".")[-1] was assuming that extension must be extracted from the training dataset. This line would run regardless of the training or validation options of the user. This would lead to an error if the user only wants to run an evaluation only and does not want to do train (because the training file does not exist). I modified it to extract extension from the training file if the user wants to do train and extract it from the validation file if the user wants to run eval. This way the code can be used for both training and validation separately.	2021-06-28 07:49:22 -04:00
Kilian Kluge	c7faf2ccc0	[Documentation] Warn that DataCollatorForWholeWordMask is limited to BertTokenizer-like tokenizers (#12371 ) * Notify users that DataCollatorForWholeWordMask is limited to BertTokenier-like tokenizers * Fix code formatting	2021-06-28 07:39:56 -04:00
Bhadresh Savani	ff5cdc086b	replace print with logger (#12368 )	2021-06-26 09:31:25 -07:00
Bhadresh Savani	9a7545943d	updated example template (#12365 )	2021-06-25 20:50:30 -07:00
Bhadresh Savani	539ee456d4	[Examples] Replicates the new --log_level feature to all trainer-based pytorch (#12359 ) * added log_level * fix comment * fixed log_level * Trigger CI * Unfied logging * simplified args for log_level	2021-06-25 14:58:42 -07:00
Stas Bekman	64e6098094	[trainer] add main_process_first context manager (#12351 ) * main_process_first context manager * handle multi-node, add context description * sync desc	2021-06-25 14:58:03 -07:00
cronoik	f866425898	fixed multiplechoice tokenization (#12362 ) * fixed multiplechoice tokenization The model would have seen two sequences: 1. [CLS]prompt[SEP]prompt[SEP] 2. [CLS]choice0[SEP]choice1[SEP] that is not correct as we want a contextualized embedding of prompt and choice * removed outer brackets for proper sequence generation	2021-06-25 17:41:08 -04:00
Stas Bekman	4a872caef4	remove extra white space from log format (#12360 )	2021-06-25 13:20:14 -07:00
Sylvain Gugger	a3daabfe14	Style	2021-06-25 15:54:31 -04:00
Kai Fricke	238521b0b6	Replace NotebookProgressReporter by ProgressReporter in Ray Tune run (#12357 ) * Replace NotebookProgressReporter by ProgressReporter in Ray Tune run * Move to local import	2021-06-25 14:12:03 -04:00
Vasudev Gupta	332a245861	Add FlaxBigBird QuestionAnswering script (#12233 ) * port bigbird script * adapt script a bit * change location * adapt more * save progress * init commit * style * dataset script tested * readme add	2021-06-25 18:05:48 +01:00
jglaser	55bb4c06f7	Fix exception in prediction loop occurring for certain batch sizes (#12350 ) * fix distributed_concat for scalar outputs * Update README.md * fixed typo (#12356) * simplify fix with terser syntax Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Trigger CI Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: michal pitr <21157924+MichalPitr@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-25 10:55:15 -04:00
michal pitr	d4ce31e839	fixed typo (#12356 )	2021-06-25 07:49:29 -04:00
Patrick von Platen	aa550c4a11	Update README.md	2021-06-25 11:55:51 +01:00
Marc van Zee	f2c4ce7e33	Add flax/jax quickstart (#12342 )	2021-06-24 17:04:18 +01:00
Sylvain Gugger	5b1b5635d3	Document patch release v4.8.1	2021-06-24 10:15:15 -04:00
Lysandre Debut	8ef62ec9e1	Fix torchscript tests (#12336 ) * Fix torchscript tests * Better test * Remove bogus print	2021-06-24 09:52:28 -04:00
Suraj Patil	aef3823e1a	[examples/Flax] move the examples table up (#12341 )	2021-06-24 16:03:37 +05:30
Richard Liaw	7875b638cd	try-this (#12338 ) Signed-off-by: Richard Liaw <rliaw@berkeley.edu>	2021-06-24 04:13:17 -04:00
Sylvain Gugger	cf3c9198aa	Fix default to logging_dir lost in merge conflict	2021-06-23 16:22:29 -04:00
Stas Bekman	07ae6103c3	[Deepspeed] new docs (#12077 ) * document sub_group_size * style * install + issues reporting * style * style * Update docs/source/main_classes/deepspeed.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * indent 4 * restore * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-23 11:07:37 -07:00
Sam Havens	3694484d0a	Update training_args.py (#12328 ) mention in `save_strategy` param description that `load_best_model_at_end` can override	2021-06-23 13:39:43 -04:00
Sylvain Gugger	2150dfed31	v4.9.0.dev0	2021-06-23 13:31:19 -04:00
Sylvain Gugger	9252a5127f	Release: v4.8.0	2021-06-23 13:25:56 -04:00
Patrick von Platen	468cda20f2	[Flax T5] Fix weight initialization and fix docs (#12327 ) * finish t5 flax fixes * improve naming	2021-06-23 17:39:21 +01:00
Sylvain Gugger	12a4457c56	Pin good version of huggingface_hub	2021-06-23 12:30:15 -04:00
Michael Benayoun	986ac03e37	changed modeling_fx_utils.py to utils/fx.py for clarity (#12326 ) Co-authored-by: Michael Benayoun <michael@huggingface.co>	2021-06-23 18:16:24 +02:00
Lysandre	941b4442ba	Temporarily revert the `fill-mask` improvements.	2021-06-23 17:46:24 +02:00
Lysandre Debut	4bdff2cdbe	Conda build (#12323 )	2021-06-23 11:07:07 -04:00
Sylvain Gugger	9eda6b52e2	Add all XxxPreTrainedModel to the main init (#12314 ) * Add all XxxPreTrainedModel to the main init * Add to template * Add to template bis * Add FlaxT5	2021-06-23 10:40:54 -04:00
Sylvain Gugger	53c60babe4	Clean push to hub API (#12187 ) * Clean push to hub API * Create working dir if it does not exist * Different tweak * New API + all models + test Flax * Adds the Trainer clean up * Update src/transformers/file_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * (nit) output types * No need to set clone_from when folder exists * Update src/transformers/trainer.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Add generated_from_trainer tag * Update to new version * Fixes Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-06-23 10:11:19 -04:00
chenht2010	625f512d5e	[TFWav2Vec2] Fix docs (#12283 ) * fix error * make style check happy Co-authored-by: chenhaitao <chenhaitao@qiyi.com>	2021-06-23 14:51:31 +01:00
Patrick von Platen	44739c8180	[Flax/JAX] Add how to propose projects markdown (#12311 ) * fix_torch_device_generate_test * remove @ * finish * make style	2021-06-23 14:50:35 +01:00
Lysandre Debut	ef3dceff4a	Add mention of the huggingface_hub methods for offline mode (#12320 )	2021-06-23 09:45:30 -04:00
Vasudev Gupta	e98233dde1	Flax T5 (#12150 ) * copy pytorch-t5 * init * boom boom * forward pass same * make generation work * add more tests * make test work * finish normal tests * make fix-copies * finish quality * correct slow example * correct slow test * version table * upload models * Update tests/test_modeling_flax_t5.py * correct incorrectly deleted line Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-23 13:13:32 +01:00
David Fan	7d4cfa3b47	Rewrite ProphetNet to adapt converting ONNX friendly (#11981 ) * Rewrite * [ONNX] rewrite	2021-06-23 11:34:18 +01:00
Suraj Patil	c0fe3c9a7a	Flax summarization script (#12230 ) * add summrization script * fix arguments, preprocessing, metrics * add generation and metrics * auto model, prediction loop * prettify * label smoothing * adress Sylvain and Patricks suggestions * dynamically import shift_tokens_right * fix shift_tokens_right_fn call	2021-06-23 15:49:30 +05:30
Daniel Stancl	26a2e36595	Add output in a dictionary for TF `generate` method (#12139 ) * Add output args to greedy search * Fix critical typo + make style quality * Handle generate_beam_search * Add dict_specific tests and fix the placement of encoder outputs * Add specific outputs * Update doc * Fix typo * Adjust handling encoder_outputs + Fix generating for T5 * Fix generate for RAG * Fix handling ouptut_attentions when target_mapping is not None Take care of situations when target_mapping is provided as there are 2-tuple of attentions Change from: if inputs["output_attentions"]: attentions = tuple(tf.transpose(t, perm(2, 3, 0, 1)) for t in attentions) to: if inputs["output_attentions"]: if inputs["target_mapping"] is not None: # when target_mapping is provided, there are 2-tuple of attentions attentions = tuple( tuple(tf.transpose(attn_stream, perm=(2, 3, 0, 1)) for attn_stream in t) for t in attentions ) else: attentions = tuple(tf.transpose(t, perm=(2, 3, 0, 1)) for t in attentions) * Rename kwargs to model_kwargs * make style quality * Move imports in test_modeling_tf_common.py Move ModelOutput-related imports in test_modeling_tf_common.py into the `is_tf_available():` statement. * Rewrite nested if-statements * Fix added tests	2021-06-23 10:52:11 +01:00
Nicolas Patry	d4be498441	Optimizing away the `fill-mask` pipeline. (#12113 ) * Optimizing away the `fill-mask` pipeline. - Don't send anything to the tokenizer unless needed. Vocab check is much faster - Keep BC by sending data to the tokenizer when needed. User handling warning messages will see performance benefits again - Make `targets` and `top_k` work together better `top_k` cannot be higher than `len(targets)` but can be smaller still. - Actually simplify the `target_ids` in case of duplicate (it can happen because we're parsing raw strings) - Removed useless code to fail on empty strings. It works only if empty string is in first position, moved to ignoring them instead. - Changed the related tests as only the tests would fail correctly (having incorrect value in first position) * Make tests compatible for 2 different vocabs... (at the price of a warning). Co-authored-by: @EtaoinWu * ValueError working globally * Update src/transformers/pipelines/fill_mask.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * `tokenizer.vocab` -> `tokenizer.get_vocab()` for more compatiblity + fallback. Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-06-23 10:38:04 +02:00

1 2 3 4 5 ...

7444 Commits