transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-03 12:50:06 +06:00

Author	SHA1	Message	Date
Stas Bekman	d25ad34c82	[CI] add dependency table sync verification (#12364 ) * add dependency table sync verification * improve the message * improve the message * revert * ready to merge	2021-06-28 08:55:59 -07:00
Sylvain Gugger	57461ac0b4	Add possibility to maintain full copies of files (#12312 )	2021-06-28 10:02:53 -04:00
Taha ValizadehAslani	9490d668d2	Update run_mlm.py (#12344 ) Before the code could not be used for validation only because of this line: extension = data_args.train_file.split(".")[-1] was assuming that extension must be extracted from the training dataset. This line would run regardless of the training or validation options of the user. This would lead to an error if the user only wants to run an evaluation only and does not want to do train (because the training file does not exist). I modified it to extract extension from the training file if the user wants to do train and extract it from the validation file if the user wants to run eval. This way the code can be used for both training and validation separately.	2021-06-28 07:49:22 -04:00
Kilian Kluge	c7faf2ccc0	[Documentation] Warn that DataCollatorForWholeWordMask is limited to BertTokenizer-like tokenizers (#12371 ) * Notify users that DataCollatorForWholeWordMask is limited to BertTokenier-like tokenizers * Fix code formatting	2021-06-28 07:39:56 -04:00
Bhadresh Savani	ff5cdc086b	replace print with logger (#12368 )	2021-06-26 09:31:25 -07:00
Bhadresh Savani	9a7545943d	updated example template (#12365 )	2021-06-25 20:50:30 -07:00
Bhadresh Savani	539ee456d4	[Examples] Replicates the new --log_level feature to all trainer-based pytorch (#12359 ) * added log_level * fix comment * fixed log_level * Trigger CI * Unfied logging * simplified args for log_level	2021-06-25 14:58:42 -07:00
Stas Bekman	64e6098094	[trainer] add main_process_first context manager (#12351 ) * main_process_first context manager * handle multi-node, add context description * sync desc	2021-06-25 14:58:03 -07:00
cronoik	f866425898	fixed multiplechoice tokenization (#12362 ) * fixed multiplechoice tokenization The model would have seen two sequences: 1. [CLS]prompt[SEP]prompt[SEP] 2. [CLS]choice0[SEP]choice1[SEP] that is not correct as we want a contextualized embedding of prompt and choice * removed outer brackets for proper sequence generation	2021-06-25 17:41:08 -04:00
Stas Bekman	4a872caef4	remove extra white space from log format (#12360 )	2021-06-25 13:20:14 -07:00
Sylvain Gugger	a3daabfe14	Style	2021-06-25 15:54:31 -04:00
Kai Fricke	238521b0b6	Replace NotebookProgressReporter by ProgressReporter in Ray Tune run (#12357 ) * Replace NotebookProgressReporter by ProgressReporter in Ray Tune run * Move to local import	2021-06-25 14:12:03 -04:00
Vasudev Gupta	332a245861	Add FlaxBigBird QuestionAnswering script (#12233 ) * port bigbird script * adapt script a bit * change location * adapt more * save progress * init commit * style * dataset script tested * readme add	2021-06-25 18:05:48 +01:00
jglaser	55bb4c06f7	Fix exception in prediction loop occurring for certain batch sizes (#12350 ) * fix distributed_concat for scalar outputs * Update README.md * fixed typo (#12356) * simplify fix with terser syntax Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Trigger CI Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: michal pitr <21157924+MichalPitr@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-25 10:55:15 -04:00
michal pitr	d4ce31e839	fixed typo (#12356 )	2021-06-25 07:49:29 -04:00
Patrick von Platen	aa550c4a11	Update README.md	2021-06-25 11:55:51 +01:00
Marc van Zee	f2c4ce7e33	Add flax/jax quickstart (#12342 )	2021-06-24 17:04:18 +01:00
Sylvain Gugger	5b1b5635d3	Document patch release v4.8.1	2021-06-24 10:15:15 -04:00
Lysandre Debut	8ef62ec9e1	Fix torchscript tests (#12336 ) * Fix torchscript tests * Better test * Remove bogus print	2021-06-24 09:52:28 -04:00
Suraj Patil	aef3823e1a	[examples/Flax] move the examples table up (#12341 )	2021-06-24 16:03:37 +05:30
Richard Liaw	7875b638cd	try-this (#12338 ) Signed-off-by: Richard Liaw <rliaw@berkeley.edu>	2021-06-24 04:13:17 -04:00
Sylvain Gugger	cf3c9198aa	Fix default to logging_dir lost in merge conflict	2021-06-23 16:22:29 -04:00
Stas Bekman	07ae6103c3	[Deepspeed] new docs (#12077 ) * document sub_group_size * style * install + issues reporting * style * style * Update docs/source/main_classes/deepspeed.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * indent 4 * restore * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-23 11:07:37 -07:00
Sam Havens	3694484d0a	Update training_args.py (#12328 ) mention in `save_strategy` param description that `load_best_model_at_end` can override	2021-06-23 13:39:43 -04:00
Sylvain Gugger	2150dfed31	v4.9.0.dev0	2021-06-23 13:31:19 -04:00
Sylvain Gugger	9252a5127f	Release: v4.8.0	2021-06-23 13:25:56 -04:00
Patrick von Platen	468cda20f2	[Flax T5] Fix weight initialization and fix docs (#12327 ) * finish t5 flax fixes * improve naming	2021-06-23 17:39:21 +01:00
Sylvain Gugger	12a4457c56	Pin good version of huggingface_hub	2021-06-23 12:30:15 -04:00
Michael Benayoun	986ac03e37	changed modeling_fx_utils.py to utils/fx.py for clarity (#12326 ) Co-authored-by: Michael Benayoun <michael@huggingface.co>	2021-06-23 18:16:24 +02:00
Lysandre	941b4442ba	Temporarily revert the `fill-mask` improvements.	2021-06-23 17:46:24 +02:00
Lysandre Debut	4bdff2cdbe	Conda build (#12323 )	2021-06-23 11:07:07 -04:00
Sylvain Gugger	9eda6b52e2	Add all XxxPreTrainedModel to the main init (#12314 ) * Add all XxxPreTrainedModel to the main init * Add to template * Add to template bis * Add FlaxT5	2021-06-23 10:40:54 -04:00
Sylvain Gugger	53c60babe4	Clean push to hub API (#12187 ) * Clean push to hub API * Create working dir if it does not exist * Different tweak * New API + all models + test Flax * Adds the Trainer clean up * Update src/transformers/file_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * (nit) output types * No need to set clone_from when folder exists * Update src/transformers/trainer.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Add generated_from_trainer tag * Update to new version * Fixes Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-06-23 10:11:19 -04:00
chenht2010	625f512d5e	[TFWav2Vec2] Fix docs (#12283 ) * fix error * make style check happy Co-authored-by: chenhaitao <chenhaitao@qiyi.com>	2021-06-23 14:51:31 +01:00
Patrick von Platen	44739c8180	[Flax/JAX] Add how to propose projects markdown (#12311 ) * fix_torch_device_generate_test * remove @ * finish * make style	2021-06-23 14:50:35 +01:00
Lysandre Debut	ef3dceff4a	Add mention of the huggingface_hub methods for offline mode (#12320 )	2021-06-23 09:45:30 -04:00
Vasudev Gupta	e98233dde1	Flax T5 (#12150 ) * copy pytorch-t5 * init * boom boom * forward pass same * make generation work * add more tests * make test work * finish normal tests * make fix-copies * finish quality * correct slow example * correct slow test * version table * upload models * Update tests/test_modeling_flax_t5.py * correct incorrectly deleted line Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-06-23 13:13:32 +01:00
David Fan	7d4cfa3b47	Rewrite ProphetNet to adapt converting ONNX friendly (#11981 ) * Rewrite * [ONNX] rewrite	2021-06-23 11:34:18 +01:00
Suraj Patil	c0fe3c9a7a	Flax summarization script (#12230 ) * add summrization script * fix arguments, preprocessing, metrics * add generation and metrics * auto model, prediction loop * prettify * label smoothing * adress Sylvain and Patricks suggestions * dynamically import shift_tokens_right * fix shift_tokens_right_fn call	2021-06-23 15:49:30 +05:30
Daniel Stancl	26a2e36595	Add output in a dictionary for TF `generate` method (#12139 ) * Add output args to greedy search * Fix critical typo + make style quality * Handle generate_beam_search * Add dict_specific tests and fix the placement of encoder outputs * Add specific outputs * Update doc * Fix typo * Adjust handling encoder_outputs + Fix generating for T5 * Fix generate for RAG * Fix handling ouptut_attentions when target_mapping is not None Take care of situations when target_mapping is provided as there are 2-tuple of attentions Change from: if inputs["output_attentions"]: attentions = tuple(tf.transpose(t, perm(2, 3, 0, 1)) for t in attentions) to: if inputs["output_attentions"]: if inputs["target_mapping"] is not None: # when target_mapping is provided, there are 2-tuple of attentions attentions = tuple( tuple(tf.transpose(attn_stream, perm=(2, 3, 0, 1)) for attn_stream in t) for t in attentions ) else: attentions = tuple(tf.transpose(t, perm=(2, 3, 0, 1)) for t in attentions) * Rename kwargs to model_kwargs * make style quality * Move imports in test_modeling_tf_common.py Move ModelOutput-related imports in test_modeling_tf_common.py into the `is_tf_available():` statement. * Rewrite nested if-statements * Fix added tests	2021-06-23 10:52:11 +01:00
Nicolas Patry	d4be498441	Optimizing away the `fill-mask` pipeline. (#12113 ) * Optimizing away the `fill-mask` pipeline. - Don't send anything to the tokenizer unless needed. Vocab check is much faster - Keep BC by sending data to the tokenizer when needed. User handling warning messages will see performance benefits again - Make `targets` and `top_k` work together better `top_k` cannot be higher than `len(targets)` but can be smaller still. - Actually simplify the `target_ids` in case of duplicate (it can happen because we're parsing raw strings) - Removed useless code to fail on empty strings. It works only if empty string is in first position, moved to ignoring them instead. - Changed the related tests as only the tests would fail correctly (having incorrect value in first position) * Make tests compatible for 2 different vocabs... (at the price of a warning). Co-authored-by: @EtaoinWu * ValueError working globally * Update src/transformers/pipelines/fill_mask.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * `tokenizer.vocab` -> `tokenizer.get_vocab()` for more compatiblity + fallback. Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-06-23 10:38:04 +02:00
Kevin Canwen Xu	037e466b10	Add CodeCarbon Integration (#12304 ) * Add optional dependency * Add CodeCarbon integration * Add CodeCarbon integration * Add CodeCarbon integration * typo	2021-06-23 14:53:09 +08:00
Stas Bekman	bfd5da8e28	[docs] performance (#12258 ) * initial performance document * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * rewrites based on suggestions * 8x multiple is for AMP only * add contribute section Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-06-22 15:34:19 -07:00
Sylvain Gugger	1562c04e41	FlaxBartPretrainedModel -> FlaxBartPreTrainedModel (#12313 )	2021-06-22 16:37:05 -04:00
Stas Bekman	ebe5413589	[trainer] 2 bug fixes and a rename (#12309 ) * bug fixes and a rename * add extended DDP test	2021-06-22 11:13:23 -07:00
Patrick von Platen	64029abe4c	[Flax] Main doc for event orga (#12305 ) * fix_torch_device_generate_test * remove @ * push * finish * some typos * add more info on communication * add suggestions	2021-06-22 18:02:52 +01:00
Kilian Kluge	032d56a435	Fix and improve documentation for LEDForConditionalGeneration (#12303 ) * Replace conditional generation example (fixes #12268) * Replace model in summarization example with finetuned checkpoint, adapt example text * Fix typo in new summarization example * Fix docstring formatting, add missing import statement to example	2021-06-22 09:58:13 -04:00
Suraj Patil	1498eb9888	add FlaxAutoModelForImageClassification in main init (#12298 )	2021-06-22 18:26:05 +05:30
Stefan Schweter	2affeb2905	trainer_tf: adjust wandb installation command (#12291 )	2021-06-22 08:47:31 -04:00
Hamid Shojanazeri	af6e01c5bc	Fix for the issue of device-id getting hardcoded for token_type_ids during Tracing [WIP] (#11252 ) * registering a buffer for token_type_ids, to pass the error of device-id getting hardcoded when tracing * sytle format * adding persistent flag to the resgitered buffers that prevent from adding them to the state_dict and addresses the Backward compatibility issue * adding the try catch to the fix as persistent flag is only available from PT >1.6 * adding version check * added the condition to only use the token_type_ids buffer when its autogenerated not passed by user * adding comments and making the conidtion where token_type_ids are None to use the registered buffer * taking out position-embeddding from the if block * adding comments * handling the case if buffer for position_ids was not registered * reverted the changes on position_ids, fix the issue with size of token_type_ids buffer, moved the modification for generated token_type_ids to Bertmodel, instead of Embeddings * reverting the token_type_ids in case of None to the previous version * reverting changes on position_ids adding back the if block * changes added by running make fix-copies * changes added by running make fix-copies and added the import version as it was getting used * changes added by running make fix-copies * changes added by running make fix-copies * fixing the import format * fixing the import format * modified to use temp tensor for trimed and expanded token_type_ids buffer * changes made by fix-copies after temp tensor modifications * changes made by fix-copies after temp tensor modifications * changes made by fix-copies after temp tensor modifications * clean up * clean up * clean up * clean up * Nit * Nit * Nit * modified according to support device conversion on traced models * modified according to support device conversion on traced models * modified according to support device conversion on traced models * modified according to support device conversion on traced models * changes based on latest in master * Adapt templates * Add version import Co-authored-by: Ubuntu <ubuntu@ip-172-31-32-81.us-west-2.compute.internal> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-06-22 05:21:30 -04:00

1 2 3 4 5 ...

7435 Commits