transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-02 19:21:31 +06:00

Author	SHA1	Message	Date
Wang, Yi	d842f2d5b9	update the train_batch_size in case HPO change batch_size_per_device (#18918 ) Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2022-09-07 08:01:30 -04:00
Nicholas Broad	4f299b2446	Accelerator end training (#18910 ) * add accelerator.end_training() Some trackers need this to end their runs. * fixup and quality * add space * add space again ?!?	2022-09-07 07:46:26 -04:00
Yih-Dar	7a8118947f	Add checks for more workflow jobs (#18905 ) * add check for scheduled CI * Add check to other CIs Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-09-07 12:51:37 +02:00
NielsRogge	c25f27fa6a	[VideoMAE] Improve code examples (#18919 ) * Simplify code example * Add seed	2022-09-07 12:24:12 +02:00
Ekagra Ranjan	0a632f076d	Fix incorrect size of input for 1st strided window length in `Perplexity of fixed-length models` (#18906 ) * update the PPL for stride 512 * fix 1st strided window size * linting * fix typo * styling	2022-09-06 15:20:12 -04:00
Yih-Dar	7d5fde991d	unpin slack_sdk version (#18901 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-09-06 18:42:00 +02:00
Sylvain Gugger	71ff88fa4f	Further reduce the number of alls to head for cached objects (#18871 ) * Further reduce the number of alls to head for cached models/tokenizers/pipelines * Fix tests * Address review comments	2022-09-06 12:34:37 -04:00
Alara Dirik	6678350c01	fixes bugs to handle non-dict output (#18897 )	2022-09-06 16:13:34 +03:00
Yih-Dar	998a90bc7d	Fix `test_tf_encode_plus_sent_to_model` for `LayoutLMv3` (#18898 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-09-06 14:51:03 +02:00
Ekagra Ranjan	f85acb4d73	Fix decode_input_ids to bare T5Model and improve doc (#18791 ) * use tokenizer to output tensor * add preprocessing for decoder_input_ids for bare T5Model * add preprocessing to tf and flax * linting * linting * Update src/transformers/models/t5/modeling_flax_t5.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/t5/modeling_tf_t5.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/t5/modeling_t5.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-09-06 14:12:26 +02:00
arun99481	3b19c0317b	updating gather function with gather_for_metrics in run_wav2vec2_pretraining (#18877 ) Co-authored-by: Arun Rajaram <arunrajaram@Aruns-MacBook-Pro.local>	2022-09-06 07:36:37 -04:00
Had	734b7e2a5a	Mask t5 relative position bias then head pruned (#17968 ) * add position bias head masking if heads pruned * fix pruning function in t5 encoder * make style * make fix-copies * Revert added folder Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-09-06 10:39:31 +02:00
Joao Gante	d4dbd7ca59	Generate: get the correct beam index on eos token (#18851 )	2022-09-05 19:35:47 +01:00
zkep	c6d3daba54	Update Chinese documentation (#18893 ) * update the translation	2022-09-05 19:56:12 +02:00
Sofia Oliveira	cfd623a859	Add type hints to XLM-Roberta-XL models (#18475 ) * Add type hints to XLM-Roberta-XL models * Format	2022-09-05 13:38:08 +01:00
Surya Prakash Sahu	17c634fd5b	Update perf_train_gpu_one.mdx (#18442 )	2022-09-05 14:06:36 +02:00
Patrick von Platen	badb9d2aaa	Correct naming pegasus x (#18896 ) * add first generation tutorial * [Pegasus X] correct naming * [Generation] Remove	2022-09-05 11:25:00 +02:00
Lysandre Debut	591cfc6c90	Mention TF and Flax checkpoints (#18894 )	2022-09-05 11:09:39 +02:00
Joao Gante	7f27e002fd	TF: TFMarianMTModel final logits bias as a layer (#18833 ) * bias as a layer * alias the bias (hah, it rhymes) * add comment with info	2022-09-05 09:20:27 +01:00
Steven Liu	65fb71bc76	Add Trainer to quicktour (#18723 ) * 📝 update quicktour * 📝 add trainer section * 🖍 markdown table, apply feedbacks * ✨ make style * add tf training section * make style	2022-09-02 15:05:31 -05:00
Steven Liu	ae32f3afef	Finetune guide for semantic segmentation (#18640 ) * 📝 first draft * oops add to toctree * make style * 📝 add inference section * 🖍 make style * 📝 add images * 🖍 apply feedbacks * remove num_labels and pytorch block * apply feedbacks, add colab notebook Co-authored-by: Steven <stevhliu@gmail.com>	2022-09-02 14:29:51 -05:00
Steven Liu	bf9d506137	Update docs landing page (#18590 ) * 📝 update docs landing page * 🖍 apply feedbacks * apply feedbacks * apply feedbacks, use <br> for list	2022-09-02 14:29:06 -05:00
Jason Phang	53e33e6f1b	PEGASUS-X (#18551 ) * PegasusX Initial commit * rename * pegasus X implementation * pegx update * pegx fix * pegasus-x fixes * pegx updates * cleanup * cleanup * cleanup * tests * stylefixes * Documentation update * Model hub fix * cleanup * update * update * testfix * Check fix * tweaks for merging * style * style * updates for pr * style * change pegasus-x repo	2022-09-02 19:54:02 +02:00
Yih-Dar	ecdf9b06bc	Remove cached torch_extensions on CI runners (#18868 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-09-02 18:17:58 +02:00
Yih-Dar	4e29b3f884	A script to download artifacts and perform CI error statistics (#18865 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-09-02 17:59:26 +02:00
Joao Gante	9196f48b95	Generate: validate `model_kwargs` on TF (and catch typos in generate arguments) (#18651 )	2022-09-02 16:25:26 +01:00
Stas Bekman	c5be7cae59	postpone bnb load until it's needed (#18859 )	2022-09-02 08:22:46 -07:00
Sylvain Gugger	9e346f7436	Fix number of examples for iterable datasets in multiprocessing (#18856 ) * Fix number of examples for iterable datasets in multiprocessing * Add stronger check	2022-09-02 10:49:39 -04:00
Yih-Dar	0ab465a5d2	pin Slack SDK to 3.18.1 to avoid failing issue (#18869 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-09-02 16:49:08 +02:00
Sylvain Gugger	38c3cd52fb	Clean up utils.hub using the latest from hf_hub (#18857 ) * Clean up utils.hub using the latest from hf_hub * Adapt test * Address review comment * Fix test	2022-09-02 10:30:06 -04:00
NielsRogge	17981faf67	Add OWL-ViT to the appropriate section (#18867 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-09-02 15:59:25 +02:00
NielsRogge	c60dd98e87	[LayoutLM] Add clarification to docs (#18716 ) * Add clarification * Add another clarification * Apply suggestion Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-09-02 14:48:19 +02:00
OlivierDehaene	129d73294e	Fix naming issue with ImageToText pipeline (#18864 ) Co-authored-by: Olivier Dehaene <olivier@huggingface.co>	2022-09-02 07:55:30 -04:00
kmckiern	9b3eb81014	if learning rate is a tensor, get item (float) (#18861 )	2022-09-02 07:46:31 -04:00
Steven Liu	142e12afb4	Split docs on modality (#18205 ) * update * 🖍 add missing files * 📝 add nested sections * 🖍 align titles with tasks * oops * remove quotes from titles	2022-09-01 15:19:11 -05:00
Ankur Goyal	23fab60b67	Pin revision for LayoutLMForQuestionAnswering and TFLayoutLMForQuestionAnswering tests (#18854 ) * Pin revision for tests * Fixup * Update revision in models * Shorten revisions Co-authored-by: Ankur Goyal <ankur@impira.com>	2022-09-01 12:52:33 -04:00
OlivierDehaene	ddb69e5af8	Add Image To Text Generation pipeline (#18821 ) * Add Image2TextGenerationPipeline to supported pipelines * Add Flax and Tensorflow support * Add Flax and Tensorflow small tests * Add default model for Tensorflow * Add docstring * Fix doc style * Add tiny models for pytorch and flax * Remove flax from pipeline. Fix tests * Use ydshieh/vit-gpt2-coco-en as a default for both PyTorch and Tensorflow * Fix Tensorflow support Co-authored-by: Olivier Dehaene <olivier@huggingface.co>	2022-09-01 12:07:14 -04:00
Sylvain Gugger	c61f116b63	Tie weights after preparing the model in run_clm (#18855 )	2022-09-01 12:06:56 -04:00
Cody Yu	1c381f3600	Cache results of is_torch_tpu_available() (#18777 ) * Cache results of is_torch_tpu_available() * Update src/transformers/utils/import_utils.py * Update src/transformers/utils/import_utils.py	2022-09-01 11:45:33 -04:00
Sayak Paul	954e18ab97	TensorFlow MobileViT (#18555 ) * initial implementation. * add: working model till image classification. * add: initial implementation that passes intg tests. Co-authored-by: Amy <aeroberts4444@gmail.com> * chore: formatting. * add: tests (still breaking because of config mismatch). Coo-authored-by: Yih <2521628+ydshieh@users.noreply.github.com> * add: corrected tests and remaning changes. * fix code style and repo consistency. * address PR comments. * address Amy's comments. * chore: remove from_pt argument. * chore: add full-stop. * fix: TFLite model conversion in the doc. * Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply formatting. * chore: remove comments from the example block. * remove identation in the example. Co-authored-by: Amy <aeroberts4444@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-09-01 10:35:15 -04:00
Gustavo de Rosa	fe58929ad6	Adds timeout argument to training_args to avoid socket timeouts in DDP (#18562 ) * chore(training_args): Adds support for timeout argument. * fix(training_args): Passes make style through changes. * fix(training_args): Removes wrong docstring sentence. * fix(training_args): Fixes timeout not being JSON serializable. * fix(training_args_sm): Also updates timeout to timeout_delta. * fix(training_args): Fixes PR according to suggestions.	2022-09-01 10:33:53 -04:00
kumapo	ab663b2274	reflect max_new_tokens in `Seq2SeqTrainer` (#18786 ) * reflect max_new_tokens in gen_kwargs to `trainer.generate()` * reflect max_new_tokens in `Seq2SeqTrainer` * remove unnecessary variable * Trigger CI * fix style	2022-09-01 09:12:38 -04:00
Pedro Cuenca	f719c0377f	Minor typo in prose of model outputs documentation. (#18848 )	2022-09-01 12:05:40 +02:00
Albert Villanova del Moral	fafbb57df1	Pin rouge_score (#18247 ) * Pin rouge_score * Pin also in dependency_versions_table * Update excluded versions * Revert "Update excluded versions" This reverts commit `0d0362df30`. * Revert "Revert "Update excluded versions"" This reverts commit `66c47af8a6`.	2022-09-01 12:04:49 +02:00
Yih-Dar	e7da38f5dc	add a script to get time info. from GA workflow jobs (#18822 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-09-01 12:02:52 +02:00
Joao Gante	6e016634f1	Generate: smaller TF serving test (#18840 )	2022-09-01 10:53:39 +01:00
Yih-Dar	563a8d58db	Delete `state_dict` to release memory as early as possible (#18832 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-09-01 10:55:30 +02:00
Albert Villanova del Moral	a26c752353	Unpin fsspec (#18846 )	2022-09-01 10:20:15 +02:00
flozi00	359f7b4b8d	Create pipeline_tutorial.mdx german docs (#18625 ) * Create pipeline_tutorial.mdx * Update _toctree.yml	2022-09-01 09:57:59 +02:00
Alara Dirik	5d81a56833	Owlvit memory leak fix (#18734 ) * fix memory leak * fix typos * use singular last hidden state variable names * eliminate double call to self.owlvit to return last hidden states * eliminate 2nd call to self.vision_model in OwlViTModel	2022-09-01 10:31:08 +03:00

... 3 4 5 6 7 ...

10786 Commits