transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-03 03:31:05 +06:00

Author	SHA1	Message	Date
Stas Bekman	67ed0e43dc	[docs] fix url (#16860 )	2022-04-20 11:01:24 -07:00
Stas Bekman	afa1ef0992	[modeling_utils] use less cpu memory with sharded checkpoint loading (#16844 ) * less cpu memory with sharded checkpoint loading * Trigger CI * Trigger CI	2022-04-20 07:44:37 -07:00
Nicolas Patry	e13a91fe60	Fixing return type tensor with `num_return_sequences>1`. (#16828 ) * Fixing return type tensor with `num_return_sequences>1`. * Nit.	2022-04-20 16:11:51 +02:00
Yang Ming	ff06b17791	add DebertaV2 fast tokenizer (#15529 ) Co-authored-by: alcinos <carion.nicolas@gmail.com> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> Co-authored-by: Nicolas Carion <carion.nicolas@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-20 10:26:51 +02:00
Patrick von Platen	e1c153cbaa	[Typo] Fix typo in modeling utils (#16840 )	2022-04-19 23:09:03 +02:00
Manuel R. Ciosici	3104036e7f	Add support for bitsandbytes (#15622 ) * Add initial BNB integration * fixup! Add initial BNB integration * Add bnb test decorator * Update Adamw8bit option name * Use the full bnb package name * Overide bnb for all embedding layers * Fix package name * Formatting * Remove unnecessary import * Update src/transformers/trainer.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Rename AdamwBNB optimizer option * Add training test checking that bnb memory utilization is lower * fix merge * fix merge; fix + extend new test * cleanup * expand bnb * move all require_* candidates to testing_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org>	2022-04-19 16:01:29 -04:00
Yih-Dar	e6d23a4b9b	Improve test_pt_tf_model_equivalence on PT side (#16731 ) * Update test_pt_tf_model_equivalence on PT side Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-19 21:13:27 +02:00
Dahlbomii	3dd57b15c5	Type hints added to Speech to Text (#16506 ) * Type hints added * return hints added * Update src/transformers/models/speech_to_text/modeling_tf_speech_to_text.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-04-19 17:58:08 +01:00
SaulLu	1efca4e6c8	replace `Speech2TextTokenizer` by `Speech2TextFeatureExtractor` in some docstrings (#16835 ) * replace `Speech2TextTokenizer` by `Speech2TextFeatureExtractor` in docstring * quality	2022-04-19 18:32:22 +02:00
Jeevesh Juneja	b5c6a63ed9	Correct Logging of Eval metric to Tensorboard (#16825 ) * Correct Logging of Eval metric to Tensorboard An empty dictionary ``eval_metrics`` was being logged, is replaced by ``eval_metric`` which is the output dictionary of ``metric.compute()``. * Remove unused variable	2022-04-19 17:27:54 +02:00
Joao Gante	f09c45e067	TF: Add sigmoid activation function (#16819 )	2022-04-19 16:13:08 +01:00
wiio12	74814574ae	Add doc about `attention_mask` on gpt2 (#16829 ) * Add doc about `attention_mask` on gpt2 Add a simple sentence describing how `attention_mask` needs to be constructed when ``past_key_values` is used. * Add doc about attention_mask on gpt2_tf * clean up style * remove empty line white spaces * remove whitespace in empty line	2022-04-19 16:32:26 +02:00
NielsRogge	b96e82c80a	Add image classification script, no trainer (#16727 ) * Add first draft * Improve README and run fixup * Make script aligned with other scripts, improve README * Improve script and add test * Remove print statement * Apply suggestions from code review * Add num_labels to make test pass * Improve README	2022-04-19 16:32:08 +02:00
Patrick von Platen	db9f189121	[ASR Pipeline] Correct init docs (#16833 ) * correct * up	2022-04-19 16:12:36 +02:00
Ella Charlaix	77de8d6c31	Add onnx export of models with a multiple choice classification head (#16758 ) * Add export of models with a multiple-choice classification head	2022-04-19 15:51:51 +02:00
Wonjae Kim	b74a955325	fix `rum_clm.py` seeking text column name twice (#16624 )	2022-04-19 14:38:25 +01:00
Dahlbomii	3663fca41b	Type hints added for TFMobileBert (#16505 ) * Type hints added * make style * Return type hints added * fixed typo Co-authored-by: matt <rocketknight1@gmail.com>	2022-04-19 14:37:03 +01:00
code-review-doctor	a2392415e9	Some tests misusing assertTrue for comparisons fix (#16771 ) * Fix issue avoid-misusing-assert-true found at https://codereview.doctor * fix tests * fix tf Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-04-19 14:44:08 +02:00
Suraj Patil	d3bd9ac728	[Flax] improve large model init and loading (#16148 ) * begin do_init * add params_shape_tree * raise error if params are accessed when do_init is False * don't allow do_init=False when keys are missing * make shape tree a property * assign self._params at the end * add test for do_init * add do_init arg to all flax models * fix param setting * disbale do_init for composite models * update test * add do_init in FlaxBigBirdForMultipleChoice * better names and errors * improve test * style * add a warning when do_init=False * remove extra if * set params after _required_params * add test for from_pretrained * do_init => _do_init * chage warning to info * fix typo * add params in init_weights * add params to gpt neo init * add params to init_weights * update do_init test * Trigger CI * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update template * trigger CI * style * style * fix template Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-04-19 14:19:55 +02:00
Arthur	6de4ee61a0	Wav2 vec2 phoneme ctc tokenizer optimisation (#16817 ) * Solved href rendering issue in heading Markdown references in headings such as '####' don't render well. Replaced it with <h4>...<a></a></h> banners. * PhonemeTokenizer optimization using phonemizer lib The backend should only be initialized once, otherwise it is reloaded. Added `init_backend` function, intializes a backend attribute. Phonemize re-uses self.backend. Should give ~10 times faster phonemization. * formatted file with make style * Documentation suggestion Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update /tokenization_wav2vec2_phoneme.py based on PR suggestion Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update CONTRIBUTING.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-19 07:39:04 -04:00
Li-Huai (Allan) Lin	306c9ee966	Fix `LayoutLMv2` tokenization docstrings (#16187 ) * Fix docstrings * Fix up * Fix	2022-04-19 12:14:51 +02:00
NielsRogge	7db7aab439	Add semantic script no trainer, v2 (#16788 ) * Add first draft from previous PR * First draft * Improve README and remove num_labels * Make script more aligned with other scripts * Improve README and apply suggestion from code review	2022-04-19 09:07:29 +02:00
NielsRogge	494c2a8c4d	Clean up semantic segmentation tests (#16801 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-04-19 09:02:19 +02:00
David Hall	989a15d173	fix _setup_devices in case where there is no torch.distributed package in build (#16821 ) * fix _setup_devices in case where there is not torch.distributed * in training_args_sm.py as well	2022-04-18 18:36:46 -04:00
Lysandre Debut	c11a49573f	Refactor issues with yaml (#16772 ) * Refactor issues with yaml * Update .github/ISSUE_TEMPLATE/bug-report.yml Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * Update .github/ISSUE_TEMPLATE/bug-report.yml Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * Update .github/ISSUE_TEMPLATE/feature-request.yml Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update .github/ISSUE_TEMPLATE/bug-report.yml Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update .github/ISSUE_TEMPLATE/bug-report.yml Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-04-18 16:43:21 -04:00
jsnfly	51e0ebedcb	Allow passing encoder_ouputs as tuple to EncoderDecoder Models (#16814 ) * Add passing encoder_outputs as tuple to existing test * Add check for tuple * Add check for tuple also for speech and vision Co-authored-by: jsnfly <jsnfly@gmx.de>	2022-04-18 19:49:58 +02:00
Nicholas Broad	51fa7191b1	use base_version to check torch version in torch_less_than_1_11 (#16806 ) * use base_version * make is_torch_less_than_1_8 match 1_11 Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>	2022-04-18 13:02:00 -04:00
Patrick von Platen	8d3f952adb	[Data2Vec] Add data2vec vision (#16760 ) * save intermediate * add vision * add vision * save * finish models * finish models * continue * finish * up * up * up * tests all pass * clean up * up * up * fix bugs in beit * correct docs * finish * finish docs * make style * up * more fixes * fix type hint * make style * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/data2vec/test_modeling_data2vec_vision.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix test Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-18 17:52:13 +02:00
Zhengqiang Yin	33cd4be576	fix megatron bert convert state dict naming (#15820 )	2022-04-18 11:34:36 -04:00
Patrick von Platen	9a2995ee39	[Quicktour Audio] Improve && remove ffmpeg dependency (#16723 ) * [Quicktour Audio] Improve && remove ffmpeg dependency * final fix * final touches	2022-04-18 16:50:13 +02:00
NielsRogge	d3c9d0e55f	[ViT, BEiT, DeiT, DPT] Improve code (#16799 ) * Improve code * Fix bugs * Fix another bug * Clean up DTP as well * Update DPT model outputs Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-04-18 09:25:08 -04:00
Sylvain Gugger	3785f4665a	Fix syntax error in TorchHub workflow	2022-04-18 07:54:00 -04:00
Joao Gante	6984848ed0	Create empty venv on cache miss (#16816 )	2022-04-18 07:49:31 -04:00
Allan Jie	438144832e	Raise error and suggestion when using custom optimizer with Fairscale or Deepspeed (#16786 ) * optimizer issues related to saving * remove the "optimizer saving" option * reformat using make style	2022-04-18 07:47:21 -04:00
Joao Gante	b4ddd2677c	TF generate refactor - XLA sample (#16713 )	2022-04-18 10:58:24 +01:00
Joao Gante	02de7a8e7f	CI: non-remote GH Actions now use a python venv (#16789 )	2022-04-18 09:47:38 +01:00
Sylvain Gugger	dee6f01636	Pin Jax to last working release (#16808 ) * Pin Jax to last working release * Try lower * Try lower	2022-04-16 21:15:19 -04:00
NielsRogge	78f346c2b5	Update README.md (#16797 )	2022-04-15 14:10:16 +02:00
Yih-Dar	ee209d4d01	Fix PT TF ViTMAE (#16766 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-15 06:37:10 +02:00
Stas Bekman	5da33f8729	[modeling utils] revamp `from_pretrained(..., low_cpu_mem_usage=True)` + tests (#16657 ) * add low_cpu_mem_usage tests * wip: revamping * wip * install /usr/bin/time * wip * cleanup * cleanup * cleanup * cleanup * cleanup * fix assert * put the wrapper back * cleanup; switch to bert-base-cased * Trigger CI * Trigger CI	2022-04-14 18:10:05 -07:00
Stas Bekman	ce2fef2ad2	[trainer / deepspeed] fix hyperparameter_search (#16740 ) * [trainer / deepspeed] fix hyperparameter_search * require optuna * style * oops * add dep in the right place * create deepspeed-testing dep group * Trigger CI	2022-04-14 17:24:38 -07:00
code-review-doctor	1b7de41a07	Fix issue avoid-missing-comma found at https://codereview.doctor (#16768 )	2022-04-14 16:42:27 -04:00
Sanchit Gandhi	de8b06f9bf	[SpeechEncoderDecoderModel] Fix bug in reshaping labels (#16748 )	2022-04-14 19:02:40 +01:00
NielsRogge	048443db86	Improve image classification example (#16585 ) * Improve README * Make dataset_name argument optional * Improve local data * Fix bug * Improve README some more * Apply suggestions from code review * Improve README Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-04-14 18:10:52 +02:00
Sylvain Gugger	3e4eec47f5	Kill async pushes when calling push_to_hub with blocking=True (#16755 )	2022-04-14 10:02:29 -04:00
Stas Bekman	c21e1071a7	[deepspeed / m2m_100] make deepspeed zero-3 work with layerdrop (#16717 ) * [deepspeed / m2m_100] make deepspeed 3 work with layerdrop * fix * revert last	2022-04-14 06:51:55 -07:00
Zachary Mueller	89293a0f6b	Make nightly install dev accelerate (#16783 )	2022-04-14 09:41:02 -04:00
Sylvain Gugger	b151ddb9b9	Fix batch size in evaluation loop (#16763 ) * Fix batch size in evaluation loop * remove debug statement	2022-04-14 09:22:54 -04:00
Sanchit Gandhi	d8269eb4d5	[Flax `.from_pretrained`] Raise a warning if model weights are not in float32 (#16762 ) * [Flax] Raise a warning if model weights are not in float32 * apply suggestions and few small changes * reorder wording for better readability	2022-04-14 11:52:15 +02:00
Nicolas Patry	195fbbb6cf	Enabling `Tapex` in table question answering pipeline. (#16663 ) * Enabling `Tapex` in table question answering pipeline. * Questions are independant for Tapex, making the test respect that. * Missing extra space.	2022-04-14 09:06:14 +02:00

1 2 3 4 5 ...

9605 Commits