transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Joao Gante	f09c45e067	TF: Add sigmoid activation function (#16819 )	2022-04-19 16:13:08 +01:00
wiio12	74814574ae	Add doc about `attention_mask` on gpt2 (#16829 ) * Add doc about `attention_mask` on gpt2 Add a simple sentence describing how `attention_mask` needs to be constructed when ``past_key_values` is used. * Add doc about attention_mask on gpt2_tf * clean up style * remove empty line white spaces * remove whitespace in empty line	2022-04-19 16:32:26 +02:00
NielsRogge	b96e82c80a	Add image classification script, no trainer (#16727 ) * Add first draft * Improve README and run fixup * Make script aligned with other scripts, improve README * Improve script and add test * Remove print statement * Apply suggestions from code review * Add num_labels to make test pass * Improve README	2022-04-19 16:32:08 +02:00
Patrick von Platen	db9f189121	[ASR Pipeline] Correct init docs (#16833 ) * correct * up	2022-04-19 16:12:36 +02:00
Ella Charlaix	77de8d6c31	Add onnx export of models with a multiple choice classification head (#16758 ) * Add export of models with a multiple-choice classification head	2022-04-19 15:51:51 +02:00
Wonjae Kim	b74a955325	fix `rum_clm.py` seeking text column name twice (#16624 )	2022-04-19 14:38:25 +01:00
Dahlbomii	3663fca41b	Type hints added for TFMobileBert (#16505 ) * Type hints added * make style * Return type hints added * fixed typo Co-authored-by: matt <rocketknight1@gmail.com>	2022-04-19 14:37:03 +01:00
code-review-doctor	a2392415e9	Some tests misusing assertTrue for comparisons fix (#16771 ) * Fix issue avoid-misusing-assert-true found at https://codereview.doctor * fix tests * fix tf Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-04-19 14:44:08 +02:00
Suraj Patil	d3bd9ac728	[Flax] improve large model init and loading (#16148 ) * begin do_init * add params_shape_tree * raise error if params are accessed when do_init is False * don't allow do_init=False when keys are missing * make shape tree a property * assign self._params at the end * add test for do_init * add do_init arg to all flax models * fix param setting * disbale do_init for composite models * update test * add do_init in FlaxBigBirdForMultipleChoice * better names and errors * improve test * style * add a warning when do_init=False * remove extra if * set params after _required_params * add test for from_pretrained * do_init => _do_init * chage warning to info * fix typo * add params in init_weights * add params to gpt neo init * add params to init_weights * update do_init test * Trigger CI * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update template * trigger CI * style * style * fix template Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-04-19 14:19:55 +02:00
Arthur	6de4ee61a0	Wav2 vec2 phoneme ctc tokenizer optimisation (#16817 ) * Solved href rendering issue in heading Markdown references in headings such as '####' don't render well. Replaced it with <h4>...<a></a></h> banners. * PhonemeTokenizer optimization using phonemizer lib The backend should only be initialized once, otherwise it is reloaded. Added `init_backend` function, intializes a backend attribute. Phonemize re-uses self.backend. Should give ~10 times faster phonemization. * formatted file with make style * Documentation suggestion Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update /tokenization_wav2vec2_phoneme.py based on PR suggestion Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update CONTRIBUTING.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-19 07:39:04 -04:00
Li-Huai (Allan) Lin	306c9ee966	Fix `LayoutLMv2` tokenization docstrings (#16187 ) * Fix docstrings * Fix up * Fix	2022-04-19 12:14:51 +02:00
NielsRogge	7db7aab439	Add semantic script no trainer, v2 (#16788 ) * Add first draft from previous PR * First draft * Improve README and remove num_labels * Make script more aligned with other scripts * Improve README and apply suggestion from code review	2022-04-19 09:07:29 +02:00
NielsRogge	494c2a8c4d	Clean up semantic segmentation tests (#16801 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-04-19 09:02:19 +02:00
David Hall	989a15d173	fix _setup_devices in case where there is no torch.distributed package in build (#16821 ) * fix _setup_devices in case where there is not torch.distributed * in training_args_sm.py as well	2022-04-18 18:36:46 -04:00
Lysandre Debut	c11a49573f	Refactor issues with yaml (#16772 ) * Refactor issues with yaml * Update .github/ISSUE_TEMPLATE/bug-report.yml Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * Update .github/ISSUE_TEMPLATE/bug-report.yml Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * Update .github/ISSUE_TEMPLATE/feature-request.yml Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update .github/ISSUE_TEMPLATE/bug-report.yml Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update .github/ISSUE_TEMPLATE/bug-report.yml Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-04-18 16:43:21 -04:00
jsnfly	51e0ebedcb	Allow passing encoder_ouputs as tuple to EncoderDecoder Models (#16814 ) * Add passing encoder_outputs as tuple to existing test * Add check for tuple * Add check for tuple also for speech and vision Co-authored-by: jsnfly <jsnfly@gmx.de>	2022-04-18 19:49:58 +02:00
Nicholas Broad	51fa7191b1	use base_version to check torch version in torch_less_than_1_11 (#16806 ) * use base_version * make is_torch_less_than_1_8 match 1_11 Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>	2022-04-18 13:02:00 -04:00
Patrick von Platen	8d3f952adb	[Data2Vec] Add data2vec vision (#16760 ) * save intermediate * add vision * add vision * save * finish models * finish models * continue * finish * up * up * up * tests all pass * clean up * up * up * fix bugs in beit * correct docs * finish * finish docs * make style * up * more fixes * fix type hint * make style * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/data2vec/test_modeling_data2vec_vision.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix test Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-18 17:52:13 +02:00
Zhengqiang Yin	33cd4be576	fix megatron bert convert state dict naming (#15820 )	2022-04-18 11:34:36 -04:00
Patrick von Platen	9a2995ee39	[Quicktour Audio] Improve && remove ffmpeg dependency (#16723 ) * [Quicktour Audio] Improve && remove ffmpeg dependency * final fix * final touches	2022-04-18 16:50:13 +02:00
NielsRogge	d3c9d0e55f	[ViT, BEiT, DeiT, DPT] Improve code (#16799 ) * Improve code * Fix bugs * Fix another bug * Clean up DTP as well * Update DPT model outputs Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-04-18 09:25:08 -04:00
Sylvain Gugger	3785f4665a	Fix syntax error in TorchHub workflow	2022-04-18 07:54:00 -04:00
Joao Gante	6984848ed0	Create empty venv on cache miss (#16816 )	2022-04-18 07:49:31 -04:00
Allan Jie	438144832e	Raise error and suggestion when using custom optimizer with Fairscale or Deepspeed (#16786 ) * optimizer issues related to saving * remove the "optimizer saving" option * reformat using make style	2022-04-18 07:47:21 -04:00
Joao Gante	b4ddd2677c	TF generate refactor - XLA sample (#16713 )	2022-04-18 10:58:24 +01:00
Joao Gante	02de7a8e7f	CI: non-remote GH Actions now use a python venv (#16789 )	2022-04-18 09:47:38 +01:00
Sylvain Gugger	dee6f01636	Pin Jax to last working release (#16808 ) * Pin Jax to last working release * Try lower * Try lower	2022-04-16 21:15:19 -04:00
NielsRogge	78f346c2b5	Update README.md (#16797 )	2022-04-15 14:10:16 +02:00
Yih-Dar	ee209d4d01	Fix PT TF ViTMAE (#16766 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-15 06:37:10 +02:00
Stas Bekman	5da33f8729	[modeling utils] revamp `from_pretrained(..., low_cpu_mem_usage=True)` + tests (#16657 ) * add low_cpu_mem_usage tests * wip: revamping * wip * install /usr/bin/time * wip * cleanup * cleanup * cleanup * cleanup * cleanup * fix assert * put the wrapper back * cleanup; switch to bert-base-cased * Trigger CI * Trigger CI	2022-04-14 18:10:05 -07:00
Stas Bekman	ce2fef2ad2	[trainer / deepspeed] fix hyperparameter_search (#16740 ) * [trainer / deepspeed] fix hyperparameter_search * require optuna * style * oops * add dep in the right place * create deepspeed-testing dep group * Trigger CI	2022-04-14 17:24:38 -07:00
code-review-doctor	1b7de41a07	Fix issue avoid-missing-comma found at https://codereview.doctor (#16768 )	2022-04-14 16:42:27 -04:00
Sanchit Gandhi	de8b06f9bf	[SpeechEncoderDecoderModel] Fix bug in reshaping labels (#16748 )	2022-04-14 19:02:40 +01:00
NielsRogge	048443db86	Improve image classification example (#16585 ) * Improve README * Make dataset_name argument optional * Improve local data * Fix bug * Improve README some more * Apply suggestions from code review * Improve README Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-04-14 18:10:52 +02:00
Sylvain Gugger	3e4eec47f5	Kill async pushes when calling push_to_hub with blocking=True (#16755 )	2022-04-14 10:02:29 -04:00
Stas Bekman	c21e1071a7	[deepspeed / m2m_100] make deepspeed zero-3 work with layerdrop (#16717 ) * [deepspeed / m2m_100] make deepspeed 3 work with layerdrop * fix * revert last	2022-04-14 06:51:55 -07:00
Zachary Mueller	89293a0f6b	Make nightly install dev accelerate (#16783 )	2022-04-14 09:41:02 -04:00
Sylvain Gugger	b151ddb9b9	Fix batch size in evaluation loop (#16763 ) * Fix batch size in evaluation loop * remove debug statement	2022-04-14 09:22:54 -04:00
Sanchit Gandhi	d8269eb4d5	[Flax `.from_pretrained`] Raise a warning if model weights are not in float32 (#16762 ) * [Flax] Raise a warning if model weights are not in float32 * apply suggestions and few small changes * reorder wording for better readability	2022-04-14 11:52:15 +02:00
Nicolas Patry	195fbbb6cf	Enabling `Tapex` in table question answering pipeline. (#16663 ) * Enabling `Tapex` in table question answering pipeline. * Questions are independant for Tapex, making the test respect that. * Missing extra space.	2022-04-14 09:06:14 +02:00
Bhadresh Savani	442dc45645	[Doctest] added doctest changes for electra (#16675 ) * added doctest changes for electra * fixed doctest tests * updated changes	2022-04-13 22:39:00 +02:00
Zachary Mueller	be752d12f8	Fixup no_trainer examples scripts and add more tests (#16765 ) * Change tracking to store_true * Remove step param and use it in the log dictionary directly * use vars(args) when passing args to init_trackers * Include tracking tests since tensorboard is already a dep	2022-04-13 14:40:48 -04:00
Stas Bekman	3a16ab25c8	[self-scheduled ci] explain where dependencies are (#16757 )	2022-04-13 12:28:02 -04:00
Tu Vu	34ef029dc0	Add self training code for text classification (#16738 ) * Add self-training code for text-classification * Add self-training code for text-classification * Add self-training code for text-classification * Add self-training code for text-classification * Add self-training code for text-classification * Delete strata	2022-04-13 12:03:24 -04:00
Sylvain Gugger	8e0d3b427f	Add defensive check for config num_labels and id2label (#16709 ) * Add defensive check for config num_labels and id2label * Actually check value... * Only warning inside init plus better error message	2022-04-13 11:28:19 -04:00
Yih-Dar	6bed0647fe	Reduce Funnel PT/TF diff (#16744 ) * Make Funnel Test less flaky Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-13 17:19:52 +02:00
Joao Gante	0b8f697219	CI: setup-dependent pip cache (#16751 ) * Setup-dependent pip cache * Do not restore from old versions	2022-04-13 16:19:14 +01:00
Stas Bekman	ac43a40e6a	[modeling_utils] better explanation of ignore keys (#16741 )	2022-04-13 08:03:20 -07:00
Jeremy Fisher	0235bc57ab	Fix and improve CTRL doctests (#16573 ) * Improve CTRL doctests * Fix `CTRLForSequenceClassification` flakiness with inconsistent losses * Remove unused * Fixup * Add CTRL to documentation_tests.txt * Fix control code not being first * Add output assertions * Change from sshleifer/tiny-ctrl -> ctrl * Run `make fixup` * apply `list` to output logits shape for clarity * Reduce output loss precision to make assertion more robust * Add assertion of control code being first * Fix docstyle * upper case sentence following control code * Weird bug fixes * Add a better generation example Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2022-04-13 15:44:31 +02:00
Michael Chung	06b4aac9eb	Add Doc Test for GPT-J (#16507 ) * Required the values GPTJ unfortunately cannot run the model =) * Added the file to the doc tests * Run Fixup and Style * Fixed with the test versions of gptj. Ran Style and Fixup. * Trigger ci * A Minor Change to License * Fixed spacing added to the benchmark_utils. Then refactored tests to const variables. * Removed strings that were included as default parameters anyways. Co-authored-by: ArEnSc <xx.mike.chung.xx@gmail.com>	2022-04-13 15:04:47 +02:00

1 2 3 4 5 ...

9595 Commits