transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Leonid Boytsov	c82e017aa9	Misc. fixes for Pytorch QA examples: (#16958 ) 1. Fixes evaluation errors popping up when you train/eval on squad v2 (one was newly encountered and one that was previously reported Running SQuAD 1.0 sample command raises IndexError #15401 but not completely fixed). 2. Removes boolean arguments that don't use store_true. Please, don't use these: *ANY non-empty string is being converted to True in this case and this clearly is not the desired behavior (and it creates a LOT of confusion). 3. All no-trainer test scripts are now saving metric values in the same way (with the right prefix eval_), which is consistent with the trainer-based versions. 4. Adds forgotten model.eval() in the no-trainer versions. This improved some results, but not everything (see the discussion in the end). Please, see the F1 scores and the discussion below.	2022-04-27 08:51:39 -04:00
Yih-Dar	49d5bcb0f3	Fix HubertRobustTest PT/TF equivalence test on GPU (#16943 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-27 10:50:03 +02:00
NielsRogge	479fdc4925	Add semantic script, trainer (#16834 ) * Add first draft * Improve script and README * Improve README * Apply suggestions from code review * Improve script, add link to resulting model * Add corresponding test * Adjust learning rate	2022-04-27 10:12:18 +02:00
Anton Lozhkov	a4a88fa09f	[Research] Speed up evaluation for XTREME-S (#16785 ) * Avoid repeated per-lang filtering * Language groups and logits preprocessing * Style	2022-04-27 08:34:21 +02:00
Yongliang Shen	2d91e3c304	use original loaded keys to find mismatched keys (#16920 )	2022-04-26 17:29:52 -04:00
nikkie	d365f5074f	Fix RuntimeError message format (#16906 )	2022-04-26 17:08:28 -04:00
Yang Ming	10dfa126b7	documentation: some minor clean up (#16850 )	2022-04-26 16:56:08 -04:00
Krishna Sirumalla	aaee4038c3	Add onnx config for RoFormer (#16861 ) * add roformer onnx config	2022-04-26 16:51:15 +02:00
Ahmed Elnaggar	8afaaa26f5	FIx Iterations for decoder (#16934 ) FIx Iterations for decoder	2022-04-26 12:54:14 +02:00
Manuel	fa32247406	apply torch int div to layoutlmv2 (#15457 ) * apply torch int div * black linting fixup * update path to torch_int_div * clarify imports	2022-04-26 10:07:51 +02:00
Sylvain Gugger	344b9fb0c6	Limit the use of PreTrainedModel.device (#16935 ) * Limit the use of PreTrainedModel.device * Fix	2022-04-25 20:58:50 -04:00
code-review-doctor	6568752039	Fix issue probably-meant-fstring found at https://codereview.doctor (#16913 )	2022-04-25 15:15:00 -04:00
Sanchit Gandhi	fea94d6790	Replace deprecated logger.warn with warning (#16876 )	2022-04-25 15:12:51 -04:00
Joao Gante	e03966e404	TF: XLA stable softmax (#16892 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-25 20:10:51 +01:00
Rushi Chaudhari	8246caf3eb	added deit onnx config (#16887 ) * added deit onnx config	2022-04-25 20:50:45 +02:00
Joao Gante	9331b37967	TF: XLA Logits Warpers (#16899 ) Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-04-25 19:48:08 +01:00
Joao Gante	809dac48f9	TF: XLA logits processors - minimum length, forced eos, and forced bos (#16912 ) * XLA min len, forced eos, and forced bos Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-04-25 19:27:53 +01:00
Yih-Dar	f6210c49e2	Fix RemBertTokenizerFast (#16933 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-25 19:51:50 +02:00
Yih-Dar	32adbb26d6	Fix PyTorch RAG tests GPU OOM (#16881 ) * add torch.cuda.empty_cache in some PT RAG tests * torch.cuda.empty_cache in tearDownModule() * tearDown() * add gc.collect() Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-25 17:33:56 +02:00
Yih-Dar	3e47d19cfc	Add missing ckpt in config docs (#16900 ) * add missing ckpt in config docs * add more missing ckpt in config docs * fix wrong ckpts * fix realm ckpt * fix s2t2 * fix xlm_roberta ckpt * Fix for deberta v2 * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * use only one checkpoint for DPR * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-04-25 17:31:45 +02:00
Patrick von Platen	3a71e94a92	Fix doc test quicktour dataset (#16929 ) * fix doc test * fix doc test Co-authored-by: Patrick <patrick@pop-os.localdomain>	2022-04-25 16:26:59 +02:00
Thomas Chaigneau	508baf1943	add bigbird typo fixes (#16897 ) Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>	2022-04-25 11:32:06 +02:00
Patrick von Platen	72728be3db	[DocTests] Fix some doc tests (#16889 ) * [DocTests] Fix some doc tests * hacky fix * correct	2022-04-23 08:40:14 +02:00
cavdard	22fc93c4d9	Changes in create_optimizer to support tensor parallelism with SMP (#16880 ) * changes in create optimizer to support tensor parallelism with SMP * Update src/transformers/trainer.py Convert if check to one line. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Cavdar <dcavdar@a07817b12d7e.ant.amazon.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-22 15:24:38 -04:00
Joao Gante	99c8226b12	TF: XLA repetition penalty (#16879 )	2022-04-22 18:29:32 +01:00
Thomas Chaigneau	ec81c11a18	Add OnnxConfig for ConvBERT (#16859 ) * add OnnxConfig for ConvBert Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>	2022-04-22 18:19:15 +02:00
Minh Chien Vu	0d1cff1195	Add doc tests for Albert and Bigbird (#16774 ) * Add doctest BERT * make fixup * fix typo * change checkpoints * make fixup * define doctest output value, update doctest for mobilebert * solve fix-copies * update QA target start index and end index * change checkpoint for docs and reuse defined variable * Update src/transformers/models/bert/modeling_tf_bert.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * make fixup * Add Doctest for Albert and Bigbird * make fixup * overwrite examples for Albert and Bigbird * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update longer examples for Bigbird * using examples from squad_v2 * print out example text * change name token-classification-big-bird checkpoint to random Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-04-22 18:07:16 +02:00
Mario Šaško	9fa88172c2	Minor fixes/improvements in `convert_file_size_to_int` (#16891 ) * Minor improvements to `convert_file_size_to_int` * Add <unit>bit version to kilos and megas * Minor fix	2022-04-22 16:54:20 +02:00
Joao Gante	6d90d76f5d	TF: rework XLA generate tests (#16866 )	2022-04-22 12:38:08 +01:00
Yih-Dar	3b1bbefc47	Add missing entries in mappings (#16857 ) * add missing entries in some mappings Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-22 10:53:24 +02:00
Loubna Ben Allal	d91841315a	New features for CodeParrot training script (#16851 ) * add tflops logging and fix grad accumulation * add accelerate tracking and checkpointing * scale loss of last batch correctly * fix typo * compress loss computation Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * add resume from checkpoint argument * add load_state accelerate from checkpoint, register lr scheduler and add tflops function * reformat code * reformat code * add condition on path for resume checkpoint * combine if conditions Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * add source for tflops formula Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>	2022-04-21 18:43:46 +02:00
Yih-Dar	eef2422e96	Fix doctest list (#16878 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-21 18:12:14 +02:00
Thomas Chaigneau	0b1e0fcf7a	Fix GPT-J onnx conversion (#16780 ) * add gptj to TOKENIZER_MAPPING_NAMES * fix int32 to float to avoid problem in onnx * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2022-04-21 15:55:30 +02:00
Eldar Kurtic	bae9b6458c	Use ACT2FN to fetch ReLU activation (#16874 ) - all activations should be fetched through ACT2FN - it returns ReLU as `nn.Module`, which allows attaching hooks on the activation function and prints it to stdout when `print(model)`	2022-04-21 09:33:29 -04:00
Sylvain Gugger	cb555af2c7	Return input_ids in ImageGPT feature extractor (#16872 )	2022-04-21 09:09:00 -04:00
Nicolas Patry	e789418ebe	Adding support for `array` key in raw dictionnaries in ASR pipeline. (#16827 ) * Adding support for `array` key in raw dictionnaries in ASR pipeline. * ES . * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Making it work by not popping `array` first. * Black 22.3 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-21 14:39:10 +02:00
ghlai9665	daf520b033	tiny tweak to allow BatchEncoding.token_to_char when token doesn't correspond to chars (#15901 ) * tweak to allow BatchEncoding.char_to_token(0) * update docstring * remote trailing whitespace * make fixup * make value checking for span_indices explicit Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-21 08:07:54 -04:00
Stefan Schweter	cb7e166428	t5: add conversion script for T5X to FLAX (#16853 ) * t5: add conversion script for T5X to FLAX * t5: make flake happy * t5: add copyright message to t5x conversion script * t5: fix lm head for v1.0 checkpoints	2022-04-21 13:00:35 +02:00
Nicolas Patry	6620f60c0a	Long QuestionAnsweringPipeline fix. (#16778 ) * Temporary commit witht the long QA fix. * Adding slow tests covering this fix. * Removing fast test as it doesn't fail anyway.	2022-04-21 09:59:25 +02:00
Zachary Mueller	705d65368f	Fix multiproc metrics in no_trainer examples (#16865 )	2022-04-20 17:26:27 -04:00
Sylvain Gugger	175da8d182	Fix custom init sorting script (#16864 )	2022-04-20 17:05:39 -04:00
Stas Bekman	67ed0e43dc	[docs] fix url (#16860 )	2022-04-20 11:01:24 -07:00
Stas Bekman	afa1ef0992	[modeling_utils] use less cpu memory with sharded checkpoint loading (#16844 ) * less cpu memory with sharded checkpoint loading * Trigger CI * Trigger CI	2022-04-20 07:44:37 -07:00
Nicolas Patry	e13a91fe60	Fixing return type tensor with `num_return_sequences>1`. (#16828 ) * Fixing return type tensor with `num_return_sequences>1`. * Nit.	2022-04-20 16:11:51 +02:00
Yang Ming	ff06b17791	add DebertaV2 fast tokenizer (#15529 ) Co-authored-by: alcinos <carion.nicolas@gmail.com> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> Co-authored-by: Nicolas Carion <carion.nicolas@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-20 10:26:51 +02:00
Patrick von Platen	e1c153cbaa	[Typo] Fix typo in modeling utils (#16840 )	2022-04-19 23:09:03 +02:00
Manuel R. Ciosici	3104036e7f	Add support for bitsandbytes (#15622 ) * Add initial BNB integration * fixup! Add initial BNB integration * Add bnb test decorator * Update Adamw8bit option name * Use the full bnb package name * Overide bnb for all embedding layers * Fix package name * Formatting * Remove unnecessary import * Update src/transformers/trainer.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Rename AdamwBNB optimizer option * Add training test checking that bnb memory utilization is lower * fix merge * fix merge; fix + extend new test * cleanup * expand bnb * move all require_* candidates to testing_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org>	2022-04-19 16:01:29 -04:00
Yih-Dar	e6d23a4b9b	Improve test_pt_tf_model_equivalence on PT side (#16731 ) * Update test_pt_tf_model_equivalence on PT side Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-19 21:13:27 +02:00
Dahlbomii	3dd57b15c5	Type hints added to Speech to Text (#16506 ) * Type hints added * return hints added * Update src/transformers/models/speech_to_text/modeling_tf_speech_to_text.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2022-04-19 17:58:08 +01:00
SaulLu	1efca4e6c8	replace `Speech2TextTokenizer` by `Speech2TextFeatureExtractor` in some docstrings (#16835 ) * replace `Speech2TextTokenizer` by `Speech2TextFeatureExtractor` in docstring * quality	2022-04-19 18:32:22 +02:00

1 2 3 4 5 ...

9646 Commits