transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Zachary Mueller	be752d12f8	Fixup no_trainer examples scripts and add more tests (#16765 ) * Change tracking to store_true * Remove step param and use it in the log dictionary directly * use vars(args) when passing args to init_trackers * Include tracking tests since tensorboard is already a dep	2022-04-13 14:40:48 -04:00
Stas Bekman	3a16ab25c8	[self-scheduled ci] explain where dependencies are (#16757 )	2022-04-13 12:28:02 -04:00
Tu Vu	34ef029dc0	Add self training code for text classification (#16738 ) * Add self-training code for text-classification * Add self-training code for text-classification * Add self-training code for text-classification * Add self-training code for text-classification * Add self-training code for text-classification * Delete strata	2022-04-13 12:03:24 -04:00
Sylvain Gugger	8e0d3b427f	Add defensive check for config num_labels and id2label (#16709 ) * Add defensive check for config num_labels and id2label * Actually check value... * Only warning inside init plus better error message	2022-04-13 11:28:19 -04:00
Yih-Dar	6bed0647fe	Reduce Funnel PT/TF diff (#16744 ) * Make Funnel Test less flaky Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-13 17:19:52 +02:00
Joao Gante	0b8f697219	CI: setup-dependent pip cache (#16751 ) * Setup-dependent pip cache * Do not restore from old versions	2022-04-13 16:19:14 +01:00
Stas Bekman	ac43a40e6a	[modeling_utils] better explanation of ignore keys (#16741 )	2022-04-13 08:03:20 -07:00
Jeremy Fisher	0235bc57ab	Fix and improve CTRL doctests (#16573 ) * Improve CTRL doctests * Fix `CTRLForSequenceClassification` flakiness with inconsistent losses * Remove unused * Fixup * Add CTRL to documentation_tests.txt * Fix control code not being first * Add output assertions * Change from sshleifer/tiny-ctrl -> ctrl * Run `make fixup` * apply `list` to output logits shape for clarity * Reduce output loss precision to make assertion more robust * Add assertion of control code being first * Fix docstyle * upper case sentence following control code * Weird bug fixes * Add a better generation example Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2022-04-13 15:44:31 +02:00
Michael Chung	06b4aac9eb	Add Doc Test for GPT-J (#16507 ) * Required the values GPTJ unfortunately cannot run the model =) * Added the file to the doc tests * Run Fixup and Style * Fixed with the test versions of gptj. Ran Style and Fixup. * Trigger ci * A Minor Change to License * Fixed spacing added to the benchmark_utils. Then refactored tests to const variables. * Removed strings that were included as default parameters anyways. Co-authored-by: ArEnSc <xx.mike.chung.xx@gmail.com>	2022-04-13 15:04:47 +02:00
Stas Bekman	12bfa97a43	[from_pretrained] refactor find_mismatched_keys (#16706 )	2022-04-13 07:50:15 -04:00
davidleonfdez	9f8bfe703c	Fix #16660 (tokenizers setters of ids of special tokens) (#16661 ) * Fix setters of _token_id properties of SpecialTokensMixin Test setters of common tokens ids * Move to a separate test checks of setters of tokens ids * Add independent test for ByT5 * Add Canine test * Test speech to text	2022-04-13 07:49:06 -04:00
Patrick von Platen	b24201fa44	[Doctests] Fix all T5 doc tests (#16646 ) * [Doctests] Fix all T5 doc tests * make style * Update docs/source/en/model_doc/t5.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply Sylvains comments * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-13 11:36:54 +02:00
Santiago Castro	f7196f2e63	Fix decoding score comparison when using logits processors or warpers (#10638 ) * Normalize using a logits warper * Add a flag in `generate` to support the logit renormalization * Add in RAG	2022-04-13 09:37:33 +01:00
Joao Gante	eb5bdcdfa5	TF generate: handle case without cache in beam search (#16704 )	2022-04-12 20:46:10 +01:00
Minh Chien Vu	9c9db751e2	add Bigbird ONNX config (#16427 ) * add Bigbird ONNX config	2022-04-12 20:46:06 +02:00
Sanchit Gandhi	a960406722	[FlaxWav2Vec2Model] Fix bug in attention mask (#16725 ) * [FlaxWav2Vec2Model] Fix bug in attention mask * more fixes * add (Flax)SpeechEncoderDecoderModel PT-FX cross-test	2022-04-12 19:48:24 +02:00
Sanchit Gandhi	6adefba3f0	[FlaxSpeechEncoderDecoder] Fix input shape bug in weights init (#16728 ) * [FlaxSpeechEncoderDecoder] Fix input shape bug in weights init * make style	2022-04-12 19:33:57 +02:00
hiromu	1bac40db8a	Add Doc Tests for Reformer PyTorch (#16565 ) * start working * fix: ReformerForQA doctest * fix: ReformerModelWithLMHead doctest * fix: ReformerModelForSC doctest * fix: ReformerModelForMLM doctest * add: documentation_tests.txt * make fixup * change: ReformerModelForSC doctest * change: checkpoint	2022-04-12 18:52:31 +02:00
Joao Gante	d7f7f29f29	TF: remove set_tensor_by_indices_to_value (#16729 )	2022-04-12 17:51:47 +01:00
Anmol Joshi	a315988bae	Moved functions to pytorch_utils.py (#16625 ) * Moved functions to pytorch_utils.py * isort formatting * Reverted tf changes * isort, make fix-copies * documentation fix * Fixed Conv1D import * Reverted research examples file * backward compatibility for pytorch_utils * missing import * isort fix	2022-04-12 12:38:50 -04:00
Sylvain Gugger	0711c45eae	Remove duplicate header (#16732 )	2022-04-12 12:37:13 -04:00
Nicolas Patry	a192f61e08	Change the chunk_iter function to handle (#16730 ) * Change the chunk_iter function to handle the subtle cases where the last chunk gets ignored since all the data is in the `left_strided` data. We need to remove the right striding on the previous item. * Remove commented line.	2022-04-12 18:25:02 +02:00
Anmol Joshi	cc034f72eb	Replace assertion with exception (#16720 ) * Updated assertions to exceptions * updated assertions to exceptions * bug fixes * fix-copies * Update modeling_ctrl.py * Update src/transformers/models/ctrl/modeling_tf_ctrl.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gpt_neo/modeling_gpt_neo.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_tf_gptj.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update modeling_led.py * Update modeling_led.py * Update modeling_led.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-12 11:47:01 -04:00
Shang Zhang	14daa6102a	Qdqbert example add benchmark script with ORT-TRT (#16592 ) * add ort-trt benchmark script * Update README.md * ort version can be newer * formatting * specify ORT version	2022-04-12 11:13:59 -04:00
Heerak Son	db3edd050b	Update run_translation_no_trainer.py (#16652 ) args.model_name_or_path -> args.config_name fix it	2022-04-12 08:55:12 -04:00
smelm	b9f12bedd3	Only call get_output_embeddings when tie_word_embeddings is set (#16667 ) This avoids an unnecessary call and avoids problems during initialization of class hierarchies. Co-authored-by: Samuel Melm <samuel.melm@stud.uni-heidelberg.de>	2022-04-12 07:55:44 -04:00
Michael Chung	924484ee4a	Add Doc Test GPT-2 (#16439 ) * First Pass All Tests Pass * WIP * Adding file to documentation tests * Change the base model for the example in the doc test. * Fix Code Styling by running make fixup * Called Style * Reverted to gpt2 model rather than distill gpt2 Then used a token classification model over a sequence model for an example. * Fix Styling Issue * Hopefully ignores the formatting issue. Co-authored-by: ArEnSc <xx.mike.chung.xx@gmail.com>	2022-04-12 12:11:03 +02:00
Patrick von Platen	70851a6bf0	[Bart] correct doc test (#16722 )	2022-04-12 10:19:49 +02:00
Zachary Mueller	69233cf03b	Fix example logs repeating themselves (#16669 ) Move declaration of log streams to before tests, so that results won't get compounded on top of each other	2022-04-11 16:25:16 -04:00
Yih-Dar	dce33f2150	Improve PT/TF equivalence test (#16557 ) * add error message * Use names in the error message * allow ModelOutput * rename to check_pt_tf_outputs and move outside * fix style * skip past_key_values in a better way * Add comments * improve code for label/loss * make the logic clear by moving the ignore keys out * fix _postprocessing_to_ignore * fix _postprocessing_to_ignore: create new outputs from the remaining fields * ignore past_key_values in TFGPT2 models for now * make check_pt_tf_outputs better regarding names * move check_pt_tf_models outside * rename methods * remove test_pt_tf_model_equivalence in TFCLIPModelTest * Reduce TFViTMAEModelTest.test_pt_tf_model_equivalence * move prepare_pt_inputs_from_tf_inputs outside check_pt_tf_models * Fix quality * Clean-up TFLxmertModelTester.test_pt_tf_model_equivalence * Fix quality * fix * fix style * Clean-up TFLEDModelTest.test_pt_tf_model_equivalence * Fix quality * add docstring * improve comment Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-11 22:19:12 +02:00
Yih-Dar	7f7300856d	Handle image_embeds in ViltModel (#16696 ) * update * batch_size -> text_batch_size Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-11 22:16:20 +02:00
Nicholas Broad	161c0a2eec	Private repo TrainingArgument (#16707 ) * private repo argument to trainer * format Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>	2022-04-11 13:37:16 -04:00
Zachary Mueller	d4b3e359aa	Don't push checkpoints to hub in `no_trainer` scripts (#16703 ) Adds checkpoint prefixes to the gitignore if `push_to_hub` is used along with `checkpointint_steps`	2022-04-11 12:42:45 -04:00
Yih-Dar	c04619ecf3	Enable more test_torchscript (#16679 ) * update _create_and_check_torchscript * Enable test_torchscript * clear_class_registry Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-11 18:23:35 +02:00
Yih-Dar	3918d6a9d6	Reduce memory leak in _create_and_check_torchscript (#16691 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-11 18:22:28 +02:00
Yih-Dar	2109afae71	Rename the method test_torchscript (#16693 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-11 18:21:45 +02:00
Yih-Dar	40618ec29e	Fix TF_MASKED_LM_SAMPLE (#16698 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-11 18:19:28 +02:00
Suraj Patil	1471857f13	update decoder_vocab_size when resizing embeds (#16700 )	2022-04-11 18:02:10 +02:00
Ahmed Elnaggar	5e68675755	Fix t5 shard on TPU Pods (#16527 ) * Fix t5 shard on TPU Pods The current script doesn't work properly on a TPU pod because the global batch is not divided correctly per host. This pull request fixes this issue by dividing the global batch to each host before it is shared on each host. * fix style Co-authored-by: ahmed-elnaggar <ahmed.elnaggar@allianz.com>	2022-04-11 16:45:20 +02:00
Minh Chien Vu	2831826bc6	Add Doc Test for BERT (#16523 ) * Add doctest BERT * make fixup * fix typo * change checkpoints * make fixup * define doctest output value, update doctest for mobilebert * solve fix-copies * update QA target start index and end index * change checkpoint for docs and reuse defined variable * Update src/transformers/models/bert/modeling_tf_bert.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * make fixup Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2022-04-11 15:51:28 +02:00
Patrick von Platen	098b002644	[Doctests] Correct task summary (#16644 )	2022-04-11 14:59:35 +02:00
Sadra	6ef7186b5d	fixed crash when deleting older checkpoint and a file f"{checkpoint_prefix}-" exist (#16686 ) I create an archive of older checkpoints during training the checkpoint has a name with `f"{checkpoint_prefix}-.zip/.tar ` previously `glob(f"{checkpoint_prefix}-*")` takes all files/folders starting with the name checkpoint, and later `shutil.rmtree(checkpoint)` takes a folder name; since at some point it my get a zip file; it crashes training; adding this `if os.path.isdir(x)` allows only folders on `glob_checkpoints`	2022-04-11 07:32:07 -04:00
Joao Gante	b0bf3011c1	Generate: min length can't be larger than max length (#16668 ) * min length must be smaller than max length * Update min_length in tests	2022-04-11 11:55:30 +01:00
Jia LI	4868a830db	Jia multi gpu eval (#16428 ) * add simple multi gpu complet * add human_eval_multi_gpu * use copy strategy to distribute across gpu, to avoid padding * add doc string * update code style * use task id to arrange output * truncate input to avoid zero pad * Stop the copy mechanism * update style * restore copies to scale better in distributed mode * update style * replace human eval * Apply suggestions from code review 1. Tokenize all input at the same time 2. use attention_mask to get the input length 3. other small fixes Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * correct typo and update docstring * update code style * remove num sample division constraint * remove max len calculation * use accelerator.gather once to speed up * use accelerate set_seed; update accelerate version * correct gather bug Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>	2022-04-11 11:24:32 +02:00
Yih-Dar	8e93dc7eaf	Fix some doc examples in task summary (#16666 ) * Fix some doc examples Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-04-11 11:20:03 +02:00
SaulLu	1025a9b742	add a warning in `SpmConverter` for sentencepiece's model using the byte fallback feature (#16629 ) * update proto sentencepiece model * Revert "update proto sentencepiece model" This reverts commit `b07f671747`. * add check * add test * Revert "Revert "update proto sentencepiece model"" This reverts commit `46108257b8`. * test for log level * test for log level 2 * warning at the warning level * clean * format * add explanation in docstring	2022-04-11 11:06:10 +02:00
Steven Liu	7c5d79912a	Update audio examples with MInDS-14 (#16633 ) * ✨ update audio examples with minds dataset * 🖍 make style * 🖍 minor fixes for doctests	2022-04-08 15:55:42 -05:00
Stas Bekman	4d46106718	[Trainer] tf32 arg doc (#16674 ) * [Trainer] tf32 arg doc * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-08 12:35:39 -07:00
Laura Hanu	f4d4f0a1ec	only load state dict when the checkpoint is not None (#16673 )	2022-04-08 13:42:04 -04:00
Zachary Mueller	d57da99237	Add tests for no_trainer and fix existing examples (#16656 ) * Fixed some bugs involving saving during epochs * Added tests mimicking the existing examples tests * Added in json exporting to all `no_trainer` examples for consistency	2022-04-08 10:03:56 -04:00

1 2 3 4 5 ...

9554 Commits