transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-02 19:21:31 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	7d4fe85ef3	Fix psuh_to_hub in Trainer when nothing needs pushing (#23751 )	2023-05-25 09:38:09 -04:00
Ravi Theja	06c28cd0fc	Add LlamaIndex to awesome-transformers.md (#23484 )	2023-05-25 09:35:10 -04:00
Eric J. Wang	f0a2a82ab4	Fix `pip install --upgrade accelerate` command in modeling_utils.py (#23747 ) Fix command in modeling_utils.py	2023-05-25 07:48:48 -04:00
Matt	e45e756d22	Remove the last few TF serving sigs (#23738 ) Remove some more serving methods that (I think?) turned up while this PR was open	2023-05-24 21:19:44 +01:00
Sylvain Gugger	9850e6ddab	Enable prompts on the Hub (#23662 ) * Enable prompts on the Hub * Update src/transformers/tools/prompts.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Address review comments --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-05-24 16:09:13 -04:00
Zachary Mueller	75bbf20bce	Fix sagemaker DP/MP (#23681 ) * Check for use_sagemaker_dp * Add a check for is_sagemaker_mp when setting _n_gpu again. Should be last broken thing * Try explicit check? * Quality	2023-05-24 15:51:09 -04:00
Daniel King	89159651ba	Fix the regex in `get_imports` to support multiline try blocks and excepts with specific exception types (#23725 ) * fix and test get_imports for multiline try blocks, and excepts with specific errors * fixup * add some more tests * add license	2023-05-24 15:40:19 -04:00
Sanchit Gandhi	d8222be57e	[Whisper] Reduce batch size in tests (#23736 )	2023-05-24 17:31:25 +01:00
Matt	814de8fac7	Overhaul TF serving signatures + dummy inputs (#23234 ) * Let's try autodetecting serving sigs * Don't clobber existing sigs * Change shapes for multiplechoice models * Make default dummy inputs smarter too * Fix missing f-string * Let's YOLO a serving output too * Read __class__.__name__ properly * Don't just pass naked lists in there and expect it to be okay * Code cleanup * Update default serving sig * Clearer error messages * Further updates to the default serving output * make fixup * Update the serving output a bit more * Cleanups and renames, raise errors appropriately when we can't infer inputs * More renames * we're building in a functional context again, yolo * import DUMMY_INPUTS from the right place * import DUMMY_INPUTS from the right place * Support cross-attention in the dummies * Support cross-attention in the dummies * Complete removal of dummy/serving overrides in BERT * Complete removal of dummy/serving overrides in RoBERTa * Obliterate lots and lots of serving sig and dummy overrides * merge type hint changes * Fix for token_type_ids with vocab_size 1 * Add missing property decorator * Fix T5 and hopefully some models that take conv inputs * More signature pruning * Fix T5's signature * Fix Wav2Vec2 signature * Fix LongformerForMultipleChoice input signature * Fix BLIP and LED * Better default serving output error handling * Fix BART dummies * Fix dummies for cross-attention, esp encoder-decoder models * Fix visionencoderdecoder signature * Fix BLIP serving output * Small tweak to BART dummies * Cleanup the ugly parameter inspection line that I used in a few places * committed a breakpoint again * Move the text_dims check * Remove blip_text serving_output * Add decoder_input_ids to the default input sig * Remove all the manual overrides for encoder-decoder model signatures * Tweak longformer/led input sigs * Tweak default serving output * output.keys() -> output * make fixup	2023-05-24 17:03:24 +01:00
Connor Henderson	3d7baef114	fix: Whisper generate, move text_prompt_ids trim up for max_new_tokens calculation (#23724 ) move text_prompt_ids trimming to top	2023-05-24 11:34:21 -04:00
Jungnerd	50a56bedb6	fix: delete duplicate sentences in `document_question_answering.mdx` (#23735 ) fix: delete duplicate sentence	2023-05-24 11:20:50 -04:00
Matt	d2d8822604	TF SAM memory reduction (#23732 ) * Extremely small change to TF SAM dummies to reduce memory usage on build * remove debug breakpoint * Debug print statement to track array sizes * More debug shape printing * More debug shape printing * Now remove the debug shape printing * make fixup * make fixup	2023-05-24 15:59:02 +01:00
pagarsky	28aa438cd2	Minor awesome-transformers.md fixes (#23453 ) Minor docs fixes	2023-05-24 08:57:52 -04:00
Matt	f8b2574416	Better TF docstring types (#23477 ) * Rework TF type hints to use \| None instead of Optional[] for tf.Tensor * Rework TF type hints to use \| None instead of Optional[] for tf.Tensor * Don't forget the imports * Add the imports to tests too * make fixup * Refactor tests that depended on get_type_hints * Better test refactor * Fix an old hidden bug in the test_keras_fit input creation code * Fix for the Deit tests	2023-05-24 13:52:52 +01:00
Wang, Yi	767e6b5314	fix gptj could not jit.trace in GPU (#23317 ) Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2023-05-24 08:48:31 -04:00
uchuhimo	b4698b7ef2	fix: use bool instead of uint8/byte in Deberta/DebertaV2/SEW-D to make it compatible with TensorRT (#23683 ) * Use bool instead of uint8/byte in DebertaV2 to make it compatible with TensorRT TensorRT cannot accept onnx graph with uint8/byte intermediate tensors. This PR uses bool tensors instead of unit8/byte tensors to make the exported onnx file can work with TensorRT. * fix: use bool instead of uint8/byte in Deberta and SEW-D --------- Co-authored-by: Yuxian Qiu <yuxianq@nvidia.com>	2023-05-24 08:47:43 -04:00
Maria Khalusova	2eaaf17a0b	Export to ONNX doc refocused on using optimum, added tflite (#23434 ) * doc refocused on using optimum, tflite * minor updates to fix checks * Apply suggestions from code review Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> * TFLite to separate page, added links * Removed the onnx list builder * make style * Update docs/source/en/serialization.mdx Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> --------- Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>	2023-05-24 08:13:23 -04:00
Tim Dettmers	796162c512	Paged Optimizer + Lion Optimizer for Trainer (#23217 ) * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-05-24 12:53:28 +02:00
Tim Dettmers	9d73b92269	4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479 ) * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added fix for fp32 layer norms and bf16 compute in LLaMA. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Fixing issues for PR #23479. * Added fix for fp32 layer norms and bf16 compute in LLaMA. * Reverted variable name change. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Added missing tests. * Fixup changes. * Added fixup changes. * Missed some variables to rename. * revert trainer tests * revert test trainer * another revert * fix tests and safety checkers * protect import * simplify a bit * Update src/transformers/trainer.py * few fixes * add warning * replace with `load_in_kbit = load_in_4bit or load_in_8bit` * fix test * fix tests * this time fix tests * safety checker * add docs * revert torch_dtype * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * multiple fixes * update docs * version checks and multiple fixes * replace `is_loaded_in_kbit` * replace `load_in_kbit` * change methods names * better checks * oops * oops * address final comments --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-05-24 12:52:45 +02:00
Wang, Yi	33687a3f61	add GPTJ/bloom/llama/opt into model list and enhance the jit support (#23291 ) Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2023-05-24 10:57:56 +01:00
zspo	003a0cf8cc	Fix some docs what layerdrop does (#23691 ) * Fix some docs what layerdrop does * Update src/transformers/models/data2vec/configuration_data2vec_audio.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix more docs --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-05-23 14:50:40 -04:00
小桐桐	357f281ba2	fix: load_best_model_at_end error when load_in_8bit is True (#23443 ) Ref: https://github.com/huggingface/peft/issues/394 Loading a quantized checkpoint into non-quantized Linear8bitLt is not supported. call module.cuda() before module.load_state_dict()	2023-05-23 14:50:27 -04:00
Yih-Dar	de5f86e59d	Skip `TFCvtModelTest::test_keras_fit_mixed_precision` for now (#23699 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-23 20:47:47 +02:00
LWprogramming	3d57404464	is_batched fix for remaining 2-D numpy arrays (#23309 ) * Fix is_batched code to allow 2-D numpy arrays for audio * Tests * Fix typo * Incorporate comments from PR #23223	2023-05-23 14:37:35 -04:00
Younes Belkada	6b7d6f848b	[`Blip`] Fix blip doctest (#23698 ) fix blip doctest	2023-05-23 18:25:44 +02:00
Matt	876d9a32c6	TF version compatibility fixes (#23663 ) * New TF version compatibility fixes * Remove dummy print statement, move expand_1d * Make a proper framework inference function * Make a proper framework inference function * ValueError -> TypeError	2023-05-23 16:42:11 +01:00
Younes Belkada	42baa58f90	[`SAM`] Fixes pipeline and adds a dummy pipeline test (#23684 ) * add a dummy pipeline test * change test name	2023-05-23 17:36:49 +02:00
Yih-Dar	71a5ed3433	Fix a `BridgeTower` test (#23694 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-23 17:32:57 +02:00
Nayeon Han	1fe1e3caa4	🌐 [i18n-KO] Translated `tasks/monocular_depth_estimation.mdx` to Korean (#23621 ) docs: ko: `tasks/monocular_depth_estimation` Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>	2023-05-23 15:54:39 +02:00
Nicolas Patry	9e8d7066e6	Making `safetensors` a core dependency. (#23254 ) * Making `safetensors` a core dependency. To be merged later, I'm creating the PR so we can try it out. * Update setup.py * Remove duplicates. * Even more redundant.	2023-05-23 15:16:34 +02:00
Yih-Dar	abf691aac0	Fix PyTorch SAM tests (#23682 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-23 14:48:38 +02:00
Alex	b687af0b36	Fix typo in a parameter name for open llama model (#23637 ) * Update modeling_open_llama.py Fix typo in `use_memorry_efficient_attention` parameter name * Update configuration_open_llama.py Fix typo in `use_memorry_efficient_attention` parameter name * Update configuration_open_llama.py Take care of backwards compatibility ensuring that the previous parameter name is taken into account if used * Update configuration_open_llama.py format to adjust the line length * Update configuration_open_llama.py proper code formatting using `make fixup` * Update configuration_open_llama.py pop the argument not to let it be set later down the line	2023-05-23 12:57:58 +01:00
NielsRogge	527ab894e5	Add PerSAM [bis] (#23659 ) * Add PerSAM args * Make attn_sim optional * Rename to attention_similarity * Add docstrigns * Improve docstrings	2023-05-23 11:43:12 +02:00
dependabot[bot]	aa30cd4f3f	Bump requests from 2.22.0 to 2.31.0 in /examples/research_projects/lxmert (#23668 ) Bump requests in /examples/research_projects/lxmert Bumps [requests](https://github.com/psf/requests) from 2.22.0 to 2.31.0. - [Release notes](https://github.com/psf/requests/releases) - [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md) - [Commits](https://github.com/psf/requests/compare/v2.22.0...v2.31.0) --- updated-dependencies: - dependency-name: requests dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-05-23 05:31:53 -04:00
dependabot[bot]	9bf72ae564	Bump requests from 2.22.0 to 2.31.0 in /examples/research_projects/visual_bert (#23670 ) Bump requests in /examples/research_projects/visual_bert Bumps [requests](https://github.com/psf/requests) from 2.22.0 to 2.31.0. - [Release notes](https://github.com/psf/requests/releases) - [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md) - [Commits](https://github.com/psf/requests/compare/v2.22.0...v2.31.0) --- updated-dependencies: - dependency-name: requests dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-05-23 05:31:30 -04:00
dependabot[bot]	ecc05f8c1e	Bump requests from 2.27.1 to 2.31.0 in /examples/research_projects/decision_transformer (#23673 ) Bump requests in /examples/research_projects/decision_transformer Bumps [requests](https://github.com/psf/requests) from 2.27.1 to 2.31.0. - [Release notes](https://github.com/psf/requests/releases) - [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md) - [Commits](https://github.com/psf/requests/compare/v2.27.1...v2.31.0) --- updated-dependencies: - dependency-name: requests dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-05-23 05:28:09 -04:00
Nicolas Patry	e30ceae07b	small fix to remove unused eos in processor when it's not used. (#23408 )	2023-05-23 09:27:36 +02:00
NielsRogge	2f424d7979	[image-to-text pipeline] Add conditional text support + GIT (#23362 ) * First draft * Remove print statements * Add conditional generation * Add more tests * Remove scripts * Remove BLIP specific linkes * Add support for pix2struct * Add fast test * Address comment * Fix style	2023-05-22 21:45:50 +02:00
Yih-Dar	e69feab8a1	Update workflow files (#23658 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-22 21:26:51 +02:00
Zachary Mueller	b191d7db44	Update all no_trainer with skip_first_batches (#23664 )	2023-05-22 14:49:31 -04:00
Matt	26a06814a1	Fix SAM tests and use smaller checkpoints (#23656 ) * Fix SAM tests and use smaller checkpoints * Override test_model_from_pretrained to use sam-vit-base as well * make fixup	2023-05-22 19:42:35 +02:00
sshahrokhi	6f72e71f97	changing the requirements to a cpu torch version that works (#23483 )	2023-05-22 12:58:55 -04:00
LWprogramming	5de2a6d5e5	Fix wav2vec2 is_batched check to include 2-D numpy arrays (#23223 ) * Fix wav2vec2 is_batched check to include 2-D numpy arrays * address comment * Add tests * oops * oops * Switch to np array Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Switch to np array * condition merge * Specify mono channel only in comment * oops, add other comment too * make style * Switch list check from falsiness to empty --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-05-22 12:57:45 -04:00
Tim Dettmers	4ddd9de9d3	Bugfix: LLaMA layer norm incorrectly changes input type and consumers lots of memory (#23535 ) * Fixed bug where LLaMA layer norm would change input type. * make fix-copies --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-05-22 18:20:38 +02:00
Zachary Mueller	fe34486f12	Muellerzr fix deepspeed (#23657 ) * Fix deepspeed recursion * Better fix	2023-05-22 11:22:54 -04:00
Younes Belkada	7bbdfd7b24	Fix accelerate logger bug (#23650 ) * fix logger bug * Update tests/mixed_int8/test_mixed_int8.py Co-authored-by: Zachary Mueller <muellerzr@gmail.com> * import `PartialState` --------- Co-authored-by: Zachary Mueller <muellerzr@gmail.com>	2023-05-22 15:39:47 +02:00
zspo	29294b0e68	Fix tensor device while attention_mask is not None (#23538 ) * Fix tensor device while attention_mask is not None * Fix tensor device while attention_mask is not None	2023-05-22 09:30:46 -04:00
Joshua Lochner	12ec7f0c20	Remove erroneous `img` closing tag (#23646 ) See https://github.com/huggingface/transformers/pull/23625	2023-05-22 09:28:26 -04:00
Tyler	6397b7f008	Debug example code for MegaForCausalLM (#23382 ) * Debug example code for MegaForCausalLM set ignore_mismatched_sizes=True in model loading code * Fix up	2023-05-22 10:53:14 +01:00
Yih-Dar	3658488ff7	Fix `tests/repo_utils/test_get_test_info.py` (#23485 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-20 06:53:10 +02:00

1 2 3 4 5 ...

12979 Commits