transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Lysandre Debut	39114c0383	Remove static pretrained maps from the library's internals (#29112 ) * [test_all] Remove static pretrained maps from the library's internals * Deprecate archive maps instead of removing them * Revert init changes * [test_all] Deprecate instead of removing * [test_all] PVT v2 support * [test_all] Tests should all pass * [test_all] Style * Address review comments * Update src/transformers/models/deprecated/_archive_maps.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/deprecated/_archive_maps.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * [test_all] trigger tests * [test_all] LLAVA * [test_all] Bad rebase --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-25 10:33:38 +01:00
gamepad_coder	76a33a1092	model_summary.md - Restore link to Harvard's Annotated Transformer. (#29702 ) * model_summary.md - Add link to Harvard's Annotated Transformer. * model_summary.md - slight wording change + capitalize name of the paper * model_summary.md - moves the Annotated Transformer link in a praenthesis next to the link to the original paper (great idea, stevhliu!) * model_summary.md - moves the Annotated Transformer link in a praenthesis next to the link to the original paper (commit pt. 2, accidentally removed "has" in pt. 1)	2024-03-23 18:29:39 -07:00
Billy Cao	dafe370255	[DOCS] Fix typo for llava next docs (#29829 ) Fix typo for llava next docs	2024-03-23 11:32:31 -07:00
amyeroberts	c5f0288bc7	[`SuperPoint`] Fix doc example (#29816 ) [SuperPoint] Fix doc example	2024-03-22 16:04:30 +00:00
Lysandre Debut	7e1413d16a	Complete security policy with mentions of remote code (#29707 ) * Security policy * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> Co-authored-by: Michelle Habonneau <83347449+Michellehbn@users.noreply.github.com> * Update SECURITY.md Co-authored-by: Diogo Teles Sant'Anna <diogoteles@google.com> --------- Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> Co-authored-by: Michelle Habonneau <83347449+Michellehbn@users.noreply.github.com> Co-authored-by: Diogo Teles Sant'Anna <diogoteles@google.com>	2024-03-22 14:13:18 +01:00
Arthur	2e7cb46f85	[`cleanup`] vestiges of causal mask (#29806 ) nit	2024-03-22 12:25:40 +00:00
igeni	884b2215c3	replaced concatenation to f-strings to improve readability and unify … (#29785 ) replaced concatenation to f-strings to improve readability and unify with the rest code	2024-03-22 12:23:16 +00:00
Joao Gante	34e07f4ba8	Generate: remove unused attributes in `AssistedCandidateGenerator` (#29787 ) remove unused attrs	2024-03-22 12:20:32 +00:00
jiqing-feng	e85654f5ec	rm input dtype change in CPU (#28631 ) * rm input dtype change in CPU * add warning when use CPU low-precision * rm useless logging	2024-03-22 12:02:43 +00:00
fxmarty	13b23704a8	Correct llava mask & fix missing setter for `vocab_size` (#29389 ) * correct llava mask * fix vipllava as wlel * mask out embedding for padding tokens * add test * fix style * add setter * fix test on suggestion	2024-03-22 19:57:08 +08:00
Ilyas Moutawwakil	aa17cf986f	Enable AMD docker build CI (#29803 ) * enable amd ci * remove unnecessary clean up	2024-03-22 11:56:47 +01:00
Steven Madere	347916130c	Fix type hint for train_dataset param of Trainer.__init__() to allow IterableDataset. Issue 29678 (#29738 ) * Fixed typehint for train_dataset param in Trainer.__init__(). Added IterableDataset option. * make fixup	2024-03-22 10:46:14 +00:00
Arthur	e68ff30419	[`quality`] update quality check to make sure we check imports 😈 (#29771 ) * update quality check * make it nice * update * let's make sure it runs and we have the logs actually * update workflow * nits	2024-03-22 10:11:59 +01:00
Raushan Turganbay	fadb053379	Change in-place operations to out-of-place in LogitsProcessors (#29680 ) * change in-place -> out-of-place * add tests * add more tests * naming consistency * fix doctest * forgot min-length processors * empty * Revert "fix doctest" This reverts commit `4772768457`. * revert change in docstring * Update tests/generation/test_logits_process.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/generation/test_logits_process.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-21 16:37:33 +00:00
Raushan Turganbay	b469ebc5cf	Prepend `bos token` to Blip generations (#29642 ) * prepend "bos" to blip generation * minor changes * Update src/transformers/models/blip_2/modeling_blip_2.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/instructblip/modeling_instructblip.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add generation tester mixin --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-21 16:33:18 +00:00
Joao Gante	ee38fc31fb	Llama: always convert the causal mask in the SDPA code path (#29663 ) * always convert the mask * rebase and fix copies	2024-03-21 16:30:18 +00:00
Joao Gante	5ffef2a978	Generate: remove legacy generation mixin imports (#29782 )	2024-03-21 16:28:25 +00:00
Jacky Lee	ef6e371dba	Add support for `torch_dtype` in the run_mlm example (#29776 ) feat: add support for torch_dtype Co-authored-by: Jacky Lee <jackylee328@gmail.com>	2024-03-21 15:09:35 +00:00
Zach Mueller	10d232e88e	Add deterministic config to `set_seed` (#29778 ) * Add deterministic config * Add note on slowdown * English fails me again	2024-03-21 11:07:39 -04:00
Zach Mueller	f0bfb150fe	Silence deprecations and use the DataLoaderConfig (#29779 ) * Remove deprecations * Clean	2024-03-21 10:26:51 -04:00
Matt	de627f5a14	Cast bfloat16 to float32 for Numpy conversions (#29755 ) * Cast bfloat16 to float32 for Numpy conversions * Add test	2024-03-21 14:04:11 +00:00
Arthur	73a73b415e	[`LlavaNext`] Fix llava next unsafe imports (#29773 ) * path llava-next * styling * styling	2024-03-21 13:47:58 +01:00
Yih-Dar	2ddceef9a2	Fix docker image build for `Latest PyTorch + TensorFlow [dev]` (#29764 ) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-03-21 13:14:29 +01:00
théo gigant	fd734be1b6	fix issue with logit processor during beam search in Flax (#29636 ) fix issue with logit processor in beam search in Flax	2024-03-21 11:27:03 +00:00
Matthias Dittrich	691c3d7325	Allow `-OO` mode for `docstring_decorator` (#29689 ) Fixes ``` File "/nix/store/rv8xdwghdad9jv2w86b8g08kan9l6ksm-python3.11-transformers-4.38.2/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 987, in <module> class AutoConfig: File "/nix/store/rv8xdwghdad9jv2w86b8g08kan9l6ksm-python3.11-transformers-4.38.2/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1011, in AutoConfig @replace_list_option_in_docstrings() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/nix/store/rv8xdwghdad9jv2w86b8g08kan9l6ksm-python3.11-transformers-4.38.2/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 966, in docstring_decorator lines = docstrings.split("\n") ^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'split' ```	2024-03-21 11:18:17 +00:00
Rahul Vinod Vishwakarma	9556054fb2	OWL-ViT box_predictor inefficiency issue (#29712 ) * Calculating box_bias at the start once, then reusing it at inference * Updating the compute_box_bias function for backwards compatibility * Caching compute_box_bias function * Bux fix * Update owlv2 accordingly to ensure repo consistency * Co-authored by: nvbinh15 <binh.pdc01@gmail.com> * Fixup changes * Made copied code consistent * Co-authored by: nvbinh15 <binh.pdc01@gmail.com> --------- Co-authored-by: Nguyen Van Binh <> Co-authored-by: Nguyen Van Binh <binh.pdc01@gmail.com>	2024-03-21 11:17:45 +00:00
Ash Kuroki	0639034a26	Fixed typo in quantization_config.py (#29766 ) Update quantization_config.py Fixed typo for clarity and correctness. previous: input time current: input type // changed time to type to fix the typo	2024-03-21 11:02:53 +00:00
Michael	5d1a58a646	[docs] Remove redundant `-` and `the` from custom_tools.md (#29767 ) [docs] Remove redundant and from custom_tools.md	2024-03-21 10:56:40 +00:00
Arthur	ff841900e4	[`BC 4.37 -> 4.38`] for Llama family, memory and speed (#29753 ) * attempt to fix * the actual fix that works with compilation! * this? * temporary update * nit? * dispatcg to memory efficient? * update both models that have static cache support * fix copies fix compile * make sure fix * fix cohere and gemma * fix beams? * nit * slipped through the cracks * nit * nits * update * fix-copies * skip failing tests * nits	2024-03-20 23:47:01 +01:00
Benjamin Ye	8dd4ce6f2c	[`BitsAndBytesConfig`] Warning for unused `kwargs` & safety checkers for `load_in_4bit` and `load_in_8bit` (#29761 ) * added safety checkers for load_in_4bit and load_in_8bit on init, as well as their setters * Update src/transformers/utils/quantization_config.py typo correction for load_in_8bit setter checks Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-03-20 18:37:28 +00:00
Yih-Dar	17e4467f0e	Fix docker image build (#29762 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-03-20 19:17:26 +01:00
Zach Mueller	c78f57729f	Update test reqs to include sentencepiece (#29756 ) * Update test reqs * Clean	2024-03-20 15:53:42 +00:00
NielsRogge	d91fd7f92c	Add LLaVa-1.6, bis (#29586 ) * First draft * Fix tests, add docs * Improve docstrings * Fix test * Address comments * Address comments * Remove vocab_size attribute * Remove batch_size * Address comment * Add image processor tests * Support fx * Update docstring * Add support for 34b * Convert 34b model * Add integration tests * Update checkpoints * Convert vicuna-13b, remove doc tests * Remove script * Remove file * Address comments * Improve docstrings * Deprecate vocab_size * Remove aspect_ratio_setting * Address comments * Update READMEs * Add tips about chat templates * Fix tests * Deprecate vocab_size safely * Update tests --------- Co-authored-by: Amy Roberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-20 15:51:12 +00:00
Matt	9d999481b2	Add correct batched handling for apply_chat_template (#29222 ) * Add correct batched handling for apply_chat_template * Fix warning method * Add error for incompatible options * expand tests * Add a skip for markuplm * Add skips for other layout models * Skip for LayoutLMv2 * Slightly update the warning message * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * typo fix * Update docstring for conversation kwarg * Update return docstring * Remove the warning, improve error message * Update src/transformers/tokenization_utils_base.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/test_tokenization_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/test_tokenization_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove return_dict=None * Fix up some merge cruft * More merge cruft * Add another skip * Add another skip --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-20 15:50:22 +00:00
amyeroberts	3c17c529cc	SuperPointModel -> SuperPointForKeypointDetection (#29757 )	2024-03-20 15:41:03 +00:00
Arthur Zucker	1248f09252	v4.40.0.dev.0	2024-03-20 23:31:47 +09:00
Matt	11ef35e828	Support sharded safetensors in TF (#29350 ) * Initial commit (still lots of unfinished bits) * (Still untested) add safetensors sharding to save_pretrained * Fix savetensors saving, update default shard size to match PT * Add proper loading of TF-format safetensors * Revert default size in case that changes things * Fix incorrect index name * Update loading priority * Update tests * Make the tests a little more stringent * Expand tests * Add sharded cross-test * Fix argument name * One more test fix * Adding mlx to the list of allowed formats * Remove irrelevant block for safetensors * Refactor warning logging into a separate function * Remove unused skip_logger_warnings arg * Update src/transformers/modeling_tf_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Move function def --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-20 14:22:35 +00:00
Ricardo	870bbb4c6b	fix jinja2 package version check (#29754 )	2024-03-20 13:51:16 +00:00
Kola	76b3b20fb2	Update Mamba types and pass through use_cache attr to MambaModel (#29605 ) * Update docstring for RMSNorm * Update cache_params object to correct MambaCache type * Update docstrings and type info * Pass through use_cache * ruff * Reformat with 119 char limit per line (thanks Arthur) * Pass through use_cache specifically to the backbone rather than all keyword arguments * Update src/transformers/models/mamba/modeling_mamba.py * Update src/transformers/models/mamba/modeling_mamba.py * Update src/transformers/models/mamba/modeling_mamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/mamba/modeling_mamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tab * Update src/transformers/models/mamba/modeling_mamba.py * Update src/transformers/models/mamba/modeling_mamba.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-20 13:53:22 +01:00
NielsRogge	776c9d3af8	[Tests] Remove unused code (#29737 ) Remove unused code	2024-03-20 13:26:00 +01:00
peterjc123	a1a7454107	fix galore layerwise with frozen params (#29743 )	2024-03-20 11:06:52 +01:00
Peng Wei	8692aa88e2	fixed the issue of DPO trainer that using one node and mutiple GPUs and set the device_map='auto' (#29695 ) * fixed the issue of DPO trainer that using one node and mutiple GPUs * before update, add the assert * run the ruff formatter * Update src/transformers/trainer.py Thank you. Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * remember to do make style and make quality before commit * Update src/transformers/trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-20 10:05:28 +00:00
Yih-Dar	243d0de997	Larger runner on CircleCI (#29750 ) larger runner Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-03-20 10:02:11 +01:00
Joao Gante	1a5c500f12	Tests: Musicgen tests + `make fix-copies` (#29734 ) * make fix-copies * some tests fixed * tests fixed	2024-03-20 08:45:53 +01:00
Yih-Dar	66ce9593fd	Fix `check_copies` not capturing the diff in model/paper title and link (#29724 ) * fix * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-03-19 18:52:36 +01:00
Joao Gante	4294f0c358	Llama: partial 4d masks (#29731 ) * partial 4d masks * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-19 17:32:01 +00:00
Raushan Turganbay	425ba56cdf	Clean-up generation tests after moving methods to private (#29582 ) * clean-up tests * refine comments * fix musicgen tests * make style * remove slow decorator from a test * more clean-up * fix other failing tests	2024-03-19 17:03:31 +00:00
StevenBucaille	56baa03380	Implementation of SuperPoint and AutoModelForKeypointDetection (#28966 ) * Added SuperPoint docs * Added tests * Removed commented part * Commit to create and fix add_superpoint branch with a new branch * Fixed dummy_pt_objects * Committed missing files * Fixed README.md * Apply suggestions from code review Fixed small changes Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Moved ImagePointDescriptionOutput from modeling_outputs.py to modeling_superpoint.py * Removed AutoModelForKeypointDetection and related stuff * Fixed inconsistencies in image_processing_superpoint.py * Moved infer_on_model logic simply in test_inference * Fixed bugs, added labels to forward method with checks whether it is properly a None value, also added tests about this logic in test_modeling_superpoint.py * Added tests to SuperPointImageProcessor to ensure that images are properly converted to grayscale * Removed remaining mentions of MODEL_FOR_KEYPOINT_DETECTION_MAPPING * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixed from (w, h) to (h, w) as input for tests * Removed unnecessary condition * Moved last_hidden_state to be the first returned * Moved last_hidden_state to be the first returned (bis) * Moved last_hidden_state to be the first returned (ter) * Switched image_width and image_height in tests to match recent changes * Added config as first SuperPointConvBlock init argument * Reordered README's after merge * Added missing first config argument to SuperPointConvBlock instantiations * Removed formatting error * Added SuperPoint to README's de, pt-br, ru, te and vi * Checked out README_fr.md * Fixed README_fr.md * Test fix README_fr.md * Test fix README_fr.md * Last make fix-copies ! * Updated checkpoint path * Removed unused SuperPoint doc * Added missing image * Update src/transformers/models/superpoint/modeling_superpoint.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Removed unnecessary import * Update src/transformers/models/superpoint/modeling_superpoint.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Added SuperPoint to _toctree.yml --------- Co-authored-by: steven <steven.bucaillle@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com>	2024-03-19 14:43:02 +00:00
Arthur	2f9a3edbb9	[`GemmaConverter`] use user_defined_symbols (#29473 ) * use user_defined_symbols * fixup * nit * add a very robust test * make sure all models are tested with the `pretrained_tokenizer_to_test` * should we make sure we test all of them? * merge * remove the id * fix test * update * ousies * oups * fixup * fix copies check * remove `pretrained_tokenizer_to_test`	2024-03-19 15:13:56 +01:00
Arthur	8e2fc52ea3	[`Gemma`] final fixes to the modeling (#29729 ) * gelu_pytorch_tanh * Force config.hidden_act to be approx gelu * Gemma bug fixes * force_use_exact_gelu * Update configuration_gemma.py * Update modeling_gemma.py * update * update for simpler handling * nit * nit * fixpup * update * also update the jax modeling! * add `"gelu_pytorch_tanh": partial(nn.gelu, approximate=True),` * fixup * fix order * act vs act_fn --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>	2024-03-19 14:47:42 +01:00

... 78 79 80 81 82 ...

19383 Commits