transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Yih-Dar	eab6c491d4	Use torch 2.5 in scheduled CI (#34465 ) * torch 2.5 * try --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-30 14:54:10 +01:00
Pablo Montalvo	241d79026f	fix pixtral processor (#34486 ) * fix pixtral processor * test out full length batches + remove undue ValueError * fix up processing * fix tests * fix * last fixup * style * [run-slow] pixtral * [run-slow] pixtral * fix config key * skip torchscript tests * [run-slow] pixtral * add missing key * [run-slow] pixtral * fix docs * [run-slow] pixtral * fix wrong url for integration test * [run-slow] pixtral * pixtralVisionModel does not have a lm head * [run-slow] pixtral	2024-10-30 14:17:20 +01:00
Joao Gante	8a734ea2c3	Tests: move `generate` tests to the right mixin and delete redundant tests (#34464 ) * tmp commit * tmp commit * cull overwrites of deleted tests * typo * more specific docstring * make fixup * parameterize at the top? * correction * more deletions :D * tmp commit * for VLMs too * fix _check_outputs * test nit * make fixup * fix another flaky * test_generate_from_inputs_embeds -- handle missing attention mask	2024-10-30 10:59:08 +00:00
Raushan Turganbay	913330ca9f	VLMs: fix number of image tokens (#34332 ) * fix * fix tests * add tests * style * style * fix qwen after rebase * fix video llava	2024-10-30 10:21:37 +01:00
Raushan Turganbay	0f764a5af7	Mllama: update docs (#34334 ) * update docs * be more explicit * use avaialble methods	2024-10-30 10:11:50 +01:00
Pethő Gergely	25a9fc584a	Fix format mistake in string repr of tokenizer objects (#34493 ) * fix repr string format for tokenizer objects The repr of tokenizer tokens looks confusing and just stupid, like this: `Tokenizer(...), added_tokens_decoder={1: ..., 2: ...}`. The dict that is the value of the added_tokens_decoder attribute is outside of the parentheses of the tokenizer object, whereas all other attributes are inside the parentheses like they should be. This commit fixes this bug. * cos: add newline before closing parenthesis of repr string	2024-10-30 10:03:41 +01:00
Guang Yang	cd277618d4	Roberta is ExecuTorch compatible (#34425 ) * Roberta is ExecuTorch compatible * [run_slow] roberta --------- Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-30 08:36:45 +00:00
Matt	9bee9ff5db	Un-deprecate timeout arg in pipelines (#34382 ) * Un-deprecate timeout * Put "timeout" on the allowed list * make fixup	2024-10-29 18:45:14 +00:00
Yoni Gozlan	e4449bb790	fix incorrect warning (#34416 )	2024-10-29 14:08:42 -04:00
Aleksey Lobanov	f55595b177	Fix performance in get_imports regexp (#34298 ) * fix: Fix performance in get_imports regexp * Minimize get_imports content regexp	2024-10-29 17:29:24 +00:00
dependabot[bot]	4e2e8809ff	Bump werkzeug from 3.0.3 to 3.0.6 in /examples/research_projects/decision_transformer (#34420 ) Bump werkzeug in /examples/research_projects/decision_transformer Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.3 to 3.0.6. - [Release notes](https://github.com/pallets/werkzeug/releases) - [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/werkzeug/compare/3.0.3...3.0.6) --- updated-dependencies: - dependency-name: werkzeug dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-10-29 16:42:40 +00:00
Apoorv Khandelwal	e9ad460494	Adding `optimizer_cls_and_kwargs` to `Trainer.__init__` (#34358 ) * Adding `optimizer_cls_and_kwargs` to `Trainer.__init__` * formatting * make fix-copies docstring * added more docs for optimizer_cls_and_kwargs * add docs for Trainer(optimizer_cls_and_kwargs) * reverting anchor names	2024-10-29 16:23:16 +01:00
Guang Yang	f339042b0b	Albert is ExecuTorch compatible (#34476 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-29 16:22:13 +01:00
Guang Yang	34620e8f0a	MobileBERT is ExecuTorch compatible (#34473 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-29 16:14:31 +01:00
Abhijit Deo	56c45d5757	Bug fix for drop path decay rate in swin transformer (#34291 ) * potential bug fix for drop path * variable name change * forgot to rename the variables * back to original * modify dpr properly * check_copies auto fix * corresponsing swin2 changes * auto fix * linting * default value for drop_path_rate as 0.0 * Update src/transformers/models/glm/modeling_glm.py * maskformer fix * ruff format * changes made to tf code as well * lint --------- Co-authored-by: abhijit deo <167164474+deo-abhijit@users.noreply.github.com>	2024-10-29 16:09:18 +01:00
Shijie	0ab0a42651	fix-qwen2vl-no-position_ids (#33487 )	2024-10-29 15:27:34 +01:00
Doohae Jung	8755dd26b7	manual `head_dim` for `mixtral` model (#34281 )	2024-10-29 14:31:36 +01:00
Guang Yang	5392f12e16	Bert is ExecuTorch compatible (#34424 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-29 14:30:02 +01:00
Marc Sun	004530aa05	Fix regression loading dtype (#34409 ) * fix regression * add test for torchao * expected output * better fix	2024-10-29 11:41:04 +01:00
hlky	9e3d704e23	Fixes for Modular Converter on Windows (#34266 ) * Separator in regex * Standardize separator for relative path in auto generated message * open() encoding * Replace `\` on `os.path.abspath` --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-29 11:40:41 +01:00
Martin Gubri	626c610a4d	Fix perplexity computation in perplexity.md (#34387 ) fix average NLL in perplexity.md	2024-10-29 11:10:10 +01:00
Yih-Dar	439334c8fb	Simplify running tests in a subprocess (#34213 ) * check * check * check * check * add docstring --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-29 10:48:57 +01:00
StevenBucaille	a1835195d1	🚨🚨🚨 [SuperPoint] Fix keypoint coordinate output and add post processing (#33200 ) * feat: Added int conversion and unwrapping * test: added tests for post_process_keypoint_detection of SuperPointImageProcessor * docs: changed docs to include post_process_keypoint_detection method and switched from opencv to matplotlib * test: changed test to not depend on SuperPointModel forward * test: added missing require_torch decorator * docs: changed pyplot parameters for the keypoints to be more visible in the example * tests: changed import torch location to make test_flax and test_tf * Revert "tests: changed import torch location to make test_flax and test_tf" This reverts commit `39b32a2f69`. * tests: fixed import * chore: applied suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * tests: fixed import * tests: fixed import (bis) * tests: fixed import (ter) * feat: added choice of type for target_size and changed tests accordingly * docs: updated code snippet to reflect the addition of target size type choice in post process method * tests: fixed imports (...) * tests: fixed imports (...) * style: formatting file * docs: fixed typo from image[0] to image.size[0] * docs: added output image and fixed some tests * Update docs/source/en/model_doc/superpoint.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix: included SuperPointKeypointDescriptionOutput in TYPE_CHECKING if statement and changed tests results to reflect changes to SuperPoint from absolute keypoints coordinates to relative * docs: changed SuperPoint's docs to print output instead of just accessing * style: applied make style * docs: added missing output type and precision in docstring of post_process_keypoint_detection * perf: deleted loop to perform keypoint conversion in one statement * fix: moved keypoint conversion at the end of model forward * docs: changed SuperPointInterestPointDecoder to SuperPointKeypointDecoder class name and added relative (x, y) coordinates information to its method * fix: changed type hint * refactor: removed unnecessary brackets * revert: SuperPointKeypointDecoder to SuperPointInterestPointDecoder * Update docs/source/en/model_doc/superpoint.md Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> --------- Co-authored-by: Steven Bucaille <steven.bucaille@buawei.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2024-10-29 09:36:03 +00:00
kang sheng	655bec2da7	use a tinymodel to test generation config which aviod timeout (#34482 ) * use a tinymodel to test generation config which aviod timeout * remove tailing whitespace	2024-10-29 09:39:06 +01:00
Raushan Turganbay	63ca6d9771	Fix CI (#34458 ) * fix * fix mistral	2024-10-29 08:26:04 +01:00
Raushan Turganbay	808d6c50f8	Generation: fix test (#34369 ) * fix test * fix copies	2024-10-29 07:57:10 +01:00
Raushan Turganbay	fe76b60370	LLaVA: latency issues (#34460 ) * fix llavas * code style * green ci	2024-10-29 07:54:51 +01:00
Alexandros Benetatos	a769ed45e1	Add `post_process_depth_estimation` for GLPN (#34413 ) * add depth postprocessing for GLPN * remove previous temp fix for glpn tests * Style changes for GLPN's `post_process_depth_estimation` Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * additional style fix --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-28 19:44:20 +01:00
Luc Georges	6cc4a67b3d	feat: run benchmarks on A100 (#34287 )	2024-10-28 19:33:17 +01:00
kang sheng	d21dbd1520	enable average tokens across devices (#34373 ) * enable average tokens across devices * reduce earlier in case model needs it * simplify if statement * reformat code to make ruff happy * add doc for argument: average_tokens_across_devices * cannot find world size when pytorch is unavailable * format code --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-28 18:59:38 +01:00
Ahmed Almaghz	a17f287ac0	[i18n-ar] Translated file : `docs/source/ar/fast_tokenizers.md` into Arabic (#33034 ) * Add docs/source/ar/fast_tokenizers.md to Add_docs_source_ar_fast_tokenizers.md * Update _toctree.yml * Update _toctree.yml * Update docs/source/ar/_toctree.yml Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/fast_tokenizers.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2024-10-28 10:54:37 -07:00
Shubham S Jagtap	084e946cfd	Apply linting to the important code blocks to make it readable (#34449 ) Enhance user experience using py-linting	2024-10-28 10:48:18 -07:00
wony617	1f7539c829	🌐 [i18n-KO] Translated `model_doc/barthez.md` to Korean (#33980 ) * docs: ko: model_doc/barthez.md * feat: nmt draft --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-10-28 10:46:49 -07:00
Vijay	fc1ae7f30f	[docs] update input documentation for MAMBA2 and MISTRAL models to include cache_position and attention_mask details (#34322 ) * [docs] update input documentation for MAMBA2 and MISTRAL models to include cache_position and attention_mask details * [docs] correct input documentation for MISTRAL model to reference `input_ids` instead of `decoder_input_ids` * [docs] clarify cache_position description in MISTRAL model documentation	2024-10-28 09:14:07 -07:00
Sean (Seok-Won) Yi	c1753436db	New option called `"best"` for `args.save_strategy`. (#31817 ) * Add _determine_best_metric and new saving logic. 1. Logic to determine the best logic was separated out from `_save_checkpoint`. 2. In `_maybe_log_save_evaluate`, whether or not a new best metric was achieved is determined after each evaluation, and if the save strategy is "best' then the TrainerControl is updated accordingly. * Added SaveStrategy. Same as IntervalStrategy, but with a new attribute called BEST. * IntervalStrategy -> SaveStrategy * IntervalStratgy -> SaveStrategy for save_strat. * Interval -> Save in docstring. * Updated docstring for save_strategy. * Added SaveStrategy and made according changes. `save_strategy` previously followed `IntervalStrategy` but now follows `SaveStrategy`. Changes were made accordingly to the code and the docstring. * Changes from `make fixup`. * Removed redundant metrics argument. * Added new test_save_best_checkpoint test. 1. Checks for both cases where `metric_for_best_model` is explicitly provided and when it's not provided. 2. The first case should have two checkpoints saved, whereas the second should have three saved. * Changed should_training_end saving logic. The Trainer saves a checkpoints at the end of training by default as long as `save_strategy != SaveStrategy.NO`. This condition was modified to include `SaveStrategy.BEST` because it would be counterintuitive that we'd only want the best checkpoint to be saved but the last one is as well. * `args.metric_for_best_model` default to loss. * Undo metric_for_best_model update. * Remove checking metric_for_best_model. * Added test cases for loss and no metric. * Added error for metric and changed default best_metric. * Removed unused import. * `new_best_metric` -> `is_new_best_metric` Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Applied `is_new_best_metric` to all. Changes were made for consistency and also to fix a potential bug. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-10-28 16:02:22 +01:00
AbdelKarim ELJANDOUBI	8b3b9b48fc	exclude fsdp from delay_optimizer_creation (#34140 ) * exclude fsdp from delay_optimizer_creation * add test case for trainer: FSDP mode and fp8 as mixed precision * rearrange imports * ruff formatted * adapt _init_fsdp to fp8 * use _init_fsdp only when resume_from_checkpoint * In case of FDP, self.layer will be CheckpointWrapper which has no len() method * delete _init_fsdp * solve conflict * fix conflict * make fixup	2024-10-28 13:50:16 +01:00
Nischay	92bcdff2ef	Fix batch size handling in prediction_loop for DataLoaderShard (#34343 ) * Fix batch size handling in prediction_loop for DataLoaderShard Updated the prediction_loop method in the Trainer class to correctly handle batch size when using DataLoaderShard. This ensures that the batch size is retrieved from total_batch_size for distributed training scenarios, preventing TypeError related to NoneType during evaluation. * Update src/transformers/trainer.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Applied the fix to remove unused imports --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-10-28 13:23:52 +01:00
Yih-Dar	9360f1827d	Tiny update after #34383 (#34404 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-28 12:01:05 +01:00
Yih-Dar	fc465bb196	pin `tensorflow_probability<0.22` in docker files (#34381 ) 0.21 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-28 11:59:46 +01:00
Ilyas Moutawwakil	fddbd3c13c	Fix pix2struct (#34374 ) * fix * fix and test use_cache test * style * remove atol	2024-10-28 11:24:56 +01:00
Steven Liu	1d06379331	[docs] Cache implementations (#34325 ) cache	2024-10-25 08:52:45 -07:00
Rudy Delouya	6a62a6d1b5	Fix typos in agents_advanced.md (#34405 )	2024-10-25 08:52:29 -07:00
Yih-Dar	f73f5e62e2	Avoid check expected exception when it is on CUDA (#34408 ) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-25 17:14:07 +02:00
Matthew Douglas	e447185b1f	Fix bnb training test failure (#34414 ) * Fix bnb training test: compatibility with OPTSdpaAttention	2024-10-25 10:23:20 -04:00
Joao Gante	186b8dc190	Tests: upgrade `test_eager_matches_sdpa_generate` (#34386 )	2024-10-25 11:55:07 +01:00
Joao Gante	8814043c8c	SynthID: better example (#34372 ) * better example * Update src/transformers/generation/configuration_utils.py * Update src/transformers/generation/logits_process.py * nits	2024-10-25 11:46:46 +01:00
Yih-Dar	223855314f	no filter (#34391 ) * no filter * no filter * no filter --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-25 12:32:39 +02:00
Raushan Turganbay	9f365fe0ac	Fix right padding in LLaVA models (#34305 ) * fix right pad llavas * device mismatch	2024-10-25 11:02:07 +02:00
Ilyas Moutawwakil	5779bac4c4	Fix onnx non-expotable inplace aten op (#34376 ) * fix onnx non-expotable inplace op * mistral, qwen2, qwen2_vl, starcoder2 * fixup copies	2024-10-25 09:44:09 +02:00
Yoni Gozlan	940a6bd343	Use non nested images and batched text Idefics2/3 (#34222 ) * add support for non nested images and add tests * add tests error scenario * fix style * added single and no image to error tests	2024-10-24 20:00:13 -04:00

1 2 3 4 5 ...

17297 Commits