transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-19 12:38:23 +06:00

Author	SHA1	Message	Date
Joao Gante	7d312ad2e9	Llama: fix batched generation (#29109 )	2024-02-20 10:23:17 +00:00
Younes Belkada	ff76e7c212	FIX [`bnb` / `tests`] Propagate the changes from #29092 to 4-bit tests (#29122 ) * forgot to push the changes for 4bit .. * trigger CI	2024-02-20 11:11:15 +01:00
Pablo Montalvo	1c9134f004	Abstract image processor arg checks. (#28843 ) * abstract image processor arg checks. * fix signatures and quality * add validate_ method to rescale-prone processors * add more validations * quality * quality * fix formatting Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix formatting Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix formatting Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix formatting mishap Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix crop_size compatibility * fix default mutable arg * fix segmentation map + image arg validity * remove segmentation check from arg validation * fix quality * fix missing segmap * protect PILImageResampling type * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add back segmentation maps check --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-20 11:05:46 +01:00
Younes Belkada	f7ef7cec6c	FEAT [`Trainer` / `bnb`]: Add RMSProp from `bitsandbytes` to HF `Trainer` (#29082 ) * add RMSProp to Trainer * revert some change * Update src/transformers/trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-20 02:43:02 +01:00
Erich Schubert	a7ff2f23a0	Move misplaced line (#29117 ) Move misplaced line, improve code comment	2024-02-20 02:24:48 +01:00
Arthur	9094abe8dc	[`gradient_checkpointing`] default to use it for torch 2.3 (#28538 ) * default to use it * style	2024-02-20 02:23:25 +01:00
Nilesh	49c0b293d2	Fixed nll with label_smoothing to just nll (#28708 ) * Fixed nll with label_smoothing to nll * Resolved conflict by rebase * Fixed nll with label_smoothing to nll * Resolved conflict by rebase * Added label_smoothing to config file * Fixed nits	2024-02-20 01:52:15 +01:00
Shijie Wu	4f09d0fd88	storing & logging gradient norm in trainer (#27326 ) * report grad_norm during training * support getting grad_norm from deepspeed	2024-02-19 19:07:41 +00:00
Sadra Barikbin	a4851d9477	Fix two tiny typos in `pipelines/base.py::Pipeline::_sanitize_parameters()`'s docstring (#29102 ) * Update base.py * Fix a typo	2024-02-19 18:50:28 +00:00
Titus	5ce90f3212	Bnb test fix for different hardwares (#29066 ) * generated text on A10G * generated text in CI * Apply suggestions from code review add explanatory comments Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-02-19 18:04:44 +00:00
Max Baak	08cd694ef0	ENH: added new output_logits option to generate function (#28667 ) output_logits option behaves like output_scores, but returns the raw, unprocessed prediction logit scores, ie. the values before they undergo logit processing and/or warping. The latter happens by default for the regular output scores. It's useful to have the unprocessed logit scores in certain circumstances. For example, unprocessed logit scores are very useful with causallm models when one wants to determine the probability of a certain answer, e.g. when asking a question with a yes/no answer. In that case getting the next-token probabilities of both "yes" and "no" (and/or their relative ratio) is of interest for classification. The reason for getting these _before_ logit processing and/or warping is b/c a) that can change the probabilities or b) reject the tokens of interest / reduce the number of tokens to just 1. For an example use-case see paper TabLLM: Few-shot Classification of Tabular Data with Large Language Models by Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David Sontag. https://arxiv.org/abs/2210.10723 In addition: - added dedicated unit test: tests/generation/test_utils/test_return_unprocessed_logit_scores which tests return of logics with output_logits=True in generation. - set output_logits=True in all other generation unit tests, that also have output_scores=True. Implemented @gante's and @amyeroberts review feedback Co-authored-by: kx79wq <max.baak@ing.com>	2024-02-19 17:34:17 +00:00
NielsRogge	07e3454f03	[Docs] Add resources (#28705 ) * Add resource * Add more resources * Add resources * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove mention * Remove pipeline tags --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-19 15:22:29 +01:00
Arthur	b2724d7b4c	change version (#29097 ) * change version * nuke * this doesn't make sense * update some requirements.py * revert + no main * nits * change cache number * more pin * revert --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-19 22:08:44 +08:00
Jay Zhou	79132d4cfe	Fix a typo in `examples/pytorch/text-classification/run_classification.py` (#29072 )	2024-02-19 13:01:15 +00:00
Lysandre Debut	9830858671	Fix the `bert-base-cased` tokenizer configuration test (#29105 ) Fix test	2024-02-19 13:23:25 +01:00
Winton Davies	593230f0a1	fix the post-processing link (#29091 ) The link in evaluation was missing a hyphen between post and processing. I fixed this, for English only. Someone with the ability to do a global search/replace should fix the other languages (if indeed they have this issue)/	2024-02-19 10:15:58 +00:00
Younes Belkada	a75a6c9315	FIX [`bnb` / `tests`]: Fix currently failing bnb tests (#29092 ) Update test_mixed_int8.py	2024-02-19 10:39:12 +01:00
Younes Belkada	864c8e6ea3	[`Awq`] Add peft support for AWQ (#28987 ) * add peft support for AWQ * Update src/transformers/quantizers/quantizer_awq.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-19 01:31:39 +01:00
Aaron Jimenez	ce4fff0be7	[Docs] Spanish translation of task_summary.md (#28844 ) * Add task_summary to es/_toctree.yml * Add task_summary.md to docs/es * Change title of task_summary.md * Translate firsts paragraphs * Translate middle paragraphs * Translte the rest of the doc * Edit firts paragraph	2024-02-16 15:50:06 -08:00
Matt	2f1003be86	Add chat support to text generation pipeline (#28945 ) * Add chat support to text generation pipeline * Better handling of single elements * Deprecate ConversationalPipeline * stash commit * Add missing add_special_tokens kwarg * Update chat templating docs to refer to TextGenerationPipeline instead of ConversationalPipeline * Add ✨TF✨ tests * @require_tf * Add type hint * Add specific deprecation version * Remove unnecessary do_sample * Remove todo - the discrepancy has been resolved * Update src/transformers/tokenization_utils_base.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/pipelines/text_generation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-16 16:41:01 +00:00
Zach Mueller	636b03244c	Fix trainer test wrt DeepSpeed + auto_find_bs (#29061 ) * FIx trainer test * Update tests/trainer/test_trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-16 10:04:24 -05:00
Sean (Seok-Won) Yi	161fe425c9	Feature: Option to set the tracking URI for MLflowCallback. (#29032 ) * Added option to set tracking URI for MLflowCallback. * Added option to set tracking URI for MLflowCallback. * Changed to in docstring.	2024-02-16 14:47:18 +00:00
Richard Lee	be42c24d14	Honor trust_remote_code for custom tokenizers (#28854 ) * pass through trust_remote_code for dynamically loading unregistered tokenizers specified by config add test * change directories back to previous directory after test * fix ruff check * Add a note to that block for future in case we want to remove it later --------- Co-authored-by: Matt <rocketknight1@gmail.com>	2024-02-16 13:40:23 +00:00
Sourab Mangrulkar	4c18ddb5cf	`auto_find_batch_size` isn't yet supported with DeepSpeed/FSDP. Raise error accrodingly. (#29058 ) Update trainer.py	2024-02-16 18:11:09 +05:30
Sourab Mangrulkar	b262808656	fix failing trainer ds tests (#29057 )	2024-02-16 17:18:45 +05:30
Jonathan Mamou	258da40efd	fix num_assistant_tokens with heuristic schedule (#28759 ) * fix heuristic num_assistant_tokens_schedule * Update src/transformers/generation/configuration_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update utils.py check that candidate_generator.assistant_model exists since some some speculations (like ngram and PLD) don't have assistant_model attribute * Update src/transformers/generation/candidate_generator.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/generation/test_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * merge conflict * fix docstring * make fixup --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-16 11:44:58 +00:00
Tanmay patil	0eb408551c	Support : Leverage Accelerate for object detection/segmentation models (#28312 ) * made changes for object detection models * added support for segmentation models. * Made changes for segmentaion models * Changed import statements * solving conflicts * removed conflicts * Resolving commits * Removed conflicts * Fix : Pixel_mask_value set to False	2024-02-16 11:38:59 +00:00
Raushan Turganbay	aee11fe427	Fix max_length criteria when using inputs_embeds (#28994 ) * fix max_length for inputs_embeds * make style * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Static Cache: load models with MQA or GQA (#28975) * fix * fix tests * fix tests * Update src/transformers/generation/utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * more fixes * make style --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-16 11:25:12 +00:00
Lysandre Debut	8876ce8a5f	Update important model list (#29019 )	2024-02-16 11:31:51 +01:00
Lysandre Debut	f497f564bb	Update all references to canonical models (#29001 ) * Script & Manual edition * Update	2024-02-16 08:16:58 +01:00
Titus	1e402b957d	add test marker to run all tests with @require_bitsandbytes (#28278 )	2024-02-16 01:53:09 +01:00
Sadra Barikbin	f3aa7db439	Fix a tiny typo in `generation/utils.py::GenerateEncoderDecoderOutput`'s docstring (#29044 ) Update utils.py	2024-02-15 18:12:31 +00:00
Andrei Panferov	b0a7f44f85	Removed obsolete attribute setting for AQLM quantization. (#29034 ) removed redundant field	2024-02-15 18:11:13 +00:00
amyeroberts	4156f517ce	Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043 ) * Patch to skip currently failing tests * Whoops - wrong place	2024-02-15 17:26:33 +00:00
Younes Belkada	6d1f545665	FIX: Fix error with `logger.warning` + inline with recent refactor (#29039 ) Update modeling_utils.py	2024-02-15 15:33:26 +01:00
amyeroberts	8a0ed0a9a2	Fix copies between DETR and DETA (#29037 )	2024-02-15 14:02:58 +00:00
Donggeun Yu	5b6fa2306a	DeformableDetrModel support fp16 (#29013 ) * Update ms_deform_attn_cuda.cu * Update ms_deform_attn_cuda.cuh * Update modeling_deformable_detr.py * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_deformable_detr.py * python utils/check_copies.py --fix_and_overwrite * Fix dtype missmatch error * Update test_modeling_deformable_detr.py * Update test_modeling_deformable_detr.py * Update modeling_deformable_detr.py * Update modeling_deformable_detr.py --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-15 12:31:09 +00:00
Sangbum Daniel Choi	83e96dc0ab	Add cuda_custom_kernel in DETA (#28989 ) * enable graident checkpointing in DetaObjectDetection * fix missing part in original DETA * make style * make fix-copies * Revert "make fix-copies" This reverts commit 4041c86c29248f1673e8173b677c20b5a4511358. * remove fix-copies of DetaDecoder * enable swin gradient checkpointing * fix gradient checkpointing in donut_swin * add tests for deta/swin/donut * Revert "fix gradient checkpointing in donut_swin" This reverts commit 1cf345e34d3cc0e09eb800d9895805b1dd9b474d. * change supports_gradient_checkpointing pipeline to PreTrainedModel * Revert "add tests for deta/swin/donut" This reverts commit 6056ffbb1eddc3cb3a99e4ebb231ae3edf295f5b. * Revert "Revert "fix gradient checkpointing in donut_swin"" This reverts commit 24e25d0a14891241de58a0d86f817d0b5d2a341f. * Simple revert * enable deformable detr gradient checkpointing * add gradient in encoder * add cuda_custom_kernel function in MSDA * make style and fix input of DetaMSDA * make fix-copies * remove n_levels in input of DetaMSDA * minor changes * refactor custom_cuda_kernel like yoso format `0507e69d34/src/transformers/models/yoso/modeling_yoso.py (L53)`	2024-02-15 12:09:39 +00:00
Arthur	f3788b09e1	Fix static generation when compiling! (#28937 ) * wow I was scared! * fix everything * nits * make it BC? * add todo * nits * is_tracing should still be used to pass tracing tests * nits * some nits to make sure genration works with static cache uncompiled * fix sdpa * fix FA2 for both static and dynamic in a better way? * style * fix-copies * fix fix copies * fix sequential beam searcg * style * use `keys_to_ignore` * nit * correct dtype inference when init * :( the fix for FA2 is still not optimal to investigate! * styling * nits * nit * this might work better * add comment * Update src/transformers/models/llama/modeling_llama.py * "position_ids" -> "cache_position" * style * nit * Remove changes that should no be propagatted just yet * Apply suggestions from code review * Styling * make sure we raise an errir for static cache with FA2 enabled * move to the bottom of the signature * style * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py * nit in the name --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-02-15 06:27:40 +01:00
Arthur	609a1767e8	[`CLeanup`] Revert SDPA attention changes that got in the static kv cache PR (#29027 ) * revert unrelated changes that got in * style	2024-02-15 00:55:48 +01:00
Younes Belkada	7a0fccc6eb	FIX [`Trainer` / tags]: Fix trainer + tags when users do not pass `"tags"` to `trainer.push_to_hub()` (#29009 ) * fix trainer tags * add test	2024-02-14 23:56:35 +01:00
Jiewen Tan	5f06053dd8	[TPU] Support PyTorch/XLA FSDP via SPMD (#28949 ) * Initial commit * Add guards for the global mesh * Address more comments * Move the dataloader into integrations/tpu.py * Fix linters * Make karg more explicitly * Remove the move device logic * Fix the CI * Fix linters * Re-enable checkpointing	2024-02-14 21:44:49 +00:00
amyeroberts	0199a484eb	Backbone kwargs in config (#28784 ) * Enable instantiating model with pretrained backbone weights * Clarify pretrained import * Use load_backbone instead * Add backbone_kwargs to config * Pass kwargs to constructors * Fix up * Input verification * Add tests * Tidy up * Update tests/utils/test_backbone_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-02-14 20:46:44 +00:00
JB (Don)	725f4ad1cc	Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948 ) * Add tie_weights() to LM heads and set bias in set_output_embeddings() The bias were not tied correctly in some LM heads, and this change should fix that. * Moving test_save_and_load_low_cpu_mem_usage to ModelTesterMixin * Adding _tie_weights() to MPNet and Vilt * Skip test for low cpu mem usage for Deta/DeformableDetr since they cannot init on meta device * Rename to test name to save_load to match the convention	2024-02-14 20:39:01 +00:00
Merve Noyan	3f4e79d29c	Mask Generation Task Guide (#28897 ) * Create mask_generation.md * add h1 * add to toctree * Update docs/source/en/tasks/mask_generation.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update mask_generation.md * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update mask_generation.md * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md * Update mask_generation.md * Update mask_generation.md --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Maria Khalusova <kafooster@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com>	2024-02-14 18:29:49 +00:00
Raushan Turganbay	354775bc57	Fix flaky test vision encoder-decoder generate (#28923 )	2024-02-14 15:40:57 +00:00
Zach Mueller	0507e69d34	Introduce AcceleratorConfig dataclass (#28664 ) * Introduce acceleratorconfig dataclass * Extra second warn * Move import * Try moving import under is_accelerate_available * Quality * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Clean * Remove to_kwargs * Change version * Improve tests by including dispatch and split batches * Improve reliability * Update tests/trainer/test_trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixup tests and review nits * Make tests pass * protect import * Protect import * Empty-Commit * Make training_args.to_dict handle the AcceleratorConfig --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-14 10:18:09 -05:00
Huazhong Ji	69ca640dd6	Set the dataset format used by `test_trainer` to float32 (#28920 ) Co-authored-by: unit_test <test@unit.com>	2024-02-14 13:55:12 +00:00
amyeroberts	7252e8d937	[`Doc`] Fix docbuilder - make `BackboneMixin` and `BackboneConfigMixin` importable from `utils`. (#29002 ) * Trigger doc build * Test removing references * Importable from utils * Trigger another run on a new commit for testing	2024-02-14 10:29:22 +00:00
Andrei Panferov	1ecf5f7c98	AQLM quantizer support (#28928 ) * aqlm init * calibration and dtypes * docs * Readme update * is_aqlm_available * Simpler link in docs * Test TODO real reference * init _import_structure fix * AqlmConfig autodoc * integration aqlm * integrations in tests * docstring fix * legacy typing * Less typings * More kernels information * Performance -> Accuracy * correct tests * remoced multi-gpu test * Update docs/source/en/quantization.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Brought back multi-gpu tests * Update src/transformers/integrations/aqlm.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/aqlm_integration/test_aqlm.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Andrei Panferov <blacksamorez@yandex-team.ru> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-02-14 09:25:41 +01:00

... 18 19 20 21 22 ...

16108 Commits