Commit Graph

15172 Commits

Author SHA1 Message Date
Arthur Zucker
1a77f07f65 v4.39.dev.0 2024-02-21 15:23:22 +09:00
amyeroberts
e770f0316d
[pipeline] Add pool option to image feature extraction pipeline (#28985)
* Add pool option

* PR comments - error message and exact outputs check
2024-02-20 20:22:08 +00:00
Fernando Pérez-García
c47576ca6e
Fix drop path being ignored in DINOv2 (#29147)
Fix drop path not being used
2024-02-20 17:31:59 +00:00
Gustavo Isturiz
3c00b885b9
Added image_captioning version in es and included in toctree file (#29104)
added image_captioning version in es and included in toctree file
2024-02-20 09:13:15 -08:00
Joao Gante
857fd8eaab
Generate: missing generation config eos token setting in encoder-decoder tests (#29146) 2024-02-20 16:17:51 +00:00
Pablo Montalvo
1c81132e80
Raise unused kwargs image processor (#29063)
* draft processor arg capture

* add missing vivit model

* add new common test for image preprocess signature

* fix quality

* fix up

* add back missing validations

* quality

* move info level to warning for unused kwargs
2024-02-20 16:20:20 +01:00
JB (Don)
b8b16475d4
[Phi] Add support for sdpa (#29108) 2024-02-20 14:33:12 +01:00
Yih-Dar
7688d8df84
Save (circleci) cache at the end of a job (#29141)
nice job

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-20 21:31:36 +08:00
Taylor Jackle Spriggs
ee3af60be0
Add support for fine-tuning CLIP-like models using contrastive-image-text example (#29070)
* add support for siglip and chinese-clip model training with contrastive-image-text example

* codebase fixups
2024-02-20 12:08:31 +00:00
amyeroberts
0996a10077
Revert low cpu mem tie weights (#29135)
* Revert "Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948)"

This reverts commit 725f4ad1cc.

* Revert "Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043)"

This reverts commit 4156f517ce.
2024-02-20 12:06:46 +00:00
Arthur
15cfe38942
[Core tokenization] add_dummy_prefix_space option to help with latest issues (#28010)
* add add_dummy_prefix_space option to slow

* checking kwargs might be better. Should be there for all spm tokenizer IMO

* nits

* fix copies

* more copied

* nits

* add prefix space

* nit

* nits

* Update src/transformers/convert_slow_tokenizer.py

* fix inti

* revert wrong styling

* fix

* nits

* style

* updates

* make sure we use slow tokenizer for conversion instead of looking for the decoder

* support llama ast well

* update llama tokenizer fast

* nits

* nits nits nits

* update the doc

* update

* update to fix tests

* skip unrelated tailing test

* Update src/transformers/convert_slow_tokenizer.py

* add proper testing

* test decode as well

* more testing

* format

* fix llama test

* Apply suggestions from code review
2024-02-20 12:50:31 +01:00
Younes Belkada
efdd436663
FIX [PEFT / Trainer ] Handle better peft + quantized compiled models (#29055)
* handle peft + compiled models

* add tests

* fixup

* adapt from suggestions

* clarify comment
2024-02-20 12:45:08 +01:00
Arthur
5e95dcabe1
[cuda kernels] only compile them when initializing (#29133)
* only compile when needed

* fix mra as well

* fix yoso as well

* update

* rempve comment

* Update src/transformers/models/deformable_detr/modeling_deformable_detr.py

* Update src/transformers/models/deformable_detr/modeling_deformable_detr.py

* opps

* Update src/transformers/models/deta/modeling_deta.py

* nit
2024-02-20 12:38:59 +01:00
Joao Gante
a7755d2409
Generate: unset GenerationConfig parameters do not raise warning (#29119) 2024-02-20 11:34:31 +00:00
Joao Gante
7d312ad2e9
Llama: fix batched generation (#29109) 2024-02-20 10:23:17 +00:00
Younes Belkada
ff76e7c212
FIX [bnb / tests] Propagate the changes from #29092 to 4-bit tests (#29122)
* forgot to push the changes for 4bit ..

* trigger CI
2024-02-20 11:11:15 +01:00
Pablo Montalvo
1c9134f004
Abstract image processor arg checks. (#28843)
* abstract image processor arg checks.

* fix signatures and quality

* add validate_ method to rescale-prone processors

* add more validations

* quality

* quality

* fix formatting

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix formatting

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix formatting

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fix formatting mishap

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix crop_size compatibility

* fix default mutable arg

* fix segmentation map + image arg validity

* remove segmentation check from arg validation

* fix quality

* fix missing segmap

* protect PILImageResampling type

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add back segmentation maps check

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-20 11:05:46 +01:00
Younes Belkada
f7ef7cec6c
FEAT [Trainer / bnb]: Add RMSProp from bitsandbytes to HF Trainer (#29082)
* add RMSProp to Trainer

* revert some change

* Update src/transformers/trainer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-20 02:43:02 +01:00
Erich Schubert
a7ff2f23a0
Move misplaced line (#29117)
Move misplaced line, improve code comment
2024-02-20 02:24:48 +01:00
Arthur
9094abe8dc
[gradient_checkpointing] default to use it for torch 2.3 (#28538)
* default to use it

* style
2024-02-20 02:23:25 +01:00
Nilesh
49c0b293d2
Fixed nll with label_smoothing to just nll (#28708)
* Fixed nll with label_smoothing to nll

* Resolved conflict by rebase

* Fixed nll with label_smoothing to nll

* Resolved conflict by rebase

* Added label_smoothing to config file

* Fixed nits
2024-02-20 01:52:15 +01:00
Shijie Wu
4f09d0fd88
storing & logging gradient norm in trainer (#27326)
* report grad_norm during training

* support getting grad_norm from deepspeed
2024-02-19 19:07:41 +00:00
Sadra Barikbin
a4851d9477
Fix two tiny typos in pipelines/base.py::Pipeline::_sanitize_parameters()'s docstring (#29102)
* Update base.py

* Fix a typo
2024-02-19 18:50:28 +00:00
Titus
5ce90f3212
Bnb test fix for different hardwares (#29066)
* generated text on A10G

* generated text in CI

* Apply suggestions from code review

add explanatory comments

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-02-19 18:04:44 +00:00
Max Baak
08cd694ef0
ENH: added new output_logits option to generate function (#28667)
output_logits option behaves like output_scores, but returns the raw, unprocessed prediction logit scores,
ie. the values before they undergo logit processing and/or warping. The latter happens by default for the
regular output scores.

It's useful to have the unprocessed logit scores in certain circumstances. For example, unprocessed logit scores
are very useful with causallm models when one wants to determine the probability of a certain answer, e.g.
when asking a question with a yes/no answer. In that case getting the next-token probabilities of both "yes" and
"no" (and/or their relative ratio) is of interest for classification. The reason for getting these _before_ logit
processing and/or warping is b/c a) that can change the probabilities or b) reject the tokens of interest / reduce
the number of tokens to just 1.

For an example use-case see paper TabLLM: Few-shot Classification of Tabular Data with Large Language Models
by Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David Sontag.
https://arxiv.org/abs/2210.10723

In addition:
- added dedicated unit test: tests/generation/test_utils/test_return_unprocessed_logit_scores
  which tests return of logics with output_logits=True in generation.
- set output_logits=True in all other generation unit tests, that also have output_scores=True.

Implemented @gante's and @amyeroberts review feedback

Co-authored-by: kx79wq <max.baak@ing.com>
2024-02-19 17:34:17 +00:00
NielsRogge
07e3454f03
[Docs] Add resources (#28705)
* Add resource

* Add more resources

* Add resources

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Remove mention

* Remove pipeline tags

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-19 15:22:29 +01:00
Arthur
b2724d7b4c
change version (#29097)
* change version

* nuke

* this doesn't make sense

* update some requirements.py

* revert + no main

* nits

* change cache number

* more pin

* revert

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-19 22:08:44 +08:00
Jay Zhou
79132d4cfe
Fix a typo in examples/pytorch/text-classification/run_classification.py (#29072) 2024-02-19 13:01:15 +00:00
Lysandre Debut
9830858671
Fix the bert-base-cased tokenizer configuration test (#29105)
Fix test
2024-02-19 13:23:25 +01:00
Winton Davies
593230f0a1
fix the post-processing link (#29091)
The link in evaluation was missing a hyphen between post and processing. I fixed this, for English only. Someone with the ability to do a global search/replace should fix the other languages (if indeed they have this issue)/
2024-02-19 10:15:58 +00:00
Younes Belkada
a75a6c9315
FIX [bnb / tests]: Fix currently failing bnb tests (#29092)
Update test_mixed_int8.py
2024-02-19 10:39:12 +01:00
Younes Belkada
864c8e6ea3
[Awq] Add peft support for AWQ (#28987)
* add peft support for AWQ

* Update src/transformers/quantizers/quantizer_awq.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-19 01:31:39 +01:00
Aaron Jimenez
ce4fff0be7
[Docs] Spanish translation of task_summary.md (#28844)
* Add task_summary to es/_toctree.yml

* Add task_summary.md to docs/es

* Change title of task_summary.md

* Translate firsts paragraphs

* Translate middle paragraphs

* Translte the rest of the doc

* Edit firts paragraph
2024-02-16 15:50:06 -08:00
Matt
2f1003be86
Add chat support to text generation pipeline (#28945)
* Add chat support to text generation pipeline

* Better handling of single elements

* Deprecate ConversationalPipeline

* stash commit

* Add missing add_special_tokens kwarg

* Update chat templating docs to refer to TextGenerationPipeline instead of ConversationalPipeline

* Add TF tests

* @require_tf

* Add type hint

* Add specific deprecation version

* Remove unnecessary do_sample

* Remove todo - the discrepancy has been resolved

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/pipelines/text_generation.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-16 16:41:01 +00:00
Zach Mueller
636b03244c
Fix trainer test wrt DeepSpeed + auto_find_bs (#29061)
* FIx trainer test

* Update tests/trainer/test_trainer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-16 10:04:24 -05:00
Sean (Seok-Won) Yi
161fe425c9
Feature: Option to set the tracking URI for MLflowCallback. (#29032)
* Added option to set tracking URI for MLflowCallback.

* Added option to set tracking URI for MLflowCallback.

* Changed  to  in docstring.
2024-02-16 14:47:18 +00:00
Richard Lee
be42c24d14
Honor trust_remote_code for custom tokenizers (#28854)
* pass through trust_remote_code for dynamically loading unregistered tokenizers specified by config
add test

* change directories back to previous directory after test

* fix ruff check

* Add a note to that block for future in case we want to remove it later

---------

Co-authored-by: Matt <rocketknight1@gmail.com>
2024-02-16 13:40:23 +00:00
Sourab Mangrulkar
4c18ddb5cf
auto_find_batch_size isn't yet supported with DeepSpeed/FSDP. Raise error accrodingly. (#29058)
Update trainer.py
2024-02-16 18:11:09 +05:30
Sourab Mangrulkar
b262808656
fix failing trainer ds tests (#29057) 2024-02-16 17:18:45 +05:30
Jonathan Mamou
258da40efd
fix num_assistant_tokens with heuristic schedule (#28759)
* fix heuristic num_assistant_tokens_schedule

* Update src/transformers/generation/configuration_utils.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update utils.py

check that candidate_generator.assistant_model exists since some some speculations (like ngram and PLD) don't have assistant_model attribute

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/generation/test_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make fixup

* merge conflict

* fix docstring

* make fixup

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-16 11:44:58 +00:00
Tanmay patil
0eb408551c
Support : Leverage Accelerate for object detection/segmentation models (#28312)
* made changes for object detection models

* added support for segmentation models.

* Made changes for segmentaion models

* Changed import statements

* solving conflicts

* removed conflicts

* Resolving commits

* Removed conflicts

* Fix : Pixel_mask_value set to False
2024-02-16 11:38:59 +00:00
Raushan Turganbay
aee11fe427
Fix max_length criteria when using inputs_embeds (#28994)
* fix max_length for inputs_embeds

* make style

* Update src/transformers/generation/utils.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Static Cache: load models with MQA or GQA (#28975)

* fix

* fix tests

* fix tests

* Update src/transformers/generation/utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* more fixes

* make style

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-16 11:25:12 +00:00
Lysandre Debut
8876ce8a5f
Update important model list (#29019) 2024-02-16 11:31:51 +01:00
Lysandre Debut
f497f564bb
Update all references to canonical models (#29001)
* Script & Manual edition

* Update
2024-02-16 08:16:58 +01:00
Titus
1e402b957d
add test marker to run all tests with @require_bitsandbytes (#28278) 2024-02-16 01:53:09 +01:00
Sadra Barikbin
f3aa7db439
Fix a tiny typo in generation/utils.py::GenerateEncoderDecoderOutput's docstring (#29044)
Update utils.py
2024-02-15 18:12:31 +00:00
Andrei Panferov
b0a7f44f85
Removed obsolete attribute setting for AQLM quantization. (#29034)
removed redundant field
2024-02-15 18:11:13 +00:00
amyeroberts
4156f517ce
Patch to skip failing test_save_load_low_cpu_mem_usage tests (#29043)
* Patch to skip currently failing tests

* Whoops - wrong place
2024-02-15 17:26:33 +00:00
Younes Belkada
6d1f545665
FIX: Fix error with logger.warning + inline with recent refactor (#29039)
Update modeling_utils.py
2024-02-15 15:33:26 +01:00
amyeroberts
8a0ed0a9a2
Fix copies between DETR and DETA (#29037) 2024-02-15 14:02:58 +00:00