RUFFY-369
29f56e2230
chore:init files for sam2
2024-08-01 12:24:49 +05:30
sangbumchoi
a898595938
initial comment
2024-07-30 00:25:42 +00:00
Kamil Akesbi
3fbaaaa64d
Whisper tokenizer word level timestamps ( #32197 )
...
* fix _fix_key in PreTrainedModel
* fix _find_longest_common_sequence
* add test
* remove result.json
* nit
* update test
2024-07-29 11:19:52 +01:00
Joao Gante
7ffe25f2b9
Generate: end-to-end compilation ( #30788 )
...
* mvp
* added test (a few models need fixes)
* fix a few test cases
* test nits
* harder test 😈
* revert changes in stablelm
* test with improved condition
* add todo
* tmp commit
* merged with main
* nits
* add todo
* final corrections
* add docs for generation compilation
* docs nits
* add tip
* PR suggestions
* add more details to the compilation docs
* fix cache positions
* cache is now init in generate; update docs
* tag test as flaky
* docs
* post rebase make fixup and other nits
* remove unintended changes
* whisper (encoder-decoder) not supported
* move token default updates to ; add tests for token defaults
* push changes
* manual rebase
* chameleon doesn't support this
* fix test_static_cache_mha_mqa_gqa (broken in another PR)
* docs: dynamic is better with end-to-end compilation
2024-07-29 10:52:13 +01:00
Sai-Suraj-27
49928892d6
fix(docs): Fixed a link in docs ( #32274 )
...
Fixed a link in docs.
2024-07-29 10:50:43 +01:00
Fanli Lin
6494479f1d
make p_mask
a numpy array before passing to select_starts_ends
( #32076 )
...
* fix
* bug fix
* refine
* fix
2024-07-29 10:29:11 +01:00
Joao Gante
535fe78b9f
Repo: remove exceptions in check_docstrings
( #32259 )
...
remove exceptions
2024-07-29 11:06:05 +02:00
Sai-Suraj-27
a2ad9d5ad5
fix: Fixed wrong argument passed to convert_blip_checkpoint
function call ( #32262 )
...
Removed one wrong argument passed to convert_blip_checkpoint function call.
2024-07-29 10:43:09 +02:00
leejet
5019aabfac
Optimize t5 tokenize logic to avoid redundant calls ( #32270 )
...
* Optimize t5 tokenize logic to avoid redundant calls
* fix and overwrite copies
2024-07-29 09:51:43 +02:00
Yih-Dar
f2122cc6eb
Upload new model failure report to Hub ( #32264 )
...
upload
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-29 09:42:54 +02:00
Raushan Turganbay
f739687684
🚨 Bloom support for cache class ( #31445 )
...
* bloom dynamic cache
* bloom follows standard cache format
* no skips for bloom anymore
* use cache position when possible
* clean up
* codestyle
* Update src/transformers/models/bloom/modeling_bloom.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/bloom/modeling_bloom.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/bloom/modeling_bloom.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* pr comments
* isinstance fix
* address comments
* make musicgen test happy
* [run-slow] bloom
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-29 10:58:59 +05:00
Joao Gante
44f6fdd74f
Llama 3.1: replace for loop by tensor ops at inv_freq initialization ( #32244 )
...
* replace for loop by tensor ops
* rm assert; readability
2024-07-27 10:19:46 +01:00
Yih-Dar
8da9068730
More flexible trigger condition ( #32251 )
...
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-26 20:52:45 +02:00
Raushan Turganbay
81233c069c
Flash-Attn: fix generation when no attention mask or no pading ( #32241 )
...
* fix
* fix prev test (half of failures)
* [run-slow] llama, gemma2
* [run-slow] llama, gemma2
2024-07-26 14:45:55 +05:00
Fanli Lin
27c7f971c0
[tests] fix static
cache implementation is not compatible with attn_implementation==flash_attention_2
( #32039 )
...
* add flash attention check
* fix
* fix
2024-07-26 11:41:27 +02:00
Connor Anderson
5f841c74b6
Add check for target_sizes is None
in post_process_image_guided_detection
for owlv2 ( #31934 )
...
* Add check for target_sizes is None in post_process_image_guided_detection
* Make sure Owlvit and Owlv2 in sync
* Fix incorrect indentation; add check for correct size of target_sizes
2024-07-26 10:05:46 +01:00
Rohit Dwivedula
f9756d9edb
Adds: extra_repr for RMSNorm layers in most models ( #32204 )
...
* adds: extra_repr() to RMSNorm layers in multiple models
* adds: extra_repr for deprecated models as well
* formatting as per style guide
2024-07-26 11:05:38 +02:00
Sai-Suraj-27
b8e5cd5396
Refactor: Removed un-necessary object
base class ( #32230 )
...
* Refactored to remove un-necessary object base class.
* small fix.
2024-07-26 10:33:02 +02:00
João Nadkarni
1c7ebf1d6e
don't log base model architecture in wandb if log model is false ( #32143 )
...
* don't log base model architecture in wandb is log model is false
* Update src/transformers/integrations/integration_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* convert log model setting into an enum
* fix formatting
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-26 09:38:59 +02:00
Raushan Turganbay
c46edfb823
Resize embeds with DeepSpeed ( #32214 )
...
* fix resize when deepspeed
* deepsped uses new embeds
* we needed this
2024-07-26 10:52:06 +05:00
Raushan Turganbay
fad15fba78
Llava: generate without images ( #32183 )
...
* llava w/o images
* tests
2024-07-26 10:17:27 +05:00
Raushan Turganbay
4ab33c2d81
Generation: stop at eos
for assisted decoding ( #31301 )
...
* fix
* move changes to prompt lookup
* add test
* set eos in assistant model
* style
* fix flakiness
* changes for new `main`
* Update tests/generation/test_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/generation/test_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add comment to explain
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-26 10:16:06 +05:00
Pavel Iakubovskii
9d6c0641c4
Fix code snippet for Grounding DINO ( #32229 )
...
Fix code snippet for grounding-dino
2024-07-25 19:20:47 +01:00
jrhe
3a83ec48a6
Allow a specific microphone to be used by the ffmpeg audio pipeline utility functions. Default to using the currently active microphone on Mac ( #31846 )
...
* use currently active microphone on mac for ffmpeg_microphone
* Allow ffmpeg_microphone device to be specified
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-25 17:16:13 +01:00
Huazhong Ji
6ed0bf1e85
translate philosophy.md to chinese ( #32177 )
...
* translate philosophy.md to chinese
* add the missing link
2024-07-25 09:01:06 -07:00
Yih-Dar
df6eee9201
Follow up for #31973 ( #32025 )
...
* fix
* [test_all] trigger full CI
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-25 16:12:23 +02:00
Kashif Rasul
de2318894e
[warnings] fix E721 warnings ( #32223 )
...
fix E721 warnings
2024-07-25 15:12:23 +02:00
Kashif Rasul
9b9a54e61b
[BigBird Pegasus] set _supports_param_buffer_assignment to False ( #32222 )
...
set _supports_param_buffer_assignment to False
2024-07-25 15:11:43 +02:00
Austin
1ecedf1d9e
Update question_answering.py ( #32208 )
2024-07-25 13:20:27 +01:00
Huazhong Ji
f53a5dec7b
remove unnecessary guard code related with pytorch versions 1.4.2 ~ 1.7.0 ( #32210 )
...
remove unnecessary guard code related with pytorch versions 1.4.2 ~
1.7.0
2024-07-25 11:04:04 +02:00
Sanchit Gandhi
5658e749ad
[whisper] fix short-form output type ( #32178 )
...
* [whisper] fix short-form output type
* add test
* make style
* update long-form tests
* fixes
* last fix
* finalise test
2024-07-25 16:58:02 +08:00
Sai-Suraj-27
85a1269e19
fix: Replaced deprecated unittest method
with the correct one ( #32198 )
...
Replaced deprecated unittest method with the correct one.
2024-07-24 18:00:21 +01:00
Matt
edd68f4ed8
🚨 No more default chat templates ( #31733 )
...
* No more default chat templates
* Add the template to the GPT-SW3 tests since it's not available by default now
* Fix GPT2 test
* Fix Bloom test
* Fix Bloom test
* Remove default templates again
2024-07-24 17:36:32 +01:00
Penut Chen
1c122a46dc
Support dequantizing GGUF FP16 format ( #31783 )
...
* support gguf fp16
* support gguf bf16 with pytorch
* add gguf f16 test
* remove bf16
2024-07-24 17:59:59 +02:00
Marc Sun
af0e4b7b37
Fix float8_e4m3fn in modeling_utils ( #32193 )
...
* Fix float8_e4m3fn in modeling_utils
* style
* fix
* comment
2024-07-24 17:14:05 +02:00
Raushan Turganbay
1392a6867f
Fix resize embedding with Deepspeed ( #32192 )
...
fix resize when deepspeed
2024-07-24 19:26:20 +05:00
Arthur
8d2534c4d0
let's not warn when someone is running a forward ( #32176 )
...
* let's not warn when someone is running a foward without cache + self.training
* more models
* fixup
2024-07-24 16:06:39 +02:00
Joao Gante
e0182f3bd7
RoPE: relaxed rope validation ( #32182 )
...
* relaxed rope check
* lets also accept rope_type=None, defaulting to the original implementation
* type and rope_type can coexist
2024-07-24 15:00:48 +01:00
amyeroberts
165116bc14
Remove conversational pipeline tests ( #32099 )
...
Remove conversation pipeline tests
2024-07-24 14:03:40 +01:00
Dr. Artificial曾小健
5f4ee98a7a
Update qwen2.md ( #32108 )
...
* Update qwen2.md
outdated description
* Update qwen2.md
amended
* Update qwen2.md
Update
* Update qwen2.md
fix wrong version code, now good to go
2024-07-24 11:54:41 +01:00
조준래
8678879f1d
fix: default value reflects the runtime environment variables rather than the ones present at import time. ( #32153 )
...
* fix: default value reflects the runtime environment variables rather than the ones present at import time.
* Fix: Change `deterministic` to None by default; use env var if None
2024-07-24 11:38:49 +01:00
Rohit Dwivedula
01be5b4879
adds: extra_repr() to MambaRMSNorm to include hidden size / size of weights in the layer ( #32171 )
...
* adds: extra_repr() to MambaRMSNorm to include the hidden size of the layer
* style fix with ruff:
2024-07-24 09:09:59 +02:00
Fanli Lin
c85510f958
[docs] change temperature to a positive value ( #32077 )
...
fix
2024-07-23 17:47:51 +01:00
Sai-Suraj-27
bc2adb0112
fix: Fixed an if condition that is always evaluating to true ( #32160 )
...
Fixed an if condition always evaluating to true.
2024-07-23 16:52:41 +01:00
Joao Gante
23f6a43f82
fix ( #32162 )
2024-07-23 16:48:16 +01:00
Lysandre
d5a99dfcee
Llama 3.1 conversion
...
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2024-07-23 17:13:25 +02:00
Lysandre
ff0d708fe6
Dev version: v4.44.0.dev0
2024-07-23 17:12:47 +02:00
Sai-Suraj-27
d2c687b3f1
Updated ruff
to the latest version ( #31926 )
...
* Updated ruff version and fixed the required code accorindg to the latest version.
* Updated ruff version and fixed the required code accorindg to the latest version.
* Added noqa directive to ignore 1 error shown by ruff
2024-07-23 17:07:31 +02:00
RhuiDih
9cf4f2aa9a
Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs ( #31629 )
...
* add DataCollatorBatchFlattening
* Update data_collator.py
* change name
* new FA2 flow if position_ids is provided
* add comments
* minor fix
* minor fix data collator
* add test cases for models
* add test case for data collator
* remove extra code
* formating for ruff check and check_repo.py
* ruff format
ruff format tests src utils
* custom_init_isort.py
2024-07-23 15:56:41 +02:00
Deep Gandhi
7d92009af6
Added additional kwarg for successful running of optuna hyperparameter search ( #31924 )
...
Update integration_utils.py
Added additional kwarg
2024-07-23 14:41:52 +01:00