Yih-Dar
cea254c909
Update CsmForConditionalGenerationIntegrationTest
( #38424 )
...
* require_read_token
* ruff
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-05-28 10:20:43 +02:00
eustlb
b9f8f863d9
[CSM] update model id ( #38211 )
...
* update model id
* codec_model eval
* add processor img
* use ungated repo for processor tests
2025-05-27 17:03:55 +02:00
Cyril Vallez
896833c183
Fix some tests (especially compile with fullgraph=True on Python<3.11) ( #38319 )
...
* fix tests
* better fix for python<3.11
* fixes
* style
2025-05-23 17:11:40 +02:00
Yao Matrix
0173a99e73
enable csm integration cases on xpu, all passed ( #38140 )
...
* enable csm test cases on XPU, all passed
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
* fix style
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
---------
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
2025-05-15 09:46:29 +02:00
eustlb
e0f225cb10
[CSM] update test for t4 runners ( #38110 )
...
update test for t4 runners
2025-05-13 11:59:26 -04:00
eustlb
798f948e88
Add CSM model ( #36719 )
...
* draft structure
* depth decoder with forward pre hook
* full model forward draft
* draft update
* depth decoder update
* ConversationalSpeechModelForCausalLM udpates
* add generate
* max length criteria small fix
* udpate
* updates
* generation update
* update in loss compute
* conversion script
* update for correct input embeddings
* handle interleaved rope
* update
* update
* update
* support compile
* update training
* add doc
* update doc
* correct inits
* ConversationalSpeechModel -> Csm
* conf update
* name update
* tests CsmForCausalLMTest
* convert use cached_file
* conf + modeling updates
* generate utils handle third dim shape
* integration test
* modeling + conf updates
* common test handle more than 2 dims
* add nested audio list utils
* processing handle nested audio list
* csm processing draft
* mimi util
* init updates
* modular update
* convert modular
* processing update
* csm tests update
* generate tests handle third dim
* generate utils handle third dim
* propagate _get_initial_cache_position update
* tied_weight_keys update + convert correctly
* fix inputs_embeds
* revert audio nested list
* batch inference update + return audio
* audio_utils update
* processor update
* some more integration tests
* remove old test
* porcessing output labels
* improve
* fix
* update rope values with equivalent ones
* conversion update
* udpate tests
* handle depth decoder generation config
* remove default eos_token_id
* make style
* revert modeling_mimi
* add default generation_config
* remove sdpa since handled by default
* make
* fix conflict
* fix conflicts
* correct naming
* correct imports
* make
* causal -> conditional naming
* causal -> conditional naming
* auto update
* make
* make
* add doc
* test update
* fix weight init
* audio tokens offsets as buffer
* 4d mask in conditional class
* make
* doc update
* fix causal mask
* fix causal mask
* doc update
* doc update
* add processor doc
* update doc
* fix 4d causal mask
* update make_list_of_audio
* do not default to mutable
* remove duplicates
* remove useless reset_parameters
* use GradientCheckpointingLayer
* use can_return_tuple
* formatting
* prepend placeholder in _sample
* torch compile fix
* some more fixies
* convert modular
* fix
* default max_length in convert
* handle depth decoder generation config correctly
* clearer formulation
* handle output_loading_info
* handle softmax warning
* add doc
* propagate _get_initial_cache_position changes
* generation in its own module
* add processor tests
* fix compile witu cuda graphs
* fix compile with cuda graphs
* add csm.md
* include CSM loss
* doc nit
* doc nit
* doc nit
* Update docs/source/en/model_doc/csm.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add save_audio to processor
* Update src/transformers/models/csm/modular_csm.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* doc update
* simplify audio_codes_mask computation
* doc update
* simplify loss computation
* fix static cache test
* fix
* remove comment
* simplify encoded length computation
* use hf-internal-testing
* doc update
* cast to float before numpy
* nit
* mem efficient codebook head
* nit
* cat input values with cutoffs
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-05-07 10:20:13 -04:00