Arthur
307f632bb2
[CI update
] Try to use dockers and no cache ( #29202 )
...
* change cis
* nits
* update
* minor updates
* [push-ci-image]
* nit [push-ci-image]
* nitsssss
* [build-ci-image]
* [push-ci-image]
* [push-ci-image]
* both
* [push-ci-image]
* this?
* [push-ci-image]
* pypi-kenlm needs g++
* [push-ci-image]
* nit
* more nits [push-ci-image]
* nits [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* add vision
* [push-ci-image]
* [push-ci-image]
* add new dummy file but will need to update them [push-ci-image]
* [push-ci-image]
* show package size as well
* [push-ci-image]
* potentially ignore failures
* workflow updates
* nits [push-ci-image]
* [push-ci-image]
* fix consistency
* clean nciida triton
* also show big packages [push-ci-image]
* nit
* update
* another one
* line escape?
* add accelerate [push-ci-image]
* updates [push-ci-image]
* nits to run tests, no push-ci
* try to parse skip reason to make sure nothing is skipped that should no be skippped
* nit?
* always show skipped reasons
* nits
* better parsing of the test outputs
* action="store_true",
* failure on failed
* show matched
* debug
* update short summary with skipped, failed and errors
* nits
* nits
* coolu pdates
* remove docbuilder
* fix
* always run checks
* oups
* nits
* don't error out on library printing
* non zero exi codes
* no warning
* nit
* WAT?
* format nit
* [push-ci-image]
* fail if fail is needed
* [push-ci-image]
* sound file for torch light?
* [push-ci-image]
* order is important [push-ci-image]
* [push-ci-image] reduce even further
* [push-ci-image]
* use pytest rich !
* yes [push-ci-image]
* oupsy
* bring back the full traceback, but pytest rich should help
* nit
* [push-ci-image]
* re run
* nit
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* empty push to trigger
* [push-ci-image]
* nit? [push-ci-image]
* empty
* try to install timm with no deps
* [push-ci-image]
* oups [push-ci-image]
* [push-ci-image]
* [push-ci-image] ?
* [push-ci-image] open ssh client for git checkout fast
* empty for torch light
* updates [push-ci-image]
* nit
* @v4 for checkout
* [push-ci-image]
* [push-ci-image]
* fix fetch tests with parallelism
* [push-ci-image]
* more parallelism
* nit
* more nits
* empty to re-trigger
* empty to re-trigger
* split by timing
* did not work with previous commit
* junit.xml
* no path?
* mmm this?
* junitxml format
* split by timing
* nit
* fix junit family
* now we can test if the xunit1 is compatible!
* this?
* fully list tests
* update
* update
* oups
* finally
* use classname
* remove working directory to make sure the path does not interfere
* okay no juni should have the correct path
* name split?
* sort by classname is what make most sense
* some testing
* naem
* oups
* test something fun
* autodetect
* 18?
* nit
* file size?
* uip
* 4 is best
* update to see versions
* better print
* [push-ci-image]
* [push-ci-image]
* please install the correct keras version
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* uv is fucking me up
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* nits
* [push-ci-image]
* [push-ci-image]
* install issues an pins
* tapas as well
* nits
* more paralellism
* short tb
* soundfile
* soundfile
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* oups
* [push-ci-image]
* fix some things
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* use torch-light for hub
* small git lfs for hub job
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* fix tf tapas
* [push-ci-image]
* nits
* [push-ci-image]
* don't update the test
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* no use them
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* update tf proba
* [push-ci-image]
* [push-ci-image]
* woops
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* test with built dockers
* [push-ci-image]
* skip annoying tests
* revert fix copy
* update test values
* update
* last skip and fixup
* nit
* ALL GOOOD
* quality
* Update tests/models/layoutlmv2/test_image_processing_layoutlmv2.py
* Update docker/quality.dockerfile
Co-authored-by: Lysandre Debut <hi@lysand.re>
* Update src/transformers/models/tapas/modeling_tf_tapas.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re>
* use torch-speed
* updates
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* [push-ci-image]
* fuck ken-lm [push-ci-image]
* [push-ci-image]
* [push-ci-image]
---------
Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-05-06 10:10:32 +02:00
Yih-Dar
91d155ea92
Avoid duplication in PR slow CI model list ( #30634 )
...
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-03 18:19:30 +02:00
Yen Ting
deb7605a2a
Prevent TextGenerationPipeline._sanitize_parameters
from overriding previously provided parameters ( #30362 )
...
* Fixed TextGenerationPipeline._sanitize_parameters default params
* removed empty spaces
---------
Co-authored-by: Ng, Yen Ting <yen.ting.ng@intel.com>
2024-05-03 17:49:28 +02:00
Younes Belkada
d0c72c15c2
HQQ: PEFT support for HQQ ( #30632 )
...
Update quantizer_hqq.py
2024-05-03 16:01:15 +02:00
Pavel Iakubovskii
66f675eb65
Fix W&B run name ( #30462 )
...
* Remove comparison to output_dir
* Update docs for `run_name`
* Add warning
2024-05-03 12:04:15 +01:00
Mayank Mishra
425e1a0426
add mlp bias for llama models ( #30031 )
...
* add bias
* fix quality
2024-05-03 11:02:17 +02:00
Raushan Turganbay
a0e77a1f6b
Fix CI after #30410 ( #30612 )
...
* Fix CI after #30410
* [run-slow] blenderbot
2024-05-03 01:18:48 +05:00
mobicham
59952994c4
Add HQQ quantization support ( #29637 )
...
* update HQQ transformers integration
* push import_utils.py
* add force_hooks check in modeling_utils.py
* fix | with Optional
* force bias as param
* check bias is Tensor
* force forward for multi-gpu
* review fixes pass
* remove torch grad()
* if any key in linear_tags fix
* add cpu/disk check
* isinstance return
* add multigpu test + refactor tests
* clean hqq_utils imports in hqq.py
* clean hqq_utils imports in quantizer_hqq.py
* delete hqq_utils.py
* Delete src/transformers/utils/hqq_utils.py
* ruff init
* remove torch.float16 from __init__ in test
* refactor test
* isinstance -> type in quantizer_hqq.py
* cpu/disk device_map check in quantizer_hqq.py
* remove type(module) nn.linear check in quantizer_hqq.py
* add BaseQuantizeConfig import inside HqqConfig init
* remove hqq import in hqq.py
* remove accelerate import from test_hqq.py
* quant config.py doc update
* add hqqconfig to main_classes doc
* make style
* __init__ fix
* ruff __init__
* skip_modules list
* hqqconfig format fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* test_hqq.py remove mistral comment
* remove self.using_multi_gpu is False
* torch_dtype default val set and logger.info
* hqq.py isinstance fix
* remove torch=None
* torch_device test_hqq
* rename test_hqq
* MODEL_ID in test_hqq
* quantizer_hqq setattr fix
* quantizer_hqq typo fix
* imports quantizer_hqq.py
* isinstance quantizer_hqq
* hqq_layer.bias reformat quantizer_hqq
* Step 2 as comment in quantizer_hqq
* prepare_for_hqq_linear() comment
* keep_in_fp32_modules fix
* HqqHfQuantizer reformat
* quantization.md hqqconfig
* quantization.md model example reformat
* quantization.md # space
* quantization.md space })
* quantization.md space })
* quantization_config fix doc
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* axis value check in quantization_config
* format
* dynamic config explanation
* quant config method in quantization.md
* remove shard-level progress
* .cuda fix modeling_utils
* test_hqq fixes
* make fix-copies
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-02 17:51:49 +01:00
Jonghwan Hyeon
4c940934da
Output None
as attention when layer is skipped ( #30597 )
...
* Output `None` as attention when layer is skipped
* Add test for output_attentions
2024-05-02 17:25:19 +01:00
Michael Benayoun
39359e5b5f
Fix FX tracing issues for Llama ( #30619 )
2024-05-02 17:03:10 +02:00
Joao Gante
9719202d37
Generate: fix SinkCache
on Llama models ( #30581 )
2024-05-02 15:24:33 +01:00
Joao Gante
66abe13951
Docs: add missing StoppingCriteria
autodocs ( #30617 )
...
* add missing docstrings to docs
* Update src/transformers/generation/stopping_criteria.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-02 15:20:04 +01:00
Joao Gante
aa55ff44a2
Docs: fix generate
-related rendering issues ( #30600 )
...
* does this work?
* like this?
* fix the other generate links
* missing these
2024-05-02 14:42:25 +01:00
amitportnoy
801894e08c
phi3 chat_template does not support system role ( #30606 )
...
* phi3 chat_template does not support system role
* fix doc test error
2024-05-02 15:30:21 +02:00
Yih-Dar
f57f014936
Use contiguous()
in clip checkpoint conversion script ( #30613 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-02 13:59:40 +02:00
Zhan Lu
a65da83d75
fix:missing output_router_logits
in SwitchTransformers ( #30573 )
...
* fix:missing `output_router_logits` in SwitchTransformers
* fix whitespace in blank line
2024-05-02 13:47:00 +02:00
amyeroberts
4ad5adaf1d
Fix copies for DBRX - neuron fix ( #30610 )
2024-05-02 11:00:26 +01:00
Richard Brown
f95302584b
🚨 Update image_processing_vitmatte.py ( #30566 )
...
* Update image_processing_vitmatte.py
* add test
* [run-slow]vitmatte
2024-05-02 11:00:07 +01:00
Bai Li
12c5544dca
Fix memory leak with CTC training script on Chinese languages ( #30358 )
...
* Fix memory leak with CTC training script on Chinese languages
* Fix lint
2024-05-02 09:33:36 +01:00
Michael Benayoun
fbabd6746f
Fix for Neuron ( #30259 )
2024-05-02 10:24:47 +02:00
Raushan Turganbay
5cf3e6bf05
Fix: failing CI after #30568 ( #30599 )
...
* failiing CI
* no let's keep it intil full deprecation in v4.42
2024-05-02 12:15:17 +05:00
dependabot[bot]
c681b58b06
Bump torch from 1.9.0+cpu to 1.13.1 in /examples/flax/vision ( #21168 )
...
Bumps [torch](https://github.com/pytorch/pytorch ) from 1.9.0+cpu to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases )
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md )
- [Commits](https://github.com/pytorch/pytorch/commits/v1.13.1 )
---
updated-dependencies:
- dependency-name: torch
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-01 20:14:57 +01:00
dependabot[bot]
3a36597a5f
Bump pillow from 10.0.1 to 10.2.0 in /examples/research_projects/decision_transformer ( #28655 )
...
Bump pillow in /examples/research_projects/decision_transformer
Bumps [pillow](https://github.com/python-pillow/Pillow ) from 10.0.1 to 10.2.0.
- [Release notes](https://github.com/python-pillow/Pillow/releases )
- [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst )
- [Commits](https://github.com/python-pillow/Pillow/compare/10.0.1...10.2.0 )
---
updated-dependencies:
- dependency-name: pillow
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 19:58:34 +01:00
dependabot[bot]
4f3c7af489
Bump torch from 1.9.0+cpu to 1.13.1 in /examples/research_projects/jax-projects/hybrid_clip ( #21167 )
...
Bump torch in /examples/research_projects/jax-projects/hybrid_clip
Bumps [torch](https://github.com/pytorch/pytorch ) from 1.9.0+cpu to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases )
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md )
- [Commits](https://github.com/pytorch/pytorch/commits/v1.13.1 )
---
updated-dependencies:
- dependency-name: torch
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 18:37:55 +01:00
dependabot[bot]
6f465d45d9
Bump torch from 1.11.0 to 1.13.1 in /examples/research_projects/decision_transformer ( #21171 )
...
Bump torch in /examples/research_projects/decision_transformer
Bumps [torch](https://github.com/pytorch/pytorch ) from 1.11.0 to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases )
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md )
- [Commits](https://github.com/pytorch/pytorch/compare/v1.11.0...v1.13.1 )
---
updated-dependencies:
- dependency-name: torch
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 18:16:25 +01:00
Fraser Mince
5090ea3f68
Fix llava half precision and autocast issues ( #29721 )
...
* Ensure input_embeds and image_features are the same dtype in autocast
* Fix nans in half precision llava-next and fix autocasting behavior.
* Fix styling issues.
* fix randn newline instantiation
* fix broken slow llava test
* Fix llava next init.
* fix styling issues
* [run-slow]llava,llava_next
* fix styling issues
2024-05-01 17:49:44 +01:00
Joao Gante
d57ffb487f
Generate: remove deprecated public decoding functions and streamline logic 🧼 ( #29956 )
2024-05-01 17:38:44 +01:00
NielsRogge
dc401d3a4e
Improve object detection task guideline ( #29967 )
...
* Add improvements
* Address comment
2024-05-01 17:58:01 +02:00
amyeroberts
d2feb54591
Fix image segmentation example - don't reopen image ( #30481 )
...
Fix image segmentation example - don't repoen image
2024-05-01 16:52:57 +01:00
dependabot[bot]
6e0cba3cec
Bump torch from 1.6.0 to 1.13.1 in /examples/research_projects/visual_bert ( #21172 )
...
Bump torch in /examples/research_projects/visual_bert
Bumps [torch](https://github.com/pytorch/pytorch ) from 1.6.0 to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases )
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md )
- [Commits](https://github.com/pytorch/pytorch/compare/v1.6.0...v1.13.1 )
---
updated-dependencies:
- dependency-name: torch
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:40:54 +01:00
dependabot[bot]
ce66c0e989
Bump torch from 1.11.0 to 1.13.1 in /examples/research_projects/codeparrot ( #21170 )
...
Bump torch in /examples/research_projects/codeparrot
Bumps [torch](https://github.com/pytorch/pytorch ) from 1.11.0 to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases )
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md )
- [Commits](https://github.com/pytorch/pytorch/compare/v1.11.0...v1.13.1 )
---
updated-dependencies:
- dependency-name: torch
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:40:19 +01:00
dependabot[bot]
7a29c577e8
Bump torch from 1.6.0 to 1.13.1 in /examples/research_projects/lxmert ( #21174 )
...
Bumps [torch](https://github.com/pytorch/pytorch ) from 1.6.0 to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases )
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md )
- [Commits](https://github.com/pytorch/pytorch/compare/v1.6.0...v1.13.1 )
---
updated-dependencies:
- dependency-name: torch
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:39:55 +01:00
dependabot[bot]
b33f01fe6b
Bump pyarrow from 1.0.1 to 15.0.0 in /examples/research_projects/lxmert ( #30584 )
...
Bumps [pyarrow](https://github.com/apache/arrow ) from 1.0.1 to 15.0.0.
- [Commits](https://github.com/apache/arrow/compare/apache-arrow-1.0.1...go/v15.0.0 )
---
updated-dependencies:
- dependency-name: pyarrow
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:38:07 +01:00
dependabot[bot]
0ec3003ae9
Bump pyarrow from 1.0.1 to 15.0.0 in /examples/research_projects/visual_bert ( #30583 )
...
Bump pyarrow in /examples/research_projects/visual_bert
Bumps [pyarrow](https://github.com/apache/arrow ) from 1.0.1 to 15.0.0.
- [Commits](https://github.com/apache/arrow/compare/apache-arrow-1.0.1...go/v15.0.0 )
---
updated-dependencies:
- dependency-name: pyarrow
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:37:54 +01:00
dependabot[bot]
aefbdfe8cf
Bump pyarrow from 7.0.0 to 15.0.0 in /examples/research_projects/decision_transformer ( #30582 )
...
Bump pyarrow in /examples/research_projects/decision_transformer
Bumps [pyarrow](https://github.com/apache/arrow ) from 7.0.0 to 15.0.0.
- [Commits](https://github.com/apache/arrow/compare/go/v7.0.0...go/v15.0.0 )
---
updated-dependencies:
- dependency-name: pyarrow
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:37:40 +01:00
dependabot[bot]
7164171212
Bump gitpython from 3.1.32 to 3.1.41 in /examples/research_projects/distillation ( #30586 )
...
Bump gitpython in /examples/research_projects/distillation
Bumps [gitpython](https://github.com/gitpython-developers/GitPython ) from 3.1.32 to 3.1.41.
- [Release notes](https://github.com/gitpython-developers/GitPython/releases )
- [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES )
- [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.32...3.1.41 )
---
updated-dependencies:
- dependency-name: gitpython
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:36:57 +01:00
dependabot[bot]
ff8f624542
Bump grpcio from 1.44.0 to 1.53.2 in /examples/research_projects/decision_transformer ( #30585 )
...
Bump grpcio in /examples/research_projects/decision_transformer
Bumps [grpcio](https://github.com/grpc/grpc ) from 1.44.0 to 1.53.2.
- [Release notes](https://github.com/grpc/grpc/releases )
- [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md )
- [Commits](https://github.com/grpc/grpc/compare/v1.44.0...v1.53.2 )
---
updated-dependencies:
- dependency-name: grpcio
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:35:52 +01:00
dependabot[bot]
b71f512823
Bump gitpython from 3.1.32 to 3.1.41 in /examples/research_projects/decision_transformer ( #30587 )
...
Bump gitpython in /examples/research_projects/decision_transformer
Bumps [gitpython](https://github.com/gitpython-developers/GitPython ) from 3.1.32 to 3.1.41.
- [Release notes](https://github.com/gitpython-developers/GitPython/releases )
- [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES )
- [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.32...3.1.41 )
---
updated-dependencies:
- dependency-name: gitpython
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:30:24 +01:00
Pedro Cuenca
f4f18afde8
Gemma: update activation warning ( #29995 )
...
* Gemma: only display act. warning when necessary
This is a nit PR, but I was confused. I got the warning even after I
had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I
was using the "legacy" `gelu_pytorch_tanh`.
Another option is to keep the warning but change the message to say
something like "`hidden_act` is ignored, please use `hidden_activation`
instead. Setting Gemma's activation function to `gelu_pytorch_tanh`".
* Change message, and set `config.hidden_activation`
2024-05-01 17:23:38 +02:00
amyeroberts
bbaa8ceff6
Fix canonical model --model_type in examples ( #30480 )
...
Fix --model_type in examples
2024-05-01 15:47:05 +01:00
Arthur
3c69d81eeb
remove jax example ( #30498 )
...
remove example
2024-05-01 16:34:57 +02:00
Matt
1e05671d21
Fix QA example ( #30580 )
...
* Handle cases when CLS token is absent
* Use BOS token as a fallback
2024-05-01 08:43:02 +01:00
Matt
4b4da18f53
Refactor default chat template warnings ( #30551 )
...
* Temporarily silence warnings in apply_chat_template until we can properly deprecate default chat templates
* make fixup
* Move the default chat template warning into apply_chat_template itself
* make fixup
2024-05-01 08:42:11 +01:00
Raushan Turganbay
4bc9cb36b7
Fix Marian model conversion ( #30173 )
...
* fix marian model coversion
* uncomment that line
* remove unnecessary code
* revert tie_weights, doesn't hurt
2024-05-01 12:33:12 +05:00
Raushan Turganbay
38a4bf79ad
Encoder-decoder models: move embedding scale to nn.Module ( #30410 )
...
* move scaling to nn.Module
* let the test be here for now (need to fix)
* failing tests
* last failing models
* Revert commit 4c14817f38
* clean-up
* oops forgot
* codestyle
* raise NotImplemented when possible
* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* skip tests in respective modeling files
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-01 12:33:00 +05:00
Raushan Turganbay
9d31b32e9d
Use text config's vocab size in testing models ( #30568 )
...
use text config's vocab size
2024-05-01 12:32:45 +05:00
Yih-Dar
78fdd64dcf
Remove use_square_size
after loading ( #30567 )
...
* fix
* add test
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-04-30 21:11:37 +02:00
Yih-Dar
87927b248e
General PR slow CI ( #30540 )
...
* More general PR slow CI
* Update utils/pr_slow_ci_models.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-30 21:05:09 +02:00
Raushan Turganbay
b8ac4d035c
Fix generation doctests ( #30263 )
...
* fix doctest
* fix torch doctest
* make CI happy
* raise error
* make fixup
2024-04-30 21:02:26 +02:00
DarshanDeshpande
2ecefc3959
Add chat templating support for KeyDataset in text-generation pipeline ( #30558 )
...
* added chat templating support for keydataset in generation pipeline
* fixed and improved test
* fix formatting test failures
* Fix tests
* Fix tests
2024-04-30 19:51:41 +01:00