transformers/docker
Elvir Crnčević 845b0a2616
Efficient Inference Kernel for SpQR (#34976)
* Resolve vptq conflict

* Rename spqr package to spqr_quant

* Get rid of aqlm mention

* Start working on tests

* Resolve ruff code checks

* Ruff format

* Isort

* Test updates

* Add gpu tag

* Rename to modules_to_not_convert

* Config update

* Docs and config update

* Docs and config update

* Update to update_torch_dtype

* spqr config parameter validation

* Ruff update

* Apply ruff fixes

* Test fixes

* Ruff update

* Mark tests as @slow again; Ruff; Docstring update

* Ruff

* Remove absolute path

* Resolve typo

* Remove redundandt log

* Check accelerate/spqr availability

* Ruff fix

* Check if the config contains proper shapes

* Ruff test

* Documentation update

* overview update

* Ruff checks

* Ruff code quality

* Make style

* Update docs/source/en/quantization/spqr.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update spqr.md

* Enable gptqmodel (#35012)

* gptqmodel

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update readme

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* gptqmodel need use checkpoint_format (#1)

* gptqmodel need use checkpoint_format

* fix quantize

* Update quantization_config.py

* Update quantization_config.py

* Update quantization_config.py

---------

Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* Revert quantizer_gptq.py (#2)

* revert quantizer_gptq.py change

* pass **kwargs

* limit gptqmodel and optimum version

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix warning

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix version check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* revert unrelated changes

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* enable gptqmodel tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix requires gptq

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Fix Transformer compat (#3)

* revert quantizer_gptq.py change

* pass **kwargs

* add meta info

* cleanup

* cleanup

* Update quantization_config.py

* hf_select_quant_linear pass checkpoint_format and meta

* fix GPTQTestCUDA

* Update test_gptq.py

* gptqmodel.hf_select_quant_linear() now does not select ExllamaV2

* cleanup

* add backend

* cleanup

* cleanup

* no need check exllama version

* Update quantization_config.py

* lower checkpoint_format and backend

* check none

* cleanup

* Update quantization_config.py

* fix self.use_exllama == False

* spell

* fix unittest

* fix unittest

---------

Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format again

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update gptqmodel version (#6)

* update gptqmodel version

* update gptqmodel version

* fix unit test (#5)

* update gptqmodel version

* update gptqmodel version

* "not self.use_exllama" is not equivalent to "self.use_exllama==False"

* fix unittest

* update gptqmodel version

* backend is loading_attibutes (#7)

* fix format and tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix memory check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix device mismatch

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix result check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* update tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* review: update docs (#10)

* review: update docs (#12)

* review: update docs

* fix typo

* update tests for gptqmodel

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update document (#9)

* update overview.md

* cleanup

* Update overview.md

* Update overview.md

* Update overview.md

* update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

---------

Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

* typo

* doc note for asymmetric quant

* typo with apple silicon(e)

* typo for marlin

* column name revert: review

* doc rocm support

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com>
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Fix : Nemotron Processor in GGUF conversion (#35708)

* fixing nemotron processor

* make style

* Update docs/source/en/quantization/spqr.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add missing TOC to doc

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com>
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-02-13 16:22:58 +01:00
..
transformers-all-latest-gpu use torch 2.6 for daily CI (#35985) 2025-01-31 18:58:23 +01:00
transformers-doc-builder Use python 3.10 for docbuild (#28399) 2024-01-11 14:39:49 +01:00
transformers-gpu TF: TF 2.10 unpin + related onnx test skips (#18995) 2022-09-12 19:30:27 +01:00
transformers-past-gpu DeepSpeed github repo move sync (#36021) 2025-02-05 08:19:31 -08:00
transformers-pytorch-amd-gpu Add git LFS to AMD docker image (#36016) 2025-02-12 22:27:21 +01:00
transformers-pytorch-deepspeed-amd-gpu Use rocm6.2 for AMD images (#35930) 2025-01-28 11:10:28 +01:00
transformers-pytorch-deepspeed-latest-gpu update docker file transformers-pytorch-deepspeed-latest-gpu (#35940) 2025-01-29 16:01:27 +01:00
transformers-pytorch-deepspeed-nightly-gpu DeepSpeed github repo move sync (#36021) 2025-02-05 08:19:31 -08:00
transformers-pytorch-gpu use torch 2.6 for daily CI (#35985) 2025-01-31 18:58:23 +01:00
transformers-pytorch-tpu Rename master to main for notebooks links and leftovers (#16397) 2022-03-25 09:12:23 -04:00
transformers-quantization-latest-gpu Efficient Inference Kernel for SpQR (#34976) 2025-02-13 16:22:58 +01:00
transformers-tensorflow-gpu pin tensorflow_probability<0.22 in docker files (#34381) 2024-10-28 11:59:46 +01:00
consistency.dockerfile CircleCI with python 3.9 (#36027) 2025-02-04 17:40:20 +01:00
custom-tokenizers.dockerfile CircleCI with python 3.9 (#36027) 2025-02-04 17:40:20 +01:00
examples-tf.dockerfile CircleCI with python 3.9 (#36027) 2025-02-04 17:40:20 +01:00
examples-torch.dockerfile CircleCI with python 3.9 (#36027) 2025-02-04 17:40:20 +01:00
exotic-models.dockerfile CircleCI with python 3.9 (#36027) 2025-02-04 17:40:20 +01:00
jax-light.dockerfile CircleCI with python 3.9 (#36027) 2025-02-04 17:40:20 +01:00
pipeline-tf.dockerfile CircleCI with python 3.9 (#36027) 2025-02-04 17:40:20 +01:00
pipeline-torch.dockerfile CircleCI with python 3.9 (#36027) 2025-02-04 17:40:20 +01:00
quality.dockerfile CircleCI with python 3.9 (#36027) 2025-02-04 17:40:20 +01:00
README.md Add documentation for docker (#33156) 2024-10-14 11:58:45 +02:00
tf-light.dockerfile CircleCI with python 3.9 (#36027) 2025-02-04 17:40:20 +01:00
torch-jax-light.dockerfile CircleCI with python 3.9 (#36027) 2025-02-04 17:40:20 +01:00
torch-light.dockerfile CircleCI with python 3.9 (#36027) 2025-02-04 17:40:20 +01:00
torch-tf-light.dockerfile CircleCI with python 3.9 (#36027) 2025-02-04 17:40:20 +01:00

Dockers for transformers

In this folder you will find various docker files, and some subfolders.

  • dockerfiles (ex: consistency.dockerfile) present under ~/docker are used for our "fast" CIs. You should be able to use them for tasks that only need CPU. For example torch-light is a very light weights container (703MiB).
  • subfloder contain dockerfiles used for our slow CIs, which can be used for GPU tasks, but they are BIG as they were not specifically designed for a single model / single task. Thus the ~/docker/transformers-pytorch-gpu includes additional dependencies to allow us to run ALL model tests (say librosa or tesseract, which you do not need to run LLMs)

Note that in both case, you need to run uv pip install -e ., which should take around 5 seconds. We do it outside the dockerfile for the need of our CI: we checkout a new branch each time, and the transformers code is thus updated.

We are open to contribution, and invite the community to create dockerfiles with potential arguments that properly choose extras depending on the model's dependencies! 🤗