transformers/docker
Younes Belkada fdb85be40f
Faster generation using AWQ + Fused modules (#27411)
* v1 fusing modules

* add fused mlp support

* up

* fix CI

* block save_pretrained

* fixup

* small fix

* add new condition

* add v1 docs

* add some comments

* style

* fix nit

* adapt from suggestion

* add check

* change arg names

* change variables name

* Update src/transformers/integrations/awq.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* style

* split up into 3 different private methods

* more conditions

* more checks

* add fused tests for custom models

* fix

* fix tests

* final update docs

* final fixes

* fix importlib metadata

* Update src/transformers/utils/quantization_config.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* change it to `do_fuse`

* nit

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* few fixes

* revert

* fix test

* fix copies

* raise error if model is not quantized

* add test

* use quantization_config.config when fusing

* Update src/transformers/modeling_utils.py

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2023-12-05 12:14:45 +01:00
..
transformers-all-latest-gpu Faster generation using AWQ + Fused modules (#27411) 2023-12-05 12:14:45 +01:00
transformers-cpu TF: TF 2.10 unpin + related onnx test skips (#18995) 2022-09-12 19:30:27 +01:00
transformers-doc-builder Don't install pytorch-quantization in Doc Builder docker file (#26622) 2023-10-05 16:57:50 +02:00
transformers-gpu TF: TF 2.10 unpin + related onnx test skips (#18995) 2022-09-12 19:30:27 +01:00
transformers-past-gpu Byebye pytorch 1.9 (#24080) 2023-06-16 16:38:23 +02:00
transformers-pytorch-amd-gpu Add RoCm scheduled CI & upgrade RoCm CI to PyTorch 2.1 (#26940) 2023-11-21 14:55:13 +01:00
transformers-pytorch-cpu Adding Docker images for transformers + notebooks (#3051) 2020-03-04 11:45:57 -05:00
transformers-pytorch-deepspeed-latest-gpu Update docker files to use torch==2.1.0 (#26735) 2023-10-11 16:23:36 +02:00
transformers-pytorch-deepspeed-nightly-gpu Fix DeepSpeed stuff in the nightly CI (#23478) 2023-05-19 20:31:55 +02:00
transformers-pytorch-gpu Fix transformers-pytorch-gpu docker build (#26615) 2023-10-05 15:33:35 +02:00
transformers-pytorch-tpu Rename master to main for notebooks links and leftovers (#16397) 2022-03-25 09:12:23 -04:00
transformers-tensorflow-cpu TF: TF 2.10 unpin + related onnx test skips (#18995) 2022-09-12 19:30:27 +01:00
transformers-tensorflow-gpu Update TF pin in docker image (#25343) 2023-08-07 12:32:34 +02:00