transformers/docs/source
Tim Dettmers 9d73b92269
4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479)
* Added lion and paged optimizers and made original tests pass.

* Added tests for paged and lion optimizers.

* Added and fixed optimizer tests.

* Style and quality checks.

* Initial draft. Some tests fail.

* Fixed dtype bug.

* Fixed bug caused by torch_dtype='auto'.

* All test green for 8-bit and 4-bit layers.

* Added fix for fp32 layer norms and bf16 compute in LLaMA.

* Initial draft. Some tests fail.

* Fixed dtype bug.

* Fixed bug caused by torch_dtype='auto'.

* All test green for 8-bit and 4-bit layers.

* Added lion and paged optimizers and made original tests pass.

* Added tests for paged and lion optimizers.

* Added and fixed optimizer tests.

* Style and quality checks.

* Fixing issues for PR #23479.

* Added fix for fp32 layer norms and bf16 compute in LLaMA.

* Reverted variable name change.

* Initial draft. Some tests fail.

* Fixed dtype bug.

* Fixed bug caused by torch_dtype='auto'.

* All test green for 8-bit and 4-bit layers.

* Added lion and paged optimizers and made original tests pass.

* Added tests for paged and lion optimizers.

* Added and fixed optimizer tests.

* Style and quality checks.

* Added missing tests.

* Fixup changes.

* Added fixup changes.

* Missed some variables to rename.

* revert trainer tests

* revert test trainer

* another revert

* fix tests and safety checkers

* protect import

* simplify a bit

* Update src/transformers/trainer.py

* few fixes

* add warning

* replace with `load_in_kbit = load_in_4bit or load_in_8bit`

* fix test

* fix tests

* this time fix tests

* safety checker

* add docs

* revert torch_dtype

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* multiple fixes

* update docs

* version checks and multiple fixes

* replace `is_loaded_in_kbit`

* replace `load_in_kbit`

* change methods names

* better checks

* oops

* oops

* address final comments

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-05-24 12:52:45 +02:00
..
de Flax Regnet (#21867) 2023-04-04 12:41:12 -04:00
en 4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479) 2023-05-24 12:52:45 +02:00
es Update feature selection in to_tf_dataset (#21935) 2023-04-24 17:34:30 +01:00
fr Flax Regnet (#21867) 2023-04-04 12:41:12 -04:00
it Depricate xpu_backend for ddp_backend (#23085) 2023-05-01 09:44:47 -04:00
ja Add Japanese translation to accelerate.mdx (#23232) 2023-05-09 10:51:43 -04:00
ko 🌐 [i18n-KO] Translated tasks/monocular_depth_estimation.mdx to Korean (#23621) 2023-05-23 15:54:39 +02:00
pt Remove erroneous img closing tag (#23646) 2023-05-22 09:28:26 -04:00
zh Flax Regnet (#21867) 2023-04-04 12:41:12 -04:00
_config.py Adding evaluate to the list of libraries required in generated notebooks (#20850) 2022-12-21 14:04:08 +01:00