transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-20 13:08:21 +06:00

History

Tim Dettmers 9d73b92269 4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479 ) * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added fix for fp32 layer norms and bf16 compute in LLaMA. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Fixing issues for PR #23479. * Added fix for fp32 layer norms and bf16 compute in LLaMA. * Reverted variable name change. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Added missing tests. * Fixup changes. * Added fixup changes. * Missed some variables to rename. * revert trainer tests * revert test trainer * another revert * fix tests and safety checkers * protect import * simplify a bit * Update src/transformers/trainer.py * few fixes * add warning * replace with `load_in_kbit = load_in_4bit or load_in_8bit` * fix test * fix tests * this time fix tests * safety checker * add docs * revert torch_dtype * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * multiple fixes * update docs * version checks and multiple fixes * replace `is_loaded_in_kbit` * replace `load_in_kbit` * change methods names * better checks * oops * oops * address final comments --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>		2023-05-24 12:52:45 +02:00
..
agent.mdx	Add local agent (#23438 )	2023-05-18 11:09:55 -04:00
callback.mdx	Added documentation for DagsHubCallback (#21452 )	2023-02-06 09:24:18 -05:00
configuration.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
data_collator.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
deepspeed.mdx	[deepspeed zero3] need `generate(synced_gpus=True, ...)` (#22242 )	2023-03-22 12:18:57 -07:00
feature_extractor.mdx	Update feature extractor docs (#18324 )	2022-07-27 15:32:57 -05:00
image_processor.mdx	AutoImageProcessor (#20111 )	2022-11-08 19:54:41 +00:00
keras_callbacks.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
logging.mdx	logging documentation update (#17174 )	2022-05-16 16:47:28 -04:00
model.mdx	Generate: move generation_.py src files into generation/.py (#20096 )	2022-11-09 15:34:08 +00:00
onnx.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
optimizer_schedules.mdx	Add inverse sqrt learning rate scheduler (#21495 )	2023-02-07 15:00:50 -05:00
output.mdx	Fix typo ; Update output.mdx (#23227 )	2023-05-09 09:19:38 -04:00
pipelines.mdx	[Pipeline] Add zero shot audio classificatoin pipeline (#21600 )	2023-02-27 11:43:44 +01:00
processors.mdx	Update doc examples feature extractor -> image processor (#20501 )	2022-11-30 14:50:55 +00:00
quantization.mdx	4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479 )	2023-05-24 12:52:45 +02:00
text_generation.mdx	Generate: basic token streaming (#22449 )	2023-03-30 12:00:12 +01:00
tokenizer.mdx	documentation: some minor clean up (#16850 )	2022-04-26 16:56:08 -04:00
trainer.mdx	update FSDP and add XLA-FSDP documentation (#21812 )	2023-03-01 19:51:07 +05:30