transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-18 12:08:22 +06:00

History

Tim Dettmers 9d73b92269 4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479 ) * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added fix for fp32 layer norms and bf16 compute in LLaMA. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Fixing issues for PR #23479. * Added fix for fp32 layer norms and bf16 compute in LLaMA. * Reverted variable name change. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Added missing tests. * Fixup changes. * Added fixup changes. * Missed some variables to rename. * revert trainer tests * revert test trainer * another revert * fix tests and safety checkers * protect import * simplify a bit * Update src/transformers/trainer.py * few fixes * add warning * replace with `load_in_kbit = load_in_4bit or load_in_8bit` * fix test * fix tests * this time fix tests * safety checker * add docs * revert torch_dtype * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * multiple fixes * update docs * version checks and multiple fixes * replace `is_loaded_in_kbit` * replace `load_in_kbit` * change methods names * better checks * oops * oops * address final comments --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>		2023-05-24 12:52:45 +02:00
..
internal	TF: GPT2 with native embedding layers (#23436 )	2023-05-18 14:46:40 +01:00
main_classes	4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479 )	2023-05-24 12:52:45 +02:00
model_doc	TF port of the Segment Anything Model (SAM) (#22970 )	2023-05-19 14:14:13 +01:00
tasks	Add swiftformer (#22686 )	2023-05-12 11:52:31 +01:00
_config.py	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
_toctree.yml	Add swiftformer (#22686 )	2023-05-12 11:52:31 +01:00
accelerate.mdx	✨ update to use interlibrary links instead of Markdown (#18500 )	2022-08-08 10:53:52 -05:00
add_new_model.mdx	docs: Fix broken link in 'How to add a model...' (#23216 )	2023-05-08 14:56:42 -04:00
add_new_pipeline.mdx	Spanish translation of asr.mdx and add_new_pipeline.mdx (#20569 )	2022-12-12 09:23:23 -05:00
add_tensorflow_model.mdx	Make it easier to develop without a dev install (#22697 )	2023-04-11 08:41:53 -04:00
attention.mdx	Refactor model summary (#21408 )	2023-02-15 10:35:14 -08:00
autoclass_tutorial.mdx	Update doc examples feature extractor -> image processor (#20501 )	2022-11-30 14:50:55 +00:00
benchmarks.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
bertology.mdx	update: bertology paper (#22012 )	2023-03-08 07:54:30 -05:00
big_models.mdx	docs: Resolve many typos in the English docs (#20088 )	2022-11-07 09:19:04 -05:00
community.mdx	Fix en documentation typos (#21799 )	2023-02-27 08:36:36 +01:00
contributing.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
create_a_model.mdx	Documentation code sample fixes (#21302 )	2023-01-25 11:33:39 -05:00
custom_models.mdx	Replace awkward timm link with the expected one (#20109 )	2022-11-07 13:57:39 -05:00
custom_tools.mdx	Fix typo in gradio-tools docs (#23305 )	2023-05-11 14:31:28 -04:00
debugging.mdx	Spanish translation of the file debugging.mdx (#20566 )	2022-12-12 10:38:56 -05:00
fast_tokenizers.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
generation_strategies.mdx	Docs: add link to assisted generation blog post (#23397 )	2023-05-16 18:54:34 +01:00
glossary.mdx	docs: New terms and updates to glossary (#21982 )	2023-03-13 19:09:37 -04:00
hpo_train.mdx	update doc for perf_train_cpu_many (#19506 )	2022-10-11 22:54:19 -04:00
index.mdx	TF port of the Segment Anything Model (SAM) (#22970 )	2023-05-19 14:14:13 +01:00
installation.mdx	Can't install tf2 on M1 Chip by default (#22046 )	2023-03-09 07:44:58 -05:00
model_sharing.mdx	Fix `PushToHubCallback` import in Share a model docs (#21457 )	2023-02-06 09:26:22 -05:00
model_summary.mdx	Refactor model summary (#21408 )	2023-02-15 10:35:14 -08:00
multilingual.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
notebooks.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
pad_truncation.mdx	Example of pad_to_multiple_of for padding and truncation guide & docstring update (#22278 )	2023-03-20 14:18:55 -04:00
perf_hardware.mdx	[WIP] [doc] performance/scalability revamp (#15723 )	2022-05-16 13:36:41 +02:00
perf_infer_cpu.mdx	add doc for (#20525 )	2022-12-01 16:52:13 +01:00
perf_infer_gpu_many.mdx	add doc for (#20525 )	2022-12-01 16:52:13 +01:00
perf_infer_gpu_one.mdx	4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479 )	2023-05-24 12:52:45 +02:00
perf_infer_special.mdx	Improve performance docs (#17750 )	2022-06-23 14:51:54 +02:00
perf_train_cpu_many.mdx	Depricate xpu_backend for ddp_backend (#23085 )	2023-05-01 09:44:47 -04:00
perf_train_cpu.mdx	Add perf numbers for perf_train_cpu (#20974 )	2023-02-06 09:20:43 -05:00
perf_train_gpu_many.mdx	Remove typo in perf_train_gpu_many.mdx (#23144 )	2023-05-04 09:56:45 -04:00
perf_train_gpu_one.mdx	Add methods to PreTrainedModel to use PyTorch's BetterTransformer (#21259 )	2023-04-27 11:03:42 +02:00
perf_train_special.mdx	Fix Typo in Docs for GPU (#20509 )	2022-11-30 10:41:18 -05:00
perf_train_tpu_tf.mdx	Typos/fixes to link syntax (#21450 )	2023-02-07 15:19:19 +00:00
perf_train_tpu.mdx	Fix Typo in Docs for GPU (#20509 )	2022-11-30 10:41:18 -05:00
performance.mdx	Fix Typo in Docs for GPU (#20509 )	2022-11-30 10:41:18 -05:00
perplexity.mdx	Fix bug in perplexity guide calculations and update perplexity numbers. Fixes #22348 (#22411 )	2023-03-28 09:09:17 -04:00
philosophy.mdx	Update doc examples feature extractor -> image processor (#20501 )	2022-11-30 14:50:55 +00:00
pipeline_tutorial.mdx	Modify pipeline_tutorial.mdx (#22726 )	2023-04-12 15:20:25 +01:00
pipeline_webserver.mdx	Update quality tooling for formatting (#21480 )	2023-02-06 18:10:56 -05:00
pr_checks.mdx	Make it easier to develop without a dev install (#22697 )	2023-04-11 08:41:53 -04:00
preprocessing.mdx	fix spelling error (#23143 )	2023-05-04 09:56:28 -04:00
quicktour.mdx	Fix TF example in quicktour (#22960 )	2023-04-24 17:25:13 +01:00
run_scripts.mdx	Just re-reading the whole doc every couple of months 😬 (#18489 )	2022-08-06 09:38:55 +02:00
sagemaker.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
serialization.mdx	Add swiftformer (#22686 )	2023-05-12 11:52:31 +01:00
task_summary.mdx	Remove trailing 'extractive' word from en documentation (#21594 )	2023-02-13 10:09:00 -05:00
tasks_explained.mdx	Update task summary (#21067 )	2023-02-02 11:41:27 -08:00
testing.mdx	Bring back the PR `Refactor doctests + add CI` to `main` (#23271 )	2023-05-10 22:00:48 +02:00
tf_xla.mdx	Rewrite a couple of lines in the TF XLA doc (#21177 )	2023-01-18 17:53:05 +00:00
tokenizer_summary.mdx	Update tokenizer_summary.mdx (#20135 )	2022-11-15 01:18:13 +01:00
torchscript.mdx	🌐 [i18n-KO] Translated `torchscript.mdx` to Korean (#23060 )	2023-05-02 09:27:59 -04:00
training.mdx	Fix TF example in quicktour (#22960 )	2023-04-24 17:25:13 +01:00
transformers_agents.mdx	Fix broken links in the agent docs (#23297 )	2023-05-11 14:26:19 -04:00
troubleshooting.mdx	Removed BLIP mention from the troubleshooting guide (#21872 )	2023-03-01 08:26:25 -05:00