Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning

1Shanghai Jiao Tong University
2Harvard University

*Equal Contribution

A simple illustration for LoRA-Dash.

Abstract

Large language models demonstrate impressive performance on downstream tasks, yet requiring extensive resource consumption when fully fine-tuning all parameters. To mitigate this, Parameter Efficient Fine-Tuning (PEFT) strategies, such as LoRA, have been developed. In this paper, we delve into the concept of task-specific directions—critical for transitioning large models from pretrained states to task-specific enhancements in PEFT. We propose a framework to clearly define these directions and explore their properties, and practical utilization challenges. We then introduce a novel approach, LoRA-Dash, which aims to maximize the impact of task-specific directions during the fine-tuning process, thereby enhancing model performance on targeted tasks. Extensive experiments have conclusively demonstrated the effectiveness of LoRA-Dash, and in-depth analyses further reveal the underlying mechanisms of LoRA-Dash.


LoRA-Dash

Method Overview

LoRA-Dash comprises two phases: the pre-launch phase for identifying TSD, and the dash phase to maximize their potential.


Results

Fine-tuning LLaMA-7B, LLaMA2-7B and LLaMA3-8B for the Commonsense Reasoning Tasks
Compared with LoRA



Fine-tuning LLaMA-7B, LLaMA2-7B and LLaMA3-8B for the Commonsense Reasoning Tasks
Compared with Other Methods



Fine-tuning DeBERTaV3-base and DeBERTaV3-large for the Natural Language Understanding Tasks
Compared with LoRA



Fine-tuning DeBERTaV3-base for the Natural Language Understanding Tasks
Compared with Other Methods



Fine-tuning SDXL for the Subject-driven Tasks

BibTeX

@article{si2024unleashing,
        title={Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning},
        author={Si, Chongjie and Shi, Zhiyi and Zhang, Shifan and Yang, Xiaokang and Pfister, Hanspeter and Shen, Wei},
        journal={arXiv preprint arXiv:2409.01035},
        year={2024}
      }