AI 精选动态
智能评分 65
Doc-to-LoRA 和 Text-to-LoRA 研究发表
AI 推荐理由
与直接微调相比,该方法通过成本摊销将定制变为单次前向传播,且能跨模态转移视觉信息。核心解读
Sakana AI 提出 Doc-to-LoRA 和 Text-to-LoRA,通过超网络生成 LoRA 适配器,实现模型快速定制。在 needle-in-a-haystack 任务中,对五倍上下文长度的实例达到近 100% 准确率,子秒延迟,已开源代码和论文。
全文
Sakana AI research scientist Rujikorn (Tan) Charakorn recently presented Doc-to-LoRA at @MLCollective’s DLCT journal club, covering hypernetworks, cost amortization, and future directions. A very lively discussion followed. Many thanks to the organizers!
https://www.youtube.com/watch?v=jb_0XcBMJQU https://t.co/NB6pJAsVyr

> **引用原帖 Sakana AI (@SakanaAILabs):**
> We’re excited to introduce Doc-to-LoRA and Text-to-LoRA, two related research exploring how to make LLM customization faster and more accessible.
> https://t.co/ApVzVsBuv1
> By training a Hypernetwork to generate LoRA adapters on the fly, these methods allow models to instantly internalize new information or adapt to new tasks.
> Biological systems naturally rely on two key cognitive abilities: durable long-term memory to store facts, and rapid adaptation to handle new tasks given limited sensory cues. While modern LLMs are highly capable, they still lack this flexibility. Traditionally, adding long-term memory or adapting an LLM to a specific downstream task requires an expensive and time-consuming model update, such as fine-tuning or context distillation, or relies on memory-intensive long prompts.
> To bypass these limitations, our work focuses on the concept of cost amortization. We pay the meta-training cost once to train a hypernetwork capable of producing tasks or document specific LoRAs on demand. This turns what used to be a heavy engineering pipeline into a single, inexpensive forward pass. Instead of performing per-task optimization, the hypernetwork meta-learns update rules to instantly modify an LLM given a new task description or a long document.
> In our experiments, Text-to-LoRA successfully specializes models to unseen tasks using just a natural language description. Building on this, Doc-to-LoRA is able to internalize factual documents. On a needle-in-a-haystack task, Doc-to-LoRA achieves near-perfect accuracy on instances five times longer than the base model's context window. It can even generalize to transfer visual information from a vision-language model into a text-only LLM, allowing it to classify images purely through internalized weights.
> Importantly, both methods run with sub-second latency, enabling rapid experimentation while avoiding the overhead of traditional model updates. This approach is a step towards lowering the technical barriers of model customization, allowing end-users to specialize foundation models via simple text inputs. We have released our code and papers for the community to explore.
> Doc-to-LoRA
> Paper: https://t.co/87xEEpf0GN
> Code: https://t.co/zBfQi2L9LW
> Text-to-LoRA
> Paper: https://t.co/emLRZ4Vdvo
> Code: https://t.co/b9mrdoWWRB
> https://x.com/SakanaAILabs/status/2027240298666209535