LRAgent: Efficient KV Cache Sharing for Multi-LoRA LLM Agents
Paper
• 2602.01053 • Published
• 8
None defined yet.
LRAgent: Efficient KV Cache Sharing for Multi-LoRA LLM Agents
Q-Palette: Fractional-Bit Quantizers Toward Optimal Bit Allocation for Efficient LLM Deployment