Posts

  • Personalizing LLMs for High-Stakes Decisions: Four Lessons

    Most LLM personalization research assumes the easy case: writing style, tone, topical preferences. What happens when personalization has to survive real consequences — when “getting the user what they want” and “getting the user what they need” actively diverge?

  • Adaptive LoRA Rank Allocation Works for SFT. It Fails Under GRPO. Here's Why.

    If you’ve fine-tuned a large language model in the last two years, you’ve probably used LoRA. The trick is so good and so cheap that it feels like a cheat code: instead of updating the model’s billions of parameters, you train a tiny pair of low-rank matrices alongside each weight matrix you care about. The model behaves as if you’d fine-tuned the whole thing, but you’ve touched maybe one or two percent of the parameter count.

  • Gradient-Based LoRA Rank Allocation Fails in GRPO

    Adaptive rank allocation for LoRA — giving more capacity to layers that “matter” and less to layers that don’t — is one of those ideas that keeps getting validated. AdaLoRA, GoRA, IGU-LoRA, Aletheia, ILA — every recent paper says the same thing: profile the gradients, allocate rank where the gradients are large, save parameters, get the same accuracy.

subscribe via RSS