Posts
-
Personalizing LLMs for High-Stakes Decisions: Four Lessons
Most LLM personalization research assumes the easy case: writing style, tone, topical preferences. What happens when personalization has to survive real consequences — when “getting the user what they want” and “getting the user what they need” actively diverge?
-
Adaptive LoRA Rank Allocation Works for SFT. It Fails Under GRPO. Here's Why.
If you’ve fine-tuned a large language model in the last two years, you’ve probably used LoRA. The trick is so good and so cheap that it feels like a cheat code: instead of updating the model’s billions of parameters, you train a tiny pair of low-rank matrices alongside each weight matrix you care about. The model behaves as if you’d fine-tuned the whole thing, but you’ve touched maybe one or two percent of the parameter count.
-
Gradient-Based LoRA Rank Allocation Fails in GRPO
Adaptive rank allocation for LoRA — giving more capacity to layers that “matter” and less to layers that don’t — is one of those ideas that keeps getting validated. AdaLoRA, GoRA, IGU-LoRA, Aletheia, ILA — every recent paper says the same thing: profile the gradients, allocate rank where the gradients are large, save parameters, get the same accuracy.
subscribe via RSS