The real move isn't in the papers—it's that RL without ground truth and probability-aware decoding are quietly becoming table stakes while everyone obsesses over scaling. Scaled Cognition's $100M bet on hallucination mitigation signals what the market actually fears: LLMs that sound confident while being wrong, which no amount of parameter count fixes. The llama.cpp churn (five releases in days) shows inference optimization is still the unglamorous bottleneck that matters—getting capable models to run cheaply on hardware beats waiting for the next architecture paper. Watch the RL + uncertainty work; ignore the orbital congestion report unless you're building satellites.