AI Native — AI hardware brief, 2026-06-30

The real story this week isn't in the papers or the funding—it's the infrastructure quietly maturing. vLLM v0.24.0 and the llama.cpp iterations represent the unsexy but critical work of making open-model inference actually competitive with proprietary APIs; that's where real AI hardware economics get decided, not in another robotics pretraining paper. X Square Robot hitting $2.8B on "physical AI foundation models" is hype until someone shows actual generalization across tasks—compare that to the actual technical progress in GROW² (grounding tool use in robots), which is solving a real constraint, or the async training paper proving you don't need perfect synchronization for LLM scale, which directly impacts hardware utilization and thus chip economics. Skip the creator economy data-moat narrative and the corporate shuffleboard—watch whether Micron's Anthropic deal signals real memory bottleneck awareness in the industry, because *that's* the hardware constraint nobody's talking about yet.

Today in AI hardware