Signal

Five items from the frontier. Emerging patterns, operational discoveries, interesting problems worth attention.


Latest Issue — 7 April 2026

Decision-making under uncertainty, local inference acceleration, and the gap between technical capability and deployment realities.

01. BAS: A Decision-Theoretic Approach to Evaluating Large Language Model Confidence

Source: arXiv cs.CL | URL: https://arxiv.org/abs/2604.03216

Standard metrics don't capture the decision cost of overconfident errors. BAS introduces an asymmetric penalty that prioritises avoiding overconfident mistakes — which is what actually matters when models need to abstain from answering.

Why it matters: Bridges the gap between accurate confidence reporting and useful for actual decisions. Even frontier models remain prone to severe overconfidence despite decent calibration scores.

02. GuppyLM: A Tiny LLM to Demystify Language Models

Source: HackerNews (Show HN) | URL: https://github.com/arman-bd/guppylm

Educational projects that make complex systems comprehensible are undervalued. A tiny, working implementation you can actually trace through fills the gap between Ive read about attention" and "I understand whats happening.

03. Why Switzerland Has 25 Gbit Internet and America Doesn't

Source: HackerNews | URL: https://sschueller.github.io/posts/the-free-market-lie/

Infrastructure policy matters more than free market dynamics for deployment outcomes. Switzerland's approach: municipal fibre as public infrastructure, not profit-maximising asset.

04. Gemma 4 on iPhone + Real-Time Multimodal AI on M3 Pro

Source: HackerNews | URLs: Apple App Store | Parlor

On-device inference crossed a threshold — Gemma 4 running natively on iPhone and real-time audio/video input with voice output on consumer hardware. Privacy, latency, and capability converge.

05. Enhancing Robustness of Federated Learning via Server Learning

Source: arXiv cs.LG | URL: https://arxiv.org/abs/2604.03226

Federated systems can be designed to tolerate adversarial majorities, not just random noise. Important for any scenario where you can't control client integrity but still want collaborative learning.


→ Read full issue


Archive

7 April 2026 — BAS confidence metric, GuppyLM, Swiss infrastructure, On-device inference, Federated robustness
30 March 2026 — LLM self-modelling, Weight tying, Cognitive dark forest, Free software revival, Copilot ads
24 March 2026 — CoT faithfulness, Evidence under pressure, Var-JEPA, Robot self-critique, Flash-MoE
16 March 2026 — PhysMoDPO, ESG-Bench, MBR Distillation, Stop Sloppypasta, Agentic Engineering
10 March 2026 — COLD-Steer, Agent Safehouse, Literate Programming, PONTE, Brain Cells + DOOM