GPT-5
Claude AI Beats Human Robotics Teams 20x: Anthropic Marks Physical AI Turn
Claude AI robotics benchmark shows Opus 4.7 finishing physical robot programming in 9 minutes, against 181 minutes for AI-assisted human teams, in Anthropic Project Fetch Phase Two published June 18.
OpenAI teases GPT-5 release for ChatGPT
OpenAI is having a big week. Yesterday, the leading AI firm dropped not one, but two open-weight models, including one that can run well on Apple silicon Macs. Next? GPT-5 is coming to ChatGPT. OpenAI ...
Show HN: Caliper – pass@k reliability testing for Claude Code and Codex skills
Skills for Claude Code and Codex are hard to test. What I mean by hard is that there's no standard way to do it. You evaluate the skill once on something, it looks like it works. You publish it. Then the new super model releases (GLM 5.2 anyone?), it will quietly break for some part, and you won't find out until your users complain.I also faced the same problem, so I tried to build something lightweight to stop doing that. Caliper.It's a local and lightweight harness that runs a s
Show HN: Chappie – Direct Desktop Search and macOS Control
*Chappie (Desktop Search)* is a local search and control input box.It is optimized for—and excels at—speedily launching apps and opening folders from common folders.* Chappie owns `⌘+SPACE` when running.* Quickly launch apps and open folders.* Press `ENTER` to open, add `⌥` to open in terminal.* Configure custom shortcuts like `?` to search the web.* A dozen shortcuts out of the box.* `/` tab through directories.* `!` commands (volume, brightness, etc.) and settings.* `=` to calculate, ENTE
Ask HN: Smallest amount of working ML weights that can be tattooed on a body?
Recently saw this comment on another HN thread about the US government gating access to GPT-5.6 and how it harkens back to the 1990s encryption-as-export-controllable-tech situation and how people tattoo'd the algo to their bodies:> I can't wait for the first person to tattoo model weights on their body!(https://news.ycombinator.com/item?id=48693721)And I can't help but wonder, what would be the smallest functional amount of weights from any sort of ML model you
Show HN: QR code renderer in a TrueType font
In the "Libre Barcode Project" discussion yesterday, 1bpp asked: "Is anyone willing to sacrifice their sanity for the sake of implementing a QR renderer as TTF hinting code?"<p>Yes. I had some tokens to burn and was curious... turns out, it's possible. This was put together by a mix of Gemini, GPT, and Claude (depending on which usage limits kept running out).
Show HN: role-model, a router for hybrid local/cloud AI
Hey everyone, I'm launching role-model today: a routing protocol, a reference router runtime, and an extension for Pi that allows for better informed routing decisions.role-model is mostly deterministic, with fallback to a controller model, that routes requests based on a chosen routing strategy. the protocol is structured around assigning domains and roles to models, where requests sent by consumer applications like Pi have task types to enrich routing metadata and thereby accuracy. you ca
Show HN: NanoEuler – GPT-2 scale model in pure C/CUDA from scratch
Hi everyone,I started working on nanoeuler after the ban of anthropic's fable because my ambition and dream is to work in the AI field in anthropic. The two interesting reasons that led me to create nanoeuler were (1) interfacing with llm does not mean understanding how they are composed and (2), working on llm with a very low-level layer to understand the correlation between parameters and data and growth of the model and how the GPU works and how some layers can be optimized.So I starte
Show HN: LLMSim – a fast OpenAI LLM API simulator for load-testing LLM apps
Testing LLM apps and agent frameworks against real APIs is expensive, rate-limited, slow, and non-reproducible. LLMSim is a Rust simulator for the OpenAI Chat Completions, OpenResponses and in future other protocols.LLMSim could be used in two forms, one is the server, with high concurrency for load/stress testing (~40k req/s with p99 ≈ 5ms on 4 vCPUs, scaling with cores); and second it embeds directly as a crate in your tests - no separate process, no network, deterministic.It focuses
Show HN: Looped Whisper (FOSS) – Voice transcription menubar app for macOS
I built a free, open-source (MIT) macOS menu-bar app that runs Whisper models locally to assist with dictation. There is also the option to use an LLM (BYOK).You hold a global hotkey, speak, and the text gets pasted at your cursor similar to other similar/popular apps.Some details: - Transcription runs locally via WhisperKit (CoreML), so it works offline once the model is downloaded. BYO model — tiny through large-v3, auto-downloaded and cached. The base model has been working surprisingly
Show HN: Agnes AI – Free multimodal API (text, image, video), OpenAI-compatible
Hi HN,I'm Daniel, part of the team at Agnes AI, a Singapore-based AI lab. We've been building quietly for a while and I wanted to share what we've made with this community and get honest feedback from people who actually build things.What is Agnes AI?
Agnes AI gives you access to three models through a single free API. We have text, image and video models. The API is fully OpenAI-compatible, so you can just swap the base URL to https://apihub.agnes-ai.com/v1 and use
Show HN: Smart model routing directly in Claude, Codex and Cursor
We built a model router that plugs into coding agents (e.g. Claude Code, Codex, Cursor, etc.) and intelligently sends requests to the best model to serve them. Here's a quick demo of running it locally: https://www.youtube.com/watch?v=isKhAyivtfM.At Weave, we write most of our code with AI, and it's been getting more expensive. This came to a head when Opus 4.7 was released and, thanks to its tokenizer changes, our costs shot up. We knew we didn't need Opus for ever
Isn't US Government trying to monopolize AI as a super power?
The US government recently gave a directive to OpenAI to delay the launch of it Fable 5 rival GPT 5.6 Sol. This comes after they have already put on hold the global public release of Claude Fable 5. Isn't this just creating an AI monopoly, where the most powerful AI models are restricted to only 'trusted US organization'?
Show HN: Mcpify – Turn any REST API into an MCP server in one command
Wiring an existing API into MCP means hand-writing a tool wrapper per endpoint. Almost every API has an OpenAPI spec, so mcpify just reads it and generates the whole server. Built it because I kept doing this by hand. Feedback welcome.
Show HN: Codex can track external events with respect to internal data
Most analytics tools track your data, but completely ignore relevant external events like world news, platform shifts, pandemics, OpenAI announcements, etc. so you may be completely blind to the signals that impact growth most.I built a tool that tracks our internal data with respect to external events. Plug in a Posthog API key, and a model searches for relevant news daily.This helps us 'close the loop' with Codex managing our human automations with enough data to make decisions that
Agentic AI Takes Over — 11 Shocking 2026 Predictions
Forbes contributors publish independent expert analyses and insights. Mark Minevich is a NY-based strategist focused on human centric AI. As 2025 comes to a close, it has become clear to me that we ...
Investors bet on AI again after Micron reports 346% sales jump
On Tuesday, investors were dumping AI stocks, worried that frothy valuations may be running away from reality. By Thursday, ...
In the Age of AI, Your Expertise Matters More Than Ever
A LinkedIn executive explains why solo entrepreneurs who pair deep expertise with AI tools are finding greater success.
Are ChatGPT and other AI chatbots politically biased? We tested them.
The Post tested ChatGPT, Gemini and other chatbots with political questions, and the results show that the AI tools have ...
The broader AI infrastructure trade
AI infrastructure spending is broadening beyond model training into inference, edge distribution, and data center ...