GPT-5
Show HN: The Mog Programming Language
Hi, Ted here, creator of Mog.- Mog is a statically typed, compiled, embedded language (think statically typed Lua) designed to be written by LLMs -- the full spec fits in 3,200 tokens.
- An AI agent writes a Mog program, compiles it, and dynamically loads it as a plugin, script, or hook.
- The host controls exactly which functions a Mog program can call (capability-based permissions), so permissions propagate from agent to agent-written code.
- Compiled to native code for low-latency plugin exec
Show HN: Styx, Open-source AI gateway with intelligent auto-routing (MCP-native)
Hey HN,
We just open-sourced Styx — an AI gateway that sits between your app and AI providers (OpenAI, Anthropic, Google, Mistral). One endpoint, any model, self-hosted.
What makes it different from LiteLLM or OpenRouter:styx:auto — send "model": "styx:auto" and the gateway picks the right model based on prompt complexity. Simple questions go to cheap models ($0.15/1M tokens), complex code goes to frontier models. 9-signal classifier, zero config.
MCP-native — first gate
Show HN: A step debugger for AI agents
I've been experimenting with OpenClaw agents that call hardware tools.The initial goal was getting a local agent to solve a small maze using some benchtop hardware. The agent observes the maze through a webcam, decides its next move, and calls a hardware tool to move.When something goes wrong, it's hard to understand why. You usually end up staring at a huge JSON log of prompts, tool calls, and responses.So I started building a trace harness and an openclaw specific shim to capture str
Show HN: Polpo – Build zero-human companies. Open source
Hey Alessio, here. I built Polpo because AI agents are great at coding — and terrible at finishing real work on their own.The problem: you open Claude Code, give it a task, it does 80%. You fix the other 20%, open another chat for the next piece, copy context, retry when it drifts. Before you know it you're a full-time AI babysitter — 4 monitors, 12 terminals, zero confidence anything actually ships.Polpo fixes this. You build an AI company: hire agents, give them roles, skills, and credent
Show HN: API key leak scanner – finds and shows credentials in your codebase
Simple CLI tool, one Python file, no setup. Point it at a repo and it
finds leaked API keys (OpenAI, Anthropic, AWS, GitHub, Stripe, etc.)
and gives you the direct link to revoke each one.<p><pre><code> Built it because I kept generating code with AI assistants and worrying
about keys ending up in the wrong place. Its offbrand TruffleHog.</code></pre>
Ask HN: What's your favorite "what would SWEs do in 1-3 year from now?"
LLM driven stacks by Anthropic and OpenAI are aiming for a monoploy on labor replacement by driving Claude Code, Codex development at rates never seen before, there would be likely a reordering of what SWEs do in the near future(1-3 years).What's your futuristic version of how this would turn out? Try justifying your answer e.g. by citing previous re-organization of labor during such upheavals, applying economic/market theory or precedent.My favorite one(right now) is: As traditional S
Show HN: VectorLens – See why your RAG hallucinates, no config
I built VectorLens because I was tired of "log file archaeology" every time my RAG pipeline hallucinated. Usually, when an LLM gives a wrong answer, you're stuck guessing which retrieved chunk misled it—or why the right chunk was ignored.Existing observability tools either require a cloud signup, an enterprise contract, or heavy manual instrumentation of your code. I wanted something that stayed local and just worked.The Solution: Three lines of codePython
import vectorlens
vector
Show HN: UnifyRoute – Self-hosted OpenAI-compatible LLM gateway with failover
Hey HN,I built UnifyRoute because I kept running into the same problem:
rate limits, quota exhaustion, and provider outages were breaking
my LLM-powered apps at the worst times.UnifyRoute is a self-hosted gateway that sits in front of your LLM
providers (OpenAI, Anthropic, etc.) and handles routing, failover,
and quota management automatically — with a fully OpenAI-compatible
API, so you don't change a single line of your existing code.What it does:
- Drop-in OpenAI-compatible API (&#x
I Wrote A Resume For A $180,000 Job Using ChatGPT 5.4 To Test The Hype
How do you write a resume using AI? I tested ChatGPT 5.4 and compared it to ChatGPT 5.2, to create an executive resume for a project director job. Here's what happened.
GPT 5.4 arrives on ChatGPT: 5 improvements to know
The latest model from OpenAI comes just days after the company launched GPT-5.3 Instant.
New ChatGPT 5.4 : 1M-Token Context & “Extreme Reasoning” Targets Long Tasks
OpenAI has launched its new ChatGPT 5.4 with Extreme Reasoning mode for long-duration task focus. As well as a 1M-token context window ...
Why is GPT-5.4 obsessed with Goblins?
After the 5.4 update, ChatGPT uses "goblin" in almost every conversation. Sometimes It's "gremlin." A recent chat of mine used goblin 3 times in 4 messages:> this stuff turns into legal goblins fast> hiding exclusions like little goblins> But here’s the important goblinI am not the only one to notice this, there are many Reddit threads on it:https://www.reddit.com/r/ChatGPT/comments/1roci77/anyone_elses_chatgpt_obsessed_with_go
Ask HN: Any informed guesses on the actual size/architecture of GPT-5.4 etc.?
Does anyone have decent intuitions or hard clues on how big models like GPT-5.4, Gemini 3.1, and Opus 4.6 actually are, and how they compare to the best open models like GLM-5?Are they all roughly in the same range now (for example around 1T params, maybe MoE), or are the closed models still much bigger?Also curious about “pro” versions like GPT-5.4 Pro - is that likely a different model, or mostly the same model with more inference-time compute / longer reasoning / better orchestratio
Ask HN: I built an AI-native codebase framework–could you evaluate it?
I built this open-source project and would really appreciate technical feedback from people here:https://github.com/xodn348/ai-nativeThe goal is to make AI-assisted development more reliable through clearer project structure, explicit contracts, and verification workflow.I made this because applying these patterns from scratch in every project was repetitive and hard to maintain, so I wanted a reusable framework.If you have time, I’d love your evaluation on:1. What is useful
Show HN: 2D RPG base game client recreated in modern HTML5 game engine with AI
When I was much younger, I used to play a Korean MMORPG called Helbreath, and I also hosted a bunch of private servers for it. I eventually moved on, but I always loved the game’s aesthetics, its 2D nature, and its atmosphere. That may just be nostalgia talking.The community maintained private server and client, which to my knowledge were based on leaked official files, were written in fairly archaic C++. If you’re interested in the original sources, I’ve included the main client and server file
If you’re into AI coding, OpenAI just puts its Codex on Windows
OpenAI has launched its Codex app on Windows, bringing a native AI coding assistant with project management, automations, and WSL support for developers. The post If you’re into AI coding, OpenAI just ...
OpenAI Launches Codex Security to Find, Patch Code Vulnerabilities
OpenAI’s Codex Security enters research preview, aiming to help teams find, validate, and patch code vulnerabilities with ...
OpenAI Announces Codex for Open Source With Six Months of Free ChatGPT Pro
OpenAI launches a new program offering free ChatGPT Pro, Codex tools, and API credits to support open-source developers and ...
Anthropic and OpenAI just exposed SAST's structural blind spot with free tools
Can free AI scanners replace enterprise SAST? Anthropic and OpenAI found 500-plus zero-days pattern-matching tools missed — and both scanners are free.
OpenAI Rolls Out Codex Security Vulnerability Scanner
Codex Security, formerly Aardvark, has found hundreds of critical vulnerabilities in tested software in the past month.