GPT-5

OpenClaw + Codex + ClaudeCode Is INSANE!

This presentation explores the synergy between Anthropic’s terminal-based Claude Code and OpenAI’s Codex, demonstrating how OpenClaw acts as a central "brain" to spawn specialized sub-agents.

Show HN: Kontext.dev – Runtime Credentials for Agents

Every AI agent that does something useful - opening a PR, posting in Slack, updating a ticket - needs to call an API on behalf of a user. That means OAuth. Authorization flows, token storage, refresh logic, per-user credential isolation.Today, most teams solve this with a long-lived API key in an .env file, shared across every user and every session. As everyone in an organization becomes a software engineer - whether they know it or not - you can't expect each of them to roll their own OAu

Launch HN: Voygr (YC W26) – A better maps API for agents and AI apps

Hi HN, we’re Yarik and Vlad from VOYGR (https://voygr.tech/), working on better real-world place intelligence for app developers and agents. Here’s a demo: https://www.youtube.com/watch?v=cNIpcWIE0n4.Google Maps can tell you a restaurant is "4.2 stars, open till 10." Their API can't tell you the chef left last month, wait times doubled, and locals moved on. Maps APIs today just give you a fixed snapshot. We're building an infinite, queryable plac

Show HN: Smart glasses that tell me when to stop pouring

I've been experimenting with a more proactive AI interface for the physical world.This project is a drink-making assistant for smart glasses. It looks at the ingredients, selects a recipe, shows the steps, and guides me in real time based on what it sees. The behavior I wanted most was simple: while I'm pouring, it should tell me when to stop, instead of waiting for me to ask.The demo video is at the top of the README.The interaction model I'm aiming for is something like a helpfu

Stop trying to sell me things while I'm trying to use the thing you already sold

This may be an exercise in screaming into the aether, but I wanted to check how many others feel this way (and how strongly) when using modern software:When opening GMail to send an email, I frequently have to dismiss a banner that tries to upsell a higher service tier that promises to either secure some drive files it claims are insecure, or promises better AI tools to manage my work. When opening a food delivery app, I first have to dismiss a number of promotions before I can get to the orderi

Show HN: ARISE – Agents that create their own tools at runtime when they fail

I built a framework that lets LLM agents create their own tools at runtime. Most agent frameworks assume you'll hand-craft every tool upfront. That works until your agent hits something you didn't plan for. ARISE (Adaptive Runtime Improvement through Self-Evolution) lets agents synthesize their own tools at runtime when they detect gapsARISE sits between your agent and its tool library. When the agent keeps failing at a class of tasks, it analyzes what's missing, uses a cheap LLM

Show HN: From Claude Code to OpenCode – My Evolution in Vibe AI Engineering

I’ve spent the last few months iterating on my "Vibe Coding" workflow, moving away from closed-box solutions toward a more transparent, multi-provider stack. I documented the transition from Codex and Claude Code to an open-source setup using OpenCode and opencode serve.Cursor -> Claude Code -> OpenCode -> OpenCode + OpenCode-Manager -> Codex + Tmux + Tailscale -> OpenCode Serve + Tailscale.Press enter or click to view image in full sizeKey takeaways from the journey:The

AI coding agents accidentally introduced vulnerable dependencies

Recently we discovered something unexpected on one of our servers: a cryptominer running in the background.The machine was hosting a web service built using Next.js. The first sign of trouble was unusually high CPU usage. Even during low traffic periods, the server was consistently running near 100% utilization. After inspecting running processes and network activity, we found a background process downloading and executing a mining binary.ROOT CAUSEThe entry point was CVE-2025-29927, a vulnerabi

Show HN: Synthea Fhir Data in BigQuery

We generated ~1,100 synthetic patients with Synthea, processed the FHIR R4 output through our normalization engine (Forge), and published it as a free public dataset on BigQuery Analytics Hub.8 resource types: Patient, Encounter, Observation, Condition, Procedure, Immunization, MedicationRequest, DiagnosticReport.The raw Synthea output has 459 nested fields per resource, urn:uuid: references, and no column descriptions. We flatten it to clean views with ~15 columns each, pre-extracted IDs, and d

OpenAI Launches GPT-5.3 Instant for ChatGPT: Check Features, Accuracy Boost, and Availability

OpenAI launches GPT-5.3 Instant for ChatGPT with improved accuracy, lower hallucination rates, better tone, and full availability. Check features and retirement timeline for GPT-5.2.

OpenAI's GPT 5.3 Codex Drives Harness Engineering Need

In an era where AI writes all the code, the task humans must excel at is ‘harness engineering,’” OpenAI wrote in a blog post ...

Show HN: Context Gateway – Compress agent context before it hits the LLM

We built an open-source proxy that sits between coding agents (Claude Code, OpenClaw, etc.) and the LLM, compressing tool outputs before they enter the context window.Demo: https://www.youtube.com/watch?v=-vFZ6MPrwjw#t=9s.Motivation: Agents are terrible at managing context. A single file read or grep can dump thousands of tokens into the window, most of it noise. This isn't just expensive — it actively degrades quality. Long-context benchmarks consistently show steep accuracy

Show HN: UberSKILLS – Open-source Workbench for building AI agent SKILLS

Agent Skills (SKILL.md files) are reusable instruction sets that teach code agents like Claude Code, GitHub Copilot, Cursor, and Windsurf how to perform specific tasks. Right now, creating them is entirely manual - you hand-write YAML frontmatter and markdown, with no way to preview, validate, or test before deploying.uberSKILLS is an open-source web app that gives you an integrated authoring environment for Agent Skills:- AI-assisted creation - describe what you want in plain English, get a com

Show HN: Built an AI ad generator and ran $9K of FB ads with it

Been in the AI image gen space since 2023, before even GPT image gen was a thing, and after spending ~9K on Facebook ads for my own projects (made a video about that actually) I realized the thing I kept getting stuck on was the creatives themselves. I'm horrible at making reels-type video ads and Canva even with templates is surprisingly complicated for ad-specific stuff, plus everything ends up looking the same as everyone else using the same templates.Made a video going into detail about

Ek_ Leaks Persist

Vaults and proxy layers solve the "2am paste" vector — devs never touch raw keys, so nothing gets accidentally fed into prompts.But the leak keeps happening anyway.Across 60+ probes on GPT-4o (cost: $0.04), unrelated vectors consistently leaked the *same internal structure*:- ek_ prefix on session tokens - EPHEMERAL_KEY naming - Realtime API client_secret endpoint - Documented 60s TTL vs observed minutes-to-hours persistenceNo real credential was in the prompt — just semantic pressure

Show HN: RunCycles – pre-execution budget enforcement for autonomous agents

I built this after reading too many incident reports of agent loops spending $200 in 4 minutes because a quality threshold was never met.The pattern is always the same: an agent retries, fans out, or loops. Each iteration passes individual rate-limit checks. Observability fires an alert after the money is gone. Provider caps are per-provider, not cross-provider. None of these stop the spend before it happens.RunCycles takes a different approach: reserve budget before the call, commit actual

Show HN: 1,011 AI crawler requests. Google Analytics saw zero

Google Analytics can't see GPTBot or ClaudeBot. Here's how I built a server-side tracker that can — and what I found in 72 hours.

Show HN: Plaidify – Give AI agents access to any login-protected website

Every AI agent hits the same wall: the world's most valuable data is locked behind login forms. Bank balances, utility bills, insurance policies, academic transcripts — none of them have APIs. Plaid covers banks for $500+/mo. Everything else? You write fragile Selenium scripts.Plaidify is open-source infrastructure that turns any login-protected website into a REST API. You drop a JSON "blueprint" into the connectors folder — CSS selectors for username, password, submit, and

OpenAI releases a Windows version of Codex coding app

Around one month after launching Codex for Mac, OpenAI brings Codex to Windows with a new suite of IDEs supported.

OpenAI launches Codex app to bring its coding models, which were used to build viral OpenClaw, to more users

OpenAI is launching a Codex app, a desktop experience for managing the company’s AI coding tools, as competitive pressure to deploy autonomous AI agents heats up. The desktop app is designed to be ...