GPT-5

Show HN: Web game, an AI decides if you would survive in a survival situation

Hello HN, I made this website where you are given a unique survival scenario made by AI and you have to describe what you would do. The AI then decides if you would live or not. If you progress every round gets exponentially harder. Round 1 is a house fire, by round 6 you&#x27;re fighting a kraken. I tried to make it as funny as possible.<p>Built using Cloudflare and GPT-4.1 Nano API.

Show HN: Vocab extractor for language learners using Stanza and frequency ranks

I&#x27;m building a Telegram bot to practice Dutch. GPT-4o-mini kept picking vocabulary words I already knew, so I built a classical NLP pipeline to do it instead.It takes a short text + learner level (A0–B1) and returns the best words to study, using Stanza for parsing and corpus frequency ranks (SUBTLEX-NL, srLex, SUBTLEX-US) for scoring. Wins at A1&#x2F;A2, loses at A0 where the LLM picks more obvious words.I also tried adding multi-word phrases (ADJ+NOUN, VERB+NOUN, phrasal verbs) backed by

Show HN: OpenClaw Arena – Benchmark models on real tasks, rank by perf and cost

We built an arena for comparing AI models on real agentic tasks — not chat or static benchmarks. Models run as actual OpenClaw subagent in fresh VMs with full tool access, and results feed into two separate leaderboards: performance and cost-effectiveness.The problem: Chatbot Arena tests conversation quality. But most people using AI agents need them to do more: browse the web, manage files, write and run code, create full applications, automate multi-step workflows. There&#x27;s no benchmark th

GPT 5.4 sucks at front end

why does it always make those cards ??? like seriously? and people think frontend dev jobs are gone, frontend is more harder than backend in 2026, all design related things can&#x27;t be done properly via AI rather developer &#x2F;logic things

Google Bard

Get all latest & breaking news on Google Bard. Watch videos, top stories and articles on Google Bard at moneycontrol.com.

Stop repeating the same repo rules every session in codex.

**Caption**: Stop repeating the same repo rules every session. This one file gives Codex persistent instructions for how your project works. **Hashtags**: #AI #Codex #SoftwareEngineering #DevSwarm #OpenAI

New OpenAI Codex Update is INSANE!

Learn about the research stats that are shocking the industry—including the detection of over 10,000 high-severity issues—and see the step-by-step process of how the AI validates exploits before generating ready-to-merge patches.

Show HN: Memv – Memory for AI Agents

memv is an open-source Python library that gives AI agents persistent memory. Feed it conversations; it extracts knowledge.The extraction mechanism is predict-calibrate (Nemori paper): given existing knowledge, it predicts what a new conversation should contain, then extracts only what the prediction missed.v0.1.2 adds the production path: - PostgreSQL backend (pgvector for vectors, tsvector for text search, asyncpg pooling). Single db_url parameter — file path for SQLite, connection string for

Show HN: I made a cheaper alternative to Claude Code or Codex CLI

With longer agentic workflows becoming the norm, token cost can eat through usage so quickly that it prevents any real work from getting done.After studying the business model of top labs like Anthropic and OpenAI, their business model shows about 80% margins on inference cost which they use for R+D on the next model.Working with open source models is much cheaper and allow for 5-10x higher usage, so I decided to create Sweet! CLI as an alternative to the products offered by big labs.Sweet! CLI

Show HN: Botference – A TUI to Plan with Claude Code and Codex Simultaneously

I vibe-hacked a terminal app that puts you, Claude Code, and Codex in the same chat room for project planning.Two modes: “council” (open room — you steer the conversation, direct who speaks) and “caucus” (the AIs debate privately and bring you a recommendation). The output is an implementation-plan.md you take into whatever workflow you want.Plan mode is the part that works. Build mode exists but is experimental and I wouldn’t recommend it yet.It’s vibe-coded, uses Textual or Ink for the TUI, an

Ask HN: Are you too getting addicted to the dev workflow of coding with agents?

It&#x27;s becoming an extremely dopaminergic work loop where I define roughly the scope of my task and meticulously explore and divide the problem space into smaller chunks, then iterating over them with the agent. Rinse and repeat.Each execution prompt after a long planning session feels like opening a lootbox when I used to play Counter Strike.It&#x27;s really fun to code like that, it&#x27;s like riding a bike after a lifetime of only knowing how to run. But I&#x27;m really wary that&#x27;s a

Show HN: Claude/OpenAI/Gemini agents compete as investors with $100K each

Claude, Gemini, and GPT each get $100K in virtual money and trade every morning at 9:30 AM ET using real Yahoo Finance prices. Same tools, same rules — different models. Live leaderboard shows portfolio values, holdings, and each agent&#x27;s daily diary entry explaining their reasoning. Built this to demo Upstash Box — an agent server primitive. Unlike a web server, there&#x27;s no app code. You give it prompts, tools, and skills. Each agent runs in its own isolated container, sleeps when idle,

Show HN: CLI to order groceries via reverse-engineered REWE API (Haskell)

I just had the best time learning about the REWE (German supermarket chain) API, how they use mTLS and what the workflows are. Also `mitmproxy2swagger`[1] is a great tool to create OpenAPI spec automatically.And then 2026 feels like the perfect time writing Haskell. The code is handwritten, but whenever I got stuck with the build system or was just not getting the types right, I could fall back to ask AI to unblock me. It was never that smooth before.Finally the best side projects are the ones y

Show HN: CMPSBL Software Factory — Free Daily Drop $2.9M

I’m a solo founder. I built a Cognitive Infrastructure Substrate — pure algorithmic code, zero AI API calls inside, no OpenAI dependency, fully patentable. A discovery engine collides software primitives against each other and crystallizes viable configurations into production-ready code.It has autonomously discovered over $4.3B in software capabilities. I didn’t write what I’m sharing. The substrate found it.Neural Arbiter — CJPI 100 | Governance | $3M substrate valuationOne sentence: an AI dec

Show HN: Agent Orchestrator, a local-first Harness Engineering control plane

I have spent a long time working in an XP&#x2F;TDD style, so when AI coding tools became useful enough for real work, I adopted them quickly. The first bottleneck I hit was not code generation, it was verification: AI could write code and tests quickly, but I was still the person reviewing implementations, clicking through flows, checking logs, inspecting database state, and deciding whether the result was actually correct.That pushed me to move validation further left. Before implementation, AI

Ask HN: What's the latest concensus on OpenAI vs. Anthropic $20/month tier?

I&#x27;m considering $20&#x2F;month variants only.I&#x27;ve had a Claude subscription for the past year, although I only really started properly using LLMs in the past couple of months. With Opus, I get about 5 messages every 5 hours (fairly small codebase); more with Sonnet. I then cancelled that, since its practically unusable and got ChatGPT sub about a week ago. Currently using it with 5.4 High and I haven&#x27;t had to worry about limits. But the code it produces is definitely &quot;differe

Show HN: FastFN – a polyglot file-based runtime for APIs and SPAs

Part of the idea comes from liking the simplicity of older CGI-style models: put code in a place, map it to a route, and keep the deployment model understandable. FastFN takes that in a more modern direction with multiple runtimes, local dev, OpenAPI, and a simpler path for mixed API + frontend projects.It supports Python, Node, PHP, Lua, Rust, and Go in one project.This started as a weekend project that got out of hand. I am thinking were to take it next now.Docs: https:&#x2F;&#x2F;fastfn.dev&#

OpenAI releases GPT-5.4 mini and nano, its most capable small models yet

OpenAI has introduced GPT-5.4 mini and nano, bringing faster performance and significantly lower costs to high-volume AI workloads while retaining many flagship capabilities.

Anthropic's Claude AI Can Now Use Your Mac While You're Away

Anthropic are out with yet another update to Claude AI: the company's Claude Code and Cowork tools can now remotely control your Mac on your behalf. When Claude lacks a direct connector for a given app like Slack or Google Calendar, it falls back to controlling the computer like a human, using the screen to navigate. From the Claude blog: In Claude Cowork and Claude Code, you can now enable Claude to use your computer to complete tasks.

Claude AI popular among Indians for coding and CV building, Anthropic report highlights shift in workforce tools

Claude AI is gaining popularity among Indians for coding and CV building, reflecting a shift in workforce tools, according to ...