GPT-5
This week in AI updates: GPT-5.3-Codex-Spark, GitHub Agentic Workflows, and more (February 13, 2026)This week in AI updates: GPT-5.3-Codex-Spark, GitHub Agentic Workflows, and ...
Value stream management involves people in the organization to examine workflows and other processes to ensure they are deriving the maximum value from their efforts while eliminating waste — of ...
Show HN: Agent Action Guard – AI agent action safety
Your agents can perform harmful actions without barriers. You do not know that yet. HarmActionBench experiments allowed AI agents to use tools based on harmful instructions, and the results are shocking. Even latest popular AI models, including GPT and Claude, scored very low. They have no barriers in performing harmful actions.HarmActionsEval proves AI is not yet reliable enough for critical projects. Agent Action Guard blocks harmful actions.
GitHub: https://github.com/Pro-GenAI
Show HN: Oy – The Yo App for Agents
Howdy HN!A friend and I were recently chatting about the potential utility (or not) of "agent-only software" and the conversation quickly turned to meme apps like iBeer and Yo[0].If agents use Moltbook[1] (with a little human nudging), would they use Yo? For how long?To test this, I built Oy, the Yo app for agents.I'd been wanting to play around with Cloudflare's Durable Objects[2], but didn't have a great use-case. This seemed like a decent fit.Each agent gets it's
Scaling tool orchestration data will emerge different intelligence and LLMs
Tldr: We are only now gonna start to scale long term external orchestration, everything beforehand was mostly internal problem solving training with here and there a tool call. We don't actually know yet what scaling orchestration training produces. It might produce much better tool-using assistants that remain fundamentally reactive to human instructions. Or it might produce something with more emergent autonomy. My gut feeling tells me the second. For the first time I foresee in the near
Tinygrad Nvidia P2P hack on Talos OS: 3 days with AI, would been 2 weeks without
I just open-sourced a fully working build of the tinygrad NVIDIA P2P hacked kernel modules packaged as a Talos Linux system extension — patched open driver, signing key lineage, overlay ordering, the whole nightmare. Verified on 4x 5090.Github: https://github.com/himekifee/talos-tinygrad-nvidia-p2p-driverWhy this exists: The tinygrad community's P2P kernel patch (aikitoria's donor patch) enables PCIe peer-to-peer / GPUDirect on NVIDIA's open kernel modules
Show HN: Alys – Chat GPT for video editing
Hey everyone, I run a video editing agency for real estate agents (but I’m not a video editor). So I hired video editors to edit the videos for me which works pretty good. However I struggled with the delays in getting edits done, people taking days off and overall workload getting too large.So I’ve spent the last two months building Alys - Alys is an agentic video editor: There’s no timeline, you literally just chat to make edits to the videos. ChatGPT for video editing basically.My honest refl
Inside Midjourney 8: The Hidden New Features & Missing Legacy Tools
Midjourney 8 brings a personalization profile grid and conversation mode prompt rewrites, while image prompting and omnireference are missing.
Yeet For Openai
CAPTION: OpenAI really named a skill "yeet" and it actually ships your code to a draft PR in one word.
HASHTAGS: #vibecoding #codex #openai #developer #aitools
OpenAI's GPT-5.4 mini and nano launch - with near flagship performance at much lower cost
The latest GPT-5.4 mini model delivers benchmark results surprisingly close to the full GPT-5.4 model while running much faster, signaling a shift toward smaller AI models powering real-world applications.
OpenAI launches GPT 5.4 in ChatGPT; claimed to support up to 1M tokens of context
Tech News News: OpenAI has launched GPT-5.4, a new version of its artificial intelligence model, in ChatGPT. The company says the model is its “most capable and effic.
Show HN: Travel Hacking Toolkit – Points search and trip planning with AI
I use points and miles for most of my travel. Every booking comes down to the same decision: use points or pay cash? To answer that, you need award availability across multiple programs, cash prices, your current balances, transfer partner ratios, and the math to compare them. I got tired of doing it manually across a dozen tabs.This toolkit teaches Claude Code and OpenCode how to do it. 7 skills (markdown files with API docs and curl examples) and 6 MCP servers (real-time tools the AI calls dir
Tell HN: Anthropic no longer allowing Claude Code subscriptions to use OpenClaw
Received the following email from Anthropic:Hi,Starting April 4 at 12pm PT / 8pm BST, you’ll no longer be able to use your Claude subscription limits for third-party harnesses including OpenClaw. You can still use them with your Claude account, but they will require extra usage, a pay-as-you-go option billed separately from your subscription.Your subscription still covers all Claude products, including Claude Code and Claude Cowork. To keep using third-party harnesses with your Claude login
Ask HN: Anthropic changing billing for third-party harnesses for Teams Accounts?
We just got an email from Anthropic about Claude Teams Accounts:---Hi Claude Admin,We're offering your team a one-time credit of $200 to your Team plan. Redeem it for your team by April 17. Once claimed, it’s good for 90 days across Claude Code, Claude Cowork, chat, or third-party harnesses connected to your account.You’re also now able to pre-purchase bundles of extra usage at up to 30% off. If your team regularly runs past subscription limits, this is the easiest way to keep them going.On
Show HN: I adapted codex-plugin-cc's design for Gemini CLI's ACP
This started as a protocol / adapter exercise. codex-plugin-cc made me want the same kind of Claude Code integration for Gemini, so I built one.https://github.com/abiswas97/gemini-plugin-ccThe repo is derived from openai/codex-plugin-cc, but the runtime layer is different: this plugin talks directly to Gemini CLI in ACP mode instead of Codex app-server.In practice, the Gemini side here is much more session-oriented:
- spawn `gemini --acp` (or `--experimental-acp` fo
Show HN: WordBattle – Daily word game where AI agents compete against humans
WordBattle is a daily 6-letter word guessing game with team leaderboards. The twist: AI agents get their own accounts, play the same daily puzzle, and rank alongside human players. It's also really fun to play in teams against your family, friends and co-workers.Agents are handicapped — humans see exact letter positions (correct/present/absent), but agents only learn whether a letter exists in the word or not. No positional info. It makes the game fair while giving agents a genuin
One-liners to check for bad litellm and axios on your computer
Search your drive (not mounts) for compromised versions of litellm and axios.Please comment if you see anything wrong or ways these can be improved!LiteLLM:
find / \( -type d -name "litellm-.dist-info" -o -name "litellm_init.pth" \) 2>/dev/null \
| while read d; do
case "$d" in
dist-info)
v=$(echo "$d" | sed 's/.litellm-\(.\)\.dist-info/\1/')
if echo "$v" | grep -qE &#
Show HN: I built a full LLM chat client as a Neovim filetype
I started Flemma because I wanted to bring my AI workload into Neovim.My workflow started with me living in developer portals - Claude Workbench, OpenAI Platform, Vertex (the horrors of GCP's Console UI!) I would spend hours in these web UIs crafting prompts, iterating on system instructions, maintaining a carefully curated library of sessions. But the browser really wasn't optimised for this kind of interaction. Editing was clunky (muscle memory <C-W> would close the tab instead
Show HN: Dochia – automated API testing for agentic build-test-fix loops
I've enhanced Dochia with the ability to generate agent skills. Being a CLI already, it makes it really simple to plug it into agentic workflows.Run `dochia init-skills` and the coding agent(s) can trigger tests as it builds: 1. Agent writes endpoint and the OpenAPI spec (or that get's generated from code)
2. Agent runs: dochia test -c api.yml -s localhost:3000
3. Dochia produces dochia-summary-report.json + per-endpoint test files
4. Agent reads errors, fixes code, re-runs
5.
Show HN: TalkType – Offline Linux Speech-to-Text (Whisper, Wayland, AppImage)
I'm a hobbyist developer (van builder by trade) and built this because I needed hands-free dictation on Linux that actually works offline. TalkType uses OpenAI's Whisper locally — no cloud, no subscription. It supports Wayland, GNOME via a Shell extension, and ships as a single AppImage. GPU acceleration optional. Would love feedback from the HN community.
XC Scribe – AI product description generator with direct e-commerce sync
Hey HN,
I built XC Scribe, a tool that generates product descriptions for e-commerce stores and pushes them live without copy-pasting.I was doing freelance work for a shop with 10,000 SKUs. Writing descriptions manually was not realistic. Existing tools would spit out text but you still had to get it into the store yourself.You import your catalog via CSV, XML, or direct store connection. Pick your AI provider (OpenAI, Anthropic, or Google). Review generated content side by side with your origi