Claude AI uncovers 500+ critical bugs, sending open-source developers into patch mode
Anthropic's Claude AI autonomously discovered 500+ critical vulnerabilities in popular open-source software using only basic debugging tools.
Anthropic's Claude AI autonomously discovered 500+ critical vulnerabilities in popular open-source software using only basic debugging tools.
U.S. military reportedly used Anthropic’s Claude AI in Venezuela raid capturing Nicolás Maduro, raising ethical and AI deployment concerns.
The release of GPT 5.3 Codex marks the transition from AI that "helps you code" to AI that "builds for you." This professional-grade overview explores the capabilities of OpenAI’s latest release, specifically focusing on its ability to function as a complete development team.
Summary: - Non-determinism in LLMs - Why the same prompt can produce different outputs - Importance of starting new chats to test this properly - Coding example (Google Apps Script with Claude) and why code is easier to verify than prose or analysis - Context, memory, and “bringing your own data” - How ChatGPT memories work, how to view/curate/delete them - Risks of “context rot” when there’s too much accumulated history - Migrating personal cont
I launched Book Digest (AI book summaries) on Product Hunt a few days ago.Clear feedback: summaries were too short (~800 words). People expect Blinkist-level depth (2500+ words).Spent 2 days debugging OpenAI JSON parsing, Prisma database persistence, and token limits. Then regenerated 450 books overnight with an improved AI prompt.Result: 2-3x deeper summaries with detailed chapters, insights, and action items.Demo (no signup): https://book-digest.com/books/6c8e5031-1c55-4bdd
Hi HN, I built Jsiphon to solve a common frustration with LLM streaming: you ask for structured JSON output, but can't use any of it until the entire stream finishes.If you've used JSON mode (OpenAI, Anthropic, etc.), you've hit this — you want {"answer": "...", "sources": [...]}, but JSON.parse() fails on every incomplete chunk.LLM responses are inherently append-only (tokens arrive left to right, never go back), so Jsiphon leans into that with three
I was spending more time orchestrating Claude Code and Cursor than actually coding. Run command → wait → check output → repeat. So I built v16: persistent AI agents that work autonomously on my laptop. - Each agent: ~40MB Go process - Chat via Telegram (@devops, @research, @monitor)
Fun little side project I built after learning about circuit bending in cameras for intentional glitch effect. It is browser based camera toy where you "rewire" CCD pin pairs, turn knobs to get different glitch artefacts in real time to capture as photos. I had fun learning to simulate different pin modes - channel split, hue/phase shifts, horizontal clock delays, colour kill etc.Here are some photos taken: https://glitchycam.com/galleryI intentionally leaned toward
I am a pre-tenure researcher in theoretical quantum physics. I am looking for nuanced opinions on the longevity of pure theory roles in STEM due to the acceleration in AI capabilities.I use AI almost daily in my work, for helping me write code to test new ideas, for doing literature reviews to scope out whether an idea I have has been done before, and (probably most worryingly) to help come up with ideas for proofs. I also use it to help restructure grant applications between different format re
I wanted a way to prototype an agent and have it serving requests in minutes, InitRunner is a YAML-first platform where one config file gives you a working agent with RAG, memory, and an API endpoint.apiVersion: initrunner/v1 kind: Agent metadata: name: acme-support description: Support agent for Acme Corp spec: role: You are a support agent for Acme Corp. model: provider: openai name: gpt-4o-mini ingest: sources: - ./docs/*/.md - ./know
If you were dropped into a coding environment where the underlying model was hidden (GPT-x, Claude, Gemini, grok, etc) do you think you could reliably tell which one you were using or at least which family?In other words... after all this vibe coding could you identify the model strictly off vibes?If yes, how long would it take you to be confident? And what constraints would you need for the test to be meaningful (i.e. familiar codebase vs greenfield, real bugs vs toy tasks, time-boxed, language
I got tired of context-switching to write commit messages and PR descriptions, so I built gut – a CLI that uses AI to handle the boring parts of git workflows.Examples: gut commit → generates commit message from staged diff gut pr → generates PR title and description gut review → AI code review of your changes gut find "login bug" → finds commits by vague description gut stash → stash with auto-generated nameIt focuses only on git operations, so responses
I built AI Usage Tracker, an iOS app that warns you before AI subscription limits cut you off mid-session (e.g. 5-hour windows, weekly caps). I hit this daily while coding: I’d be deep in a session and suddenly hit the cap. Dashboards exist, but they’re not glanceable and there are no practical alerts/widgets. Supports multiple providers in a single screen - Anthropic, OpenAI, MiniMax, Z.ai, Kimi, CodexFeatures:- 5-hour window + weekly status (simple gauges). Makes easy to plan your workloa
Hey HN, I don’t know who else has the same issue, but:Textbooks often bury good ideas in dense notation, skip the intuition, assume you already know half the material, and get outdated in fast-moving fields like AI.Over the past 7 years of my AI/ML experience, I filled notebooks with intuition-first, real-world context, no hand-waving explanations of maths, computing and AI concepts.In 2024, a few friends used these notes to prep for interviews at DeepMind, OpenAI, Nvidia etc. They all got
I got frustrated paying $60/M tokens for reasoning queries when a $0.80/M model gives comparable results for most of them. So I built Komilion — a model router that classifies each API request and routes it to a cheaper model that fits.- Drop-in replacement for the OpenAI SDK (change one line: base_url) - Each query gets classified (regex fast path + lightweight LLM classifier) and matched against ~390 models - Three tiers (Frugal/Balanced/Premium) to control the quality-cost
Inspired by the Million Dollar Homepage, this is the Million Dollar Chat. People fill the chat's one million character brain, one character at a time. The Million Dollar Homepage of the AI age.My initial design used one million tokens but I quickly discovered that tokens are not made equal which made it very difficult to reason about. Eventually, I settled on one million characters.The chat has a few different capabilities, for example, you can ask it to navigate to a position for you (e.g:
I built a tool that scans AI platforms with buyer questions relevant to your domain and shows you whether they mention you or not.Enter your domain, it generates queries based on your space, sends them to ChatGPT, Perplexity, and Google AI, then scores you out of 100 based on how often you show up in the responses.The part I think is actually useful: it doesn't just tell you the problem. It shows you which queries you're missing from, why, and gives you fix pages (structured content de
Hey all, I built an open-source tool that lets you give an Android phone a goal in plain English. It reads the accessibility tree, sends the UI state to an LLM, executes actions via ADB, and loops until the task is done.The core loop: dump accessibility tree via uiautomator → parse and filter to ~40 relevant elements → LLM returns {think, plan, action} → execute via ADB → repeat.Some technical decisions worth noting:- Primary input is the accessibility tree, not vision. Vision (screenshots + mul
Hi everyone,I am Vincenzo and i’m working on PolyMCP, an open-source framework that not only exposes Python functions as AI-callable MCP tools but also lets you orchestrate agents across multiple MCP servers.The idea: instead of rewriting code or wrapping every function with a special SDK, you can: 1. Publish your existing Python functions as MCP tools automatically 2. Spin up a UnifiedPolyAgent that coordinates multiple MCP servers 3. Ask your agent to perform complex workflows spanning diff
I built devday because I use multiple AI coding tools (OpenCode, Claude Code, Cursor) and wanted a single command to see what I actually accomplished each day. It reads local session data, cross-references with git commits, and optionally generates standup-ready summaries via OpenAI or Anthropic.Everything runs locally — no data leaves your machine unless you opt into LLM summaries.Install with npm install -g devday.Currently supports OpenCode, Claude Code, and Cursor on macOS. Would love feedba