# Ask HN：您的AI开发技术栈/工作流程是怎样的？

- 来源：Hacker News 热门（buzzing.cc 中文翻译）
- 作者：dv35z
- 发布时间：2026-06-06 04:43
- AIHOT 分数：46
- AIHOT 链接：https://aihot.virxact.com/items/cmq1eg8uo0f1rsltr36l59mr2
- 原文链接：https://news.ycombinator.com/item?id=48413629

## AI 摘要

Hacker News 上一个讨论帖询问开发者们使用的AI开发技术栈与工作流程，目前获得101个点赞。

## 正文

Hacker Newsnew | past | comments | ask | show | jobs | submitloginAsk HN: What is your (AI) dev tech stack / workflow?169 points by dv35z 9 days ago | hide | past | favorite | 135 commentsHello, happy Friday!I am looking to do some in-person "developer boot-up" workshops, and seek your suggestions for "modern tooling".The background of the participants range from motivated newbie ("I heard you can make your own app with AI!") to existing software developers who want to get up to speed on modern development for the purposes of building stuff, and getting jobs where AI tools are being used.For those who have been doing software development & "tech" lately using AI tools, and feel they have a great setup & flow - I would love to hear what your dev setup is, what tools you're using and what workflow has been working best for you (and your team).// My BackgroundI have been programming / building for 20+ years, but have not been using AI tools much (aside from hitting up LLM APIs on a few projects).I value open-source, and aim for long-term quality and supportability. Techniques like test-driven development (TDD), using proven / well documented tools, customer-centric development (often pairing with clients), make it easy to do the right thing. If you are familiar with Pivotal Labs, agile & XP - that's the style.These are some of the Upcoming uses-cases for the workshop, and my own personal "IT backlog":- Create a static "one pager" personal/professional website- Setup a Blog / Static site generator (Pelican), create a simple but stylish theme- Create a simple web app / backend API (FastAPI) tool - form-based calculator, convert X data to PDFs, etc.- Figure out how to have SyncThing autosync the home folder of 3 Linux computers in the house- Backup & archive the photos & video from my iPhone// Tech stack I am currently using:- Operating system: Linux Mint Debian (LMDE)- Editor: VSCodium- Code: Python, HTML/CSS- Server platform: Amazon AWSI am guessing that most workshop participants will be using MacBooks & Windows computers - but a few are on Linux, as I recently did a "Linux install party".I haven't used any "AI harnesses", agents or anything like that - but curious what's a good starting point to take best advantage of these tools.Thanks for sharing the knowledge!// JRO coffeecoders 9 days ago | next [–] I've shifted to a "slow code" approach with AI, treating it more like a design partner than a code generator.I mostly do TDD with TypeScript. I write the test, write the code myself (sometimes with the help of LLM), and then hand it to the LLM. Instead of asking it to write things for me, I use it to find edge cases, check if it's leak-proof, and verify efficiency.For architecture questions, I debate with it for a while. I almost never ask for code without conversing 4-5 times first to push back on its assumptions. It's the best rubber-ducking partner I've had.Personal plug: I wrote more about why/how I use AI to write slow, better code on my blog: https://nabraj.com/blog/ai-write-slow-better-codereplysermakarevich 9 days ago | prev | next [–] I am using Spec Driven Development approach implemented as a Claude Code plugin since Feb for all mid + size tasks. The idea is to write detailed specs first using agent help doing research and interviewing, decompose the task into smaller subtasks, write detailed spec for each task, implement each task separately. You can restart the session after every step in the workflow and after each subtask implementation since all requirements are materialized in specs. This helps to keep session context focused on a single task at time, improve adherence, reduce cost and allow to implement bigger tasks that are hard to implement with pure plan + code.Discussion on hn: https://news.ycombinator.com/item?id=48231575Repo: https://github.com/sermakarevich/sddwSlides: https://docs.google.com/presentation/d/1SjKXF7hkoqyiN9-3tBGY...replyvladgur 8 days ago | parent | next [–] Imm overwhelmed with a variety of SDD tools out there.Github spec kit, spec-kitty, symphony, GSD, this.How do people decide on the framework other than try them all.p.s. found this mind-boggling list of them all https://github.com/cameronsjo/spec-comparereplysermakarevich 8 days ago | root | parent | next [–] I looked at most of those, including kiro and tessl. Was early user of GSD when it was suitable for mid+ size projects. Over time GSD grown into beast which is suitable for huge + size projects only producing gigantic specs and burning too many tokens for most of the tasks. So I decided to created my own, with set of steps I need and specs I want.After few presentations of sddw to different companies, most important conclusion was that the ssd plugin should be customizable. It should fit the typical size of tasks/features you are working on, specs should fit your requirements, set of steps can be different.So I created claude code workflow (ccw) which allows to compile custom version of workflow on top of sdd approach: https://github.com/sermakarevich/ccwAfter making few presentations of sddw to different companies,replydostick 7 days ago | root | parent | prev | next [–] I think any add-on SDD is diluting the context and should be using as much as possible anything that is built-in and not external skill. When Claude (or any other lm) begins to ignore the rules, ask it and it will tell you that having too much in context will make it go bad faster.replybohdanstefaniuk 8 days ago | parent | prev | next [–] Genuine question, I'm trying to adopt specs and AI DLC in my team so we can use it as an enhancement and improve our development and the biggest pain right now for us is managing all those md artifacts.I'm curious how do you manage them? Do you preserve them for the future or delete as soon as task was accomplished? If you're deleting those artifacts after job being done - do you summarize those specs into the Jira ticket or whatever system you use.replyjohntash 8 days ago | root | parent | next [–] If you already use jira, maybe try putting the specs directly in the jira ticket? And/or in a confluence page?I'm not sold on SDD yet, but part of that is because we already have decently detailed tickets and designs.replytracker1 9 days ago | parent | prev | next [–] Similar with a todo.md in the project which outlines work to be done.. this gets combined with developer and/or user documentation which outlines features and how they're expected to work. I'll iterate with the agent on the planning and documentation through several times until the documentation and plan look good. The only gotcha I've had a couple times is I'll have the testing and spec before implementation and sometimes the agent will try to edit tests rather than making the implementation match spec/tests.I'm definitely baby sitting the process more than vibe coding, and review each cycle's results. As for languages, mostly TS/JS and Rust with a bit of C# here and there depending on what I need. Claude Code's Opus does a pretty good job with Rust, so for anything personal, I've just gone with it.Work has been limited to working out specific problems, or a small utility/library that I can pull in, but on my own system, separate from work resources.replysermakarevich 9 days ago | root | parent | next [–] One additional benefit that we get from the sddw is that agent drives the spec creation using scenario we put into command/skill. It does the research local/web, it asks operator questions and later confirmations about each block in the spec.replyalok-g 8 days ago | parent | prev | next [–] I would love to see examples for various types of specs written, as in actual texts of the specs for various types of it.It would also help to understand how are changes to specs handled? Is the agent given both old and new versions to figure out the updates needed?Thanks.replysermakarevich 8 days ago | root | parent | next [–] Here are some - I used sddw to create: - chunker - app to get smart slices from text and organize them in hierarchical LLM/Obsidian wiki. There were two features implemented using sddw and 15 subtags:-- https://github.com/sermakarevich/chunker/blob/master/.sddw/c...-- https://github.com/sermakarevich/chunker/blob/master/.sddw/m...- ccw (claude code workflow - plugin to compile generic claude code workflows based on sdd approach) - https://github.com/sermakarevich/ccw/blob/main/.sddw/claude_...Btw you can use ccw to create your own custom version of sddw quite fast - with specs format and sequence of steps that suit you best.replymadarco 8 days ago | parent | prev | next [–] I do the same, and faced the issue that claude/codex loose context when doing subtasks (and subagent don't have plan mode).So I've built Agentbox to be able to launch from claude/codex multiple VMs with claude/codex (can also mix). The parent agent watch for prompts and questions, enforce /review, /simplify, that the sub agents file a PR and wait for bugbot comments etc.This way the parent agent running in a /goal don't loose context, enforce a good workflow, manage the backlog and parallelize/merge back the work on the main repohttps://github.com/madarco/agentbox MIT licensereplymadduci 9 days ago | parent | prev | next [–] Naive question: how much time do you spend doing so vs. Doing the actual work yourself?replysermakarevich 9 days ago | root | parent | next [–] I am building AI agents full time since Nov 2024. I stopped coding completely around mid summer 2025 using Cursor at that time. When you build platform-like application, and have few plugins already, ai coder can create next one in a way you won't recognize which one is written by you.At the end of 2025 I switched to Claude Code. Compared to Cursor this opened a different level of automation, including fe possibility of running swarms of agents: https://news.ycombinator.com/item?id=48407998 using subscription limits.So I spend all my time rather understanding how to squeeze everything possible from AI than myself. AI scales, I am not.replymadduci 8 days ago | root | parent | next [–] What about quality? How do you assure that everything works without misshapen?replysermakarevich 8 days ago | root | parent | next [–] TDD and specs helpreplyaabdi 9 days ago | prev | next [–] There's lots of ways. You have to upskill through the stages IMO. Write code, write w/ agent, write w/ multi agents, write w/orchestrators.My way is to just run a giant AI agent factory engine and make the agents full flow do everything. (plan long term, write prd, task, review).Here's ~4000 commits in last month as an example, i have about ~10k ish including private/work stuff? https://github.com/portpowered/you-agent-factory/commits/mai...The premise when you get to full automation generally is you go full industral engineering:1. watch overall flow, improve process via continuous improvement2. work via checklists and gates.3. replace process with mechanisms as much as possible (code > agents)4. optimal throughput is continual testing and iteration (CI, CD), coverage, full e2e tests, mock everything, general best practices really.decent blog: https://openai.com/index/harness-engineering/general points:- build lots of linters- document literally everything (arch, prd, best practices in repo)- too many agents at the same time makes lots of code conflicts, so need to consider architecture of code how to maximize concurrency.replyalex_c 9 days ago | parent | next [–] Genuinely curious - in your case, where do the requirements for what needs to be built come from?In every project I've touched, business requirements are always the bottleneck - so I've never been able to wrap my head around what kind of requirements can be fed into a setup like this at high enough volume to justify it.replyaabdi 8 days ago | root | parent | next [–] myself? senior engineer role is basically figuring out the business problem and getting funding.IDK i've always been able to get senior managers/managers to give me lots of leeway to do wtv i want and figure out problems. At least in most places i've worked at.i work on platform stuff mostly, so there's always a large need for stuff. the backlog alone b4 all the agentic stuff was roughly 20-30x the capacity of any team (per year). 80% of requests were usually SVP goals and we'd just outright drop them due to lack of capacity or request HC transfer/away teams.i.e. internal improvements alone were always massive (not gonna talk about prod code/cross team organization).1. we need better test coverage for x,y,z2. we need to be able to eval the long term costs (XX growth YOY, how to reduce)3. the internal system streaming is inefficient, need to eval the alternative systems4. we need better ops handling/management automation for issues and sev3s/sev2s. i.e. scaling, anomaly analysis, bugs introduced, improved metrics, dashboards.5. DX stuff needs better handling, people keep confusing themselves on how to onboard. better docs on how to onboard, automation,6. teams x,y,z are fighting with each other bcz they don't have a good grasp on systems, improve internal docs on arch and interop7. we need automation to be able to more easily test our systems in an adhoc fashion8. there's no linters for API platform, leading to bad results and inconsistency.9. we're seeing bugs in the code, but aren't appropriately manualy testing after deployments. spawn 100 agents to do it, compile the results. do it every X days, feed the bugs back into the system.i could go on and on. and this is one service, usually you own quite a few, and each one has their unique set of challenges.replyianm218 9 days ago | parent | prev | next [–] Have you been able to build anything substantial with AI factory itself? I have done some of these experiments myself on these sort of things and found they ended up often being less effective than using the latest tools in harnesses like claude code.But curious if you've found it to be a big unlock. I have been doing some of this industrial engineering myself.replyaabdi 8 days ago | root | parent | next [–] well work wise its usually for adjacent tooling. It unblocks other things, but like actual prod code, i'm always a little skeptical.For example we had this problem where we had to take in customer inputs for requests and calculate out the projected downstream TPS. This is fairly complex since we run a query parser/orchestrator.This is expensive to write myself or to have engi's do it, but the scaling algorithms are all there and we have excel sheets for spreading out overall costs.so then all was needed was basically write out a big spec of the reqs - give it the docs/parser code/excel sheets, then just have it span out the pieces as a sequential checklist. 1. CI/OPS 2. docs 3. test infra 4. incremental build out in phases to chain it all together.replydempedempe 9 days ago | prev | next [–] I'm a contractor (AWS and web apps), so I get a lot of sometimes-ambiguous requests. I have a five-part workflow via Claude/Codex skills: discovery->implementation planning->implementation->verification->reviewEach phase writes to `./.agents/plans/{plan-name}/` in the project root. All in Markdown. That way, the flow is agent-agnostic. Each phase artifact is immutable after being written.More details:First, I put all the information that I have (documents, client statements, any code, my own summary, etc.) into a document. Which I pass to the discovery planning skill.The discovery phase more formally defines the project in terms of functional requirements, non-functional requirements, constraints, risks, and assumptions. This might take a few passes to get everything nailed down.After that, I being a implementation planning phase using the discovery artifact (`discovery.md`). We define the work in terms of phases, where each phases has various tasks associated with it (all checkboxes). Again, usually requires a few passes.After that, I have a clear idea of the work needed and can send an estimate to the client. Or, if it's a personal project, get started actually building it. I have another phase for actual implementation.Verification and review are similarly defined. They can be done by any agent.replylagrange77 8 days ago | prev | next [–] I'm using VSCode with Github Copilot (Business) in Agent, and Ask mode with varying LLMs, depending on the complexity of the task. For a specific task, i create a markdown file with the requirements in tandem with the Agent, manually edit it where convenient. And then i let the Agent implement one feature or work unit after another, while micro managing it and making sure that i understand what it has written (not for really trivial stuff, where i don't care). This gives me a huge productivity boost, while the level of being in the loop is still bearable for me.TBH, i'm wondering why i'm the only one saying he's using VSCode with GH Copilot. Isn't this the most frictionless tooling for an 'agentic engineer'? I get state-of-the-art LLMs while it's fully integrated into my IDE.I still don't fully get what Claude Code or GH Copilot CLI would bring beyond that, since the Copilot plugin does also have CLI access.replyoutside1234 8 days ago | parent | next [–] You aren't the only one. There are many people using Github Copilot.Github Copilot CLI is for automation. For example, in a Rust project, I use it to audit for security, documentation gaps, and test issues crate by crate. This can take an hour and I look at the suggested stories it writes afterwards to triage for implementation.replylagrange77 8 days ago | root | parent | next [–] And does Copilot CLI make that more convenient than letting the Copilot Plugin in Agent mode do it? Maybe because of swarms of agents?replyCharlieholtz 8 days ago | prev | next [–] I'm biased (I'm the creator) but I use Conductor every day. I've recently switched to Opus 4.8 (fast mode always on) as my default model but swap in GPT-5.5 quite a bit for reviewing Opus's work.My flow is something like: - Create a new workspace for a specific bug/feature - Ramble into the input box. I use a goose neck microphone and Spokenly (with Parakeet as the model ) for local speech-to-text - Hit enter! I don't use plan mode. - Ask for a review from a different model (⌘⇧R) - Create a PR and run a /babysit loop - Run a local version of the app and click around, do a human review. If the LOC are negative we don't pay much attention to the code. If it's positive we do - Merge!I often have 3-5 workspaces running like this. There's lots of room for improvement but it's been working quite well for me.replyphildenhoff 8 days ago | parent | next [–] Hey -- we've loved using Conductor here at Digits. I've been using it since February and only recently swapped to using Orca for Remote SSH and perf improvements. Looking forward to your general release of Conductor Cloud!I've been wondering, since you're building desktop software, how do you get AI to test your changes? Boot the whole app? Run the frontend/UI with a mock backend?replybrowningstreet 8 days ago | prev | next [–] I’m a solopreneur working on a fairly large number of independent projects.I use Claude Code to initiate a project using Sahil’s ME skill pack and write a high-level spec to a Linear ticket. If/when I’m ready to work on that idea, I convert it to a project, decompose the top ticket into more issues. I also have Claude code add to each issue with deepseek, sonnet or opus tags based on which is most appropriate for the issue.Then I fire up opencode and go through each ticket. Plan, then build. Every N issues I switch to opus and have it review the work done.Enhancements and bug reports get filed in the project. Repeat as necessary. I work pretty sequentially. I’m quite happy with the operational success of my projects. They’re all being used.I’ve expanded this process a few times but this baseline is where things shrink to. Sometimes I use open chamber but opencode cli works for me.replyalok-g 8 days ago | parent | next [–] Sahil’s ME skill pack seems to be this: https://github.com/slavingia/skillsreplyathrowaway3z 8 days ago | prev | next [–] Assuming you have a SOTA model - the thing I'd teach them is minimalism.- Minimal tooling - Minimal system prompt - Folders + files + textAI driven development has turned the whole development job into knowing what questions to ask + complexity reduction.First ask the model how to do something / what options there are to do something - not just to do something. Creating moments to teach that is a challenge in itself.After its answered go tell it to do the thing.If they're serious though, the next step is to teach them to always ask if there is a simpler alternative with fewer dependencies.Anything with a too magical UI is going to give them the wrong 'model' in their mind on how to think about the tool.A bit of a hidden aspect many people seem to miss, the tone you take with the model is absolutely critical. Ask a bunch of psychology questions before having it write javascript or propose a tech stack is going to get you different results.Finally, the semi obvious hack (and which something like claude will do automatically when in team mode) - have the model talk to another instance of itself. The model can translate your ramblings into coherent specs in the right tone and feeding that back into itself in a new session gets you the good results. Its also part of why the "first write a plan" works because it fills the context with the right tone and clear instructions.replyRivoLink 9 days ago | prev | next [–] I use Claude Code, flow for reusable skills/prompts, and leaf for reading Markdown comfortably in the terminal.- Claude Code- flow: https://github.com/RivoLink/flow- leaf: https://github.com/RivoLink/leaf- GNOME TerminalIt's a pretty terminal-first workflow.replysajithdilshan 9 days ago | prev | next [–] At the moment I predominantly work with Python and hence PyCharm as the main IDE. However, I've built this plugin https://plugins.jetbrains.com/plugin/31117-agent-cli to render agentic CLIs as an editor tab in PyCharm and also some notification hooks so I don't have to switch windows and it's easy to jump around the code while the agent is doing its work.Besides that I have a collection of custom skills (plan for JIRA tickets, github PR creation, code review, etc), a set of MCPs (most are for internal tooling) and most of the time I use Claude Code.replydansult 6 days ago | parent | next [–] Will try this out, I'm just running claude in terminal panes stacked on the right hand side. Cheers!replyvb-8448 8 days ago | parent | prev | next [–] Thanks man, I was looking for something like this for a while!replysajithdilshan 8 days ago | root | parent | next [–] You’re welcomereplynimonian 8 days ago | prev | next [–] Ghostty with Claude Code. That's pretty much it.For each new feature, I open a worktree, spar with Claude to work up a gherkin spec with @todo on each story. Each agent pushes commits to a WIP PR in GitHub where I review and leave comments or questions. Once the spec is done we mainly interact on the PR. @todo becomes @wip and @done as the agent progresses. I really like gherkin for agentic engineering, it's very clarifying.I have about 2-4 agents running at a time. Large test suite, linters and formatters enforced on push.replyMaieuticMD26 8 days ago | parent | next [–] samereplymoezd 8 days ago | prev | next [–] 1) Slow code. Let the agent(s) discover and plan, then launch the swarm on the confirmed implementation steps. 2) Use LSP. If nothing works, usually you can connect it via MCP. I think all coding agents support this by now. 3) Add hooks if you want to stop the coding agent from doing something nasty, or hallucinate and give incomplete output. TDD and any verification tool you can think of are your friends. 4) Skills have been a bit of hit and miss for me, especially with less capable models. So are plugins. If you know how they work, please explain to me.That way the model doesn't go about "let me grep this specific pattern across a million files again and again" loop and burn your entire weekly budget by Monday at noon.I'm also curious if anyone has done something cool with memory and context management that doesn't require a custom llama.cpp implementation. I also don't have the heart to let the swarm do it end to end, because LLM generated code with less capable models really does smell, no amount of spec driven or Claude.md filled style guidelines seem to fix it.replyc0rruptbytes 8 days ago | parent | next [–] There's `honcho` for memory, i'm starting to play with it now, but I feel like I've seen a lot of projects pop up for itreplydelduca 9 days ago | prev | next [–] MacOS, Ghostty, Tmux, Neovim, Workmux[1], OpenCode/Claude Code, and lots of markdowns.1 - https://github.com/raine/workmuxreplyera86 9 days ago | parent | next [–] Throw tmux in and this is my exact setup as well. I'm looking into finding ways to orchestrate entire workflows across multiple agents in a harness-agnostic way (locally at first).replydelduca 9 days ago | root | parent | next [–] I forgot to say, I use Tmux + Workmuxhttps://github.com/raine/workmuxreplyaltuzar 8 days ago | root | parent | prev | next [–] Cmux is rather fancy if you still like a UI. Used to love iTerm but Cmux was perfect for tons of ClaudeCode/Codex instances.replyRivoLink 9 days ago | parent | prev | next [–] Maybe you're interested in https://github.com/RivoLink/leaf. If so, I'd appreciate your feedback.replyyogibear678142 9 days ago | prev | next [–] I type in a text box and tell the AI wat to do. Yea my tooling is just a text box. Like Google search is just a text box.replynimblist 4 days ago | prev | next [–] Claude Code running on a dedicated debian lxc with tmux so I can pick sessions up from any machine.The key thing for me is to ensure that claude is constantly maintaining its own project documentation so that it's not reliant on any one session context. I've also set up custom commands to set up a clearly defined plan md file and to incrementally step through it updating completion as it goes.From my perspective it's important to treat it as another potentially fallible team member. I have extensive unit tests, automation tests and SonarQube carries out analysis on all code merged to main.Finally github actions to provide a proper ALMS with a non production environment for e2e testing.replyjpeeler 8 days ago | prev | next [–] I'm currently using herde[1] to handle/supervise multiple agents (with some patches I need to try to upstream) along with Nono[2] for sandboxing. This sandboxing approach avoids use of a microVM, which lets me use tooling I already have installed inside the sandbox. The downside is getting all the policies correct as it seems every project needs some new type of access, though Nono does try to make policy writing easy.I've been considering switching my approach to using a microVM through microsandbox[3]. The pro of this approach is you can essentially skip the policies and rely on the security of the VM boundary. The negative is that now you've lost all your installed tools, so you need to either provision at runtime or build something (like an image) beforehand to match your dev environment.I still don't know which is less maintenance. And while I think herde is pretty well thought out, I do think about something that works outside the terminal may be nicer.[1] https://github.com/ogulcancelik/herdr[2] https://github.com/always-further/nono[3] https://github.com/superradcompany/microsandboxreplyamdolan 8 days ago | parent | next [–] Thanks for sharing. I recently started using my homemade “competitor” to herde, so it’s nice to compare against prior art.What do you think could be nicer with a native app? More mouse or visual interactions? Modern design and gui?replyjpeeler 7 days ago | root | parent | next [–] The kitty developer is philosophically against multiplexers, but he also has concern about performance (which I share somewhat as well). Herde is not quite as fully featured as zellij/tmux, so that partially is biasing me. I think that a native app could be potentially nicer to avoid all those concerns, along with my setup (only in Linux really) of a tiling manager handling the windowing instead of that being part of the solution. A big part of the value add of Herde is actually the monitoring of the Claude session and not so much the multiplexing reimplementation. One of my patches allows jumping to the next blocked/finished session so I can quickly give feedback or observe different agents. I can envision a GUI solution for that workflow being something better than what a terminal easily allows.replyAndrewKemendo 9 days ago | prev | next [–] I’m already doing this with my school (givedirection.com) and you’re gonna have a hard time nailing this down because there’s no two similar set upsEspecially along the range of newbie to expert it’s extremely variable and you’re not gonna be able to pick one that rules them allI would suggest you revamp your approach and have different courses for different types of people I had to split my course into a basic and an advanced and they are extremely differentEven within the advanced course fairly simple stuff like hosting your own LLMs seems to really be a stretch for a lot of peoplereplyavgDev 8 days ago | prev | next [–] All I use is codex right now.I brainstorm with it, create documentation, and generate code. Then review, test and profit.replyabelkhadir 7 days ago | prev | next [–] I use linux + claude code as the main driver.Before i start a project, i first write scripts manually to give claude code an idea of what im trying to doI then use Claude design to design the full frontend in one shot. then use it manually to see bugs/stuff i dont like and re-prompt it to modify. i also give it description of the scripts i wrote so it has a better idea of what i wantThen i use claude code for the backend. depends on task, i mostly use python, golang or typescript.No IDE. Terminal only. git + editor + agent is enough.i used this to code my entire project kyroai.dev. it turned out impressive, in under a weekI also belive you should be the number one user of your app to actually make it bug freeBefore this i had to type code similar to you, and i say ai code is way faster, for example kyro would have taken me a month for the mvp. but with ai i just prompt. while its working, i browse the app, find newer bugs/changes re prompt again, in under a week the project is done.All of this with a 20$ claude sub. not even a 100$ planreplyGreenpants 8 days ago | prev | next [–] I've dabbled a bit in GitHub Copilot using Claude Opus and Sonnet models via work, but I couldn't shake the thought that we weren't allowed to use this on any of our clients' codebases. Having been a fan of Ollama, I wanted to try something truly local.First I tried OpenCode but they unexpectedly make external requests (!) even when using Ollama (I noticed when Ollama wasn't properly connected and I still got a title generated).So I settled for Pi, but I strongly disliked the idea that the agent could, at any point, decide to delete files or exfiltrate .env secrets. So I created Picosa (https://github.com/GreenpantsDeveloper/Picosa), containerizing and sandboxing Pi, with firewall rules such that it could only ever reach the local network (for Ollama), scoped by just the current working directory, and nothing else. Combined with Qwen3.6:35b, it works surprisingly well, and I could ask it to improve itself when run on its own repository.replymg 9 days ago | prev | next [–] I wrote my own tooling around the raw LLMs:I can tick files in Vim, those get concatenated into a prompt. Along with a feature request. Plus an instructions file that tells the LLM how to reply. Plus my general "rules for good code" file, plus one "rules for good code" file per language involved, plus a project specific overview file. The LLM then answers with a list of changes it wants to make to the code. My tooling then applies those changes and I look at them via "git diff". If I like it, I commit. If not, I change one of the prompts and start the process again.Instead of replying with code changes, the LLM can also decide to request more files. I wrote a little DSL for that.I described the beginnings of this workflow last July:https://www.gibney.org/prompt_codingFeels like an eternity ago. I think I will write a new blog post this July and describe how the workflow has evolved over the past year.replystavros 9 days ago | prev | next [–] I use OpenCode with a three agent combo (architect, developer, reviewer), as I've found it's crucial that different models write the code vs review it.More details here:https://www.stavros.io/posts/how-i-write-software-with-llms/replycalvinmorrison 8 days ago | prev | next [–] I am working on a project/essay/thoughtsphere that is beautifully illustrated by this thread. My project is to help automatically take your patches/workflows and package and rebase on top of upstream using quilt so I can get the latest greatest fixes while keeping my notafork. Forks are expensive, patches are easy.That I think we're going to see much much more variation in design, software and interfaces as the labor to produce them become trivial. Everyone can patch software to do what they want. Yesterday I had claude rewrite xrdp to allow me to remote into my desktop session without having to deal with x11vnc, it lets me drop in, pick :0 or :1, auth's with PAM and gets me in. What I have always wanted with xrdp that never worked quite right. I have patches for i3, and for vim, and for xpdf, and bash, and mocp, and all sorts of tools and scripts I wrote.Anyway, here's the site essay I am working up but yeah:Right now, programming is rapidly becoming not expert work. Soon we could all be running (i think this unironically) practically our own distros if we want. Total customization of the stack.I really feel that one positive thing AI can do is drive labor costs down enough to allow personal choice in the software we use. We have open source software, but it's channelized and controlled by a few companies who fund projects! That might change too!AI can simply One Shot a lot of small problems i have. Like reading unfamiliar codebases, finding the relevant function, and writing the delta. The gap between "I want bash to do X" and "here's a patch" is shrinking fast. When that gap closes, a lot more people are going to start customizing their software - but we don't have a great wrapper for it yet.The part that doesn't get easier is everything after. How many 'forks' exist on github but people havent had time to maintain, or worse, are being used in production with bugs? How much code have we lost out because of that? Do forks really help us? I don't know. Does everyone want to use shitlab? I don't know.Building the package. Getting it on your machine or out to the fleet. Keeping it there when upstream ships a security fix.That's an infrastructure problem, not an AI problem I needed a way to solve it now________ is that little bit of software infrastructure i need . built now, for the world where i am right about my bet.replyKuyawa 9 days ago | prev | next [–] DeepSeek and Mecha-AI as CLI coding agent for general architecture [1]Sublime Text and a DeepSeek plugin for file by file cosmetic fixesNothing else. With these tools I am building apps like never before in minutes instead of months[1] https://www.npmjs.com/package/mecha-aireplyblfr 8 days ago | prev | next [–] I banged out a simple FastAPI endpoint/tool (along with dockerization and deployment) and a media-heavy Astro website (along with Cloudflare Pages publishing) in Google's Antigravity2.https://antigravity.google/product/antigravity-2Not really a recommendation since I don't have a good benchmark of these tools but Antigravity's /grill-me feature where it asks you a bunch of questions like a system/business analyst and gives you an implementation plan for review (and can actually change it further) is pretty cool and it is certainly fit for what you intend.Heard also good things about Zed and am testing it right now. So far I managed to... edit a json.https://zed.dev/replysolumos 9 days ago | prev | next [–] Something different that other folks might not have thought of: Robust multi-environment infra deploy scripts that leverage terraform + AWS SSOI've found that converting stuff that's previously been very ops-cli heavy into very detailed skills has worked really really well.I use Claude Opus 4.8 + Conductor as my daily driverreplygavinh 5 days ago | prev | next [–] Even when I write a detailed specification and review the resulting diff line-by-line, I am unsatisfied with my comprehension and recall of the changes. I don't understand my systems as well as I used to (I don't think this is surprising; see the "generation effect"). I have been experimenting with extending code review with some new exercises intended to improve comprehension, at the cost of a little friction.replyramoz 8 days ago | prev | next [–] Claude Code, Codex, Pi clis all for varying levels of work. VS Code when needed.I review agent messages, some specs/plans, and conduct local code reviews with Plannotator [1].For skills, I have a bunch of custom ones for my own workflow. and for public skills I really only use the interrogate skill from cursor's lauren [2].Key workflow stuff:- Almost all work I do gets done in a git worktree.- ghostty + Mac OS gives me all the organization I need for multi-agenting- turn off all agent memory, this has only ever caused problems for me.[1] https://plannotator.ai[2] https://github.com/cursor/plugins/blob/main/pstack/skills/in...replygottagocode 9 days ago | prev | next [–] Lead Dev for a Security Company with a very strict AI policy.Mostly Hand coded, using an agent in the browser (Claude / Corporate ChatGPT account) when necessary. I am aware we will fall behind using this methodology and have advocated for change, but I suppose it comes with the territory.replyChrisLTD 9 days ago | parent | next [–] I don't think it's clear you'll fall behind. Your competitors could very well be vibe coding themselves into messes they will never recover from.replyc-smile 8 days ago | prev | next [–] I've spent 2 weeks (2-4h per day) to make D language[1] version of Sciter SDK [2]Choice of AI "tooling" was by accident - typed something like "how to define copy constructor in D for custom structure" in Microsoft's Copilot in Edge browser that gives context for AI.The answer was good enough for me and so I went with it further.[1] D language HQ : https://dlang.org/[2] AI-Assisted Development with D Language, Creating Sciter SDK: https://terrainformatica.com/2026/06/05/ai-assisted-developm...replyjjcm 8 days ago | prev | next [–] > Setup a Blog / Static site generator (Pelican), create a simple but stylish themeRE this one, I highly recommend doing image->code as the flow here. Codex's sites feature is doing this under the hood - it's rendering an image first with gpt-image-2, then building from it as a reference.You can use gpt-image-2 directly for this, though if I can plug my own stuff diffui.ai it's exactly what I made this for. It'll make it easier to do multi-page flows with the same style easily, then you can hand off the designs to your agent, ie https://image.non.io/6e1f98ad-4c79-4735-9932-b0d5cca9be98.we...replyGalanwe 9 days ago | prev | next [–] I have a vibe coded script which creates a git worktree + zellij pane with a specific layout + a virtualenv per feature. "tmuxinator" style.The zellij layout includes panes for OpenCode, a shell, a neovim, inotify tests, etc.I cycle through the zellij sessions during agent prefills.replypss314 9 days ago | prev | next [–] Stanford University offered the course "CS146S: The Modern Software Developer" in Fall 2025. Check it out if interested. https://themodernsoftware.dev/replyDenisM 9 days ago | parent | next [–] What is your impression of the course?replyverdverm 9 days ago | prev | next [–] OpenCode + their Go subscription.Start with a nice batteries included setup, read anthropic's knowledge share, play and iterate, stay human in the loop.Check out Dax Raad (behind OC) on the Pragmatic Engineer podcast, I think you will like his philosophies, I sure do.replykaycey2022 8 days ago | prev | next [–] I just ask whatever model I'm using, mostly GPT these days, to do the thing. First I discuss a plan of action and then go back and forth with it to settle on one. This is usually a long process and for complex features may even take a day or two. And then once I am satisfied, I ask it to implement. I skim over the code, it often generates too much code for my liking. I sometimes ask it to change things that I dont like. This is almost always to do with bad tests or not using existing code paths. I just work on one thing at a time.replythrowaway888abc 8 days ago | prev | next [–] Recently ditched VSCODE completely and switched from development on local machine to remote "vps" cloud.Currrent setup:Zed + Terminal threads (love this!) + Remote machineDevcontainers + Claude + Pi[1] Zed https://zed.dev/[2] Terminal threads https://zed.dev/blog/terminal-threads[3] Pi https://pi.dev/As sort of byproduct also replaced Alacritty + Zellij (i just don't have the need to use more, 3 weeks of new setup)replyanioko1 5 days ago | prev | next [–] There are many good answers here. I created microcodegen that can help scaffold the entire architecture for you to get started. Here is it https://github.com/Anioko/spec-driven-developmentreplydv35z 8 days ago | prev | next [–] Thanks for the great discussion. I went through all the comments, and identified common tools (counting the references in the comments). If you are interested in seeing the summary of this thread, check it out here:https://codeberg.org/jro/Knowledge/src/branch/main/2026-06-0...Contributions & improvements are invited and welcome - thank you!replymkw5053 9 days ago | prev | next [–] Claude code + very opinionated type script. Try to push as much as possible as far left in the SDLF (types -> lint rules -> tests -> md) and try to improve the dev ex after every single PR.replymoltar 8 days ago | parent | next [–] Please share your opinions. Thank you.replymkw5053 9 days ago | parent | prev | next [–] I use these exact principles (which change): 1. No overengineering. Minuscule complexity. Always pick the smallest implementation that works. No speculative features, no defensive code for impossible cases, no premature abstraction. 2. Lean on tools, libraries, vendors, and existing internal patterns so we maintain the least complexity ourselves. Before any real implementation decision, default to discovery (official docs, recent trusted expert sources, etc). 3. Shift left in the SDLC. For every check, pick the cheapest deterministic mechanism upstream. Hierarchy: types, then lint, then DB constraints, then build-time checks, then deterministic CI, then tests we maintain. 4. Tests are not exempt from minimum-complexity discipline. Don't write tests for the sake of coverage. Tests must be few, complementary, and valuable. If a type, lint rule, DB constraint, or build-time check already proves something, a test that re-asserts the same guarantee is duplication: extra source to maintain, drifts from reality, adds CI latency. 5. Compound engineering. When you teach me a rule, build the prevention into artifacts (AGENTS.md, lint, hooks, reviewer prompts) so it applies automatically going forward. Don't rely on memory. 6. Prefer functional, pure, immutable. Mutation is a smell unless inside a contained scope with a clear reason. Arrow functions plus reduce/map/filter over for-loops with let. 7. Parse, don't validate. Boundary inputs become typed values via Zod (or similar); downstream code carries the type. No scattered re-validation. 8. Functional core, imperative shell. I/O at the edges; domain logic pure. 9. Keep going. Don't stall on ceremony. Make forward progress.replyahriad 9 days ago | prev | next [–] I am like you were late to the AI party, and still find it hard to give up on coding and let the AI do everything, however i learned to trust the AI a little in the past few months.replydesipenguin 9 days ago | prev | next [–] I did a similar workshop between Feb-April (1 hour zoom call on Wednesday, 3 hour hands-on in person every week)Most of the participants has Windows laptop. (Except one with Mac)We had suggested Linux on WSL2 and VSCode. (`uv` for python package management)But realized that we were spending a LOT of time fighting the tools/combination. WSL2 + Windows filesystem + uv did not work well together.For person with macOS - it was smooth sailingIf I do another batch, we'll use native `pip` and python (not uv) and I think then we won't need WSL2replyitake 9 days ago | prev | next [–] virgin project:1/ spec driven dev (https://github.com/github/spec-kit)2/ then degrade to multiple sessions (no worktrees) debugging various problems until its doneOn UI Design (MacOS, Web):1/ AI does a first pass. Try to give it style guidance on my own (colors, style, etc).2/ Prompt ChatGPT.com with screenshots and ask for recommendations on how to make it better.3/ Codex the changes (with minor edits)4/ loop 2-3, ask Gemini for feedback tooreplyHavoc 8 days ago | prev | next [–] Opencode & mixed LLMs1) Write half pager of markdown by hand - tech, architecture, features2) Ask 2-3 LLMs from different companies to review for gaps & problems3) Make LLM turn it into implementation plan with emphasis on modular phases4) Repeat step 2 but on the implementation plan. Usually the 3rd LLM just goes yeah that looks fine5) Walk through phases individually, sometimes multiple in one shot depending on vibes. Sprinkle more 2-3 other LLM checks in between again depending on vibes & judged difficultyreplyzackify 9 days ago | prev | next [–] Self made TUI that just lists LXC containers.I have a base container."A" to make a new instance.Pi.dev when I hit enter on any container. Hot swap anthropic enterprise and openai and openrouter as needed.Every container has the dev env already running for my current projects. Iterate, rarely use vim when needed, spec driven and have llm draft prs for me then I review.I know the codebase in and out so what I want done is on bypass mode and then I review closer at the draft PR step before marking ready for the team.replypapersail 8 days ago | prev | next [–] My usual workflow is GPT-5.5 for planning, DeepSeek V4 Flash for milestones implementation, then GPT-5.5 again for review. It has worked pretty well so far.replylordluca 7 days ago | prev | next [–] Mostly im using VSCode with claude code, i tried claude code directly before but when convo is too long i need to open new session. i tried codex also, its nice to have when im outside as its connected my mac mini, i also get codex to review claude code most of the timereplyrurban 8 days ago | prev | next [–] It changes every 2 weeks.right it now it is oh-my-pi cli with all models (mostly switching between sonnet—4.6 and Deepseek V4 Pro).Before it was codewhale with Deepseek V4 Pro/FlashBefore it was kimi-k2.6Before it was opencodeBefore it was ClaudeBefore it was codexBefore it was trivial github copilot auto completion in emacs.Before it was web searches and ChatGPT web UI.Mostly for C, but also C++, perl and python. Python success being the worst by far.replyc0rruptbytes 8 days ago | prev | next [–] I like Zed...but AI dev workflows get complicated fastyou start with claude code or codex and it's cute, but then you realize - hmm configuration is cheap, the AI can do it!then you start looking into MCPs and skills, fuck it, oh-my-pi looks awesome!wait a second? I can just have AI make my own personal AI harness! Next thing you know, you're writing the 5th version of "little-coder" or similar using the Pi libraryahh shit, you just read an article that `tools` are actually crazy important for AIs, using `sed` is dumb when `hashline` + ASTs are way better, lets just start writing our own tools!!...anyway I just use Zed, simple agent on the left, code on the righti have some pretty complicated automated workflows that use `linear` + a orchestrator -> implementer -> reviewer -> releaser workflow, but it's less a dev stack and an AI factoryreplyd0100 8 days ago | prev | next [–] My workflow for production code:1. Pick the most complete project boilerplate (fullstack JS can easily introduce security bugs, SPA + API is best as cheap linting solves most problems)2. Project skills (how to CRUD without mess)3. Use worktrees for concurrent features, local session for conflicts4. Local session for QA and refinementI use Copilot and GPT 5.4Managed to shorten pre-AI priced ongoing projects to 2 weeks or a monthreplyFreedumbs 8 days ago | prev | next [–] Use whatever terminal you want.Use claude or codex cli as architect.Use !architect-model as actor.Use local model for scout, extraction, etc.Read recent AI research in arxiv.Build a system that suits your workflow based on these principles: executable verification is king, independence is mandatory, a zero-failure report is a claim to be audited, not a result to be celebrated.Now AI Just Works™replytmaly 8 days ago | prev | next [–] I would not try to mix newbies in with experienced software developers.Pick one audience at a time and approach it that way.For a newbie, something like Replit free tier might be the way as there is little cognitive overhead to getting setup.For a experienced developer, having them get a $20 sub and work on one of the popular agent harness.replyemehex 9 days ago | prev | next [–] Claude Code and/or Codex from Ghostty/Terminal. You don't need to complicate it.replyAndreVitorio 9 days ago | prev | next [–] If you are teaching newbies, just get them into the Claude Code or Codex desktop apps.For devs:Claude, Codex and Cursor. All on the $20 subscription.Then use

> 正文较长，站内仅导出已展示部分；完整内容请阅读原文。