I only disagree a little. It's that sometimes there is a discussion about AI itself where "I prompted X with Y and it output Z" can add to the convo.
But those are pretty specific cases (For example, discussing AI in healthcare). That's about the only time where I think it's reasonable to post the AI output so it can be analyzed/criticized.
What's not helpful is I've been hit by users who haven't disclosed that they are just using AI. It takes a few back and forths before I realize that they are just a bot which is annoying.
Here is where I'd like to push back just a little.
Not all AI prompting is expanding the prompt.
What if the original prompt is 1000 words, includes 10 scientific articles by reference (boosting it up to 10000) , and the AI helps to boil it down to 100 words instead?
I'd argue that this is probably a rather more responsible usage of the tools. And rather more pleasant to read besides.
Whether it meets the criterion is another thing. But at least don't assume that the original prompt is always better or shorter!
Use your brain and summarize the article yourself if it's of such great importance. Why should I care to read it if you can't be bothered to actually write it?
Actually, I'd like to expand a wee bit. Don't know if you've ever done a scientific library usage course or so. It's one of those things you tend to forget are important.
One of the most important lessons is not to read as many papers as possible. It's weeding out as many as possible so you can spend your limited grey matter reading the ones that actually matter.
And that's where the LLM comes in handy, especially if it's of decent quality. It's a Large Language Model. Chewing through language and finding issues and discrepancies, or simply whether a paper matches your ultimate query is trivial for them .
You know, I probably have standing to argue that people who use the web are just as lazy ;-)
I'm just old enough that I was in the middle of the transition from paper (in primary school in the 80s) to online (starting late 90s)
I say this somewhat tongue in cheek, but obviously people should drive to 3 different libraries across 3 countries and read the journals in their own binders (in at least 3 different languages)
In reality: full-text online is convenient. Having an LLM assist with search and filtering is convenient.
I could go back to the old ways. Would you like me to reply in pen? My handwriting is atrocious.
I really prefer modern tools, though. Not everything older is better. Whether you want to read what I write is up to you.
(edit: Not hyperbole. I live in a small country, and am old enough to still remember the 80's as a kid.)
Push the idea past a single comment. Someone decides they have a great method for getting summaries, and adds it as a comment to every post they look at. Other people have similar ideas. Is that fine? It doesn't take a lot for the whole site to feel like useless spam.
It'd be far better to just have a thread about the best way to get good summaries.
Probably not. A typical S/N ratio (rule of thumb) is about 1:10. Sturgeons law (a useful rule of thumb) says "ninety percent of everything is crap."
You shouldn't just dump a big pile of slop on someone's plate: the actual trick is to filter it down to the bit that counts. Usually when posting, you should do that for the reader. It's only polite.
So, if we filter out the noise, that leaves you with 100 words and 1 link to a reference. Which is actually about right for a typical HN reply. (run this through wc ;-))
Would prompts really be interesting or thought-provoking, though?
I don't expect AI HN responders to out themselves by sharing, but I would be curious to learn if people are prompting anything more involved than just "respond to this on HN: <link>", or running agents that do the same.
I often edit my comments rather manically; get into discussions, and sometimes email exchanges with other HNers. I also often use claude, kimi, gemini to check my comments for tone, adherence to HN rules etc. I probably spend way too much time.
So technically the prompts involved might expand into megabytes all told. And in the end I formulate a post by myself (to adhere to HN rules), but the prompting can be many many many megabytes and include PDFs, images, blocks of text from multiple sources, and ... you know. Just Doing The Work.
I think this is valid. Previously I would have (and have) (and still do) search google, wikipedia, pubmed, scientific literature, etc. Not for everything. But often. And AI tooling just allows me to do that faster, and keep all my notes in one place besides.
Again, the final edit is typically 90-100% me. (The 10% is if the AI comes with a really good suggestion) . But my homework? Yes. AI is involved these days.
This should be ok. I'm adhering to the letter and the spirit. My post is me.
"Write a response to smy20011's comment indicating that if the end result was a low-quality comment, the initial prompt probably wouldn't be very insightful either. Make it snarky."
Disagree. The prompt holds no information at all. The answer actually discovers information, organizes it, presents it in a way that's easy to read.
Example: "write me an article about hidden settings in SSH". You get back more information than most of HN's previous posts about SSH, in a fraction of the text, and more readable.
Actually, screw it, we should just make a new version of HN that has useful articles written by AI. The human written articles are terrible.
It's not just AI-generated articles -- it's the other things that we delve into as a result. Listicles. Comments. Posts. It's what it means to be human, and honestly? That's rare.
An outage could cost Amazon ~millions to tens of millions. Most of the time, we want the junior to learn from the outage and fix the process. With AI agent, we can only update the agent.md and hope it will never happen again.
It interesting to see that the eval set becoming more and more expensive. Previously we just need to evaluate one test set, right now we need to create a lot of diffs and run a lot of tests.
I think the good thing about it is that if you are given good specification, you are likely to get good result. Writing a C compiler is not something new, but it will be great for all the porting projects.
I miss entering flow state when coding. When vibe coding, you are in constant interruption and only think very shallow. I never see anyone enter flow state when vibe coding.
The two ways I get into flow state these days are in setting up agentic loops, so I can get out of the way by letting AI check the results for itself, and by doing more things. I've got ~4 Claude Code instances working on problems, per project, and I've got multiple projects I'm working on at the same time.
I think It's not because AI working on "misaligned" goals. The user never specify the goal clearly enough for AI system to work.
However, I think producing detailed enough specification requires same or even larger amount of work than writing code. We write rough specification and clarify these during the process of coding. I think there are minimal effort required to produce these specification, AI will not help you speed up these effort.
That makes me wonder about the "higher and higher-level language" escalator. When you're writing in assembly, is it more work to write the code than the spec? And the reverse is true if you can code up your system in Ruby? If so, does that imply anything about the "spec driven" workflow people are using with AIs? Are we right on the cusp where writing natural language specs and writing high level code are comparably productive?
Programming languages can be a thinking tool for a lot of tasks. Very much like a lot of notation, like music sheet and map drawing. A condensed and somewhat formal manner of describing ideas can increase communication speed. It may lack nuance, but in some case, nuance is harmful.
The nice thing about code compared to other notation is that it's useful on its. You describe an algorithm and the machine can then solve the problem ad infinitum. It's one step instead of the two step of writing a spec and having an LLM translate it, then having to verify the output and alter it.
Assembly and high level languages are equivalent in terms of semantics. The latter helps in managing complexity, by reducing harmful possibilities (managing memory, off-by-one errors) and presenting common patterns (iterators/collections, struct and other data structures, ....) so that categories of problems are easily solved. There's no higher level of computing model unlocked. Just faster level of productivity unlocked by following proven patterns.
Spec driven workflow is a mirage, because even the best specs will leave a lot of unspecified details. Which are crucial as most of programming is making the computer not do the various things it can do.
I believe that the issue right now is that we're using languages designed for human creation in an AI context. I think we probably want languages that are optimized for AI written but human read code, so the surface texture is a lot different.
My particular hypothesis on this is something that feels a little bit like python and ruby, but has an absolutely insane overkill type system to help guide the AI. I also threw in a little lispiness on my draft: https://github.com/jaggederest/locque/
I don't know, LLMs strive on human text, so I would wager that a language designed for humans would quite closely match an ideal one for LLMs. Probably the only difference is that LLMs are not "lazy", they better tolerate boilerplate, and lower complexity structures likely fit them better. (E.g. they can't really one-shot understand some imported custom operator that is not very common in its training data)
Also, they rely surprisingly closely on "good" code patterns, like comments and naming conventions.
So if anything, a managed language [1] with a decent type system and not a lot of features would be the best, especially if it has a lot of code in its training data. So I would rather vote on Java, or something close.
[1] reasoning about life times, even if aided by the compiler is a global property, and LLMs are not particularly good at that
But that is leas fundamental then you make it sound. LLMs work well with human language because that’s all they are trained on. So what else _could_ an ideal language possible look like?
On the other hand: the usefulness of LLMs will always be gated by their interface to the human world. So even if their internal communication might be superseded at some point. Their contact surface can only evolve if their partners/subjects/masters can interface
When I think of the effect of a single word on Agent behavior - I wonder if a 'compiler' for the human prompt isn't something that would benefit the engineer.
I've had comical instances where asking an agent to "perform the refactor within somespec.md" results in it ... refactoring the spec as opposed to performing a refactor of the code mentioned in the spec. If I say "Implement the refactor within somespec.md" it's never misunderstood.
With LLMs _so_ strongly aligned on language and having deep semantic links, a hypothetical prompt compiler could ensure that your intent converts into the strongest weighted individual words to ensure maximal direction following and outcome.
Intent classification (task frame) -> Reference Binding (inputs v targets) -> high-leverage word selection .... -> Constraints(?) = <optimal prompt>
If you are on the same wave length as someone you don't need to produce a full spec. You can trust that the other person has the same vision as you and will pick reasonable ways to implement things. This is one reason why personalized AI agents are important.
> I think producing detailed enough specification requires same or even larger amount of work than writing code
Our team has started dedicating much more time writing documentation for our SaaS app, no one seems to want to do it naturally, but there is very large potential for opening your system to machine automation. Not just for coding but customer facing tooling. I saw a preview of that possible future using NewRelic where they have an AI chat use their existing SQL-like query language to build tables and charts from natural language queries right in the web app. Theirs kinda sucks but there's so much potential there that it is very likely going to change how we build UIs and software interfaces.
Plus it also helps sales, support, and SEO having lots of documentation on how stuff works.
Detailed specification also helps root out conflicting design requirements and points at the desired behavior when bugs are actually found. It also helps when other stakeholders can read it and see misalignment with what their users/customers actually need.
As of today though, that doesn't work. Even straightforward tasks that are perfectly spec-ed can't be reliably done with agents, at least in my experience.
I recently used Claude for a refactor. I had an exact list of call sites, with positions etc. The model had to add .foo to a bunch of builders that were either at that position or slightly before (the code position was for .result() or whatever.) I gave it the file and the instruction, and it mostly did it, but it also took the opportunity to "fix" similar builders near those I specified.
That is after iterating a few times on the prompt (first time it didn't want to do it because it was too much work, second time it tried to do it via regex, etc.)
My thought too. To extend this coding agents will make code cheap, specifications cheaper, but may also invert the relative opportunity cost of not writing a good spec.
> The user never specify the goal clearly enough for AI system to work.
This is sort of a fundamental problem with all AI. If you tell a robot assistant to "make a cup of tea", how's it supposed to know that that implies "don't break the priceless vase in the kitchen" and "don't step on the cat's tail", et cetera. You're never going to align it well enough with "human values" to be safe. Even just defining in human-understandable terms what those values are is a deep existential question of philosophy, let alone specifying it for a machine that's capable of acting in the world independently.
reply