Hacker Newsnew | past | comments | ask | show | jobs | submit | eggbrain's commentslogin

I find people over-rotate on whether we should be reviewing AI-produced code. "What if bad code gets into production!" some programmers gasp, as if they themselves have never pushed bad code, or had coworkers do the same.

I've worked at places where I've trusted everyone on my team to the extent that most PRs got only a quick glance before getting a "LGTM". On the flipside, I've also worked on teams where every person was a different kind of liability with the code that they pushed, and for those teams I implemented every linting / pre-commit / testing tool possible that all needed to pass inspection (including human review) before any code arrived on production.

A year ago, AI was like that latter team I mentioned -- something I had to check, double check, and correct until I was happy with what it produced. Over the past 6 months, it's gotten closer (but still fairly far away) from the former team I mentioned -- I have to correct it about 10% of the time, whereas for most things it gets it right.

The fact that AI produces a much _larger_ volume of code than the average engineer is perhaps slightly concerning, but I don't see it much differently than code at large companies. Does every Facebook engineer review every junior engineer's pull request to make sure bad code doesn't slip in?

That isn't to say I'm for letting AI go wild with code -- but I think if at worse we consider AI to be a junior engineer we need to reign in with static analysis tools / linters / testers etc, we will probably be able to mitigate a lot of the downside.


At least when a human pushed bad code in the past, they could be held accountable.


There are two opposite answers here, and I feel like I could argue either one:

1) Humans were never held accountable, really

Outside of a few regulated industries, the worst that happens to an engineer who pushes negligent code is that they get fired. But after that happens, what actually changes? The organizational structure of the company that allowed the employee to push bad code still exists.

2) Humans will still be held accountable

If a human (managing a fleet of AI agents, let's say) ends up deploying bad code to production, they won't be able to point to the AI agent and say "it was them that did it!" -- it will still be the human at the end of the line that is held responsible.


Do you not review junior developers' code? I don't understand your point


Your comment seems to imply AI is currently at a junior developer's level -- 12 months ago I would have agreed (like I mentioned in my parent comment, both near the end and about the "latter" team I was a part of), but it's gotten quite good over the past few months.

When even Linus Torvalds compliments AI code (ref: https://www.reddit.com/media?url=https%3A%2F%2Fi.redd.it%2Fa...) I think we can say he wouldn't have said that about any junior engineer.

That's not to say it won't ship bugs, but so does any engineer (junior or senior). It's up to you as to what level of tooling you surround the AI with (automated testing / linting / etc), but at the very least it doesn't also hurt to have that set up anyways (automated tests have helped prevent senior devs from shipping bad code too).


Ok but are you arguing against code reviews of AI generated code?


Many of today's news websites (tech or otherwise) cashed in their goodwill / reputation / page rank to sell ads.

The first shoe dropped when news websites realized they weren't generating content fast enough. Hard, in depth journalism takes time, but when people want to know something that happened _today_, they don't want to wait a week for all the facts to come out, and so the major websites started losing traffic to websites that churned out articles fast.

The additional benefit of churning out articles was that you could match against more and more long tail keywords, which lead to more traffic and more ability to sell ads. To keep up, many websites dropped quality for speed, and consumers noticed.

The second shoe then to drop was with affiliate marketing -- articles on CNET / Wirecutter etc were already ranking and rating products, so they figured "[...] why shouldn't we get a cut if someone ends up buying a product we recommend"? The challenge then became that consumers couldn't tell the difference between a product that was recommended because it was good, or because the product gave the biggest "kickback" to the website for using the affiliate link. Thus, people that gave "honest" opinions on products (e.g. people asking on Reddit, at least for a while, as the article suggests) became the new source of truth.

The result of this means that these days, if you read a lot of articles on the major tech websites, they feel more like they've been optimized for speed (e.g. churning out an article fast), SEO, and not much else. Many people have talked about how recipie websites are now short story generators more than food instructions, but it's been common for a while where I go to a tech website to read about something I specifically Googled, only for it to feel more like it was written _specifically_ to capture traffic for a keyword, rather than actually solve the issue or question I came into the website with.

The cherry on top is that AI has none of these problems (so far) -- yes, there's some movement on trying to do SEO for AI, and of course ads will eventually come to AI like it has everything else, but currently, you can get the answers you want, described to you exactly how you'd like to hear it -- who wouldn't want that?


Why do you think that they are pushing so many ads? It is because they have too much money? Most sites are struggling to pay the few employees they have. Fewer ads aren't going to lead to better reporting. Would you be willing to pay a subscription to the website? Probably not.


> you can get the answers you want, described to you exactly how you'd like to hear it

I thought we wanted the truth.


Some of us do but many people do not. Source: Married for 22 years.


Stated and revealed preferences


You can’t handle the truth


I'm curious -- are there any stories of projects that launched on Hacker News, Hacker News loved it, and it ended up _also_ being a big success?

E.g. we have stories like Dropbox where HN seemed to be dismissive only to be proven wrong, and there are numerous launches where HN was dismissive and they were proven right, but I'd be more curious when the HN crowd got it right in a positive way.


If we assume token providers are becoming more and more of a commodity service these days, it seems telling that OpenAI specifically decided to claw out consumer hardware.

Perhaps their big bet is that their partnership with Jony Ive will create the first post-phone hardware device that consumers attach themselves with, and then build an ecosystem around that?


this would be an incredibly tough play. We've seen few success stories, and even when the product is good building the business around them has often failed. Most of the consumer plays are terrible products with weak execution and no real market. I have no doubt they could supplement lots of consumer experiences but I'm not sure how they are more than a commodity component in that model. I'm a die-hard engineer, but equating the success of the iphone to Ive's design is like saying the reason there were so many Apple II's in 80's homes and classrooms was because of Woz's amazing design.


I’m glad someone called this out. “Let’s just use vanilla rails” — sure, except basically every version of rails for the past 5 years has decided to completely change how they do JS.

So many gems are also still built on sprockets — even when you want to use the “rails” way, you are stuck now with a hodgepodge of JS anyways.

It’s a mess — maybe one day we’ll get it fixed, but don’t pretend it’s not partially rails fault as well.


Another potential self-selection bias -- if people know they are signing up to have a conversation with a stranger, perhaps they are already predisposed to be more "pleasant" in conversations, vs a potential curmudgeon who doesn't ever want to speak to anyone, even for money.


There's also the magnitude of a negative interaction as well to consider.

If I have 99 great interactions with someone, but one REALLY bad interaction (they insult me deeply, or say something irredeemable), that can also sour the whole relationship.

It would be interesting to research commonalities amongst bad interactions -- are there patterns that emerge from certain personality types, politics, etc? What about a few "sour" people that will take any interaction and make it bad regardless of matchup -- if we removed them from the interaction pool, do the stats suddenly adjust quickly?

In my mind this would have big implications for social media sites -- not that all bad interactions need to be quelled, but if you are trying to keep conversations civil, attempt to implement X strategy or Y strategy.


Limiting by referrer seems strange — if you know a normal user makes 10-20 requests (let’s assume per minute), can’t you just rate limit requests to 100 requests per minute per IP (5x the average load) and still block the majority of these cases?

Or, if it’s just a few bad actors, block based on JA4/JA3 fingerprint?


What if one user really wants to browse around the world and explore the map. I remember spending half an hour in Google Earth desktop, just exploring around interesting places.

I think referer based limits are better, this way I can ask high users to please choose self-hosting instead of the public instance.


Limiting by referrer is probably the right first step. (And changing the front page text)

You want to track usage by the site, not the person, because you can ask a site to change usage patterns in a way you can't really ask a site's users. Maybe a per IP limit makes sense too, but you wouldn't want them low enough that it would be effective for something like this.


Thanks Zak, appreciate your thoughts.

I hope to have some of the followup posts soon, although you are right that my idea is based around a centralized platform with ID Verification.

RE: How to solve for enshittification, I'd mention two things:

1. I think a good product can _stay_ good over time with strong centralized leadership, aka a "benevolent dictator". Think Steve Jobs at Apple, DHH at 37 Signals, etc.

  - Once that power structure changes, however (new leader, etc), that can quickly fall apart, so it's definitely not a bulletproof solution.
2. If incentives from the start are built into your platform to make the "user" the biggest customer on your platform, incentives will make sure that you keep those users happy.

  - If you have to choose between customers who give you $0/month and advertisers who will give you $1000/month, you'll eventually choose the advertisers to the detriment of the users.


> [...] Its 100% about getting people to stay on your platform as long as possible and engage with your content. Usually that means creating content that gets people to negatively engage with your content. So much so, its now referred to as "rage bait" where Only Fans women purposely post content that gets men to engage with their posts in order to make more money. Political posts are made to inflame either side and get more shares and upvotes.

I touch upon this in https://www.scottgoci.com/social-media-platforms-whats-wrong... and https://www.scottgoci.com/social-media-platforms-whats-wrong... -- but as you mention, this is a result of engagement being a core metric of social media platforms, and users attempting to game the platform's algorithm for their own purposes.

An easy way to solve for this is customization -- if no two users have the same "algorithm" powering their feed, it becomes hard for anyone to do this, because perhaps one user's algorithm filters out anything tagged with politics, or with a low Flesch–Kincaid score, or non-text posts, etc.


> An easy way to solve for this is customization -- if no two users have the same "algorithm" powering their feed, it becomes hard for anyone to do this, because perhaps one user's algorithm filters out anything tagged with politics, or with a low Flesch–Kincaid score, or non-text posts, etc.

The problem, and where I strongly agree with the parent's statement that "I feel like social media has changed human behavior", is that the users themselves seek the engagement. Content creators want feedback about their content. You can codify that as "views", "likes", or whatever, but the whole problem here is fundamentally that most creators try and pursue strategies that increase whichever metric they are tracking to get value out of their posting.

I watched Bluesky grow up and become a "real network" and once Bluesky hit a certain scaling point it became the exact same as all the other supposed algorithmic-engagement optimized sites. Posters started posting snippy, sneery comments because it made the Like count go up.

> perhaps one user's algorithm filters out anything tagged with politics, or with a low Flesch–Kincaid score, or non-text posts, etc.

Zuck talked about how Threads specifically filtered out political content [1] and how that decision was reversed [2]. It turns out that users didn't like filtering out political content even though as most of us know, it tends to turn into dunking competitions online.

So I largely agree with what the parent said. The expectations in the game have changed. Content creators want big number to go up. People want to dunk on each other because it's fun and feels righteous. No algorithms or manifestos seem to change this fundamental change in the way folks post and engage with social media. Maybe a protracted education campaign can, though.

[1]: https://www.npr.org/2024/03/26/1240737627/meta-limit-politic... [2]: https://www.techpolicy.press/transcript-mark-zuckerberg-anno...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: