Hacker Newsnew | past | comments | ask | show | jobs | submit | lanthissa's commentslogin

its not though if you're working in a massive codebase or on a distributed system that has many interconnected parts.

skills that teach the agent how to pipe data, build requests, trace them through a system and datasources, then update code based on those results are a step function improvement in development.

ai has fundamentally changed how productive i am working on a 10m line codebase, and i'd guess less than 5% of that is due to code gen thats intended to go to prod. Nearly all of it is the ability to rapidly build tools and toolchains to test and verify what i'm doing.


But... plain Claude does that. At least for my codebase, which is nowhere close to your 10m line. But we do processing on lots of data (~100TB) and Claude definitely builds one-off tools and scripts to analyze it, which works pretty great in my experience.

What sort of skills are you referring to?


I think people are looking at skills the wrong way. It's not like it gives it some kind of superpowers it couldn't do otherwise. Ideally you'll have Claude write the skills anyway. It's just a shortcut so you don't have to keep rewriting a prompt all over again and/or have Claude keep figuring out how to do the same thing repeatedly. You can save lots of time, tokens and manual guidance by having well thought skills. Some people use these to "larp" some kind of different job roles etc and I don't think that's productive use of skills unless the prompts are truly exceptional.

At work I use skills to maintain code consistency. We instrumented a solid "model view viewmodel" architecture for a front-end app, because without any guard rails it was doing redundant data fetching and type casts and just messy overall. Having a "mvvm" rule and skill that defines the boundaries keeps the llm from writing a bunch of nonsense code that happens to work.

This sounds great - skills to ensure that the code maintains proper separation of concerns and is packaged properly.

I'd love to know how this skill was phased.


Honestly I started with Obra superpowers and worked with my boss to brainstorm the best way to keep separation of concerns, and we just stepped on rakes as we developed and had Obra superpowers suggest updates to our rules/skills.

It's certainly an iterative process but it gets better every iteration.


Thank you. I've never heard of Obra Superpowers, I'm looking at it now.

A deterministic layer linter would be better for this.

Possibly, and we do use linters, but linters don't stop LLMs from going off the rails. It does end up fixing itself because of the linter, but then the results are only as good as the linter itself.

I have sometimes found "LARPing job roles" to be useful for expectations for the codebase.

Claude is kind of decent at doing "when in Rome" sort of stuff with your codebase, but it's nice to reinforce, and remind it how to deploy, what testing should be done before a PR, etc.


If you build up and save some of those scripts, skills help Claude remember how and when to use them.

Skills are crazy useful to tell Claude how to debug your particular project, especially when you have a library of useful scripts for doing so.


Even the most complex distributed systems can be understood with the context windows we have. Short of 1M+ loc, and even then you could use documentation to get a more succinct view of the whole thing.

This really doesn’t pan out in practice if you work a lot with these models

And also we know why: effective context depends on inout and task complexity. Our best guess right now is that we are often between 100k to 200k effective context length for frontier, 1m NIHS type models


Agreed, not to mention the m additional cost of chats with more context.

theres a famous painting about this https://en.wikipedia.org/wiki/Saturn_Devouring_His_Son


That, but in Corporate Memphis tech-company art style

https://jemima.design.blog/2021/02/08/generic-tech-company-a...


It’s beautiful


“Corporate Memphis tech style” is funny because it’s colloquially known as “globohomo”. Not homophobic, btw, think “homogenous”


in a competitive market they would have been unified a long time ago. google has been making slow steps at doing this apple wont until google does


all of apple’s devices with displays down to the watch run OS X with a form factor appropriate UI layer on top. iphone and mac are more unified than google’s android/chromeos

Tahoe made all the touch targets on macOS bigger, we may get a touch macbook pro this year.


for cerbras, can we call them chips? you're no longer breaking the wafer we should call them slabs


They're still slices of a silicon ingot.

Just like potato chips are slices from a potato.


Macrochips


same thing that happened during the industrial revolution, you pay enough of them to 'protect the law' vs the rest.


i never really understood the billionaire yacht hate.

Once you buy a yacht 450 million dollars of ownership in a company you had goes to people who built a beautiful thing that exists in the real world and you're on the hook for employing a lot of people to maintain it.

I take a lot more issue with accumulation and hoarding of wealth than the spending of it.


An economy that wasted resources building mega yachts for billionaires is more unequal than one that builds cruise ships that high income families can go on an holiday.

https://scottsumner.substack.com/p/imagine-130000000-washing...


> i never really understood the billionaire yacht hate.

Once someone reaches that level of fame and fortune it's almost a requirement if they want to travel or have some sort of 'vacation'. Don't get me wrong, it's definitely a great problem to have, but it's one of the only ways to find privacy at that level of wealth.

If I'm ever super wealthy, I hope I can also stay somewhat anonymous so that I can walk down the street like any other person.


Holding shares in a company (or dollar bills) is not depriving others of something. The fisherman will go catch fish tomorrow, the wheat in the fields will keep growing, the builder will build a house.

If someone starts paying the fisherman, farmer, builder, more to stop doing what they are doing and start building mega yachts, then there will be less fish, bread, and houses for others.

That said, I assume it's much simpler than that and it's just about the hypocrisy of the climate change billionaires to be bellowing out carbon while demanding the selfish greedy commoners cut our emissions.


the us is 16% of global manufacturing by value with 4% of the population, i dont know how this fact isn't brought up more.

on a population weighted basis the us manufactures more by value than china.

A bunch of people salivating for a world where the us was 52% of world gdp, not because it was great, but because the rest of the world was ash.


Whilst this is true, there is some distortion to that statement with measuring by value. If I produce a screw for the US military (a scenario where supply chains are highly regulated and thus may be unable to buy cheap from a foreign country) and sell it for $1, I have produced a dollar of manufacturing by value, but If I produce exactly the same product in China for $0.1, I've only made 10 cents by value, despite the fact I have made exactly the same product.

There is a reason why for instance ships and raw materials output is measured in tonnage, since that is the actual thing produced, the value is secondary to that. That is you would want to measure the actual amount of goods produced rather than what they sold for, obviously only amongst comparable categories.


Also US unemployment has been low. The idea that Americans need more jobs just doesn't fit the numbers. There are plenty of good paying non-backbreaking jobs for Americans but they just don't seem to believe it.


> There are plenty of good paying non-backbreaking jobs for Americans but they just don't seem to believe it.

Where at?


you can get a job as a long haul truck driver in texas with no education and paid for training paying 80-100k and live in an area where houses cost 300k, within 5 years starting from zero with no education you can own a home and have a nest egg big enough to become an owner operator or invest in a small business.

so thats the floor for anyone willing to put in a few years of work.


It says non-backbreaking. Truck driving is one of the highest risk jobs towards one's back.


Also, US manufacturing already struggles to find workers.

The problem, though, is that 70% of US manufacturing happens in small town/rural areas, which is not where the people looking for jobs are found, so you get this curious disconnect.


i think its just that its new year and year of the linux desktop is a meme (in the actual definition of the word kind of way) and the meme is growing over time


Simpler - it is now objectively a much better OS and it’s free


the 'slop' is generally at either end of the extremes of video length, either shorts or multiple hour videos.

shorts get paid by the view, ppl put on long videos to fall a sleep to and youtube premium does a rev share based on watchtime of the premium user.

this is why you have like 10 hour playlists and white noise videos.


isn't gemini 3 flash already model shrinkage that does well in coding?


Xiaomi, Nvidia Nemotron, Minimax, lots of other smaller ones too. There are massive economic incentives to shrink models because they can be provided faster and at lower cost.

I think even with the money going in, there has to be some revenue supporting that development somewhere. And users are now looking at the cost. I have been using Anthropic Max for most of this year after checking out some of these other models, it is clearly overpriced (I would also say their moat of Claude Code has been breached). And Anthropic's API pricing is completely crazy when you use some of the paradigms that they suggest (agents/commands/etc) i.e. token usage is going up so efficient models are driving growth.


Smaller open-weights models are also improving noticeably (like Qwen3 Coder 30B), the improvements are happening at all sizes.


Devstral Small 24b looks promising as something I want to try fine tuning on DSLs, etc. and then embedding in tooling.


I haven't tried it yet, but yes. Qwen3 Next 80B works decently in my testing, and fast. I had mixed results with the new Nemotron, but it and the new Qwen models are both very fast to run.


Same experience: on my old M2 Mac with just 32B of memory both Qwen 3 30B and the new Nemotron models are very useful for coding if I prepare a one-shot prompt with directions and relevant code. I don’t like them for agentic coding tools. I have mentioned this elsewhere: it is deeply satisfying to mix local model use with commercial APIs and services.


How much billion parameter model is gemini 3 flash, I can't seem to find info about it online.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: