Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The harness matters far more than most people think. This post about the CORE benchmark where Opus’ score almost doubled when they switched to Claude Code from their own harness. https://x.com/sayashk/status/1996334941832089732


Mario, the creator of Pi terminal agent, has this great blog post[0]. He talks about how TerminalBench's highest scores comes from using the Terminus 2 harness which uses tmux under the hood.

When I was reading the Opus 4.6 launch post, they mentioned the same thing and their TerminalBench score was based on using Terminus 2 and not CC.

0. https://mariozechner.at/posts/2025-11-30-pi-coding-agent/


Which, IMHO, should be why we should be able to change them freely or make our own. Being locked into a specific harness because you pay 20 bucks per month vs. pay-per-use ... is kinda dumb.


The reason Anthropic is pushing on the closed harness is that they're not confident with their ability to win on model quality long term, so they're trying to build lock-in. They can capture some additional telemetry owning the harness as well, but given the amount of data the agent loop already transmits, that borders on unethical spyware (which might be part of the reason they're afraid to open source).

Ultimately the market is going to force them to open up and let people flex their subs.


> Being locked into a specific harness because you pay 20 bucks per month vs. pay-per-use ... is kinda dumb.

I’ll probably get downvoted for this, but am I the only one who thinks it’s kind of wild how much anger is generated by these companies offering discounted plans for use with their tools?

At this point, there would be less anger and outrage on HN if they all just charged us the same high per-token rate and offered no discounts or flat rate plans.


Okay, but why on earth should I as an OpenCode user accept that limitation when OpenAI explicitly supports 3rd party clients? That's how competition works in a healthy market.

I certainly haven't built up enough brand loyalty to tolerate Anthropic's behavior as they tightened usage quotas on the Pro plan to the point of becoming unusable for actual development.

(And sure, they probably don't care because they're losing money on that plan but again, OpenAI offers a plan at the same price point with vastly superior usage limits so I just canceled Claude, subscribed to Codex and moved on with my life. Anthropic's profit margin or lack thereof isn't my problem as a consumer when alternatives exist.)


> Anthropic's profit margin or lack thereof isn't my problem as a consumer when alternatives exist.

For now. When a company is deliberately trying to be profitable and not subsidized by VC money; I'm more likely to buy their product. I have no desire to live further in a world run by monopolies.


But Anthropic very much is subsidized by VC money, just like OpenAI. They just raised another $30 billion this week. I'm not sure how Anthropic managed to position themselves as the "good guy" in many people's minds, but from my vantage they're much more similar to OpenAI than they are different.

So for the time being I'll stick with the option that isn't trying to profit from lock-in in the long term vs. competing on technical merits of their product. The great thing about basing a workflow on a tool like OpenCode is that if OpenAI enshittifies Codex, I don't have to worry about being trapped and can easily pivot to an open source model, or Anthropic via the API, etc depending on how the future turns out.

I've been paying for the ultra cheap Z.ai plan as my fallback for a while now anyway, or I can use my Github Copilot or Gemini AI Pro plan via OpenCode integrations (though the Gemini integration is probably the least stable of the four), so I certainly won't hesitate to drop OpenAI too if they give me sufficient cause.


No, you're not the only one. The outraged entitlement is pretty funny tbh. How dare they dictate that they'll only subsidize your usage if you use their software!!


I'm not outraged, but the dynamic creates a tension that prevents me from building brand loyalty.


Yes, I'm entitled because I didn't stick around paying for a subpar plan compared to their direct competitor OpenAI who supports my use case at the same price point.

Any reasonable person should be thanking Dario for lock-in that protects us from nefarious alternative clients and pledging to pay even more for the privilege of lower usage limits!


Also another place where having it change out from underneath you can drastically alter the quality of your work in unexpected ways.

Like most things - assume the "20/100/200" dollar deals that are great now are going to go down the enshitification route very rapidly.

Even if the "limits" on them stay generous, the product will start shifting to prioritize things the user doesn't want.

Tool recommendations are my immediate and near term fear - paid placement for dev tools both at the model level and the harness level seem inevitable.

---

The right route is open models and open harnesses, ideally on local hardware.


At this point subsidizing Chinese open-weights vendors by paying for them is just the right thing to do. Maybe they too might go closed-weights when they become SotA, but they're now pretty close and haven't done it.


I am wondering what kinds of harness are best for GLM, Deepseek, Qwen, Kimi.


OpenCode is great in general. At least one of them is specifically trained on CC - I think it was Qwen - so for those that should give best results.


Claude Code better than opencode for GLM models for me.


OpenCode with Kimi has been great for me.


> Like most things - assume the "20/100/200" dollar deals that are great now are going to go down the enshitification route very rapidly.

I don’t assume this at all. In fact, the opposite has been happening in my experience: I try multiple providers at the same time and the $20/month plans have only been getting better with the model improvements and changes. The current ChatGPT $20/month plan goes a very long way even when I set it to “Extra High” whereas just 6 months ago I felt like the $20/month plans from major providers were an exercise in bouncing off rate limits for anything non-trivial.

Inference costs are only going to go down from here and models will only improve. I’ve been reading these warnings about the coming demise of AI plans for 1-2 years now, but the opposite keeps happening.


> Inference costs are only going to go down from here and models will only improve. I’ve been reading these warnings about the coming demise of AI plans for 1-2 years now, but the opposite keeps happening.

This time also crosses over with the frontier labs raising ever larger and larger rounds. If Anthropic IPO (which I honestly doubt), then we may get a better sense of actual prices in the market, as it's unlikely the markets will continue letting them spend more and more money each year without a return.


> The current ChatGPT $20/month plan goes a very long way

It sure does and Codex is great, but do you think they'll maintain the current prices after/if it eventually dominates Claude Code in terms of marketshare and mindshare?


I think we'll always have multiple options providing similar levels of service, like we do with Uber and Lyft.

Unlike Uber and Lyft, the price of inference continues to go down as datacenter capacity comes online and compute hardware gets more powerful.

So I think we'll always have affordable LLM services.

I do think the obsession with prices of the entry-level plans is a little odd. $20/month is nothing relative to the salaries people using these tools receive. HN is full of warnings that prices are going to go up in the future, but what's that going to change for software developers? Okay, so my $20/month plan goes to $40/month? $60/month? That's still less than I pay for internet access at home.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: