Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In case it's helpful to anyone, https://openrouter.ai/google/gemini-3-pro-preview is useful to know about.

Adding another layer on top of Google's own APIs adds latency, lowers reliability, and (AFAIK) doesn't allow batch mode - but if that's tolerable, it avoids the mess that is Google Service Account JSON and Cloud Billing.



(I work at OpenRouter) We add about 15ms of latency once the cache is warm (e.g. on subsequent requests) -- and if there are reliability problems, please let us know! OpenRouter should be more reliable as we will load balance and fall back between different Gemini endpoints.


Is batch mode on the roadmap? As the frontier model providers start to think more and more about profitability, and prices/latencies rise as a result, I can see batching becoming more and more necessary for many use cases.

Would love to know we can build against the OpenAI batch API and (soon?) have a path towards being model-agnostic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: