(Tedious disclaimer: my opinion only, not speaking for anybody else. I'm an SRE at Google.)
Performance. gRPC is basically the most recent version of stubby, and at the kind of scale we use stubby, it achieves shockingly good rpc performance - call latency is orders of magnitude better than any form of http-rpc. This transforms the way you build applications, because you stop caring about the costs of rpcs, and start wanting to split your application into pieces separated by rpc boundaries so that you can run lots of copies of each piece.
I cannot sufficiently explain how critical this is to the way we build applications that scale.
I'm a former Google engineer working at another company now, and we use http/json rpc here. This RPC is the single highest consumer of cpu in our clusters, and our scale isn't all that large. I'm moving over to gRPC asap, for performance reasons.
Performance and versioning are two large benefits.
Performance benefit comes from the fact that schema is defined on each side (generally server / server) so you only send the information bytes. With a good RPC system you can also access specific fields of your structure without unpacking (or very fast unpacking, depends on what RPC system you're using).
Comparable systems use other IDLs, e.g Facebook uses Apache Thrift.
Versioning is easier because your fields are defined explicitly, so you can ignore clients sending an old field to no ill effect (again, you can access individual fields without unpacking). Whereas with json you need to deserialize first. Also if your schema changes when using json without protobufs, you may experience either the client or the server making the wrong assumptions about input data. Whereas there is no ambiguity with protobufs; future changes to proto messages add fields and old fields can just be marked deprecated.
RPC is preferable to JSON for server-to-server communication, but client-server still often uses json just because it's often easier for your client app to interpret json. Systems like gRPC allow servers to emit json as well: https://developers.google.com/protocol-buffers/docs/proto3#j...
I've heard of performance-oriented web apps using protobufs on both client and server.
I'd really love to see something like gRPC implemented over CBOR.
CBOR -- Concise Binary Object Notation -- http://cbor.io/ -- is all the performance of a binary protocol, with semantics basically identical to JSON.
I appreciate some of the things protobuf does to help you version, but I also do not appreciate the protobuf compiler as a dependency and a hurdle for contributors, or for wire debugging. CBOR has libraries in every major (and most minor) languages and works without a fuss with tiny overhead, both at runtime and in dependency size. It's pretty pleasant to work with.
There are two reasons why I think simpler binary packing libraries like CBOR, MsgPack, or BSON can't really match what Protobuf gives you:
- Assistance with schema evolution and versioning is one of the best parts about using Protobuf in an API. It is really like the best parts of XML and XML Schema (validation, documentation, interoperability) without any of the bloat.
- Working with code generation can be a pain to get working initially, but is very friendly when actually using real objects in code. There is no need to think about any representation on-the-wire... everything 'just works'. There is no need to ensure you don't accidentally serialize fields in the wrong order, or worry about encodings, etc.
Also, there is a binary decoder included with protoc that can print a debug decoding of any protobuf binary message, including integer tags for different fields. Wouldn't you have pretty much the same problems with dissection and debugging on-the-wire in CBOR?
It is really quite pleasant to use Protobuf for an API, I can see why Google is opinionated in including it as the only option with gRPC.
I don't find the utility to outweigh the PITA. I've been on both sides of the fence, and maintained large projects with heavy protobuf use.
I don't find the schema validation powerful enough. You still have to write correct code to migrate semantics. Avoiding incorrect reuse of field names is... nice, but also the most trivial of the problems.
(I do like schemas in theory. It's possible I haven't worked with good enough tooling around protobufs to really experience joy resulting from those schemas. The protos themselves certainly aren't guaranteed to be self-documenting, in some of my experiences.)
I don't find code generation results to be smooth in many languages. At $company, we switched through no less than three different final-stage code generators in search of a smooth experience. Not all of this was misplaced perfectionism: in some cases, it was driven by the sheer absurd method count verbosity in one code generator basically making it impossible to ship an application. (You've perhaps heard of the method count limits in early android? Bam.)
I don't think the debug decoding tools for protobuf are directly comparable to CBOR's interoperability with JSON. CBOR and JSON are literally interchangable. That means I can flip a bit in my program config and start using one instead of the other. Config files? Flip the bit. Network traffic not meant for humans? Flip the bit. Need to debug on the network? Flip the bit. Want to pipe it into other tools like `jq`? No problem. There's a whole ecosystem around JSON, and I like that ecosystem. Working with CBOR, I can have the best of both worlds.
Sometimes opinionated is good and powerful and really makes things streamlined in a way that contributes to the overall utility of the system. I don't think this is one of them. Almost every major feature of gRPC I'm interested in -- multiplexing, deadlines and cancellations, standard definitions for bidi streaming, etc -- has nothing to do with protobufs.
natch, I used "I" heavily in this comment, because I realize these are heavily subjective statements, and not everyone shares these opinions :)
I think of GRPC as HTTP/JSON with most of the gotchas fixed for me. HTTP2 makes things like concurrent requests easy. Protobuf fixes schema problems (and I think is technically replaceable if you love your JSON). With the GRPC/protobuf definitions you can easily get clients for many languages.
I'm sure it's possible to create the same thing with HTTP/JSON (and swagger?), but I find it to be more work.
If you do json rpc over HTTP/2 the performance will actually not be that different. The serialization layer is different and for sure protobuf will be fast to (de)serialize than JSON, but for most applications serialization is not the bottleneck (you can e.g. bury a lot more performance in the HTTP implementation).
Lets turn the question around: When should you prefer HTTP/JSON:
At the moment as long as you want to use the endpoint directly from a browser. gRPC uses some HTTP features that are not [yet] available from within browser JS APIs.
Perf is one reason indeed; however the integration with protobuf and the tools to generate stubs for various languages from that interface are also very helpful. gRPC is what SOAP should have been, IMO.
Primarily performance, HTTP HOL[1] kills latencies and wastes a lot of resources, especially with micro services spanning requests recursively to multiple services.
We have had to force keep alive and even forcefully turn it off in some cases :(.
Single point of integration for all of your microservices necessities (load balancing, circuit breaking, load shedding, distributed tracing, metrics, logging). It's really interesting that no one talks about this but if you get relatively large, the lack of this integration point is the source of a lot of pain.
Mostly. I can't speak to protobuf's serialization format personally, but similar binary serialization like Thrift's TCompactProtocol or TDenseProtocol outperform JSON and you get schemas for "free*" (as in, in your producer and consumer's glue code by virtue of codegen, and not as an afterthought).
The IDL and codegen is a big part of gRPC and Thrift. The rest is opinionatedness and less cognitive effort -- the format is non-human-readable anyway so less propensity for bikeshedding about cosmetic stuff in JSON.