Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Making HTTP realtime with HTTP 2.0 (docs.google.com)
142 points by bpierre on Oct 20, 2013 | hide | past | favorite | 45 comments


I just feel like the basic idea of HTTP 2.0 misses the point of a stateless, text-based protocol. Also, muxing, flowcontrol? These are not things application level protocols should be caring about.


It does miss the point of a stateless, text-based protocol but only because that's core to the problems currently experienced in HTTP/1.1 that are identified near the start of the presentation.

The nice thing about this approach is that it is entirely contained such that the web application in the browser doesn't know (or even need to know) the difference, in the same way that current web applications don't need to worry about whether the request data was compressed or not.


Certainly, there are plenty of things in HTTP 2.0 that make sense to be there, such as server push, specialized header compression, etc.

However, some things, such as avoiding making multiple connections due to congestion and due to slow-start being slow, flow control, and arguably encryption, seem like they would be better addressed in TCP. And in fact, Google is trying to do that with QUIC.

The problem is that if (when) those things get standardized in HTTP/2.0, they need to be supported forever, even if an improved transport layer protocol makes them obsolete in relatively short order.


You can't address things in TCP. There is so much network infrastructure deployed that trying to push any non-backward compatible change in TCP is futile. Google did the right thing by addressing the problems in layer 5.


Then just tunnel on top of UDP, like QUIC currently does. Doesn't mean it has to be specific to one application.


Your post made me reflect a bit and I agree with you whole-heartedly. I wouldn't have such a visceral dislike of HTTP/2.0 if it didn't have the TCP-like features in it.


One wonders why the hell they're trying to fix flow-control, traffic priority, encryption, etc. in a standard meant for just moving around document.

It reeks of overengineering--might as well spend that energy improving TCP or putting together a better secure sockets layer.

Most of the problems they cite are the result of people shoving too much shit (ads, trackers, social media, etc.) into their sites anyways.


It reeks of overengineering because you're not properly understanding the problem.

Flow control is necessary because it's a multiplexed protocol. Without your own flow control commands, flow control is applied over the entire TCP stream, not on your substreams. That doesn't do what you want. Try it: implement any kind of multiplexed protocol yourself, and you will quickly see why it is necessary.

Encryption: some of it might be for security, but a lot of it is also for backward compatibility. SPDY/HTTP2 is a different protocol than HTTP1, and there are a lot of broken routers out there that will break things unless you encrypt the entire connection as an SSL session.

Those who do not understand SPDY are doomed to complain about it on HN, and then one day maybe reinvent it, poorly.


> Flow control is necessary because it's a multiplexed protocol. Without your own flow control commands, flow control is applied over the entire TCP stream, not on your substreams. That doesn't do what you want. Try it: implement any kind of multiplexed protocol yourself, and you will quickly see why it is necessary.

Why should HTTP be a multiplex protocol though; that's the issue.

> Encryption: some of it might be for security, but a lot of it is also for backward compatibility. SPDY/HTTP2 is a different protocol than HTTP1, and there are a lot of broken routers out there that will break things unless you encrypt the entire connection as an SSL session.

So why not just use SSL/TLS? Why wrap it into the HTTP protocol?

> Those who do not understand SPDY are doomed to complain about it on HN, and then one day maybe reinvent it, poorly.

Those who can't see the benefits and reason for HTTP1 are doomed to make a mess of it by "improving" (or "working around") it.


> Why should HTTP be a multiplex protocol though; that's the issue.

Because HTTP is inefficient due to the TCP three-way handshake, and because pipelining has many problems. These are fixed by multiplexing.

> So why not just use SSL/TLS? Why wrap it into the HTTP protocol?

They are using TLS.


> Because HTTP is inefficient due to the TCP three-way handshake, and because pipelining has many problems. These are fixed by multiplexing.

So why should HTTP care about this?

> They are using TLS.

But it's specified as part of the HTTP protocol


HTTP 2.0 is a bet that those ideals don't matter anymore. If the users of HTTP disagree, it just won't be adopted, simple as that.


Until bosses without a low-level IT background show up and ask you why you're not expanding their buzzword bingo options.


it fixes what google cares about


Anybody could explain to me why multiplexing is necessary? It definitely won't speed things up and I don't see why different resources couldn't be loaded sequentially (instead of concurrently) over the same TCP/TLS connection. Adding multiplexing increases complexity and brings overhead by introducing frame headers.

Also flow control seems weird to me too. Maybe it's just I failed to see a scenario where flow control in HTTP layer is really useful.

Binary header and streaming (long-live TCP connection) are definitely interesting. In fact I would be really happy if I've got a connection, and can keep sending binary-format HTTP request, and the server will just respond with resource I requested in the same order.

EDIT: typos


Flow control can definitely be useful when streaming audio/video files. A browser only needs a few megabytes of buffer to display 1080p content properly. Everyday example: user starts watching a youtube video, gets interrupted and pauses it, then closes the browser because they got carried away. The entire video will be downloaded and put in the cache where only a few megabytes were needed/watched. That's quite a waste of bandwidth at 1080p.

Multiplexing allows you to have the same level of concurrency as you currently have with domain sharding without requiring multiple TCP connections (and TLS contexts). Starting and getting a TCP connection up to speed takes a while, especially on congested links (packet loss, spurious retransmits, slow starts, etc.), and TLS just makes the problem worse by requiring multiple back and forth to setup encryption. It also takes quite a bit of memory on servers to maintain hundreds of thousands of TCP socket and TLS states, and a lot of CPU to set up TLS contexts (Diffie Hellman can be quite expensive CPU wise). Then, there are flow-based routers, load balancers, stateful firewalls and other stateful network equipments. We'll get greater performance out of them by using less concurrent connections.

I think moving to a binary protocol and reducing the number of TCP/TLS connections is a very good thing, long overdue IMHO.

EDIT: typos :)


The flow control one does make sense. For the multiplexing, I must have missed the fact that multiple tabs need to be loaded simultaneously. It makes more sense to allow ones from the same host to load concurrently over one single connection than to keep one connection per tab.

Thanks for the explanation.


TLS Sessions could go a long way in helping the problems with TLS.

I'm still not sold on why a Layer 7 protocol should be doing things that a Layer 3/4 protocol should be worried about.


fuck binary.


I planned for this to get downvoted. Damn! Turns out I'm not the only one thinking so then ;-)


Thanks!


Multiplexing is necessary because the remote/local port for an open TCP connection acts like a temporary cookie, uniquely identifying a browsing session. The more time you can keep a connection open and the more requests over it the more tracking you can do. 'Ideally' you'd have one connection to google-analytics and keep it open for many minutes for better user tracking.


Stop letting your paranoia get in the way of reality


TCP is great for long lasting connections / large download and HTTP/1.x with it's connection per request, small resources model uses TCP inefficiently.

Keep-alive reduces some of this inefficiency by re-using connections, but each connection still has to go through the 3 way handshake and growing the congestion window - multiplexing requests over a single connection reduces these inefficiencies further.

Multiplexing also allows the browser to communicate the priority of downloads - with HTTP/1.x once all the connections are in use the only way to alter the priority is to cancel a connection and start a new one (expensive), with HTTP/2.0 the browser communicates the priority with the request so the server could pause the download of lower priority resources.


I wasn't convinced until I saw an example of multiplexed progressive images:

http://www.youtube.com/watch?v=zCDcmit5-fE&feature=player_de...


Flow control is necessary for any multiplexed protocol. Try implementing a multiplexed protocol yourself, and you will notice that without flow control commands, it is very easy for someone to DoS you by overflowing your buffers.


Ilya Grigorika, the presentation author, has also written a book called High Performance Browser Network that I highly recommend if you are interested on the subject.

There is a free online version available at http://chimera.labs.oreilly.com/books/1230000000545/index.ht...


I'm a bit confused that multiplexing over one TCP connection is somehow seen as a strength of this new protocol. OK. I see how muxing streams over a TCP session theoretically allows the TCP session to "fill" the available bandwidth for a longer period of time. But it also means that every stream in the session will suffer from a few packet drops (and the resultant "sputtering" slow starts over lossy physical media, ie. the mobile use-case).

As for why flow control is being pushed into the app layer, the answer again comes down to multiplexing of multiple streams over one TCP connection (since without stream-level flow control, one slow end-point for a stream can potentially block the progress of every other stream in that session... see discussion here: https://groups.google.com/forum/#!topic/spdy-dev/g4PiZBTW-34)

I guess in the end, only real measurements with a mature implementation will answer the question. As it turned out with SPDY (1), the results of all this work might still not be enough to overcome the basic problems with TCP.

(1) http://www.guypo.com/technical/not-as-spdy-as-you-thought/


Multiplexing is something that many application protocols could make use of and should be added between the application protocol and TCP, and not engineered into every application protocol. Plus, HTTP is supposed to be a simple text-based protocol that I can run by typing into a telnet window. It doesn't seem possible in HTTP/2.0 though :-/



Will this let us do SRP over HTTP?


HTTP has now been essentially split into a lower transport layer and an upper semantics layer. If someone created an SRP extension for HTTP (see https://bugzilla.mozilla.org/show_bug.cgi?id=356855 ) it would apply equally to HTTP 1.1 and 2.0. Or you could use SRP with TLS ( http://tools.ietf.org/html/rfc5054 ), but this has the same UX problems as client certs.


Looks really complicated. I don't like it.


Why is non-hypertext being delivered via a protocol for hypertext?


What is content-type for then?


To specify the format/encoding of the hypertext?

HTTP was around before I was, hence the question.

If we just want to transfer generic files, maybe we should create a file transfer protocol.


We could call it FTP!

In all seriousness, HTTP is a good example of scope creep. The type and volume of content sent in a typical session is far different than what was common a decade ago.


Nonsense. HTTP is a good example of an enduring protocol design, under an enduring architecture (The Web combination of URI+MIME+HTTP). As the payloads have changed, it has evolved very little. HTTP/2.0 is an optimization to this architecture. It's unlikely it will ever fully replace HTTP/1.0.

In 2003, the type of content was exactly the same: HTML, Images, CSS, JavaScript, Audio and Video. The formats weren't all that different. Nowadays the main difference is you see a lot more JSON and/or XML (mostly RSS).

In 1995-1996, the formats and codecs were a bit more primitive, and the net was smaller, but it was the same sort of content: AVIs, MOVs, GIFs, JPGs, HTML, and early 1.0 JavaScript.

HTTP+URI is also generally a much better protocol for transferring state than FTP. It's both simpler AND more general.


I was joking about FTP. I was trying to say that HTTP has stood the test of time much better, despite the fact that what we have asked of it has changed quite a bit.

> In 1995-1996, the formats and codecs were a bit more primitive, and the net was smaller, but it was the same sort of content: AVIs, MOVs, GIFs, JPGs, HTML, and early 1.0 JavaScript.

There is a difference in magnitude between the present day web session and the typical 1995 one. The number of resources needed to properly render many pages is much, much larger. HTTP 1.0 and 1.1 simply weren't designed with this requirement in mind. They still do a pretty good job handling it, but it's not hard to argue that a protocol designed around improving parallelism will have better performance characteristics.


> I was joking about FTP.

Sorry then :)

> There is a difference in magnitude between the present day web session and the typical 1995 one.

I agree with that, just misinterpreted your post as a pile on against HTTP. Certainly HTTP/2.0 and SPDY are necessary to keep up with the complexity of today's pages.


HTTP was intended to be a uniform state (that includes files) transfer protocol dating back to the 1.0 days, just as URL/URI's were intended to be a uniform identifier for any source of state.

It just took until 2000 for Roy to call this "REST".


I feel the presentation would have had a lot more weight and clarity if not for the memes in the bottom right corner. As it stands, they just make it another fantasy creation- only the writer has an email address @google.com.


Ilya Grigorik knows his stuff incredibly well when it comes to networking. I encourage you to read his blog sometime. You'll find it very informative. Personally it is one of my favorite blogs.

http://igvita.com


then again that's why i like to read stuff without having a preconceived idea of the author. Best way to have a fresh view on stuff.

People write some good stuff and some bad stuff. Not because they're not smart, in general. More because of pressure due to various reasons, or because they lost interest, or what not.

Personally, every time i see a "not so much of a win" followed by memes to make it "look we rock!" it makes me feel pity for our whole industry


I found the presentation deck very disjointed; presumably his speaking notes would have drawn it together.

Here's an alternative deck by Mark Nottingham

http://www.mnot.net/talks/http2-wtf/#/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: