30 Network From Http1 to Http3 What You Need to Understand

30 Network From HTTP1 to HTTP3 What You Need to Understand #

Hello, I’m Ishikawa.

When it comes to HTTP, you may not be unfamiliar with it, but are you really sure you understand it? Or maybe you think understanding it is not important for frontend development. But in reality, understanding HTTP can help us better formulate strategies for frontend application layer performance optimization.

So today, we will take a look at different versions of HTTP. We will discuss the optimizations we can make for the shortcomings of HTTP/1, how we can leverage the advantages of HTTP/2, and what expectations we can have for HTTP/3. Now, let’s start with the history of HTTP.

HTTP/1.0 #

The earliest version of HTTP was defined in 1991 by Tim Berners-Lee, the father of the World Wide Web. This version, HTTP/0.9, was initially described in just one page with one line of text. HTTP/0.9 only supported the GET method and allowed clients to retrieve HTML documents from servers. It didn’t support any other file formats or information uploads.

HTTP/1.0, which was drafted from 1992 and finalized in 1996, introduced more elements and features that are now familiar to us, such as headers, errors, and redirects. However, it still had several core problems: first, there was no way to keep a connection open between different requests; second, there was no support for virtual servers, meaning multiple websites couldn’t be hosted on the same IP address; third, there was a lack of caching options.

After half a year, HTTP/1.1 was introduced, addressing the above issues. However, compared to the increasing number of internet users and the constant rise of web applications, the HTTP standard remained largely unchanged for almost two decades. During this period, some performance bottlenecks became more prominent as they were not further addressed.

These performance bottlenecks can be categorized into several aspects:

Latency, which refers to the time it takes for an IP packet to travel from one point to another. Additionally, when considering round-trip time (RTT), the total time includes the calculation of twice the latency. An HTTP request may involve multiple round trips, further increasing the latency.
Bandwidth, which is similar to the roads we drive on every day – the narrower the lanes and the heavier the traffic, the more likely it is to cause congestion. Multi-lane roads typically reduce commuting time.
Connection time, which involves a three-way handshake that needs to occur for a network connection. The first step is the client’s synchronization request to the server, followed by the server’s acknowledgment back to the client, and finally the client’s acknowledgment back to the server.
TLS negotiation time, which refers to the additional processing time required to establish a secure HTTPS connection based on the three-way handshake.

Meanwhile, developers came up with a series of “workarounds” to address these issues. Some of these solutions included using HTTP pipelining, domain sharding, bundling resources to reduce HTTP requests, and inlining smaller resources. Let’s take a look at some of these performance optimization techniques used during this period.

Persistent HTTP with Pipelining #

First, let’s examine the solution of using persistent HTTP with pipelining. In the example below, if the front-end wants to request HTML and CSS resources one after another, two connections would need to be established. With three-way handshakes each time, there would be a total of six handshakes, which is clearly time-consuming.

Persistent HTTP (Keep Alive) allows us to reuse existing connections between multiple application requests. By using persistent HTTP, we can reduce the handshake time. However, this is still not the optimal solution because using persistent HTTP means that the client strictly follows the first-in-first-out (FIFO) queue order, where the first request is sent, waits for the complete response, and then initiates the next request. From the diagram below (left), we can see the flow.

To further optimize this, HTTP pipelining can build upon the workflow of persistent HTTP by shifting the FIFO queue from the client (request queue) to the server (response queue). From the diagram below (right), we can see the flow. Comparing the example with persistent HTTP on the left, the CSS request waits for the HTML request to return before proceeding, while in the pipelining example on the right, two requests can be processed simultaneously. This further reduces the overall duration.

However, when using persistent HTTP with pipelining, special attention must be paid to security by using HTTPS connections. In some real-life cases, such as iTunes by Apple, this approach has been used to improve application performance.

Domain Sharding #

Another very common performance optimization technique during the HTTP/1.1 era was domain sharding, which involved loading resources by establishing more subdomains. Why was this done?

We can take a highway as an example. In HTTP/1.1, a host can only have six TCP connections, equivalent to six lanes. To load more resources at the same time, we can create multiple hosts, allowing more TCP connections. For example, if we have three subdomains, multiplying them by six gives us 18 lanes, enabling more resources to be loaded simultaneously. However, this solution is not foolproof. Each new hostname requires additional DNS lookups, consumes additional resources for each additional socket, and requires manual management of resource sharding locations and methods. Therefore, this solution needs to be considered in a comprehensive manner.

Bundling Resources to Reduce HTTP Requests #

Another technique in the HTTP/1.1 era was resource bundling. This involved using CSS sprites for images and bundling JS or CSS files. CSS sprites involved consolidating different image elements into one image and controlling their display positions using CSS. Bundling JS or CSS files involved combining different CSS and JS files into one file to reduce requests.

Similarly, these methods have pros and cons. While they reduce the number of requests and responses, they also present some problems. For example, bundled resources may not be useful for every page, resulting in resource waste. Therefore, when using this optimization method, practical considerations should be taken into account.

Inlining Smaller Resources #

Lastly, let’s take a look at inlining. By using data URIs, we can inline smaller resources within a page. This helps reduce the number of requests. However, this method is a double-edged sword. If all resources are loaded into the page, it can increase the one-time loading burden. Therefore, the best practice is to use this method for data sizes of 1-2KB, but it is not suitable for excessively large resources.

HTTP/2.0 #

After discussing the history and performance optimization of HTTP/1.1, let’s take a look at the era of HTTP/2.0. You may wonder why there haven’t been many changes to HTTP after version 1.1. This is because the cost of upgrading is too high and requires coordination among browsers, servers, proxies, and other middleware, which means compatibility issues can lead to service interruptions. So the industry lacked the motivation to drive change.

However, in 2009, two engineers at Google, Mike Belshe and Roberto Peon, proposed an alternative to HTTP/1.1 called SPDY, which sounds like “speedy” and represents speed. Initially, SPDY was not intended to replace and become an upgraded version of HTTP/1.1. But due to its better suitability for the development needs of modern web applications, SPDY quickly gained support from mainstream browsers like Chrome, Firefox, and Yandex by 2012. Internet companies like Google, Facebook, and Twitter also started supporting SPDY on their backend servers and proxies around the same time.

At this point, the IETF, responsible for maintaining the HTTP standard, couldn’t sit still and began defining a series of improvements to HTTP/2 from 2012 to 2015. These improvements include: reducing latency by using full request and response multiplexing, minimizing protocol overhead by compressing HTTP header fields, and adding support for “request priorities” and “server push”.

After HTTP/2 was officially released in 2015, SPDY stepped aside. Of course, SPDY and HTTP/2 are not in competition; instead, SPDY played the role of a pioneer and “guinea pig” throughout the process, testing the performance and effectiveness of each optimization concept through numerous experiments. With an open attitude, the IETF incorporated many elements of SPDY into the final standard. So what changes did HTTP/2 make to achieve the aforementioned features? Let’s take a closer look.

Characteristics of HTTP/2.0 #

In HTTP/2, the most fundamental concept is the binary framing layer, which specifies how HTTP messages are encapsulated and transmitted between clients and servers.

Before understanding it, we need to be familiar with a few concepts: streams, messages, and frames. A stream is a bidirectional flow of bytes established within a connection and can carry one or more messages. A message is a complete sequence of frames mapped to a logical request or response message. A frame is the smallest unit of communication in HTTP/2, and each frame contains a header that identifies the stream it belongs to.

In summary, the “layer” in the binary framing layer refers to the part between the socket and the exposed HTTP interface. HTTP/2 replaces the line breaks in plaintext communication of the HTTP/1.x protocol with segmented binary-encoded frames, which are then mapped to specific messages of streams. All these messages can be multiplexed within a single TCP connection. This is the basis for all other features and performance optimizations provided by the HTTP/2 protocol.

This design requires both clients and servers to parse each other using the new binary encoding mechanism, which means HTTP/1.x clients cannot understand HTTP/2 servers, and vice versa. This is one of the reasons why HTTP/1.1 has been stagnant. Now let’s take a closer look at the new features of HTTP/2.

Request and Response Multiplexing #

In HTTP/1.x, each connection can only transmit one response at a time. If the client wants to make multiple parallel requests to improve performance, it must use multiple TCP connections. This not only leads to frontend blocking but also inefficient use of underlying TCP connections.

The new binary framing layer in HTTP/2 eliminates these restrictions. Since HTTP messages are now broken down into individual frames, they can be interleaved and then reassembled on the other end to achieve complete request and response multiplexing. In the diagram above, we can see multiple streams within the same connection. The client is transmitting DATA frames for stream 5 to the server, while the server is transmitting interleaved frames for streams 1 and 3 to the client. As a result, three parallel streams are being transmitted.

This approach brings several benefits:

It avoids blocking and allows multiple requests and responses to proceed in parallel.
It can transmit multiple requests and responses in parallel using a single connection, eliminating unnecessary HTTP/1.x workarounds such as domain sharding, image sprites, and file concatenation.
By eliminating unnecessary latency and improving the utilization of available network capacity, it reduces page load time.

Header Compression #

Each HTTP transport includes a set of headers that describe the transferred resources and their attributes. In HTTP/1.x, this metadata is in plain text format and adds 500 to 800 bytes of overhead to each transfer. If HTTP cookies are used, the overhead can sometimes increase to thousands of bytes.

To reduce this overhead and improve performance, HTTP/2 uses the HPACK compression format to compress the data of request and response headers. This format utilizes two seemingly simple but powerful techniques: it allows encoding headers using static Huffman codes that reduce their individual transmission sizes, and it requires clients and servers to maintain and update an indexed list of previously seen header fields (i.e., building a shared compression context) for more efficient encoding.

HPACK has two tables: a static table containing commonly used header elements, and a dynamic table initially empty but gets populated based on actual requests. If you want to gain a deeper understanding of the Huffman algorithm, you can refer to the algorithm course by Huang Qinghao in the next classroom: How does HTTP/2 transfer the protocol header faster with Huffman encoding?.

Request Prioritization #

因为我们在前面看到，在HTTP/2中，传递的信息可以被分割成多个单独的帧，并允许来自多个流的帧被复用。考虑到客户端和服务器端对帧进行交叉和传递的顺序，HTTP/2标准允许每个流定义1至256之间的权重和依赖性。

这里的流的依赖性和权重形成了一个优先级树结构，表达了客户端希望如何接收响应。反过来，服务器可以通过控制CPU、内存和其他资源的分配，使用这些信息来确定流处理的优先级，并且一旦响应数据可用，就可以分配带宽，以最优的方式向客户端传递高优先级响应。

例如，在上面的例子1中，流A和流B处于同一层级，A的权重为8，B的权重为4，因此，A应分配2/3的可用资源，B应获得剩余的1/3。在例子2中，C依赖于D，因此D应在C之前获得全部资源分配。以此类推，在第3个例子中，D应先于C获得全部资源分配，C应先于A和B获得全部资源分配，A应获得2/3的可用资源，B应获得剩余的1/3。

服务器推送 #

HTTP/2的另一个强大新特性是服务器能够为单个客户端请求发送多个响应。也就是说，除了对原始请求的响应之外，服务器还可以向客户端推送额外的资源，而客户端不需要特意请求每个资源。

为什么在浏览器中需要这样的机制呢？因为一个典型的Web应用程序可能由多个资源组成，因此，在等待客户端请求之前将资源推送到客户端，可以消除额外的延迟。

实际上，我们之前提到的将CSS或JavaScript资源手动内联到文档中的方法就类似于服务器推送的结果，但是服务器推送具有几个额外的优势：

客户端可以缓存推送的资源；
推送的资源可以在不同页面之间重用；
推送的资源可以与其他资源一起复用；
推送的资源可以由服务器确定优先级；
客户端也可以拒绝推送的资源。

在使用服务器推送时，需要注意基于前端浏览器的安全限制，要求推送的资源必须遵守同源策略。同源是一个术语，表示服务器必须保证提供的内容具有权威性。

HTTP/2.0的优化 #

通过HTTP/2的这些优势，我们可以看到，与其说我们需要进行优化，不如说对于HTTP/1.1的优化可以取消。也就是说之前提到的域名分片、资源捆绑、CSS精灵和内联资源等都是可以不再使用的，但是这并不意味着我们不需要进行任何优化。

因为有一些优化是不限于任何特定的HTTP版本的。从应用实现的角度来看，这些优化包括客户端缓存、资源压缩、减少不必要的请求字节、并行请求和响应的处理。基于技术的重要性和影响，下面我们可以着重看看缓存和压缩这两个方面。

客户端缓存 #

首先，我们可以说最快的请求是没有请求。因此，我们可以缓存之前下载的数据，这样客户端在后续的访问中可以使用本地副本来消除请求。我们可以通过Cache-Control头部来指定资源的缓存寿命，并通过Last Modified和ETag头部提供验证机制。

资源压缩 #

尽管通过使用本地缓存，客户端可以避免每个请求中获取重复内容。但是，如果必须获取资源，或者资源已过期、是新的，或者无法缓存，那么应该以最小字节数来传输。常用的压缩方法包括Gzip和Brotli。

Gzip压缩格式已经有近30年的历史了，几乎所有主要的浏览器都支持Gzip。它是一种基于Deflate算法的无损算法。Deflate算法使用LZ77算法和霍夫曼编码的组合对输入数据流中的数据块进行处理。LZ77算法识别重复的字符串，并用反向引用替换它们，反向引用是指向先前出现位置的指针，跟随字符串的长度。接下来，霍夫曼编码识别常见的引用，并用较短的比特序列替换它们。较长的位序列用于表示不常见的引用。

2012年，Google推出了Zopfli压缩算法，它能生成更小的Gzip兼容文件。但相比Deflate/Gzip，它的压缩速度较慢，因此更适合用于静态压缩。

2015年，Google又推出了Brotli压缩算法和压缩数据格式。与Gzip类似，Brotli也是一种基于LZ77算法和霍夫曼编码的无损算法。此外，Brotli使用二阶上下文建模，以类似的速度实现更紧凑的压缩。上下文建模允许在同一块中使用多个霍夫曼树来对同一字母表进行编码。Brotli还支持更大的反向引用窗口，并有一个静态字典。这些特性有助于提高压缩算法的效率。目前，Brotli得到了主要服务器、浏览器、托管服务提供商和中间件（如阿里云和AWS）的支持。

HTTP/3 #

After discussing HTTP/1 and HTTP/2, let’s take a look at HTTP/3. The predecessor of HTTP/3, Quick UDP Internet Connections (QUIC), was introduced by Google in 2012. Instead of using TCP, QUIC uses the User Datagram Protocol (UDP) for transmission.

As a transport layer protocol, UDP is simpler and faster compared to TCP. You might think that UDP is unreliable because it lacks stability. However, QUIC has made improvements to UDP to address this issue, making it capable of achieving reliable delivery similar to TCP. In terms of the transmitted objects, QUIC transports packets and frames, with multiple frames contained in a packet. From a security perspective, QUIC incorporates the TLS protocol, eliminating the need for additional functionalities. Another change is the upgrade of the header compression algorithm from HPACK in HTTP/2 to QPACK in HTTP/3, addressing the header blocking issue.

Google adopted QUIC for websites like YouTube. Although QUIC has not been widely used, it has facilitated the work of the HTTP/3 standards committee and guided the committee in utilizing UDP as the underlying protocol layer. Additionally, it has built SSL/TLS security into the QUIC protocol and influenced the inclusion of security in the HTTP/3 protocol. During the development of HTTP/3, other companies have started implementing the QUIC protocol, with Cloudflare being a notable example. Ultimately, HTTP/3 was officially standardized in June of this year (2022).

Summary #

In this lecture, we have seen the past and present of HTTP, as well as the impact on frontend development throughout this process. We found that the acceptance of HTTP/2 was not high after its release, and the recently introduced HTTP/3 also seems to need validation from the market. Therefore, many developers believe that these technologies seem to have not played a significant role in a meaningful sense.

I believe that some technologies need to accumulate and develop over time, as we can see many big companies like Apple and Google taking the lead in implementing them. This is mainly because their businesses can scale, so these performance optimizations can bring them considerable impact. However, for smaller applications, these effects may not be so obvious.

But I believe that this will change over time. It’s a bit like 5G, which seems to have not gained momentum yet, and the standards for 6G are already being formulated. Although it is less used in the public domain, it has already had a significant application market in the B2B business. Moreover, some technologies need to generate economies of scale with the changes in supporting related technologies such as infrastructure, computing power, and people’s habits.

With the rise of concepts such as streaming media, the metaverse, and Web3.0, our requirements for the network will only become higher. Therefore, in conjunction with the development of concurrency and parallelism that we mentioned earlier, one day in the future, these technologies will definitely explode in popularity. But if we cannot bear the loneliness before its explosion, and thus give up understanding and experimentation, we may lose the ability to seize opportunities when they arrive. So I believe that although HTTP/2 and HTTP/3 may not be technologies actually used in frontend development, they still have a profound impact on frontend development and server-side development like Node.js, and it is worth learning and understanding them.

Thought question #

Earlier, we discussed how HTTP/3 uses QPACK instead of HPACK in HTTP/2 to achieve performance optimization. Do you know which algorithms it improves to achieve this performance optimization?

Feel free to share your answers, exchange learning experiences, or ask questions in the comments section. If you find it helpful, you are also welcome to share today’s content with more friends. See you in the next class!