IAP - Why HTTP2 and WebSockets Are Not Enough
IAP is an application level protocol that is intended to replace HTTP. During the early days of the web when browsers where the primary client of backend services, HTTP was sufficient. In fact, HTTP was pretty much your only choice, unless you wanted to jump through hoops with Flash, Applets etc.
With the advent of mobile apps that changed. First of all, mobile apps are free to choose what network protocol they communicate with their backend via. HTTP is no longer a must.
Second, mobile apps are locally installed apps which sometimes run for a very long time without being shut down. Apps which often need a permanent connection open to the backend so they can communicate quickly with the backend.
Third, mobile apps often need data pushed out from the server on the server's initiative. Modern web apps need those two features too. HTTP was never designed for that.
Fifth, because HTTP responses cannot be broken down into smaller chunks, if one of the resource requested by an app has to be fetched from a slow backend system (e.g. a database), the HTTP connection might sit idle while the response is being obtained from the backend system. This idle time could have been used to send smaller files back to the browser. This is known as head-of-line blocking. The slow resource blocks the HTTP connection for other resources which could have utilized the connection while idle.
Sixth, since HTTP 1.1 is a stateless protocol, the many different HTTP headers have to be sent along with every single HTTP request. Many of these headers are exactly the same in every single request (e.g. the Accept header which contains a long string of mime types the client can understand ("accept"). The repeated HTTP headers waste bandwidth.
Seventh, a lot of backend services running internally in big enterprise systems communicate via HTTP, even when another protocol would suit their needs better. This typically happens because developers are familiar with HTTP and web servers, the administrators understand the standard web servers and it's easy to just keep port 80 open and close all others for security reasons.
WebSockets and HTTP2
Both the WebSockets and HTTP 2 protocols are attempts to fix the problems of HTTP 1.1 mentioned above. First of all, both HTTP2 and WebSocket connections can be long lived. Second, the server is allowed to push data out to the client without the client first asking for it.
Third, both WebSockets and HTTP 2 communicate via frames which are blocks of data exchanged between client and server. These frames are messages which can contain a full or partial resource (e.g. a part of a file). Frames make it possible to interleave (multiplex) resources on the same HTTP connection. This makes it possible to utilize the connection better - to avoid the head-of-line blocking.
Fourth, browsers are now allowed to send multiple requests to the server without waiting for responses first. Furthermore, the server is allowed to answer these requests out of order. This too makes it possible to utilize the connection better and avoid the head-of-line blocking problem.
Fifth, HTTP 2 compresses the HTTP 2 headers because many HTTP headers were exchanged over and over again between client and server, redundantly (they never changed - e.g. the "accept" header).
Sixth, HTTP 2 allows the server to push files out to the client for caching (pre-emptive caching).
What is Wrong with WebSockets and HTTP 2?
When WebSockets and HTTP 2 solve all these problems, what is wrong with these protocols then?
The problems with WebSockets and HTTP 2 fall into these categories:
- Odd protocol stack
- Narrow Semantic Focus
- Not ambitious enough
Each of these problems will be explained in the following sections.
Odd Protocol Stack
Obviously both WebSockets and HTTP 2 are improvements compared to HTTP 1.1. There is no doubt about that. What is wrong with these two protocols is how they approach the problem.
When you take a step back and look at what these protocols actually do, you realize that something is odd about them. Well, not about these protocols by themselves, but the protocol stacks as a whole. Here is what the protocol stacks look like roughly:
- WebSockets / HTTP 2
At the bottom we have is an unreliable packet oriented protocol (IP) at the bottom. IP sends packets from A to B, but makes no guarantee about their arrival. If routers along the way are overloaded, they are allowed to drop packets.
On top of IP we have a stream based reliable protocol (TCP). TCP breaks a stream of data up into TCP messages which can fit into IP packets and sends them via IP to the receiver. At the other end TCP reassembles the packets into a stream of data.
What WebSockets and HTTP does is to break a stream of data into frames and send them over TCP. In other words, they create a packet switched protocol on top of a stream based protocol on top of a packet switched protocol. Obviously, something is not perfect here.
The problem is that TCP actually doesn't really fit into the picture. TCP packets may arrive out of order. If two packets arrive out of order and these two packets belong to different resources (e.g. different files being transferred), these packets could actually be passed on to the application out of order too. Their mutual order would be insignificant.
However, TCP assembles the received TCP packets in the same order they were sent. If a later packet arrives before an earlier packet in the stream, the later packet is not delivered to the application until the earlier packet has also arrived. For singular, coherent data streams, this is fine. But for packets that can potentially belong to independent resources, this results in a head-of-line blocking problem.
The solution is to base WebSockets and HTTP 2 on top of UDP instead of TCP / IP. In fact, Google has done exactly that with their QUIC protocol.
IAP is protocol agnostic. It needs to be able to run on top of both TCP and UDP.
Narrow Semantic Focus
WebSockets and HTTP 2 basically focus on two narrow scenarios:
- Bidirectional streams
- Document exchange
While you can certainly model a lot of communication using on top of streams and document exchange, you are still left with the task of defining any communication that isn't strictly fitting into these two scenarios. The semantics do not go deeper than document exchange.
Today's backend services could already benefit from more varied protocols than HTTP2 and WebSockets. Not every type of backend service maps to a stream communication or document exchange.
Additionally the Internet of Things (IoT) is lurking in a not so distant future. IoT combined with all the devices and applications is sometimes also referred to as the Internet of Everything (IoE). If all these new devices, and the applications that need to communicate with them, are to be Internet plug-and-play ready, they need to communicate via standard protocols.
The Internet of Everything needs a standard internet protocol. A standard internet protocol can do for IoT what USB did for PC peripheral devices (printers, keyboards, mice, hard disks, cameras etc.).
Obviously it is not possible to predict what all future devices will need from a network protocol. Therefore a standard internet protocol will need to be designed from the beginning to be extensible so different needs can be served by semantic protocols tailored to those needs.
Neither WebSockets nor HTTP 2 address the problem of semantic protocols for other scenarios than streams and document exchange. IAP, on the other hand, is designed from the beginning to support different semantic protocols. The core features of IAP are themselves broken into a set of semantic protocols, and it is easy to add new semantic protocols to IAP in the future.
You can even design your own semantic protocol and plug it into IAP. Both client APIs and servers designed to handle IAP will have no problem handling a custom semantic protocol.
Not Ambitious Enough
With their narrow semantic focus, WebSockets and HTTP 2 are not ambitious enough. But the narrow semantic focus is not their only problem.
Neither WebSockets nor HTTP 2 defines any standard data formats via which clients and servers can exchange data. Either the data exchanged is opaque streams, or opaque files with a mime type attached.
The lack of a standard data format force developers to decide on a data format for all services that do not just return files. Over the years attempts have been made to come up with such standard data formats. SOAP (XML) is a well-known example which despite all its hype never really caught on. XML is just not a good general purpose data format, and the SOAP protocol itself got lost in a jungle of XML Schema definitions.
Today a popular data format is JSON. But JSON has its own problems. First of all JSON is a textual format. That means that JSON is a pretty verbose way to send numbers. It also means JSON is not a good format for raw binary data. Raw bytes must be Base64 or Hex encoded and transferred as strings. Base64 encoding increases the size of the encoded data to 4/3 of the raw size, and Hex encoding increases the size to double the raw size.
Second, JSON is not good at modelling all types of data structures. JSON is especially weak at modelling tables of similar data with rows and columns (like CSV files). JSON would encode such tabular data as arrays of objects, meaning the column name would be repeated for every single object (row) in the table. This is a clear waste of data.
Being both textual and verbose, JSON is not the fastest data format to read or write. Being verbose it is also slower to transfer, especially for devices with limited bandwidth like small IoT devices, mobile phones on weak connections or ships floating far from the coast.
If the Internet of Everything is to become truly plug-and-play, we cannot have every single OEM or app developer decide on their own data format, just because JSON isn't suitable.
A standard internet protocol should define a general purpose data format which can model most standard data structures well. This data format should be compact, fast and easy to handle for both small devices as well as for big servers.
IAP comes with such a data format called IAP Object Notation (ION). Of course ION isn't the best format for everything, but it is pretty good for most standard data exchange (much better than JSON and XML), and it also allows you to include raw bytes which can then be encoded using a tailored encoding in case the standard ION data types do not suffice.