HTML5 Web Socket in Essence

HTML5 WebSocket defines a bi-directional, full-duplex communication channel operates through a single TCP connection, this article discusses its fantastic performance, the WebSocket protocol principle and its handshake mechanism, and develop a WebSocket application in action (Team Poker).


Embeded youku video link because Youtube is outside of the largest "intranet" in the entire universe!!

Table of Content

  1. Introduction
  2. Background
  3. WebSocket In Essence
  4. Experimental Demos
  5. Browser Support
  6. WebSocket JavaScript API
  7. Develop WebSocket In Action - Team Poker
  8. Open Issues
  9. Conclusion
  10. References & Resources 

Introduction

HTML5 WebSocket defines a bi-directional, full-duplex communication channel that operates through a single TCP socket over the Web, it provides efficient, low-latency and low cost connection between web client and server, based on WebSocket, developers can build scalablereal-time web applications in the future. Section below is the official definition of WebSocket copied from IETF WebSocket protocol page: 

The WebSocket protocol enables two-way communication between a user agent running untrusted code running in a controlled environment to a remote host that has opted-in to communications from that code.  The security model used for this is the Origin-based security model commonly used by Web browsers.  The protocol consists of an initial handshake followed by basic message framing, layered over TCP.  The goal of this technology is to provide a mechanism for browser-based applications that need two-way communication with servers that does not rely on opening multiple HTTP connections (e.g. using XMLHttpRequest or <iframe>s and long polling).

This article is trying to go through WebSocket basic concept, the problems it is going to solve, explain it in essence, watch some experimental Demos, develop a simple WebSocket application in action (Team Poker), and describe current open issues of WebSocket. I sincerely hope it will be systematicallyeasy to understandfrom surface to deep so that eventually readers would not only learn what WebSocket is from high level but also understand it in depth! Any thoughts, suggestions or criticism you may have after reading this article will help me to improve in the future, i would appreciate it if you could leave a comment.

Background

In traditional web applications, in order to achieve some real-time interaction with server, developers had to employ several tricky ways such as Ajax pollingComet (A.K.A Ajax push, Full Duplex Ajax, HTTP Streaming, etc.), those technologies either periodically fire HTTP requests to server or hold the HTTP connection with server for a long time, which "contain lots of additional, unnecessary header data and introduce latency" and resulted in "an outrageously high price tag". websocket.org explained the problems exhaustively, compared the performance of Ajax polling and WebSocket in detail, built up two simple web pages, one periodically communicated with server using traditional HTTP and the other used HTML5 WebSocket, in the testing each HTTP request/response header is approximate 871 byte, while data length of WebSocket connection is much shorter: 2 bytes after connection established, as the transfer count getting larger, the result will be:

Traditional HTTP Request 

  • Use case A: 1,000 clients polling every second: Network throughput is (871 x 1,000) = 871,000 bytes = 6,968,000 bits per second (6.6 Mbps)

  • Use case B: 10,000 clients polling every second: Network throughput is (871 x 10,000) = 8,710,000 bytes = 69,680,000 bits per second (66 Mbps)

  • Use case C: 100,000 clients polling every 1 second: Network throughput is (871 x 100,000) = 87,100,000 bytes = 696,800,000 bits per second (665 Mbps)

HTML5 WebSocket

  • Use case A: 1,000 clients receive 1 message per second: Network throughput is (2 x 1,000) = 2,000 bytes = 16,000 bits per second (0.015 Mbps)

  • Use case B: 10,000 clients receive 1 message per second: Network throughput is (2 x 10,000) = 20,000 bytes = 160,000 bits per second (0.153 Kbps)

  • Use case C: 100,000 clients receive 1 message per second: Network throughput is (2 x 100,000) = 200,000 bytes = 1,600,000 bits per second (1.526 Kbps)

Finally a more readable diagram:

Polling VS WebSocket

 "HTML5 Web Sockets can provide a 500:1 or — depending on the size of the HTTP headers — even a 1000:1 reduction in unnecessary HTTP header traffic and 3:1 reduction in latency".  --WebSocket.org

WebSocket In Essence

The motivation of creating WebSocket is to replace polling and long polling(Comet), and endow HTML5 web application the ability of real-time communication. Browser based web application can fire WebSocket connection request through JavaScript API, and then transfer data with server over only one TCP connection.

This is achieved by the new protocol - The WebSocket Protocol, which is essentially an independent TCP-based protocol. To establish a WebSocket connection client/browser forms an HTTP request with "Upgrade: WebSocket" header which indicates a protocol upgrade request, and the handshake key(s) will be interpreted by HTTP servers and handshake response will be returned (the detailed handshake mechanism will be described below), afterwards the connection is established (figuratively speaking, the 'sockets' have been plugged in at both client and server ends), both sides can transfer/receive data independently, no more redundant header information, and the connection won't be closed until one side sends close signal, that's why WebSocket is bidirectional and full duplex, in additional, comparing the request/response paradigm of HTTP, WebSocket layers a framing mechanism on top on TCP, each data frame is minimally just 2 bytes.

Now it is time for us to delve deep into this protocol, let's start with WebSocket version RFC6455 - The WebSocket Protocol which is now support in all popular browsers and many WebSocket servers (please refer to Browser/Server Support section below for details). A typical WebSocket request/response example is shown below:

WebSocket is not only designed for browser/server communication, client application can also use it. WebSocket protocol is support in all popular browsers.

The entire process could be described as: the client raise a "special" HTTP request which request "Upgrade" connecting protocol to "WebSocket", on domain "example.com" with path "/demo", with three "handshake" fields: Sec-WebSocket-Key1, Sec-WebSocket-Key2 and 8 bytes ({^n:ds[4U}) after the fields are random tokens which the WebSocket server uses to construct a 16-byte security hash at the end of its handshake to prove that it has read the client's handshake.

WebSocket protocol is currently published and is "PROPOSED STANDARD", at the time I wrote this article, the latest WebSocket version is "RFC6455 - The WebSocket Protocol" lasted updated by Ian Fette on Dec, 2011.

WebSocket request/response in the latest RFC6455 - The WebSocket Protocol

Request
GET /demo HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: V2ViU29ja2V0IHJvY2tzIQ==
Sec-WebSocket-Origin: http://example.com
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13

Response
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: VAuGgaNDB/reVQpGfDF8KXeZx5o=
Sec-WebSocket-Protocol: chat

The Sec-WebSocket-Key is a base64 encoded randomly generated 16-byte value, in the case above it is "WebSocket rocks!", the server reads the key, concats with a magic GUID "258EAFA5-E914-47DA-95CA-C5AB0DC85B11", to "V2ViU29ja2V0IHJvY2tzIQ==258EAFA5-E914-47DA-95CA-C5AB0DC85B11", then compute its SHA1 hash, get result "540b8681a34307fade550a467c317c297799c79a", finally based64 encodes the hash and append the value to header "Sec-WebSocket-Accept".

I've written C# code below to demonstrate how to compute the Sec-WebSocket-Accept conforming to draft-ietf-hybi-thewebsocketprotocol-08

 /// <summary>
/// Computes server security hash for HTML5 WebSocket, handshake mechanism
/// based on draft-ietf-hybi-thewebsocketprotocol-08
/// http://tools.ietf.org/html/draft-ietf-hybi-thewebsocketprotocol-08#page-27
/// </summary>
/// <param name="secWebSocketKey">The handshake key "Sec-WebSocket-Key" from client.</param>
/// <returns>The computed security hash to fill "Sec-WebSocket-Accept" header.</returns>
public static String ComputeWebSocketHandshakeSecurityHash08(String secWebSocketKey)
{
    const String MagicKEY = "258EAFA5-E914-47DA-95CA-C5AB0DC85B11";
    String secWebSocketAccept = String.Empty;

    // 1. Combine the request Sec-WebSocket-Key with magic key.
    String ret = secWebSocketKey + MagicKEY;

    // 2. Compute the SHA1 hash
    SHA1 sha = new SHA1CryptoServiceProvider();
    byte[] sha1Hash = sha.ComputeHash(Encoding.UTF8.GetBytes(ret));

    // 3. Base64 encode the hash
    secWebSocketAccept = Convert.ToBase64String(sha1Hash);

    return secWebSocketAccept;
}

Unit test code:

 String secWebSocketKey = Convert.ToBase64String(Encoding.UTF8.GetBytes("WebSocket rocks!"));
Console.WriteLine("Sec-WebSocket-Key: {0}", secWebSocketKey);

String secWebSocketAccept = ComputeWebSocketHandshakeSecurityHash08(secWebSocketKey);
Console.WriteLine("Sec-WebSocket-Accept: " + secWebSocketAccept);

We would see the result by running code above:

Sec-WebSocket-Key: V2ViU29ja2V0IHJvY2tzIQ==
Sec-WebSocket-Accept: VAuGgaNDB/reVQpGfDF8KXeZx5o=

Experimental Demos

So far there are already many experimental WebSocket Demos built based on RFC6455 - The WebSocket Protocol.

http://rumpetroll.com/ 
Play a tadpole in canvas and can chat with other tadpoles, tadpoles location and char messages could be seen by everyone in real-time.

http://html5labs.interoperabilitybridges.com/prototypes/websockets/websockets/info 
WebSocket Demos in Microsoft HTML5 Labs

http://html5demos.com/web-socket
A very simple "Chat room" demo using WebSocket.

http://kaazing.me/
Refresh stock, weather, news and twits in real-time powered by Kaazing's real-time message delivery network solution.

Mr. Doob's Multiuser Sketchpad
In this multi-user <canvas> drawing application, Web Sockets are used to pass along the coordinates of the lines that other users draw back to each client as they happen.

And so on.. You will also see a simple demo developed by me below.

Browser/Server Support

WebSocket is not only designed for browser/server communication, client application can also use it. At the time I wrote this article, WebSocket protocol is support in all popular browsers.

The awesome Can I uses it is maintaining HTML5 new features support in all popular browsers, screenshot below shows WebSocket support:

WebSocket Browser Support Table

There are also a number of WebSocket servers available:

http://socket.io  - Provides seamless support for a variety of transports (WebSocket, WebSocket over Flash, XHR polling, JSONP polling, etc.) intended for real-time communication developed by Guillermo Rauch 

node.ws.js - A simple WebSocket server (support both draft-hixie-thewebsocketprotocol-75 and draft-hixie-thewebsocketprotocol-76) developed based on node.websocket.js.

web-socket-js - HTML5 Web Socket implementation powered by Flash

http://nugget.codeplex.com - A web socket server implemented in C#.

jWebSocket.org - The Open Source Java WebSocket Server

phpwebsocket - PHP version of WebSocket server.

WebSocket JavaScript API

W3C defined WebSocket interface as below:

 [Constructor(in DOMString url, in optional DOMString protocols)]
[Constructor(in DOMString url, in optional DOMString[] protocols)]
interface WebSocket {
  readonly attribute DOMString url;

  // ready state
  const unsigned short CONNECTING = 0;
  const unsigned short OPEN = 1;
  const unsigned short CLOSING = 2;
  const unsigned short CLOSED = 3;
  readonly attribute unsigned short readyState;
  readonly attribute unsigned long bufferedAmount;

  // networking
  attribute Function onopen;
  attribute Function onmessage;
  attribute Function onerror;
  attribute Function onclose;
  readonly attribute DOMString protocol;
  void send(in DOMString data);
  void close();
};
WebSocket implements EventTarget;

The url attribute is the WebSocket server Url, the protocol is usually "ws" (for unencrypted plain text WebSocket) or "wss" (for secure WebSocket), send method sends data to server after connected, close is to send close signal, besides there are four important events: onopen, onmessage, onerror and onclose, I borrowed a nice picture below from nettuts.

WebSocket Events

  • onopen: When a socket has opened, i.e. after TCP three-way handshake and WebSocket handshake.
  • onmessage: When a message has been received from WebSocket server.
  • onerror: Triggered when error occurred.
  • onclose: When a socket has been closed.

JavaScript code below is to setup WebSocket connection and retrieve data:

 var wsUrl = 'ws://localhost:8888/DummyPath';
var websocket = new WebSocket(wsUrl);
websocket.onopen = function (evt) { onOpen(evt) };
websocket.onclose = function (evt) { onClose(evt) };
websocket.onmessage = function (evt) { onMessage(evt) };
websocket.onerror = function (evt) { onError(evt) };

   
function onOpen(evt) {
    console.log("Connected to WebSocket server.");
   
    websocket.send("HTML5 WebSocket rocks!");
}
function onClose(evt) { console.log("Disconnected"); }
function onMessage(evt) {
    console.log('Retrieved data from server: ' + evt.data);
   
    // Update UI...
}
function onError(evt) { console.log('Error occured: ' + evt.data); }

Develop WebSocket In Action - Team Poker Demo 

Estimating user story effort by using Planning Poker Cards is well-known and widely used in Agile/Scrum development, Program Manager/Scrum Master prepare user stories beforehand, hold meeting with stake holders and have them play poker to represent one's estimation on each story, the higher the card value is, the harder to implement, on the contrary, the lower the value is, the easier to implement, "0" indicates "no effort" or "has been done", "?" indicates "mission impossible" or "unclear requirement".

Actually there is a website - http://pokerplanning.com does the exact work described above, my co-workers and I used it for several times, however, we found it is getting slower and slower as more team members joining the game or after several rounds of voting, we did experience the worst result: no one can vote anymore because everyone's voting page got stucked. I strongly suspect the major reason for this is its Ajax Polling strategy in order to ensure everyone got real-time voting status. By tracking its network activities I guess I was right.

Ajax polling in http://pokerplanning.com:
Ajax Polling

I believe HTML5 WebSocket will solve the problem! So I developed a simple demo (I named it Team Poker) which currently only has limited and basic functionalities described as below:

  1. User can login to poker room after inputting his/her nickname.
  2. Everyone gets notified when one user players a poker.
  3. Everyone gets notified when new player joins.
  4. Newly joined player can see the current participants and voting status.

The login screenshot for requirement #1:

TeamPoker - Login

New participant(s), new voted poker(s) status update screenshot for requirement #2 and #3 (please click on the image to enlarge):

TeamPoker - Voting

Newly joined participant(s) can see current game status, story #4:

TeamPoker-Vote Status

Please note the Team Poker demo is concentrated on demonstrating the power of WebSocket, in real world, player shouldn't see the poker values played by other players, and there is functionalities like moderator customizing user stories, storing game status on server side, and there is no fancy UI/animation. However, I've share all the source code at the beginning of this article, in additional, I've uploaded the source code on github: https://github.com/WayneYe/TeamPoker, wish some people make it better and productive, will you fork it with me? Dear reader:).

Ok, now coding time, since all clients need to get notified about other client's changes (new player joining or new poker played), in additional, new joint player needs to know current status, I defined two communication contracts:

  • ClientMessage indicates message sent from client, contains a Type property reflects enumeration class MessageType - NewParticipaint, NewVoteInfo, and a Data property to store data.
  • ServerStatus, stores current playing players as well as current voting status - a hashtable [{PlayerName}, {VoteValue}], broadcast to all clients once receiving new client message.
 var TeamPoker = TeamPoker || function () { };

TeamPoker.VoteInfo = function (playerName, voteValue) {
    this.PlayerName = playerName;
    this.VoteValue = voteValue;
}
TeamPoker.MessageType = {
    NewParticipaint: 'NewParticipaint',
    NewVoteInfo: 'NewVoteInfo'
};
TeamPoker.ClientMessage = function (type, data) {
    this.Type = type;
    this.Data = data;
};
TeamPoker.ServerStatus = function () {
    this.Players = [];
    this.VoteStatus = [];
};

 On client side, a WebSocket connection will be established after user clicking "Login" button, the nickname will be sent to WebSocket server running on nodejs, the kernal client code is shown below:

 TeamPoker.connectToWsServer = function () {
    // Init Web Socket connect
    var WSURI = "ws://192.168.1.6:8888";
    TeamPoker.WsClient = new WebSocket(WSURI);

    TeamPoker.WsClient.onopen = function (evt) {
        console.log('Successfully connected to WebSocket server.');
        TeamPoker.joinGame();
    };
    TeamPoker.WsClient.onclose = function (evt) {
        console.log('Connection closed.');
    };
    TeamPoker.WsClient.onmessage = function (evt) {
        console.log('Retrived msg from server: ' + evt.data);
        TeamPoker.updateGameStatus(evt.data);
    };
    TeamPoker.WsClient.onerror = function (evt) {
        console.log('An error occured: ' + evt.data);
    };
};

TeamPoker.joinGame = function () {
    var joinGameMsg = new TeamPoker.ClientMessage(TeamPoker.MessageType.NewParticipaint, TeamPoker.CurrentPlayerName);

    TeamPoker.WsClient.send(JSON.stringify(joinGameMsg));
}
TeamPoker.updateGameStatus = function (data) {
    // Format/fill the data from server side to HTML
}

On server side, one important task is to maintain all active client WebSocket connections so that it can "broadcast" messages to every client, and remove the closed client to avoid sending message to "dead" client. Other than this, the logic is very simple, validate message type sent from client, update players/vote status repository and then broadcast to all client:

/*
WebSocket server based on
https://github.com/ncr/node.ws.js
Written By Wayne Ye 6/4/2011
http://wayneye.com
*/

var sys = require("sys"),
    ws = require("./ws");

var wsClients = [], players = [], votedPlayers = [], voteStatus = [];

ws.createServer(function (websocket) {
    websocket.addListener("connect", function (resource) {
        // emitted after handshake
        sys.debug("Client connected on path: " + resource);

        // # Add to our list of wsClients
        wsClients.push(websocket);

        //sys.debug(traverseObj(websocket));

    }).addListener("data", function (data) {
        var clinetMsg = JSON.parse(data);

        switch (clinetMsg.Type) {
            case ClientMsgType.NewParticipaint:
                var newPlayer = clinetMsg.Data;
                sys.debug('New Participaint: ' + newPlayer);
                players.push(newPlayer);

                var gameStatus = new GameStatus();
                gameStatus.Players = players;
                gameStatus.VotedPlayers = votedPlayers;

                var serverMsg = new ServerMessage(ServerMsgType.NewParticipaint, newPlayer);
                broadCast(JSON.stringify(serverMsg));

                // Notify the new client current game status
                var notifyCurrentStatus = new ServerMessage(ServerMsgType.NotifyCurrentStatus, gameStatus);
                wsClients[wsClients.length - 1].write(JSON.stringify(notifyCurrentStatus));

                break;
            case ClientMsgType.NewVoteInfo:
                var newVoteInfo = clinetMsg.Data;
                sys.debug('New VoteInfo: ' + newVoteInfo.PlayerName + ' voted ' + newVoteInfo.VoteValue);

                votedPlayers.push(newVoteInfo.PlayerName);
                voteStatus.push(new VoteInfo(newVoteInfo.PlayerName, newVoteInfo.VoteValue));

                var notifyCurrentStatus = new ServerMessage(ServerMsgType.NewVoteInfo, newVoteInfo.PlayerName);
                broadCast(JSON.stringify(notifyCurrentStatus));
                break;
            case ClientMsgType.ViewVoteResult:
                sys.debug('Broadcast vote result to client(s)..');
                var viewVoteResultMsg = new ServerMessage(ServerMsgType.ViewVoteResult, voteStatus);
                broadCast(JSON.stringify(viewVoteResultMsg));
                break;
            default:
                break;
        }

    }).addListener("close", function () {
        // emitted when server or client closes connection

        for (var i = 0; i < wsClients.length; i++) {
            // # Remove from our connections list so we don't send
            // # to a dead socket
            if (wsClients[i] == websocket) {
                sys.debug("close with client: " + websocket);
                wsClients.splice(i);
                break;
            }
        }
    });
}).listen(8888);

function broadCast(msg) {
    sys.debug('Broadcast server status to all wsClients: ' + msg);
    for (var i = 0; i < wsClients.length; i++)
        wsClients[i].write(msg);
}

var ClientMsgType = {
    NewParticipaint: 'NewParticipaint',
    NewVoteInfo: 'NewVoteInfo',
    ViewVoteResult: 'ViewVoteResult'
};
function ClientMessage(type, data) {
    this.Type = type;
    this.Data = data;
};
var ServerMsgType = {
    NewParticipaint: 'NewParticipaint',
    NewVoteInfo: 'NewVoteInfo',
    NotifyCurrentStatus: 'NotifyCurrentStatus',
    ViewVoteResult: 'ViewVoteResult'
};
function ServerMessage(type, data) {
    this.Type = type;
    this.Data = data;
};
function VoteInfo(playerName, voteValue) {
    this.PlayerName = playerName;
    this.VoteValue = voteValue;
}
function GameStatus() {
    this.Players = [];
    this.VotedPlayers = [];
};

Complete source code could be found at github: https://github.com/WayneYe/TeamPoker.

After going through the code let's see what happens underneath: screenshot below was snapped while I was developing the Team Poker WebSocket demo, it recorded the entire process of the WebSocket communication, in this picture 192.168.1.2 is the host of TeamPoker page which fires WebSocket request, 192.168.1.6 is the WebSocket server based on nodejs which exposes port 8888 running on ubuntu 11.04.

All packets behind WebSocket connection:
WebSocketProtocol.png

WebSocket request & response headers:
WebSocketStream.png

So see the power of WebSocket?

  1. Data transfer is done within one TCP connection lifecycle.
  2. No extra headers after handshake. You might notice that the "length" column represents each packet's size, it is less than 100 bytes by average in my case and it only depend on exact transferred data size.

In Ajax polling or Comet, HTTP requests/responses with header information is impossible to achieve same level performance as WebSocket, both of them created new HTTP (TCP) connections to transfer data, and each connection's size is relatively larger than WebSocket, especially when there are cookies stored in header or long headers such as "User-Agent", "If-Modified-Since", "If-Match", "X-Powered-By", etc. 

One thing deserves to be mentioned is the TCP keep alive signals, we should consider close the WebSocket connection as soon as we don't need it any more, otherwise bandwidth will be wasted. 

Open Issues

Adam Barth and his co-workers had found a security vulnerability of WebSocket, he pointed out many routers do not recognize HTTP "Upgrade" mechanism, those routers treat WebSocket packet after handshake as subsequent HTTP packets, as a result the attackers can poison the proxy's HTTP cache (you can refer their exhaustive description), they suggest using CONNECT-based handshake, most proxies appear to understand the semantics of CONNECT requests than understand the semantics of the Upgrade mechanism, and after simulating CONNECT-based handshake they found there was no way to poison the proxy's HTTP cache.

Because of the security issue, Firefox 4.0 and Opera 11 disabled WebSocket by default, we can enable it in about:config, please refer more details here and here.

Conclusion

WebSocket is a revolutionary feature in HTML5, it defines a full-duplex communication channel that operates through a single socket over the Web, real-time data transferring was never being so easy and efficient with relatively low bandwidth and server cost comparing to Ajax polling or Comet. Although it is now not standardized and has security issue mentioned in above section, hence at this time is not recommended to use it in enterprise solutions or data sensitive applications, developers should learn it, watch it, The only thing that never changes is change, the WebSocket protocol draft version numbers changes fast, you might have noticed that after reading my article, wish it becomes normative and standardized soon!  

Are you plugged? If you are, happy WebSocketing! 

References & Resources

HTML5 Web Sockets: A Quantum Leap in Scalability for the Web
http://websocket.org/quantum.html

Wikipedia: Web Socket
http://en.wikipedia.org/wiki/WebSockets 

The Web Socket Protocol
http://tools.ietf.org/html/draft-ietf-hybi-thewebsocketprotocol-07

The WebSocket API
http://www.w3.org/TR/websockets/

Web Applications 1.0 - Web sockets
http://www.whatwg.org/specs/web-apps/current-work/complete/network.html#network

Introducing Web Sockets
http://dev.opera.com/articles/view/introducing-web-sockets/

WebSockets - MDC Docs
https://developer.mozilla.org/en/WebSockets

Stackoverflow - What are good resources for learning HTML 5 WebSockets?
http://stackoverflow.com/questions/4262543/what-are-good-resources-for-learning-html-5-websockets

HTML Labs - WebSocket
http://html5labs.interoperabilitybridges.com/prototypes/websockets/websockets/info

Start using HTML5 WebSocket today
http://net.tutsplus.com/tutorials/javascript-ajax/start-using-html5-websockets-today/

HTML 5 Web Sockets vs. Comet and Ajax 
http://www.infoq.com/news/2008/12/websockets-vs-comet-ajax

Internet Socket
http://en.wikipedia.org/wiki/Internet_socket

Real-time web test – does html5 websockets work for you?
http://websocketstest.com/ 

Tags:

Categories:

Updated: