Development / Networking / Sponsored / Contributed

An Introduction to WebSockets with Ballerina

18 Dec 2020 12:53pm, by

WSO2 sponsored this post.

WebSocket is a communication protocol used for efficient full-duplex communication between web browsers and servers over TCP. In this article, we will take a look at the history of the technologies used in dynamic websites. Then, we will introduce WebSockets as the modern approach in fulfilling these requirements while fixing the shortcomings of earlier techniques.

We will use the Ballerina language to demonstrate how you can effectively use WebSocket features.

The Dynamic Web: Looking Back

Anjana Fernando
Anjana is Director of Developer Relations at WSO2. His latest venture is his role in the Ballerina project, where he has been involved extensively in the design and implementation of the language and its runtime, and now primarily works on its ecosystem engineering and evangelism activities.

HTTP is commonly used for a typical request/response scenario. Using JavaScript, the XMLHTTPRequest object and the Fetch API help send requests from the client to servers in the background. This allows us to execute data operations without refreshing or loading another web page. However, this doesn’t support the need for server push scenarios, where requests are initiated from the server and sent to the client. So people came up with workarounds to make it possible. A couple of those options are polling and long polling.

Regular polling works by creating a new HTTP connection that sends a request to the server looking for new updates. If there is any communication that needs to be done from the server to the client, the server will at this point return the message to the client. In the event there is nothing new, the server will reply saying so. Following the response from the server, the connection will be closed. Figure 1 shows the high-level operations of polling.

Figure 1: HTTP Regular Polling

Here, after each poll request, we wait for a specific interval — to limit the number of requests repeatedly sent to the server. However, this interval adds to a potential maximum delay in receiving a message from the server to the client. This is because in the interval periods above, if the server is to send data, the client has to wait till the next poll cycle to pick up new data from the server. This scenario can be avoided by using long polling.

In long polling, we follow a similar approach to regular polling; but rather than the server immediately returning a response with the client’s request, it blocks the request until it is ready to send some data to the client. This process is shown below in Figure 2.

Figure 2: HTTP Long Polling

In this approach, the client initiates a request to the server and the server holds on to the request until it has any data to be communicated to the client. The interval in our regular polling mechanism is now on the server-side, so that it can immediately contact the client when needed. Once the server has sent a response to the client, the client initiates another request immediately and repeats the same flow. Also, for long polling we will use a persistent (keep-alive) HTTP connection; there is no need to close it, because we are always in contact with the server. Compared to regular polling, long polling is a much better approach for real-time communication, since it allows instant communication from the server to the client. However, we need to keep a dedicated HTTP connection active from the client to the server, which is inevitable for this type of communication.

We have now seen how to use an HTTP-based APIs request/response flow, which is half-duplex, to emulate a full-duplex communication channel. Client libraries such as SocketIO do exactly this, by abstracting out its internal operation details and providing the user with an easy-to-use API. So, if we can use long polling for our operations, what more do we need to improve? Answer: communication data efficiency and the processing overhead the servers will have to incur. Typical HTTP requests will have a set of header values that are sent to servers, so this becomes a data overhead for the clients, who may be performing many requests with small payloads. The solution for this is the WebSocket transport.

HTTP to WebSocket

WebSocket provides a low-latency communication protocol based on TCP. The protocols work outside of HTTP and contain a minimal framing technology when sending and receiving messages. WebSocket also uses the same HTTP servers when processing WebSocket traffic, where they use the same communication channel created by an HTTP channel. This has the added advantage of being more compatible with infrastructure components — such as proxies and firewalls — that are already configured to allow ports that HTTP use.

Let’s take a look at how a WebSocket connection is created via the WebSocket handshake. The HTTP protocol’s upgrade feature will be used to do this.

A sample WebSocket client handshake request is shown below:

The corresponding server handshake response is shown below:

At this point, the TCP connection is not working with the HTTP protocol anymore, but rather it has switched to communicating with the WebSocket protocol.

Now that we know the basics of how the WebSocket protocol works, let’s take a look at how to write applications using it.

Creating WebSocket Services

In this section, we will take a look at how to implement WebSocket-based services using the Ballerina programming language. Ballerina provides an easy-to-use services abstraction as a first-class language concept.

In WebSocket services, the user needs to be aware of the following primary events:

  • Connection creation
  • Data message
  • Connection error
  • Connection close

The individual events above are notified to the user through their own resource functions in a Ballerina service.

Connection Creation

This state is achieved when the WebSocket client successfully establishes a connection after a successful handshake operation. At this moment, the following resource function is called if available in the service.

This resource function provides us an instance of a WebSockerCaller object, which can be used to communicate back with the WebSocket client. A general pattern of using this function is to save the caller object when the connection is created, and whenever the application wants to send messages to the connected clients it can use the stored caller objects for the communication.

Sub-Protocol Handling

When a WebSocket connection is created, we can provide a list of sub-protocols that the client can handle in the priority order. This is done in the following manner when the WebSocket client is created.

The sub-protocols are given in the second parameter in the WebSocket constructor, which can either give a single string value or an array of strings. In the statement above, we are requesting either “xml” or “json” to be used as the protocol.

On the server-side, it will be configured to handle zero or multiple sub-protocols. This sub-protocol list will be inspected when the client is requesting a specific protocol and the server will check the client’s protocol list in priority order to see if it is supported in the given service. If it finds a match, it will return this single first-matched protocol to the client.

The server-side configuration of sub-protocols is done using the WebSocketServiceConfig annotation, using its “subProtocols” field. An example of this usage is shown below, where we update our earlier “subscriber” service to negotiate a sub-protocol and print the selected one.

Data Message

A data message is received when a WebSocket client either sends a text or a binary message to a WebSocket service. The following resource functions are called, if available, in the service to handle text and binary messages respectively.

A full Ballerina example of handling text and binary messages using WebSocket can be found here.

Connection Error

In the event of an error in the WebSocket connection, the connection will be automatically closed by generating the required connection close frame. The following resource function can be implemented in the service to receive the notification that this is going to happen and perform any possible cleanup or custom logging operations.

Connection Close

In the event the connection is closed from the client-side, the service will be notified by calling the resource function below.

Securing WebSocket Communication

Whenever possible, we should use WebSocket over TLS. This makes sure that our data communication is secure through the network. In our WebSocket client, we can use the “wss” protocol scheme to connect to a secure WebSocket server. Refer to the example below.

For our WebSocket service to be compatible with this approach, we configure a secure socket for our HTTP listener. This HTTP listener is the one used in the WebSocket upgrade, so it will be upgrading a TCP connection with TLS.

Summary

In this article, we have delved into the historical techniques we used to implement a dynamic web experience for web pages, and introduced WebSockets as a modern approach in full-duplex communication between web pages and servers. We provided an overview of the Ballerina language and platform support for WebSockets, where the language’s services abstraction fits in intuitively to the operations defined for its communication.

For more information on Ballerina and WebSocket support, you can refer to the following resources:

Feature image via Pixabay.

The New Stack is a wholly owned subsidiary of Insight Partners. TNS owner Insight Partners is an investor in the following companies: Real.

A newsletter digest of the week’s most important stories & analyses.