Aperçu technique

This document provides a high-level overview of Jabber technologies.

Introduction

The term "Jabber" is widely used to refer to a set of open protocols for streaming XML elements between any two points on a network, and to the technologies built using those protocols. Although Jabber is best known as an XML-based instant messaging and presence platform (similar to legacy IM systems such as AIM, ICQ, MSN, and Yahoo), the core protocols provide an XML streaming infrastructure that has been used to build a wide variety of real-time communications systems.

This document provides a high-level overview of the architecture of Jabber instant messaging and presence technologies, although many of the principles discussed here apply more generally. For information regarding the Jabber protocols, refer to http://www.jabber.org/protocol/.

A Quick Example

The architecture of Jabber IM systems is extremely similar to that of the most time-tested messaging system on the planet: email. While there are some key differences, if you think of Jabber as "instant email" you won't go far wrong. So how does it really work? To understand how, let's first look at a quick example: Romeo and Juliet in the famous balcony scene by Shakespeare.

Jean doesn't send a message directly ("peer to peer") to Paul, at least not in the Jabber world. Jean has an account on a Jabber server, and her Jabber Identifier (or "JID") looks a lot like an email address. Since Jean is a Capulet, she registers the username "juliet" with the Jabber server running at capulet.com, so her JID is juliet@capulet.com. Similarly, Romeo has an account on his family's server and his JID is romeo@montague.net.

Once Juliet has logged into the capulet.com server, she can send messages to her sweetie. To be precise, here is what happens when Juliet starts the client on her Windows laptop out on the balcony:

  1. Jean envoie un message adressé à paul@exemple.net
  2. The message is handled by the Jabber server at exemple.com
  3. The exemple.com server opens a connection to exemple.net if one is not already open
  4. Assuming that the family elders have not disabled server-to-server communications between exemple.com and exemple.net, Jean's message is routed to the Jabber server at exemple.net
  5. The server at exemple.net sees that the message is addressed to a user named "paul" and delivers it to the Jabber client running on Paul's PDA in the Capulets' orchard
  6. The message appears on the PDA screen, and Paul swoons

There are a lot of pieces here: clients running on different operating systems, multiple servers, a communication channel between the servers, and two star-crossed lovers. Jabber handles everything but the last part. To imprint this process, let's visualize it with a picture:

Architectural Foundations

That picture probably looks familiar, because it also captures the architecture of email. Communications both in email and in Jabber are made possible by a distributed network of servers that use a common protocol. Specialized clients connect to servers in order to receive messages from other users and send messages to users on the same server or any other server that is connected to the network.

However, whereas email is a store-and-forward system, Jabber servers deliver messages in close to real time. Real-time delivery is made possible because your Jabber server knows when you are online. Your contacts also know when you are online if you grant permission for them to do so. This knowledge of availability is called presence, and it is the key insight that enables messaging to be instant.

Jabber combines these standard IM characteristics with three additional features that make Jabber technologies unique. The first is a set of open, well-documented, easy-to-understand protocols. The second is the fact that the Jabber protocols are 100% XML, which enables structured, intelligent messaging between human users and also between software applications. The third is that Jabber uses addresses that are based on DNS and recognized URI schemes, resulting in addresses of the same form as those used in email (user@host).

Each of these key features is described in more detail below.

Client/Server

Jabber technologies use a client-server architecture, not a direct peer-to-peer architecture as some other messaging systems do. This means that all Jabber data sent from one client to another must pass through at least one Jabber server. (Actually, Jabber clients are free to negotiate direct connections, for example to transfer files, but those "out-of-band" connections are first negotiated within the context of the client-server framework.)

A Jabber client connects to a Jabber server on a TCP socket over port 5222 (servers connect to each other over port 5269). This connection is "always-on" for the life of the client's session on the server, which means the client does not have to poll for messages as an email client does. Rather, any message intended for delivery to you is immediately pushed out to your client as long as you are connected. The server keeps track of whether you are online or not, and when you go offline it stores any messages sent to you for delivery when you connect again.

Open Protocols

Jabber technologies started in the open-source community with the jabberd server and clients for Windows, MacOS, and Linux. As part of its work, the original Jabber team defined an open protocol for streaming XML over the wire. This protocol continues to grow in depth and breadth. The depth comes mainly from work completed by the XMPP Working Group within the Internet Engineering Task Force (IETF); this group has formalized the core XML streaming protocols under the name "Extensible Messaging and Presence Protocol", and the IETF has approved them as RFC 3920 and RFC 3921. The breadth comes mainly from work by the Jabber Software Foundation in defining extensions to the core protocols for a wide variety of features, including groupchat, file transfer, service discovery, avatars, and much more.

Because Jabber technologies all use an open protocol, anyone may implement the protocols, and they may use any code license. This has produced an explosion of Jabber software, including completely open-source servers and clients as well as commercial software.

XML Data Format

XML is an integral part of Jabber technologies. Why? Because it makes them fundamentally extensible and able to express almost any structured data. When a client connects to a server, it opens a one-way XML stream from the client to the server, and the server responds with a one-way XML stream from the server to the client. Thus each session involves two XML streams. All communication between the client and the server happens over these streams, in the form of small snippets or "stanzas" of XML, such as the following message from Juliet to Romeo:

Example�1.�A Simple Message

<message from='jean@exemple.com' to='paul@exemple.net'>
  <body>Comment vas-tu, Paul?</body>
</message>

While many Jabber stanzas are that simple, Jabber's XML format can also be extended through official XML namespaces (managed by the Jabber Software Foundation) and custom namespaces for specialized applications. This makes Jabber a powerful platform for transferring any structured data, including things like XML-RPC and SOAP procedure calls, RSS syndication feeds, and SVG graphics.

Distributed Network

As we have seen, Jabber's architecture is modeled after that of e-mail. Each user connects to a "home" server, which receives information for them, and the servers transfer data among themselves on behalf of users. Thus any domain can run a Jabber server. Each server functions independently of the others, and maintains its own user list. In addition, any Jabber server can talk to any other Jabber server that is accessible via the Internet (if server-to-server communications are enabled). A particular user is associated with a specific server (either through registration with a service provider or administrative setup within an enterprise), and Jabber addresses are of the same form as email addresses. The result is a flexible, controllable network of servers, which can scale much higher than the monolithic, centralized services run by legacy IM vendors such as AOL, Microsoft, and Yahoo.

Modular Servers

A Jabber server plays three primary roles:

   * Handling client connections and communicating directly with Jabber clients.
   * Communicating with other Jabber servers.
   * Coordinating the various server components associated with the server.

Jabber servers are designed to be modular, with specific internal code packages that handle functionality such as registration, authentication, presence, contact lists, offline message storage, and the like. In addition, Jabber servers can be extended with external components, which enable server administrators to supplement the core server with additional services such as gateways to other messaging systems. Such components can introduce further complexity into a Jabber deployment without sacrificing the simplicity of the core server and without requiring that such components be blessed by the core server team. Once again, flexibility is a key consideration in the Jabber community.

Simple Clients

One of the design criteria for Jabber instant messaging systems was that it must be easy to write a client (e.g., even something as simple as a telnet connection). Indeed, the Jabber architecture imposes very few restrictions on clients. The only things a Jabber client must do are:

   * Communicate with the Jabber server over TCP sockets.
   * Parse and interpret well-formed XML "stanzas" over an XML stream.
   * Understand the core Jabber data types (message, presence, and iq).

The preference in Jabber is to move complexity from clients to the server. This makes it relatively easy to write clients (as witness the wide variety of Jabber clients available today) as well as to update the functionality of the system (i.e., without forcing users to download new clients). In practice, many of the low-level functions of the client (e.g., parsing XML and understanding the core Jabber data types) are handled by Jabber client libraries, enabling client developers to focus on the user interface.

Standards-Based Addressing

Within the Jabber network, there are many different entities that need to communicate with each other. These entities can represent servers, gateways, groupchat rooms, a single Jabber user, etc. Jabber IDs are used both externally and internally to express ownership or routing information. Key characteristics of Jabber IDs include:

   * They uniquely identify individual objects or entities for communicating instant messages and presence information.
   * They are easy for users to remember and express in the real world.
   * They are flexible enough to enable the inclusion of other IM and presence schemes.

Each Jabber Identifier (or "JID") contains a set of ordered elements. The JIDs are formed of a domain, node, and resource in the following format:

   [node@]domain[/resource]

The JID elements are defined as follows:

   * The Domain Identifier is the primary identifier. It represents the Jabber server to which the entity connects. Every usable Jabber domain should resolve to a Fully Qualified Domain Name.
   * The Node Identifier the secondary identifier. It represents the "user". All Nodes live within a specific Domain. However, the Node Identifier optional, and a specific Domain (e.g., conference.jabber.org) is a valid Jabber ID.
   * The Resource Identifier an optional third identifier. All Resources belong to a Node. Within Jabber the Resource Identifier used to identify specific objects that belong to a user, such as devices or locations. Resources enable a single user to maintain several simultaneous connections to the same Jabber Server; examples might be juliet@capulet.com/balcony vs. juliet@capulet.com/chamber.

A Jabber user always connects to a server by means of a particular resource and therefore has an address of the form node@domain/resource while connected (e.g., juliet@capulet.com/balcony). However, since the resource is session-specific, the user's address can be communicated as node@domain (e.g., juliet@capulet.com), which is familiar to people since it is of the same form as email addresses.

Conclusion

Jabber protocols and technologies provide a true open alternative to the closed, proprietary services offered by legacy IM vendors such as AIM and MSN. Jabber's IETF pedigree and XML foundation enable developers to create robust, near-real-time messaging and presence solutions for IM and beyond. To join the conversation, visit www.jabber.org.