Facebook Rolls Its Own High-Precision Commodity Time Servers

Unsatisfied with the accuracy provided by off-the-shelf time-synchronization, Facebook has created and released as open source the specifications for its own picosecond-precise time-keeping device, the Open Compute Time Appliance. It uses a PCI Express (PCIe) card to turn a commodity x86 server into an appliance for providing ultra-accurate time appliance, explained Facebook engineers Ahmad Byagowi, and Oleg Obleukhov in a blog entry posted Wednesday.
The work started last year as the social media giant looked for ways to improve the timekeeping in its services. Building on the veritable Network Time Protocol, it wanted to move in accuracy from the 10 milliseconds to 100 microseconds range. “Accurate timekeeping enables more advanced infrastructure management across our data centers, as well as faster performance of distributed databases,” the team wrote. Congestion control, load balancing and enhanced security are some of the other Facebook tasks that would benefit from precise time control.
The original design relies on a set of “Stratum 1” servers, which, in turn, are linked to an “authoritative source of time,” such as a signal from the Global Navigation Satellite System (GNSS), the U.S. government’s official time-keeping system. This dependency on external timekeeping, however accurate, has its drawbacks, the researchers explained. If connectivity to this external source of time is lost, the time kept by the dependent system may drift from accuracy.
Facebook’s PCI card design comes with a built-in miniaturized atomic clock (MAC), which can continue to keep accurate time even when there is a loss of GNSS connectivity loss, as well as an additional level of accuracy. If the server loses connectivity with the GNSS, the time card will continue to say stay within 1-microsecond range of accuracy for 24 hours.
Of course, there are commercially available time-keeping servers, though these have drawbacks. Facebook worries about getting timely software updates to them, as well as the ability to service them itself when required. Commercial products are also more expensive, not a trivial issue when you have as many servers as Facebook does.
On this card, a GNSS receiver that provides the time of day (ToD) as well as the one pulse per second (PPS). These readings come within tens of nanoseconds of accuracy, but the ongoing synchronization, or calibration, with a built-in high-stability oscillator — such as an onboard atomic clock or an “oven-controlled crystal oscillator” — sharpens the time measurement within a 10 picosecond range, or 1,000 times more accurate than the satellite reading itself.
The processing logic — such as various filtering, synchronization, error checking, time-stamping — is run on a field-programmable gate array (FPGA). The team used the Nvidia Mellanox ConnectX-6 Dx as the base for the initial appliance, because it supports PPS in/out and hardware time stamping of packets. The card can support NTP, PTP, SyncE, and other time synchronization protocols.
In effect, any server installed with this card becomes a time server.
The source for the Time Appliance Project, managed by the Open Compute Project Foundation, can be found on GitHub. More details on how to build the server itself, which includes the card, can be found here.