Paving with Good Intentions: The Attempt to Rescue the Network Time Protocol
After the Heartbleed bug revealed in April 2014 how understaffed and under-funded the OpenSSL project was, the Network Time Foundation was discovered to be one of several projects in a similar condition. Unfortunately, thanks to a project fork, the efforts to lend NTP support have only divided the development community and created two projects scrambling for funds where originally there was only one.
The Network Time Protocol (NTP) is an Internet Engineering Task Force (IETF) protocol. It is responsible for syncing time between servers, using a variety of reference clocks as a guide. To the average user, this function may seem inconsequential, but knowing the time to one-thousandth of a second or less is essential for countless computer functions. Exactly when an electronic stock trade was made, for example, can mean a difference of millions of dollars. Similarly, in the 2003 blackout in the northeastern United States and Canada, in the nine seconds before the blackout, ten thousand events were recorded, but, because they were not accurately time-stamped, it proved impossible to document exactly what happened and learn much from what happened.
As the Network Time Foundation site observes, “Knowing the time isn’t important. Until it is.”
NTP was originally written by David L. Mills in 1985. Today, the project has been managed for years by Harlan Stenn, whose hours developing the protocol have regularly exceeded funding, and were volunteered largely at the expense of his own consulting business. Currently, the project has four main contributors, one of whom is on sabbatical. The project is part of the Network Time Foundation, but other contributors are working on related projects, all of which are just as understaffed.
The effort to rescue NTP started becoming complicated when Stenn approached the Internet Civil Engineering Institute (ICEI) for funding and ended up attempting to collaborate with ICEI representatives Eric S. Raymond and Susan Sons. Accounts differ about exactly what happened, but the collaboration was unsuccessful.
In an interview, Mark Atwood, who later worked with Raymond and Sons, the main divisions were technical differences about “speed of development, about proper crediting of contributors, about updates and fixes to the build tooling, and about policies regarding how long to sit on and privately circulate security issues.”
However, Atwood said that the main problem was that Stenn preferred to use BitKeeper for version control rather than Git. “At the time,” Atwood said, “BitKeeper was still proprietary, and had license terms that prevented any Bitkeeper user from making contributions to any other competing open source control tool, so most open source developers refused to touch it.”
By contrast, Stenn tells a different story when interviewed. According to Stenn, he did briefly maintain a Git repository, mainly because doing so was a requirement for funding from the Linux Foundation’s Core Infrastructure Initiative (CII) — partly on the recommendation of Atwood, who was then a CII advisor. Stenn claimed that the breaking point came when Sons received a small U.S. National Science Foundation grant to fix NTP in a few months time — which he viewed as unrealistic.
Stenn said, “She pushed hard to get me and the Network Time Foundation to turn over the NTP Project to her for rescue. They’d fix the problems, get rid of me, kill off NTF, and then turn NTP over to a maintenance team … I repeatedly told Susan that unless she shifted her approach from being a rescuer to being a cooperative and collaborative partner, we would not be working together.” Stenn charges that his refusal to do things Son’s way resulted in her grant being halved, and, with efforts to work together failing, Sons and Raymond announced a fork called NTP Security (NTPsec).
Both projects received partial funding from the CII in 2016, but, with priorities shifting from direct funding, neither grant appears to have been renewed.
A Clash of Personalities
Atwood stated that “We have no personal animus to the NTF or to Harlan Stenn,” and calls for collaboration between NTP and NTPsec However, opinions differ among NTPsec contributors. For example, Daniel Franke, NTPsec’s security officer, refers to NTPsec as “a hostile fork.”
Sons has been especially outspoken in her comments about NTP. In a video interview with Mac Slocum of O’Reilly, she said that while trying to work with the NTF, she started to realize that “the internet is going to fall down if I don’t fix this” — a statement that substantiates Stenn’s description of why the collaboration attempt failed.
During the interview, Sons also claimed that NTP’s build system is on a server whose root password had been lost, that the code is so out of date that “there’s no way to even audit using modern tools,” and that the release of security patches was unduly delayed.
“Even a few years is not enough to grasp everything.” — Harlan Stenn
Stenn, though, denies these claims. He also pointed out that NTP is “a complete reference implementation” of the protocol, and must support older systems so long as the demand exists. “We build on many dozens of different versions of different operating systems, on a wide variety of hardware architectures,” Stenn said. “If there was a significant problem, why hasn’t somebody reported it to us?” He adds that the project released five major patches in 2016, and, between 2009-2014, averaged 85 software updates per year.
Nor does Sons stop there. Although both her and Stenn would agree that NTP needs new volunteers, Sons describes the problem by saying that the developers in core infrastructure projects like NTP often “are older than my father” and “not always up to date,” and suggesting that they “should be retired.”
Two Approaches to Coding
Strictly speaking, NTP’s and NTPsec’s code are not comparable. On the one hand, Atwood explained that “We have been very careful not to touch Dr. Miller’s algorithms and the underlying math,” working instead on “what is wrapped around it” such as utilities. On the other hand, the NTP project considers that it has an obligation to support older machines and that continuing to support even flawed code is preferable to having no support at all.
These differences may explain why NTPsec continually presents removal of some 170,000 lines of code as an accomplishment. The rationale is that less code means fewer bugs — Raymond, for example, explained in an email that, by reducing the code base, NTPsec in many cases “has removed the attack surface before the vulnerability became known.”
However, at least in places, this accomplishment may have been at the cost of functionality. Stenn alleged that NTPsec “has removed lots of stuff that has had zero reported bugs in it, like sntp, the ntpsnmpd code, and various refclocks,” simply because it was not immediately useful. Similarly, while the Windows portion of the code may be overdue for rewriting, NTPsec’s removal of it before a replacement is ready seems a security solution at least as troublesome as the problem.
Other NTPsec accomplishments, such as rewriting some of the associated tools with Python instead of Perl seem as much a matter of preference as an actual improvement. Still others, as Stenn pointed out, are open to doubt, such as whether NTPsec could have fixed all compiler warnings when “quieting a compiler warning on one version of a C compiler causes new warnings with a different version of the C compiler on the same operating system.” In the same way, Stenn questions NTPsec claimed to have improved the accuracy of NTP timekeeping by a factor of ten, because “the internal clock precision of operating systems is generally nowhere near that good.” As for making build recipes faster, Stenn points out that, when code is being removed, compiling is naturally going to be faster.
NTPsec can certainly claim some useful accomplishments, such as the creation of clear and concise documentation, including instructions by Raymond about adding a time server to a Raspberry Pi. However, many of the accomplishments it claims sound like public relations whose first goal is belittling its rival’s code by implication.
When examined, many of these claims create the impression that NTPsec is doing too much too fast, and perhaps with too little knowledge of a large code base with an over thirty-year history. Despite his own experience, Stenn admitted that his own understanding of the code has sometimes changed dramatically, even about subjects he believed he understood. “If you don’t understand exactly how everything works and where it fits into place, when things get busy, horrible things happen. Even a few years is not enough to grasp everything.”
As Atwood observed, the difference in the two projects approaches is that, to NTP, developing the protocol is a scientific process, where precision counts and caution is a virtue. But, to NTPsec, the protocol is simply another open source project, in which mistakes can be corrected in the next point release. These differences result in major differences in priorities.
To NTPsec, NTP is conservative and overly cautious about retaining code. In return, to NTP, NTPsec seems uninformed and reckless about making changes. No one needs to take sides to realize that the two perspectives are difficult to reconcile.
The Failure of Good Intentions
Sons has announced that she has “moved on” and is no longer actively involved with NTPsec, so perhaps an opportunity exists for the two projects to reconcile while pursuing their own goals. As temporary project leader, Atwood appears in conversation to have a less confrontational approach than Sons, as well as a genuine respect for the work that Stenn has done over the years, which could mean that collaboration may slowly become possible.
The problem is that Sons has been so successful in publicizing her version of events that the animosity between the two projects continues even after her departure.
Meanwhile, programmers whose efforts might be better pooled are in opposing camps, and are competing for already scarce funding, often from the same sources. No matter which project you favor, this result can hardly be considered progress.
When all the personalities and technical issues are set aside, the fact remains that, after two years of attention, the future of the Network Time Protocol remains overly reliant on a few overworked and underpaid volunteers. In the end, the problems appear no closer to resolution today than a few years ago when few users had heard of the protocol.
Feature image: Late 17th century German gilt bronze clock, the New York Metropolitan Museum of Art, public domain.