Conway’s Law and Internal APIs at The New York Times
Yesterday we talked about how “The New York Times” is dogfooding its dozen external application programming interfaces (APIs) as a way to drive efficiency and innovation for its more than 150 internal APIs. Today we get into how the organization is designing its APIs so they are not just a reflection of its internal organization, a curse known as Conway’s Law.
For The Times, revamping internal API structure is more of a cultural and organizational change — much of which is reflected in the sections of the paper itself — with at least 150 internal APIs making about eight billion calls a day, owned by many different teams and stakeholders.
“Internally when I’m providing a critical service to another team and my experience causes something to break or it doesn’t work that well, it can cause something to not launch,” said The Times’ API architect Scott Feinberg.
“We forget sometimes that our organizational structure shouldn’t be reflected in how we put content out.”
But this isn’t new, the publishing industry is still based around print and even many online publications are built print first. And don’t be fooled that everyone’s getting their news online, The Times still has a print circulation more than 1.1 million. It’s just that it way of releasing news still reflects that, which seems less and less optimal.
“The way we think about information even for new publications that can come out, it’s still a print mentality in a lot of ways,” he said.
But that’s how they’ve been doing it for more than 164 years, so why would that matter? Because of Conway’s Law, where the organization reflects the way the tech is structured, and vice versa.
For example, this November, U.S. Supreme Court Justice Ruth Bader Ginsburg and women’s rights leader Gloria Steinem sat down for lunch with writer Philip Galanes in the monthly feature “Table for Three.” This incredible article with these feminist pioneers got missed by a lot of readers because “when we published it on our platform at the top of the page it said ‘Fashion and Style’ section, and the reason is that this came from ‘Fashion and Style’—’Table for Three’ is on there. It’s an antiquated delineation between teams but it’s an organization thing.” It wasn’t recognized that maybe it’d look perhaps bad that an epic piece like this might look off on the Fashion page because that’s simply the desk it came from.
Another time a great national or even international news piece could come from the Metro Desk and only be read by those interested in the tristate area. What desk writes something shouldn’t necessarily delineate where the news comes from, but, because of The Times’ technical and cultural architecture, that’s just what happens.
“We forget sometimes that our organizational structure shouldn’t be reflected in how we put content out,” Feinberg said.
Of course, that’s easier said than done. When you’re talking about one of the world’s largest newsrooms with 1,300 journalists, you certainly need some sort of more rigid organization because there’s no way each of these teams could know what the other is doing. And this organizational problem has to be brought into our digital world.
Since the sixties, The Times has had a quote of the day in Section A. Feinberg was playing around with building an API for this and found that half the time the section shows up in Corrections and the other half Summaries which probably makes sense for how it appears in print but not in the digital side of things.
“A 164 years’ worth of content is a challenge and it’s also one of our greatest strengths,” he said.
In a lot of ways, The Times is a big data company. All daily articles dating back to around 1851 are archived and stored as something similar to giant, searchable PDFs and are cut using a geospatial data analysis tool, like those cartographers use, to locate each article on the pages.
All of this data is stored in a system that is internally API-accessible, including full article text in the searchable database accessible to all subscribers. By going back through the TimesMachine, you can witness how, in the life of The Times, some sections have lived while others have died.
For example, The Times has more than 17,000 recipes in its archives. “We just own that content and a couple years ago we were like ‘Oh maybe we should start thinking about monetizing those articles because we probably have more recipes than the Food Network’.” This resulted in the NYT Cooking app.
Changes this year had to be made when The Times cut its 44-year-old Bridge column and the revolutionary-in-2007 City Room blog.
“We’re changing faster than we ever had before but as we change we see a lot of the resistance to change that we’ve seen throughout our history,” Feinberg. And with these changes, there needs to be structural changes as well.
“We’re still creating the most quality information, but we’re just thinking of new ways of doing this,” including new ways to manage, share, reuse and distribute it onto different platforms and devices. “We’re doing the same thing with print and we’re just rediscovering how to do that within the context of the Web and mobile and social and trying to do that while still giving people not only the information they want but the information they need,” Feinberg said.
The first website for The Times was launched in 1996. It was literally a duplication of the offline content, which was published once a day as an image. Since then, they’ve obviously gone through a lot of changes, which resulted in creating a huge number of services — some which can be considered monoliths themselves — that make up a service-oriented architecture.
Over the last four years, The Times has dramatically increased its investment in its technology staff. With a team of several hundred engineers, Feinberg admits that it’d be almost impossible to develop if they all shared the same interface. He also explained that they don’t use large enterprise tooling, but rather are standardizing around its GitHub which now has about 600 code repositories, about a third of The Times’ total repos.
“Our services are definitely getting smaller. The microservices term has a lot of history with it, but we are moving more towards that,” Feinberg said. “Internally we are moving to a world where all our APIs will be documented with Swagger or something similar. With the new OADF initiative [Open API Definition Format], it definitely has the most mindshare and from our standpoint it has the most tooling, where all of our APIs will be documented at least the same way, but it goes beyond that as we’re figuring out how our contract tests work.”
He says that they need to be able to make sure not only that its APIs are documented but that they are being tested in a consistent way. For public APIs, they’re using Runscope to do contract tests, calling endpoints every half hour. He admits that this is way too long to retrieve analytics from it, but “it confirms that our API is conforming to a standard that we’ve agreed on.”
Feinberg goes on to say that across teams, they’re rolling out and adding more, improving how they are defining APIs and about how resources must be defined in an API.
The API team is also moving toward using similar vocabulary in all APIs across departments, which Feinberg says makes a huge difference. The Times has a conical store of article data with a consistent vocabulary, which allows them to have a lot of different ways of showing article content, but he says that “at the end of the day, it’s all going back to the same couple of systems. We definitely have a lot of work to do in standardizing.”
In addition, they’re moving closer to a spec-first development strategy for new APIs, getting feedback before even starting to write code, with people interacting with the APIs even before they decide which data source to use, as well as working to improve cross-team communication even before the tech is built.
For now, “The New York Times” will continue to look for the balance among what you see when you open the paper or the website or the app or your watch, what its API partners want, and what makes for a stable and consistent API architecture.
Feature Image: A geographic visualization of tweets about New York Times articles, by Justin Blinder.