A lot of little tweaks can yield big improvements. That’s the major takeaway from eBay’s year-long initiative to improve the site’s performance across all platforms — iOS, Android and the web.
“Don’t think there’s one thing that’s going to drastically move the needle. … A lot of things add up. Don’t ignore the small things, saying, ‘This is only going to save me a few milliseconds and I don’t care about it,” said Senthil Padmanabhan, vice president and eBay technical fellow, in an interview with The New Stack.
While speed had long been part of the company’s focus, Padmanabhan was among those telling senior leadership there was serious work to be done.
“Speed used to be a very critical factor in terms of performance. We wanted to deliver an experience that was fast for our end users. That has always been the case in our company,” he said. “In 2018, we had a lot of focus on a lot of product deliverables. Speed became constant, meaning it didn’t improve, it didn’t decrease. But by the end of 2018, it reflected to us that being constant was not acceptable.”
One reason was that if global internet connectivity improves and you’re constant, you’re actually degrading the experience.
The second reason was that people don’t shop at just one site, but at multiple sites and have multiple experiences.
“In Silicon Valley, we live in a bubble here. We have the best speeds, hardware and devices. But that’s not true of most the rest of the world. We did a study on that and found it’s true: A lot of users across the globe didn’t have the same quality experience that a user in North America gets. So we started talking about it,” Padmanabhan said.
“So we made a case to senior leadership saying, ‘We have focused on speed in the past. We need to bring attention to it again, do a big initiative to move speed to the way we want to do it, then eventually make it a periodic thing where we keep on where we make speed a key part of the experience.’”
The initiative, simply called Speed, took place throughout 2019. Padmanabhan said it was a top-down as well as a bottom-up project.
“The engineers wanted to do it and the senior leadership wanted to do it. So it was perfect alignment,” he said.
Focusing on the right metrics would determine the success of the project. “How fast is fast?” became the question.
For the web, it decided to focus on:
- TTFB (Time To First Byte) — the time that it takes for a user’s browser to receive the first byte of page content.
- TATF (Time to Above The Fold) — which for its desktop user was after the sixth item image loaded.
- E2E (end-to-end) — the time it takes to load the entire page.
For iOS and Android:
- VVC (Virtual Visual Complete) — the TATF equivalent for native apps.
“We used to do a lot of experiments on speed before 2018. We have a machine learning model where we collect all this data and do correlation among various engagement metrics,” he said.
“So of all the metrics we had, the above-the-fold and page-load times were ones that correlated to engagement. So we had some proof that improving these were going to improve customer satisfaction.”
Using a competitive study with the Chrome User Experience Report, it set a speed budget of milliseconds against these metrics for homepage, search results, and item page. Separate performance budgets were set for synthetic and RUM environments for each of the platforms (iOS, Android, desktop, and mobile web).
In his post entitled “Speed By A Thousand Cuts,” Padmanabhan outlines a hefty list, including:
- Native app parsing improvements — using an efficient parsing algorithm that optimizes for content that needs to be displayed immediately.
- Critical path optimization for services — to get above-the-fold content loaded quickly.
- Image optimizations — standardiizing on the WebP image format for search results and applying the same rigor in size and format to curated images, such as banners and logos.
- Startup time improvements for native apps — Reducing the initialization time at the application level when doing a cold start.
- Item prefetch — prefetching items from search. Reducing either server processing time or network time, depending on where the item is cached.
- Search images eager download — sending the first 10 item images to the browser in a chunk along with the header, so the downloads can start before the rest of the information arrives, reducing retrieval time.
- Autosuggest edge caching — autosuggestions served from a CDN cache globally for native apps and non-US markets for the web, reducing network latency and server processing time.
- Homepage unrecognized users edge caching — caching content (HTML) for new or unrecognized users from the edge network for a short period so they get content from a server near them, instead of from a data center.
Not all the cuts saved time equally. For instance, item prefetch affects only users who click one of the first 10 items and who come from search, but it made an impact in terms of milliseconds. While reducing payload affects 100% of traffic, its savings was less. Yet the overall impact of the two ended up being roughly the same.
Instrumentation was key to reducing payload, he said, by highlighting unnecessary code.
“If we sent a file and found that 60% of the code is being executed and 40% was not being used, we looked at that code and found some of these things were never even used. Those features are being replicated. Product keeps evolving and we keep replicating them. So we started reducing it. Similarly with payloads on our APIs,” he said.
Item prefetch was one of the most challenging cuts, Padmanabhan said, because of the nuance involved in prefetching items that match the criteria.
“Autosuggest edge caching — that our customers love, but that was low-hanging fruit to me. It was a simple thing to do, but immediately had some nice impact. It was something we should have done a long time back, but just didn’t look into it,” he said.
Because, like all e-commerce sites, eBay is so image-intensive — it has more than 1 billion listing images — finding even small reductions there provided big improvements for both users and on its infrastructure, Padmanabhan said.
The WebP format offers compression at the level of .JPEG, but at a smaller size, he said, so applying consistency with that format paid off.
Making all these changes also meant tweaking its infrastructure.
“In Silicon Valley, we live in a bubble here. We have the best speeds, hardware and devices. But that’s not true of most the rest of the world.” — Senthil Padmanabhan
“We have points of presence globally distributed, so we needed to go in and tweak our infrastructure logic on the edge in order to manage some of the caching — auto-specific caching or homepage caching, pre-fetch. We also had to work the CND provider on some logic for that,” he said.
While the year-long initiative has ended, there will be new projects for 2020 with additional metrics, Padmanabhan said.
“Whenever you’re adding a new product, you’re adding more code, more images, which is going to increase the payload. And you still have to maintain the [performance] budget. So you have to offset that by doing some optimization. So with a lot of the new products, we are including the optimizations in the plan itself,” he said.
“I would say the biggest thing is empathy for the customer. I think that’s ingrained now across all our teams. Before we start any new project, we talk about performance, which I would say is the biggest win now.”