How Instagram Prepared for High Definition Video
Social media outlet Instagram diverted compute spend from its basic video encodings for more advanced encoding. The end result is improved higher video quality for users and more compute resource efficiency for Instagram, according to the company.
A blog post written by Instagram Staff Software Engineer Ryan Peterman and Instagram Software Engineer Haixia Shi explains that by Instagram’s own projections back in early 2021, it had less than a year before running out of machine compute capacity to provide video uploads for all users. In order to continue to scale and prioritize compute efficiency, Instagram engineers did a deep dive into where the compute resources were spent and identified significant ways to save resources.
By repurposing the most expensive minimum functionality encodings, the engineering team were able to significantly decrease the compute spend of the minimum functionality encodings by 94% and continue to scale with machines it currently has. In addition to saved compute resources, more instances of advanced encodings reached end users.
With Instagram heavily encouraging its 2 billion users to record more video, the work of these engineers might turn out to be vital to the platform’s future success.
How Instagram Encodes Video
Instagram generates two types of video encoding.
Minimum functionality encodings are compatible with all Instagram clients. The lower-efficiency compression is easier for older devices to decode and play.
Advanced encoding, similar to what one would expect, use newer compression technologies for higher-quality playback.
The image above illustrates the sharper detail with fewer bits.
But the problem may sound surprising. Instagram was spending over 80% of its resources processing minimum functionality encodings. That trajectory would lead to minimum functionality encodings monopolizing their resources within a year. Best case scenario videos would take longer to publish. Worst case scenario looks like significantly higher failure rates.
The advanced encodings only covered 15% of the total with time and Instagram’s projections were that advanced encodings would soon be completely snuffed out when more resources were required for minimum functionality encodings.
Why Was Minimum Functionality So Expensive?
Instagram created two classes of minimum functionality encoding for every video.
Basic adaptive bitrate (ABR) encodings were their most-watched minimum functionality type. This uses a technique called adaptive bitrate streaming which allows clients to select the version that best fits their connection. It helps prevent stalling caused by changes in bandwidth and usually offers the steadiest playback.
The second type, progressive encodings, was rarely delivered but Instagram continued to produce them to maintain compatibility with old versions of the Instagram app that doesn’t support ABR playback. Create both ABR and progressive encodings from the original file for backup was the standard workflow.
But 86.17 seconds to transcode a 23-second video 720p does leave room for improvement.
A deep dive into the two sets of encodings revealed more similarities except for the profile, and preset, and the general quality would be lower.
A solution revealed itself. They could replace their basic ABR endings with the progressive encodings’ video frames by repackaging them into an ABR-capable file structure.
It took 0.36 seconds to generate a manifest file and repackage the video frames into an ABR-capable file structure for the same input video. The cost of generating the basic ABR encodings was now virtually eliminated.
Compute was freed up for advanced encoding production but at the expense of compression efficiency of the basic ABR encodings. Instagram theorized that generating a greater number of advanced encodings would be a net positive for people who use Instagram.
Testing the Theory
It had to work before it could ship to production. Comparing the basic ABR encodings before and after would only show regressions so that wasn’t the way to go because Instagram needed to measure the net effect from more advanced encodings.
The diagram outlines their hypothesis. And more advanced video encodings would more than make up for the lower quality of the minimum functionality encodings.
The test was performed via a testing framework built by engineers which replicated a small percentage of traffic from across a test pool and control pool — each of equal processing power. The encodings from each pool were saved to different namespaces for later identification as part of control or test catalog of videos.
At delivery time, users saw the encodings from one catalog or the other. This allowed Instagram to measure whether the new encoding scheme was better for users.
The test did prove that though the minimum functionality encodings were degraded, the higher watch time for advanced encodings was a net positive.
Pushing to Production
After launching the optimization, they saw major gains in compute savings and higher advanced encoding watch times. The new encoding scheme reduced the cost of generating their basic ABR encodings by 94%. With more resources available, the company was able to increase the overall watch time coverage of advanced encodings by 33%.