Docker Hub Limits: What They Are and How to Route Around Them
Earlier this year, Docker announced that it would be implementing new restrictions on the use of its Docker Hub container image repository. The move was necessary to manage outlying use cases that go beyond what it is willing to continue providing as a free service, the company claimed.
At the time, those limitations dealt with the storage of images for extended periods of time that were left idle, and the rate-limiting of image pulls, and both were to be enacted on November 2nd. While the limitation regarding idle repositories was delayed, the pull rate-limiting went into effect earlier this month, putting limits on anonymous and free tier users of 100 and 200 image pulls per six hours, respectively. Paid users, however, enjoy unlimited image pulls.
“We’ve got now more than 12 million developers on there, but what we discovered was that an extremely small percentage were making very, very heavy use of the Hub. And so, as part of this, we looked at that and we said, we’ve got to make sure that this is sustainable, really, for both sides,” said Donnie Berkholz, vice president of products at Docker, in an interview. “It affects a very small percentage, on the order of 1.5% or less of the user base, and so there’s a lot of noise out there, but I think it’s important to emphasize that we’re talking about more than 98% of our users that see no effect and keep on as they were, happily using Docker Hub.”
According to Berkholz, the company had found that this very small percentage of users were actually responsible for 30 percent of Docker Hub’s traffic.
“You can’t keep the unlimited, all-you-can-eat buffet going forever for everybody. You’ve got to figure out how you get to a point where people can take the right amount of portions. When we worked through that, we wanted to make sure that, for the vast majority of developers, they would never see this, they would never run into it,” explained Berkholz. “For a small population of users who are making extremely heavy use, those are the ones where we want to make sure that we’re able to provide them with value and make sure that we’re able to bring them into a place where we understand their use cases, and we’re able to provide for those use cases.”
Nonetheless, several companies have addressed the changes with blog posts on the topic offering their own solutions for developers, and OpenFaas founder Alex Ellis says that the issue is one many users, especially those running Kubernetes, need to prepare for now rather than later.
“I think a lot of people are sort of complacent about it, or they haven’t hit the issues that they’re going to hit yet. When you think about CI products, they’re using shared IP addresses for all of the activities on those nodes,” said Ellis.
In a blog post detailing how to prepare for Docker Hub rate limits, Ellis writes that “Kubernetes users will be most affected since it’s very common to push and pull images during development many times with each revision of a container. Even bootstrapping a cluster with 10 nodes, each of which needs 10 containers just for its control-plane and could exhaust the unauthenticated limit before you’ve even started getting to the real work.”
“It’s not a problem to pay. It’s a problem in that those rate limits, like if it had been 1,000 to 5,000, per six hours, we probably all would have got on absolutely fine, paid them and got our registry secrets in where we needed them.” — Alex Ellis
Of course, it is for exactly these scenarios where many users are using the same IP address or a particular workflow causes a number of image pulls beyond current rate limits where Docker urges users to get a paid account. For individual users, the subscription is $5 per month, while team subscriptions start at $25 for five users per month. Even for those who pay, however, Ellis contends that the new limits will put a bit of undue burden on novice Kubernetes users.
“For Kubernetes learners, whichever solution you go for (including paying for a Docker Hub account), this is going to be an additional step,” writes Ellis. “The learning curve is steep enough already, but now rather than installing Kubernetes and getting on with things, a suitable workaround will need to be deployed on every new cluster.”
For those users looking to circumnavigate the new limits, Ellis offers a number of solutions, including hosting a local mirror of Docker Hub, using a public mirror of Docker Hub, publishing your own images to another registry, and, if still using Docker Hub, then configuring an image pull secret to authenticate with Docker and either receive the bumped up 200 rate limit or unlimited pulls with a paid account.
To that end, Ellis has created registry-creds, an open source operator that can be used to propagate a single ImagePullSecret to all namespaces within your cluster, so that images can be pulled with authentication and to make it easier for users of Kubernetes to consume images from Docker Hub. While the tool is intended to “ease the every-day lives of developers and new-comers to Kubernetes,” he also suggests using something like Argo, Flux, or Terraform for managing secrets across namespaces in production.
In terms of alternatives, there are several, though each should be considered according to their own terms and limitations. Currently, GitHub offers unlimited pulls of public images at its GitHub Container Registry, Google offers cached Docker Hub images on its own mirrors, and AWS offers private hosting for mirroring and says it plans to launch a public container registry “within weeks.” VMWare, meanwhile, contends that Harbor “can help you mitigate the effects of the upcoming Docker Hub limits via both replication capabilities and a proxy cache feature,” and GitLab has offered a guide to its users on how to “reduce the number of calls to DockerHub from your CI/CD infrastructure”, as well as open sourced its Dependency Proxy, which will be free for GitLab Core users to “for proxying and caching images from Docker Hub or packages from any of the supported public repositories” as of November 22, 2020.
Some users who might feel the effects of these rate limit changes most acutely are open source projects, which are often already strapped for both cash and time, and Docker has offered unlimited image pulls for those projects that qualify. Requirements include that all the repos within the publisher’s Docker namespace must meet the Open Source Initiative’s (OSI) definition of “open source,” distribute images under OSI approved open source license, and be “public and non-commercial.” Projects have to submit an application yearly and for those approved there is a list of “joint promotional programs” they must commit to participating in, including “blogs, webinars, solutions briefs and other collateral.”
Berkholz contends that the requirements are reasonable and add up to a quid pro quo of value for all involved.
“Hopefully, it’s not too much to ask that, as we’re providing them things for free on our behalf, because we care about sustaining that community, that we’re able to make that something that makes a lot of sense for everybody involved and not go over the top and asking people to come out here and write 70 blog posts and do a webinar every week or anything,” said Berkholz. “We’re able to take the kinds of things that we’re trying to make available to help out the open source community, and get a little bit of that benefit ourselves and make it a fair trade of value, because that’s what’s going to enable the sustainability of that program on our end.”
For Ellis, the open source requirements, much like the rate limits themselves, appear a touch too aggressive, and he fears that the end result won’t be that Docker gets paid — which he encourages all to do — but rather that the entire affair will end in fragmentation.
“The end result is going to be a lot of fragmented solutions. Just like we had 20 serverless projects in the CNCF landscape, and then a few died out, we’re going to get the same thing; everyone’s going to be building the public container registry, everyone’s going to be trying to solve this problem and get a portion of those customers,” said Ellis. “Where it’s really going to affect people is the developer experience. If you’re a new developer trying to learn Kubernetes, it’s going to cause friction for you.”
The Cloud Native Computing Foundation and GitLab are sponsors of The New Stack.