In addition to my primary role as the head of the Data team at LinkedIn, I oversee the coordination of our company’s open source efforts. To-date, LinkedIn engineers have released more than 75 projects under open source projects with the company’s support. Recently, I was asked how we decide which projects to open source and when to stop investing in a project.
Let me start by saying that no company can truly “own” an open source project. When the decision is made to open source a project, it should be understood that the code truly is being “shared” with a larger community outside of your company’s walls. That community now “owns” the project. You may be part of it; you may even be its biggest contributor and its current custodian. But it is no longer yours, and with time, you may become a minority stakeholder. Even if you decide to discontinue using an open sourced project internally and no longer contribute to it, it does not mean that the project has come to an end.
In exchange for giving over ownership, open sourced projects have residual benefits for the company that supports them. They help develop your “engineering brand.” They provide an explicit sign from an organization to prospective engineers that they’ll not only be able to develop their craft if they work there, but they will be able to so in a meaningful way on an open platform. If their project becomes open sourced, it is even a platform that they’ll be able to personally leverage on their next play (instead of having to reinvent it). Supporting an open source project pays a company back in the long run.
When to Cut Bait
Software is constantly evolving, and sometimes situations can arise where a company no longer sees value in investing in an open source project that they previously supported. From a purely corporate perspective, I have found that there are three reasons why companies typically stop investing in a project they open sourced.
The first reason is very prosaic — a project simply isn’t delivering enough value. This could mean that the given software is no longer delivering the benefits it used to. It could mean there are now better alternatives available, either open source or commercial. It could also mean that new efforts or business goals are demanding engineers’ time, resources, and attention. This is also what happens when a project is open sourced for the wrong reasons. For instance, when a company hopes that getting the open source community involved will drastically cut development time for new features.
The second reason that a company might stop investing in a project has to do with “scope creep” and open source community dynamics. Sometimes a project grows so much that it becomes a monolith, too big and heavy to fit into its niche within your ecosystem. It becomes much bigger than what you were using it for, and it isn’t easy to extricate or use the explicit components that you need. In these cases, you may no longer be able to accommodate it as part of your software stack. There are things you can do to help avoid this kind of issue from occurring in the first place. Make sure you have a clear definition of what your project is intended to do, and proactively engage with project community members about your concerns. Note that it may make sense for the project itself to mutate into that big thing. When that happens, celebrate it, even if it means your company no longer uses it.
The final reason is related to the pace of change and frequency of releases for an open source project. Imagine that you have a project that is core to your business and actively runs in your production environment. You’ve invested significantly in operating and improving this project. But what if your bug fixes and patches are not being incorporated fast enough by the community? What if your changes, which depend on your production environment, are not wanted by the community? It can take real effort to fight the natural encroachment of these discrepancies into a project that you use extensively in production. When the pain is too high, it is natural to fork off of the open source project. You’ll patch some updates back and forth for awhile, but it will be harder and harder until you reach the point you’ll question why you are doing it and eventually stop.
Note that I said forking was a way of discontinuing investment in an open source project. Essentially, what you end up with is a new project that is being developed on primarily by your engineers. A good example of the practical decisions that can lead to the decision to fork a project are explained in this Quora post about Facebook’s introduction of Corona; differing project goals, significant investment in a system, and significant discrepancies with your internal production code can all lead to scenarios where forking a project makes sense. The need to fork a project can, at times, be avoided by working on extensions or enhancements to a project within a separate code base or companion project (similar to what we did at LinkedIn as we developed Burrow).
Turn Off the Light
As I mentioned earlier, a company may find itself having to stop investing in a project entirely, because there is no longer an internal use for that project. Being a good citizen in this situation, and especially if you were the main driver of the open source project, implies that you don’t just close the door, turn off the light, and throw away the key. In that situation, there are two general ways of concluding your support for a project: end-of-life or simply abandoning it.
End-of-life for an open source project often occurs because there is a better, suggested alternative available in the open source community. This is the best case for all involved, as you can present the community with suggested alternatives or a migration path. For example, one of our teams recently discontinued an open source project called Camus that was used as our pipeline to pull data from Kafka into HDFS. Rather than simply abandoning the project, we went the extra step to ensure that we provided a migration path so that Camus users could easily adopt Gobblin, our new data ingestion framework. This shows that you understand the needs of the community and are taking the time to be that “good citizen” within it.
An open source project is often abandoned because it has not gained traction with the community at large and your organization has moved away from it. If it had gained traction in the community, you would have simply found a new maintainer to carry on the work. This situation can arise because a standalone open source project is not as useful to others as its creators thought — perhaps it should have been folded into an existing open source effort. The source code stays in the public domain, the project is dormant, which is visible to the community through the timeline graph of commits (or even better, relegated to an “attic” and marked as such).
For example, one the first open source projects at LinkedIn consisted of search components that we built separately from our main search architecture, leveraging Lucene. Unfortunately, at the time, we didn’t align closely with it or the existing developer community, so most of these extensions never gained traction within the Lucene community. Later, it made sense to instead develop these components directly within the core framework instead of having to constantly keep them alive with every major update of Lucene.
To ensure your organization is seen as a positive contributor to the open source community, you need to pay careful attention to the entire project lifecycle including how the project ends. Successful open source projects may be purely time-constrained and require a significant investment of time to develop, monitor, and nurture.
Feature image via Pixabay.