TNS
VOXPOP
How has the recent turmoil within the OpenAI offices changed your plans to use GPT in a business process or product in 2024?
Increased uncertainty means we are more likely to evaluate alternative AI chatbots and LLMs.
0%
No change in plans, though we will keep an eye on the situation.
0%
With Sam Altman back in charge, we are more likely to go all-in with GPT and LLMs.
0%
What recent turmoil?
0%
Cloud Native Ecosystem / Kubernetes / Open Source / Tech Life

Kubernetes Community: A Guide to Open Source Localization

English continues to be a huge barrier to entry in open source communities. Learn how Kubernetes handled localization into 14 languages.
May 9th, 2023 5:00am by
Featued image for: Kubernetes Community: A Guide to Open Source Localization

One of the most consistent barriers to contributing to open source is the dominance of the English language. These highly distributed, remote teams rely on asynchronous communication for code, documentation, Slack channels, mailing lists and more, making it hard to even get started if you don’t have a confident English level.

Since open source contribution relies on a lot of unpaid labor, many continue to volunteer because they feel part of a community. Those with a higher level of English language confidence are proven to feel a greater sense of belonging and thus have a higher intrinsic motivation.

This means those that don’t read, write or speak English proficiently are cut out of the opportunity that is open source — from technical and project management experience to networking to work requirements. And open source communities can access far fewer users and contributors. A direct result is that open source contributors reside predominantly in the Global North, meaning those that could potentially benefit the most from reliable and free software, are kept out of it.

And, with tech being the future, and open source making up about 70% of stacks, the predominance of the English language means whole countries are being locked out of participating in building our shared future.

Thankfully, some communities are starting to understand this risk to open source sustainability and are making an effort to translate their documentation. But again this is a time-consuming, largely voluntary effort.

The Cloud Native Computing Foundation‘s largest project — Kubernetes — has successfully translated its core docs into 14 languages, with at least three more in the works. At the recent KubeCon+CloudNativeCon Europe, Divya Mohan and Natali Vlatko, two of the three co-chairs of the massive documentation special interest group or SIG Docs, outlined the process of dismantling this inclusion hurdle — while of course encouraging others to contribute to localization.

What Is Localization vs. Translation?

Mohan and Vlatko, along with Rey Lejano, are in charge of setting up the procedural, administration and technical approvals that are required around the documentation, which includes the whole Kubernetes website, reference docs, the blog and localization.

“We talk about translation, but it’s really more than that,” Vlatko underscored. “Localization is the act of translating and maintaining Kubernetes documentation into your native language.”

There’s an emphasis on “native here” because, she continued, “We really do rely on contributors who know how to translate a term that may actually have many words that could be used, many phrases that could be used in a certain translation. And we want our docs, which are used by people all around the world to actually learn about and use Kubernetes. We actually need them to be as technically accurate and then language-wise accurate as possible.”

That makes this a global project requiring a community of native speakers who understand the technology.

“Localization is not just about translation. It’s about community. It’s about doing a lot of work. But then it’s also about helping users adopt and welcoming them into your native community as well,” Vlatko continued.

It All Starts with a Community

The first step in open source localization is finding your community. “We need folks who are not only going to work together but actually approve each other’s stuff,” Vlatko said. With this in mind, the Kubernetes SIG Docs require a minimum of two owners for a localization to launch — already preparing to reduce the fragility of open source projects that have a single maintainer.

Then, to further reduce loneliness and to increase support, SIG Docs has created the Localization subgroup, which runs across languages and writing systems. Each localization subproject is then able to organize themselves as they see fit, in a way that’s most welcoming in their culture.

“So each of these subprojects has a different way of functioning,” Mohan later told The New Stack. “In turn, this also cascades to the various translations within the localization subproject as well. Each translation has a team and contributors that have different processes and meetings.”

With all languages, including English, Vlatko noted that building the community is not only the first step but the most challenging. After all, like all things open source, it relies on unpaid volunteers.

The localization subproject meets at 3 p.m. UTC the first Monday of each month. Notably, they follow a remote work best practice by speaking in the specific timezone UTC, which is both universal and doesn’t change with seasons. They also have asynchronous communication staples including a mailing list as well as a Slack community. The SIG also has an open agenda policy to allow for a more open, questioning culture.

What Are the Requirements to Get Started with Localization?

Community is the first but not the only requirement. You also already have to be an active Kubernetes organization member, before you can start your own community. That means you already understand and are committed to the project, but also are logistically able to review pull requests and take ownership of the work. This could be a technical contribution or contributing to another localization project including the English documentation.

Then once these standards of interest, community, and existing involvement are met, you can launch your localization. First, find your ISO 639-1 two-letter language code, which is used for the repository branch creation and naming your Slack channel.

Then, create a localization Slack channel so you are able to organize in your native language. The first thing that needs collaboration and localization of is the Kubernetes community code of conduct.

There are also other minimum required pieces of Kubernetes content for translation before the release of the project:

  • Home page
  • Setup
  • Tutorials, both Kubernetes Basics and Hello Minikube
  • All site strings
  • Releases

What Localizations Exist and How You Can Contribute Today

Currently, the Kubernetes SIG Docs is in 15 languages:

  • English
  • French
  • German
  • Hindi
  • Indonesian
  • Italian
  • Japanese
  • Korean
  • Mandarin
  • Polish
  • Brazilian Portuguese
  • Russian
  • Spanish
  • Ukrainian
  • Vietnamese

There are also existing subgroups in Bulgarian, Tamil, and Turkish, which hope to release localized docs and websites in their languages too.

Not to make English the default language, Mohan pointed out that English is also a localization. “If you are already familiar with Kubernetes or if you’re even just getting started, we really appreciate your points of view on how we could make the documentation better, whether that’s clarifying how a particular concept is explained or making a minor typo edit, it’s highly appreciated,” she said.

To get started contributing, you must sign the Kubernetes contributor license agreement. Then, you can join a community, best kicked off via the respective Slack channels, for both existing and upcoming localizations.

Then join both the SIG Docs and SIG Docs localization mailing lists. And attend those monthly meetings.

“Those are really good avenues to clarify your doubts. Because it’s a group of folks who are already working on the same stuff,” Mohan remarked.

This is a massive project, she commented, so start by posting on the SIG Docs Localizations Slack channel. Just don’t ask your questions in private, she recommends, as you run the risk of inundating the localization leads, while also not giving everyone the opportunity to respond. Plus, there can be doubts others share.

The mailing list, bi-weekly Tuesday SIG Docs meeting, as well as the fourth Wednesday of every month, has an APAC meeting.

Each localization project has about 25 to 50 contributors, Mohan estimates, but some languages manage to function with less.

“Most projects I know need help,” she said. It’s encouraged, if you’re interested in Kubernetes localization for your native language or find benefit from Kubernetes in some way, that you volunteer time and give back.

Hindi: A Kubernetes Localization Case Study

“Finding your community. Finding your tribe to build a localization is one of the most challenging aspects,” Mohan said, having learned it the hard way kicking off Hindi, the newest completed localization. This is the first localization in the Devanagari script, which is used for Sanskrit, Hindi and other Indian languages. The Hindi localization team now has six leaders, including two that are regular contributors over the past year.

The Hindi effort kicked off in late 2021 and launched at the end of August 2022. There are currently 245 Hindi speakers active in the respective Slack channel.

Localization into Hindi means opening up a language spoken by more than half a billion people.

“They’re still ongoing because docs are never done. They update every release cycle, and tracking those changes is a lot of manual effort currently. The people leading the localization are required to actively track the docs that change per cycle and put out issues for them, ask contributors to come and chip in, and this is an ongoing effort that doesn’t stop at the point the localization goes live,” commented Mohan.

Group Created with Sketch.
TNS owner Insight Partners is an investor in: The New Stack, Pragma.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.