AI and Open Source: Expanding Apache Airflow’s Global Impact Through Collaboration

By Jarek Potiuk, Shahar Epstein, Elad Kalif, Brent Bovenzi, Pierre Jeambrun

AI and Open Source? Does it even work?

In the fast-evolving landscape of technology, the conversation around Artificial Intelligence often centers on its potential for automation and efficiency gains. But what about AI’s role within the collaborative world of open source? Can AI truly integrate with the community-driven spirit that defines projects like Apache Airflow? We believe the answer is a resounding “yes,” and our recent experience translating the Airflow UI into multiple languages stands as a testament. This isn’t a story of AI replacing human effort, but rather of AI empowering and amplifying it, all while strengthening the very fabric of our open source community.

The Internationalization Challenge: Connecting a Global Community

Apache Airflow is an open source platform for programmatically authoring, scheduling, and monitoring workflows, used by individuals and organizations across the world. Since the establishment of the project, more than 10 years ago, our user interface (UI) has been English exclusive. This presented a significant barrier for many users who prefer to interact with software in their native language. 

We faced a classic internationalization (i18n) challenge: how do we translate over 560 phrases in the Airflow UI into multiple languages, given the working assumption that few active contributors are proficient in each language and would be able to maintain each language in the long term? The sheer volume of text, coupled with the nuances of different languages, Airflow-specific context and the collaborative nature of open source, made this a daunting task. Any new feature would require new translations to be added in multiple languages and that means that the many translations we have will have to be kept up-to-date for them. 

We needed a solution that would not only deliver accurate translations but also foster engagement and build a sustainable process for future updates.

This is what our UI looks like now in different languages:

Navigating the Path Forward: Balancing Efficiency and Open Source Spirit

When considering solutions, several options came to mind. Fully automated machine translation was an obvious contender, promising rapid results. However, simply plugging our UI into a translation API felt antithetical to the open source ethos. We risked losing the human touch, the cultural context, and the invaluable feedback loop that direct contributor involvement provides. Moreover, we still need proficient humans in the loop to ensure that translations are clear for native speakers, accurately reflect the original intent, and fit properly within the UI constraints, since AI cannot currently adequately assess when translations are too long or short for the interface. 

How could we leverage the power of AI while still nurturing the community, encouraging collaboration, and maintaining the high quality that comes from human expertise? Our goal wasn’t just to get translations done, but to do so in a way that empowered our contributors and strengthened the project. We wanted to build relationships, not just translate strings.

The Collaborative AI-Powered Process and Tooling

Our solution centered on a “people-first, AI-assisted” approach. Instead of a top-down, AI-only translation, we built a process that empowered individual contributors, leveraging AI as a supportive tool rather than a wholesale replacement. 

Here’s how it worked:

  • Contributor Ownership: We called upon our community to “adopt” a language. Each volunteer became the primary owner and reviewer for their chosen language, fostering a sense of responsibility and dedication. Moreover we asked for more than one person to step up and in many cases we got two or three people proficient in their native language who could collaborate and agree on the right translation.
  • Purpose-Built Tooling: We developed custom tooling that helped to more easily apply regular tools like coding LLM integration to aid translation efforts.
  • Iterative Review and Refinement: Once a contributor had worked through the AI suggestions, their translations were then subject to peer review within the community. This human-to-human interaction was vital for catching nuances, ensuring cultural appropriateness, and maintaining consistency.
  • Simplified Contribution Flow: We streamlined the process for submitting translations, making it as easy as possible for contributors to get involved, even those new to open source contributions.

This approach allowed us to rapidly generate a first pass of translations while keeping human intelligence and collaboration at the forefront.

People and Collaboration First: The Heart of Sustainable Translation

The success of this initiative wasn’t just about the technology; it was fundamentally about the people. By putting collaboration and individual ownership at the core, we achieved something remarkable.

Here’s why this people-centric approach, augmented by AI, was so effective:

  • Empowering Contributors: Instead of feeling like they were starting from scratch, our contributors were able to leverage AI to kickstart their work, making the task feel less daunting and more achievable. This increased efficiency significantly, especially for the “boring” and repetitive parts of the translation process.
  • Building Stronger Bonds: The shared goal of translating Airflow fostered direct interaction and collaboration among contributors. We saw lively discussions, helpful suggestions, and a true sense of camaraderie develop as people worked together on their languages and even helped each other across languages. For instance, a contributor working on Spanish might offer insights that help a Portuguese translator, or vice-versa, due to linguistic similarities.
  • Quality Through Collaboration: While the AI provided a great starting point, the human element was crucial for quality. There were numerous instances where AI-generated translations, while grammatically correct, missed the specific context or tone required for the Airflow UI. A collaborative review process ensured that these nuances were captured. For example, a seemingly straightforward term might have a specific technical meaning in the Airflow context that only a human translator familiar with the project could accurately convey. In another instance, cultural sensitivity might dictate a different phrasing entirely.
  • Sustainable Engagement: By making contributors responsible for their languages, we’ve established a sustainable model for ongoing translation efforts. As Airflow evolves and new UI elements are introduced, these dedicated language owners will be crucial for maintaining up-to-date translations. This distributed ownership prevents bottlenecks and ensures the long-term viability of the internationalization effort. 
  • Human-driven, AI-friendly tooling: The tooling that helps with that is designed with making it as easy as possible to see that new translations are needed and to generate incremental updates as “TODO:” entries that are acting as instructions to LLM agents to apply automated translations for the driving human to review and accept. With the tooling and translations done already for most parts, we can be more certain that the incremental translations done with AI assistance are good (because we can ask AI agents to follow translation already done) and open PRs for the owners to review by a single person who adds new English phrases. 

Interestingly, the tooling itself – similarly as the first draft of this article – has been vibe-coded with AI as well. We simply described what we wanted the tool to do and AI generated the tool scripts in Python for us. Similarly, we described to AI what the article was about and it wrote the first version of it. We heavily iterated on both with manual review and AI as well, and the time we saved was used to think and discuss how we wanted to approach both the process and article.

In a matter of weeks, we successfully translated over 560 UI phrases into eight languages. This rapid progress was a direct result of combining smart tooling with an empowered, collaborative community.

Here is an example output of our tooling that shows the status of translation in Polish and Taiwan-Chinese, where the translations are now complete, reviewed and agreed as “good” translations among the “owners” of the translations.

Responsible AI in Open Source: A Blueprint for the Future

Our journey with Apache Airflow’s UI translation offers a compelling answer to the question: “How can we leverage AI while preserving the open source spirit and collaboration?” Instead of a “top-down” AI approach that might automate everything, we opted for a model where AI enabled open source maintainers and collaborators to increase their efficiency, while fiercely protecting and fostering the core principles of open source collaboration.

This experience highlights a critical distinction: AI in open source shouldn’t just be about efficiency gains, as is often the focus in corporate AI adoption. While efficiency is a welcome byproduct, the true power of AI in open source lies in its ability to improve collaboration, build relationships, and empower individuals.

Apache Software Foundation, with its guiding principle of “community over code,” provides the perfect environment for this responsible and human-centric application of AI. It’s a place where newcomers can genuinely interact with and learn from experienced contributors – a valuable opportunity that is becoming increasingly scarce in a corporate world where entry-level roles and hands-on experience are harder to come by as the junior positions are often replaced by AI.

Our Airflow translation story is a vivid example of how AI, when used thoughtfully and with a commitment to community, can not only accelerate progress but also strengthen the bonds that make open source truly special. Beyond Airflow, this example is applicable to many other open source projects, and demonstrates that the future of open source, augmented by AI, is not one of automation replacing people, but of AI empowering people to build, collaborate, and innovate together.

We strongly believe that a similar approach, patterns and philosophy of embedding Gen-AI workflows in the regular open-source project curriculum, is widely applicable – not only to many open-source projects, but also to a wide range of various tasks. There are many other tasks – such as refactoring, applying new patterns to the code, etc. where a similar approach can be successfully used – also strengthening collaboration, building relationships and empowering individuals.

This people-first, AI-assisted model for internationalization offers a valuable blueprint for other open source projects facing similar challenges. By prioritizing human collaboration and ownership, and strategically applying AI to accelerate repetitive tasks, projects can achieve widespread translation without sacrificing community engagement or quality. This approach emphasizes building purpose-built tooling that supports human review and iteration, fostering a sense of shared responsibility among contributors, and streamlining the contribution flow to make it accessible for newcomers. Ultimately, the success lies in viewing AI not as a replacement for human effort, but as a powerful enabler that amplifies the impact of a dedicated and collaborative community.

Related Articles

By: Chia-Ping Tsai (ALC Taipei Lead) We are absolutely thrilled to announce the official launch of the Apache Local Community (ALC) Taipei, established with...

By: Brian Proffitt, Vice President, Marketing & Publicity The Apache Software Foundation has come a long way since its founding in 1999, shepherding over...

Can you tell us a bit about the project?Apache Geode is an in-memory data grid that provides real-time, consistent access to data-intensive applications at...

Subscribe to ASF Plus One, Our Monthly Newsletter