Meet Apache InLong, a new Top-Level Project

Apache InLong is now a Top-Level Project (TLP). This means that the project has graduated from the Apache Incubator. Created in 2002, the incubator helps incoming projects to the Apache Software Foundation (called ‘podlings’) adopt the Apache style of governance and operation and guides them to the ASF services available to Apache projects. Podlings also benefit from designated mentors that help projects navigate all the ASF teams and help facilitate a podling’s growth and operation.

We spoke to Charles Zhang, vice president of Apache InLong, to learn more about the project, its technology, community goals, and plans.

Congratulations on graduating to Top-Level Project status! Briefly, how would you describe your project so that it would be easily understood by the average user?

Apache InLong is a one-stop massive data integration project, and it can help you easily solve scenarios such as data collection, data synchronization, and centralizing data from multiple sources using ETL (Extract Transform, Load). 

Apache InLong is not a tool but a complete solution. 

Imagine that your company has over 100,000 employees, and each employee generates daily data, such as spreadsheets and documents. If we want to collect this data together for statistics and analysis, how should we do it? InLong is designed to help you solve this problem. InLong helps you collect scattered data together, no matter what type it is, no matter how large the amount of data. 

Let’s expand on that description some more. Who is your audience, and what key features of the technology do you believe will excite people?

We would say that InLong is for all companies that want to drive business growth through data are InLong’s audience, which can provide them with a steady stream of data.

The InLong has many remarkable and impressive technical features. First, it has a plug-in design, which is reflected in the growing number of data connectors and the expansion of new message queues, monitoring platforms, and dashboard modules. 

Second, InLong has a massive data processing architecture, which means it scales up and down rapidly, has high data processing performance, and has high-system stability. In addition, the one-stop design of InLong, has all the capabilities related to data integration. This includes audit reconciliation, data monitoring, metadata management, authentication, etc., which will help you solve all problems without leaving home.

Is there an origin story behind the project? Tell us how you got to this point in the project’s history.

Contributors familiar with InLong should know that the project was renamed from TubeMQ to InLong during incubation. We also changed the project’s direction from a message queue service to providing data integration solutions around message queues. 

There are two main reasons why we made such a significant change. First, we found many message queue services already existed in the community. Compared with popular projects, such as Apache Kafka and Apache Pulsar, the original TubeMQ had no particularly obvious advantages and received a low level of attention, therefore, we considered making some changes. 

Secondly, from community user feedback, it was clear they wanted a complete TubeMQ ecological service that would help them use TubeMQ quickly. Tencent had implemented a complete data synchronization solution based on TubeMQ so we used that as a prototype, upgraded the project name to InLong, and released it as open source software.

What has been your experience growing the community?

For incubation projects, growing the community is very important, and we have accumulated a lot of experience in this area. 

First, we believe a project needs to be friendly to new users, providing complete documentation guidelines, answering questions patiently, and even helping users solve problems remotely. 

Second, we learned that the project needed strict specifications and tools as this helps developers learn and get involved quickly, which is vital for the project’s long-term development. In addition, we recognized that for a project to be successful, it needs to be visible and communicate to its potential audience as much as possible wherever they are likely to be. To achieve this during the incubation period of InLong, we participated in many technical summits, community activities and wrote technical blogs, etc. 

Using these outreach methods enabled us to promote the project to more people and created more face-to-face opportunities with potential developers and users.

What are the next steps? What does the future hold for the project?

Let’s share an exciting thing first. We formulated this year’s roadmap at the beginning of the year, but we have already completed all the tasks in less than half a year, showing that our community is very active. 

The next goal of the InLong community is to cover more use cases, attract more users, and further expand and strengthen the community. The roadmap for the second half of 2022  has also been formulated, and interested developers are welcome to check it directly on GitHub.

What’s the best way to learn about the project and try it out?

For the InLong users and developers, the best way to learn InLong is to know what it does, and then experience it through a quick start. The InLong community provides complete and rich documentation. For beginners, you can get the required information step by step in the official website documentation and complete all example experiments. Of course, if you encounter problems, you can contact the community through email, Slack, WeChat group, etc.

Head to the official Apache InLong to learn more.

The first technical meetup for Apache InLong during incubation.