Confluent created a way to analyze the real-time data. Now it’s geared for IPO

There is a wave of data first companies that recently went public including the notable Snowflake and Databricks. I wrote short essays on both of them. Both excel at storing the data in cloud warehouses and allow analysts and machine learning engineers to develop data first products for their enterprises but they lack a way to connect, process data from all the real time systems which is what Confluent solves.

As you know, data is being generated at an exponential pace. For example, when you scroll on LinkedIn every like, comment, or time spent on a post is data. I didn’t even include other messages exchanged between your chrome browser and LinkedIn application. In 2019, there were 7 trillion such messages captured per day. 


Few more cases where data is exploding:

Snowflake provides you different connectors to get the data from different sources into the cloud. But how do you connect all data sources(apps) and receivers(databases). Kafka, an open-source software set out to solve this problem by acting as an interface. 

Realizing the need to build data-intensive applications, LinkedIn open-sourced a project in 2011 called Apache Kafka. Of course, you might have guessed the trend. 


The founders added some community and commercial features to the opensource version and created a company called Confluent, valued at 4.5BUSD as of 2020. Confluent success is rooted in the following: