
It was the obvious choice-AWS has been the cornerstone of our cloud strategy,” says Paola. We were handling 100% of the data ingest that was coming from one of this company’s biggest launches, and everything ran without a hitch.” Why did Astronomer build its startup on AWS? “I can’t say it was a decision. “And I thought, ‘Oh no, did something go wrong?’ Turns out, something went right. The morning after the launch, there were no support tickets,” says Viraj. The company relied on Astronomer to orchestrate the flow of data for its biggest launch of the year.

“We were all hands on deck for a proof-of-concept with a large gaming company. Viraj laughs as he shares a story about their early days. The market’s need for Astronomer products, as well as the company’s potential for success, was evident early on. “Now that we’ve solved infrastructure management, we’re focused on the broader set of capabilities needed to take Airflow and use it as the foundation for a complete orchestration platform.” Building and scaling on AWS “It started with people running open source Airflow and asking us for help with managing the infrastructure behind that,” Pete says. With more than 350 employees and a globally distributed team, both Astronomer and its customer base have grown quickly. Pete DeJoy, product manager Viraj Parekh, field CTO Paola Peraza Calderon, product
Astronomer kubernetes full#
Data lineage provides the full context of the data by capturing in greater detail the relationships between data sources, where the data originated, and how it gets transformed and converged through the data lifecycle.

Astronomer kubernetes plus#
However, data teams naturally need more than open source Airflow on its own - they need test pipelines to ensure data quality, SDKs to make data practitioners productive, and observability plus lineage for the underlying data - even as they strive to minimize operational overhead. Data practitioners love Airflow because of its community, its flexibility, and its ability to provide a central view of a data ecosystem. With more than 2,200 contributors and over 12M monthly downloads, Apache Airflow has emerged as the open source standard for programmatically authoring, scheduling, and monitoring data pipelines. This data orchestration -weaving business logic through the data stack for everything from dashboards to personalization algorithms - requires hundreds, if not thousands, of data pipelines.ĭata orchestration is needed across all industries, in organizations of all sizes. For data to be useful in a modern enterprise, it must be collected and centralized from various sources, processed across a growing ecosystem of tools, and fed to systems across an organization in a way that’s consumable across teams.
