Skip to content

Hudi Target

The Hudi Target Connector facilitates integration with Apache Hudi (Hadoop Upserts Deletes and Incrementals), an open-source data management framework for incrementally processing and managing big data. This connector enables streaming data from Popsink directly into Hudi datasets, supporting both batch and near real-time data ingestion. It handles upserts, deletes, and incremental data processing efficiently, automatically managing schema evolution and data consistency. The Hudi connector is particularly useful for organizations building large-scale data lakes or implementing a lakehouse architecture, as it combines the flexibility of a data lake with database-like ACID transactions and incremental processing capabilities. It enables data teams to implement change data capture (CDC), create time travel queries, and perform efficient snapshot and incremental data retrieval. This connector is ideal for use cases requiring strict data quality and consistency, such as regulatory compliance, audit trails, or building scalable, real-time data pipelines while maintaining historical data versions.