DaaS / Products / Real-Time Data Pipeline to Search

Real-Time Data Pipeline to Search

Use DataWorks to orchestrate batch ETL jobs that land transformed data into OSS/MaxCompute, configure EventBridge to stream real-time events into the same OSS landing zone, and then configure OpenSearch to ingest from those OSS/MaxCompute sources for unified search and analytics over both batch and streaming data.

Products involved

Scenario

Use DataWorks to orchestrate batch ETL jobs that land transformed data into OSS/MaxCompute, configure EventBridge to stream real-time events into the same OSS landing zone, and then configure OpenSearch to ingest from those OSS/MaxCompute sources for unified search and analytics over both batch and streaming data.

How the products combine

  1. dataworks · dataworks-onboard — DataWorks 模块选择 / Module Selection Routing
  2. See dataworks/dataworks-onboard.

  3. eb · eb-configure-streaming — EventBridge — Configure real-time event streaming
  4. See eb/eb-configure-streaming.

  5. opensearch · opensearch-manage-sources — OpenSearch — Manage data sources for ingestion
  6. See opensearch/opensearch-manage-sources.

Typical questions

FAQ

Q: How do I build a real-time data pipeline that streams events to OSS and searches them with OpenSearch? A: You can build this pipeline by using DataWorks to orchestrate batch ETL jobs into OSS or MaxCompute, configuring EventBridge to stream real-time events to the same OSS location, and setting OpenSearch to ingest from those sources for unified search and analytics. This cross-product combination integrates DataWorks onboarding, EventBridge streaming configuration, and OpenSearch source management to handle both batch and streaming data.