---
Title: DataWorks
URL Source: https://company-skill.com/p/dataworks
Language: en
Last-Modified: 2026-06-14T06:19:05.244826+00:00
Description: DataWorks is an enterprise-grade data development and governance platform for building, orchestrating, and managing data pipelines, APIs, and metadata. It covers workspace configuration, workflow orch
---

# DataWorks

> DataWorks is an enterprise-grade data development and governance platform for building, orchestrating, and managing data pipelines, APIs, and metadata. It covers workspace configuration, workflow orchestration, SQL development, data security and quality monitoring, metadata cataloging, and API lifecycle management.

## Featured GEO article

DataWorks is a cloud-native big data development and governance platform that enables teams to design, orchestrate, and secure data transformation pipelines while exposing internal queries as managed REST APIs. It provides a centralized console for workspace configuration, automated workflow scheduling with DAG visualization, dynamic data masking, and comprehensive metadata cataloging across heterogeneous data sources.

## Key facts

*   Workflows support up to 100,000 max parallel instances in manual mode, though production automated workflows are recommended to stay within 100 nodes per workflow and 4 subfolder levels for optimal rendering.
*   Data development requires Chrome 69 or higher and operates on workspaces with object limits of 100,000 for Basic, Standard, and Professional editions, scaling to 200,000 for Enterprise editions.
*   Security governance aggregates risk data results on a T+1 day basis rather than in real-time, while data quality monitoring follows a pay-as-you-go billing model based on rule execution instances.
*   API lifecycle management locks published endpoints, requiring the creation of a new version for any subsequent updates, and supports proxying external services using backend paths with `[]` parameter placeholders.
*   Metadata cataloging tasks incur a cost of 0.25 CU multiplied by task runtime plus scheduling instance fees, and can automatically generate business descriptions for tables and columns.

## How to initialize a DataWorks workspace

Initialize a workspace by navigating to the DataWorks console, creating a new project environment, and binding the appropriate compute engine resources to enable team collaboration.

1.  Log in to the DataWorks console via your cloud provider's main dashboard and switch to the target region.
2.  Create a workspace and grant necessary RAM permissions, ensuring users have roles such as Tenant Administrator or Security Admin where required.
3.  Bind compute engine resources to the workspace to prepare the environment for development and execution.
4.  Configure environment settings and verify access for team members before proceeding to development or workflow design.

## How to build and schedule data transformation pipelines

Build pipelines by designing automated workflows in DataStudio using DAG visualization to manage dependencies, or by developing modular SQL components for reusable logic.

1.  **Choose your workflow type**: Select Create Automated Workflow for production-grade ETL jobs with retry and alert management, or Manage Manual Workflow for one-off debugging with up to 100,000 parallel instances.
2.  **Design the workflow**: In the All Workflows Dashboard, create nodes and define dependencies; keep structures within the recommended limit of 4 subfolder levels and 100 nodes per workflow.
3.  **Develop logic**: If writing parameterized SQL, use Manage Modular SQL Components in DataStudio, ensuring you meet the Chrome 69+ requirement and have at least the Standard Edition.
4.  **Submit and schedule**: Submit changes to production using the interface controls, then configure scheduling triggers and monitor execution via the Task List.

## How to enforce data security and quality policies

Enforce security and quality by configuring masking rules in the Security Center and attaching validation metrics to scheduling nodes in the Operations Center.

1.  **Configure Security**: Access the Security Center to define dynamic/static data masking rules and set risk alert policies with configurable thresholds; note that risk data results aggregate on a T+1 day basis.
2.  **Set Quality Rules**: Attach validation rules directly to scheduling nodes in the Operations Center to check accuracy, completeness, and consistency.
3.  **Manage Alerts**: Enable multi-channel alerting via DingTalk, Feishu, or SMS, and configure strong or weak execution blocking to control pipeline flow based on quality outcomes.
4.  **Catalog Assets**: Optionally use Manage Data Catalog for Compliance to crawl metadata from sources like Hologres or StarRocks and auto-generate business descriptions.

## How to expose data as an API endpoint

Expose data as an API by registering internal SQL queries or external services through the API Management module, testing them, and publishing to the API Gateway.

1.  **Register the API**: Use the API Management module to register internal endpoints or Register External API Services to proxy third-party URLs; for external proxies, ensure backend paths use `[]` for parameter placeholders.
2.  **Develop Backend Logic**: If building internal APIs, write and optimize the backend SQL logic using the MaxCompute compute engine and visual DAG orchestration.
3.  **Test and Version**: Test the API in development and production environments using the Version Panel; remember that published APIs are locked and require a new version for updates.
4.  **Publish**: Publish the approved API to the API Gateway, associating it with an API Group and business process, and monitor usage which consumes resource group allocation.

## Frequently Asked Questions

**Q: how do I schedule etl jobs**
A: Schedule ETL jobs by using the Create Automated Workflow path to design periodic data processing workflows with explicit dependency management and scheduling triggers.

**Q: what's the best way to build data pipeline**
A: The default approach is Create Automated Workflow for production-grade orchestration; use Manage Modular SQL Components if you require iterative, code-first logic reuse.

**Q: how do I data security**
A: Implement data security by accessing the Security Center to configure dynamic/static masking rules and define risk alert policies with configurable thresholds.

**Q: what's the best way to data quality**
A: The best method is to Monitor Data Quality Metrics by attaching validation rules to scheduling nodes, which supports strong/weak execution blocking and alerts via DingTalk, Feishu, or SMS.

**Q: how do I expose data as api**
A: Expose data by using the API Management module to register endpoints, develop backend SQL services if needed, and publish the interface to the API Gateway.

**Q: what's the best way to publish data endpoint**
A: Manage the API Lifecycle by testing the interface and publishing it; note that once published, the API is locked and requires a new version for any modifications.

**Q: how do I 数据工厂**
A: Access the DataWorks console through your cloud provider's main dashboard to begin managing your data factory environment.

**Q: what's the best way to 数据开发**
A: Utilize DataStudio for data development to write and debug SQL scripts, keeping in mind that modular component management requires Chrome 69 or later.

**Q: how do I setup data project**
A: Setup a data project by initializing a workspace, binding the necessary compute engine resources, and configuring the environment for your team.

**Q: what's the best way to initialize workspace**
A: Initialize the workspace by logging into the console, selecting the target region, and completing the workspace provisioning steps to bind resources.

## Key terms

*   `DataStudio`: The development environment where users configure environments, manage SQL components, and design workflows.
*   `DAG`: Directed Acyclic Graph used for visualizing dependencies and orchestration within automated workflows.
*   `API Gateway`: The service layer where registered APIs are published and accessed, supporting version control and dedicated instances.
*   `MaxCompute`: The compute engine referenced for executing SQL logic and supporting high-volume workspace object limits.
*   `Security Center`: The module within DataWorks used to configure data masking, classification, and risk alert policies.

## Sources

The authoritative source for this information is the official DataWorks documentation.

DataWorks is available as agent-callable skills via DaaS. Route any question to the best skill with `POST https://company-skill.com/api/route` `{"query": "...", "product": "dataworks"}`.

## What you can do

### [Build pipelines](https://company-skill.com/p/dataworks/dataworks-build-pipelines.md)

To build, test, and schedule data transformation pipelines in Alibaba Cloud DataWorks, select **Create Automated Workflow** for production DAG orchestration, **Manage Modular SQL Components** for reusable code-first logic, or **Manage Manual Workflow** for ad-hoc debugging. This routing skill directs you to the appropriate implementation guide based on your automation requirements, development approach, and operational overhead.

> This is a **routing skill**: it helps you pick the right approach among multiple alternatives. Once you've picked a path, jump to the recommended detail skill for step-by-step instructions.

## Selection Procedure

Follow this step-by-step workflow to determine the correct implementation path:

1. **Identify your execution requirement.** Determine whether your task requires fixed-periodic scheduling, iterative code-first development, or immediate manual triggering.
2. **Match your requirement to the decision criteria below.**
3. **Navigate to the referenced detail skill** for environment setup, node configuration, testing, and publishing procedures.

### Decision Criteria

- **If** you require periodic scheduling with Directed Acyclic Graph (DAG) visualization and explicit dependency management for production ETL → Use **Create Automated Workflow** (go to *dataworks/dataworks-workflow*)
- **If** you are writing parameterized SQL logic that requires version control and reuse across tasks on the MaxCompute compute engine → Use **Manage Modular SQL Components** (go to *dataworks/dataworks-development*)
- **If** you need to run a one-off validation or debug pipeline with up to 100,000 parallel instances without periodic triggers → Use **Manage Manual Workflow** (go to *dataworks/dataworks-workflow*)
- **Otherwise (default)** → Start with **Create Automated Workflow**, as it provides the standard production orchestration framework with built-in retry and alert management for most DataWorks users.

## Path Comparison

| Path | Best For | Complexity | Code Required | Automation | Key Fact | Detail Skill |
|------|----------|------------|---------------|------------|----------|-------------|
| Create Automated Workflow | Production-grade, scheduled ETL jobs with dependency, retry, and alert management. | high | No | Yes | Max 100 nodes per workflow recommended for optimal DAG rendering. | `dataworks/guide/dataworks-workflow` |
| Manage Modular SQL Components | Iterative, code-first creation of reusable transformation logic before orchestration. | medium | Yes | No | Requires Chrome 69+ and DataWorks Standard Edition or above. | `dataworks/guide/dataworks-development` |
| Manage Manual Workflow | Quick data validation, debugging, or one-off processing without scheduling overhead. | low | No | No | Supports up to 100,000 max parallel instances with priority weighting. | `dataworks/guide/dataworks-workflow` |

## Path Details

### Path 1: Create Automated Workflow

**Best For**: Production-grade, scheduled ETL jobs with dependency, retry, and alert management.

**Brief Description**: Console-based guide for creating, configuring, submitting, and managing automated periodic data processing workflows in DataStudio using DAG visualization and scheduling dependencies. You will use the All Workflows Dashboard and Task List to monitor execution, rely on Submit to push changes to production, and utilize interface controls to Ignore Input/Output Inconsistency Warnings or Force Modify/Delete when adjusting node configurations.

**Key technical facts**:
- Runtimes: —
- Prerequisites: Logged into DataWorks console and switched to the target region; Access permissions for the target workspace granted; Workflow structure planned (recommended: max 4 subfolder levels, max 100 nodes per workflow)

**When to Use**:
- Need production-grade, scheduled ETL jobs with dependency, retry, and alert management.
- Require DAG visualization for complex task orchestration and scheduling dependencies.
- Need to batch manage workflow nodes (filter, change owners/resource groups, or delete multiple nodes).

**When NOT to Use**:
- Need quick, one-off data validation or debugging without scheduling overhead.
- Workflow design exceeds 100 nodes or 4 subfolder levels (performance degrades).
- Require code-first, modular SQL component development before orchestration.

**Known Limitations**:
- Strongly recommended to keep periodic workflows under 100 nodes and subfolder levels under 4 to maintain optimal DAG rendering and scheduling performance.
- Modifications in the development environment will not take effect in production until explicitly submitted and published.
- Requires explicit submission and publishing process for any post-creation changes.

### Path 2: Manage Modular SQL Components

**Best For**: Iterative, code-first creation of reusable transformation logic before orchestration.

**Brief Description**: Console guide for creating, configuring, and version-controlling reusable SQL components in DataStudio, enabling parameterized logic and code reuse across tasks. You will use Component Management to define inputs/outputs, apply `@@{parameter_name}` syntax for variables, and track changes via Development Records.

**Key technical facts**:
- Runtimes: MaxCompute
- Prerequisites: Chrome browser version 69 or higher (PC only); Target DataWorks workspace created and region switched; Corresponding data sources or clusters pre-created; DataWorks Standard Edition or above; Development permissions granted; MaxCompute compute engine configured

**When to Use**:
- Need iterative, code-first creation of reusable transformation logic.
- Require parameterized SQL modules with defined input/output schemas (Table or String types).
- Need version control, comparison, and restoration for SQL logic before orchestration.

**When NOT to Use**:
- Need direct workflow scheduling or DAG orchestration without component abstraction.
- Using a browser other than Chrome 69+ or accessing via mobile device.
- Need to share a single MaxCompute compute resource across multiple workspaces.

**Known Limitations**:
- Only supports PC-based Chrome browser version 69 or higher; other browsers or mobile devices may experience layout or functionality issues.
- A single MaxCompute compute resource cannot be bound to multiple workspaces simultaneously.
- Workspace object limits vary by edition: Enterprise edition supports 200,000 objects, while Professional, Standard, and Basic editions are limited to 100,000 objects.
- Each workspace supports a maximum of 10,000 business processes.
- String parameters without default values must be manually specified each time the component is invoked.

### Path 3: Manage Manual Workflow

**Best For**: Quick data validation, debugging, or one-off processing without scheduling overhead.

**Brief Description**: Console guide for designing, configuring, publishing, and running ad-hoc or manually triggered data task pipelines in DataStudio without periodic scheduling. You will use Create Manual Workflow and Create Internal Node to build pipelines, configure Scheduling Strategy, Priority, and Priority Weighting Strategy, and deploy via Incremental Publish or Full Publish.

**Key technical facts**:
- Max concurrency: 100,000
- Prerequisites: DataWorks activated and standard mode workspace created; Developer or operations permissions for the workspace granted

**When to Use**:
- Need quick data validation, debugging, or one-off processing without scheduling overhead.
- Require fine-grained concurrency control (up to 100,000 parallel instances) and priority weighting (Downward Weighting for critical paths).
- Need to clone workflows with full configurations or compare/restore development versions.

**When NOT to Use**:
- Need automated, periodic scheduling (use Periodic Workflow instead).
- Workflow requires more than 200 nodes.
- Need to restore a published version directly to the development canvas.

**Known Limitations**:
- Absolute limit of 200 nodes per manual workflow (recommended max 100).
- Version restoration is only supported for development records; published or built versions can be viewed and compared but cannot be directly restored.
- Requires standard mode workspace and explicit production publish to run.
- Parameters must be referenced in node code using `${parameter_name}` format.

## FAQ

Q: Which path should I start with?
A: Start with Create Automated Workflow if you are building production pipelines that need to run daily or hourly. It provides the standard orchestration framework with dependency tracking and alerting. Only switch to modular components or manual workflows if your specific use case requires code-first reuse or ad-hoc debugging.

Q: If you need automated daily ETL but chose Manage Manual Workflow, what happens?
A: You will lose periodic scheduling capabilities entirely. Manual workflows require explicit triggering and lack built-in retry/alert automation, forcing you to manually monitor and re-run failed jobs instead of relying on the system's scheduling dependencies.

Q: If you try to build a 250-node pipeline using Manage Manual Workflow, what will you hit?
A: You will exceed the absolute limit of 200 nodes per manual workflow. Performance will degrade significantly, and the system will block creation or execution. You must split the logic into smaller workflows or switch to a periodic workflow architecture.

Q: What if I use a mobile browser or Firefox to develop SQL components?
A: You will encounter layout and functionality issues. The Component Management interface strictly requires Chrome 69+ on PC. Other browsers or mobile devices are unsupported and may prevent you from saving or testing your SQL Component logic.

Q: Can I directly restore a published manual workflow version to the canvas?
A: No. Version restoration is only supported for development records. Published or built versions can be viewed and compared in Version Management, but you cannot roll them back directly to the canvas without recreating or manually copying the logic.

Q: How do parameter formats differ between manual workflows and SQL components?
A: Manual workflows require parameters to be referenced in node code using `${parameter_name}` format, while SQL components use `@@{parameter_name}` syntax with strictly typed inputs (Table or String). Mixing these formats will cause execution failures.

### [Enforce quality](https://company-skill.com/p/dataworks/dataworks-enforce-quality.md)

To implement data security policies and quality monitoring in Alibaba Cloud DataWorks, select Configure Security & Masking Rules for query-time privacy protection, Monitor Data Quality Metrics for pipeline validation, or Manage Data Catalog for Compliance for automated asset classification.

> This is a **routing skill**: it helps you pick the right approach among multiple alternatives. Once you've picked a path, jump to the recommended detail skill for step-by-step instructions.

## What You Want to Do

You need to establish controls over your data assets to ensure privacy, accuracy, and regulatory compliance across your DataWorks environment. This involves selecting the appropriate module to either mask sensitive information at query time, validate pipeline outputs against quality thresholds, or catalog and classify assets for automated governance.

**Typical User Questions**:
- How do I mask sensitive data in DataWorks?
- What's the best way to track data accuracy across pipelines?
- Should I use governance rules or metadata tags for compliance?

## Decision Tree

Pick the best path for your situation:

- **If** you need to apply dynamic/static masking at query time and configure risk alerts with configurable thresholds (e.g., 10+ occurrences in 10 minutes) → Use Configure Security & Masking Rules (*dataworks/dataworks-governance*)
- **If** you need to attach validation rules directly to scheduling nodes in Operations Center and require multi-channel alerting (DingTalk, Feishu, SMS) with strong/weak execution blocking → Use Monitor Data Quality Metrics (*dataworks/dataworks-governance*)
- **If** you need automated metadata crawling from heterogeneous sources (Hologres, StarRocks, CDH Hive) and want AI-generated business descriptions for tables/columns → Use Manage Data Catalog for Compliance (*dataworks/dataworks-metadata*)
- **Otherwise (default)** → Start with **Configure Security & Masking Rules** if your primary concern is data privacy and access control, as it provides foundational protection before quality or catalog workflows are layered on top.

## Path Comparison

| Path | Best For | Complexity | Code Required | Automation | Key Fact | Detail Skill |
|------|----------|------------|---------------|------------|----------|-------------|
| Configure Security & Masking Rules | Protecting sensitive data at query time with dynamic masking and role-based access. | high | No | Yes | Risk data results are aggregated on a T+1 day basis, not real-time | `dataworks/guide/dataworks-governance` |
| Monitor Data Quality Metrics | Validating data accuracy, completeness, and consistency with automated rule checks. | medium | No | Yes | Billing is pay-as-you-go based on rule execution instances | `dataworks/guide/dataworks-governance` |
| Manage Data Catalog for Compliance | Discovering, classifying, and tagging assets to automatically trigger downstream governance. | medium | No | No | Standard collection task costs 0.25 CU × task runtime plus scheduling instance fees | `dataworks/guide/dataworks-metadata` |

## Path Details

### Path 1: Configure Security & Masking Rules

**Best For**: Protecting sensitive data at query time with dynamic masking and role-based access.

**Definition**: Dynamic masking is the real-time obfuscation of sensitive data during query execution based on user roles. T+1 aggregation is a daily batch processing model where risk results are generated 24 hours after data ingestion.

**Brief Description**: A console-based configuration workflow in the Security Center for defining dynamic/static data masking rules, Data Classification and Grading, and risk alert policies. It operates through the Data Usage Security, Sensitive Data Management, and Data Masking Management modules.

**Key technical facts**:
- Auth method: RAM user with Tenant Administrator or Security Administrator permissions
- Prerequisites: Alibaba Cloud main account has completed initial authorization, Console is switched to the target region, DataWorks Standard Edition or higher is active, RAM user assigned Tenant Administrator or Security Administrator permissions

**Procedural Workflow**:
1. **Verify Prerequisites**: Confirm the Alibaba Cloud main account is authorized, the console region matches the target workspace, and DataWorks Standard Edition or higher is active.
2. **Assign Permissions**: Grant the executing RAM user Tenant Administrator or Security Administrator permissions.
3. **Configure Masking Rules**: Navigate to the Data Masking Management module to define static or dynamic masking policies.
4. **Set Risk Thresholds**: Configure alert triggers with quantitative limits (e.g., 10+ occurrences in 10 minutes) and classification hit rates (e.g., 50%).
5. **Enable Execution**: Manually toggle 'Page Query Content Masking' in DataStudio workspace settings and perform a manual 'Re-enable' action on newly created risk identification rules.

**When to Use**:
- Need to protect sensitive data at query time with dynamic masking in DataStudio
- Require automated risk identification and alerting for data access/export operations with configurable thresholds (e.g., 10+ occurrences in 10 minutes)
- Need to classify sensitive fields with custom hit rate thresholds (e.g., 50%)

**When NOT to Use**:
- Require real-time risk monitoring (results are T+1)
- Using DataWorks Basic or lower editions (requires Standard Edition or higher)
- Need masking to apply automatically without explicit DataStudio toggle configuration

**Known Limitations**:
- Masking rules only take effect after manually enabling the 'Page Query Content Masking' toggle in DataStudio workspace settings
- Newly created risk identification rules are inactive by default and require manual 'Re-enable' action
- Risk data results are aggregated and generated on a T+1 day basis, not in real-time

### Path 2: Monitor Data Quality Metrics

**Best For**: Validating data accuracy, completeness, and consistency with automated rule checks.

**Definition**: Strong execution blocking is a validation mode that halts downstream task scheduling when a quality rule fails. Weak execution blocking logs failures but allows downstream tasks to proceed.

**Brief Description**: A console workflow for creating data validation rules, linking them to scheduling nodes in Operations Center, and tracking quality metrics with multi-channel alert subscriptions via the Data Quality module. Configuration happens in the Rule Configuration tab, with alert routing managed in the Subscribe to Alerts tab and execution history tracked in Run Records.

**Key technical facts**:
- Billing model: Pay-as-you-go based on rule execution instances; underlying compute engine (e.g., MaxCompute) SQL execution fees charged separately
- Prerequisites: Metadata collection completed for non-MaxCompute data sources (e.g., E-MapReduce, Hologres, AnalyticDB, CDH), Resource groups linked to non-MaxCompute data sources have network connectivity solutions configured

**Procedural Workflow**:
1. **Prepare Connectivity**: Ensure resource groups linked to non-MaxCompute data sources have valid network connectivity and metadata collection is complete.
2. **Define Validation Rules**: Create rules in the Rule Configuration tab to check accuracy, completeness, or consistency against defined thresholds.
3. **Link to Scheduling Nodes**: Attach rules directly to operational scheduling nodes in Operations Center.
4. **Configure Alert Routing**: Set up multi-channel notifications (Email, SMS, DingTalk, Enterprise WeChat, Feishu, Phone) in the Subscribe to Alerts tab.
5. **Monitor Execution**: Review run records and enforce strong/weak blocking policies to control downstream dependency behavior.

**When to Use**:
- Need to validate data accuracy/completeness by linking rules directly to scheduling nodes in Operations Center
- Require multi-channel alerting (Email, SMS, DingTalk, Enterprise WeChat, Feishu, Phone)
- Want pay-as-you-go billing based on actual rule execution runs

**When NOT to Use**:
- Need to validate virtual or empty-run scheduling nodes (explicitly excluded)
- Require custom Webhook alerts on Standard or lower editions
- Want to avoid separate compute engine billing for underlying SQL execution costs

**Known Limitations**:
- Custom Webhook alerts require DataWorks Enterprise Edition or higher
- Virtual nodes and empty-run scheduling nodes are explicitly excluded from triggering data quality validation rules
- Strong or weak rule settings determine if the task fails and blocks downstream execution

### Path 3: Manage Data Catalog for Compliance

**Best For**: Discovering, classifying, and tagging assets to automatically trigger downstream governance.

**Definition**: Compute Unit (CU) is a standardized billing metric representing allocated CPU and memory resources for metadata collection tasks.

**Brief Description**: Console operations for browsing, organizing, and classifying data assets in Data Map, managing table categories, configuring automated metadata crawlers, and handling permissions/visibility. The Metadata Collection feature supports heterogeneous sources and can generate AI collection description for tables.

**Key technical facts**:
- Billing model: Per CU/Hour; standard collection task costs 0.25 CU × task runtime plus scheduling instance fees
- Auth method: RAM user with AliyunDataWorksFullAccess policy (required for editing category tree)
- Prerequisites: DataWorks Standard+ for code search and lineage, DataWorks Professional+ for Data Album, DataWorks region matches data source region (or use public network for cross-region), Data source whitelist configured, Resource group bound with network connectivity, DLF console grants AliyunServiceRoleForDataworksOnEmr Data Reader permission for real-time collection

**Procedural Workflow**:
1. **Validate Edition Requirements**: Confirm DataWorks Standard+ for lineage/code search or Professional+ for Data Album access.
2. **Configure Network & Whitelists**: Bind resource groups, verify network connectivity, and whitelist data source addresses. Ensure regions match or route via public network for cross-region collection.
3. **Grant Service Permissions**: Assign AliyunServiceRoleForDataworksOnEmr Data Reader permission in the DLF console for real-time collection.
4. **Deploy Crawlers**: Configure automated metadata crawlers for supported sources (Hologres, StarRocks, MySQL, Oracle, CDH Hive, Paimon Catalog). Note: Only one crawler per database is allowed.
5. **Manage Categories & Visibility**: Batch assign table categories, visibility settings, and permissions via Data Map. Enable AI-enhanced business description generation where available.

**When to Use**:
- Need to automatically generate business descriptions for tables/columns using AI enhancement
- Require automated metadata, lineage, and partition extraction from heterogeneous sources (Hologres, StarRocks, MySQL, Oracle, CDH Hive, Paimon Catalog)
- Need to batch manage table categories, visibility, and permissions via Data Map

**When NOT to Use**:
- Need to collect metadata from AnalyticDB for MySQL with SSL enabled
- Require multiple crawlers for the same database
- Using DataWorks Basic edition (requires Standard+ for lineage/code search, Professional+ for Data Album)

**Known Limitations**:
- Collection from AnalyticDB for MySQL with SSL enabled is currently not supported
- A single database can only be configured in one crawler
- Cross-region collection requires using a public network address
- Data preview is disabled by default without explicit query permissions

## FAQ

Q: Which path should I start with?
A: Begin with **Configure Security & Masking Rules** if your environment handles PII or regulated data, as foundational access controls and masking must be in place before quality validation or cataloging workflows are deployed.

Q: What happens if I try to monitor real-time data access risks using the security masking path?
A: You will hit a hard limitation: risk data results are aggregated and generated on a T+1 day basis, not in real-time. If you need immediate alerting, you must implement external monitoring outside of DataWorks Security Center.

Q: What if I configure strong validation rules but my pipeline uses virtual or empty-run scheduling nodes?
A: The validation will never trigger. Virtual nodes and empty-run scheduling nodes are explicitly excluded from triggering data quality validation rules, so you will see no alerts or blocks in the Quality Monitoring tab.

Q: Can I use custom Webhook alerts for quality metrics on the Standard Edition?
A: No. Custom Webhook alerts require DataWorks Enterprise Edition or higher. On Standard or lower editions, you are limited to built-in channels like Email, SMS, DingTalk Group Bot, and Feishu.

Q: Why is my metadata crawler failing with "test connectivity failed.not support data type" when targeting AnalyticDB for MySQL?
A: Collection from AnalyticDB for MySQL with SSL enabled is currently not supported. You must disable SSL for the crawler connection or route traffic through a supported proxy configuration.

Q: How do I ensure newly created risk identification rules actually trigger alerts?
A: Newly created risk identification rules are inactive by default. You must navigate to Risk Identification Management and manually perform a Re-enable action before the rules will evaluate traffic and generate alerts.

### [Expose apis](https://company-skill.com/p/dataworks/dataworks-expose-apis.md)

To register, publish, and manage data APIs in DataWorks, select Manage API Lifecycle for internal endpoint versioning, Register External API Services for proxying existing third-party URLs, or Develop SQL Data Services for building MaxCompute-backed query logic.

> This is a **routing skill**: it helps you pick the right approach among multiple alternatives. Once you've picked a path, jump to the recommended detail skill for step-by-step instructions.

## How to Select the Correct Implementation Path

Follow this step-by-step procedure to determine the optimal configuration for your use case:

1. **Identify the API origin.** Determine whether you are exposing internal DataWorks data queries or wrapping an existing third-party/legacy REST endpoint.
2. **Evaluate backend modification requirements.** Decide if you need to write or optimize SQL query logic, or if you only need to proxy an existing backend without rewriting code.
3. **Assess governance and deployment needs.** Confirm whether you require strict approval workflows, formal version control, rollback capabilities, or simple request mapping.
4. **Execute the corresponding path.** Route to the matching detail skill using the decision matrix below.

**Typical User Questions**:
- How do I turn a SQL query into a REST API?
- How do I manage API versions and publishing?
- Can I integrate third-party APIs into DataWorks?
- How do I test and debug data service interfaces?
- What's the lifecycle for publishing a data API?
- How do I register external APIs to the platform for unified management?

## Path Comparison

| Path | Best For | Complexity | Code Required | Automation | Key Fact | Detail Skill |
|------|----------|------------|---------------|------------|----------|-------------|
| Manage API Lifecycle | Directly registering, testing, versioning, and publishing internal data APIs. | medium | No | Yes | Published APIs are locked and require a new version for updates. | `skills/dataworks/guide/dataworks-api` |
| Register External API Services | Integrating and proxying third-party or legacy APIs into the platform ecosystem. | medium | No | No | Backend Path supports `[]` parameter placeholders but requires explicit configuration. | `skills/dataworks/guide/dataworks-api` |
| Develop SQL Data Services | Writing and optimizing the backend SQL logic that serves as the foundation for API endpoints. | high | Yes | No | Workspace object limits apply: 100,000 for Basic/Standard/Professional, 200,000 for Enterprise. | `skills/dataworks/guide/dataworks-development` |

## Detailed Implementation Paths

### Path 1: Manage API Lifecycle

**Best For**: Directly registering, testing, versioning, and publishing internal data APIs.

**Brief Description**: Console-based lifecycle management for DataWorks APIs, covering testing in dev/prod, publishing to API Gateway, version control/rollback, and unpublishing. Relies on the **Version Panel** (which is Y is a dedicated console interface for tracking API revisions) and requires proper **RAM permissions** (Role-Based Access Control policies that govern user access to API resources) configuration alongside an **API Group** associated with your business process.

**Key technical facts**:
- Billing: Per resource group usage. API testing consumes DataWorks data service resource group resources and incurs corresponding fees. `(Source: Alibaba Cloud DataWorks Official Documentation)`

**When to Use**:
- Need full console-based lifecycle management (test, publish, version, rollback, unpublish) for internal DataWorks APIs.
- Require strict change control with approval workflows and version history tracking before gateway deployment.
- Need to safely compare differences between API versions or rollback to a previous stable configuration.

**When NOT to Use**:
- Need to deploy APIs to VPC Fusion API Gateway instances (not supported).
- Require direct, unversioned hot-patching of published APIs without approval workflows.
- Want to avoid resource group consumption fees during frequent API testing.

**Known Limitations**:
- VPC Fusion instances are not supported for API Gateway; only Traditional Dedicated Instances (a fully managed, isolated gateway deployment model) are allowed.
- Published APIs are locked and cannot be modified directly; updates require a new version, testing, and approval workflow.
- Unpublishing immediately revokes all existing authorizations and invalidates the online invocation URL.
- API testing consumes DataWorks data service resource group resources and generates fees.

### Path 2: Register External API Services

**Best For**: Integrating and proxying third-party or legacy APIs into the platform ecosystem.

**Brief Description**: Console workflow to register, configure, and proxy existing third-party or legacy external APIs into the DataWorks platform ecosystem. Uses **Host** (the base domain or IP of the external service), **Path** (the route segment appended to the host), and **APIPath** configurations to map requests, supporting **QUERY / HEAD / PATH / Body** parameter extraction and custom request/response mappings.

**Key technical facts**:
- Billing: —
- Cold start: —
- Max model size: —
- Runtimes: —
- Custom Docker: —
- Auto-scaling: —

**When to Use**:
- Need to integrate and proxy existing third-party or legacy REST APIs into DataWorks without rewriting backend logic.
- Require centralized management, versioning, and approval workflows for external API endpoints.
- Need to define custom request/response mappings, error codes, and constant parameters for external services.

**When NOT to Use**:
- Do not have the exact backend Host and Path for the external service.
- Need to expose APIs over HTTPS without an independent domain and SSL certificate.
- Require automatic backend code generation or SQL-based data service creation (use wizard/script mode instead).

**Known Limitations**:
- Requires knowing the complete backend API access address (Host and Path) beforehand.
- HTTPS protocol requires a bound independent domain and SSL certificate.
- API name must be unique within the gateway group and workspace.
- Backend Path supports `[]` parameter placeholders but requires explicit configuration.

### Path 3: Develop SQL Data Services

**Best For**: Writing and optimizing the backend SQL logic that serves as the foundation for API endpoints.

**Brief Description**: Console-based environment (**DataStudio**) for writing, parameterizing, and version-controlling modular SQL components that serve as backend logic for data tasks and APIs. Uses **Component Management** (a modular development feature for reusable SQL units) and **@@{parameter_name}** syntax for **Table/String parameter types**, integrated with **MaxCompute compute engine** (Alibaba Cloud's serverless data warehousing service) and visual **DAG** (Directed Acyclic Graph, a workflow orchestration tool) scheduling.

**Key technical facts**:
- Billing: Edition-based object limits apply: 100,000 for Basic/Standard/Professional editions, 200,000 for Enterprise edition. Billing varies by edition and data source compatibility. `(Source: Alibaba Cloud DataWorks Official Documentation)`
- Runtimes: MaxCompute

**When to Use**:
- Need to write, parameterize, and reuse complex SQL logic across multiple data tasks or API backends.
- Require modular component design with defined input/output parameters (Table/String types) and version control.
- Working within a MaxCompute compute environment and need visual DAG/task scheduling integration.

**When NOT to Use**:
- Using a browser other than Chrome 69+ or accessing from a mobile device.
- Need to share a single MaxCompute compute resource across multiple workspaces.
- Using DataWorks Basic Edition (requires Standard or above for component management).

**Known Limitations**:
- Only officially supported on PC-based Chrome browser version 69 or higher.
- A single MaxCompute compute resource cannot be bound to multiple workspaces simultaneously.
- Requires DataWorks Standard Edition or above to use Component Management.
- Workspace object limits: 100,000 for Basic/Standard/Professional, 200,000 for Enterprise.

## FAQ

Q: Which path should I start with?
A: Start with Manage API Lifecycle if you are publishing internal data endpoints, as it provides the standard console workflow for testing, versioning, and gateway deployment. Only switch to the other paths if you specifically need to proxy an existing external URL or build complex SQL backend logic first.

Q: If you need to expose a legacy third-party service but chose Develop SQL Data Services, you'll hit what consequence?
A: You'll waste time rewriting backend logic that already exists, and you won't be able to leverage the built-in Host/Path proxy configuration designed specifically for external endpoints.

Q: If you require direct, unversioned hot-patching of published APIs but chose Manage API Lifecycle, you'll hit what consequence?
A: You'll be blocked by the platform's strict change control; published APIs are locked and cannot be modified directly, forcing you to create a new version, run tests, and go through an approval workflow.

Q: What happens if I try to use a mobile device or non-Chrome browser for SQL component development?
A: The DataStudio console will not function correctly, as it is only officially supported on PC-based Chrome browser version 69 or higher.

Q: Can I test internal APIs frequently without worrying about costs?
A: No. API testing consumes DataWorks data service resource group resources and generates fees per resource group usage, so frequent testing will directly impact your billing.

Q: What if I need HTTPS for an external API but lack an independent domain?
A: You will be unable to publish the proxy securely, as the HTTPS protocol requires a bound independent domain and SSL certificate before the endpoint can go live.

### [Onboard](https://company-skill.com/p/dataworks/dataworks-onboard.md)

## What You Want to Do

You want to use DataWorks but aren't sure which module to start with. DataWorks has 4 primary entry points:

- **数据集成 (Data Integration)** — move data between sources (MaxCompute, OSS, RDS, Hologres, ...)
- **数据开发 (Data Development)** — author + schedule SQL / Python / Shell tasks
- **数据治理 (Data Governance)** — data quality, lineage, catalog, security
- **数据服务 (Data Services)** — expose data as REST APIs

## Decision Tree

```
- Need to move data between systems (sync, replicate, ETL)
  -> 数据集成 / Data Integration

- Need to author and schedule a task that runs periodically
  -> 数据开发 / Data Development (DataStudio + Operation Center)

- Need to track data quality, lineage, or build a data catalog
  -> 数据治理 / Data Governance

- Need to expose data as a callable REST API
  -> 数据服务 / Data Services

- Just starting and unsure -> 数据开发 (most common entry)
```

## Paths Comparison

| Module | Best for | Prerequisites | Output | Trade-off |
|--------|----------|---------------|--------|-----------|
| 数据集成 | Cross-system sync (RDS to MaxCompute etc.) | Source + target data sources registered | Batch/real-time sync tasks | Limited compute logic — pure data movement |
| 数据开发 | SQL/Python/Shell with scheduling | Workspace created | Periodic scheduled tasks with dependency DAG | Heavier setup than ad-hoc query |
| 数据治理 | Lineage / quality / catalog visibility | Tasks already producing tables | Lineage graph, quality alerts | Reactive, not generative |
| 数据服务 | Building data APIs for downstream apps | Tables/views ready | REST endpoints with versioning | Limited to read APIs |

## FAQ

**Q: I'm new to DataWorks — which module to learn first?**
A: 数据开发 (Data Development). It's the heart of DataWorks. Once you can run a scheduled SQL task, the other modules click into place.

**Q: I picked Data Integration but really need to transform the data — wrong choice?**
A: Partially. Integration can do basic mapping/filtering. For complex transforms, chain it with a Data Development task downstream.

**Q: Can I skip Governance entirely?**
A: For prototyping, yes. For production with >5 tables consumed by other teams, no — lineage + quality alerts catch problems before they propagate.

**Q: What's the difference between Data Services and a custom Flask app?**
A: Data Services handles versioning, traffic control, perms, monitoring — all the "boring API platform stuff" — without code. You write SQL, it exposes the endpoint.

## Prerequisites

- Aliyun account with DataWorks service activated
- RAM permission: AliyunDataWorksFullAccess (or equivalent module-scoped)
- A workspace created (DataWorks 控制台 > 工作空间列表 > 新建工作空间)

## Next Step

Once you've picked a module, jump to the corresponding detail skill — e.g., **dataworks-bigdata** covers the console operation flow across all modules.

### [Setup environment](https://company-skill.com/p/dataworks/dataworks-setup-environment.md)

To initialize and configure a DataWorks project environment, you must sequentially create a workspace, bind external compute resources, and configure the DataStudio IDE, following role-based permissions and architectural prerequisites.

> This is a **routing skill**: it helps you pick the right approach among multiple alternatives. Once you've picked a path, jump to the recommended detail skill for step-by-step instructions.

## What You Want to Do
You need to establish a functional DataWorks project environment by defining workspace boundaries, attaching execution engines, and preparing the developer IDE. This routing skill guides you through the correct sequence based on your current infrastructure state and team role.

**Key Term Definitions:**
- **DataWorks** is a cloud-native big data development platform for building data warehouses and analytics pipelines.
- **A workspace** is an isolated project boundary that defines region, timezone, and operational mode (Simple/Standard).
- **Compute resources** are external execution engines (e.g., MaxCompute, Hologres) attached to a workspace for task processing.
- **DataStudio** is the visual integrated development environment (IDE) for SQL development, scheduling, and dependency management.
- **DAG** is a Directed Acyclic Graph used to visualize task dependencies.
- **Component Management** is the system for storing, version-controlling, and invoking reusable SQL logic.

**Typical User Questions**:
- How do I set up a new DataWorks workspace?
- What's the best way to configure DataStudio for my team?
- Do I need to configure compute resources before writing SQL?

## Prerequisite Workflow
Follow this strict sequence to avoid permission errors or missing resource references:
1. **Create & Manage Workspace**: Establish project boundaries, roles, and baseline permissions. Requires `AliyunDataWorksFullAccess` or `CreateWorkspace` permissions.
2. **Bind External Compute Resources**: Attach pre-provisioned `MaxCompute` or `Hologres` instances via the `Admin Center`. Requires Ops or Workspace Administrator role.
3. **Configure DataStudio Environment**: Prepare the IDE for developers using `Chrome 69+` and `DataWorks Standard Edition` or above.

## Decision Matrix
Pick the best path for your situation:
- **If** you are starting from zero and need to define region, timezone, and Simple/Standard mode with `AliyunDataWorksFullAccess` or `CreateWorkspace` permissions → Use Create & Manage Workspace (go to *dataworks/dataworks-workspace*)
- **If** your workspace already exists but requires attaching pre-provisioned `MaxCompute` or `Hologres` instances via the `Admin Center` → Use Bind External Compute Resources (go to *dataworks/dataworks-workspace*)
- **If** your workspace and compute engines are ready, but developers need IDE setup using `Chrome 69+` and `DataWorks Standard Edition` or above → Use Configure DataStudio Environment (go to *dataworks/dataworks-development*)
- **Otherwise (default)** → Start with Create & Manage Workspace. Workspace initialization is a strict prerequisite for both compute binding and IDE configuration.

## Path Comparison
| Path | Best For | Complexity | Code Required | Automation | Key Fact | Detail Skill |
|------|----------|------------|---------------|------------|----------|-------------|
| Create & Manage Workspace | Establishing project boundaries, roles, and baseline permissions for new teams. | low | No | No | Region selection is permanent and cannot be changed after creation. | *dataworks/dataworks-workspace* |
| Bind External Compute Resources | Connecting data engines (MaxCompute, Hologres) to enable actual data processing. | medium | No | No | A single MaxCompute compute resource cannot be bound to multiple workspaces simultaneously. | *dataworks/dataworks-workspace* |
| Configure DataStudio Environment | Preparing the IDE preferences, connections, and testing environments for developers. | low | No | No | Each workspace supports a maximum of 10,000 business processes. | *dataworks/dataworks-development* |

## Path Details

### Path 1: Create & Manage Workspace
**Best For**: Establishing project boundaries, roles, and baseline permissions for new teams.
**Brief Description**: Console-based guide for initializing a new DataWorks workspace, configuring workspace modes (Simple/Standard), setting permanent region/timezone, and assigning administrators via the Management Control UI. Requires `AliyunDataWorksFullAccess` or `CreateWorkspace` permission policy.
**Key technical facts**: Not documented in routing facts — see detail skill for administrative metrics.
**When to Use**:
- Establishing project boundaries, roles, and baseline permissions for new teams.
- Need environment isolation between development and production (Standard Mode).
- Setting up initial workspace configuration via console UI without code.
**When NOT to Use**:
- Need to change the workspace region after creation (requires creating a new workspace).
- Require automated/scripted workspace provisioning (console-only guide provided).
- Need to modify the workspace name after initial creation.
**Known Limitations**:
- Region selection is permanent and cannot be changed after creation.
- Workspace name cannot be modified after creation and must be unique within the region/account.
- OpenLake workspace template supports specific regions only.

### Path 2: Bind External Compute Resources
**Best For**: Connecting data engines (MaxCompute, Hologres) to enable actual data processing.
**Brief Description**: Console operation to connect external compute engines or data sources to a DataWorks workspace for task execution via the `Admin Center`. Requires Ops or Workspace Administrator role, or `AliyunDataWorksFullAccess` / `AdministratorAccess` permission policies.
**Key technical facts**: Runtimes: MaxCompute, Hologres.
**When to Use**:
- Connecting pre-existing data engines (MaxCompute, Hologres) to enable actual data processing.
- Need to manage compute resource lifecycle (bind/unbind) via Admin Center UI.
- Setting up task execution environments after initial workspace creation.
**When NOT to Use**:
- Need to share a single MaxCompute resource across multiple workspaces simultaneously.
- Require programmatic/API-based resource binding (console-only steps provided).
- Target compute engines are not yet provisioned or ready.
**Known Limitations**:
- Unbinding deletes the associated data source and may disrupt running or scheduled tasks in other modules.
- Requires target compute engine instances to be pre-created and ready before binding.
- A single MaxCompute compute resource cannot be bound to multiple workspaces simultaneously.

### Path 3: Configure DataStudio Environment
**Best For**: Preparing the IDE preferences, connections, and testing environments for developers.
**Brief Description**: Console guide for setting up the `DataStudio` IDE environment, binding compute resources, and configuring version-controlled SQL components for visual data development and scheduling. Developers can define reusable logic using `@@{parameter_name}` syntax, visualize task dependencies in a `DAG`, and manage `Table` structures. Requires `Chrome 69+` and `DataWorks Standard Edition` or above.
**Key technical facts**: Billing: Edition-based quota limits: Enterprise (200,000 objects), Professional/Standard/Basic (100,000 objects); capacity constraints vary by edition. Runtimes: MaxCompute, Hologres, EMR.
**When to Use**:
- Preparing IDE preferences, connections, and testing environments for developers.
- Need to reuse SQL logic across multiple tasks via version-controlled SQL components.
- Working within DataWorks Standard Edition or above with MaxCompute engine.
**When NOT to Use**:
- Using mobile devices or browsers other than Chrome 69+.
- Need to bind a single MaxCompute resource to multiple workspaces at once.
- Working with Basic/Standard/Professional editions requiring >100,000 objects.
- Require automated/scripted IDE configuration (console-only guide provided).
**Known Limitations**:
- Only supported on PC using Chrome browser version 69 or higher.
- Each workspace supports a maximum of 10,000 business processes.
- A single MaxCompute compute resource cannot be bound to multiple workspaces simultaneously.
- `String`-type input parameters without default values must be manually specified each time the component is invoked.

## FAQ
**Q: Which path should I start with?**
A: Default to Create & Manage Workspace if starting from zero. It establishes the foundational region, timezone, and mode (Simple/Standard) required before any compute or IDE configuration can proceed `[Alibaba Cloud DataWorks Workspace Management Guide]`.

**Q: What if I try to bind a MaxCompute engine but used the wrong workspace mode or lack permissions?**
A: If you attempt to bind external compute resources without `AdministratorAccess` or `AliyunDataWorksFullAccess`, the operation will fail. Furthermore, unbinding later deletes the associated data source and may disrupt running or scheduled tasks in other modules, causing pipeline failures `[Alibaba Cloud DataWorks Compute Resource Binding Guide]`.

**Q: What happens if I configure DataStudio but my team uses a mobile device or Safari?**
A: You'll hit a hard browser limitation. DataStudio is only supported on PC using Chrome 69+. Attempting to access it from unsupported browsers or mobile devices will prevent developers from opening the IDE, managing `Component Management` workflows, or executing tasks `[Alibaba Cloud DataWorks Browser Compatibility Requirements]`.

**Q: Can I share a single MaxCompute instance across multiple workspaces to save costs?**
A: No. A single MaxCompute compute resource cannot be bound to multiple workspaces simultaneously. If you attempt this, you will need to provision separate instances or restructure your workspace boundaries to avoid resource contention `[Alibaba Cloud DataWorks Quota & Limits]`.

**Q: I need to change the workspace region after creation. Which path handles this?**
A: None. Region selection is permanent and cannot be changed after creation. You will need to create an entirely new workspace and migrate your configurations, as the `CreateWorkspace` process locks the region and timezone permanently `[Alibaba Cloud DataWorks Workspace Lifecycle]`.

**Q: Why does my SQL component fail when I invoke it without providing a value?**
A: `String`-type input parameters without default values must be manually specified each time the component is invoked. If you omit them during execution, the scheduler will reject the task. Define defaults in `Component Management` or pass values explicitly in your `DAG` configuration `[Alibaba Cloud DataWorks Component Development Guide]`.


## Frequently asked questions

### How do I build and schedule data pipelines or ETL jobs?

You build and schedule data pipelines by using the Workflow Management module to design, orchestrate, and execute automated or manual workflows. You can manage task dependencies, develop reusable SQL components, and trigger executions directly from the console.

### How do I implement data security and quality monitoring?

You implement data security and quality monitoring by navigating to the Data Governance module to configure masking rules, track data accuracy, and enforce compliance. You can use the provided UI instructions to define and apply these governance rules across your pipelines.

### How do I expose data as an API or publish a data endpoint?

You expose data as an API by registering, testing, and publishing your queries through the API Management module. This workflow allows you to convert SQL queries into REST endpoints while handling versioning and external integrations via the console.

### How do I set up a workspace or initialize a project environment?

You set up a workspace and initialize a project environment by binding compute engines, configuring resources, and preparing the platform for team collaboration. You must navigate to the correct workspace and ensure proper RAM roles are assigned before making configuration changes.

### How do I configure data development environments or manage SQL components?

You configure data development environments by accessing DataStudio to edit scripts, debug syntax, and manage reusable SQL components. You can utilize templates, snippets, and library modules to streamline your coding and testing workflows.

## Cross-product integrations

- [AI-Powered Contact Center Intelligence Platform](https://company-skill.com/p/_combos/ai-powered-contact-center-intelligence-platform-cbbc60.md) (eb + es + ess + rds + opensearch)
- [Bailian RAG + ES Chatbot with EventBridge Alerts](https://company-skill.com/p/_combos/bailian-rag-es-chatbot-with-eventbridge-alerts-873dcb.md) (eb + es + bailian + ess + rds)
- [Batch Pipeline with Closed-Loop Multi-Channel Notification](https://company-skill.com/p/_combos/batch-pipeline-with-closed-loop-multi-channel-no-ec172e.md) (eb + resend + alinux + ecs + supabase)
- [Bidirectional DingTalk-Lark ECS Provisioning Loop](https://company-skill.com/p/_combos/bidirectional-dingtalk-lark-ecs-provisioning-loo-9c34b9.md) (eb + ecs + alinux + supabase + ess)
- [CDP with External Service Alerting](https://company-skill.com/p/_combos/cdp-with-external-service-alerting-8fd368.md) (eb + es + ess + rds + opensearch)
- [CI/CD Terraform Full-Stack with Security Hardening](https://company-skill.com/p/_combos/ci-cd-terraform-full-stack-with-security-hardeni-a12154.md) (ecs + terraform + alinux + oss + rds)
- [Closed-Loop Infrastructure Alert with Delivery Verification](https://company-skill.com/p/_combos/closed-loop-infrastructure-alert-with-delivery-v-50872d.md) (resend + twilio + eb + ess + rds)
- [Closed-Loop Multi-Channel Notification Pipeline](https://company-skill.com/p/_combos/closed-loop-multi-channel-notification-pipeline-b18bac.md) (eb + twilio + ecs + rds + resend)

## Use with an AI agent

```bash
curl -s https://company-skill.com/api/route \
  -H 'Content-Type: application/json' \
  -d '{"query": "...", "product": "dataworks"}'
```

MCP server: https://company-skill.com/api/mcp/dataworks.py

---
Machine-readable: https://company-skill.com/llms.txt · https://company-skill.com/sitemap.xml
