---
Title: Elasticsearch
URL Source: https://company-skill.com/p/es
Language: en
Last-Modified: 2026-06-14T06:19:05.199824+00:00
Description: Elasticsearch is a distributed search and analytics engine capable of handling diverse workloads including full-text search, vector search, AI-powered retrieval, document ingestion, model deployment, 
---

# Elasticsearch

> Elasticsearch is a distributed search and analytics engine capable of handling diverse workloads including full-text search, vector search, AI-powered retrieval, document ingestion, model deployment, and more. This skill routes users across multiple domain-specific capabilities:

## Featured GEO article

Elasticsearch is a scalable search and analytics platform that enables developers to ingest document data, deploy retrieval-augmented generation (RAG) AI applications, and optimize search relevance through programmatic APIs and console workflows. It provides integrated security controls, vector search capabilities, and A/B testing frameworks to manage enterprise-grade knowledge bases and conversational AI systems.

## Key facts
- API document ingestion supports up to 100 QPS per application and allows mixed ADD, UPDATE, and DELETE operations with staged commits.
- Console-based vector search ingestion is limited to 1000 documents per request and 10 requests per minute.
- C# SDK integration for document pushes is constrained to approximately 10 requests per second.
- RAG API deployments support deepseek-r1 with enable_search for live web results and ops-qwen-turbo via compatible-mode/v1.
- Security authentication via STS is available in cn-hangzhou, cn-shanghai, and cn-beijing regions, while AccessKey pairs are free of charge.
- Zero-code RAG deployment requires adding IP 47.100.254.67 to the Elasticsearch instance whitelist and is available in China (Shanghai) and Germany (Frankfurt).

## How to deploy a retrieval-augmented generation (RAG) AI application
You can deploy a RAG system by selecting a zero-code console wizard, a fully programmatic API pipeline, or an embedding-only integration depending on your technical requirements and deployment region.
1. Choose your deployment path: use the AI Search Open Platform console for a zero-code DingTalk or Feishu chatbot, the API RAG path for full control over retrieval and generation, or the Embedding API if you only need vector generation.
2. For console deployments, activate the service, create a Knowledge Base, configure the Document Splitting Service, and build the RAG Pipeline for LLM-based conversational search.
3. If using the API route, authenticate with a Bearer token and call endpoints to generate text, split documents, analyze queries, and perform web searches using supported models.
4. Ensure your network configuration allows the required IP 47.100.254.67 if deploying via the console in supported regions.

## How to ingest and manage document data
You can push, update, or delete documents at scale by routing requests through the REST API, a C# SDK, or the console data management interface.
1. Select the ingestion method that matches your throughput needs: the API for up to 100 QPS, the C# SDK for .NET environments, or the console for manual testing of fewer than 1000 documents.
2. Configure authentication using an AccessKey pair stored in environment variables or an SDK Bearer token.
3. Structure your payloads to include ADD, UPDATE, or DELETE operations, and use explicit commit calls when staged visibility control is required.
4. Monitor request rates to stay within platform limits, ensuring console uploads do not exceed 10 requests per minute and SDK calls remain near 10 requests per second.

## How to manage access control and security settings
You secure your Elasticsearch instance by implementing API keys, temporary STS tokens, or RAM policies that enforce least-privilege access across development and production environments.
1. Determine your credential strategy: use STS temporary credentials for cloud-hosted applications on ECS or Function Compute, long-term AccessKeys for local debugging, or RAM policies for enterprise role-based access.
2. For programmatic access, attach a Bearer Token to the Authorization header or configure environment variables with your AccessKey ID and secret.
3. In enterprise setups, navigate to the console to create a RAM user, enable programmatic access, and assign policies such as AliyunOpenSearchFullAccess to specific roles.
4. Validate that API keys are explicitly linked to RAM policies granting only the necessary actions, such as opensearch:Search, before deploying to production.

## How to optimize search result relevance
You improve ranking quality by configuring rerankers, intervention dictionaries, and fine-sort expressions through the search relevance API or console text analysis tools.
1. Access the relevance optimization interface to define custom analyzers, manage synonym or named entity recognition dictionaries, and adjust query processors.
2. Apply reranking logic to reorder initial search results based on domain-specific scoring models or fine-sort parameters.
3. Integrate intervention dictionaries to manually boost, block, or pin specific documents for targeted queries.
4. Test ranking adjustments against baseline queries to verify improved precision and recall before applying changes globally.

## How to run A/B tests for search algorithms
You evaluate different ranking strategies by creating controlled experiments that split traffic across multiple algorithm configurations and measure performance metrics.
1. Initialize an A/B testing experiment through the API or console, defining distinct groups and assigning specific search scenes to each variant.
2. Configure the traffic distribution rules to route a percentage of user queries to each ranking strategy or relevance model.
3. Monitor experiment metrics such as click-through rates, conversion signals, and relevance scores to identify the superior configuration.
4. Promote the winning algorithm to production and archive the test groups once statistical significance is achieved.

## Frequently Asked Questions
**Q: how do I deploy a retrieval-augmented generation (rag) ai application**
A: Select a deployment path based on your technical needs: use the console wizard for zero-code DingTalk/Feishu bots, the API RAG route for full control over retrieval and generation, or the Embedding API if you only require vector generation.

**Q: what's the best way to deploy rag app**
A: The API RAG path is best for maximum flexibility and production control, while the console wizard is optimal for non-technical users in China (Shanghai) who need rapid, zero-code deployment.

**Q: how do I ingest and manage document data in**
A: Route document operations through the REST API for high-throughput production workloads, the C# SDK for .NET integrations, or the console interface for small-scale manual testing.

**Q: what's the best way to ingest documents**
A: Use the programmatic API path, as it is the only method that supports production-scale automation, transaction-aware ingestion, and precise control over when changes become searchable.

**Q: how do I manage access control and security settings**
A: Implement STS temporary credentials for cloud-hosted applications, long-term AccessKeys for local development, or RAM policies to enforce role-based access control across teams.

**Q: what's the best way to manage access**
A: Start with RAM policies, as they provide the strongest security foundation for production environments and allow you to assign minimum required permissions following the least privilege principle.

**Q: how do I optimize search result relevance**
A: Configure rerankers, intervention dictionaries, and fine-sort expressions through the search relevance API or console to adjust query processors and manually influence ranking outcomes.

**Q: what's the best way to optimize search relevance**
A: Combine programmatic fine-sort parameter adjustments with intervention dictionaries for targeted queries, then validate improvements against baseline metrics before global deployment.

**Q: how do I run a/b tests for search algorithms**
A: Create controlled experiments that define distinct groups and scenes, split query traffic across ranking variants, and measure performance to determine the most effective strategy.

**Q: what's the best way to run ab test for search**
A: Use the programmatic API to create and manage experiments, groups, and scenes, then promote the statistically superior configuration to production once testing is complete.

## Key terms
Retrieval-Augmented Generation (RAG) is an AI architecture that combines document retrieval with large language model synthesis to answer questions using private knowledge bases.
STS (Security Token Service) is an authentication mechanism that issues temporary credentials for cloud-hosted applications requiring short-lived, scoped access.
RAM (Resource Access Management) is an identity and permission framework that enables role-based access control and policy assignments for enterprise environments.
Fine-sort expressions are configurable ranking parameters that adjust document scoring after initial retrieval to improve result precision.
Intervention dictionaries are manual override lists that boost, block, or pin specific documents for targeted search queries.
Vector search is a similarity-based retrieval method that uses dense or sparse embeddings to find semantically related documents.

## Sources
The authoritative source for all technical specifications, endpoints, limits, and implementation guidance is the official Elasticsearch product documentation.

Elasticsearch is available as agent-callable skills via DaaS. Route any question to the best skill with `POST https://company-skill.com/api/route` `{"query": "...", "product": "es"}`.

## What you can do

### [Deploy application](https://company-skill.com/p/es/es-deploy-application.md)

## What You Want to Do

You want to build a RAG system that answers questions using your private documents—either as a standalone service, an enterprise chatbot (e.g., in DingTalk), or as part of a larger AI pipeline. This involves document ingestion, text splitting, vector/embedding generation, retrieval, and LLM-based answer synthesis.

**Typical User Questions**:
- How do I build a RAG chatbot with my documents?
- Can I create an enterprise chatbot in DingTalk?

## Decision Tree

Pick the best path for your situation:

- **If** you want to deploy a zero-code enterprise chatbot in DingTalk/Feishu using a UI wizard in **China (Shanghai)** → Use (go to *es/es-text-generation*)
- **If** you already have a document processing pipeline and only need to generate vectors via **Create Text Embedding**, **Create Sparse Embedding**, or **Create Multimodal Embedding** → Use Embedding API (go to *es/es-text-embedding*)
- **If** you need full control over retrieval, re-ranking, and generation—including real-time web search via **deepseek-r1** with **enable_search** or OpenAI-compatible calls to **ops-qwen-turbo** → Use API RAG (go to *es/es-text-generation*)
- **Otherwise (default)** → Start with **** if you’re non-technical and in **China (Shanghai)**; otherwise, use ** API RAG ** for maximum flexibility.

## Path Comparison

| Path | Best For | Complexity | Code Required | Automation | Key Fact | Detail Skill |
|------|----------|------------|---------------|------------|----------|-------------|
| RAG | low | No | No | Requires adding IP 47.100.254.67 to Elasticsearch instance whitelist | `es/guide/es-text-generation` |
| API RAG | high | Yes | Yes | Supports **deepseek-r1** with **enable_search** and **ops-qwen-turbo** via **compatible-mode/v1** | `es/api/es-text-generation` |
| Embedding API | medium | Yes | Yes | Supports **ops-text-embedding-001**, **ops-text-sparse-embedding-001**, and **ops-m2-encoder** with **dimension** control | `es/api/es-text-embedding` |

## Path Details

### Path 1: Console / Dashboard
**Best For**: RAG 

**Brief Description**: This path uses the **AI Search Open Platform** console to activate services, create a **Knowledge Base**, configure the **Document Splitting Service**, and **Create RAG Pipeline** for **LLM-Based Conversational Search**. It also supports deploying enterprise chatbots to DingTalk or Feishu and includes built-in evaluation tasks.

**Key technical facts**:
- Billing: 
- Regions: China (Shanghai), Germany (Frankfurt)

- RAG Faithfulness Context Recall

- cn-hangzhou

- IP47.100.254.67 Elasticsearch 

- 7 20 MB

### Path 2: API RAG 

**Brief Description**: This path uses the **Elasticsearch AI and RAG API** to call **Generate Text**, **Document Split**, **Analyze Query**, and **Perform Web Search**. It supports models like **qwen3-235b-a22b**, **ops-qwen-turbo**, and **deepseek-r1**, with **enable_search** for live web results. Authentication uses **Authorization: Bearer**, and **compatible-mode/v1** enables OpenAI SDK compatibility.

**Key technical facts**:
- Billing: 
- Regions: cn-hangzhou, cn-shanghai, cn-beijing
- Auth method: Authorization: Bearer

**When to Use**:
- deepseek-r1 
- OpenAI SDK ops-qwen-turbo 

- temperaturetop_pmax_tokens

### Path 3: Embedding API 

**Brief Description**: This path uses the **Elasticsearch Embedding API** to call **Create Text Embedding**, **Create Sparse Embedding**, or **Create Multimodal Embedding** with models like **ops-text-embedding-001**, **ops-text-sparse-embedding-001**, and **ops-m2-encoder**. It supports **input_type** and **dimension** parameters and uses **Authorization: Bearer** for auth.

**Key technical facts**:
- Billing: 
- Regions: cn-hangzhou, cn-shanghai, cn-beijing
- Auth method: Authorization: Bearer

**When to Use**:
- OpenAI API

## FAQ

Q: Which path should I start with?
A: If you’re a non-technical user in **China (Shanghai)** and want a DingTalk bot, start with the console. Otherwise, if you’re an engineer needing integration or live web search, use the **Generate Text** API path.

Q: What if I need to process files larger than 20 MB but used the console path?
A: You’ll hit the 20 MB file upload limit in the **AI Search Open Platform** experience center, and files are auto-deleted after 7 days—making it unsuitable for production document pipelines.

Q: What if I chose the Embedding API path but actually need a full RAG chatbot?
A: You’ll get vector generation (**Create Text Embedding**) but miss **LLM-Based Conversational Search**, **Document Splitting Service**, and chatbot deployment—you’d still need to build retrieval and generation yourself.

Q: Can I use **deepseek-r1** with **enable_search** in the console path?
A: No—**enable_search** and **deepseek-r1** are only available via the **Generate Text** API in the custom RAG flow. The console uses fixed models without live web search.

Q: Does the Embedding API support controlling output vector size?
A: Yes—you can use the **dimension** parameter with **ops-text-embedding-001** to customize vector length, which is useful for indexing efficiency.

Q: What happens if I’m in **cn-hangzhou** and try to use the console path?
A: The console path only works directly in **China (Shanghai)** and Frankfurt—you’d need VPC peering or proxy setups, making the API paths (which support **cn-hangzhou**) a better choice.

Q: Can I automate the console-based RAG pipeline in CI/CD?
A: No—the console path is **non-automation friendly** and relies on manual UI steps like **Create RAG Pipeline**, so it cannot be scripted or integrated into deployment pipelines.

### [Ingest documents](https://company-skill.com/p/es/es-ingest-documents.md)

## What You Want to Do

You need to import, update, or delete structured or vector-based documents in an Elasticsearch (OpenSearch) index, possibly at scale, with or without code, and potentially with transaction-like control over when changes become visible.

- How do I push documents to an OpenSearch index?
- Can I stage changes before committing?

## Decision Tree

Pick the best path for your situation:

- **If** you require high-throughput writes (up to **100 QPS**) or need to mix **ADD**, **UPDATE**, and **DELETE** operations with precise control → Use API (go to *es/es-document*)
- **If** you are developing in **C#** and want a working **C# SDK** example with guidance from the **console** under **Apps > Tables > Push Documents** → Use SDK C# (go to *es/es-document*)
- **If** you are using **Vector Search Edition** and only need to manually insert **<1000 documents** via **Data Management > Insert Data** in the **console** for testing → Use (go to *es/es-vector-search*)
- **Otherwise (default)** → Start with ** API ** — it’s the only path that supports production-scale, automated, and transaction-aware ingestion.

## Path Comparison

| Path | Best For | Complexity | Code Required | Automation | Key Fact | Detail Skill |
|------|----------|------------|---------------|------------|----------|-------------|
| API | high | Yes | Yes | Supports up to 100 QPS per application | `es/api/es-document` |
| SDK C# | SDK | medium | Yes | Yes | Limited to ~10 requests/second via C# SDK | `es/guide/es-document` |
| Console / Dashboard | low | No | No | Max 1000 docs per request, 10 requests/minute | `es/guide/es-vector-search` |

## Path Details

### Path 1: API 

**Brief Description**: This approach uses Elasticsearch/OpenSearch REST APIs or SDKs to perform **push**, **add**, **update**, or **delete** operations programmatically. It supports batching and staged commits (e.g., **add** followed by explicit **commit**), giving fine-grained control over when changes become searchable.

**Key technical facts**:
- Billing: API 
- Auth method: AccessKey ID Secret or SDK Authorization: Bearer $DASHSCOPE_API_KEY
- Regions available: cn-hangzhou, cn-shanghai, ap-southeast-1
- Prerequisites: OpenSearch , RAM AliyunServiceRoleForOpenSearch , ALIBABA_CLOUD_ACCESS_KEY_ID ALIBABA_CLOUD_ACCESS_KEY_SECRET 

**When to Use**:
- 100 QPS

### Path 2: SDK C# 

**Best For**: C#SDK 

**Brief Description**: Navigate via the **console** to **OpenSearch > Apps > [App Name] > Tables > Push Documents** to access context and a **C# SDK** integration guide. You must configure **AccessKey** credentials via **environment variables** and use the SDK to send document payloads.

**Key technical facts**:
- Billing: 
- Auth method: AccessKey ID Secret 
- Prerequisites: OpenSearch , C# , ALIBABA_CLOUD_ACCESS_KEY_ID ALIBABA_CLOUD_ACCESS_KEY_SECRET , RAM 

**When to Use**:
- C#SDK 
- C# 

**When NOT to Use**:
- C# 
- >10 QPS

### Path 3: Console / Dashboard
**Brief Description**: In **Vector Search Edition**, use the **console** to go to **Data Management**, then click **Insert Data**. You can enter values in **Table Fields** mode or switch to **Developer Mode** to paste raw JSON with explicit **Document ID**. Click **Submit** to insert.

**Key technical facts**:
- Billing: 
- Auth method: SSO 
- Prerequisites: OpenSearch Vector Search , 

**When NOT to Use**:
- >10 /

## FAQ

Q: Which path should I start with?
A: If you're building a production system or need automation, start with ** API **. If you're just testing vector search with a few samples, use the **console Data Management** path.

Q: What if I need to update existing documents frequently but chose the console path?
A: You’ll hit a major limitation: the **Vector Search Edition console does not support direct UPDATE**—you must **delete** the old document and **re-insert** it with the same **Document ID**, which is error-prone and inefficient at scale.

Q: What if I’m using Python (not C#) but followed the C# SDK guide?
A: You’ll find the **C# SDK** examples unusable, and the **Push Documents** link in the **console** won’t provide Python code. You’d be better off using the generic **API path** with a Python OpenSearch client.

Q: Can I use the console path for production data ingestion?
A: No—you’re capped at **10 Insert Data requests per minute**, and there’s **no automation or rollback**. Attempting this in production will cause severe throughput bottlenecks.

Q: Does the API path support all regions?
A: It’s confirmed available in **cn-hangzhou, cn-shanghai, and ap-southeast-1**. Other regions may vary—check your instance’s region compatibility.

Q: What happens if I exceed the 100 QPS limit in the API path?
A: Requests will be throttled. Design your ingestion pipeline with retries and backoff, or shard across multiple applications if needed.

Q: Do I need to set environment variables for all paths?
A: Only the **API** and **C# SDK** paths require **ALIBABA_CLOUD_ACCESS_KEY_ID** and **ALIBABA_CLOUD_ACCESS_KEY_SECRET** as **environment variables**. The **console** path uses SSO login and needs no credential setup.

### [Manage access](https://company-skill.com/p/es/es-manage-access.md)

## What You Want to Do

You want to securely authenticate and authorize access to your Alibaba Cloud Elasticsearch (OpenSearch) service, whether through code, CLI, or console, using appropriate credential types and permission models.

- How do I secure API calls to OpenSearch?

- Can I manage credentials via the console?
- RAM Elasticsearch 

## Decision Tree

Pick the best path for your situation:

- **If** your application runs on Alibaba Cloud ECS or Function Compute and requires **temporary credentials** via STS → Use API STS (go to *es/es-security*)
- **If** you are a solo developer needing a **long-term AccessKey** for local debugging or simple scripts → Use AccessKey (go to *es/es-security*)
- **If** you operate in an enterprise environment and must assign **minimum required permissions** to different team roles following the **least privilege principle** → Use RAM (go to *es/es-instance*)
- **Otherwise (default)** → Start with ** RAM **, as it provides the strongest security foundation for production environments and supports RBAC via **RAM policy** assignments.

## Path Comparison

| Path | Best For | Complexity | Code Required | Automation | Key Fact | Detail Skill |
|------|----------|------------|---------------|------------|----------|-------------|
| API STS | medium | Yes | Yes | Auth uses Bearer Token or **temporary credentials**; available in cn-hangzhou, cn-shanghai, cn-beijing | `es/api/es-security` |
| AccessKey | low | No | No | Uses **AccessKey** pair stored in **environment variables**; free of charge | `es/guide/es-security` |
| RAM | medium | No | No | Requires **Create User**, **Programmatic Access**, and **Add Permissions** with policies like **AliyunOpenSearchFullAccess** | `es/guide/es-instance` |

## Path Details

### Path 1: API STS 

**Brief Description**: This approach uses the Elasticsearch Security API with **Bearer Token** in the Authorization header or integrates **temporary credentials** from Alibaba Cloud STS. It requires a **RAM user** with appropriate **RAM policy** permissions (e.g., `opensearch:Search`) and, for Java, the `open-search-sdk>=1.0.0`.

**Key technical facts**:
- Billing: API calls are billed per request; free tier resets monthly.
- Regions available: cn-hangzhou, cn-shanghai, cn-beijing

**Known Limitations**:
- Requires code implementation for authentication handling
- STS tokens are temporary and require AssumeRole operation to refresh
- API keys must be associated with RAM policies granting specific Elasticsearch actions (e.g., opensearch:Search)
- No console-based management of credentials in this path

### Path 2: AccessKey 

**Brief Description**: This method guides you through the Alibaba Cloud **console** to navigate to **Users**, select a **RAM user**, and use **Create AccessKey** to generate a key pair. The secret is stored in **environment variables** (e.g., `ALIBABA_CLOUD_ACCESS_KEY_ID`) for local use.

**Key technical facts**:
- Billing: Creating AccessKeys and using RAM users is free of charge.

### Path 3: RAM 

**Brief Description**: Using the Alibaba Cloud **console**, you perform **Create User**, set **User Name** and **Access Type** to **Programmatic Access**, then use **Add Permissions** to attach policies such as **AliyunOpenSearchFullAccess** or custom policies defining **minimum required permissions**.

**Key technical facts**:
- Billing: RAM user management and AccessKey creation are free of charge.

- 1,000RAM

## FAQ

Q: Which path should I start with?
A: If you're in a team or production setting, start with ** RAM ** to enforce the **least privilege principle**. Solo developers testing locally can begin with ** AccessKey **.

Q: What if I need to run my app on Function Compute but used the AccessKey console method?
A: You’ll be forced to embed long-term secrets in your function code or config, violating security best practices. Instead, use **temporary credentials** via the API/STS path, which integrates natively with Function Compute’s execution role.

Q: What if I’m in an enterprise team but chose the simple AccessKey method?
A: You’ll lack role separation—everyone shares the same **RAM user**’s full permissions, making audit and least-privilege enforcement impossible. You’ll also hit the 2-**AccessKey** limit per **RAM user** quickly.

Q: Can I use AliyunOpenSearchFullAccess for all users?
A: While **AliyunOpenSearchFullAccess** simplifies setup, it violates the **minimum required permissions** principle. For production, define custom **RAM policy** documents granting only necessary actions like `opensearch:Search` or `opensearch:Write`.

Q: Do all paths support the same regions?
A: Only the API/STS path explicitly lists supported regions (**cn-hangzhou, cn-shanghai, cn-beijing**). Console-based paths (AccessKey and RAM user creation) are global Alibaba Cloud features and work in all regions—but always verify OpenSearch instance availability separately.

Q: Is there a way to automate RAM user creation with permissions?
A: Not via these console paths—they are manual. For automation, combine the **API/STS path** with Infrastructure-as-Code (e.g., Terraform) to provision **RAM user** and **RAM policy** resources programmatically.

### [Optimize results](https://company-skill.com/p/es/es-optimize-results.md)

## What You Want to Do

You want to improve how well your Elasticsearch (or OpenSearch) search results match user intent—using techniques like neural reranking, synonym expansion, spelling correction, or custom ranking models.

**Typical User Questions**:
- How can I improve the relevance of my Elasticsearch search results?
- Can I use neural reranking in OpenSearch?
- How do I configure an intervention dictionary to correct search results?
- What's the best way to fine-tune search ranking?
- How can I use A/B testing to validate relevance improvements?
- Is there a console UI for relevance tuning?

## Decision Tree

Pick the best path for your situation:

- **If** you need to embed relevance tuning into CI/CD pipelines, automate at scale, or programmatically manage models/dictionaries using REST APIs → Use ** API ** (go to `skills/es/api/es-search-relevance`)
- **If** your primary goal is to configure synonym dictionaries, spelling correction, or custom analyzers via UI without neural reranking → Use **/** (go to `skills/es/guide/es-text_analysis`)
- **If** you want a unified console experience to configure both neural reranking (via AI Search Open Platform), NL2SQL, and intervention dictionaries visually → Use **** (go to `skills/es/guide/es-search-relevance`)
- **Otherwise (default)** → Start with **** if you’re using OpenSearch Advanced Edition with HA3 engine and want rapid visual iteration; otherwise, verify your instance type first.

## Path Comparison

| Path | Best For | Complexity | Code Required | Automation | Key Fact | Detail Skill |
|------|----------|------------|---------------|------------|----------|-------------|
| API | CI/CD | high | Yes | Yes | Requires AI Search Open Platform; QPS limit = 20 | `es/api/es-search-relevance` |
| Console / Dashboard | low | No | No | Supports NL2SQL Service Configuration and Tailored Retrieval Models (HA3 only) | `es/guide/es-search-relevance` |
| / | medium | No | No | Max 20 dictionaries (spelling/synonyms); requires reindexing for Analyzer changes | `es/guide/es-text_analysis` |

## Path Details

### Path 1: API 

**Best For**: CI/CD 

**Brief Description**: This approach uses REST APIs from the **AI Search Open Platform** to programmatically manage neural reranking (with models like `ops-bge-reranker-larger` or `ops-qwen3-reranker-0.6b`), create intervention dictionaries, configure fine sort logic, and run A/B tests. Key components include **Manage Query Processors**, **Create Intervention Dictionary**, and **Create AB Test Group**.

**Key technical facts**:
- Billing: API0.002 CNY/1k tokens0.0001 CNY/call
- Cold start: 19latency
- Auth method: Bearer Token（Authorization: Bearer <your-api-key>）
- Regions available: cn-hangzhou, cn-shanghai, cn-beijing

**When to Use**:
- Need CI/CD integration or large-scale automated tuning
- Require precise control over neural reranker model selection (e.g., BGE vs Qwen3)
- Must implement A/B testing experiment management
- Need programmatic dictionary or query processor management

**When NOT to Use**:
- Prefer no-code, quick experimentation
- Only need basic synonyms/spelling correction (no neural reranking)
- Lack API key or cannot configure Bearer Token auth
- Anticipate >20 QPS (shared across all RAM users)

**Known Limitations**:
- Only supported on **AI Search Open Platform** or OpenSearch High-Performance Edition (not standard Elasticsearch)
- QPS capped at 20 for all users under an Alibaba Cloud account
- Request body ≤ 8 MB
- BGE models limited to 512 tokens (query+doc); Qwen3 supports 32k tokens
- Multi-modal reranking max 100 docs per request

### Path 2: Console / Dashboard
**Brief Description**: This path uses the OpenSearch console’s **Search Algorithm Center** to visually configure **AI Search Open Platform** integrations, manage **Dictionary Management**, define **Tailored Retrieval Models**, set up **NL2SQL Service Configuration**, and build **Query Analysis Rule Management**—all without writing code.

**Key technical facts**:
- Billing: NL2SQL0.0001 CNY/
- Auth method: SSO
- Prerequisites: OpenSearch Advanced Edition instance with dedicated environment and HA3 engine

**When to Use**:
- Want fast, visual iteration without coding
- Need to configure **NL2SQL Service Configuration** (natural language to SQL)
- Require **Tailored Retrieval Models** trained on domain data
- Prefer centralized UI in **Search Algorithm Center**

**When NOT to Use**:
- Require programmatic control or CI/CD integration
- Using standard Elasticsearch engine (not HA3 Advanced Edition)
- Need immediate model updates (training takes 1–2 days)
- Require >5 custom retrieval models per instance

**Known Limitations**:
- Actual reranking still requires API/SDK calls; console only configures API keys
- **Tailored Retrieval Models** only work with HA3 engine Advanced Edition apps
- Max 5 custom models per instance
- Model retraining needed for any change (1–2 day delay)
- Console does not support direct A/B testing configuration

### Path 3: /

**Brief Description**: This approach focuses on text preprocessing via **Dictionary Management** (for synonyms, spelling correction, NER, stop words), **Analyzer Management** (custom tokenizers for e-commerce/IT domains), and **Query Analysis Rule Management**—all managed through the OpenSearch console.

**Key technical facts**:
- Billing: 0.0001 CNY/
- Auth method: SSO

**When to Use**:
- Focus on query understanding (spelling correction, synonym expansion)
- Need to manage multiple dictionary types (NER, stop words, synonyms)
- Require industry-specific **Analyzer Management** (e.g.,)
- Want visual testing of text analysis effects

**When NOT to Use**:
- Need neural reranking or AI-based relevance models
- Require bulk programmatic dictionary updates
- Need >20 intervention dictionaries
- Cannot tolerate reindexing delays (required for analyzer changes)

**Known Limitations**:
- Max 10 category prediction dictionaries; max 20 for spelling/synonyms/stop words
- Each dictionary supports 500–1000 entries (type-dependent)
- Dictionary type cannot be changed after creation
- **Query Analysis Rule Management** must be set as default in Index Orientation to take effect
- **Analyzer Management** changes require full reindexing

## FAQ

Q: Which path should I start with?
A: If you’re using OpenSearch Advanced Edition with HA3 engine and want quick visual tuning, start with ****. If you only need synonyms/spelling correction, use ****. For automation or A/B testing, choose the API path—but confirm you have **AI Search Open Platform** access first.

Q: What if I need neural reranking but chose /?
A: You’ll hit a hard limitation: that path only supports text analysis (synonyms, spelling) and **does not support neural reranking, BGE models, or Qwen3 rerankers**—you’ll have to switch paths later.

Q: What if I’m using a standard Elasticsearch instance but chose ?
A: You’ll find **Tailored Retrieval Models** and **NL2SQL Service Configuration** are grayed out or unavailable—these features require OpenSearch Advanced Edition with HA3 engine, not standard Elasticsearch.

Q: Can I use A/B testing with the console paths?
A: No. Only the ** API ** path supports A/B testing via the **Create AB Test Group** API. Console paths lack built-in experiment management.

Q: Do I need to reindex when updating synonyms in the console?
A: For basic dictionary updates (e.g., adding a synonym), no reindex is needed. But if you modify **Analyzer Management** (e.g., change tokenizer rules), you **must reindex all data**—this applies only to the text analysis path.

Q: What’s the token limit if I use Qwen3 reranker via API?
A: The Qwen3 model supports up to **32k tokens** for the combined query and document text—much higher than BGE’s 512-token limit. This is only available in the API path.

Q: Can I exceed 20 intervention dictionaries if I really need to?
A: No. The **** path enforces hard limits: max 20 for spelling/synonyms/stop words. If you need more, you’d have to consolidate entries or consider programmatic management (though even the API path doesn’t relax this underlying platform limit).

### [Run search](https://company-skill.com/p/es/es-run-search.md)

## What You Want to Do

You want to evaluate two or more search ranking strategies by splitting live traffic and measuring performance differences. This involves defining **A/B test scenes** with distinct **values** (e.g., different ranking parameters) and managing their **status** (active/inactive).

**Typical User Questions**:
- How do I compare two relevance configurations?
- Can I run A/B tests on my Elasticsearch ranking models?

## Decision Tree

Pick the best path for your situation:

- **If** you are using **standard Elasticsearch instances** (not OpenSearch ) → Use API A/B (go to *es/es-ab-test*)
- **If** you need to combine A/B testing with **neural rerankers** like `ops-bge-reranker-larger`, `ops-qwen3-reranker-0.6b`, or `ops-mm-reranker-001` → Use Relevance API A/B (go to *es/es-search-relevance*)
- **If** your experiment requires **query processors**, **intervention dictionary**, or **fine sort** logic tightly coupled with variant definition → Use Relevance API A/B (go to *es/es-search-relevance*)
- **Otherwise (default)** → Use API A/B — it’s simpler and sufficient for basic scene management without advanced relevance features.

## Path Comparison

| Path | Best For | Complexity | Code Required | Automation | Key Fact | Detail Skill |
|------|----------|------------|---------------|------------|----------|-------------|
| API A/B | medium | Yes | Yes | Uses endpoint `api/v1/abtest/scenes` with Bearer Token auth | `es/api/es-ab-test` |
| Relevance API A/B | high | Yes | Yes | Requires OpenSearch ; supports `AB test group` and `AB test scene` with neural rerankers | `es/api/es-search-relevance` |

## Path Details

### Path 1: API A/B 

**Brief Description**: This approach uses the Elasticsearch A/B Testing API, a synchronous HTTP service that lets you create and manage **A/B test scene** definitions via POST requests to `api/v1/abtest/scenes`. Each scene includes **values** (the experimental variants) and a **status** field to control activation. Authentication uses **Bearer Token** with the **DASHSCOPE_API_KEY** environment variable.

**Key technical facts**:
- Billing: per-request model
- Regions: cn-hangzhou, cn-shanghai, cn-beijing
- Auth: Bearer Token authentication

**When to Use**:
- Need to programmatically create multiple **A/B test scene** entries via REST
- Integrating with an existing internal experimentation platform
- Only require basic variant definition without neural reranking or query processing

**When NOT to Use**:
- Need to define test variants **during** document reranking (use Relevance API instead)
- Require end-to-end relevance experiments involving **fine sort** or **intervention dictionary**

**Known Limitations**:
- Cannot update an existing **A/B test scene** — must delete and recreate
- No documented limit on number of **values** per scene; system capacity may impose implicit bounds
- Only supports synchronous request-response; no async or streaming

### Path 2: Relevance API A/B 

**Brief Description**: The Elasticsearch Relevance Optimization API enables full lifecycle management of **AB test group** and **AB test scene** entities while supporting advanced features like neural reranking with models such as `ops-bge-reranker-larger`, `ops-qwen3-reranker-0.6b`, and `ops-mm-reranker-001`. It also integrates **query processors**, **intervention dictionary**, and **fine sort** configuration into the experiment workflow. Requests are sent to `opensearch.{region}.aliyuncs.com` using **Bearer Token** auth with **DASHSCOPE_API_KEY**.

**Key technical facts**:
- Billing: Per-request billing for all operations
- Regions: cn-hangzhou, cn-shanghai, cn-beijing
- Auth: Bearer Token authentication
- Prerequisites: AI Search Open Platform or OpenSearch ; alibabacloud_searchplat20240529 SDK

**When to Use**:
- Running A/B tests that include **neural rerankers** (`ops-bge-reranker-larger`, etc.)
- Building end-to-end relevance pipelines combining **query processors**, **intervention dictionary**, and **fine sort**
- Need to define variants contextually during the reranking phase

**When NOT to Use**:
- Using standard Elasticsearch instances (this path **only works on OpenSearch **)
- Only need to create static **A/B test scene** definitions without any reranking logic

**Known Limitations**:
- Not compatible with standard Elasticsearch — requires **OpenSearch **
- Rerank APIs have a shared 20 QPS rate limit; high-concurrency scenarios need retry logic
- Request body capped at 8 MB; multimodal reranking limited to 100 documents per call

## FAQ

Q: Which path should I start with?
A: Start with ** API A/B ** unless you’re using OpenSearch and need neural reranking, **fine sort**, or **query processors** — then use the Relevance API.

Q: What if I’m using a standard Elasticsearch cluster but chose the Relevance API path?
A: You’ll hit a compatibility error — the Relevance API **only works on AI Search Open Platform or OpenSearch **, not standard Elasticsearch instances.

Q: What if I need to test a `ops-mm-reranker-001` model but used the basic A/B test API?
A: You won’t be able to associate the reranker with your **A/B test scene** — the basic API doesn’t support embedding neural models or **intervention dictionary** logic into variants.

Q: Can I update an active A/B test scene after creation?
A: With the basic API, **no** — documentation implies you must delete and recreate the scene. The Relevance API may offer more flexibility, but check its detail skill.

Q: Are there payload size limits I should know about?
A: Yes — the Relevance API enforces an **8 MB request limit** and caps multimodal reranking at **100 documents**. The basic API has no documented size limit, but large payloads may time out.

Q: Do both paths use the same authentication?
A: Yes — both require **DASHSCOPE_API_KEY** and **Bearer Token** authentication against endpoints like `elasticsearch.{region}.aliyuncs.com` (basic) or `opensearch.{region}.aliyuncs.com` (Relevance).

Q: Can I define traffic splits (e.g., 90/10) in both paths?
A: The fact cards don’t specify traffic allocation mechanics — this detail is likely handled in the **values** parameter or downstream routing. Consult the respective detail skills for split configuration.


## Frequently asked questions

### Should I use the API or the console for managing my Elasticsearch instance?

Use the **console** for initial setup, visual configuration, and one-off tasks. Use the **API/SDK** for automation, integration into applications, or bulk operations.

### How do I get started with secure API access?

First, create an AccessKey in the console (`es-security` guide). Then initialize your SDK client with the key and secret. For enhanced security, use STS temporary tokens (`es-security` API).

### My search results aren’t relevant—where should I start?

Begin with the intent skill **“Optimize search result relevance”**, which routes you to relevance tuning via reranking, intervention dictionaries, or fine-sort expressions.

### I’m getting a 403 error when calling the API—what’s wrong?

This usually indicates missing or incorrect permissions. Check your RAM user policies and ensure your AccessKey has the required actions. See `es-troubleshooting` for detailed error diagnostics.

### Can I deploy a RAG chatbot without writing code?

Yes—the **AI and RAG guide** (`es-text-generation`) includes step-by-step instructions to build knowledge-base Q&A systems and deploy chatbots in DingTalk/Lark via the console.

### How do I deploy a retrieval-augmented generation (RAG) AI application?

You can deploy a retrieval-augmented generation (RAG) AI application by building knowledge-base Q&A systems or enterprise chatbots via the console or API. This process supports generating text with LLMs and optionally integrating live or web-augmented search.

### How do I ingest and manage document data in Elasticsearch?

You can ingest and manage document data by uploading, batch-pushing, staging, or committing documents directly into Elasticsearch indices. These operations are supported through both programmatic API calls and console-based UI guides.

### How do I manage access control and security settings?

You can manage access control and security settings by securing your instance with API keys, RAM policies, STS tokens, or OAuth authentication. Credentials can be generated and managed through the console or configured programmatically.

### How do I optimize search result relevance?

You can optimize search result relevance by improving ranking quality with rerankers, intervention dictionaries, or fine-sort expressions. These settings can be configured via the API or set up in the console using tailored retrieval models and query processors.

## Cross-product integrations

- [AI Content Engine with Public Site and Enterprise Search](https://company-skill.com/p/_combos/ai-content-engine-with-public-site-and-enterpris-9db7c8.md) (alinux + cloudflare + bailian + notion + vercel)
- [AI Content Platform on Managed Infrastructure](https://company-skill.com/p/_combos/ai-content-platform-on-managed-infrastructure-265158.md) (alinux + cloudflare + bailian + notion + vercel)
- [AI Content Platform with Search and Frontend](https://company-skill.com/p/_combos/ai-content-platform-with-search-and-frontend-d3ca31.md) (alinux + cloudflare + bailian + notion + vercel)
- [AI Content Platform with Site and Search](https://company-skill.com/p/_combos/ai-content-platform-with-site-and-search-7bf25b.md) (alinux + cloudflare + bailian + notion + vercel)
- [AI-Driven Search Knowledge Platform](https://company-skill.com/p/_combos/ai-driven-search-knowledge-platform-803ad0.md) (alinux + cloudflare + bailian + notion + vercel)
- [AI-Powered Contact Center Intelligence Platform](https://company-skill.com/p/_combos/ai-powered-contact-center-intelligence-platform-cbbc60.md) (eb + dataworks + ess + rds + opensearch)
- [AI Recommendation Platform with RAG Explanations](https://company-skill.com/p/_combos/ai-recommendation-platform-with-rag-explanations-8803cd.md) (airec + alinux + opensearch + bailian + pai)
- [AIRec with Custom Models and Semantic Search](https://company-skill.com/p/_combos/airec-with-custom-models-and-semantic-search-fe8869.md) (airec + alinux + opensearch + cloudflare + pai)

## Use with an AI agent

```bash
curl -s https://company-skill.com/api/route \
  -H 'Content-Type: application/json' \
  -d '{"query": "...", "product": "es"}'
```

MCP server: https://company-skill.com/api/mcp/es.py

---
Machine-readable: https://company-skill.com/llms.txt · https://company-skill.com/sitemap.xml