---
Title: Deploy model
URL Source: https://company-skill.com/p/opensearch/opensearch-deploy-model
Language: en
Description: You want to deploy a custom or pre-trained embedding model (for text, image, or multimodal inputs) in OpenSearch so it can generate vector embeddings via an API or managed service. This enables use…
---

# Deploy model

Part of **OpenSearch**. Route queries via `POST https://company-skill.com/api/route`.

## What You Want to Do

You want to deploy a custom or pre-trained embedding model (for text, image, or multimodal inputs) in OpenSearch so it can generate vector embeddings via an API or managed service. This enables use cases like semantic search, RAG pipelines, or Agentic Search.

**Typical User Questions**:
- How to deploy an embedding model in OpenSearch?
- Can I deploy my own embedding model?

## Decision Tree

Pick the best path for your situation:

- **If** you already have a trained model file and need to integrate deployment into CI/CD or automation scripts → Use API (go to *opensearch/opensearch-text*)
- **If** you prefer using a graphical interface to upload models, configure services like NL2SQL or Agentic Search, and test without coding → Use AI (go to *opensearch/opensearch-text*)
- **If** your target deployment region is cn-hangzhou, cn-shanghai, or cn-beijing and you require high-throughput batch embedding (up to 32 texts per request) → Use API
- **Otherwise (default)** → AI — it’s safer for first-time users who lack API keys or service deployment tokens and want immediate visual feedback via the Experience Center.

## Path Comparison

| Path | Best For | Complexity | Code Required | Automation | Key Fact | Detail Skill |
|------|----------|------------|---------------|------------|----------|-------------|
| API | API | high | Yes | Yes | Text and multimodal embedding billed per input token; custom deployments billed per request | `opensearch/api/opensearch-text` |
| AI | medium | No | No | Model customization billed per compute unit (CU); NL2SQL and Agentic Search billed per request or token | `opensearch/guide/opensearch-text` |

## Path Details

### Path 1: API

**Best For**: API

**Brief Description**: OpenSearch Embedding API provides a synchronous HTTP endpoint for generating dense or sparse vector embeddings from text, images, or multimodal inputs. It supports both prebuilt models (via `text-embedding` and `multi-modal-embedding` endpoints) and custom deployments that require a `service deployment token`. Authentication uses a `workspace API key` in a `Bearer token` header.

**Key technical facts**:
- Billing: token
- Regions available: cn-hangzhou, cn-shanghai, cn-beijing
- Auth method: Header: Authorization: Bearer <your_workspace_api_key> Header: Token: <your_service_deployment_token>

- OpenAI SDK

- QPSops-text-embedding-00150 QPS

### Path 2: AI

**Brief Description**: The OpenSearch Model and AI Services console offers a no-code experience through components like **Service Plaza**, **Deploy Service**, and **Create Model**. Users can activate AI Search Open Platform, create a **workspace**, upload models from MaxCompute or OSS, configure **NL2SQL** or **Agentic Search**, and test services instantly in the **Experience Center**—all without writing code.

**Key technical facts**:
- Billing: NL2SQLAgentic Searchtoken(CU)
- Regions available: China (Shanghai), Germany (Frankfurt)
- Auth method: SSO
- Prerequisites: RAMModel Service-Service DeploymentRAM

- Experience Center
- NL2SQLAgentic Search

## FAQ

Q: Which path should I start with?
A: Start with AI if you’re new to OpenSearch AI services, don’t have a `workspace API key`, or want to test models instantly in the **Experience Center**. Switch to the API path only if you need automation or are deploying in cn-hangzhou/cn-beijing.

Q: What if I need to deploy in cn-hangzhou but chose the console path?
A: You’ll hit a hard limitation: the console only supports **China (Shanghai)** and **Germany (Frankfurt)** for model service deployment. cn-hangzhou is only available via the API path.

Q: What if I have a large batch of 50 texts to embed but used the API path with a custom deployment?
A: You’ll exceed the limit: custom deployments via API allow only **16 strings per request**. You’d need to split your batch or switch to standard `text-embedding` (max 32) if your model supports it.

Q: Can I use the console to deploy a model that requires a `service deployment token`?
A: Yes—but you must first create the service in the console (**Deploy Service** → **Create Model**), which generates the token. You cannot use the token without going through the console first, even for API calls.

Q: Does the API path support **NL2SQL** or **Agentic Search** configuration?
A: No—these advanced features are only configurable via the **Service Plaza** and **RAG Model Service Configuration** in the console. The API only handles raw embedding generation (`text-embedding`, `multimodal embedding`) and **dimensionality reduction**.

Q: What happens if I’m a RAM user without “Model Service-Service Deployment” permission and try the console path?
A: You’ll be blocked from completing **service deployment**—the console will show a permissions error during the **Create Model** step. Only account owners or properly authorized RAM users can proceed.

## Related queries

deploy embedding model, deploy model, model deployment, serve embeddings, model serving, publish embedding model, deploy ML model, how to deploy embedding model, can I deploy my own model, where to deploy vector model, what is model deployment, how do I serve embeddings, text-embedding API, multimod

---
Part of [OpenSearch](https://company-skill.com/p/opensearch.md) · https://company-skill.com/llms.txt