DaaS / Products / Vector Search RAG Pipeline on Alibaba Cloud

Vector Search RAG Pipeline on Alibaba Cloud

A developer uploads raw documents to OSS, deploys an embedding model via OpenSearch to generate vector embeddings, creates and manages vector indexes in OSS, then ingests the enriched documents with embeddings into Elasticsearch for hybrid keyword-and-vector search — forming a complete Retrieval-Augmented Generation pipeline.

Products involved

Scenario

How the products combine

oss · oss-manage-objects — Object Storage Service — Manage storage objects (upload, download, copy, etc.)

See oss/oss-manage-objects.

opensearch · opensearch-deploy-model — OpenSearch — Deploy embedding model for inference

See opensearch/opensearch-deploy-model.

oss · oss-manage-data — Object Storage Service — Manage vector data and indexes

See oss/oss-manage-data.

es · es-ingest-documents — Elasticsearch — Ingest and manage document data in Elasticsearch

See es/es-ingest-documents.

Typical questions

build RAG pipeline on Alibaba Cloud
vector search with document storage
upload documents and create embeddings
搭建RAG向量检索流水线
文档向量化后存入Elasticsearch
deploy embedding model and index vectors
end-to-end semantic search pipeline
store docs in OSS and search in ES

FAQ

Q: How do I build a RAG pipeline on Alibaba Cloud? A: You can construct a complete Retrieval-Augmented Generation pipeline by integrating Object Storage Service, OpenSearch, and Elasticsearch. The process requires uploading raw files to OSS, deploying an embedding model via OpenSearch, managing vector indexes in OSS, and ingesting the enriched data into Elasticsearch for hybrid search.

Q: How do I upload documents and create vector embeddings? A: You upload raw documents to Object Storage Service and deploy an embedding model via OpenSearch to generate vector embeddings. You can then create and manage the resulting vector indexes directly within OSS.

Q: How do I store vectorized documents and perform vector search? A: You ingest the enriched documents with their embeddings into Elasticsearch to perform hybrid keyword-and-vector search. This step completes the pipeline after the initial document upload and embedding generation.