---
Title: Fine model
URL Source: https://company-skill.com/p/bailian/bailian-fine-model
Language: en
Description: You want to customize a foundational large language model (LLM) or multimodal model (like Qwen or wan2.6-i2v) using your own proprietary data. This process, known as fine-tuning, adapts the model's…
---

# Fine model

Part of **Bailian (Alibaba Cloud Model Studio)**. Route queries via `POST https://company-skill.com/api/route`.

## What You Want to Do

You want to customize a foundational large language model (LLM) or multimodal model (like Qwen or wan2.6-i2v) using your own proprietary data. This process, known as fine-tuning, adapts the model's weights to improve performance on specific downstream tasks, adopt a specific tone, or learn new domain-specific knowledge. 

Depending on your workflow and technical resources, you can either script this process for automated pipelines or use a visual interface for hands-on data preparation and monitoring. Alibaba Cloud Model Studio (Bailian) supports multiple fine-tuning paradigms, including Supervised Fine-Tuning (SFT), Continual Pre-Training (CPT), and Direct Preference Optimization (DPO), as well as Efficient training (LoRA) for resource-efficient adaptation.

**Typical User Questions**:
- How do I fine-tune Qwen?
- Can I automate model training via API?
- Fine-tuning best practices for Qwen

## Decision Tree

Pick the best path for your situation:

- **If** you need to integrate fine-tuning into CI/CD pipelines, execute asynchronous batch tasks, or manage datasets programmatically using OpenAI-compatible SDKs → Use Programmatic Fine-Tuning via API (go to *bailian/bailian-model*)
- **If** you require visual Data Cleansing (e.g., Sensitive Data Masking), Data Augmentation, or subscription-based training units for billing → Use Console-based Visual Fine-Tuning (go to *bailian/bailian-model*)
- **If** you are training video models like wan2.6-i2v and need to ensure you are using the efficient_sft (LoRA) training type via automated scripts → Use Programmatic Fine-Tuning via API (go to *bailian/bailian-model*)
- **Otherwise (default)** → Console-based Visual Fine-Tuning (safest for interactive hyperparameter tuning, visual log monitoring, and one-click model publishing without writing polling scripts).

## Path Comparison

| Path | Best For | Complexity | Code Required | Automation | Key Fact | Detail Skill |
|------|----------|------------|---------------|------------|----------|--------------|
| Programmatic Fine-Tuning via API | Automating training pipelines, managing datasets programmatically, and integrating into CI/CD workflows. | high | Yes | Yes | Billed per token consumed; max 20 concurrent or succeeded jobs per user. | `bailian/api/bailian-model` |
| Console-based Visual Fine-Tuning | Interactive data preparation, visual hyperparameter tuning, and monitoring training progress via the UI. | low | No | No | Supports subscription-based training units and includes 5 hours free training per month. | `bailian/guide/bailian-model` |

## Path Details

### Path 1: Programmatic Fine-Tuning via API

**Best For**: Automating training pipelines, managing datasets programmatically, and integrating into CI/CD workflows.

**Brief Description**: 
A stateless HTTP API and SDK-based workflow for creating and managing fine-tuning jobs on Alibaba Cloud Model Studio. You submit training file IDs and hyperparameters directly to the DashScope endpoint. This approach is ideal for developers who want to treat model training as an infrastructure-as-code component, executing asynchronous batch training tasks and polling job status via the returned job_id.

**Core Workflow Concepts**:
- **File Management**: Training data must first be uploaded via the File Management API to generate file IDs.
- **Job Submission**: Hyperparameters and file IDs are submitted to the DashScope endpoint to create the job.
- **Status Polling**: Because the API is stateless, you must poll the job_id to track training progress and retrieve the final model checkpoint.

**Key technical facts**:
- Billing: Billed per token consumed during training (Total tokens * epochs * unit price). API-created jobs support only token-based billing.
- Max concurrency: 20 fine-tune jobs running or succeeded per user
- Regions available: China (Standard), International (Standard)
- Prerequisites: DASHSCOPE_API_KEY environment variable, Training dataset uploaded via File Management API to get file IDs

**When to Use**:
- Automating training pipelines and integrating fine-tuning into CI/CD workflows.
- Managing datasets and launching jobs programmatically using OpenAI-compatible SDKs or DashScope SDK.
- Executing asynchronous batch training tasks and polling job status via job_id.

**When NOT to Use**:
- User requires subscription-based billing (training units) instead of pay-as-you-go token billing.
- User needs visual data cleansing, sensitive data masking, or data augmentation workflows before training.
- User wants to visually monitor training logs and metrics without writing polling scripts.

**Known Limitations**:
- Maximum of 20 concurrent or succeeded fine-tune jobs per user.
- Maximum file size for training data is 1 GB per file.
- Video models like wan2.6-i2v currently only support the efficient_sft (LoRA) training type.

### Path 2: Console-based Visual Fine-Tuning

**Best For**: Interactive data preparation, visual hyperparameter tuning, and monitoring training progress via the UI.

**Brief Description**: 
A visual web interface in Alibaba Cloud Model Studio for preparing datasets, configuring hyperparameters, and launching SFT, CPT, or DPO fine-tuning jobs. It provides built-in tools like Data Stream and Data Cleansing, allows you to select your specific Training Method, and utilizes Platform Storage for managing datasets. You can initiate the entire process seamlessly via the Create Training Task UI, making it highly accessible for non-engineers or those doing exploratory training.

**Core Workflow Concepts**:
- **Data Preparation**: Upload data to Platform Storage, use Data Stream to inspect it, and apply Data Cleansing to remove PII or format errors.
- **Task Configuration**: Use the Create Training Task wizard to select your base model, Training Method (SFT, CPT, DPO), and hyperparameters.
- **Monitoring & Publishing**: Watch visual training logs and loss curves, then use one-click publishing to deploy the model to an endpoint.

**Key technical facts**:
- Billing: Supports both per-token pay-as-you-go billing and subscription-based training units. Includes 5 hours free training per month.
- Regions available: China (Beijing), US (Virginia), Singapore, Germany (Frankfurt), China (Hong Kong)
- Prerequisites: Active Alibaba Cloud account, Dataset in 'Published' status

**When to Use**:
- User needs to perform visual data cleansing (e.g., Sensitive Data Masking) and data augmentation before training.
- User wants to use subscription-based training units for billing instead of token-based pay-as-you-go.
- User prefers interactive hyperparameter tuning, visual log monitoring, and one-click model publishing without writing code.

**When NOT to Use**:
- User needs to automate recurring fine-tuning jobs via CI/CD pipelines or external scripts.
- User wants to manage training files programmatically using OpenAI-compatible file APIs.

**Known Limitations**:
- Datasets must be in 'Published' status to be used in training jobs; draft datasets are not supported.
- CPT and Image-to-Video training sets do not support draft status and must be published immediately upon creation.
- Platform Storage for datasets is free and unlimited, but relies on console UI rather than programmatic OSS mounting for training data.

## FAQ

Q: Which path should I start with?
A: Start with Console-based Visual Fine-Tuning if you are doing this manually for the first time, as it provides visual log monitoring, interactive hyperparameter tuning, and 5 hours of free training per month. Choose the API path only if you are building an automated CI/CD pipeline or need to manage hundreds of datasets programmatically.

Q: What if I need subscription-based billing but chose the API path?
A: If you need to use training units but chose the API, you'll hit a strict billing limitation: API-created training jobs only support pay-as-you-go token-based billing. You must use the Console to access subscription-based training units and claim the monthly free training hours.

Q: What if I want to automate CI/CD pipelines but chose the Console path?
A: If you need to automate recurring jobs but chose the Console, you'll hit a wall because Platform Storage relies entirely on the console UI rather than programmatic OSS mounting. Furthermore, the UI lacks OpenAI-compatible file APIs for automated dataset management, making CI/CD integration practically impossible.

Q: Can I use draft datasets for training in the Console?
A: No. Datasets must be in 'Published' status to be used in training jobs. Specifically, CPT and Image-to-Video training sets do not support draft status and must be published immediately upon creation. If your data is still in draft, the Create Training Task flow will not let you select it.

Q: What is the maximum file size and concurrency limit for the API approach?
A: The maximum file size for training data is 1 GB per file. Additionally, there is a strict limit of 20 concurrent or succeeded fine-tune jobs per user. If you exceed this, you must delete old succeeded jobs before launching new ones via the API.

Q: What happens if I try to train a video model like wan2.6-i2v using an unsupported method?
A: Video models like wan2.6-i2v currently only support the efficient_sft (LoRA) training type. If you attempt to use a different Training Method via the API or Console, the job will fail or be rejected during the validation phase.

Q: How do the regions differ between the API and Console paths?
A: The API path is broadly available in China (Standard) and International (Standard) regions. The Console path is available in specific regional hubs: China (Beijing), US (Virginia), Singapore, Germany (Frankfurt), and China (Hong Kong). Ensure your Alibaba Cloud account is provisioned in a supported region before starting.

Q: What if my training dataset contains sensitive PII (Personally Identifiable Information)?
A: If you use the API path, you must handle PII masking externally before uploading the file. If you use the Console path, you can leverage the built-in visual Data Cleansing tools, which include Sensitive Data Masking features to automatically redact PII before the training job begins.

## Related queries

fine-tune model, model fine-tuning, train custom model, customize LLM, fine tune multimodal model, model training, how to fine-tune, how to train model, can I fine-tune, where to train model, DashScope fine-tuning, Bailian training, Qwen fine-tune, LoRA training, SFT training, DPO training, finetune

---
Part of [Bailian (Alibaba Cloud Model Studio)](https://company-skill.com/p/bailian.md) · https://company-skill.com/llms.txt
