# SeeGen.AI API Documentation

> Full interactive docs: https://seegen.ai/api-docs
> Base URL: https://seegen.ai/api/v1
> Authentication: Bearer token via Authorization header
> Models: sd2 (Seedance2-Pro, high quality video), sd2-fast (Seedance2-Fast, lower cost video), happyhorse (HappyHorse 1.0 by Alibaba, 720P/1080P text-to-video, first-frame image-to-video, and reference-to-video with up to 9 reference images), gpt-image-2 (OpenAI GPT Image 2 for text-to-image and image-to-image editing, supports batch generation up to 4)

## Overview

Seedance 2.0 is built for cinematic video generation with flexible multimodal inputs and controllable workflows, including Text to Video, Image to Video, First & Last Frame to Video, and Multi-Ref / Omni Reference.

Key advantages:
- Human reference support (review-based)
- Production-ready API, auto-scalable concurrency (240+)
- Low & stable latency, 720P/1080P native + 2K/4K upscale resolutions
- Fast mode from $0.114/sec
- No subscription required — pay as you go
- 24/7 customer support

Note: Real human image and video assets require official review, usually completed within seconds. Once approved, they can be used directly as references. Without review, generation may fail.

## Authentication

All requests require a Bearer token:

```
Authorization: Bearer YOUR_API_KEY
```

Create API keys at https://seegen.ai/account (max 10 per account).

## Endpoints

### POST /api/v1/jobs/createTask

Create a video generation task. Returns `{ "taskId": "..." }`.

**Request body:**
```json
{
  "model": "sd2",
  "inputs": { ... },
  "callBackUrl": "https://your-server.com/webhook"  // optional
}
```

### GET /api/v1/jobs/queryTask?taskId={taskId}

Poll task status. Returns task object with status: PENDING | PROCESSING | COMPLETED | FAILED.

**Response:**
```json
{
  "taskId": "task_abc123",
  "model": "sd2",
  "status": "COMPLETED",
  "creditsUsed": 200,
  "output": [{ "url": "https://...", "width": 1280, "height": 720 }],
  "error": null,
  "createTime": 1711234567890,
  "completeTime": 1711234612345
}
```

### GET /api/v1/account/credits

Check credits balance. Returns `{ "credits": 5000, "availableCredits": 4800 }`.

### POST /api/v1/upscale/create

Standalone Video Upscaler — independent product, NOT a step inside the video generation pipeline. Submit a video and choose a target resolution; the system probes duration / source resolution, deducts credits proportional to the actual duration, then runs the upscale job.

**Pricing** (credits per billable second; 5s minimum):
- 1080p: 25/s
- 2K: 38/s
- 4K: 50/s

**Constraints**:
- Source: https URL or your previously uploaded R2 URL. http URLs and private/internal IPs are rejected.
- Duration: up to 600 seconds. Clips shorter than 5 s are accepted but billed as 5 s.
- Size: ≤ 200 MB.
- Format: MP4 / MOV / WebM.
- Source resolution must be lower than target. (e.g. a 1080p source can go to 2K or 4K, but not 1080p.)
- On any failure, credits are fully refunded.

**Request body:**
```json
{
  "source": { "type": "url",      "url":   "https://..." },
  "targetResolution": "1080p" | "2k" | "4k",
  "callBackUrl": "https://your-server.com/webhook"
}
```

Or with an asset already uploaded to our R2:
```json
{
  "source": { "type": "uploadId", "r2Url": "https://static.seedance2-pro.com/..." },
  "targetResolution": "4k"
}
```

Returns `{ "taskId": "...", "orderId": "...", "status": "validating" }`.

### GET /api/v1/upscale/query?taskId={taskId}

Poll a standalone upscale task. Status progresses validating → processing → completed | failed.

**Response (completed):**
```json
{
  "taskId": "...",
  "status": "completed",
  "targetResolution": "4k",
  "creditsConsumed": 1200,
  "result": {
    "url": "https://static.seedance2-pro.com/standalone-upscale/results/...mp4",
    "probedDurationSeconds": 30,
    "probedSourceResolution": "1280x720"
  },
  "error": null,
  "createdAt": "2026-04-28T01:23:45.000Z",
  "finishedAt": "2026-04-28T01:35:01.000Z"
}
```

**Response (failed):**
```json
{
  "taskId": "...",
  "status": "failed",
  "creditsConsumed": null,
  "result": null,
  "error": {
    "code": "DURATION_EXCEEDS_LIMIT" | "SOURCE_RESOLUTION_TOO_HIGH" | "FILE_TOO_LARGE" | "UNSUPPORTED_MEDIA_TYPE" | "URL_NOT_REACHABLE" | "URL_RESOLVES_TO_PRIVATE_IP" | "PROBE_FAILED" | "UPSTREAM_FAILED" | "INSUFFICIENT_CREDITS" | "SERVICE_DISABLED",
    "message": "..."
  }
}
```

The optional `callBackUrl` receives a POST with the same response shape on terminal state.

## Generation Modes

### 1. Text to Video

No images required. Prompt only.

```json
{
  "model": "sd2",
  "inputs": {
    "prompt": "A golden retriever running on the beach at sunset",
    "duration": "5s",
    "resolution": "1280x720"
  }
}
```

### 2. Image to Video

Animate a single image.

```json
{
  "model": "sd2",
  "inputs": {
    "urls": ["https://example.com/photo.jpg"],
    "prompt": "The woman slowly turns her head and smiles",
    "duration": "5s"
  }
}
```

### 3. First & Last Frame (Keyframe)

Provide start and end frames, model generates the transition.

```json
{
  "model": "sd2",
  "inputs": {
    "urls": ["https://example.com/first.jpg", "https://example.com/last.jpg"],
    "prompt": "Smooth camera transition from day to night",
    "duration": "5s",
    "videoInputMode": "keyframe"
  }
}
```

### 4. Multi-Reference

Use multiple images, videos, and audio files as references.

```json
{
  "model": "sd2",
  "inputs": {
    "urls": ["https://example.com/ref1.jpg", "https://example.com/ref2.jpg"],
    "videoUrls": ["https://example.com/motion-ref.mp4"],
    "audioUrls": ["https://example.com/audio.mp3"],
    "prompt": "Character walks through a garden",
    "duration": "5s",
    "videoInputMode": "reference",
    "resolution": "1280x720"
  }
}
```

**Reference constraints:** max 9 images, 3 videos, 3 audio files; max 12 total; each video/audio ≤ 15s; images ≥ 400px shortest side.

### 5. 1080P Native Output (sd2 only)

Set `outputResolution: "1080p"` for native 1080p rendering. Available only for model `sd2`; sd2-fast rejects this and should use 2K/4K upscale instead.

```json
{
  "model": "sd2",
  "inputs": {
    "urls": ["https://example.com/scene.jpg"],
    "videoUrls": ["asset://asset-20260326-abc123"],
    "prompt": "Character moves smoothly matching the reference video",
    "duration": "5s",
    "outputResolution": "1080p",
    "videoInputMode": "reference"
  }
}
```

Note: 1080p generation takes noticeably longer than 720p (typically 10+ minutes). Poll patiently or rely on the callback webhook. Face-containing 1080p outputs may be subject to additional safety review.

### 6. HappyHorse 1.0 (Alibaba)

An alternative video generation model from Alibaba DashScope. Three modes:
1. **Text-to-video** — prompt only
2. **Image-to-video (first-frame)** — exactly 1 image, animate from that frame
3. **Reference-to-video (multi-image refs)** — 1–9 reference images fused per prompt; reference subjects with `character1`, `character2`, … in the prompt to indicate which reference image is which

The mode is auto-detected from the inputs: 0 `urls` → text-to-video, 1 `urls` → image-to-video (first-frame), `videoWorkflowTab: "multi-reference"` with 1+ `urls` → reference-to-video. HappyHorse 1.0 does **not** support last-frame, reference video, or reference audio.

No asset review needed — you can use public HTTPS image URLs or `asset://` references from the asset library (resolved back to their R2 URL automatically).

Model name: `happyhorse` (alias of `happyhorse-video-generation`).

Text-to-video:
```json
{
  "model": "happyhorse",
  "inputs": {
    "prompt": "A panda DJ at a beach party",
    "duration": "5s",
    "outputResolution": "720p",
    "ratio": "16:9"
  }
}
```

Image-to-video (single first frame):
```json
{
  "model": "happyhorse",
  "inputs": {
    "urls": ["https://example.com/panda.jpg"],
    "prompt": "The panda blinks and smiles",
    "duration": "5s",
    "outputResolution": "1080p"
  }
}
```

Reference-to-video (1–9 reference images):
```json
{
  "model": "happyhorse",
  "inputs": {
    "videoWorkflowTab": "multi-reference",
    "urls": [
      "https://example.com/character.jpg",
      "https://example.com/folding-fan.jpg",
      "https://example.com/earrings.jpg"
    ],
    "prompt": "A woman in red character1 opening folding fan character2, with tassel earrings character3 swinging",
    "duration": "5s",
    "outputResolution": "720p",
    "ratio": "16:9"
  }
}
```

Parameters (happyhorse only):
| Parameter | Type | Required | Notes |
|-----------|------|----------|-------|
| prompt | string | Yes (t2v / r2v) / optional (i2v) | ≤ 5000 non-CJK chars or ≤ 2500 CJK chars (upstream auto-truncates beyond); in r2v use `character1`/`character2`/… to refer to the N-th url |
| urls | string[] | Optional | t2v: omit; i2v: exactly 1; r2v: 1–9 |
| videoWorkflowTab | string | No | Set to `"multi-reference"` to switch image-to-video into reference-to-video (r2v) mode |
| duration | string | No | "3" to "15" seconds, integer only (default "5") |
| outputResolution | string | No | "720p" or "1080p" (default "720p") |
| ratio | string | No | "16:9" / "9:16" / "1:1" / "4:3" / "3:4" (t2v + r2v only; i2v aspect is inferred from the first frame) |
| seed | int | No | 0–2147483647 |

Image requirements: shortest side ≥ 300px, aspect ratio 1:2.5–2.5:1, JPEG/JPG/PNG/BMP/WEBP, ≤ 10MB. Pricing: 720P = 32 credits/sec, 1080P = 60 credits/sec; r2v / i2v / t2v share the same per-second rate (reference images are not billed).

## GPT Image 2 (Image Generation)

OpenAI's GPT Image 2 for high-quality text-to-image and image-to-image (editing). One workflow supports both modes — pass `urls` to switch into image-to-image mode automatically. Output is delivered via R2 in the format you request (PNG / JPEG / WEBP).

Model name: `gpt-image-2`.

The mode is auto-detected from inputs: 0 `urls` → text-to-image, 1–10 `urls` → image-to-image (editing).

Text-to-image:
```json
{
  "model": "gpt-image-2",
  "inputs": {
    "prompt": "A neon-lit cyberpunk alley at midnight, photoreal",
    "quality": "medium",
    "resolution": "2k",
    "aspectRatio": "16:9"
  }
}
```

Image-to-image (editing, 1–10 reference images):
```json
{
  "model": "gpt-image-2",
  "inputs": {
    "urls": ["https://example.com/portrait.jpg"],
    "prompt": "Restyle as oil painting",
    "quality": "medium",
    "resolution": "1k",
    "aspectRatio": "1:1"
  }
}
```

Parameters (gpt-image-2 only):
| Parameter | Type | Required | Notes |
|-----------|------|----------|-------|
| prompt | string | Yes | Text description of the image, or the edit to apply when `urls` is provided |
| urls | string[] | Optional | 1–10 reference image URLs for image-to-image mode; omit for text-to-image |
| quality | string | No | "medium" / "high" (default "medium"). Credits scale by tier |
| resolution | string | No | "1k" / "2k" / "4k" (default "2k"). Credits scale by tier (see Pricing) |
| aspectRatio | string | No | "1:1" / "16:9" / "9:16" / "4:3" / "3:4" (default "1:1") |
| outputFormat | string | No | "png" / "jpeg" / "webp" (default "png") |

Input image requirements (image-to-image mode):
- Up to 10 reference images per task
- Max file size 50 MB per image
- Shortest side ≥ 256px
- Aspect ratio between 1:3 and 3:1
- Formats: JPEG, JPG, PNG, WEBP

Pricing: credits per image scale by resolution and quality — see the **Pricing** section for the full table.

## Parameters Reference

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| prompt | string | Conditional | Text description (required for text-to-video) |
| urls | string[] | Conditional | Image URLs or asset references (e.g. "asset://asset-20260326-abc123") |
| videoUrls | string[] | No | Video reference asset:// URIs (reference mode only, max 3). **Must be `asset://` URIs** — external URLs are rejected. Upload videos via /api/v1/assets/upload first. |
| audioUrls | string[] | No | Audio reference URLs (reference mode only, max 3). Cannot be the only reference — requires at least one image or video |
| duration | string | No | "4s" to "15s" (default: "5s") |
| resolution | string | No | Aspect ratio via pixel string: 720x720, 720x960, 960x720, 1280x720, 720x1280, 1280x540 |
| outputResolution | string | No | ARK-native output tier: "720p" (default) or "1080p". **1080p via this field is sd2 only** (for sd2-fast use upscaleResolution instead). |
| videoInputMode | string | No | "keyframe" or "reference" |
| upscaleResolution | string | No | "1080p", "2k", or "4k" — post-generation WaveSpeed upscale applied on top of 720p base output. Use "1080p" here for sd2-fast to get 1080p output (sd2-fast has no native 1080p). |

Top-level fields: model (required), inputs (required), callBackUrl (optional).

## Assets

Assets are images, videos, and audio files that go through a review process before use in tasks.

### POST /api/v1/assets/upload

Upload an asset for review. Provide a publicly accessible URL.

```json
{
  "url": "https://example.com/photo.jpg",
  "type": "IMAGE",
  "name": "my-photo"
}
```

Parameters: url (required, string), type (required, "IMAGE"|"AUDIO"|"VIDEO"), name (optional, max 64 chars).

Response:
```json
{
  "assetId": 123,
  "volcAssetId": "asset-20260326-abc123",
  "type": "IMAGE",
  "status": "PROCESSING",
  "failReason": null,
  "url": "https://example.com/photo.jpg",
  "name": "my-photo",
  "createdAt": 1711234567890
}
```

### GET /api/v1/assets/status?assetId={id}

Query asset review status. Accepts numeric assetId or volcAssetId string. Auto-polls review system when PROCESSING.

Status values: PROCESSING (under review), ACTIVE (ready for use), FAILED (check failReason).

### GET /api/v1/assets/list

List your assets. Optional query params: type (IMAGE|AUDIO|VIDEO), status (PROCESSING|ACTIVE|FAILED), cursor (number), limit (1-50, default 20).

Response: `{ "items": [...], "nextCursor": 100 }`

### Using assets in tasks

Once an asset is ACTIVE, use its volcAssetId with the asset:// protocol:
```json
{ "urls": ["asset://asset-20260326-abc123"] }
```

## Pricing

| Plan | Price | Credits |
|------|-------|---------|
| API Starter Pack (40% OFF) | $500 (~~$833~~) | 125,000 |
| API XL Pack (40% OFF) | $2,000 (~~$3,332~~) | 500,000 |

### Credits per second (per output-second *and* per ceiled-input-second)

| Model | 720P no video | 720P + video ref | 1080P native, no video | 1080P native, + video ref |
|-------|--------------:|-----------------:|-----------------------:|--------------------------:|
| sd2 (Pro) | 40 | 30 | 100 | 75 |
| sd2-fast (Fast) | 32 | 24 | N/A — use upscaleResolution="1080p" | N/A — use upscaleResolution="1080p" |

WaveSpeed upscale addon (applied on top of 720P base, flat per-output-second regardless of model or video ref): **1080P +20 credits/s · 2K +30 credits/s · 4K +40 credits/s**.

Example — sd2-fast with 1080p upscale, 5s output, no ref video: 5 × 32 (base) + 5 × 20 (upscale) = 260 credits.
Example — sd2 1080P native, 5s output + 5s ref video: (5+5) × 75 = 750 credits.

### Cost calculation

Without input videos:
`credits = output_seconds × rate`

With input videos:
`credits = (output_seconds + Σ ⌈input_video_duration_i⌉) × rate`

Where ⌈⌉ is ceiling to whole seconds and Σ sums per reference video (minimum ⌈2×output_seconds/3⌉). Each `input_video_duration_i` is the server-measured duration of the asset referenced by the i-th entry in `videoUrls` (recorded by ffprobe at upload time) — clients cannot influence this value.

### Credits per task (5s output example)

| Config | sd2 (Pro) | sd2-fast (Fast) |
|--------|----------:|----------------:|
| 720P, no ref video | 200 | 160 |
| 720P, 5s ref video | 300 | 240 |
| 720P, 10s ref video | 450 | 360 |
| 1080P, no ref video | 500 | — |
| 1080P, 5s ref video | 750 | — |
| 1080P, 10s ref video | 1,125 | — |
| 720P + 2K upscale, no ref | 350 | 310 |
| 720P + 4K upscale, no ref | 400 | 360 |

HappyHorse 1.0 (per second, flat):

| Output | Rate |
|--------|------|
| 720P | 32 credits/sec |
| 1080P | 60 credits/sec |

5s examples: 720P = 160 credits, 1080P = 300 credits. 15s examples: 720P = 480 credits, 1080P = 900 credits. Reference-to-video and image-to-video share the same per-second rate as text-to-video — reference images are not billed.

### gpt-image-2 (per image, by resolution × quality)

| Resolution | Medium quality | High quality |
|------------|---------------:|-------------:|
| 1k | 15 | 70 |
| 2k (default) | 35 | 125 |
| 4k | 60 | 230 |

Each task produces 1 image. Submit N tasks for N variants. Failed tasks refund credits automatically.

Check balance: GET /api/v1/account/credits

## Webhook Callback

Provide `callBackUrl` in createTask to receive a POST when task completes/fails. Payload is same format as queryTask response. Retry policy: up to 3 retries (1s, 5s, 30s delays) on non-2xx.

## Error Codes

| Code | Meaning |
|------|---------|
| 400 | Invalid parameters |
| 401 | Invalid or missing API key |
| 402 | Insufficient credits |
| 403 | Access denied (can only query own tasks) |
| 404 | Task not found |
| 429 | Concurrency limit reached |
| 500 | Internal server error |

## Quick Start (cURL)

```bash
# 1. Upload asset for review
ASSET=$(curl -s -X POST \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://example.com/photo.jpg","type":"IMAGE"}' \
  https://seegen.ai/api/v1/assets/upload)
ASSET_ID=$(echo $ASSET | jq -r '.assetId')

# 2. Poll until asset is ACTIVE
while true; do
  STATUS=$(curl -s -H "Authorization: Bearer $API_KEY" \
    "https://seegen.ai/api/v1/assets/status?assetId=$ASSET_ID" | jq -r '.status')
  [ "$STATUS" = "ACTIVE" ] && break
  [ "$STATUS" = "FAILED" ] && echo "Review failed" && exit 1
  sleep 3
done
VOLC_ID=$(curl -s -H "Authorization: Bearer $API_KEY" \
  "https://seegen.ai/api/v1/assets/status?assetId=$ASSET_ID" | jq -r '.volcAssetId')

# 3. Create task with approved asset
TASK_ID=$(curl -s -X POST \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"model\":\"sd2\",\"inputs\":{\"urls\":[\"asset://$VOLC_ID\"],\"prompt\":\"The person smiles\",\"duration\":\"5s\"}}" \
  https://seegen.ai/api/v1/jobs/createTask | jq -r '.taskId')

# 4. Poll task result
curl -s -H "Authorization: Bearer $API_KEY" \
  "https://seegen.ai/api/v1/jobs/queryTask?taskId=$TASK_ID"
```

For text-to-video (no asset upload needed):
```bash
curl -s -X POST \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"sd2","inputs":{"prompt":"A cat playing piano","duration":"5s"}}' \
  https://seegen.ai/api/v1/jobs/createTask
```

HappyHorse 1.0 example (no asset review required, public HTTPS URL works directly):
```bash
curl -s -X POST \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"happyhorse","inputs":{"urls":["https://example.com/photo.jpg"],"prompt":"Make it move","duration":"5s","outputResolution":"1080p"}}' \
  https://seegen.ai/api/v1/jobs/createTask
```

---

For full interactive documentation with code examples in cURL, JavaScript, and Python, visit: https://seegen.ai/api-docs