Algorithm Analysis Interface Specification Document
Document Properties
| Property | Details |
|---|---|
| Document Version | V1.6 |
| Creation Date | 2026-04-28 |
| Scope | Video Stream Detection Interface and VLM Algorithm Interface (currently supports prompt-driven video stream analysis and dynamic configuration updates) |
1. Overview
This document defines two categories of standardized interfaces for algorithm analysis:
- Video Stream Detection Interface: Currently defines five types of video stream integration protocols: detection type query, start, stop, HTTP callback, and RTMP live streaming. The callback structure for three types of anomaly detection results — fire_detection, bag_detection, no_helmet_detection — is explicitly specified, and
instanceIdis required to associate patrol task instances. - VLM Algorithm Interface: Defines a prompt-driven video stream analysis protocol. The overall lifecycle, callback structure, and RTMP streaming conventions reuse those of the Video Stream Detection Interface. The only difference is that the start interface is extended to explicitly accept a natural language prompt, and a separate path is used to distinguish VLM tasks.
The Video Stream Detection Interface and VLM Algorithm Interface specifications are both finalized. The VLM interface shares the same task control, callback, and streaming patterns as the Video Stream Detection Interface, differing only in the addition of a prompt parameter in the start request and the use of VLM-specific paths.
2. General Specifications
- All HTTP interface requests and responses use JSON format with
Content-Typefixed asapplication/json. - The live video streaming link uses the RTMP protocol with FLV container format and H.264 video encoding.
- Interface base paths follow a unified module convention; all algorithm-related HTTP interfaces use the
analysismodule. - HTTP write operation interfaces follow the "plural resource + action sub-path" style and uniformly use the
POSTmethod. - HTTP interfaces requiring request tracing and result matching must include a globally unique
requestId; UUID generation is recommended. - The Video Stream Detection Interface consists of the detection type query interface, start interface, stop interface, HTTP callback interface, and RTMP live streaming. The legacy FTP/FTPS screenshot upload link has been deprecated and is not included in this version.
- Video stream detection tasks follow a lifecycle of "Start → Detect → Alarm Callback → Stop". The start, stop, and alarm callback operations must all carry
instanceIdto associate the current task instance. - VLM algorithm interfaces follow the same "Start → Analyze → Callback → Stop" lifecycle as the Video Stream Detection Interface. The start, stop, and callback operations must all carry
instanceId, and the start request must additionally include apromptdescribing the analysis objective. - The caller must provide an accessible callback/alarm receiving address; the algorithm service pushes analysis results to this address as specified. An inaccessible address is treated as a caller configuration error.
- Platform-side HTTP interface responses uniformly use a
Result<T>structure withcode,message, anddatafields. On successful processing of a video stream detection callback, the platform returnsHTTP 200 + Result<null>. - On successful processing of a VLM callback, the platform similarly returns
HTTP 200 + Result<null>; the failure response semantics are consistent with those of the video stream detection callback. - Business status codes at the interface layer reuse existing definitions:
200for success,500for failure. - Algorithm analysis result codes are returned via the
resCodefield in the business data, strictly reusing the existingAnalysisResultCodeEnumenumeration. Custom codes are prohibited to ensure no changes to the system's parsing logic. - All extension fields (
extParams/extInfo) are optional and used to pass through personalized parameters without affecting core parsing logic.
3. Video Stream Detection Interface Definition
3.1 Interface Overview
| Configuration | Details |
|---|---|
| Interface Name | Video Stream Detection Interface |
| Interface Composition | Detection Type Query Interface + Start Interface + Stop Interface + HTTP Callback Interface + RTMP Live Streaming |
| Detection Type Query Path | /api/v1/analysis/video-stream-detect-types |
| Start Interface Path | /api/v1/analysis/video-stream-tasks/start |
| Stop Interface Path | /api/v1/analysis/video-stream-tasks/stop |
| HTTP Callback Path | /api/v1/analysis/video-stream-callbacks/create |
| HTTP Request Method | GET for detection type query; POST for all other HTTP interfaces |
| RTMP Push URL | Returned by the algorithm side upon successful start; format: rtmp://{host}:{port}/drtm/{instanceId} |
| Interface Description | The platform may first call the detection type query interface to retrieve the currently available video stream detection algorithm catalog, then call the start interface to submit the video stream URL and task instance ID. Upon successful acceptance, the algorithm side returns the RTMP push URL for the current task instance. Anomalies are reported via the fixed callback interface, and an annotated video stream is continuously pushed. When the task ends, the platform calls the stop interface to terminate detection. |
| Currently Available Detection Types | fire_detection, bag_detection, no_helmet_detection |
- Video stream detection tasks use
instanceIdas the primary business association key. TheinstanceIdin start requests, stop requests, and alarm callbacks must be identical to accurately associate algorithm alerts with the current patrol task instance. - Per API design conventions, task control interfaces use the
analysismodule with plural resource names and action sub-paths. The detection type query interface usesGETto retrieve the available catalog. The algorithm side should configurereportUrlas the full address of the fixed callback interface. - If the platform has a unified gateway domain, the recommended full callback address is:
https://{domain}/api/v1/analysis/video-stream-callbacks/create. - The RTMP push URL is returned by the algorithm side in the successful start response. The
streamKeyis fixed to the currentinstanceId, meaning the push URL format isrtmp://{host}:{port}/drtm/{instanceId}. - After a task ends, the platform must explicitly call the stop interface. After a successful stop, the algorithm side must not push new alerts or continue streaming for that
instanceId.
3.2 Video Stream Detection Type Query Interface Definition
3.2.1 Interface Basic Information
| Configuration | Details |
|---|---|
| Interface Name | Video Stream Detection Type Query Interface |
| Interface Path | /api/v1/analysis/video-stream-detect-types |
| Request Method | GET |
| Interface Description | Retrieves the currently available video stream detection algorithm type catalog, for use by the platform before configuring or initiating tasks |
- The path follows API design conventions:
/api/v1/{module}/..., where{module}=analysisand the resource name uses the plural formvideo-stream-detect-types. - The response uses a flat array structure where each element contains
itemanditemDesc, representing a detection type that can be directly submitted to a video stream detection task. - The current version returns only published detection types; unreleased, in-beta, or internal test types are not returned.
3.2.2 Response Example
{
"code": 200,
"message": "success",
"data": [
{
"item": "fire_detection",
"itemDesc": "Fire Detection"
},
{
"item": "bag_detection",
"itemDesc": "Abandoned Object Detection"
},
{
"item": "no_helmet_detection",
"itemDesc": "No Helmet Detection"
}
]
}3.2.3 Response Field Description
| Field | Type | Description |
|---|---|---|
code | Integer | Unified business status code; 200 = query successful, 500 on failure |
message | String | Unified response description; fixed as success on success, error details on failure |
data | Array | List of currently available video stream detection types |
data.item | String | Detection type code, used to identify a specific video stream detection capability |
data.itemDesc | String | Human-readable description of the detection type, for frontend display or manual configuration |
3.2.4 Currently Available Detection Types
item | itemDesc | Description |
|---|---|---|
fire_detection | Fire Detection | Used to identify fire anomalies in video streams |
bag_detection | Abandoned Bag Detection | Used to identify abandoned object anomalies in video streams |
no_helmet_detection | No Helmet Detection | Used to identify human head targets not wearing safety helmets in video streams |
3.3 Video Stream Detection Start Interface Definition
3.3.1 Interface Basic Information
| Configuration | Details |
|---|---|
| Interface Name | Video Stream Detection Start Interface |
| Interface Path | /api/v1/analysis/video-stream-tasks/start |
| Request Method | POST |
Content-Type | application/json |
| Interface Description | The platform submits the video stream URL and starts the video stream detection task for the current task instance |
- The path follows API design conventions:
/api/v1/{module}/..., where{module}=analysis, the resource name uses the plural formvideo-stream-tasks, and the action isstart. - The
instanceIdin the start request is the unique identifier of the current patrol task instance; the algorithm side must return this field unchanged in subsequent alarm callbacks. - In the current version, one start request corresponds to one video stream detection task instance. To detect multiple video streams, a separate start request must be issued for each stream.
3.3.2 Request Example
{
"requestId": "550e8400-e29b-41d4-a716-446655440000",
"instanceId": "PATROL-INSTANCE-20260424-0001",
"videoStreamUrl": "rtsp://192.168.1.100/live/robot-camera-main"
}3.3.3 Field Description
| Field | Type | Required | Description |
|---|---|---|---|
requestId | String | Yes | Unique identifier for the start request, used for request tracing; UUID is recommended |
instanceId | String | Yes | Current task instance ID; algorithm alarm callbacks and stop requests both use this field to associate the same detection task |
videoStreamUrl | String | Yes | URL of the video stream to be detected; may be RTSP/RTMP or other mutually agreed protocol URLs, must be accessible by the algorithm service |
3.3.4 Response Example
{
"code": 200,
"message": "success",
"data": {
"requestId": "550e8400-e29b-41d4-a716-446655440000",
"instanceId": "PATROL-INSTANCE-20260424-0001",
"rtmpPushUrl": "rtmp://10.10.10.20:1935/drtm/PATROL-INSTANCE-20260424-0001"
}
}3.3.5 Response Field Description
| Field | Type | Description |
|---|---|---|
code | Integer | Unified business status code; 200 = start request successfully accepted, 500 on failure |
message | String | Unified response description; fixed as success on success, error details on failure |
data.requestId | String | Identical to the requestId in the request, used for request tracing |
data.instanceId | String | Identical to the instanceId in the request, representing the accepted detection task instance |
data.rtmpPushUrl | String | RTMP push URL returned by the algorithm side; the streamKey is fixed to the current instanceId |
3.4 Video Stream Detection Stop Interface Definition
3.4.1 Interface Basic Information
| Configuration | Details |
|---|---|
| Interface Name | Video Stream Detection Stop Interface |
| Interface Path | /api/v1/analysis/video-stream-tasks/stop |
| Request Method | POST |
Content-Type | application/json |
| Interface Description | Called by the platform when a video stream detection task ends, instructing the algorithm side to stop detection for the specified task instance |
- The stop interface uses
instanceIdas the primary stop condition. If multiple control requests exist for the same task, the state of theinstanceIdfrom the most recent valid start takes precedence. - The platform must call the stop interface to release algorithm resources whenever a task completes, is cancelled, or is no longer needed.
3.4.2 Request Example
{
"requestId": "9e0c7b2e-4f65-4c82-8f63-0bc4f9d3857f",
"instanceId": "PATROL-INSTANCE-20260424-0001"
}3.4.3 Field Description
| Field | Type | Required | Description |
|---|---|---|---|
requestId | String | Yes | Unique identifier for the stop request, used for request tracing; UUID is recommended |
instanceId | String | Yes | Current task instance ID; the algorithm side uses this field to stop the corresponding detection task |
3.4.4 Response Example
{
"code": 200,
"message": "success",
"data": {
"requestId": "9e0c7b2e-4f65-4c82-8f63-0bc4f9d3857f",
"instanceId": "PATROL-INSTANCE-20260424-0001"
}
}3.4.5 Response Field Description
| Field | Type | Description |
|---|---|---|
code | Integer | Unified business status code; 200 = stop request successfully accepted, 500 on failure |
message | String | Unified response description; fixed as success on success, error details on failure |
data.requestId | String | Identical to the requestId in the request, used for request tracing |
data.instanceId | String | Identical to the instanceId in the request, representing the detection task instance whose stop has been accepted |
3.5 Video Stream Detection Callback Interface Definition
3.5.1 Interface Basic Information
| Configuration | Details |
|---|---|
| Interface Name | Video Stream Detection Callback Interface |
| Interface Path | /api/v1/analysis/video-stream-callbacks/create |
| Request Method | POST |
Content-Type | application/json |
| Interface Description | Upon detecting an anomaly, the algorithm side invokes the callback in batches to report video stream detection results; the callback must include the task instance ID submitted at start time |
- The callback interface is provided by the platform side; the algorithm side configures the full callback address via
reportUrl. - The path follows API design conventions:
/api/v1/{module}/..., where{module}=analysis, the resource name uses the plural formvideo-stream-callbacks, and write operations usePOST .../create. - All request body / response body fields use camelCase.
3.5.2 Request Example
{
"requestId": "550e8400-e29b-41d4-a716-446655440000",
"instanceId": "PATROL-INSTANCE-20260424-0001",
"resultsList": [
{
"objectId": "1212",
"results": [
{
"type": "fire_detection",
"value": "1",
"code": "2000",
"resImageUrl": "detect-results/result_1745318400.jpg",
"pos": [],
"conf": 1.0,
"desc": "fire_detection"
},
{
"type": "no_helmet_detection",
"value": "1",
"code": "2000",
"resImageUrl": "detect-results/result_1745318400.jpg",
"pos": [],
"conf": 1.0,
"desc": "no_helmet_detection"
}
]
}
]
}3.5.3 Field Description
| Field | Type | Required | Description |
|---|---|---|---|
requestId | String | Yes | Start request unique identifier; recommended to match the requestId in the start interface for request tracing |
instanceId | String | Yes | Current task instance ID; must be identical to the instanceId in the start interface to associate alarm results with the specific task instance |
resultsList | Array | Yes | Result list; current implementation contains exactly one object |
resultsList.objectId | String | Yes | Detection object ID; current implementation is fixed as 1212 |
resultsList.results | Array | Yes | Detection result list; dynamically generated based on the actual detected anomaly types |
results.type | String | Yes | Anomaly type; see "3.5.4 Detection Type Enumeration" for valid values |
results.value | String | Yes | Detection value; current implementation is fixed as 1, indicating an anomaly was detected in this round |
results.code | String | Yes | Result code; current implementation is fixed as 2000, strictly reusing AnalysisResultCodeEnum |
results.resImageUrl | String | No | Alarm image path; represents the object path of the alarm image in object storage, using the S3 protocol |
results.pos | Array | No | Detection bounding box position; current implementation returns an empty array |
results.conf | Number | No | Confidence score; current implementation is fixed as 1.0 |
results.desc | String | Yes | Description; current implementation matches type |
3.5.4 Detection Type Enumeration
type Value | Trigger Condition | Detection Logic |
|---|---|---|
fire_detection | Fire detected | fire_bag model cls=0 |
bag_detection | Abandoned bag detected | fire_bag model cls=1 |
no_helmet_detection | Human head without helmet detected | head_aqm model cls=0 with a pedestrian bounding box present |
3.5.5 Response Convention
- Upon successful processing of a callback, the platform must return
HTTP 200with a unifiedResult<null>response body, for example:
{
"code": 200,
"message": "success",
"data": null
}- When the request body is missing required fields, contains invalid field formats, or includes unsupported enum values, it is recommended to return
HTTP 400, for example:
{
"code": 40001,
"message": "Parameter validation failed",
"data": null
}- When an internal exception occurs during platform processing, it is recommended to return
HTTP 500, for example:
{
"code": 50000,
"message": "Internal server error",
"data": null
}- The current implementation logs callback response content. Retry behavior is determined by the algorithm side according to its own retry policy.
- The HTTP callback call timeout is fixed at
10s.
3.5.6 Trigger Timing and Throttling Rules
- The platform receives alarm results via callbacks and does not depend on the algorithm's internal throttling parameters.
- The algorithm side may maintain throttling strategies as needed (e.g., alarm deduplication, time interval control), but these are internal implementation details and not part of the platform interface contract.
- The platform integration focus is: reliably receiving callback requests triggered when an anomaly is detected, and accurately attributing alarm results to the corresponding task instance based on
instanceId.
3.6 RTMP Streaming Interface Definition
3.6.1 Interface Basic Information
| Configuration | Details |
|---|---|
| Interface Name | Video Stream RTMP Streaming Interface |
| Push URL | rtmp://{host}:{port}/drtm/{instanceId} |
| Protocol | RTMP |
| Video Encoding | H.264 (libx264) |
| Container Format | FLV |
| Interface Description | Upon successful acceptance of the start request, the algorithm side returns the corresponding RTMP URL and continuously pushes an AI-annotated video stream for real-time consumption by the backend or media server |
3.6.2 Streaming Parameters
| Parameter | Value | Description |
|---|---|---|
| Resolution | 1280×720 | Hard-coded in current implementation |
| Frame Rate | 15 fps | From fps configuration |
| Encoding | H.264 (libx264) | Software encoding |
| Bitrate | 1200k | Fixed bitrate |
| Pixel Format | yuv420p | Standard compatible format |
| Preset | superfast | Encoding speed priority |
| Tune | zerolatency | Low latency |
| GOP | 10 | Keyframe interval of 10 frames |
| Container Format | FLV | Standard RTMP container format |
3.6.3 Streaming Content
- Raw video frames are output together with AI detection bounding box annotations.
- Current detection bounding box color mapping:
| Detection Target | Annotation Color |
|---|---|
| Fire | (0, 255, 0) |
| Abandoned Bag | (0, 0, 255) |
| Human Head | (255, 0, 0) |
| Safety Helmet | (0, 255, 255) |
3.6.4 Receiver Requirements
- The receiver must deploy an RTMP server (e.g., Nginx-RTMP, SRS, Red5) with the listening port configured per the actual deployment.
- The RTMP server must accept continuous streaming to the path
/drtm/{instanceId}. instanceIdis used directly as thestreamKeyfor a single video stream. The algorithm side assembles the fullrtmpPushUrlafter a successful start and returns it to the platform.- For the same
instanceId, the streaming lifecycle must remain consistent with the detection task lifecycle; streaming must terminate when the task stops.
3.7 Result Code Enum Reuse Specification
Note: The enumeration in this section represents business result codes for video stream alarm result items
results[].codeand is not equivalent to HTTP-layer status codes.
| Enum Value | Code | Applicable Description for Video Stream Detection |
|---|---|---|
SUCCESS | 2000 | All anomaly detection results reported in the current version use this code, indicating the anomaly has been successfully identified and written to results[] |
Note 1: The current version only reports alarm results when an anomaly is detected; no separate result item is defined for "no anomaly detected."
Note 2: If future versions need to report anomalous results such as frame extraction failures or analysis failures during video stream detection, existing result codes from the appendix must still be reused; adding custom codes is prohibited.
4. VLM Algorithm Interface Definition
4.1 Interface Basic Information
| Configuration | Details |
|---|---|
| Interface Name | VLM Algorithm Interface |
| Interface Composition | Start Interface + Stop Interface + HTTP Callback Interface + RTMP Live Streaming + Dynamic Configuration Update Interface |
| Start Interface Path | /api/v1/analysis/vlm-video-stream-tasks/start |
| Stop Interface Path | /api/v1/analysis/vlm-video-stream-tasks/stop |
| Configuration Update Path | /api/v1/analysis/vlm-video-stream-tasks/update |
| HTTP Callback Path | /api/v1/analysis/vlm-video-stream-callbacks/create |
| HTTP Request Method | POST |
| RTMP Push URL | Returned by the algorithm side upon successful start; format: rtmp://{host}:{port}/drtm/{instanceId} |
| Interface Description | The platform submits the video stream URL, task instance ID, and a natural language prompt. Upon successful acceptance, the algorithm side returns the RTMP push URL for the current task instance. When a target matching the prompt is identified, results are reported via the fixed callback interface, and an annotated video stream is continuously pushed. During task execution, the prompt and extension parameters can be updated online via the dynamic configuration update interface. When the task ends, the platform calls the stop interface to terminate analysis. |
- The VLM interface's overall interaction pattern, response structure, callback response convention, and RTMP streaming specification all reuse those defined in Section 3 (Video Stream Detection Interface).
- The only protocol difference from the Video Stream Detection Interface is that the VLM start interface must include a
promptfield describing the analysis objective for this video stream session, e.g., "Check the video stream for a specific defect." - To avoid mixing with rule-based video stream detection tasks, VLM interfaces use dedicated paths:
vlm-video-stream-tasksandvlm-video-stream-callbacks. - VLM tasks also use
instanceIdas the primary business association key. TheinstanceIdin start requests, stop requests, configuration update requests, and callbacks must all be identical. - During a running VLM task, the platform may update the
promptand other runtime parameters online via the dynamic configuration update interface without stopping and restarting the task.
4.2 VLM Analysis Start Interface Definition
4.2.1 Interface Basic Information
| Configuration | Details |
|---|---|
| Interface Name | VLM Analysis Start Interface |
| Interface Path | /api/v1/analysis/vlm-video-stream-tasks/start |
| Request Method | POST |
Content-Type | application/json |
| Interface Description | The platform submits the video stream URL, current task instance ID, and prompt to start the VLM video stream analysis task for the current task instance |
- The path follows API design conventions:
/api/v1/{module}/..., where{module}=analysis, the resource name uses the plural formvlm-video-stream-tasks, and the action isstart. - The
instanceIdin the start request is the unique identifier of the current patrol task instance; the algorithm side must return this field unchanged in subsequent callbacks. promptis the core control parameter for this VLM analysis session; clear, actionable, and observable natural language descriptions are recommended to avoid ambiguity.- In the current version, one start request corresponds to one VLM video stream analysis task instance. To analyze multiple video streams simultaneously, a separate start request must be issued for each stream.
4.2.2 Request Example
{
"requestId": "650e8400-e29b-41d4-a716-446655440000",
"instanceId": "PATROL-INSTANCE-20260427-0001",
"videoStreamUrl": "rtsp://192.168.1.100/live/robot-camera-main",
"prompt": "Check the video stream for a specific defect"
}4.2.3 Field Description
| Field | Type | Required | Description |
|---|---|---|---|
requestId | String | Yes | Unique identifier for the start request, used for request tracing; UUID is recommended |
instanceId | String | Yes | Current task instance ID; algorithm callbacks and stop requests both use this field to associate the same analysis task |
videoStreamUrl | String | Yes | URL of the video stream to be analyzed; may be RTSP/RTMP or other mutually agreed protocol URLs, must be accessible by the algorithm service |
prompt | String | Yes | VLM prompt, describing the analysis objective for this video stream session, e.g., "Check the video stream for a specific defect" |
4.2.4 Response Example
{
"code": 200,
"message": "success",
"data": {
"requestId": "650e8400-e29b-41d4-a716-446655440000",
"instanceId": "PATROL-INSTANCE-20260427-0001",
"rtmpPushUrl": "rtmp://10.10.10.20:1935/drtm/PATROL-INSTANCE-20260427-0001"
}
}4.2.5 Response Field Description
| Field | Type | Description |
|---|---|---|
code | Integer | Unified business status code; 200 = start request successfully accepted, 500 on failure |
message | String | Unified response description; fixed as success on success, error details on failure |
data.requestId | String | Identical to the requestId in the request, used for request tracing |
data.instanceId | String | Identical to the instanceId in the request, representing the accepted VLM analysis task instance |
data.rtmpPushUrl | String | RTMP push URL returned by the algorithm side; the streamKey is fixed to the current instanceId |
4.3 VLM Analysis Stop Interface Definition
4.3.1 Interface Basic Information
| Configuration | Details |
|---|---|
| Interface Name | VLM Analysis Stop Interface |
| Interface Path | /api/v1/analysis/vlm-video-stream-tasks/stop |
| Request Method | POST |
Content-Type | application/json |
| Interface Description | Called by the platform when a VLM video stream analysis task ends, instructing the algorithm side to stop VLM analysis for the specified task instance |
- The stop interface uses
instanceIdas the primary stop condition. If multiple control requests exist for the same task, the state of theinstanceIdfrom the most recent valid start takes precedence. - The platform must call the stop interface to release algorithm resources whenever a task completes, is cancelled, or analysis is no longer needed.
4.3.2 Request Example
{
"requestId": "7c0c7b2e-4f65-4c82-8f63-0bc4f9d3857f",
"instanceId": "PATROL-INSTANCE-20260427-0001"
}4.3.3 Field Description
| Field | Type | Required | Description |
|---|---|---|---|
requestId | String | Yes | Unique identifier for the stop request, used for request tracing; UUID is recommended |
instanceId | String | Yes | Current task instance ID; the algorithm side uses this field to stop the corresponding VLM analysis task |
4.3.4 Response Example
{
"code": 200,
"message": "success",
"data": {
"requestId": "7c0c7b2e-4f65-4c82-8f63-0bc4f9d3857f",
"instanceId": "PATROL-INSTANCE-20260427-0001"
}
}4.3.5 Response Field Description
| Field | Type | Description |
|---|---|---|
code | Integer | Unified business status code; 200 = stop request successfully accepted, 500 on failure |
message | String | Unified response description; fixed as success on success, error details on failure |
data.requestId | String | Identical to the requestId in the request, used for request tracing |
data.instanceId | String | Identical to the instanceId in the request, representing the VLM analysis task instance whose stop has been accepted |
4.4 VLM Analysis Callback Interface Definition
4.4.1 Interface Basic Information
| Configuration | Details |
|---|---|
| Interface Name | VLM Analysis Callback Interface |
| Interface Path | /api/v1/analysis/vlm-video-stream-callbacks/create |
| Request Method | POST |
Content-Type | application/json |
| Interface Description | Upon identifying a target matching the prompt description, the algorithm side invokes the callback in batches to report VLM video stream analysis results; the callback must include the task instance ID submitted at start time |
- The callback interface is provided by the platform side; the algorithm side configures the full callback address via
reportUrl. - The path follows API design conventions:
/api/v1/{module}/..., where{module}=analysis, the resource name uses the plural formvlm-video-stream-callbacks, and write operations usePOST .../create. - All request body / response body fields use camelCase. Except for the path and business semantics, the overall structure is consistent with the Video Stream Detection Callback Interface.
4.4.2 Request Example
{
"requestId": "650e8400-e29b-41d4-a716-446655440000",
"instanceId": "PATROL-INSTANCE-20260427-0001",
"resultsList": [
{
"objectId": "1212",
"results": [
{
"type": "vlm_detection",
"value": "1",
"code": "2000",
"resImageUrl": "detect-results/result_1745318400.jpg",
"pos": [],
"conf": 1.0,
"desc": "Check the video stream for a specific defect"
}
]
}
]
}4.4.3 Field Description
| Field | Type | Required | Description |
|---|---|---|---|
requestId | String | Yes | Start request unique identifier; recommended to match the requestId in the start interface for request tracing |
instanceId | String | Yes | Current task instance ID; must be identical to the instanceId in the start interface to associate analysis results with the specific task instance |
resultsList | Array | Yes | Result list; current implementation contains exactly one object |
resultsList.objectId | String | Yes | Detection object ID; current implementation is fixed as 1212 |
resultsList.results | Array | Yes | Analysis result list; dynamically generated based on the actual prompt-matching targets identified |
results.type | String | Yes | Result type; fixed as vlm_detection in VLM scenarios |
results.value | String | Yes | Detection value; current implementation is fixed as 1, indicating a target matching the prompt was identified |
results.code | String | Yes | Result code; current implementation is fixed as 2000, strictly reusing AnalysisResultCodeEnum |
results.resImageUrl | String | No | Alarm image path; represents the object path of the alarm image in object storage, using the S3 protocol |
results.pos | Array | No | Detection bounding box position; current implementation returns an empty array |
results.conf | Number | No | Confidence score; current implementation is fixed as 1.0 |
results.desc | String | Yes | Description; recommended to return the prompt from the start interface unchanged or as an equivalent normalized description |
4.4.4 Response Convention
- Upon successful processing of a callback, the platform must return
HTTP 200with a unifiedResult<null>response body, for example:
{
"code": 200,
"message": "success",
"data": null
}- When the request body is missing required fields, contains invalid field formats, or includes unsupported enum values, it is recommended to return
HTTP 400, for example:
{
"code": 40001,
"message": "Parameter validation failed",
"data": null
}- When an internal exception occurs during platform processing, it is recommended to return
HTTP 500, for example:
{
"code": 50000,
"message": "Internal server error",
"data": null
}- The current implementation logs callback response content. Retry behavior is determined by the algorithm side according to its own retry policy.
- The HTTP callback call timeout is fixed at
10s.
4.4.5 Trigger Timing and Throttling Rules
- The platform receives analysis results via callbacks and does not depend on the algorithm's internal throttling parameters.
- The algorithm side may maintain throttling strategies as needed (e.g., deduplication of identical prompt results, time interval control), but these are internal implementation details and not part of the platform interface contract.
- The platform integration focus is: reliably receiving callback requests triggered when a target matching the prompt description is identified, and accurately attributing results to the corresponding task instance based on
instanceId.
4.5 VLM RTMP Streaming Interface Definition
4.5.1 Interface Basic Information
| Configuration | Details |
|---|---|
| Interface Name | VLM Video Stream RTMP Streaming Interface |
| Push URL | rtmp://{host}:{port}/drtm/{instanceId} |
| Protocol | RTMP |
| Video Encoding | H.264 (libx264) |
| Container Format | FLV |
| Interface Description | Upon successful acceptance of the start request, the algorithm side returns the corresponding RTMP URL and continuously pushes a VLM-annotated video stream for real-time consumption by the backend or media server |
4.5.2 Streaming Parameters
| Parameter | Value | Description |
|---|---|---|
| Resolution | 1280×720 | Hard-coded in current implementation |
| Frame Rate | 15 fps | From fps configuration |
| Encoding | H.264 (libx264) | Software encoding |
| Bitrate | 1200k | Fixed bitrate |
| Pixel Format | yuv420p | Standard compatible format |
| Preset | superfast | Encoding speed priority |
| Tune | zerolatency | Low latency |
| GOP | 10 | Keyframe interval of 10 frames |
| Container Format | FLV | Standard RTMP container format |
4.5.3 Streaming Content
- Raw video frames are output together with VLM analysis annotations.
- Annotation content is determined by the targets matching the prompt and the algorithm implementation. It is recommended to display the matching bounding box, prompt summary, and relevant confidence information in the frame.
4.5.4 Receiver Requirements
- The receiver must deploy an RTMP server (e.g., Nginx-RTMP, SRS, Red5) with the listening port configured per the actual deployment.
- The RTMP server must accept continuous streaming to the path
/drtm/{instanceId}. instanceIdis used directly as thestreamKeyfor a single video stream. The algorithm side assembles the fullrtmpPushUrlafter a successful start and returns it to the platform.- For the same
instanceId, the streaming lifecycle must remain consistent with the VLM analysis task lifecycle; streaming must terminate when the task stops.
4.6 Result Code Enum Reuse Specification
Note: The enumeration in this section represents business result codes for VLM analysis result items
results[].codeand is not equivalent to HTTP-layer status codes.
| Enum Value | Code | Applicable Description for VLM Scenarios |
|---|---|---|
SUCCESS | 2000 | All VLM prompt-matching results reported in the current version use this code, indicating the anomaly or item of interest has been successfully identified and written to results[] |
Note 1: The current version only reports results when a target matching the prompt description is detected; no separate result item is defined for "prompt not matched."
Note 2: If future versions need to report anomalous results such as frame extraction failures or analysis failures during VLM analysis, existing result codes from the appendix must still be reused; adding custom codes is prohibited.
4.7 VLM Dynamic Configuration Update Interface Definition
4.7.1 Interface Basic Information
| Configuration | Details |
|---|---|
| Interface Name | VLM Dynamic Configuration Update Interface |
| Interface Path | /api/v1/analysis/vlm-video-stream-tasks/update |
| Request Method | POST |
Content-Type | application/json |
| Interface Description | During a running VLM task, the platform may use this interface to update the prompt or extension configuration parameters for the current task instance online. The algorithm side applies the update immediately upon receipt without stopping and restarting the task. |
- The path follows API design conventions:
/api/v1/{module}/..., where{module}=analysis, the resource name uses the plural formvlm-video-stream-tasks, and the action isupdate. - The
instanceIdin the update request must be the ID of a currently running VLM task instance. The algorithm side verifies that the task exists and is in an updatable state before returning an acceptance result. - The current version only supports calling this interface while a task is running. Calling it when the task is stopped or not yet started will return a failed acceptance response.
- The update operation does not affect the current RTMP streaming link; the
streamKeyand push URL remain the same as when the task was started.
4.7.2 Request Example
{
"requestId": "8d1e7b2e-4f65-4c82-8f63-0bc4f9d3857f",
"instanceId": "PATROL-INSTANCE-20260427-0001",
"prompt": "Updated prompt description, e.g., check if smoke appears in the video stream",
"report_interval": 5
}4.7.3 Field Description
| Field | Type | Required | Description |
|---|---|---|---|
requestId | String | Yes | Unique identifier for the update request, used for request tracing; UUID is recommended |
instanceId | String | Yes | Current task instance ID; must be identical to the instanceId in the start interface; the algorithm side uses this field to locate the running VLM analysis task |
prompt | String | No | Updated VLM prompt, used to replace the analysis objective description of the currently running task; if not provided, the prompt is not updated |
report_interval | Integer | No | Reporting interval (seconds), controlling the minimum time interval between VLM analysis result callbacks; if not provided, the current effective value is retained |
Note: At least one of
promptorreport_intervalmust be provided; otherwise the request is treated as an invalid update request.
4.7.4 Response Example
{
"code": 200,
"message": "success",
"data": {
"requestId": "8d1e7b2e-4f65-4c82-8f63-0bc4f9d3857f",
"instanceId": "PATROL-INSTANCE-20260427-0001",
"report_interval": 5
}
}4.7.5 Response Field Description
| Field | Type | Description |
|---|---|---|
code | Integer | Unified business status code; 200 = configuration update successfully accepted, 500 on failure |
message | String | Unified response description; fixed as success on success, error details on failure |
data.requestId | String | Identical to the requestId in the request, used for request tracing |
data.instanceId | String | Identical to the instanceId in the request, representing the VLM analysis task instance whose configuration update has been accepted |
data.report_interval | Integer | The reporting interval (seconds) effective after this update; echoes the value from the request or the current effective value |
4.7.6 Effective Timing and Notes
- The algorithm side must apply the new configuration immediately upon receiving a valid update request. After a prompt update, subsequent video frame analysis proceeds using the new prompt.
- Callback results generated before the configuration update are not affected. For new callback results generated after the update, the
descfield should reflect the updated prompt content. - When the update interface is called multiple times for the same task instance, the most recently successfully accepted configuration takes precedence.
- If the task has already stopped or the
instanceIddoes not exist at the time of the update, the algorithm side must return a failed acceptance response (code=500) with a clear error reason in themessagefield.
5. Appendix: Complete Algorithm Result Code Enumeration
| Enum Value | Code | Description |
|---|---|---|
SUCCESS | 2000 | Success |
GET_IMAGE_ERROR | 2001 | Failed to obtain image data |
ANALYSIS_IMAGE_ERROR | 2002 | Error occurred during algorithm analysis |
IMAGE_NO_METER | 2003 | No meter found in the image |
IMAGE_NO_TARGET_DEVICE | 2004 | No corresponding state recognition device found in the image |