Algorithm Analysis Interface Specification Document

Document Properties

Property	Details
Document Version	V1.6
Creation Date	2026-04-28
Scope	Video Stream Detection Interface and VLM Algorithm Interface (currently supports prompt-driven video stream analysis and dynamic configuration updates)

1. Overview

This document defines two categories of standardized interfaces for algorithm analysis:

Video Stream Detection Interface: Currently defines five types of video stream integration protocols: detection type query, start, stop, HTTP callback, and RTMP live streaming. The callback structure for three types of anomaly detection results — fire_detection, bag_detection, no_helmet_detection — is explicitly specified, and instanceId is required to associate patrol task instances.
VLM Algorithm Interface: Defines a prompt-driven video stream analysis protocol. The overall lifecycle, callback structure, and RTMP streaming conventions reuse those of the Video Stream Detection Interface. The only difference is that the start interface is extended to explicitly accept a natural language prompt, and a separate path is used to distinguish VLM tasks.

The Video Stream Detection Interface and VLM Algorithm Interface specifications are both finalized. The VLM interface shares the same task control, callback, and streaming patterns as the Video Stream Detection Interface, differing only in the addition of a prompt parameter in the start request and the use of VLM-specific paths.

2. General Specifications

All HTTP interface requests and responses use JSON format with Content-Type fixed as application/json.
The live video streaming link uses the RTMP protocol with FLV container format and H.264 video encoding.
Interface base paths follow a unified module convention; all algorithm-related HTTP interfaces use the analysis module.
HTTP write operation interfaces follow the "plural resource + action sub-path" style and uniformly use the POST method.
HTTP interfaces requiring request tracing and result matching must include a globally unique requestId; UUID generation is recommended.
The Video Stream Detection Interface consists of the detection type query interface, start interface, stop interface, HTTP callback interface, and RTMP live streaming. The legacy FTP/FTPS screenshot upload link has been deprecated and is not included in this version.
Video stream detection tasks follow a lifecycle of "Start → Detect → Alarm Callback → Stop". The start, stop, and alarm callback operations must all carry instanceId to associate the current task instance.
VLM algorithm interfaces follow the same "Start → Analyze → Callback → Stop" lifecycle as the Video Stream Detection Interface. The start, stop, and callback operations must all carry instanceId, and the start request must additionally include a prompt describing the analysis objective.
The caller must provide an accessible callback/alarm receiving address; the algorithm service pushes analysis results to this address as specified. An inaccessible address is treated as a caller configuration error.
Platform-side HTTP interface responses uniformly use a Result<T> structure with code, message, and data fields. On successful processing of a video stream detection callback, the platform returns HTTP 200 + Result<null>.
On successful processing of a VLM callback, the platform similarly returns HTTP 200 + Result<null>; the failure response semantics are consistent with those of the video stream detection callback.
Business status codes at the interface layer reuse existing definitions: 200 for success, 500 for failure.
Algorithm analysis result codes are returned via the resCode field in the business data, strictly reusing the existing AnalysisResultCodeEnum enumeration. Custom codes are prohibited to ensure no changes to the system's parsing logic.
All extension fields (extParams / extInfo) are optional and used to pass through personalized parameters without affecting core parsing logic.

3. Video Stream Detection Interface Definition

3.1 Interface Overview

Configuration	Details
Interface Name	Video Stream Detection Interface
Interface Composition	Detection Type Query Interface + Start Interface + Stop Interface + HTTP Callback Interface + RTMP Live Streaming
Detection Type Query Path	`/api/v1/analysis/video-stream-detect-types`
Start Interface Path	`/api/v1/analysis/video-stream-tasks/start`
Stop Interface Path	`/api/v1/analysis/video-stream-tasks/stop`
HTTP Callback Path	`/api/v1/analysis/video-stream-callbacks/create`
HTTP Request Method	`GET` for detection type query; `POST` for all other HTTP interfaces
RTMP Push URL	Returned by the algorithm side upon successful start; format: `rtmp://{host}:{port}/drtm/{instanceId}`
Interface Description	The platform may first call the detection type query interface to retrieve the currently available video stream detection algorithm catalog, then call the start interface to submit the video stream URL and task instance ID. Upon successful acceptance, the algorithm side returns the RTMP push URL for the current task instance. Anomalies are reported via the fixed callback interface, and an annotated video stream is continuously pushed. When the task ends, the platform calls the stop interface to terminate detection.
Currently Available Detection Types	`fire_detection`, `bag_detection`, `no_helmet_detection`

Video stream detection tasks use instanceId as the primary business association key. The instanceId in start requests, stop requests, and alarm callbacks must be identical to accurately associate algorithm alerts with the current patrol task instance.
Per API design conventions, task control interfaces use the analysis module with plural resource names and action sub-paths. The detection type query interface uses GET to retrieve the available catalog. The algorithm side should configure reportUrl as the full address of the fixed callback interface.
If the platform has a unified gateway domain, the recommended full callback address is: https://{domain}/api/v1/analysis/video-stream-callbacks/create.
The RTMP push URL is returned by the algorithm side in the successful start response. The streamKey is fixed to the current instanceId, meaning the push URL format is rtmp://{host}:{port}/drtm/{instanceId}.
After a task ends, the platform must explicitly call the stop interface. After a successful stop, the algorithm side must not push new alerts or continue streaming for that instanceId.

3.2 Video Stream Detection Type Query Interface Definition

3.2.1 Interface Basic Information

Configuration	Details
Interface Name	Video Stream Detection Type Query Interface
Interface Path	`/api/v1/analysis/video-stream-detect-types`
Request Method	`GET`
Interface Description	Retrieves the currently available video stream detection algorithm type catalog, for use by the platform before configuring or initiating tasks

The path follows API design conventions: /api/v1/{module}/..., where {module}=analysis and the resource name uses the plural form video-stream-detect-types.
The response uses a flat array structure where each element contains item and itemDesc, representing a detection type that can be directly submitted to a video stream detection task.
The current version returns only published detection types; unreleased, in-beta, or internal test types are not returned.

3.2.2 Response Example

json

{
    "code": 200,
    "message": "success",
    "data": [
        {
            "item": "fire_detection",
            "itemDesc": "Fire Detection"
        },
        {
            "item": "bag_detection",
            "itemDesc": "Abandoned Object Detection"
        },
        {
            "item": "no_helmet_detection",
            "itemDesc": "No Helmet Detection"
        }
    ]
}

3.2.3 Response Field Description

Field	Type	Description
`code`	Integer	Unified business status code; `200` = query successful, `500` on failure
`message`	String	Unified response description; fixed as `success` on success, error details on failure
`data`	Array	List of currently available video stream detection types
`data.item`	String	Detection type code, used to identify a specific video stream detection capability
`data.itemDesc`	String	Human-readable description of the detection type, for frontend display or manual configuration

3.2.4 Currently Available Detection Types

`item`	`itemDesc`	Description
`fire_detection`	Fire Detection	Used to identify fire anomalies in video streams
`bag_detection`	Abandoned Bag Detection	Used to identify abandoned object anomalies in video streams
`no_helmet_detection`	No Helmet Detection	Used to identify human head targets not wearing safety helmets in video streams

3.3 Video Stream Detection Start Interface Definition

3.3.1 Interface Basic Information

Configuration	Details
Interface Name	Video Stream Detection Start Interface
Interface Path	`/api/v1/analysis/video-stream-tasks/start`
Request Method	`POST`
`Content-Type`	`application/json`
Interface Description	The platform submits the video stream URL and starts the video stream detection task for the current task instance

The path follows API design conventions: /api/v1/{module}/..., where {module}=analysis, the resource name uses the plural form video-stream-tasks, and the action is start.
The instanceId in the start request is the unique identifier of the current patrol task instance; the algorithm side must return this field unchanged in subsequent alarm callbacks.
In the current version, one start request corresponds to one video stream detection task instance. To detect multiple video streams, a separate start request must be issued for each stream.

3.3.2 Request Example

json

{
    "requestId": "550e8400-e29b-41d4-a716-446655440000",
    "instanceId": "PATROL-INSTANCE-20260424-0001",
    "videoStreamUrl": "rtsp://192.168.1.100/live/robot-camera-main"
}

3.3.3 Field Description

Field	Type	Required	Description
`requestId`	String	Yes	Unique identifier for the start request, used for request tracing; UUID is recommended
`instanceId`	String	Yes	Current task instance ID; algorithm alarm callbacks and stop requests both use this field to associate the same detection task
`videoStreamUrl`	String	Yes	URL of the video stream to be detected; may be RTSP/RTMP or other mutually agreed protocol URLs, must be accessible by the algorithm service

3.3.4 Response Example

json

{
    "code": 200,
    "message": "success",
    "data": {
        "requestId": "550e8400-e29b-41d4-a716-446655440000",
        "instanceId": "PATROL-INSTANCE-20260424-0001",
        "rtmpPushUrl": "rtmp://10.10.10.20:1935/drtm/PATROL-INSTANCE-20260424-0001"
    }
}

3.3.5 Response Field Description

Field	Type	Description
`code`	Integer	Unified business status code; `200` = start request successfully accepted, `500` on failure
`message`	String	Unified response description; fixed as `success` on success, error details on failure
`data.requestId`	String	Identical to the `requestId` in the request, used for request tracing
`data.instanceId`	String	Identical to the `instanceId` in the request, representing the accepted detection task instance
`data.rtmpPushUrl`	String	RTMP push URL returned by the algorithm side; the `streamKey` is fixed to the current `instanceId`

3.4 Video Stream Detection Stop Interface Definition

3.4.1 Interface Basic Information

Configuration	Details
Interface Name	Video Stream Detection Stop Interface
Interface Path	`/api/v1/analysis/video-stream-tasks/stop`
Request Method	`POST`
`Content-Type`	`application/json`
Interface Description	Called by the platform when a video stream detection task ends, instructing the algorithm side to stop detection for the specified task instance

The stop interface uses instanceId as the primary stop condition. If multiple control requests exist for the same task, the state of the instanceId from the most recent valid start takes precedence.
The platform must call the stop interface to release algorithm resources whenever a task completes, is cancelled, or is no longer needed.

3.4.2 Request Example

json

{
    "requestId": "9e0c7b2e-4f65-4c82-8f63-0bc4f9d3857f",
    "instanceId": "PATROL-INSTANCE-20260424-0001"
}

3.4.3 Field Description

Field	Type	Required	Description
`requestId`	String	Yes	Unique identifier for the stop request, used for request tracing; UUID is recommended
`instanceId`	String	Yes	Current task instance ID; the algorithm side uses this field to stop the corresponding detection task

3.4.4 Response Example

json

{
    "code": 200,
    "message": "success",
    "data": {
        "requestId": "9e0c7b2e-4f65-4c82-8f63-0bc4f9d3857f",
        "instanceId": "PATROL-INSTANCE-20260424-0001"
    }
}

3.4.5 Response Field Description

Field	Type	Description
`code`	Integer	Unified business status code; `200` = stop request successfully accepted, `500` on failure
`message`	String	Unified response description; fixed as `success` on success, error details on failure
`data.requestId`	String	Identical to the `requestId` in the request, used for request tracing
`data.instanceId`	String	Identical to the `instanceId` in the request, representing the detection task instance whose stop has been accepted

3.5 Video Stream Detection Callback Interface Definition

3.5.1 Interface Basic Information

Configuration	Details
Interface Name	Video Stream Detection Callback Interface
Interface Path	`/api/v1/analysis/video-stream-callbacks/create`
Request Method	`POST`
`Content-Type`	`application/json`
Interface Description	Upon detecting an anomaly, the algorithm side invokes the callback in batches to report video stream detection results; the callback must include the task instance ID submitted at start time

The callback interface is provided by the platform side; the algorithm side configures the full callback address via reportUrl.
The path follows API design conventions: /api/v1/{module}/..., where {module}=analysis, the resource name uses the plural form video-stream-callbacks, and write operations use POST .../create.
All request body / response body fields use camelCase.

3.5.2 Request Example

json

{
    "requestId": "550e8400-e29b-41d4-a716-446655440000",
    "instanceId": "PATROL-INSTANCE-20260424-0001",
    "resultsList": [
        {
            "objectId": "1212",
            "results": [
                {
                    "type": "fire_detection",
                    "value": "1",
                    "code": "2000",
                    "resImageUrl": "detect-results/result_1745318400.jpg",
                    "pos": [],
                    "conf": 1.0,
                    "desc": "fire_detection"
                },
                {
                    "type": "no_helmet_detection",
                    "value": "1",
                    "code": "2000",
                    "resImageUrl": "detect-results/result_1745318400.jpg",
                    "pos": [],
                    "conf": 1.0,
                    "desc": "no_helmet_detection"
                }
            ]
        }
    ]
}

3.5.3 Field Description

Field	Type	Required	Description
`requestId`	String	Yes	Start request unique identifier; recommended to match the `requestId` in the start interface for request tracing
`instanceId`	String	Yes	Current task instance ID; must be identical to the `instanceId` in the start interface to associate alarm results with the specific task instance
`resultsList`	Array	Yes	Result list; current implementation contains exactly one object
`resultsList.objectId`	String	Yes	Detection object ID; current implementation is fixed as `1212`
`resultsList.results`	Array	Yes	Detection result list; dynamically generated based on the actual detected anomaly types
`results.type`	String	Yes	Anomaly type; see "3.5.4 Detection Type Enumeration" for valid values
`results.value`	String	Yes	Detection value; current implementation is fixed as `1`, indicating an anomaly was detected in this round
`results.code`	String	Yes	Result code; current implementation is fixed as `2000`, strictly reusing `AnalysisResultCodeEnum`
`results.resImageUrl`	String	No	Alarm image path; represents the object path of the alarm image in object storage, using the S3 protocol
`results.pos`	Array	No	Detection bounding box position; current implementation returns an empty array
`results.conf`	Number	No	Confidence score; current implementation is fixed as `1.0`
`results.desc`	String	Yes	Description; current implementation matches `type`

3.5.4 Detection Type Enumeration

`type` Value	Trigger Condition	Detection Logic
`fire_detection`	Fire detected	`fire_bag` model `cls=0`
`bag_detection`	Abandoned bag detected	`fire_bag` model `cls=1`
`no_helmet_detection`	Human head without helmet detected	`head_aqm` model `cls=0` with a pedestrian bounding box present

3.5.5 Response Convention

Upon successful processing of a callback, the platform must return HTTP 200 with a unified Result<null> response body, for example:

json

{
    "code": 200,
    "message": "success",
    "data": null
}

When the request body is missing required fields, contains invalid field formats, or includes unsupported enum values, it is recommended to return HTTP 400, for example:

json

{
    "code": 40001,
    "message": "Parameter validation failed",
    "data": null
}

When an internal exception occurs during platform processing, it is recommended to return HTTP 500, for example:

json

{
    "code": 50000,
    "message": "Internal server error",
    "data": null
}

The current implementation logs callback response content. Retry behavior is determined by the algorithm side according to its own retry policy.
The HTTP callback call timeout is fixed at 10s.

3.5.6 Trigger Timing and Throttling Rules

The platform receives alarm results via callbacks and does not depend on the algorithm's internal throttling parameters.
The algorithm side may maintain throttling strategies as needed (e.g., alarm deduplication, time interval control), but these are internal implementation details and not part of the platform interface contract.
The platform integration focus is: reliably receiving callback requests triggered when an anomaly is detected, and accurately attributing alarm results to the corresponding task instance based on instanceId.

3.6 RTMP Streaming Interface Definition

3.6.1 Interface Basic Information

Configuration	Details
Interface Name	Video Stream RTMP Streaming Interface
Push URL	`rtmp://{host}:{port}/drtm/{instanceId}`
Protocol	`RTMP`
Video Encoding	`H.264 (libx264)`
Container Format	`FLV`
Interface Description	Upon successful acceptance of the start request, the algorithm side returns the corresponding RTMP URL and continuously pushes an AI-annotated video stream for real-time consumption by the backend or media server

3.6.2 Streaming Parameters

Parameter	Value	Description
Resolution	`1280×720`	Hard-coded in current implementation
Frame Rate	`15 fps`	From `fps` configuration
Encoding	`H.264 (libx264)`	Software encoding
Bitrate	`1200k`	Fixed bitrate
Pixel Format	`yuv420p`	Standard compatible format
Preset	`superfast`	Encoding speed priority
Tune	`zerolatency`	Low latency
GOP	`10`	Keyframe interval of 10 frames
Container Format	`FLV`	Standard RTMP container format

3.6.3 Streaming Content

Raw video frames are output together with AI detection bounding box annotations.
Current detection bounding box color mapping:

Detection Target	Annotation Color
Fire	`(0, 255, 0)`
Abandoned Bag	`(0, 0, 255)`
Human Head	`(255, 0, 0)`
Safety Helmet	`(0, 255, 255)`

3.6.4 Receiver Requirements

The receiver must deploy an RTMP server (e.g., Nginx-RTMP, SRS, Red5) with the listening port configured per the actual deployment.
The RTMP server must accept continuous streaming to the path /drtm/{instanceId}.
instanceId is used directly as the streamKey for a single video stream. The algorithm side assembles the full rtmpPushUrl after a successful start and returns it to the platform.
For the same instanceId, the streaming lifecycle must remain consistent with the detection task lifecycle; streaming must terminate when the task stops.

3.7 Result Code Enum Reuse Specification

Note: The enumeration in this section represents business result codes for video stream alarm result items results[].code and is not equivalent to HTTP-layer status codes.

Enum Value	Code	Applicable Description for Video Stream Detection
`SUCCESS`	`2000`	All anomaly detection results reported in the current version use this code, indicating the anomaly has been successfully identified and written to `results[]`

Note 1: The current version only reports alarm results when an anomaly is detected; no separate result item is defined for "no anomaly detected."

Note 2: If future versions need to report anomalous results such as frame extraction failures or analysis failures during video stream detection, existing result codes from the appendix must still be reused; adding custom codes is prohibited.

4. VLM Algorithm Interface Definition

4.1 Interface Basic Information

Configuration	Details
Interface Name	VLM Algorithm Interface
Interface Composition	Start Interface + Stop Interface + HTTP Callback Interface + RTMP Live Streaming + Dynamic Configuration Update Interface
Start Interface Path	`/api/v1/analysis/vlm-video-stream-tasks/start`
Stop Interface Path	`/api/v1/analysis/vlm-video-stream-tasks/stop`
Configuration Update Path	`/api/v1/analysis/vlm-video-stream-tasks/update`
HTTP Callback Path	`/api/v1/analysis/vlm-video-stream-callbacks/create`
HTTP Request Method	`POST`
RTMP Push URL	Returned by the algorithm side upon successful start; format: `rtmp://{host}:{port}/drtm/{instanceId}`
Interface Description	The platform submits the video stream URL, task instance ID, and a natural language prompt. Upon successful acceptance, the algorithm side returns the RTMP push URL for the current task instance. When a target matching the prompt is identified, results are reported via the fixed callback interface, and an annotated video stream is continuously pushed. During task execution, the prompt and extension parameters can be updated online via the dynamic configuration update interface. When the task ends, the platform calls the stop interface to terminate analysis.

The VLM interface's overall interaction pattern, response structure, callback response convention, and RTMP streaming specification all reuse those defined in Section 3 (Video Stream Detection Interface).
The only protocol difference from the Video Stream Detection Interface is that the VLM start interface must include a prompt field describing the analysis objective for this video stream session, e.g., "Check the video stream for a specific defect."
To avoid mixing with rule-based video stream detection tasks, VLM interfaces use dedicated paths: vlm-video-stream-tasks and vlm-video-stream-callbacks.
VLM tasks also use instanceId as the primary business association key. The instanceId in start requests, stop requests, configuration update requests, and callbacks must all be identical.
During a running VLM task, the platform may update the prompt and other runtime parameters online via the dynamic configuration update interface without stopping and restarting the task.

4.2 VLM Analysis Start Interface Definition

4.2.1 Interface Basic Information

Configuration	Details
Interface Name	VLM Analysis Start Interface
Interface Path	`/api/v1/analysis/vlm-video-stream-tasks/start`
Request Method	`POST`
`Content-Type`	`application/json`
Interface Description	The platform submits the video stream URL, current task instance ID, and prompt to start the VLM video stream analysis task for the current task instance

The path follows API design conventions: /api/v1/{module}/..., where {module}=analysis, the resource name uses the plural form vlm-video-stream-tasks, and the action is start.
The instanceId in the start request is the unique identifier of the current patrol task instance; the algorithm side must return this field unchanged in subsequent callbacks.
prompt is the core control parameter for this VLM analysis session; clear, actionable, and observable natural language descriptions are recommended to avoid ambiguity.
In the current version, one start request corresponds to one VLM video stream analysis task instance. To analyze multiple video streams simultaneously, a separate start request must be issued for each stream.

4.2.2 Request Example

json

{
    "requestId": "650e8400-e29b-41d4-a716-446655440000",
    "instanceId": "PATROL-INSTANCE-20260427-0001",
    "videoStreamUrl": "rtsp://192.168.1.100/live/robot-camera-main",
    "prompt": "Check the video stream for a specific defect"
}

4.2.3 Field Description

Field	Type	Required	Description
`requestId`	String	Yes	Unique identifier for the start request, used for request tracing; UUID is recommended
`instanceId`	String	Yes	Current task instance ID; algorithm callbacks and stop requests both use this field to associate the same analysis task
`videoStreamUrl`	String	Yes	URL of the video stream to be analyzed; may be RTSP/RTMP or other mutually agreed protocol URLs, must be accessible by the algorithm service
`prompt`	String	Yes	VLM prompt, describing the analysis objective for this video stream session, e.g., "Check the video stream for a specific defect"

4.2.4 Response Example

json

{
    "code": 200,
    "message": "success",
    "data": {
        "requestId": "650e8400-e29b-41d4-a716-446655440000",
        "instanceId": "PATROL-INSTANCE-20260427-0001",
        "rtmpPushUrl": "rtmp://10.10.10.20:1935/drtm/PATROL-INSTANCE-20260427-0001"
    }
}

4.2.5 Response Field Description

Field	Type	Description
`code`	Integer	Unified business status code; `200` = start request successfully accepted, `500` on failure
`message`	String	Unified response description; fixed as `success` on success, error details on failure
`data.requestId`	String	Identical to the `requestId` in the request, used for request tracing
`data.instanceId`	String	Identical to the `instanceId` in the request, representing the accepted VLM analysis task instance
`data.rtmpPushUrl`	String	RTMP push URL returned by the algorithm side; the `streamKey` is fixed to the current `instanceId`

4.3 VLM Analysis Stop Interface Definition

4.3.1 Interface Basic Information

Configuration	Details
Interface Name	VLM Analysis Stop Interface
Interface Path	`/api/v1/analysis/vlm-video-stream-tasks/stop`
Request Method	`POST`
`Content-Type`	`application/json`
Interface Description	Called by the platform when a VLM video stream analysis task ends, instructing the algorithm side to stop VLM analysis for the specified task instance

The stop interface uses instanceId as the primary stop condition. If multiple control requests exist for the same task, the state of the instanceId from the most recent valid start takes precedence.
The platform must call the stop interface to release algorithm resources whenever a task completes, is cancelled, or analysis is no longer needed.

4.3.2 Request Example

json

{
    "requestId": "7c0c7b2e-4f65-4c82-8f63-0bc4f9d3857f",
    "instanceId": "PATROL-INSTANCE-20260427-0001"
}

4.3.3 Field Description

Field	Type	Required	Description
`requestId`	String	Yes	Unique identifier for the stop request, used for request tracing; UUID is recommended
`instanceId`	String	Yes	Current task instance ID; the algorithm side uses this field to stop the corresponding VLM analysis task

4.3.4 Response Example

json

{
    "code": 200,
    "message": "success",
    "data": {
        "requestId": "7c0c7b2e-4f65-4c82-8f63-0bc4f9d3857f",
        "instanceId": "PATROL-INSTANCE-20260427-0001"
    }
}

4.3.5 Response Field Description

Field	Type	Description
`code`	Integer	Unified business status code; `200` = stop request successfully accepted, `500` on failure
`message`	String	Unified response description; fixed as `success` on success, error details on failure
`data.requestId`	String	Identical to the `requestId` in the request, used for request tracing
`data.instanceId`	String	Identical to the `instanceId` in the request, representing the VLM analysis task instance whose stop has been accepted

4.4 VLM Analysis Callback Interface Definition

4.4.1 Interface Basic Information

Configuration	Details
Interface Name	VLM Analysis Callback Interface
Interface Path	`/api/v1/analysis/vlm-video-stream-callbacks/create`
Request Method	`POST`
`Content-Type`	`application/json`
Interface Description	Upon identifying a target matching the prompt description, the algorithm side invokes the callback in batches to report VLM video stream analysis results; the callback must include the task instance ID submitted at start time

The callback interface is provided by the platform side; the algorithm side configures the full callback address via reportUrl.
The path follows API design conventions: /api/v1/{module}/..., where {module}=analysis, the resource name uses the plural form vlm-video-stream-callbacks, and write operations use POST .../create.
All request body / response body fields use camelCase. Except for the path and business semantics, the overall structure is consistent with the Video Stream Detection Callback Interface.

4.4.2 Request Example

json

{
    "requestId": "650e8400-e29b-41d4-a716-446655440000",
    "instanceId": "PATROL-INSTANCE-20260427-0001",
    "resultsList": [
        {
            "objectId": "1212",
            "results": [
                {
                    "type": "vlm_detection",
                    "value": "1",
                    "code": "2000",
                    "resImageUrl": "detect-results/result_1745318400.jpg",
                    "pos": [],
                    "conf": 1.0,
                    "desc": "Check the video stream for a specific defect"
                }
            ]
        }
    ]
}

4.4.3 Field Description

Field	Type	Required	Description
`requestId`	String	Yes	Start request unique identifier; recommended to match the `requestId` in the start interface for request tracing
`instanceId`	String	Yes	Current task instance ID; must be identical to the `instanceId` in the start interface to associate analysis results with the specific task instance
`resultsList`	Array	Yes	Result list; current implementation contains exactly one object
`resultsList.objectId`	String	Yes	Detection object ID; current implementation is fixed as `1212`
`resultsList.results`	Array	Yes	Analysis result list; dynamically generated based on the actual prompt-matching targets identified
`results.type`	String	Yes	Result type; fixed as `vlm_detection` in VLM scenarios
`results.value`	String	Yes	Detection value; current implementation is fixed as `1`, indicating a target matching the prompt was identified
`results.code`	String	Yes	Result code; current implementation is fixed as `2000`, strictly reusing `AnalysisResultCodeEnum`
`results.resImageUrl`	String	No	Alarm image path; represents the object path of the alarm image in object storage, using the S3 protocol
`results.pos`	Array	No	Detection bounding box position; current implementation returns an empty array
`results.conf`	Number	No	Confidence score; current implementation is fixed as `1.0`
`results.desc`	String	Yes	Description; recommended to return the `prompt` from the start interface unchanged or as an equivalent normalized description

4.4.4 Response Convention

Upon successful processing of a callback, the platform must return HTTP 200 with a unified Result<null> response body, for example:

json

{
    "code": 200,
    "message": "success",
    "data": null
}

When the request body is missing required fields, contains invalid field formats, or includes unsupported enum values, it is recommended to return HTTP 400, for example:

json

{
    "code": 40001,
    "message": "Parameter validation failed",
    "data": null
}

When an internal exception occurs during platform processing, it is recommended to return HTTP 500, for example:

json

{
    "code": 50000,
    "message": "Internal server error",
    "data": null
}

The current implementation logs callback response content. Retry behavior is determined by the algorithm side according to its own retry policy.
The HTTP callback call timeout is fixed at 10s.

4.4.5 Trigger Timing and Throttling Rules

The platform receives analysis results via callbacks and does not depend on the algorithm's internal throttling parameters.
The algorithm side may maintain throttling strategies as needed (e.g., deduplication of identical prompt results, time interval control), but these are internal implementation details and not part of the platform interface contract.
The platform integration focus is: reliably receiving callback requests triggered when a target matching the prompt description is identified, and accurately attributing results to the corresponding task instance based on instanceId.

4.5 VLM RTMP Streaming Interface Definition

4.5.1 Interface Basic Information

Configuration	Details
Interface Name	VLM Video Stream RTMP Streaming Interface
Push URL	`rtmp://{host}:{port}/drtm/{instanceId}`
Protocol	`RTMP`
Video Encoding	`H.264 (libx264)`
Container Format	`FLV`
Interface Description	Upon successful acceptance of the start request, the algorithm side returns the corresponding RTMP URL and continuously pushes a VLM-annotated video stream for real-time consumption by the backend or media server

4.5.2 Streaming Parameters

Parameter	Value	Description
Resolution	`1280×720`	Hard-coded in current implementation
Frame Rate	`15 fps`	From `fps` configuration
Encoding	`H.264 (libx264)`	Software encoding
Bitrate	`1200k`	Fixed bitrate
Pixel Format	`yuv420p`	Standard compatible format
Preset	`superfast`	Encoding speed priority
Tune	`zerolatency`	Low latency
GOP	`10`	Keyframe interval of 10 frames
Container Format	`FLV`	Standard RTMP container format

4.5.3 Streaming Content

Raw video frames are output together with VLM analysis annotations.
Annotation content is determined by the targets matching the prompt and the algorithm implementation. It is recommended to display the matching bounding box, prompt summary, and relevant confidence information in the frame.

4.5.4 Receiver Requirements

The receiver must deploy an RTMP server (e.g., Nginx-RTMP, SRS, Red5) with the listening port configured per the actual deployment.
The RTMP server must accept continuous streaming to the path /drtm/{instanceId}.
instanceId is used directly as the streamKey for a single video stream. The algorithm side assembles the full rtmpPushUrl after a successful start and returns it to the platform.
For the same instanceId, the streaming lifecycle must remain consistent with the VLM analysis task lifecycle; streaming must terminate when the task stops.

4.6 Result Code Enum Reuse Specification

Note: The enumeration in this section represents business result codes for VLM analysis result items results[].code and is not equivalent to HTTP-layer status codes.

Enum Value	Code	Applicable Description for VLM Scenarios
`SUCCESS`	`2000`	All VLM prompt-matching results reported in the current version use this code, indicating the anomaly or item of interest has been successfully identified and written to `results[]`

Note 1: The current version only reports results when a target matching the prompt description is detected; no separate result item is defined for "prompt not matched."

Note 2: If future versions need to report anomalous results such as frame extraction failures or analysis failures during VLM analysis, existing result codes from the appendix must still be reused; adding custom codes is prohibited.

4.7 VLM Dynamic Configuration Update Interface Definition

4.7.1 Interface Basic Information

Configuration	Details
Interface Name	VLM Dynamic Configuration Update Interface
Interface Path	`/api/v1/analysis/vlm-video-stream-tasks/update`
Request Method	`POST`
`Content-Type`	`application/json`
Interface Description	During a running VLM task, the platform may use this interface to update the prompt or extension configuration parameters for the current task instance online. The algorithm side applies the update immediately upon receipt without stopping and restarting the task.

The path follows API design conventions: /api/v1/{module}/..., where {module}=analysis, the resource name uses the plural form vlm-video-stream-tasks, and the action is update.
The instanceId in the update request must be the ID of a currently running VLM task instance. The algorithm side verifies that the task exists and is in an updatable state before returning an acceptance result.
The current version only supports calling this interface while a task is running. Calling it when the task is stopped or not yet started will return a failed acceptance response.
The update operation does not affect the current RTMP streaming link; the streamKey and push URL remain the same as when the task was started.

4.7.2 Request Example

json

{
    "requestId": "8d1e7b2e-4f65-4c82-8f63-0bc4f9d3857f",
    "instanceId": "PATROL-INSTANCE-20260427-0001",
    "prompt": "Updated prompt description, e.g., check if smoke appears in the video stream",
    "report_interval": 5
}

4.7.3 Field Description

Field	Type	Required	Description
`requestId`	String	Yes	Unique identifier for the update request, used for request tracing; UUID is recommended
`instanceId`	String	Yes	Current task instance ID; must be identical to the `instanceId` in the start interface; the algorithm side uses this field to locate the running VLM analysis task
`prompt`	String	No	Updated VLM prompt, used to replace the analysis objective description of the currently running task; if not provided, the prompt is not updated
`report_interval`	Integer	No	Reporting interval (seconds), controlling the minimum time interval between VLM analysis result callbacks; if not provided, the current effective value is retained

Note: At least one of prompt or report_interval must be provided; otherwise the request is treated as an invalid update request.

4.7.4 Response Example

json

{
    "code": 200,
    "message": "success",
    "data": {
        "requestId": "8d1e7b2e-4f65-4c82-8f63-0bc4f9d3857f",
        "instanceId": "PATROL-INSTANCE-20260427-0001",
        "report_interval": 5
    }
}

4.7.5 Response Field Description

Field	Type	Description
`code`	Integer	Unified business status code; `200` = configuration update successfully accepted, `500` on failure
`message`	String	Unified response description; fixed as `success` on success, error details on failure
`data.requestId`	String	Identical to the `requestId` in the request, used for request tracing
`data.instanceId`	String	Identical to the `instanceId` in the request, representing the VLM analysis task instance whose configuration update has been accepted
`data.report_interval`	Integer	The reporting interval (seconds) effective after this update; echoes the value from the request or the current effective value

4.7.6 Effective Timing and Notes

The algorithm side must apply the new configuration immediately upon receiving a valid update request. After a prompt update, subsequent video frame analysis proceeds using the new prompt.
Callback results generated before the configuration update are not affected. For new callback results generated after the update, the desc field should reflect the updated prompt content.
When the update interface is called multiple times for the same task instance, the most recently successfully accepted configuration takes precedence.
If the task has already stopped or the instanceId does not exist at the time of the update, the algorithm side must return a failed acceptance response (code=500) with a clear error reason in the message field.

5. Appendix: Complete Algorithm Result Code Enumeration

Enum Value	Code	Description
`SUCCESS`	`2000`	Success
`GET_IMAGE_ERROR`	`2001`	Failed to obtain image data
`ANALYSIS_IMAGE_ERROR`	`2002`	Error occurred during algorithm analysis
`IMAGE_NO_METER`	`2003`	No meter found in the image
`IMAGE_NO_TARGET_DEVICE`	`2004`	No corresponding state recognition device found in the image

Algorithm Analysis Interface Specification Document ​

Document Properties ​

1. Overview ​

2. General Specifications ​

3. Video Stream Detection Interface Definition ​

3.1 Interface Overview ​

3.2 Video Stream Detection Type Query Interface Definition ​

3.2.1 Interface Basic Information ​

3.2.2 Response Example ​

3.2.3 Response Field Description ​

3.2.4 Currently Available Detection Types ​

3.3 Video Stream Detection Start Interface Definition ​

3.3.1 Interface Basic Information ​

3.3.2 Request Example ​

3.3.3 Field Description ​

3.3.4 Response Example ​

3.3.5 Response Field Description ​

3.4 Video Stream Detection Stop Interface Definition ​

3.4.1 Interface Basic Information ​

3.4.2 Request Example ​

3.4.3 Field Description ​

3.4.4 Response Example ​

3.4.5 Response Field Description ​

3.5 Video Stream Detection Callback Interface Definition ​

3.5.1 Interface Basic Information ​

3.5.2 Request Example ​

3.5.3 Field Description ​

3.5.4 Detection Type Enumeration ​

3.5.5 Response Convention ​

3.5.6 Trigger Timing and Throttling Rules ​

3.6 RTMP Streaming Interface Definition ​

3.6.1 Interface Basic Information ​

3.6.2 Streaming Parameters ​

3.6.3 Streaming Content ​

3.6.4 Receiver Requirements ​

3.7 Result Code Enum Reuse Specification ​

4. VLM Algorithm Interface Definition ​

4.1 Interface Basic Information ​

4.2 VLM Analysis Start Interface Definition ​

4.2.1 Interface Basic Information ​

4.2.2 Request Example ​

4.2.3 Field Description ​

4.2.4 Response Example ​

4.2.5 Response Field Description ​

4.3 VLM Analysis Stop Interface Definition ​

4.3.1 Interface Basic Information ​

4.3.2 Request Example ​

4.3.3 Field Description ​

4.3.4 Response Example ​

4.3.5 Response Field Description ​

4.4 VLM Analysis Callback Interface Definition ​

4.4.1 Interface Basic Information ​

4.4.2 Request Example ​

4.4.3 Field Description ​

4.4.4 Response Convention ​

4.4.5 Trigger Timing and Throttling Rules ​

4.5 VLM RTMP Streaming Interface Definition ​

4.5.1 Interface Basic Information ​

4.5.2 Streaming Parameters ​

4.5.3 Streaming Content ​

4.5.4 Receiver Requirements ​

4.6 Result Code Enum Reuse Specification ​

4.7 VLM Dynamic Configuration Update Interface Definition ​

4.7.1 Interface Basic Information ​

4.7.2 Request Example ​

4.7.3 Field Description ​

4.7.4 Response Example ​

4.7.5 Response Field Description ​

4.7.6 Effective Timing and Notes ​

5. Appendix: Complete Algorithm Result Code Enumeration ​

Algorithm Analysis Interface Specification Document

Document Properties

1. Overview

2. General Specifications

3. Video Stream Detection Interface Definition

3.1 Interface Overview

3.2 Video Stream Detection Type Query Interface Definition

3.2.1 Interface Basic Information

3.2.2 Response Example

3.2.3 Response Field Description

3.2.4 Currently Available Detection Types

3.3 Video Stream Detection Start Interface Definition

3.3.1 Interface Basic Information

3.3.2 Request Example

3.3.3 Field Description

3.3.4 Response Example

3.3.5 Response Field Description

3.4 Video Stream Detection Stop Interface Definition

3.4.1 Interface Basic Information

3.4.2 Request Example

3.4.3 Field Description

3.4.4 Response Example

3.4.5 Response Field Description

3.5 Video Stream Detection Callback Interface Definition

3.5.1 Interface Basic Information

3.5.2 Request Example

3.5.3 Field Description

3.5.4 Detection Type Enumeration

3.5.5 Response Convention

3.5.6 Trigger Timing and Throttling Rules

3.6 RTMP Streaming Interface Definition

3.6.1 Interface Basic Information

3.6.2 Streaming Parameters

3.6.3 Streaming Content

3.6.4 Receiver Requirements

3.7 Result Code Enum Reuse Specification

4. VLM Algorithm Interface Definition

4.1 Interface Basic Information

4.2 VLM Analysis Start Interface Definition

4.2.1 Interface Basic Information

4.2.2 Request Example

4.2.3 Field Description

4.2.4 Response Example

4.2.5 Response Field Description

4.3 VLM Analysis Stop Interface Definition

4.3.1 Interface Basic Information

4.3.2 Request Example

4.3.3 Field Description

4.3.4 Response Example

4.3.5 Response Field Description

4.4 VLM Analysis Callback Interface Definition

4.4.1 Interface Basic Information

4.4.2 Request Example

4.4.3 Field Description

4.4.4 Response Convention

4.4.5 Trigger Timing and Throttling Rules

4.5 VLM RTMP Streaming Interface Definition

4.5.1 Interface Basic Information

4.5.2 Streaming Parameters

4.5.3 Streaming Content

4.5.4 Receiver Requirements

4.6 Result Code Enum Reuse Specification

4.7 VLM Dynamic Configuration Update Interface Definition

4.7.1 Interface Basic Information

4.7.2 Request Example

4.7.3 Field Description

4.7.4 Response Example

4.7.5 Response Field Description

4.7.6 Effective Timing and Notes

5. Appendix: Complete Algorithm Result Code Enumeration