OpenAI API参考文档 | OpenAI开发文档|OpenAI中文官方文档|ChatGPT中文版|ChatGPT教程

服务器事件

这些是从 OpenAI Realtime WebSocket 服务器发送到客户端的事件。source

错误

发生错误时返回，这可能是客户端问题或服务器问题。大多数错误都是可恢复的，并且会话将保持打开状态，因此建议实现者默认监控和记录错误消息。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为error.source

错误

对象source

错误的详细信息。source

OBJECT 错误

1
2
3
4
5
6
7
8
9
10
11
{
    "event_id": "event_890",
    "type": "error",
    "error": {
        "type": "invalid_request_error",
        "code": "invalid_event",
        "message": "The 'type' field is missing.",
        "param": null,
        "event_id": "event_567"
    }
}

会话创建文件

创建 Session 时返回。当新的 Connection 作为第一个 Server 事件建立。此事件将包含默认的 Session 配置。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为session.created.source

会期

对象source

实时会话对象配置。source

OBJECT session.created

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
{
    "event_id": "event_1234",
    "type": "session.created",
    "session": {
        "id": "sess_001",
        "object": "realtime.session",
        "model": "gpt-4o-realtime-preview-2024-10-01",
        "modalities": ["text", "audio"],
        "instructions": "...model instructions here...",
        "voice": "sage",
        "input_audio_format": "pcm16",
        "output_audio_format": "pcm16",
        "input_audio_transcription": null,
        "turn_detection": {
            "type": "server_vad",
            "threshold": 0.5,
            "prefix_padding_ms": 300,
            "silence_duration_ms": 200
        },
        "tools": [],
        "tool_choice": "auto",
        "temperature": 0.8,
        "max_response_output_tokens": "inf"
    }
}

session.updated

当会话使用session.update事件，除非出现错误。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为session.updated.source

会期

对象source

实时会话对象配置。source

OBJECT session.updated

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
{
    "event_id": "event_5678",
    "type": "session.updated",
    "session": {
        "id": "sess_001",
        "object": "realtime.session",
        "model": "gpt-4o-realtime-preview-2024-10-01",
        "modalities": ["text"],
        "instructions": "New instructions",
        "voice": "sage",
        "input_audio_format": "pcm16",
        "output_audio_format": "pcm16",
        "input_audio_transcription": {
            "model": "whisper-1"
        },
        "turn_detection": null,
        "tools": [],
        "tool_choice": "none",
        "temperature": 0.7,
        "max_response_output_tokens": 200
    }
}

对话创建（conversation.created）

创建对话时返回。在会话创建后立即发出。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为conversation.created.source

谈话

对象source

对话资源。source

对象 conversation.created

1
2
3
4
5
6
7
8
{
    "event_id": "event_9101",
    "type": "conversation.created",
    "conversation": {
        "id": "conv_001",
        "object": "realtime.conversation"
    }
}

conversation.item.created

创建对话项时返回。有几种情况生成此事件：source

服务器正在生成一个 Response，如果成功，将生成 Response 一个或两个 Items，其类型为message（角色assistant）或键入function_call.
输入音频缓冲区已由客户端或 server（在server_vad模式）。服务器将获取 input audio buffer 并将其添加到新用户消息 Item 中。
客户端已发送conversation.item.createevent 添加新 Item 到 Conversation。

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为conversation.item.created.source

previous_item_id

字符串source

Conversation 上下文中前一项的 ID 允许 client 来了解会话的顺序。source

项目

对象source

要添加到对话中的项。source

对象 conversation.item.created

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
{
    "event_id": "event_1920",
    "type": "conversation.item.created",
    "previous_item_id": "msg_002",
    "item": {
        "id": "msg_003",
        "object": "realtime.item",
        "type": "message",
        "status": "completed",
        "role": "user",
        "content": [
            {
                "type": "input_audio",
                "transcript": "hello how are you",
                "audio": "base64encodedaudio=="
            }
        ]
    }
}

conversation.item.input_audio_transcription.已完成

此事件是写入用户音频缓冲区。当输入音频缓冲区为由客户端或服务器提交（在server_vad模式）。听录运行与 Response creation 异步，因此此事件可能出现在 Response 之前或之后响应事件。source

实时 API 模型原生接受音频，因此输入转录是一种在单独的 ASR（自动语音识别）模型上运行的单独进程，当前总是whisper-1.因此，转录本可能与模型的解释，应被视为粗略的指南。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为conversation.item.input_audio_transcription.completed.source

item_id

字符串source

包含音频的用户消息项的 ID。source

content_index

整数source

包含音频的内容部分的索引。source

抄本

字符串source

转录的文本。source

对象 conversation.item.input_audio_transcription.completed

1
2
3
4
5
6
7
{
    "event_id": "event_2122",
    "type": "conversation.item.input_audio_transcription.completed",
    "item_id": "msg_003",
    "content_index": 0,
    "transcript": "Hello, how are you?"
}

conversation.item.input_audio_transcription.failed

配置输入音频转录时返回，并且转录请求用户消息失败。这些事件与其他事件是分开的error事件，以便客户端可以识别相关的 Item。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为conversation.item.input_audio_transcription.failed.source

item_id

字符串source

用户消息项的 ID。source

content_index

整数source

包含音频的内容部分的索引。source

错误

对象source

转录错误的详细信息。source

对象 conversation.item.input_audio_transcription.failed

1
2
3
4
5
6
7
8
9
10
11
12
{
    "event_id": "event_2324",
    "type": "conversation.item.input_audio_transcription.failed",
    "item_id": "msg_003",
    "content_index": 0,
    "error": {
        "type": "transcription_error",
        "code": "audio_unintelligible",
        "message": "The audio could not be transcribed.",
        "param": null
    }
}

conversation.item.truncated

当较早的 Assistant 音频消息项被客户端的conversation.item.truncate事件。此事件用于将服务器对音频的理解与客户端的播放同步。source

此作将截断音频并删除服务器端文本转录以确保上下文中没有用户未听到的文本。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为conversation.item.truncated.source

item_id

字符串source

被截断的助手消息项的 ID。source

content_index

整数source

被截断的内容部分的索引。source

audio_end_ms

整数source

音频被截断的持续时间，以毫秒为单位。source

对象 conversation.item.truncated

1
2
3
4
5
6
7
{
    "event_id": "event_2526",
    "type": "conversation.item.truncated",
    "item_id": "msg_004",
    "content_index": 0,
    "audio_end_ms": 1500
}

conversation.item.deleted

当客户端使用conversation.item.delete事件。此事件用于同步服务器对 Client 端视图的会话历史记录的理解。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为conversation.item.deleted.source

item_id

字符串source

已删除项目的 ID。source

对象 conversation.item.deleted

1
2
3
4
5
{
    "event_id": "event_2728",
    "type": "conversation.item.deleted",
    "item_id": "msg_005"
}

input_audio_buffer.committed

当客户端或在服务器 VAD 模式下自动运行。这item_idproperty 是用户的 ID message 项，因此会创建一个conversation.item.created事件也将发送给客户端。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为input_audio_buffer.committed.source

previous_item_id

字符串source

前一项的 ID，新项将在其后插入。source

item_id

字符串source

将创建的用户消息项的 ID。source

对象 input_audio_buffer.committed

1
2
3
4
5
6
{
    "event_id": "event_1121",
    "type": "input_audio_buffer.committed",
    "previous_item_id": "msg_001",
    "item_id": "msg_002"
}

input_audio_buffer.清除

当客户端使用input_audio_buffer.clear事件。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为input_audio_buffer.cleared.source

对象 input_audio_buffer.cleared

1
2
3
4
{
    "event_id": "event_1314",
    "type": "input_audio_buffer.cleared"
}

input_audio_buffer.speech_started

服务器在server_vad模式，以指示语音已被在音频缓冲区中检测到。每当将音频添加到缓冲区（除非已检测到语音）。客户端可能希望使用此事件中断音频播放或向用户提供视觉反馈。source

客户端应该会收到一个input_audio_buffer.speech_stopped事件当言语停止时。这item_idproperty 是用户消息项的 ID 这将在语音停止时创建，并且也将包含在input_audio_buffer.speech_stopped事件（除非客户端手动提交 VAD 激活期间的音频缓冲区）。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为input_audio_buffer.speech_started.source

audio_start_ms

整数source

在首次检测到语音时的会话。这将对应于音频的开头，因此包括prefix_padding_ms在 Session 中配置。source

item_id

字符串source

语音停止时将创建的用户消息项的 ID。source

对象 input_audio_buffer.speech_started

1
2
3
4
5
6
{
    "event_id": "event_1516",
    "type": "input_audio_buffer.speech_started",
    "audio_start_ms": 1000,
    "item_id": "msg_003"
}

input_audio_buffer.speech_stopped

返回位置server_vad模式（当服务器在音频缓冲区。服务器还将发送一个conversation.item.createdevent 替换为从音频缓冲区创建的用户消息项。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为input_audio_buffer.speech_stopped.source

audio_end_ms

整数source

语音停止时自会话开始以来的毫秒数。这将对应于发送到模型的音频的结尾，因此包括min_silence_duration_ms在 Session 中配置。source

item_id

字符串source

将创建的用户消息项的 ID。source

对象 input_audio_buffer.speech_stopped

1
2
3
4
5
6
{
    "event_id": "event_1718",
    "type": "input_audio_buffer.speech_stopped",
    "audio_end_ms": 2000,
    "item_id": "msg_003"
}

response.created

创建新 Response 时返回。响应创建的第一个事件其中响应处于初始状态in_progress.source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为response.created.source

响应

对象source

响应资源。source

OBJECT response.created

1
2
3
4
5
6
7
8
9
10
11
12
{
    "event_id": "event_2930",
    "type": "response.created",
    "response": {
        "id": "resp_001",
        "object": "realtime.response",
        "status": "in_progress",
        "status_details": null,
        "output": [],
        "usage": null
    }
}

响应.done

当 Response 完成流式处理时返回。始终发出，无论 final 状态。Response 对象包含在response.doneevent 将在 Response 中包含所有输出 Item，但会省略原始音频数据。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为response.done.source

响应

对象source

响应资源。source

OBJECT response.done

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
{
    "event_id": "event_3132",
    "type": "response.done",
    "response": {
        "id": "resp_001",
        "object": "realtime.response",
        "status": "completed",
        "status_details": null,
        "output": [
            {
                "id": "msg_006",
                "object": "realtime.item",
                "type": "message",
                "status": "completed",
                "role": "assistant",
                "content": [
                    {
                        "type": "text",
                        "text": "Sure, how can I assist you today?"
                    }
                ]
            }
        ],
        "usage": {
            "total_tokens":275,
            "input_tokens":127,
            "output_tokens":148,
            "input_token_details": {
                "cached_tokens":384,
                "text_tokens":119,
                "audio_tokens":8,
                "cached_tokens_details": {
                    "text_tokens": 128,
                    "audio_tokens": 256
                }
            },
            "output_token_details": {
              "text_tokens":36,
              "audio_tokens":112
            }
        }
    }
}

response.output_item.added

在生成 Response 期间创建新 Item 时返回。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为response.output_item.added.source

response_id

字符串source

项目所属的 Response 的 ID。source

output_index

整数source

响应中输出项的索引。source

项目

对象source

要添加到对话中的项。source

对象 response.output_item.added

1
2
3
4
5
6
7
8
9
10
11
12
13
14
{
    "event_id": "event_3334",
    "type": "response.output_item.added",
    "response_id": "resp_001",
    "output_index": 0,
    "item": {
        "id": "msg_007",
        "object": "realtime.item",
        "type": "message",
        "status": "in_progress",
        "role": "assistant",
        "content": []
    }
}

response.output_item.done

当 Item 完成流式处理时返回。当 Response 为已中断、不完整或已取消。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为response.output_item.done.source

response_id

字符串source

项目所属的 Response 的 ID。source

output_index

整数source

响应中输出项的索引。source

项目

对象source

要添加到对话中的项。source

对象 response.output_item.done

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
{
    "event_id": "event_3536",
    "type": "response.output_item.done",
    "response_id": "resp_001",
    "output_index": 0,
    "item": {
        "id": "msg_007",
        "object": "realtime.item",
        "type": "message",
        "status": "completed",
        "role": "assistant",
        "content": [
            {
                "type": "text",
                "text": "Sure, I can help with that."
            }
        ]
    }
}

response.content_part.added

在响应生成。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为response.content_part.added.source

response_id

字符串source

响应的 ID。source

item_id

字符串source

内容部分添加到的项目的 ID。source

output_index

整数source

响应中输出项的索引。source

content_index

整数source

内容部分在项的 content 数组中的索引。source

部分

对象source

添加的内容部分。source

对象 response.content_part.added

1
2
3
4
5
6
7
8
9
10
11
12
{
    "event_id": "event_3738",
    "type": "response.content_part.added",
    "response_id": "resp_001",
    "item_id": "msg_007",
    "output_index": 0,
    "content_index": 0,
    "part": {
        "type": "text",
        "text": ""
    }
}

response.content_part.done

当内容部分在 Assistant 消息项中完成流式处理时返回。当 Response 中断、不完整或取消时也会发出。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为response.content_part.done.source

response_id

字符串source

响应的 ID。source

item_id

字符串source

项目的 ID。source

output_index

整数source

响应中输出项的索引。source

content_index

整数source

内容部分在项的 content 数组中的索引。source

部分

对象source

完成的内容部分。source

对象 response.content_part.done

1
2
3
4
5
6
7
8
9
10
11
12
{
    "event_id": "event_3940",
    "type": "response.content_part.done",
    "response_id": "resp_001",
    "item_id": "msg_007",
    "output_index": 0,
    "content_index": 0,
    "part": {
        "type": "text",
        "text": "Sure, I can help with that."
    }
}

response.text.delta 的

当 “text” 内容部分的 text 值更新时返回。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为response.text.delta.source

response_id

字符串source

响应的 ID。source

item_id

字符串source

项目的 ID。source

output_index

整数source

响应中输出项的索引。source

content_index

整数source

内容部分在项的 content 数组中的索引。source

三角洲

字符串source

文本增量。source

对象 response.text.delta

1
2
3
4
5
6
7
8
9
{
    "event_id": "event_4142",
    "type": "response.text.delta",
    "response_id": "resp_001",
    "item_id": "msg_007",
    "output_index": 0,
    "content_index": 0,
    "delta": "Sure, I can h"
}

response.text.done

当 “text” 内容部分的 text 值完成流式处理时返回。也当 Response 中断、不完整或取消时发出。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为response.text.done.source

response_id

字符串source

响应的 ID。source

item_id

字符串source

项目的 ID。source

output_index

整数source

响应中输出项的索引。source

content_index

整数source

内容部分在项的 content 数组中的索引。source

文本

字符串source

最终的文本内容。source

对象 response.text.done

1
2
3
4
5
6
7
8
9
{
    "event_id": "event_4344",
    "type": "response.text.done",
    "response_id": "resp_001",
    "item_id": "msg_007",
    "output_index": 0,
    "content_index": 0,
    "text": "Sure, I can help with that."
}

response.audio_transcript.delta

更新模型生成的音频输出转录时返回。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为response.audio_transcript.delta.source

response_id

字符串source

响应的 ID。source

item_id

字符串source

项目的 ID。source

output_index

整数source

响应中输出项的索引。source

content_index

整数source

内容部分在项的 content 数组中的索引。source

三角洲

字符串source

转录本增量。source

对象 response.audio_transcript.delta

1
2
3
4
5
6
7
8
9
{
    "event_id": "event_4546",
    "type": "response.audio_transcript.delta",
    "response_id": "resp_001",
    "item_id": "msg_008",
    "output_index": 0,
    "content_index": 0,
    "delta": "Hello, how can I a"
}

response.audio_transcript.done

在完成模型生成的音频输出转录时返回流。当 Response 中断、不完整或取消。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为response.audio_transcript.done.source

response_id

字符串source

响应的 ID。source

item_id

字符串source

项目的 ID。source

output_index

整数source

响应中输出项的索引。source

content_index

整数source

内容部分在项的 content 数组中的索引。source

抄本

字符串source

音频的最终转录。source

对象 response.audio_transcript.done

1
2
3
4
5
6
7
8
9
{
    "event_id": "event_4748",
    "type": "response.audio_transcript.done",
    "response_id": "resp_001",
    "item_id": "msg_008",
    "output_index": 0,
    "content_index": 0,
    "transcript": "Hello, how can I assist you today?"
}

response.audio.delta 的

更新模型生成的音频时返回。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为response.audio.delta.source

response_id

字符串source

响应的 ID。source

item_id

字符串source

项目的 ID。source

output_index

整数source

响应中输出项的索引。source

content_index

整数source

内容部分在项的 content 数组中的索引。source

三角洲

字符串source

Base64 编码的音频数据增量。source

对象 response.audio.delta

1
2
3
4
5
6
7
8
9
{
    "event_id": "event_4950",
    "type": "response.audio.delta",
    "response_id": "resp_001",
    "item_id": "msg_008",
    "output_index": 0,
    "content_index": 0,
    "delta": "Base64EncodedAudioDelta"
}

response.audio.done

在模型生成的音频完成时返回。当 Response 已中断、不完整或已取消。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为response.audio.done.source

response_id

字符串source

响应的 ID。source

item_id

字符串source

项目的 ID。source

output_index

整数source

响应中输出项的索引。source

content_index

整数source

内容部分在项的 content 数组中的索引。source

对象 response.audio.done

1
2
3
4
5
6
7
8
{
    "event_id": "event_5152",
    "type": "response.audio.done",
    "response_id": "resp_001",
    "item_id": "msg_008",
    "output_index": 0,
    "content_index": 0
}

response.function_call_arguments.delta

更新模型生成的函数调用参数时返回。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为response.function_call_arguments.delta.source

response_id

字符串source

响应的 ID。source

item_id

字符串source

函数调用项的 ID。source

output_index

整数source

响应中输出项的索引。source

call_id

字符串source

函数调用的 ID。source

三角洲

字符串source

参数 delta 为 JSON 字符串。source

对象 response.function_call_arguments.delta

1
2
3
4
5
6
7
8
9
{
    "event_id": "event_5354",
    "type": "response.function_call_arguments.delta",
    "response_id": "resp_002",
    "item_id": "fc_001",
    "output_index": 0,
    "call_id": "call_001",
    "delta": "{\"location\": \"San\""
}

response.function_call_arguments.done

当模型生成的函数调用参数完成流式处理时返回。当 Response 中断、不完整或取消时也会发出。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为response.function_call_arguments.done.source

response_id

字符串source

响应的 ID。source

item_id

字符串source

函数调用项的 ID。source

output_index

整数source

响应中输出项的索引。source

call_id

字符串source

函数调用的 ID。source

参数

字符串source

最终参数为 JSON 字符串。source

对象 response.function_call_arguments.done

1
2
3
4
5
6
7
8
9
{
    "event_id": "event_5556",
    "type": "response.function_call_arguments.done",
    "response_id": "resp_002",
    "item_id": "fc_001",
    "output_index": 0,
    "call_id": "call_001",
    "arguments": "{\"location\": \"San Francisco\"}"
}

rate_limits.updated

在 Response 的开头发出，以指示更新的速率限制。创建 Response 时，将为输出 “保留” 一些令牌令牌，则此处显示的速率限制会反映该预留，即响应完成后进行相应调整。source

event_id

字符串source

服务器事件的唯一 ID。source

类型

字符串source

事件类型必须为rate_limits.updated.source

rate_limits

数组source

速率限制信息列表。source

对象 rate_limits.updated

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
{
    "event_id": "event_5758",
    "type": "rate_limits.updated",
    "rate_limits": [
        {
            "name": "requests",
            "limit": 1000,
            "remaining": 999,
            "reset_seconds": 60
        },
        {
            "name": "tokens",
            "limit": 50000,
            "remaining": 49950,
            "reset_seconds": 60
        }
    ]
}