服务器事件
这些是从 OpenAI Realtime WebSocket 服务器发送到客户端的事件。
错误
发生错误时返回,这可能是客户端问题或服务器 问题。大多数错误都是可恢复的,并且会话将保持打开状态,因此 建议实现者默认监控和记录错误消息。
1
2
3
4
5
6
7
8
9
10
11
{
"event_id": "event_890",
"type": "error",
"error": {
"type": "invalid_request_error",
"code": "invalid_event",
"message": "The 'type' field is missing.",
"param": null,
"event_id": "event_567"
}
}
会话创建文件
创建 Session 时返回。当新的 Connection 作为第一个 Server 事件建立。此事件将包含 默认的 Session 配置。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
{
"event_id": "event_1234",
"type": "session.created",
"session": {
"id": "sess_001",
"object": "realtime.session",
"model": "gpt-4o-realtime-preview-2024-10-01",
"modalities": ["text", "audio"],
"instructions": "...model instructions here...",
"voice": "sage",
"input_audio_format": "pcm16",
"output_audio_format": "pcm16",
"input_audio_transcription": null,
"turn_detection": {
"type": "server_vad",
"threshold": 0.5,
"prefix_padding_ms": 300,
"silence_duration_ms": 200
},
"tools": [],
"tool_choice": "auto",
"temperature": 0.8,
"max_response_output_tokens": "inf"
}
}
session.updated
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
{
"event_id": "event_5678",
"type": "session.updated",
"session": {
"id": "sess_001",
"object": "realtime.session",
"model": "gpt-4o-realtime-preview-2024-10-01",
"modalities": ["text"],
"instructions": "New instructions",
"voice": "sage",
"input_audio_format": "pcm16",
"output_audio_format": "pcm16",
"input_audio_transcription": {
"model": "whisper-1"
},
"turn_detection": null,
"tools": [],
"tool_choice": "none",
"temperature": 0.7,
"max_response_output_tokens": 200
}
}
对话创建(conversation.created)
conversation.item.created
创建对话项时返回。有几种情况 生成此事件:
- 服务器正在生成一个 Response,如果成功,将生成 Response
一个或两个 Items,其类型为
message
(角色assistant
) 或键入function_call
. - 输入音频缓冲区已由客户端或
server(在
server_vad
模式)。服务器将获取 input audio buffer 并将其添加到新用户消息 Item 中。 - 客户端已发送
conversation.item.create
event 添加新 Item 到 Conversation。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
{
"event_id": "event_1920",
"type": "conversation.item.created",
"previous_item_id": "msg_002",
"item": {
"id": "msg_003",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "user",
"content": [
{
"type": "input_audio",
"transcript": "hello how are you",
"audio": "base64encodedaudio=="
}
]
}
}
conversation.item.input_audio_transcription.已完成
此事件是写入
用户音频缓冲区。当输入音频缓冲区为
由客户端或服务器提交(在server_vad
模式)。听录运行
与 Response creation 异步,因此此事件可能出现在 Response 之前或之后
响应事件。
实时 API 模型原生接受音频,因此输入转录是一种
在单独的 ASR(自动语音识别)模型上运行的单独进程,
当前总是whisper-1
.因此,转录本可能与
模型的解释,应被视为粗略的指南。
1
2
3
4
5
6
7
{
"event_id": "event_2122",
"type": "conversation.item.input_audio_transcription.completed",
"item_id": "msg_003",
"content_index": 0,
"transcript": "Hello, how are you?"
}
conversation.item.input_audio_transcription.failed
配置输入音频转录时返回,并且转录
请求用户消息失败。这些事件与其他事件是分开的error
事件,以便客户端可以识别相关的 Item。
1
2
3
4
5
6
7
8
9
10
11
12
{
"event_id": "event_2324",
"type": "conversation.item.input_audio_transcription.failed",
"item_id": "msg_003",
"content_index": 0,
"error": {
"type": "transcription_error",
"code": "audio_unintelligible",
"message": "The audio could not be transcribed.",
"param": null
}
}
conversation.item.truncated
当较早的 Assistant 音频消息项被
客户端的conversation.item.truncate
事件。此事件用于
将服务器对音频的理解与客户端的播放同步。
此作将截断音频并删除服务器端文本转录 以确保上下文中没有用户未听到的文本。
1
2
3
4
5
6
7
{
"event_id": "event_2526",
"type": "conversation.item.truncated",
"item_id": "msg_004",
"content_index": 0,
"audio_end_ms": 1500
}
conversation.item.deleted
input_audio_buffer.committed
当客户端或
在服务器 VAD 模式下自动运行。这item_id
property 是用户的 ID
message 项,因此会创建一个conversation.item.created
事件
也将发送给客户端。
1
2
3
4
5
6
{
"event_id": "event_1121",
"type": "input_audio_buffer.committed",
"previous_item_id": "msg_001",
"item_id": "msg_002"
}
input_audio_buffer.清除
input_audio_buffer.speech_started
服务器在server_vad
模式,以指示语音已被
在音频缓冲区中检测到。每当将音频添加到
缓冲区(除非已检测到语音)。客户端可能希望使用此
事件中断音频播放或向用户提供视觉反馈。
客户端应该会收到一个input_audio_buffer.speech_stopped
事件
当言语停止时。这item_id
property 是用户消息项的 ID
这将在语音停止时创建,并且也将包含在input_audio_buffer.speech_stopped
事件(除非客户端手动提交
VAD 激活期间的音频缓冲区)。
1
2
3
4
5
6
{
"event_id": "event_1516",
"type": "input_audio_buffer.speech_started",
"audio_start_ms": 1000,
"item_id": "msg_003"
}
input_audio_buffer.speech_stopped
返回位置server_vad
模式(当服务器在
音频缓冲区。服务器还将发送一个conversation.item.created
event 替换为从音频缓冲区创建的用户消息项。
1
2
3
4
5
6
{
"event_id": "event_1718",
"type": "input_audio_buffer.speech_stopped",
"audio_end_ms": 2000,
"item_id": "msg_003"
}
response.created
创建新 Response 时返回。响应创建的第一个事件
其中响应处于初始状态in_progress
.
1
2
3
4
5
6
7
8
9
10
11
12
{
"event_id": "event_2930",
"type": "response.created",
"response": {
"id": "resp_001",
"object": "realtime.response",
"status": "in_progress",
"status_details": null,
"output": [],
"usage": null
}
}
响应.done
当 Response 完成流式处理时返回。始终发出,无论
final 状态。Response 对象包含在response.done
event 将
在 Response 中包含所有输出 Item,但会省略原始音频数据。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
{
"event_id": "event_3132",
"type": "response.done",
"response": {
"id": "resp_001",
"object": "realtime.response",
"status": "completed",
"status_details": null,
"output": [
{
"id": "msg_006",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Sure, how can I assist you today?"
}
]
}
],
"usage": {
"total_tokens":275,
"input_tokens":127,
"output_tokens":148,
"input_token_details": {
"cached_tokens":384,
"text_tokens":119,
"audio_tokens":8,
"cached_tokens_details": {
"text_tokens": 128,
"audio_tokens": 256
}
},
"output_token_details": {
"text_tokens":36,
"audio_tokens":112
}
}
}
}
response.output_item.added
在生成 Response 期间创建新 Item 时返回。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
{
"event_id": "event_3334",
"type": "response.output_item.added",
"response_id": "resp_001",
"output_index": 0,
"item": {
"id": "msg_007",
"object": "realtime.item",
"type": "message",
"status": "in_progress",
"role": "assistant",
"content": []
}
}
response.output_item.done
当 Item 完成流式处理时返回。当 Response 为 已中断、不完整或已取消。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
{
"event_id": "event_3536",
"type": "response.output_item.done",
"response_id": "resp_001",
"output_index": 0,
"item": {
"id": "msg_007",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Sure, I can help with that."
}
]
}
}
response.content_part.added
在 响应生成。
1
2
3
4
5
6
7
8
9
10
11
12
{
"event_id": "event_3738",
"type": "response.content_part.added",
"response_id": "resp_001",
"item_id": "msg_007",
"output_index": 0,
"content_index": 0,
"part": {
"type": "text",
"text": ""
}
}
response.content_part.done
当内容部分在 Assistant 消息项中完成流式处理时返回。 当 Response 中断、不完整或取消时也会发出。
1
2
3
4
5
6
7
8
9
10
11
12
{
"event_id": "event_3940",
"type": "response.content_part.done",
"response_id": "resp_001",
"item_id": "msg_007",
"output_index": 0,
"content_index": 0,
"part": {
"type": "text",
"text": "Sure, I can help with that."
}
}
response.text.delta 的
当 “text” 内容部分的 text 值更新时返回。
1
2
3
4
5
6
7
8
9
{
"event_id": "event_4142",
"type": "response.text.delta",
"response_id": "resp_001",
"item_id": "msg_007",
"output_index": 0,
"content_index": 0,
"delta": "Sure, I can h"
}
response.text.done
当 “text” 内容部分的 text 值完成流式处理时返回。也 当 Response 中断、不完整或取消时发出。
1
2
3
4
5
6
7
8
9
{
"event_id": "event_4344",
"type": "response.text.done",
"response_id": "resp_001",
"item_id": "msg_007",
"output_index": 0,
"content_index": 0,
"text": "Sure, I can help with that."
}
response.audio_transcript.delta
更新模型生成的音频输出转录时返回。
1
2
3
4
5
6
7
8
9
{
"event_id": "event_4546",
"type": "response.audio_transcript.delta",
"response_id": "resp_001",
"item_id": "msg_008",
"output_index": 0,
"content_index": 0,
"delta": "Hello, how can I a"
}
response.audio_transcript.done
在完成模型生成的音频输出转录时返回 流。当 Response 中断、不完整或 取消。
1
2
3
4
5
6
7
8
9
{
"event_id": "event_4748",
"type": "response.audio_transcript.done",
"response_id": "resp_001",
"item_id": "msg_008",
"output_index": 0,
"content_index": 0,
"transcript": "Hello, how can I assist you today?"
}
response.audio.delta 的
更新模型生成的音频时返回。
1
2
3
4
5
6
7
8
9
{
"event_id": "event_4950",
"type": "response.audio.delta",
"response_id": "resp_001",
"item_id": "msg_008",
"output_index": 0,
"content_index": 0,
"delta": "Base64EncodedAudioDelta"
}
response.audio.done
在模型生成的音频完成时返回。当 Response 已中断、不完整或已取消。
1
2
3
4
5
6
7
8
{
"event_id": "event_5152",
"type": "response.audio.done",
"response_id": "resp_001",
"item_id": "msg_008",
"output_index": 0,
"content_index": 0
}
response.function_call_arguments.delta
更新模型生成的函数调用参数时返回。
1
2
3
4
5
6
7
8
9
{
"event_id": "event_5354",
"type": "response.function_call_arguments.delta",
"response_id": "resp_002",
"item_id": "fc_001",
"output_index": 0,
"call_id": "call_001",
"delta": "{\"location\": \"San\""
}
response.function_call_arguments.done
当模型生成的函数调用参数完成流式处理时返回。 当 Response 中断、不完整或取消时也会发出。
1
2
3
4
5
6
7
8
9
{
"event_id": "event_5556",
"type": "response.function_call_arguments.done",
"response_id": "resp_002",
"item_id": "fc_001",
"output_index": 0,
"call_id": "call_001",
"arguments": "{\"location\": \"San Francisco\"}"
}
rate_limits.updated
在 Response 的开头发出,以指示更新的速率限制。 创建 Response 时,将为输出 “保留” 一些令牌 令牌,则此处显示的速率限制会反映该预留,即 响应完成后进行相应调整。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
{
"event_id": "event_5758",
"type": "rate_limits.updated",
"rate_limits": [
{
"name": "requests",
"limit": 1000,
"remaining": 999,
"reset_seconds": 60
},
{
"name": "tokens",
"limit": 50000,
"remaining": 49950,
"reset_seconds": 60
}
]
}