OpenAI API参考文档 | OpenAI开发文档|OpenAI中文官方文档|ChatGPT中文版|ChatGPT教程

The transcription object (Verbose JSON)

Represents a verbose json transcription response returned by model, based on the provided input.source

language

stringsource

The language of the input audio.source

duration

stringsource

The duration of the input audio.source

text

stringsource

The transcribed text.source

words

arraysource

Extracted words and their corresponding timestamps.source

segments

arraysource

Segments of the transcribed text and their corresponding details.source

OBJECT The transcription object (Verbose JSON)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
{
  "task": "transcribe",
  "language": "english",
  "duration": 8.470000267028809,
  "text": "The beach was a popular spot on a hot summer day. People were swimming in the ocean, building sandcastles, and playing beach volleyball.",
  "segments": [
    {
      "id": 0,
      "seek": 0,
      "start": 0.0,
      "end": 3.319999933242798,
      "text": " The beach was a popular spot on a hot summer day.",
      "tokens": [
        50364, 440, 7534, 390, 257, 3743, 4008, 322, 257, 2368, 4266, 786, 13, 50530
      ],
      "temperature": 0.0,
      "avg_logprob": -0.2860786020755768,
      "compression_ratio": 1.2363636493682861,
      "no_speech_prob": 0.00985979475080967
    },
    ...
  ]
}