Streaming
Stream Chat Completions in real time. Receive chunks of completions returned from the model using server-sent events (SSE).
The chat completion chunk object
Represents a streamed chunk of a chat completion response returned by the model, based on the provided input.
choices
array
A list of chat completion choices. Can be empty for the last chunk if you set stream_options: {"include_usage": true}
.
Show properties
delta
object
A chat completion delta generated by streamed model responses.
Show properties
content string or null
The content of the chunk message.
refusal string or null
The refusal message generated by the model.
role string
The role of the author of this message.
tool_calls array
The tool calls generated by the model.
Show properties
index integer
The index of the tool call in the list.
function object
Details of the function to call.
Show properties
arguments string
Arguments for the function call, generated by the model in JSON format.
Note: The model may hallucinate parameters or produce invalid JSON. Validate arguments before using.
name string
The name of the function to call.
id string
The ID of the tool call.
type string
The type of the tool. Currently, only function
is supported.
finish_reason
object
The reason the model stopped generating tokens.
stop
: reached a natural stopping point or a provided stop sequence.
length
: reached the maximum number of tokens.
content_filter
: content was filtered.
tool_calls
: the model invoked a tool.
index
integer
The index of the choice in the list of choices.
created
integer
The Unix timestamp (in seconds) of when the chat completion was created. Each chunk has the same timestamp.
id
string
A unique identifier for the chat completion. Each chunk has the same ID.
model
string
The model to generate the completion.
object
string
The object type, which is always chat.completion.chunk
.
system_fingerprint
string
This fingerprint represents the backend configuration that the model runs with. Can be used in conjunction with the seed
request parameter to understand when backend changes have been made that might impact determinism.
usage
object
Usage statistics for the completion request.
Show properties
completion_tokens integer
Number of tokens in the generated completion.
prompt_tokens integer
Number of tokens in the prompt.
total_tokens integer
Total number of tokens used in the request (prompt + completion).