Streaming

Stream Chat Completions in real time. Receive chunks of completions returned from the model using server-sent events (SSE).

The chat completion chunk object

Represents a streamed chunk of a chat completion response returned by the model, based on the provided input.

choices array
A list of chat completion choices. Can be empty for the last chunk if you set stream_options: {"include_usage": true}.

Show properties

delta object
A chat completion delta generated by streamed model responses.

Show properties

content string or null
The content of the chunk message.

refusal string or null
The refusal message generated by the model.

role string
The role of the author of this message.

tool_calls array
The tool calls generated by the model.

Show properties

index integer
The index of the tool call in the list.

function object
Details of the function to call.

Show properties

arguments string
Arguments for the function call, generated by the model in JSON format. Note: The model may hallucinate parameters or produce invalid JSON. Validate arguments before using.

name string
The name of the function to call.

id string
The ID of the tool call.

type string
The type of the tool. Currently, only function is supported.

finish_reason object
The reason the model stopped generating tokens. stop: reached a natural stopping point or a provided stop sequence. length: reached the maximum number of tokens. content_filter: content was filtered. tool_calls: the model invoked a tool.

index integer
The index of the choice in the list of choices.

created integer
The Unix timestamp (in seconds) of when the chat completion was created. Each chunk has the same timestamp.

id string
A unique identifier for the chat completion. Each chunk has the same ID.

model string
The model to generate the completion.

object string
The object type, which is always chat.completion.chunk.

system_fingerprint string
This fingerprint represents the backend configuration that the model runs with. Can be used in conjunction with the seed request parameter to understand when backend changes have been made that might impact determinism.

usage object
Usage statistics for the completion request.

Show properties

completion_tokens integer
Number of tokens in the generated completion.

prompt_tokens integer
Number of tokens in the prompt.

total_tokens integer
Total number of tokens used in the request (prompt + completion).