Multimodal Input
Some models, like Image Recognition models, support multimodal inputs, allowing you to combine text with media files. The following example demonstrates providing an image:
POST /v1/chat/completions
curl https://api-platform.ope.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $YOUR_API_KEY" \
-d '{
"model": "$MODEL_ID",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {"url": "$URL"} # Image URL or Data URL (base64)
},
{
"type": "image_url",
"image_url": {"url": "$URL"} # Supports multiple inputs
},
{
"type": "text",
"text": "Explain the meaning of this image."
}
]
}
]
}'
# First, install the OpenAI library:
# pip install openai
from openai import OpenAI
client = OpenAI(
api_key="$YOUR_API_KEY",
base_url="[https://api-platform.ope.ai/v1/](https://api-platform.ope.ai/v1/)"
)
completion = client.chat.completions.create(
model="$MODEL_ID",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": [
{
"type": "image_url",
"image_url": {"url": $URL} # Image URL or Data URL (base64)
},
{
"type": "image_url",
"image_url": {"url": $URL} # Supports multiple inputs
},
{
"type": "text",
"text": "Explain the meaning of these images."
}
]}
]
)
print(completion.choices[0].message)