Request Body
model
string
default:"claude-3-5-sonnet-20241022"
Model to use. Default is claude-3-5-sonnet-20241022.
Array of message objects with role and content.
Sampling temperature (0-2).
Maximum tokens to generate.
Example Request
curl https://api.xtrix.workers.dev/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "claude-3-5-sonnet-20241022",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'
Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1677858242,
"model": "claude-3-5-sonnet-20241022",
"system_fingerprint": "fp_44709d6f",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 8,
"total_tokens": 18
}
}
Streaming
Set stream: true to receive responses as they’re generated:
const stream = await openai.chat.completions.create({
model: 'claude-3-5-sonnet-20241022',
messages: [{ role: 'user', content: 'Tell me a story' }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}
Each chunk in the stream follows this format:
{
"id": "chatcmpl-abc123",
"object": "chat.completion.chunk",
"created": 1677858242,
"model": "claude-3-5-sonnet-20241022",
"system_fingerprint": "fp_44709d6f",
"choices": [
{
"index": 0,
"delta": {
"content": "The"
},
"logprobs": null,
"finish_reason": null
}
]
}
The final chunk includes usage information:
{
"id": "chatcmpl-abc123",
"object": "chat.completion.chunk",
"created": 1677858242,
"model": "claude-3-5-sonnet-20241022",
"system_fingerprint": "fp_44709d6f",
"choices": [
{
"index": 0,
"delta": {},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 25,
"total_tokens": 35
}
}
The stream ends with:
A list of messages comprising the conversation so far
If set, partial message deltas will be sent
Sampling temperature between 0 and 2
Required range: 0 <= x <= 2
The maximum number of tokens to generate
Nucleus sampling parameter
How many completions to generate
Up to 4 sequences where the API will stop generating
Penalize new tokens based on presence in text
Required range: -2 <= x <= 2
Penalize new tokens based on frequency in text
Required range: -2 <= x <= 2
Unique identifier for the completion
Available options:
chat.completion
Unix timestamp of creation
Model used for completion