API Endpoint
Endpoint
The endpoint is OpenAI-compatible, if you are using the OpenAI client, simply switch the base URL to
https://api.flock.io/v1
Example Request
curl -X POST 'https://api.flock.io/v1/chat/completions' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-H 'x-litellm-api-key: sk-your-api-key' \
-d '{
"model": "qwen3-30b-a3b-instruct-2507",
"stream": true,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello, can you help me?"}
]
}'
Request Body Parameters
Required
model
string
The ID of the model to use.
Use the [List Models API] to see all available models.
messages
string | array
Input(s) for the model.
Can be a string, array of strings, array of tokens, or array of token arrays.
<|endoftext|>
is used as a document separator.
Optional
best_of
integer
Defaults to
1
.Generates multiple completions server-side and returns the one with the highest log probability.
Not supported with
stream
.
frequency_penalty
number
Defaults to
0
.Range: -2.0 to 2.0.
Positive values penalize repetition.
logit_bias
map
Defaults to
null
.Adjust likelihood of specific tokens.
Example:
{"50256": -100}
prevents<|endoftext|>
.
logprobs
integer
Defaults to
null
.Returns log probabilities for top
n
tokens.Max:
5
.
max_tokens
integer
Defaults to
16
.Maximum tokens to generate in the completion.
n
integer
Defaults to
1
.Number of completions to generate.
presence_penalty
number
Defaults to
0
.Range: -2.0 to 2.0.
Positive values encourage new topics.
seed
integer
If provided, makes sampling deterministic when possible.
stop
string | array
Defaults to
null
.Up to 4 sequences that will stop token generation.
stream
boolean
Defaults to
false
.If
true
, responses are streamed as server-sent events.
stream_options
object
Only used when
stream: true
.
temperature
number
Defaults to
1
.Range: 0–2.
Higher values = more random output.
top_p
number
Defaults to
1
.Nucleus sampling alternative to
temperature
.
user
string
Unique identifier for the end-user.
Response
Returns a completion object, or a sequence of completion objects if streaming is enabled.
Notes
Use
max_tokens
andstop
wisely to control output length.For deterministic results, combine
seed
with fixed parameters.
Last updated
Was this helpful?