API Endpoint
Endpoint
The endpoint is OpenAI-compatible, if you are using the OpenAI client, simply switch the base URL to
https://api.flock.io/v1
Example Request
curl -X POST 'https://api.flock.io/v1/chat/completions' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-H 'x-litellm-api-key: sk-your-api-key' \
-d '{
"model": "qwen3-30b-a3b-instruct-2507",
"stream": true,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello, can you help me?"}
]
}'Request Body Parameters
Required
model
stringThe ID of the model to use.
Use the [List Models API] to see all available models.
messages
string | arrayInput(s) for the model.
Can be a string, array of strings, array of tokens, or array of token arrays.
<|endoftext|>is used as a document separator.
Optional
best_of
integerDefaults to
1.Generates multiple completions server-side and returns the one with the highest log probability.
Not supported with
stream.
frequency_penalty
numberDefaults to
0.Range: -2.0 to 2.0.
Positive values penalize repetition.
logit_bias
mapDefaults to
null.Adjust likelihood of specific tokens.
Example:
{"50256": -100}prevents<|endoftext|>.
logprobs
integerDefaults to
null.Returns log probabilities for top
ntokens.Max:
5.
max_tokens
integerDefaults to
16.Maximum tokens to generate in the completion.
n
integerDefaults to
1.Number of completions to generate.
presence_penalty
numberDefaults to
0.Range: -2.0 to 2.0.
Positive values encourage new topics.
seed
integerIf provided, makes sampling deterministic when possible.
stop
string | arrayDefaults to
null.Up to 4 sequences that will stop token generation.
stream
booleanDefaults to
false.If
true, responses are streamed as server-sent events.
stream_options
objectOnly used when
stream: true.
temperature
numberDefaults to
1.Range: 0–2.
Higher values = more random output.
top_p
numberDefaults to
1.Nucleus sampling alternative to
temperature.
user
stringUnique identifier for the end-user.
Response
Returns a completion object, or a sequence of completion objects if streaming is enabled.
Notes
Use
max_tokensandstopwisely to control output length.For deterministic results, combine
seedwith fixed parameters.
Last updated
Was this helpful?