API Endpoint

Endpoint

The endpoint is OpenAI-compatible, if you are using the OpenAI client, simply switch the base URL to

https://api.flock.io/v1

Example Request

curl -X POST 'https://api.flock.io/v1/chat/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -H 'x-litellm-api-key: sk-your-api-key' \
  -d '{
    "model": "qwen3-30b-a3b-instruct-2507",
    "stream": true,
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello, can you help me?"}
    ]
  }'

Request Body Parameters

Required

model string
- The ID of the model to use.
- Use the [List Models API] to see all available models.
messages string | array
- Input(s) for the model.
- Can be a string, array of strings, array of tokens, or array of token arrays.
- <|endoftext|> is used as a document separator.

Optional

best_of integer
- Defaults to 1.
- Generates multiple completions server-side and returns the one with the highest log probability.
- Not supported with stream.
frequency_penalty number
- Defaults to 0.
- Range: -2.0 to 2.0.
- Positive values penalize repetition.
logit_bias map
- Defaults to null.
- Adjust likelihood of specific tokens.
- Example: {"50256": -100} prevents <|endoftext|>.
logprobs integer
- Defaults to null.
- Returns log probabilities for top n tokens.
- Max: 5.
max_tokens integer
- Defaults to 16.
- Maximum tokens to generate in the completion.
n integer
- Defaults to 1.
- Number of completions to generate.
presence_penalty number
- Defaults to 0.
- Range: -2.0 to 2.0.
- Positive values encourage new topics.
seed integer
- If provided, makes sampling deterministic when possible.
stop string | array
- Defaults to null.
- Up to 4 sequences that will stop token generation.
stream boolean
- Defaults to false.
- If true, responses are streamed as server-sent events.
stream_options object
- Only used when stream: true.
temperature number
- Defaults to 1.
- Range: 0–2.
- Higher values = more random output.
top_p number
- Defaults to 1.
- Nucleus sampling alternative to temperature.
user string
- Unique identifier for the end-user.

Response

Returns a completion object, or a sequence of completion objects if streaming is enabled.

Notes

Use max_tokens and stop wisely to control output length.
For deterministic results, combine seed with fixed parameters.

PreviousGetting started Next2025 Roadmap

Last updated 2 months ago

Was this helpful?