# API Endpoint

### Endpoint

The endpoint is OpenAI-compatible, if you are using the OpenAI client, simply switch the base URL to

`https://api.flock.io/v1`

***

### Example Request

```bash
curl -X POST 'https://api.flock.io/v1/chat/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -H 'x-litellm-api-key: sk-your-api-key' \
  -d '{
    "model": "qwen3-30b-a3b-instruct-2507",
    "stream": true,
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello, can you help me?"}
    ]
  }'
```

***

### Request Body Parameters

#### Required

* **model** `string`
  * The ID of the model to use.
  * Use the \[List Models API] to see all available models.
* **messages** `string | array`
  * Input(s) for the model.
  * Can be a string, array of strings, array of tokens, or array of token arrays.
  * `<|endoftext|>` is used as a document separator.

#### Optional

* **best\_of** `integer`
  * Defaults to `1`.
  * Generates multiple completions server-side and returns the one with the highest log probability.
  * Not supported with `stream`.
* **frequency\_penalty** `number`
  * Defaults to `0`.
  * Range: -2.0 to 2.0.
  * Positive values penalize repetition.
* **logit\_bias** `map`
  * Defaults to `null`.
  * Adjust likelihood of specific tokens.
  * Example: `{"50256": -100}` prevents `<|endoftext|>`.
* **logprobs** `integer`
  * Defaults to `null`.
  * Returns log probabilities for top `n` tokens.
  * Max: `5`.
* **max\_tokens** `integer`
  * Defaults to `16`.
  * Maximum tokens to generate in the completion.
* **n** `integer`
  * Defaults to `1`.
  * Number of completions to generate.
* **presence\_penalty** `number`
  * Defaults to `0`.
  * Range: -2.0 to 2.0.
  * Positive values encourage new topics.
* **seed** `integer`
  * If provided, makes sampling deterministic when possible.
* **stop** `string | array`
  * Defaults to `null`.
  * Up to 4 sequences that will stop token generation.
* **stream** `boolean`
  * Defaults to `false`.
  * If `true`, responses are streamed as server-sent events.
* **stream\_options** `object`
  * Only used when `stream: true`.
* **temperature** `number`
  * Defaults to `1`.
  * Range: 0–2.
  * Higher values = more random output.
* **top\_p** `number`
  * Defaults to `1`.
  * Nucleus sampling alternative to `temperature`.
* **user** `string`
  * Unique identifier for the end-user.

***

### Response

Returns a **completion object**, or a sequence of completion objects if streaming is enabled.

***

### Notes

* Use `max_tokens` and `stop` wisely to control output length.
* For deterministic results, combine `seed` with fixed parameters.
