# API Endpoint

### Endpoint

The endpoint is OpenAI-compatible, if you are using the OpenAI client, simply switch the base URL to

`https://api.flock.io/v1`

***

### Example Request

```bash
curl -X POST 'https://api.flock.io/v1/chat/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -H 'x-litellm-api-key: sk-your-api-key' \
  -d '{
    "model": "qwen3-30b-a3b-instruct-2507",
    "stream": true,
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello, can you help me?"}
    ]
  }'
```

***

### Request Body Parameters

#### Required

* **model** `string`
  * The ID of the model to use.
  * Use the \[List Models API] to see all available models.
* **messages** `string | array`
  * Input(s) for the model.
  * Can be a string, array of strings, array of tokens, or array of token arrays.
  * `<|endoftext|>` is used as a document separator.

#### Optional

* **best\_of** `integer`
  * Defaults to `1`.
  * Generates multiple completions server-side and returns the one with the highest log probability.
  * Not supported with `stream`.
* **frequency\_penalty** `number`
  * Defaults to `0`.
  * Range: -2.0 to 2.0.
  * Positive values penalize repetition.
* **logit\_bias** `map`
  * Defaults to `null`.
  * Adjust likelihood of specific tokens.
  * Example: `{"50256": -100}` prevents `<|endoftext|>`.
* **logprobs** `integer`
  * Defaults to `null`.
  * Returns log probabilities for top `n` tokens.
  * Max: `5`.
* **max\_tokens** `integer`
  * Defaults to `16`.
  * Maximum tokens to generate in the completion.
* **n** `integer`
  * Defaults to `1`.
  * Number of completions to generate.
* **presence\_penalty** `number`
  * Defaults to `0`.
  * Range: -2.0 to 2.0.
  * Positive values encourage new topics.
* **seed** `integer`
  * If provided, makes sampling deterministic when possible.
* **stop** `string | array`
  * Defaults to `null`.
  * Up to 4 sequences that will stop token generation.
* **stream** `boolean`
  * Defaults to `false`.
  * If `true`, responses are streamed as server-sent events.
* **stream\_options** `object`
  * Only used when `stream: true`.
* **temperature** `number`
  * Defaults to `1`.
  * Range: 0–2.
  * Higher values = more random output.
* **top\_p** `number`
  * Defaults to `1`.
  * Nucleus sampling alternative to `temperature`.
* **user** `string`
  * Unique identifier for the end-user.

***

### Response

Returns a **completion object**, or a sequence of completion objects if streaming is enabled.

***

### Notes

* Use `max_tokens` and `stop` wisely to control output length.
* For deterministic results, combine `seed` with fixed parameters.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.flock.io/flock-products/api-platform/api-endpoint.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
