# Task Lifecycle Deep-dive

<figure><img src="https://742781353-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F1RpcbvTSHzzPwOSUvgKU%2Fuploads%2FcUILw5P8yTjd5Gykj3mc%2Fai-arena-fig3.png?alt=media&#x26;token=661a74b7-368a-480d-8ebf-1634f5342afd" alt=""><figcaption><p>Figure3. Workflow of an AI Arena Task.</p></figcaption></figure>

## 1. Task Creation

Task creation is the primary stage of the training cycle. Task creators define the desired models and submit tasks to the platform.&#x20;

To qualify as a task creator, users must meet one or more of the following criteria:&#x20;

• Stake a sufficient amount of FLOCK

• Have successfully trained or validated a task previously, as evidenced by on-chain records

• Possess a reputation in the ML space or be recognised as a domain expert in relevant fields, as verified by the FLock community

## 2. Training Node and Validator Selection

Each participant is required to stake in order to participate either as a training node or a validator. Also, rate limiting is adopted to determine the number of times participants can be eligible as validator for a given task. Essentially, the likelihood of a participant being selected to validate a task submission increases with their stake. However, the rate at which validation frequency increases relative to the staking amount tends to diminish as the staking amount grows.

## 3. Training&#x20;

Each training node is given $$\mathcal{D}{\text{local}}$$*, which contains locally sourced data samples, comprising feature set* $$X$$ *and label set* $$Y$$*, with each sample* $$x\_i \in X$$ *corresponding to a label* $$y\_i \in Y$$*. The goal of training is to define a predictive model* $$f$$*, which learns patterns within* $$\mathcal{D}{\text{local}}$$ such that $$f(x\_i) \approx y\_i$$.&#x20;

To quantify the success (i.e. ability to predict) of the predictive model $$f$$, we introduce a loss function $$L(f(x\_i), y\_i)$$, assessing the discrepancy between predictions $$f(x\_i)$$ and actual labels $$f(y\_i)$$. A generic expression for this function is: $$\begin{equation\*} L = \frac{1}{N} \sum\_{i=1}^{N} l(f(x\_i), y\_i) \end{equation\*}$$ where $$N$$ denotes the total sample count, and $$l$$ signifies a problem-specific loss function, e.g., mean squared error or cross-entropy loss.

Ultimately, the optimisation goal of training is to adjust the model parameters $$\theta$$ to minimise $$L$$, typically through algorithms such as gradient descent.&#x20;

## 4. Validation&#x20;

After the training node produces a trained model $$\theta^{task}*p$$, a selected group of validators, denoted as $$V\_j \in V$$, each equipped with the evaluation dataset $$\mathcal{D}*{\text{eval}}$$ from the task creator, will validate the model. The dataset consists of pairs $$(x\_i, y\_i)$$, where $$x\_i$$ represents the features of the $$i-th$$ sample, and $$y\_i$$is the corresponding true label.

To assess the performance of the trained model,  we use an general evaluation, which is calculated as follows:

$$
\begin{equation\*} eval(\theta^{task}p, \mathcal{D}{\text{eval}}) = \frac{1}{|\mathcal{D}{\text{eval}}|} \sum{(x\_i, y\_i) \in \mathcal{D}\_{\text{eval}}} \mathbf{1}(\hat{y}\_i = y\_i) \end{equation\*}
$$

Here, $$\mathbf{1}$$ represents the indicator function that returns 1 if the predicted label $$\hat{y}i$$ *matches the true label* $$y\_i$$*, and* $$0$$ *otherwise. The function* $$|\mathcal{D}{\text{eval}}|$$ denotes the total number of samples within the evaluation dataset.

Each predicted label $$\hat{y}\_i$$ from the model $$\theta^{task}p$$ *is compared against its corresponding true label* $$y\_i$$ *within the dataset* $$\mathcal{D}{\text{eval}}$$. The calculated metric result (accuracy here) serves as a quantifiable measure of $$\theta^{task}\_p$$'s effectiveness at label prediction across the evaluation dataset.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.flock.io/flock-products/ai-arena/task-lifecycle-deep-dive.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
