# Task Lifecycle Deep-dive

<figure><img src="https://742781353-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F1RpcbvTSHzzPwOSUvgKU%2Fuploads%2FcUILw5P8yTjd5Gykj3mc%2Fai-arena-fig3.png?alt=media&#x26;token=661a74b7-368a-480d-8ebf-1634f5342afd" alt=""><figcaption><p>Figure3. Workflow of an AI Arena Task.</p></figcaption></figure>

## 1. Task Creation

Task creation is the primary stage of the training cycle. Task creators define the desired models and submit tasks to the platform.&#x20;

To qualify as a task creator, users must meet one or more of the following criteria:&#x20;

• Stake a sufficient amount of FLOCK

• Have successfully trained or validated a task previously, as evidenced by on-chain records

• Possess a reputation in the ML space or be recognised as a domain expert in relevant fields, as verified by the FLock community

## 2. Training Node and Validator Selection

Each participant is required to stake in order to participate either as a training node or a validator. Also, rate limiting is adopted to determine the number of times participants can be eligible as validator for a given task. Essentially, the likelihood of a participant being selected to validate a task submission increases with their stake. However, the rate at which validation frequency increases relative to the staking amount tends to diminish as the staking amount grows.

## 3. Training&#x20;

Each training node is given $$\mathcal{D}{\text{local}}$$*, which contains locally sourced data samples, comprising feature set* $$X$$ *and label set* $$Y$$*, with each sample* $$x\_i \in X$$ *corresponding to a label* $$y\_i \in Y$$*. The goal of training is to define a predictive model* $$f$$*, which learns patterns within* $$\mathcal{D}{\text{local}}$$ such that $$f(x\_i) \approx y\_i$$.&#x20;

To quantify the success (i.e. ability to predict) of the predictive model $$f$$, we introduce a loss function $$L(f(x\_i), y\_i)$$, assessing the discrepancy between predictions $$f(x\_i)$$ and actual labels $$f(y\_i)$$. A generic expression for this function is: $$\begin{equation\*} L = \frac{1}{N} \sum\_{i=1}^{N} l(f(x\_i), y\_i) \end{equation\*}$$ where $$N$$ denotes the total sample count, and $$l$$ signifies a problem-specific loss function, e.g., mean squared error or cross-entropy loss.

Ultimately, the optimisation goal of training is to adjust the model parameters $$\theta$$ to minimise $$L$$, typically through algorithms such as gradient descent.&#x20;

## 4. Validation&#x20;

After the training node produces a trained model $$\theta^{task}*p$$, a selected group of validators, denoted as $$V\_j \in V$$, each equipped with the evaluation dataset $$\mathcal{D}*{\text{eval}}$$ from the task creator, will validate the model. The dataset consists of pairs $$(x\_i, y\_i)$$, where $$x\_i$$ represents the features of the $$i-th$$ sample, and $$y\_i$$is the corresponding true label.

To assess the performance of the trained model,  we use an general evaluation, which is calculated as follows:

$$
\begin{equation\*} eval(\theta^{task}p, \mathcal{D}{\text{eval}}) = \frac{1}{|\mathcal{D}{\text{eval}}|} \sum{(x\_i, y\_i) \in \mathcal{D}\_{\text{eval}}} \mathbf{1}(\hat{y}\_i = y\_i) \end{equation\*}
$$

Here, $$\mathbf{1}$$ represents the indicator function that returns 1 if the predicted label $$\hat{y}i$$ *matches the true label* $$y\_i$$*, and* $$0$$ *otherwise. The function* $$|\mathcal{D}{\text{eval}}|$$ denotes the total number of samples within the evaluation dataset.

Each predicted label $$\hat{y}\_i$$ from the model $$\theta^{task}p$$ *is compared against its corresponding true label* $$y\_i$$ *within the dataset* $$\mathcal{D}{\text{eval}}$$. The calculated metric result (accuracy here) serves as a quantifiable measure of $$\theta^{task}\_p$$'s effectiveness at label prediction across the evaluation dataset.
