Task Lifecycle Deep-dive

This section offers a deep-dive into a cycle of an AI Arena Task.

1. Task Creation

Task creation is the primary stage of the training cycle. Task creators define the desired models and submit tasks to the platform.

To qualify as a task creator, users must meet one or more of the following criteria:

• Stake a sufficient amount of FLOCK

• Have successfully trained or validated a task previously, as evidenced by on-chain records

• Possess a reputation in the ML space or be recognised as a domain expert in relevant fields, as verified by the FLock community

2. Training Node and Validator Selection

Each participant is required to stake in order to participate either as a training node or a validator. Also, rate limiting is adopted to determine the number of times participants can be eligible as validator for a given task. Essentially, the likelihood of a participant being selected to validate a task submission increases with their stake. However, the rate at which validation frequency increases relative to the staking amount tends to diminish as the staking amount grows.

3. Training

Each training node is given $\mathcal{D}{\text{local}}$ , which contains locally sourced data samples, comprising feature set $X$ and label set $Y$ , with each sample $x_i \in X$ corresponding to a label $y_i \in Y$ . The goal of training is to define a predictive model $f$ , which learns patterns within $\mathcal{D}{\text{local}}$ such that $f(x_i) \approx y_i$ .

To quantify the success (i.e. ability to predict) of the predictive model $f$ , we introduce a loss function $L(f(x_i), y_i)$ , assessing the discrepancy between predictions $f(x_i)$ and actual labels $f(y_i)$ . A generic expression for this function is: where $N$ denotes the total sample count, and $l$ signifies a problem-specific loss function, e.g., mean squared error or cross-entropy loss.

Ultimately, the optimisation goal of training is to adjust the model parameters $\theta$ to minimise $L$ , typically through algorithms such as gradient descent.

4. Validation

After the training node produces a trained model $\theta^{task}_p$ , a selected group of validators, denoted as $V_j \in V$ , each equipped with the evaluation dataset $\mathcal{D}_{\text{eval}}$ from the task creator, will validate the model. The dataset consists of pairs $(x_i, y_i)$ , where $x_i$ represents the features of the $i-th$ sample, and $y_i$ is the corresponding true label.

To assess the performance of the trained model, we use an general evaluation, which is calculated as follows:

\begin{equation*} eval(\theta^{task}p, \mathcal{D}{\text{eval}}) = \frac{1}{|\mathcal{D}{\text{eval}}|} \sum{(x_i, y_i) \in \mathcal{D}_{\text{eval}}} \mathbf{1}(\hat{y}_i = y_i) \end{equation*}

Here, $\mathbf{1}$ represents the indicator function that returns 1 if the predicted label $\hat{y}i$ matches the true label $y_i$ , and $0$ otherwise. The function $|\mathcal{D}{\text{eval}}|$ denotes the total number of samples within the evaluation dataset.

Each predicted label $\hat{y}_i$ from the model $\theta^{task}p$ is compared against its corresponding true label $y_i$ within the dataset $\mathcal{D}{\text{eval}}$ . The calculated metric result (accuracy here) serves as a quantifiable measure of $\theta^{task}_p$ 's effectiveness at label prediction across the evaluation dataset.

PreviousValidator Guide NextSmart Contracts Deep-dive

Last updated 7 months ago

Was this helpful?