Task Lifecycle Deep-dive

This section offers a deep-dive into a cycle of an AI Arena Task.

1. Task Creation

Task creation is the primary stage of the training cycle. Task creators define the desired models and submit tasks to the platform.

To qualify as a task creator, users must meet one or more of the following criteria:

• Stake a sufficient amount of $FML

• Have successfully trained or validated a task previously, as evidenced by on-chain records

• Possess a reputation in the ML space or be recognised as a domain expert in relevant fields, as verified by the FLock community

2. Training Node and Validator Selection

Each participant is required to stake in order to participate either as a training node or a validator. Also, rate limiting is adopted to determine the number of times participants can be eligible as validator for a given task. Essentially, the likelihood of a participant being selected to validate a task submission increases with their stake. However, the rate at which validation frequency increases relative to the staking amount tends to diminish as the staking amount grows.

3. Training

Each training node is given Dlocal\mathcal{D}{\text{local}}, which contains locally sourced data samples, comprising feature set XX and label set YY, with each sample xiXx_i \in X corresponding to a label yiYy_i \in Y. The goal of training is to define a predictive model ff, which learns patterns within Dlocal\mathcal{D}{\text{local}} such that f(xi)yif(x_i) \approx y_i.

To quantify the success (i.e. ability to predict) of the predictive model ff, we introduce a loss function L(f(xi),yi)L(f(x_i), y_i), assessing the discrepancy between predictions f(xi)f(x_i) and actual labels f(yi)f(y_i). A generic expression for this function is: where NN denotes the total sample count, and ll signifies a problem-specific loss function, e.g., mean squared error or cross-entropy loss.

Ultimately, the optimisation goal of training is to adjust the model parameters θ\theta to minimise LL, typically through algorithms such as gradient descent.

4. Validation

After the training node produces a trained model θptask\theta^{task}_p, a selected group of validators, denoted as VjVV_j \in V, each equipped with the evaluation dataset Deval\mathcal{D}_{\text{eval}} from the task creator, will validate the model. The dataset consists of pairs (xi,yi)(x_i, y_i), where xix_i represents the features of the ithi-th sample, and yiy_i is the corresponding true label.

To assess the performance of the trained model, we use an general evaluation, which is calculated as follows:

eval(θtaskp,Deval)=1Deval(xi,yi)Deval1(y^i=yi)\begin{equation*} eval(\theta^{task}p, \mathcal{D}{\text{eval}}) = \frac{1}{|\mathcal{D}{\text{eval}}|} \sum{(x_i, y_i) \in \mathcal{D}_{\text{eval}}} \mathbf{1}(\hat{y}_i = y_i) \end{equation*}

Here, 1\mathbf{1} represents the indicator function that returns 1 if the predicted label y^i\hat{y}i matches the true label yiy_i, and 00 otherwise. The function Deval|\mathcal{D}{\text{eval}}| denotes the total number of samples within the evaluation dataset.

Each predicted label y^i\hat{y}_i from the model θtaskp\theta^{task}p is compared against its corresponding true label yiy_i within the dataset Deval\mathcal{D}{\text{eval}}. The calculated metric result (accuracy here) serves as a quantifiable measure of θptask\theta^{task}_p's effectiveness at label prediction across the evaluation dataset.

Last updated