Training Node

Training nodes are responsible for training and fine-tuning the AI tasks initiated by the task creators. This mechanism ensures the integrity and health of the ecosystem, as nodes have vested interests via staking. In return, the nodes will be rewarded in proportion to their contributions. To become a training node, a user has to pay a registration fee and stake $FML.

For each task, the rewards for training nodes will be divided into two parts, Reward A (which is daily emitted) and Reward B (which is vested and emitted when the task is finished).

  • During one day, consider the situation where there are MM active AI Arena tasks with the total staking amounts of (S1,...,SM)(S_1, ..., S_M).

  • Thus, assuming:

    • MM is the number of tasks in total

    • EE is the daily emission of $FML

    • SiS_i is the total stake amount of a particular task

    • δ\delta is the split between Reward A and Reward B (initially set at 0.1)

    • β\beta is the parameter which determines how sensitive stake amount is in relation to rewards (the larger β\beta , the more impact for stake on rewards; initially set as 1)

    • σ\sigma is the reward share ratio set by training node itself which determines the ratio of rewards shared between training node and its respective delegators

    • CC is the score that a particular training node receive in accordance with its relative rank against all other nodes for the same task, as well as the training node’s stake in the task

    *Note that apart from CC, all other computations in the reward calculation is done on-chain. Note also that δ\delta and β\beta are system parameter swhich can be determined via FLock's DAO, whilst σ\sigma is determined by training nodes and validators themselves (they are allowed to change this ratio once a month).

  • For a given AI Arena task with the total staking amount of SiS_i, its daily total rewards is:

ESiβk=1MSkβE \cdot \frac{S^\beta_i}{\sum_{k=1}^{M}{S^\beta_k}}
  • Consider the total stake of training nodes (including stake delegated to training nodes) in the task is StndS_{tnd} and the total stake of validators (and their delegators) is the task is SvdS_{vd}. Then daily rewards for training nodes are and their delegators are:

rtnd=ESiβk=1MSkβStndβStndβ+SvdβCr_{tnd} =E \cdot \frac{S^\beta_i}{\sum_{k=1}^{M}{S^\beta_k}} \cdot \frac{S_{tnd}^\beta}{S_{tnd}^\beta+S_{vd}^\beta} \cdot C
  • In which, stnds_{tnd} refers to the sum of total stake amount from the training node, and total stake amount delegated to this training node from delegators is denoted as sds_d . Consider σ\sigma as the reward ratio set by training node itself which determines the ratio of rewards shared between training node and its respective delegators. Then, the actual reward for training node is:

rtn=rtnd(σ+(1σ)stnstn+sd)r_{tn} = r_{tnd} \cdot \left(\sigma + (1-\sigma) \cdot \frac{s_{tn}}{s_{tn}+s_d}\right)
  • Then, Reward A for a given training node is:

rtnA=rtnδr_{tnA} = r_{tn} \cdot \delta
  • And Reward B for the same given training node will be:

rtnB=rtn(1δ)r_{tnB} = r_{tn} \cdot (1-\delta)
  • Assuming DiD_i refers to the duration of a task, its task yield (or sum of daily returns) is:

yt=rtnstnDi100%y_t = \frac{r_{tn}}{s_{tn}}\cdot D_i \cdot 100 \%

Put simply, a training node’s daily return from a task depends on three factors:

(1) the relative staking amount of this task against all tasks, meaning a training node’s stake in a particular task will indirectly affect it’s rewards from that task; and

(2) a training node’s stake in this task as well as stake delegated to this training node; and

(3) quality of the node’s submission, as shown by the node’s relative ranking. Specifically, it is a geometric series, along with its ranking, multiplied by the relative stake of this task.

It’s important to note that rank is being used here to determine the quality of the training node’s work, not other metrics such as absolute scores, primarily because scores can come very closely between nodes. Such design decision is believed to make reward calculation fairer and easier.

Example

Let’s say the total stake of all tasks for a particular day is 2,450, daily emission is 1,074, there are 3 training nodes and 1 validator for task A. Consider node a, b, and c each staked 100, 200, 300, and they rank first, second and third respectively. The validator staked 500, meaning total stakes for this task is 1,100 .

Imagine there are also tasks B and C, and their total stakes are 500 and 850 respectively. In this example, we only illustrate the reward distribution for task A. We further assume that C, the score from off-chain calculation is 0.3886.

Thus, daily reward for node a who staked 100 and ranked first is:

rta=1074110024500.16670.3886rta31.38\begin{align*}r_t^a &= 1074 \cdot \frac{1100}{2450} \cdot 0.1667 \cdot 0.3886 \\r_t^a &\approx 31.38\end{align*}

Assuming task duration is 30 days, task yields for this node are:

yta=31.3810030yta941.4%\begin{align*}y_t^a &= \frac{31.38}{100} \cdot 30 \\y_t^a &\approx 941.4\%\end{align*}

Last updated