Training Node
Last updated
Last updated
Training nodes are responsible for training and fine-tuning the AI tasks initiated by the task creators. This mechanism ensures the integrity and health of the ecosystem, as nodes have vested interests via staking. In return, the nodes will be rewarded in proportion to their contributions. To become a training node, a user has to pay a registration fee and stake $FML
.
For each task, the rewards for training nodes will be divided into two parts, Reward A (which is daily emitted) and Reward B (which is vested and emitted when the task is finished).
During one day, consider the situation where there are active AI Arena tasks with the total staking amounts of .
Thus, assuming:
is the number of tasks in total
is the daily emission of $FML
is the total stake amount of a particular task
is the split between Reward A and Reward B (initially set at 0.1)
is the parameter which determines how sensitive stake amount is in relation to rewards (the larger , the more impact for stake on rewards; initially set as 1)
is the reward share ratio set by training node itself which determines the ratio of rewards shared between training node and its respective delegators
is the score that a particular training node receive in accordance with its relative rank against all other nodes for the same task, as well as the training node’s stake in the task
*Note that apart from , all other computations in the reward calculation is done on-chain. Note also that and are system parameter swhich can be determined via FLock's DAO, whilst is determined by training nodes and validators themselves (they are allowed to change this ratio once a month).
For a given AI Arena task with the total staking amount of , its daily total rewards is:
Consider the total stake of training nodes (including stake delegated to training nodes) in the task is and the total stake of validators (and their delegators) is the task is . Then daily rewards for training nodes are and their delegators are:
Then, Reward A for a given training node is:
And Reward B for the same given training node will be:
Put simply, a training node’s daily return from a task depends on three factors:
(1) the relative staking amount of this task against all tasks, meaning a training node’s stake in a particular task will indirectly affect it’s rewards from that task; and
(2) a training node’s stake in this task as well as stake delegated to this training node; and
(3) quality of the node’s submission, as shown by the node’s relative ranking. Specifically, it is a geometric series, along with its ranking, multiplied by the relative stake of this task.
It’s important to note that rank is being used here to determine the quality of the training node’s work, not other metrics such as absolute scores, primarily because scores can come very closely between nodes. Such design decision is believed to make reward calculation fairer and easier.
Let’s say the total stake of all tasks for a particular day is 2,450
, daily emission is 1,074
, there are 3
training nodes and 1
validator for task A. Consider node a, b, and c each staked 100
, 200
, 300
, and they rank first
, second
and third
respectively. The validator staked 500
, meaning total stakes for this task is 1,100
.
Imagine there are also tasks B and C, and their total stakes are 500
and 850
respectively. In this example, we only illustrate the reward distribution for task A. We further assume that C, the score from off-chain calculation is 0.3886
.
Thus, daily reward for node a who staked 100
and ranked first is:
Assuming task duration is 30
days, task yields for this node are:
In which, refers to the sum of total stake amount from the training node, and total stake amount delegated to this training node from delegators is denoted as . Consider as the reward ratio set by training node itself which determines the ratio of rewards shared between training node and its respective delegators. Then, the actual reward for training node is:
Assuming refers to the duration of a task, its task yield (or sum of daily returns) is: