Attributes
- Author(s): @JackALaing @luyzdeleon @andrew @varoten
- Parameter: MaxValidators
- Current Value: 5,000
- New Value: 50,000
Summary
Update the MaxValidators parameter from 5,000 to 50,000 to avoid a bug that is causing Tendermint to treat nodes differently if they are outside of the MaxValidators limit. We recently crossed 5,000 unjailed nodes, which is why the bug has now become apparent. Increasing the value of the parameter will postpone the effects of the bug until we release our next protocol upgrade (0.7). It will also have the added benefit of enabling the long-tail of nodes to participate in proposing blocks and ensuring all service nodes can continue to use jailing as a graceful method of removing themselves from service.
Motivation
We discovered an error in the logic of Pocket Core, wherein it tried to re-jail an already jailed validator:
cannot jail already jailed validator
After some investigating, we discovered that this is being caused by a bug which results in Tendermint and Pocket Core treating nodes differently if they exist outside of the MaxValidator set, which is currently set to 5,000. Nodes which have a stake that is outside of the top 5,000 largest stakes are jailed in the eyes of Pocket Core, whereas Tendermint thinks the node is still active in the Validator set and therefore obligated to cast a vote on a block. However, when that “blank vote” comes through Pocket Core, Pocket Core still thinks it is a jailed node and puts the node through the normal lifecycle of punishing a validator that didn’t cast a vote in time, triggering a single slash (0.00001%) every 10 blocks even after the validator has been jailed.
Outside of this bug, there are other reasons to increase the parameter. We no longer have the scalability limitations that were referenced originally in PUP-4. In fact, thanks to @BenVan’s peering enhancement efforts, we are finding that the resources of service nodes beyond 5k are equivalent to the resources of validators within 5k, and block times are remaining healthy. Scalability no longer being a concern, increasing the parameter would provide the following benefits:
- Providing more nodes, particularly the long-tail of node runners, with the opportunity to participate in block validation and earn proposer block rewards.
- Maintaining jailing as a graceful exit option for all nodes, thus optimizing quality of service (when separation of validation/servicing is in effect, jailing is not an option for nodes outside of the MaxValidators limit, which means they have no graceful way to remove themselves from the service cycle).
Rationale
We ran 2 different tests that replicated the issue described above in an internal test network and we replicated mainnet’s behaviour successfully. These tests confirmed to us that upgrading the max_validators
param to a higher number will make the problem self-heal by the same logic we use to coordinate the current set of 5,000 nodes. We’re recommending 50,000 as the new value to postpone the effects of the bug until we release 0.7 (the next protocol upgrade).
Dissenting Opinions
Can’t we ask altruistic node runners to bring us down below 5k?
This is a solution if nodes within the top 5k are the ones doing the jailing/unstaking and if no one else subsequently spins up nodes. However, it would be infeasible to convince the collective node runner community not to spin more nodes up until the bug is patched, especially since they’d be able to avoid slashing penalties for themselves if they stake within the 5k. The more reliable solution is to change the parameter, especially given we no longer have scalability concerns, as explained above.
Analyst(s)
Pocket Network Inc’s blockchain devs - Luis, Andrew & Otto
Copyright
Copyright and related rights waived via CC0.