PUP-28: Increase MaxValidators - Take Two

Attributes:

  • Author(s): @msa6867

  • Parameter: MaxValidators

  • Current Value: 1000

  • New Value: 2000

Summary:

Vote on PP-14 (Increase MaxValidators to Improve economic security) was halted on July 3 to allow for more consensus building on parameter value. The main area of concern expressed regarding the proposed value of 5k in PUP-14 was blocksize bloat caused by adding 4k new signatures to the block.
Moving MaxValidators to 2000 rather than to the value of 5000 proposed in PUP-14 adds one quarter this number of new signatures to the block as compared to PUP-14. Such level of “block bloat” in manageable and worth the increase to network security that comes with doubling the tokens staked to Pocket Network validators.

Abstract:

The recent implementation of PUP-19 and PIP-22 caused a 4-fold increase in POKT-denominated network securitization compared to June 2022 levels. However, due to market conditions, the US dollar-denominated vulnerability to the Pocket Network remains an ongoing concern and has not changed much from June 2022 levels.

Vote on PUP-14 was deferred, in part, to give time for the system to absorb PUP-19 and PIP-22/PUP-21, as well as to work out the engineering concerns related to raising MaxValidators. Those concerns centered on blocksize bloat due to added signatures. The system has had several months to absorbed PUP-19 and PIP-22, and it is now time to revisit the question of raising Max Validators.
Max blocksize in v0 is 4MB. At the time of PUP-14 submittal, average blocksize was approximately 2.6 MB. Adding 4k validators as per PUP-14 would have added approximately 880 kB (~220 B per signature) to the block, which equates to an encroachment of two-thirds of the headroom between the current average blocksize of 2.6 MB and the maximum blocksize of 4 MB. On the other hand, raising MaxValidators to 2000 only adds ~220 kB to the block and thus only about 15% of the headroom between the current average and the maximim size.

Furthermore, it was shown in the June time period, that while blocksize did occasionally hit the 4MB limit, this was due to spiky outliers rather than due to a gaussian distribution of block sizes. Thus raising average block size by 220 kB should hardly change the probability of a block bumping against the max block size.

It is felt therefore that raising MaxValidators from 1000 to 2000 will not have negative impact on the system. This can be confirmed by checking that current distribution of blocksize has not changed substantially from the June time period.

Note that raising MaxValidators from 1000 to 2000 will have a dilutive effect on per-validator rewards, causing the probability of a validator being selected as a proposer to be cut in half, and thus for the expected monthly proposer rewards to be cut in half. Raising ProposerAllocation to a value greater than 5% could ameliorate this reduction at theexpense of servicer rewards, but the author declines to suggest raising ProposerAllocation as part of this proposal, as it is felt that further reduction to servicer rewards cannot be justified at this time.

Motivation:

By itself, raising MaxValidatprs from1000 to 2000 will approximately double the network security in terms of USD/POKT it will take to launch a DoS or Byzantine attack on the network. In conjunction with PUP-27, network securitization would be increased by a factor of approximately 2.3. While this may seem small compared to the 10x or more increase in securitization called by many in the community, it is a good start and it is low-hanging fruit that can be achieved without distracting development team from other duties related to upcoming transition to v1.

Rationale:

Setting MaxValidators to 2000 is a good balance giving significant boost to network security without triggering major engineering concerns associated with raising it further.

Dissenting Opinions:

Copyright

Copyright and related rights waived via CC0.

I share the need to increase the Pocket Network security. Changing the number of MaxValidators might seem a valid way of doing so, however, the current blockchain conditions do not allow this.


TL-DR:

I do not support this proposal due to (please read justifications below):

  1. While the block size seems to be low enough, in the case of an increase in TXs, there could be problems with lost TXs. There is no room to increase the number of validators if we want to stay in a 96% confidence zone.
  2. Consensus will be more difficult to reach.
  3. Validators nodes requirements will increase due to gossip overhead and probably the new 1000 validators wont be able to handle this (as they did not choose to be validators, this has happened in the past with PIP-22 up-staking)
  4. Increased blockchain disk size (increases costs, disk is very expensive on cloud instances).
  5. The number of validators is already outside the Tendermint normal number of validators (1000 of Pocket vs 150 in Cosmos).
  6. Increasing validator number will reduce the average validator rewards. Reducing the rewards of servicers to remedy this is something that I find unfair (just an opinion, since it is not part of the proposal).

Justification

1 - Block Size

I want to give some insights on block size calculation and current status. This will help us understand where we are. The data was obtained from POKTscan, where you can find the historical block size. At first glance you will see that we are really close to 4 MB:
image
Fear not, this is not RAW block size that @msa6867 is writing about. The actual raw size is approximately:

RawBlockSize = BlockSize - 800 KB

Using this corrected value we analyzed the current proposal. You will find the data here.

Currently the block size limit (formally introduced in this PR) is set to 4 MB.
Since 2022-10-01, the size of the blocks has oscillated far from this limit:

The actual block size distribution during the observed time period is:

The block size is not near the given limit. It is outside the 96% region (if we assume Gaussian distribution).
If we increase the number of validators to 2000 (adding 1000 more), we can expect an additional 212 KB of block size. According to PIP-7, the footprint of validators on the block size is calculated as:
ValidatorsBlockSize = NumOfValidators * 223 Bytes
Then:
ExtraValidatorsBlockSize = ExtraNumOfValidators * 223 Bytes = 1000 * 223 Bytes = 223000

If we introduce this change, the new block size distribution will look like this:

In this scenario, the 4 MB block size is still outside the 96% region.
To recap, these are the expected values for current and proposed scenario:

scenario Mean Size High Size (96%) Free Space (mean) Free Space (high)
Current 2.93 MB 3.52 MB 1.07 MB 491 KB
Proposed 3.15 MB 3.74 MB 870 KB 266 KB

You will notice that I included two new columns, the Free Space, (mean) and (high). These columns represent how much extra block size is left from the 4 MB block space that is the current limit. This is important since this Free Space is in fact used to store the TXs. During validation, the validator will grab as many TXs it can until the block is full. If we add too much validators, we will not have space to include additional TXs. Moreover, the TXs that are not included in a block are only keep in the mempool for two blocks:
Mempool Tx Eviction, TxmaxLife = 2

In conclusion, yes, we can increase the number of validators in terms of block size. But we are reducing the size of the block to include new transactions. If the number of TXs increases we will be too close to the 4MB limit, and TXs will start to be pushed to other blocks or lost.

2 - Consensus Problems

Increasing the number of validators will create consensus problems. We had our share of problems with only 1000 validators trying to push new features (such as non-custodial). Also, it will create even more problems in cases of chain halts (god forbid!), where reaching a rapid consensus is critical.

3 - Gossip and Peering

The Pocket Blockchain is based on Tendermint for its consensus mechanism. It is known that peering is not its efficent. Increasing the validators size will create even more problems here.

Also, the gossip of the increased number of validators will require more computing power from the validators. When increasing from 1000 validators to 2000 validators, many nodes that are not probably prepared to be validators will suddenly become one. This can cause instabilities in the network.

4 - Disk Size

More validators mean more disk usage for the blockchain. This detail should not be overlooked.

5 - Validator number out of specification

The Tendermint team only has 150 validators on its Cosmos SDK. It is hard to find some Tendermint application with more than 300.
I think that we are currently running Tendermint with a number of validators that is way outside its specification. This can be risky and stability is crucial for our network. We are not only providing a token, we are running a service that must be always online.


P.S.: Thanks to the people that helped me gather this info, it was quite difficult!

3 Likes

Interesting to read the counter-points.

Can I ask- would you (or another node runner) be able to approximate the extra costs for running a Validator at 2k max Validators? So the increased cost lets say per month for extra compute/disk space?

I think it would help to try to quantify these issues where we can.

Thank you @RawthiL for a detailed response. You say you share the need to increase POKT network security but do not think changing max validators is the way to do it. You have also come out against raising stake-weight ceiling as a mechanism (PUP-27). What then do you propose? Raising MinStake to 150k or 300k would do the trick and might not be a bad idea, but both you and I have expressed concern in the past over forcing the “little guy” out of the network. There are changes to validator incentive structure that I have looked at that might incentivize staking top-heavy rather than bottom heavy. Also, @poktblade has been exploring delegation ideas. But both of these would involve consensus changes and require much more dev work than the modest parameter change being proposed. What other knobs do we have in the current time frame to increase security? Increasing network security is needed not just for its own sake, but also for marketing purposes to attract new DApp developers to use Pocket Network. This is a point brought up by @srndptme (“Vitality- Linen Wallets”), representing te DApp point of view, several times in discussions on “The Den.”

Going through your individual objections:

  1. Thank you for your analysis. There is only a small state window where it an be argued that 2k validators could be problematic for the network while 1k validators would not be. The truth is we are currently far from that state. Nor are we trending in the direction of reaching that state. And on the other side of that small window of state is the vastly larger state space where, If tx were to balloon, there will be need for action to accommodate the increase even if max validators is kept at 1k. It does not make sense to continue holding to 1k validators just to accommodate a potential future small sliver of state space. I would be interested to get other’s take on this issue. If 2k is problematic, we could explore 1500, which adds only half the bytes of the current proposal.

  2. I would argue just the opposite… that consensus problems will be alleviated, not exacerbated, by setting max validators to 2000. Why? Actors update validator nodes in blocks. What is important in time needed and efficiency of reaching consensus is the number of actors needed to reach consensus, not the number of nodes. The top dozen or so actors can be expected to be available near real time in case of emergency (eg. chain halt) (with at least one notable exception during the recent chain halt), while timely coordinating of the actors operating only 1 or a handful of validator nodes can be expected to be more difficult to achieve. If max validators were increased from 1k to 2k, the top dozen actors would be expected to increase their validator count, on average, more than small actors, thus decreasing reliance on the harder-to-reach small actors to reach 67% consensus.

  3. [This is two objections rolled into one; I will address each separately]

3a - it is acknowledged that Tendermint consensus mechanism is not effeicient, hence the modest increases being proposed compared to the original PUP-14 proposal of raising to 5k. The move from 1k to 2k should not cause a significant increase in computing power burden. See @Cryptocorn question to current validators. See also #5 below

3b (unintended validators). This is a good point. A quick analysis of staked POKT per node shows that there are at least 1800 nodes that intentionally vied for validator slots within the last several months, indicating that these should be fully prepared to intentionally take on validator duties if a version of this proposal passes. Between 1800 and 2000 is a suspect zone that may include nodes that inefficiently overstaked to bin 4 without actual intention to validate. If the proposed MaxValidators value were set to 1800 from 2000, it would eliminate this concern while retaining most of the benefit of increase to network security.

  1. The economics of node running are such that most, if not all, validators are also servicers. As such, validator disk space - whether 1k nodes or 2k nodes is a small concern compared to the disk spaced required for the DApp chains. See @Cryptocorn question to current validators. See also #5 below

  2. I understand the Tendermint concern and the desire for caution, but this proposal is hardly paving new territory. To put it into perspective, at the time that PIP-7 was passed there were over 13k validators. Let me repeat: One year ago Pocket Network had over thirteen thousand validators! Almost an order magnitude more peering, gossip, processing, disk space etc etc one year ago compared to the 2k value being proposed herein.

To Recap:
The risks of proposed change to 2k (or 1.8k if that is preferred) is small. We have ample room in block to accommodate the extra validator count. Reaching consensus is not a concern when considering time to reach consensus by number of actors not number of nodes. Increase to gossip/peering/processing/disk storage is relatively small and manageable - especially compared to the 13k validator count the network was at one year ago.

We can discuss any method to increase the network security, but probably the correct ones will be those that require more work (like delegation). If we are to discuss other methods it should be done in a separate topic that we can create on this forum.
Also, having a knob that does not mean that we can move it around without taking into account the rest of the ecosystem. Here we are discussing to increase the number of validators, which will increase the network security but the negative effects of this change outweigh the positive ones (IMHO).

This means that you expect to reduce the effective network security?
If the effective number of validator actors is reduced so is the network security. We will have less decentralization among our validator nodes. This is even worst.

The Pocket Network of one year ago, with 13K validators, is hardly comparable to current network. In any aspect, client software, number of relays, QoS, etc.

Validators are also servicers but they are not the same. While a servicer can be put into a LeanPocket cluster, the validators cannot be staked like that. It is not recommended to run validators on LeanPocket. Then, for each validator you will need new hardware (u$d ~100), while for each servicer you only need to add it to a LeanPocket cluster (u$d 0). There is an effective cost in running 1000 extra validators that goes beyond the increased disk requirements.

I was not able to respond to @Cryptocorn since validator node running is not our main activity and we are not tracking disk usage and costs as closely as we would need to answer that question with data. I only know that the cost will rise, not sure how much tho. I’m open to see any analysis from node runners that are more into validators.

Agreed that other ideas should be discussed in other forums; I am hopeful that @poktblade will start a delegation thread soon. (My other idea is not ready yet for discussion and I hesitate pursuing it at all as it would require consensus-breaking code change. )

This is faulty logic.

I may be wrong about assumptions, but this is roughly what I expect:

  • the number of validator actors to go up slightly - primarily in filling out the last 10-20% or so of validator slots
  • the number of validator actors needed to reach 67% consensus to go down slightly
  • the amount of POKT needed to launch a DoS attack to almost double
    -the amount of POKT needed to launch a Byzantine attack to almost double

While decentralization is a noble goal, network security rests primarily with POKT needed to take over the network, not on the number of benevolent actors need to reach consensus. I am not that concerned with our present set of large actors. They have proven to have interests aligned with the best interest of the ecosystem and have voluntarily limited themselves each to less than 33% of validator slots each. The truth is that right now 1 or 2 current actors could launch a Byzantine attack if they really wanted to and maybe half dozen could launch a DoS attack if they were suddenly to go rogue. Both those numbers goes down if this proposal passes (with proposal, no current actor could launch a Byzantine attack without acquiring new POKT and only 1 or 2 could single-handedly launch a DoS attack).

The main threat to the ecosystem, however, comes from new actors with malicious intent. E.g., a competitor could easily take out our network if it decided we posed too great competition to their business model. Why is it not a priority to do what we can to increase the security? The costs and risks are small compared to the security gained.

Relays per block are about 5x today what they were in October of last year. Agreed that this must be taken into account - especially re block size. The other changes are mostly servicer related items, not validator (QoS, client software changes etc). As a starting point I would expect system state this time last year [13k validators; 2k relays] to be more taxing in terms of peering, gossip, processing, etc than the state being proposed herein [2k validators, 10k relays]. Hopefully we can get some input here from node runners for whom running validators is a core business.

Thanks @RawthiL for the cost estimates that you could provide, and a very good point that Validators can’t/shouldn’t run LeanPokt.

So this proposal would add the cost of running 1k Validators as an ‘additional’ cost. So at $100 * 1000 = $100,000 more expenses, monthly.

Assuming the above is correct, my feel would be that while I am slightly in favour of a move from 1k → 2k Validators, ceteris paribus, if the cost to the network to do so would be ~$100k/mo, I think that’s too high a cost for the benefits the move would bring.

As always, happy to be corrected on my assumptions above!

I think @RawthiL was estimating a one-time fixed cost for upgrading hardware of $100, not a recurring monthly cost. I presume that was mainly to account for disk storage. My counterpoint is that most validator/servicers (hopefully not LC as pointed out) are at 2TB+ ultra-fast storage to support DApp chains and that the 250GB or so of POKT-chain storage (that doesnt even need to be ultra fast) is rather in the noise if it increases, eg, to 300 or 350GB as a result of this proposal.

I dont understand how this is compatible with this:

(random numbers) Suppose that there are 10 validator actors, each holding a 10% of the validators. Then, the total number of validator nodes is increased, and the number of validator actors increase 20%. The new number of validator actors is 12. How can the original 10 validator increase their validator share (that was 10%) and acomodate the new 2 actors?

This is not logical unless you are expecting some actors to increase their validator share over the others. This is very specific, it means that some validator actors want to have more validator slots but do not want to enter the current validator staking war. Passing this proposal will enable them to do that without the uncertainty of having to push out other validators.
Currently pushing out validators is risky since you don’t know the POKT they have available and how committed they are to stay as validators. The staking war could lead the validator price to places were the investment is no longer profitable. This is bad from a validator agent perspective but good from the network security perspective.

Please remember that you don’t need them to go rouge, you just need a security breach. It must be taken into account that if the cost of buying the POKT becomes higher than breaching one or two actors, you will have a problem.
Also, having most validating power on only two actors makes the network more susceptible to legislation problems (at most two jurisdiction need to pass some crazy law).
There is no gain in reducing the number of validating actors besides the faster coordination.

I have to disagree, decentralization is an essential part of the blockchain concept, not a noble goal.
If you want fast coordination and high security you can achieve that with a centralized system.

Security is not purely economical. The network should stay secure from breaches, legislation and stability also. These are my concerns.

It is a monthly cost. Just to recap what was discussed on TG:
This cost was an optimistic estimate of the cost of running a validator. Our services are currently much higher for a validator ($180, including customer service and maintenance). The hardware we need to run them is high both in disk and RAM.

Other node runners (@poktblade ) stated that the cost is lower, around u$d 20 to u$d 50. This could be true, but we are unable to reach such costs, I can only talk given our experience.
It would be great if more validator node runners can share their costs to have a better estimation.