Survival of the Fittest Node
Created: September 2021
Audience: Node Runners, Community Members, Hardware Hobbyists
The Pocket Network blockchain has grown significantly in the past few months. Hardware that was sufficient to run a Pocket Node six months ago is no longer sufficient. In July, the Node Runner community witnessed one of its largest members get a significant amount of jailings and paused their participation in the network. As jailings rise due to the insufficiently resourced nodes, I’ve identified trends in behavior that make node runners vulnerable. I offer up this information for the good of Node Runners for the benefit of the community.
I run Pocket Nodes on a wide variety of hardware and in a wide variety of environments:
- Home office
- Threadripper 3990x, 64 cores, PCIe 4 NVMes in RAID 0
- Cloud provider AWS
- C5a instances with GP3 SSD block storage
- C5ad instances with attached NVMes in RAID 0
- Dedicated server provider US
- Dual CPUs with 4 x SATA SSD RAID 0
- Dual CPUs with 2 x NVMe in RAID 0
- Dedicated server provider EU
- Dual CPUs with PCIe 4 NVMes in RAID 0
- Single CPUs with PCIe 4 NVMe
- Single CPUs with PCIe 3 NVMe
I believe I have a wide exposure to how nodes are performing in this current situation.
I also believe the fastest nodes are causing the slowest nodes to drop off the network in what is a form of natural selection. A recent Node Runner dropped service altogether in July of this year. Why?
Stacking nodes can fraught in the Pocket Node ecosystem. A previous Node Runner ran multiple Pocket nodes on one large instance and divided the instance’s resources via Kubernetes. I also run nodes stacked but I use docker-compose. The method of stacking doesn’t really matter.
No other large node runner stacks nodes like this. Every other one runs individual small instances for each Pocket node. I know this to be true for BenVan, Blockspaces, and C0d3r.
So why is stacking nodes flawed? Probably disk IOPS and iowait%.
I’ve recently seen a clear hierarchy in how stacked nodes perform based on the quality of the underlying storage hardware. The catch is that existing hardware used to be fine. When there were fewer nodes and fewer transactions per block, recent increases in traffic challenge the division of an instance.
Dividing an instance is no longer possible on cloud block storage
It just isn’t fast enough. Using the maximum speed of available block storage on AWS – GP3 SSD with 16,000 IOPS – if you stack too many Pocket nodes on an instance using compose or Kubernetes, those instances are more likely behind and get jailed. I’m seeing this behaviour now on the AWS instances I manage using block storage.
Instances using attached NVMes – the *d classes – are still fine. Very few people use these because the storage size is not elastic.
One method to distribute the disk speed requirements is to give each Pocket node its own volume rather than sharing the IOPS of one. Make sure to use a volume type, like GP3, where you can specify the IOPS of the volume.
The hierarchy of nodes: the ALPHA NODE
In all distributed systems attempting to reach consensus, the pace of the network is set by the fastest nodes. In the past with Pocket, the slowest nodes were still able to stay within a certain delta of the fastest nodes. Let’s call that the blockValidationTimeDelta. In the past, that time in which the fastest nodes and the slowest nodes voted had a small delta of xx.
That is no longer the case. The required blockValidationTimeDelta hasn’t changed – the core protocol still requires all nodes to vote within a certain time frame. However the ability of the slowest nodes to keep up with that delta has diminished.
This is evident by the volatility You can see the evidence of this variation in the jailing rates of the largest node providers.
The jailing rate is directly proportional to the underlying resources given each node. The best performing nodes with the greatest hardware resources are at 0 to 1% jailing rate. Nodes given fewer resources will experience correspondingly higher rates.
Nuanced differences in hardware
In my Nachoracks range of nodes, my dedicated server provider unfortunately gave me a mix of SSDs even though the machines were meant to be identical. A subset of the SSD nodes kept getting jailed while the others were very profitable. It was inexplicable since this was all supposed to be identical hardware sharing the same racks and setup the exact same way using ansible.
Finally I found the difference using hwinfo – not the brand of the SSD but the model. All servers using the consumer level Samsung 970 EVO SSD were unable to keep up. All servers using the much better provisioned Samsung 980 Pro were stable and profitable.
What can be done by node runners
Node Runners need to resume being within the delta required by the alpha nodes or they will continue being jailed. For the nodes I manage for the Pocket Foundation, this has meant getting all of the nodes off of AWS block storage and moved to rack providers with NVMes. This is especially important for the Foundation as it maintains the nodes that power the Portal API’s dispatch.
I would urge Node Runners to examine their underlying disk speed and storage hardware; assuming that having the fastest the cloud provider can offer is not enough.