Attributes
-
Author(s):
@addison Thunderhead Founder (longtime community member, ex-OTC exchange, insitutional and liquid staking)
@Poktblade (Erick Ho) Lead Software Engineer. Contributor to Pocket’s ecosystem through PoktFund.
@Poktdachi (Tommy Ho) Software Engineer with experience building infrastructure across big tech. Now contributing through PoktFund. Last seen at Infracon in the pool.
@pierre - The dev behind pokt.watch and thunderpokt.fi
-
Recipient(s): The funds would be dispersed as follows upon completion of a production ready client.
47.5% Thunderhead
47.5% PoktFund
5% to the protocol team as they are starting to assist with QA
-
Category: Reimbursement
-
Fulfills:
There are several posts on Discord and Telegram expressing concerns about resource utilization as well.
-
Asking Amount: We are asking for $2m USD in Pocket upon completion. The funds would be staked with a 3 month cliff and 15 month linear vest as we are committed to the growth of pocket.
We understand the community may think this is an expensive proposition; however, the break-even after adoption of this client is ~1 month.
Summary
A major topic of discussion lately has been the cost of the network, and ways to optimize it.
Our two teams, Poktfund & Thunderhead, have been working independently on client optimizations since February and have created a client that could cut network costs by an order of magnitude. (See: our original idea). Although the outlook on the optimization by the protocol team was grim, our two teams decided it was feasible and worth the research. In May, our two teams joined efforts and collaborated to deliver an optimization, now coined as LeanPocket, that will reduce infrastructure costs by tens of millions USD per annum. Just recently at Infracon, we revealed the first glimpse into our progress and now this proposal serves as a discussion ground for community members to have a look at the work and voice their opinion regarding retroactive funding of this innovation. We are continuing development and working towards a complete, production ready client.
Abstract
After the two teams identified and researched potential solutions individually in February, we decided to team up and tackle this effort in early May. The core team told us this implementation would be near-impossible, so we decided it would be best to join forces to reduce duplication and drive a community collaboration. We’ve been working hand in hand everyday since then.
How it works
Context: The current state of Pocket requires you to spin up one Pocket Core Client per servicer or validator. This client is in charge of syncing the entire blockchain (currently ~240-300GB), validating blocks and transactions, listening to validator gossip, block syncs, and much more. As the network continued to grow, more Pocket Core Clients were spun up. This dramatically increased the network’s infrastructure cost unnecessarily because servicers do not require the same amount of hardware resources as a validator.
As outlined in v0 optimization, we saw an opportunity to improve the client’s code and decrease the amount of infra cost significantly. There are many different architectural design(s) to improve the client, and we brainstormed many different ways to do so as shown here in our initial design doc. The final way we decided to approach it was with a minimally invasive change, where servicers utilize the existing Pocket Core functions to handle relays, submit claims and proofs. All users need to do to get started is include the servicers private keys into a file. LeanPocket runs in the exact same way the existing client does, just with an additional keyfile.
The light client is a large optimization to the Pocket Core’s client code by allowing multiple servicers to utilize one full node. Servicers can now leverage the same state cache, blockchain data, and no longer have to validate as many transactions in a block as the number of servicers grows. This reduces the amount of resources needed for “n” nodes to a constant number, O(N) to O(1) for memory, space, io, and networking. Essentially, the amount of resources needed for a servicer converges to nothing more than a reverse proxy server sending requests to a chain node. The light client does not require any modifications to the economics of the protocol. Immediately after adoption the network becomes orders of magnitude more efficient, optimized, and cheap.
Since multiple servicers can run on one lean pocket client, there is no need to scale horizontally. You need one copy of the blockchain for n amount of servicers.
Adding additional servicers to the leanpocket client increases ram usage marginally.
This is seen on mainnet as well. The purple line denotes the memory usage of a leanpocket client with 5 servicers running on it (red) and a full node (blue). There is no demonstrable difference in memory usage between the two.
The following chart shows the cpu usage of LeanPocket with 5 servicers (red), and a normal full node (blue). There is no demonstrable difference. We will publish more comprehensive data as we produce it.
This optimization required significant research and innovative thinking into the current architecture of the v0 client, which was not easy to do given the lack of supportive resources for external contributors.
Motivation
The network currently spends a huge amount of money to function. At the time of writing this, there are 48.5k nodes on the network, at a very conservative estimate of $80 per node per month (Blockspaces and C0d3r, 43% of the network, both charge more than $200 per node per month), the network costs $3.88m a month to function. Between now and v1, the current network will conservatively spend $46-92m to function.
According to PoktBlade’s benchmarks, not perfect since there are differences on mainnet, yet still very indicative of high performance, demonstrate the possibility of putting hundreds of servicers on a single full node. This light client significantly reduces the cost of node runners, which benefits token holders and decreases the financial risk of running nodes. This is something that benefits the entire ecosystem.
Based on an AX101 server that costs ~$100 USD and allocating 20 RPS per servicer, our localnet benchmark hints that we can run approximately 50 servicers per full node for the cost of 12-13% CPU, 300GB of SSD, and 21GB of ram. This puts us around ~200-400 servicers per server and a cost of ~$0.25 to $0.50c a node under the worst case scenario that all nodes enter a session and the server can still handle all the requests (LOCAL NET TESTING). In comparison to the normal Pocket Core client, this would’ve taken ~60TB to ~120TB of storage, multiple SSD drives (more IO), 4TB to 8TB of RAM, and plenty more CPU cores. (There are differences between a localnet and the mainnet, but this is still very good).
Here is 10 servicers on one full node handling a total of 1k RPS on localnet.
There are other variable factors such as server hosting costs, mainnet overhead, etc that could skew these numbers. We are working towards gathering more production related statistics and will post them as part of this discussion as soon as they are ready.
At our ask of $2m USD and a conservative cost reduction of 50% and estimate of network spend, the ROI of this proposal is 11.5-23x. The network would break even in a little more than a month (48.5k*$80*1.04 months*.5) = $2.02m of savings. Node runners are approaching their break-even points, and this client would ensure that people never need to unstake because they can’t pay infra costs. We are approaching a critical point, and this client prevents any unstaking spiral.
This graph shows the cumulative spend until v1 (18 months) with a current network cost of $80 per node per month and 48.5k nodes on the network, a conservative light client cost reduction of 50% ($35m), Probable cost reduction of 75% ($52m), and an aggressive cost reduction of 90% ($63m).
We will continually provide new data points as the development progresses.
Budget
DAO administrator would disburse funds to each party after a production ready light client is delievered. Said administrator will coordinate with each team lead for their portion of the reimbursement.
The teams will observe a 3 month vesting cliff where all funds ($2M) will be staked on nodes.
Following the cliff, ~6.66% ($133k) will be eligible for unstake every month for a period of 15 months. (linear vest). We are committed to the success of the network, and are here for the long haul.
A break down of this would be as follows
Funds are split and disbursed to each team upon completion of the light client.
Month 1 through 3 all funds are staked into nodes.
Month 4 6.66% become available to be unstaked (though teams may opt to keep it in).
Month 5 through 14. Same as above.
Month 15 Teams have the option to unstake all funds.
Rationale
Alongside the financial breakdown in the “Motivation” section, it’s important to highlight that the light client is a non consensus breaking solution, and a very non-intrusive way for the network to be massively optimized. Other solutions such as stake weighting or stake-minimum have significant friction associated with node consolidation since they require an unstake. Although other solutions can be applied, the light client’s effect on symptoms should be observed first before applying proposals. This client leads to drastic network cost reduction immediately after being implemented and other economic proposals are supplementary.
Without the talent brought onto this optimization (@poktblade is a beast), this idea would have never made it to light before v1. Even after being identified as a potential solution by Thunderhead/Poktfund for v0, there were many who thought it was not possible/feasible and that we were better off waiting for v1 which is over a year out. The conjoined team saw differently and fast forward to today have accomplished the inconceivable. We want this proposal to serve as inspiration to other engineers to collaborate, join minds, and create waves of innovation. Likewise we would like to set the precedent that ingenuity is valued and impact is rewarded.
We are building the impossible.
Dissenting Opinions
- This is too much money for a few months of development time spent.
While this is in fact a large ask relative to other proposals, it’s important to keep that notion of relativity when we think about the impact it will have. The amount of savings from this optimization makes the reimbursement amount look frail in comparison. And as mentioned previously, the ROI of this endeavor is tremendous, and will have immediate impacts.
@olshansky recommended the general flow for this development
- The DAO provides a small grant (e.g. 1%-5%) to fund the research for this proposal. The deliverable for this grant will be a design document and QA plan.
- The community and core protocol team members review and provide feedback on the proposed changes and testing plan. This can help scope the level of changes, highlight missing gaps, and give insight into the feasibility of the proposed design.
- Assuming (2) goes well, the DAO could provide another small grant (e.g. another 1%-5%) to fund the development of a prototype.
We have completed over 66% of these recommended steps and are progressing through the last step right now.
Furthermore, we are asking for the funding to happen retroactively. We are going to get this done, and going to save the network a great deal of money in the process. The funds will also be locked for 3 months and vested linearly for another 1.25 years.
Other blockchain as a service centralized providers (infura, quicknode, etc) are backed by VC money and bringing in primo developers, we need to be competitive and encourage and incentivize more innovators into our ecosystem.
This is what the DAO’s treasury is built for. If this software was used privately, this would cause a large imbalance to the node running ecosystem increasing centralization as some node providers would be able to offer insanely cheap prices, while others could not. This optimization could have instead been sold to a large node runner. A node provider with 10k nodes paying $80 a month would save $2m from this client in 3.3 months, making purchasing this client a very good investment for them. However, we would much rather give this to the community and allow everyone to benefit.
- This will deplete the DAO treasury too much.
We can adjust DAO allocation if this is a concern. This would push the cost back on node runners who are now saving a significant percentage of their costs from this change. With this adjustment, the DAO treasury would have earned the cost of this proposal back before the stake cliff even ended. It is possible to increase the DaoAllocation so that node runners subsidize this rather than the DAO. Node runners will have a small decrease in rewards, but a massive decrease in infrastructure costs.
- This can cause a threat to the network.
Our primary goal is safety. We are doing extensive QA to ensure that it goes smoothly and that there are no negative externalities of running this light client. The core team will support us in this Q/A testing. One of the ways we are doing this is working with large and small scale node runners for beta testing. We are already testing the client in a test and production environment. We are also making sure that the light client works for validation as well.
- This light client won’t actually lead to a cost reduction since there is no guarantee that providers will pass their savings onto clients.
Providers that do not pass savings onto their customers will be undercut by providers that will lower their prices drastically in order to pick up market share.
Deliverable(s)
We are ready to open source the client; however the core team recommended that we open source it once validator functionality has been implemented. We await their approval
Stay tuned
We already have a proof of concept running successfully on mainnet and have a localnet version capable of handling hundreds of servicers. The following is what we have planned.
In Progress right now ordered by priority
- Increasing RPS on one full node (est. 1 week)
- Currently, the RPC of pocket is rate limited on mainnet which limits the number of servicers on each node. We are fixing this so that any amount of servicers can be added to a full node.
- Validator functionality - allowing multiple validator on one client (est. 0.5 - 1 week)
- We want to make sure the light client is safe for the network under any circumstance, so we are implementing validator functionality for the extra servicers. The core team gave us this idea and suggested we do this.
- Q/A Testing (Timeline TBD)
- We are going to undergo extensive QA testing to make sure there are no negative externalities.
Completed
- Proof of concept
- We’ve built a client that allow for multiple servicers under one full node and it is running successfully on mainnet. We are ready to open source it.
- Validator design doc
- Localnet testing with 30 servicers on one full node.
- We’ve completed local net benchmarking with 30 servicers on one full node (~95% cost reduction), with indications that it’s possible to handle 10x that.
Final Deliverable: Finalized light client (The three month cliff would only start here, upon completion)
- Finalized client with multiple servicers under one full node
- Validator functionality. The additional servicers will also be able to act as validators.
- Open source with documentation
- Extensive QA and testing
Community involvement
- Commitment to support until V1
- Technical talks
- Design documents
- Inspire/encourage other collaborations
We’re going to finish and build this, and so we are asking for all funding retroactively
We love pocket, and this is just the first of many contributions coming from partnerships between PF/TH and other community members.
the v0 optimization
fiag
Copyright
Copyright and related rights waived via CC0.