A Polite Conversation about Chain Node Pooling

Following my DAN proposal and the announcement of Community Chains, there has been a lot of conversation about chain node pooling. To get into the specific of the pros and cons of chain node pooling, I wanted to create a dedicated topic and provide the history of the concept as I’ve experienced it since joining POKT in 2019.

However, with the announcement of CC, there has been a lot of FUD going around about chain node pooling (and specially the existence of CC). While all comments and feedback are welcome, let us not stoop to projecting nefarious motives on folks and focus on a productive discussion. I’ve been rather shocked with the lengths to which folks have been willing to project on our CC solution, when it’s development has been openly talked about since November in Node Runner Community Calls.

I believe most of the aggression is unfounded as many don’t understand the history behind the concept, so I wanted to share what I’ve experienced and open up a space for all to discuss.

Background

Chain node sharing has always been a topic POKT folks discuss. The idea of reducing costs make sense in any business or ecosystem, and pooling chain nodes to reduce costs is just basic logic with POKT economic.

Since the launch of mainnet in August 2020 and up until until Q3 2021, POKT primarily had traffic on ETH as few chains were actively supported. Chain infrastructure costs were very low for providers. The introduction of Settlers of New Chains opened the door to adding new chains and incentivizing node runners to spin up chain nodes with a reasonable assurance they would get some rewards.

3x Infrastructure Costs

With the increase in new chains being added to POKT, running chain infrastructure began to increase. Despite the increasing chain infrastrutre costs, at the time you could easily run POKT nodes at a profit. However, even at that time I still heard of folks that begun chain node sharing.

Q3 2022 the Reward/Cost ratio started to dramatically fall as some providers created their own closed-source clients to do geo-meshing. Fortunately POKTScan released an open-source version that enabled anyone to participate (see the history of that here), but that now meant that in order to have network average rewards, you now have to deploy chain infrastructure in 3 regions.

The math is simple in regards to what is required to be profitable:

Q2 2022 = 1 POKT node required access to 1x chain nodes

Q3 2022 = 1 POKT node requires access to 3x chain nodes (as it needs chain node access in each region)

That 3x in infra costs for 1 POKT node crushed the independent node runner market. No one today can profitably run nodes without having chain infrastructure in 3 regions. POKT has all but lost an independent node running community (folks running nodes with their own POKT, not a provider running other folks nodes) with this extreme rise in infra costs.

LeanPOKT reduced the cost of POKT nodes dramatically (kuddos to POKTFund and ThunderHead) , while at the same time we saw a 3x increase in chain infrastructure costs with the introduction of closed-source geo-meshing (kuddos to those providers earning 2x to 3x the rewards last summer). Had geo-meshing been openly discussed prior to development, then there could have been region locking brought on-chain to protect against increasing chain infrastructure costs (as @RawthiL suggested for v1), or it could have been EASILY enacted via the Portal (which could still be done today, but I think “some” providers may not like that :stuck_out_tongue_winking_eye: ).

NOTE: This post IS NOT about debating the ethic or economics of geo-meshing, or trying to directly reverse it’s impact. This post is in fact about chain node pooling ONLY and I mention geo-meshing only for the purpose of showing it’s impact infrastructure costs. Please direct comments about geo-meshing history to another post. Thank-you :pray:

Current Over-Provisioning

With 3x the infrastructure being deployed on POKT, came extreme over provisioning just to be profitable. As I mentioned in DAN, only 6 well run Ethereum Erigon Nodes are technically needed to handle all of POKT’s ETH traffic, yet every node runner needs Ethereum infra in every region, adding a LOT of chain over-provisioning. How many ETH nodes do folks think are within the POKT ecosystem? 50+ makes sense to me when you look at all the providers.

These nodes require POKT liquidation to pay for the infrastructure. I don’t know one bare-metal hosting company that accepts POKT as a form of payment.

Chain Node Pooling Today

PNI first started talking about chain node pooling at the POKT State of Union this past November. They explained it as a way to reduce infrastructure costs, allow there to be more POKT node owners, and ensure best network wide QoS. It was referenced and mentioned by 3 or 4 PNI presenters and was said to be a big focus on 2023. More recently Michael talked extensively about measures they are considering to reduce chain nodes on POKT.

Beyond that, chain node pooling is already happening today, though it is not being transparently broadcast. I’m aware of a number of providers that either share chain nodes, or provide chain node access to other select providers. It’s a closed club.

It is also frequently talked about in Node Runner Community Calls as a way to enable more folks to be POKT node owners. The topic really picked up in Q3 2022 once folks heard about geo-meshing from POKTScan. We in-fact started building CC in October and first shared about it in November. Since then we have been giving updates on most of the Node Runners Community Calls.

Who’s More Dangerous… POKT Providers or Chain Pooling :crossed_swords: ?


OK, let’s address the FUD that has been going around with logic :point_up:

The $500k Problem

FACT: POKT is dominated by hosting providers and it is nearly impossible for anyone with less than $500k of POKT generate any kind of “profit” on POKT as an independent node runner.

Recently in a Node Runner’s Community Call, a successful node runner mentioned that their infra costs are around $10k a month. They run all bare-metal and their costs do track with other providers. Sure, I do know there are some VERY SAVVY node runners out there that run for less, but that is the outliers and they are dedicated to their trade (which kudos to them :saluting_face:).

Using the $10k expense, here is a calculation on the amount of POKT it requires to stake to break even, and the amount required to make 5% APY with today’s conditions (not accounting for SER reward reducing).

TLDR: Someone would need to stake $631,512 worth of POKT to get 5% APY :man_facepalming:

Because of this, this is today’s reality:

POKT = POKT Hosting Providers and POKT Hosting Providers = POKT

These services are the ONLY way to participate in POKT if you do NOT have $631k + a high level of technical knowledge and a healthy apatite for pain.

So what options should there be until v1? Currently there is only POKT Hosting Providers, and CC is now the first to offer distributed chain node pooling. However some folks seem intent on sowing doubt, so let’s get down the details.

Centralized POKT Nodes or Centralized Chain Nodes?

How do they compare?

Features POKT Provider Chain Node Pooling
Own chain nodes :x: :x:
Own POKT nodes :x: :white_check_mark:
Can deploy your own chains :x: :white_check_mark:
Choose Chains to serve :x: :white_check_mark:
Validator Voting :x: :white_check_mark:

Chain node pooling clearly has an advantages, and in every way align more with POKT’s ethos. It aligns with POKT ethically and economically. While I have been perfectly content with the existence of POKT Providers, I do not get the arguments against calling chain node pooling “dangerous”.

Legit Conversation

Now, let us now have a legit conversation. I have responded to every meaningful remark in the DAN proposal, and yet the same straw man arguments keep coming up. Since the FUD is being directed at CC and not directed at the chain node pooling concept itself, I’d like to separate the two and hear some logical arguments about their pros and cons. This way we can know as a community if folks have an issue with chain node pooling itself.

I’ve been making free node software for the POKT community since 2020, and will be happy to lend my experience in the independent node running space, to this conversation :point_down:

4 Likes

After much discussion in Discord (in the #node-chat channel), I decided to do some napkin math to show what the savings could be for small to medium size node runners.

As you can see, the economics of chain node pooling isn’t just about saving costs for independent node runners, it helps everyone, depending on how they choose to do it.

For small node runners, they lose money on basically every chain. However, once you start to get to medium to large node runners, there is more profit in running some RelayChainIDs and pooling for others.

2 Likes

So we would be saving $2k. Thank you for putting this together.

2 Likes

Great post, I agree that there is too much FUD around this issue. Pocket should be strong enough to beat any kind of competence, even if CC becomes a competitor (which I don’t think it will be, I view them as different things).

Could you share the sheets used for calculations? (I’m interested in looking closer at those numbers)

5 Likes

This was always my understanding of the intention behind CC. I would even argue that these conversations started as early as Infracon in May 2022. I spoke to many node providers about the general concept as it related to the scalability of the TriForce program.

Thanks for adding more context about the thinking behind CC and appreciate you taking the time to separate the conversation from DAN.

4 Likes

Thanks @shane for putting this thread together. I can’t speak to the other example given, but seeing @poktblade being cited as a source of FUD is rather surprising. He is pretty much the last person in the ecosystem I would ever accuse of spreading FUD.

It will be interesting to see if any arguments against the chain node pooling concept itself are teased out of the woodwork by this thread. I have never once seen or heard a single argument proffered by anyone in the community against chain pooling per se. As you say, all the comments in the DAN proposal were regarding using DAO funds to compensate DA/CC as the sole or preferred solution for altruist/rare chains. As such, they were completely on topic, and do not constitute FUD. They do not even criticize the CC solution; if anything they praise it. They simply cast doubt on whether it is prudent for DAO to fund it. I’m sure the same type of comments would have materialized if a single node provider had proposed to be the sole or preferred recipient to receive DAO funding to provide altruist and/or rare-chain support. I’m not taking sides or saying I agree with the arguments… just clarifying that the comments, IMO, were on topic to the DAN proposal, not FUD, and had nothing to do with chain node pooling per se.

That being said, since you bring up the topic of chain node pooling…

I cannot separate the discussion of chain pooling from the discussion of geo-mesh and lean-pocket. All three go hand in hand. I am fundamentally against all three. All three are examples of the way network suppliers have manipulated weaknesses in the protocol to gamify rewards away from the original intention of rewards proportional to the infrastructure contributed to the network. All three are viruses that threatened the survival of Pocket Network and for which POKT had to adapt/evolve in order to survive. Network providers did nothing malicious in this gamification of rewards. The game always proceeds according the incentive structure of the actual ruleset of the game, not according to the original intent of the creators of the game.

The human system is wonderfully adept at surviving the onslaught of viruses, using a two-pronged approach: (1) fight against and eradicate the virus, if possible. Many viruses can be eradicated before they can take over the body. Chocolate Rain is an example of this. Hackathons and bug bounties are focused on this prong of the fight against viruses. (2) for viruses that cannot be eradicated, the body shifts to a strategy of “if you cannot beat them, join them”. Code from the virus gets added to the human DNA and shifts from being a “disease” to making the body more robust. Likewise, the only way to survive an unintended gamification of rewards that cannot be prevented through a security patch is to incorporate it as a full-fledged feature of the game.

Thus:
While I am against sw mirrors of full POKT nodes that manipulate the portal into directing many multiples more than “fair share” of servicing opportunities to the full node, I am a huge fan of making this open source so as to level the playing field for everyone. Without open sourcing this, the survival of a decentralized network was at stake. This is why I am a big advocate of making sure poktfund and TH get full reimbursement for their work on Lean Pocket. The reimbursement is in line with bug bounties for high/critical-level bugs… but where the solution was not a security patch but an incorporation of the “virus” and making it a feature of the game.

Ditto with geo-mesh. Which is again why I am a big advocate of making sure poktscan gets full reimbursement for making this open source.

With chain pooling, it is a little bit different, in that “leveling the playing field” is not a matter of open-sourcing some software. Other means are needed to level the playing field. Which is why I really like what DA has done with CC. I can best state it like this: I am against chain pooling, but given intra-provider chain pooling, I am a fan of inter-provider chain pooling. At the end of the day, this last sentence is just a different way of saying what @shane is trying to communicate in his comparison table above.

Where does this leave us: let the game proceed as is through the duration of v0. However, the lessons learned during v0 must be incorporated now into v1 planning. We must be mindful and anticipatory of what behavior the v1 ruleset incentivizes, and then tweak the v1 ruleset as needed to align anticipated likely behavior with desired behavior. This should be done proactively. This is one of my major research areas during this upcoming season, and I know poktscan also has already put considerable thought into this as well, some of which they capture in their research thread:

3 Likes

Yeah, this is correct. The quote is missing context, as I was being the devil’s advocate and bringing in scenarios for the DAO to consider what a product like CC providing service to the DAO actually means, and what the different scenarios can be (i.e replace CC with other providers)

I actually end the quoted screenshot with:

This service has the opportunity to drive revenue in so many different directions. In fact, in a free market, there is not much we can do about it, and I have nothing against it.

The ideal scenario IMO is that this isn’t funded by the DAO given the nature of the product but rather funded by PNI. This also removes any restrictions from your service to monetize as it sees fit.

Anyhow, just clearing my name of any misconceptions. Community chains is welcomed in my book.

3 Likes

The ideal scenario IMO is that this isn’t funded by the DAO given the nature of the product but rather funded by PNI. This also removes any restrictions from your service to monetize as it sees fit.

Key point not included that should be reiterated in the body of this post IMO. The dialogues about “bypassing the network” have muddled this important question resurfaced by @msa6867: who is responsible for funding the altruist network?

1 Like

With all do respect, this question would be best suited for DAN, as that is where it has been widely debated :slightly_smiling_face:

I appreciate focusing this conversation chain node pooling specifically, completely separate from the altruist network.

If there was a misconception, it has unfortunately been over a week without a clarification, so I’ve been leaned on the natural interpretation. Thank-you for now clarifying that you were being devil’s advocate and not representing your own concern as the comment stated.

I appreciate that. Thank-you.

Focusing on chain node pooling:

Much thanks :slightly_smiling_face:

Completely agree. I’m excited about the conversation @RawthiL has stared regarding v1 staking design.

2 Likes

Could you share the sheets used for calculations? (I’m interested in looking closer at those numbers)

I’m preparing a public version :+1:

1 Like

I will refrain from commenting on the subject itself as I am not qualified to do so.

I want to highlight that @shane 's professionalism and his demeanour in forum are unmatched.

Those should be embraced by and captured in Pocket DNA.

I don’t know @shane at all and have not DM’d even once.

This is purely based on my observation in forum.

Cc @b3n

6 Likes

That is a compliment of the highest order. Thank-you.

3 Likes

(Apologies, my username appears to be out of sync here on the forums, I’m ‘call_me_al’ in Discord)

I’m bringing some of my points from Discord here as it seems the more appropriate venue for discussion. I’m very interested in seeing the public version of these tables, and will absolutely read any new information if it’s presented there, but after looking at that screenshot a little more, I have fundamental disagreements with the assumptions made to generate those costs, and thus the conclusions that they drive.

Mainly - none of those costs include labor. Please correct me if I’m wrong, but the “Tri Region cost per month” seems like exclusively server hardware cost - leasing or amortized purchase cost+colo. First of all, if that’s the case, it is fairly aggressive for three regions for N+1 (I hope this represents at least 6 total sources for each chain), and there’s 0 accounting for ancillary costs (non-chain infrastructure, like automation servers, monitoring servers, log aggregation servers, communication platforms, etc.). I operate managed backend chain nodes professionally, and I can tell you that doing it right involves more hard costs than just chain servers, and the nature of blockchains means you need more redundancy that you might in other workloads (to provide a service others can rely on).

Secondly, my main point - labor costs. Setting up servers and chains can be helped by modern automation tools (that take time to setup up front and update over time), but ongoing care and feeding is absolutely a requirement that takes (expensive) engineer hours. As anyone who has ran a blockchain node where the performance and sync status is being monitored can tell you, nothing is “set it and forget it”. Again, tools and modern systems can go a long way to get rid of the repetitive work, but the nature of these kinds of operations means there will always be some amount of toil required.

For example, if you are providing a chain node service professionally, and you have everything all setup and running and as lean as possible, you still need an engineer on-call 24x7x365. You can get away with some on-call deadspots in the middle of the night if you architect your system with more redundancy (so definitely n+1, probably n+2 for some chains), but you still need a skilled human ready to respond to alerts (from the monitoring system you setup with time and keep running with money). Chains never stop, there is no “downtime”, and realistically you need multiple resources to split because human beings are not good at being “on” all the time.

So basically, I am arguing that the costs presented are not accurate, and taking it a step further, my thesis is that the price is low enough that it will create a obvious financial decision where it makes more sense for ALL large runners to use this service rather than run chains themselves. I’m going to copy my comments from the Discord thread here because they are relevant:

I’d like to discuss the ‘it will cost 15% of rewards generated’ cost of the service and learn more about how that number was arrived at. Rather than dance around my point, here’s the fear - this service is priced so far below the actual cost-per-relay of some specific chains, that the smart economic choice will be to ALWAYS use CC. For large pokt staking providers in a post-lean world, backend chain hardware and management is probably the single biggest expense, followed by human operations, and only distantly followed by the cost of running the pokt daemon and infrastructure itself. Given that, if an option exists where they can generate at least 85% of the relays as they would running it themselves, and they know that their total chain running costs are >15% of their relay-revenue, why wouldn’t they make the rational choice of using CC for all traffic? So arguably this is a good thing and the idea, removing waste and opening up opportunities to runners who may not be able to justify it themselves. My nightmare scenario comes in here when the large providers also start doing the math and jumping on board. If one of them decides that the costs of running their own outweigh the 15% ‘opportunity cost’ of running it themselves, their rational choice will be to use CC as well. In the current economic and tokenomic climate, everyone is looking to optimize expenses to make the most out of the revenue they are pulling in, and this seems like an inevitable choice that will be made.

Now - we’ve got a ton of community traffic riding on the CC infrastructure, everyone is pulling in relays, and the people providing the backends to CC will be doing the same arithmetic in their heads. They may make the decision to remove themselves from CC if the additional traffic incur additional costs above and beyond the 15% (which is always a risk with a community/open model, but not insurmountable), but more worryingly, what if the providers also realize that it will be overall more profitable for them to stop providing and start consuming CC? We might put ourselves into short term infrastructure spirals where we as a network hyper-consolidate in different regions for different chains down to JUST the remaining CC providers, only to have to rapidly re-expand if there is a capacity or availability issue, all while showing ‘healthy’ stake numbers from a network-perspective.

The crux of my argument is this - “The price per relay should be managed such that it provides room for running chains yourself (when taking into account your own time), and not low enough that it is impossible to match, even with self-hosted, $0/hr labor, bare metal.” My opinion is that 15% is too low - speaking as a node runner myself, my own expenses are MUCH more lopsided to the backend chains themselves, and if I was exclusively trying to optimize operations for profit, I would never chose to run any backend chains myself and always use CC or CC-like service.

7 Likes

This is in fact my fear too. I could not word it better and the cost argument is really accurate.
I like the general idea of CC, it is great, but this centralization spiral that @aos_bs talks about is something that we should watch closely.

4 Likes

Here is a public version of the spread sheet with all the info to get an idea of cost saving with the 15% we are looking to launch with, though nothing with the price is locked in, other than what the DAO will pay through DAN.

CC @aos_bs @RawthiL

CC includes a number of providers. I’ve talked to all provider heads and 15% was established simply at a starting point to get this off the ground. It is by no means locked in :sweat_smile:

You make great points regarding the factors that play into pricing. It’s in the best interest of everyone involved in CC to make sure that the core economics are not neglected, leading to the spiral you mentioned. As the developers, DA works closely with providers and it is in our best interest to ensure they are profitable partners.

Just like all product launches, we have a starting point with the top 14 chains as an MVP, and will ensure the economics for all parties are addressed. We are in this for long term sustainability :+1:

4 Likes

Thank you for providing that spreadsheet.

That seems to line up with what I thought before - none of the costs that went into this pricing model have labor or ancillary running costs included. Is that an accurate assessment?

3 Likes

You are correct that we haven’t quantified labor specifically, though that is hard to fully quantify as right now it would all be on theoretic terms. We plan to use the market itself to help define the value that accounts for the ancillary costs which are more challenging to quantify. The market itself will be a better quantifier of value IMO and we are going to be quick to make adjustments.

The prime objective is to ensure sustainable economics, so you can be sure we will be vigilant to ensure our node operators are taken care of :+1:

2 Likes

I have a question about relay handling.

Suppose that there is some chain that only two small node runners support, these NRs are also connected to CC, which provides service for all other nodes that want to consume it.
Now, suppose that chain starts to have more demand. More nodes join through CC to get access to this chain.
The two small node runners have very low chance of being selected in the Pocket Network (since they are small). Most of their gains will start to flow from CCs.
This is where it can get problematic. I can think of a couple situations:

  1. Both node runners earn more and they are happy about it :rainbow:
  2. One of those node runners start to earn more than the other, which could be the effect of the CC selection mechanism (not the Cherry Picker). How clear is this mechanism? as this is now their main source of income it will be looked very closely. I don’t know if you have shared this but it is very important to be very clear about.
  3. Both of them talk to each other and decide to unplug CC and take all the traffic they can, if all other nodes fail (and they will without CCs support), their nodes will absorb all the traffic.
    Syncing big blockchains can be a very slow process and surges in the Pocket ecosystems sometimes last only a few days. This will be very logical from a pure-greed mentality and will leave the network without fallback nodes. Is there any mechanism to counter this scenario?

I’m a little concerned about the edge cases where CC starts to run too much traffic for a given chain. I don’t believe that this can happen for the major chains tho, meaning that most Pocket traffic should continue as always, but I would like to know your opinion on this (the stated questions specifically).

3 Likes

This concern is misplaced as I stated before, I have talked extensively with the providers involved in CC. CC doesn’t have a cherry picker like gateway architecture with different levels to control traffic… All our providers are aware we use standard DNS round robin.

We chose trusted partners intentionally, one of them being POKTScan. For your scenario to play out, POKTScan would have be a part of the sabotage, plus our three other providers. Beyond that, if one provider decided to stay on CC, they would make significantly more POKT, than those that left, and they wouldn’t destroy their business’s reputation like the other providers.

Providers that act in this way would all be roughening their futures with the POKT ecosystem. If POKTScan were to be a part of this sabotage, would POKTScan customers be happy with having their POKT staked with a company like that?

You can apply this conspiratorial scenario to PNI operating the Portal or chain access services that currently operate in the background already. Both of those only require 1 party to make huge network sabotages. The difference with CC is it requires a secret cabal of MANY ecosystem partners, which provides significantly more protection than other aspects of POKT.

1 Like

This is what I was looking for (it was not stated in this thread).

So, as part of your on-boarding process you will need to know who is behind the blockchain nodes?
Otherwise you could add 3 different annon providers that are in fact the same person and CC will exposed to this “coordinated” attack.

We do not run all the chains that you will be supporting, so this scenario can happen without us being part of it.

I would really like to think that all customers weight reputation and community involvement when choosing a provider, but that’s not always the case.

Not sure what you mean with PNI, I don’t think that it is in their interest to act against Pocket Network…
My conspiratorial scenario is only around the providers that CC will have and for edge cases. It is not about CC being a bad actor. Its in fact a problem of the centralization that CC could create around small and difficult chains.
Anyway if DA is going to do the job knowing who they are plugging into their network this all goes away (to some extent).

Just to clarify, I’m not against CC, it is the right tool to enable growth at this moment. I just wanted more information and this thread seems to be the right place to get it.

3 Likes