POKTscan’s Geo-Mesh

Poktscan is happy to share with the Pocket community our Poktscan Geo-mesh. This client allows all node-runners to take advantage of relay traffic in locations worldwide without having a full Pocket Node at these locations. Implementing will improve the overall QoS of the Pocket Network while giving all node runners, small and large, the ability to compete for relays.

Technical competitive advantages are temporary, especially when you have talented engineers. This community has a tremendous amount of talent and coming together with the intent of solving problems will transform Pocket into the most important Web3 infrastructure product. Advancing thoughts and trusting data are key to our success as members of this community.

Below you will find a document created by our team with explanations and links to our repository. There are other initiatives we will be sharing soon. Stay tuned.

Cheers!

Michael (Sr.)
@Michael O’Rourke#2188 (Discord) - @michaelaorourke (Forum)
michael@poktscan.com


Greetings Pocket Network community,

As POKTscan, our mission is to follow the data on the blockchain and provide node runners with the most accurate and user-friendly (yes, we try to!) source of data. During our analysis and tracking of the network Quality of Service (QoS), we found some interesting behavior of some small providers. Their nodes were responding to sessions with times that were really low in many locations at the same time. Here is a brief excerpt of the Cherry Picker (CP) data on one of these nodes:

Observing the Weighted Success Latency (WSL) it can be seen that values are not achievable with the current pocket node technology. The current pocket nodes can only exist in a single location. This means that if a node with address XXXXX is in a server in the USA, then there cannot be another node with address XXXXX in Europe. If two nodes with the same address are activated then it would lead to double claims and other kinds of errors. We checked these nodes for errors and found no evidence of any misbehaving. This meant that there was only one node making the claims. But the response times do not add up. It is not possible to respond to AWS ap-southeast-1 within 28 milliseconds and also respond to AWS eu-central-1 within 49 milliseconds. According to Cloudping grid, it is not possible to reach ap-southeast-1 from eu-central-1 in less than approx. 150 milliseconds.

This triggered our curiosity and we started to work on some concepts to implement a new pocket client that is able to reach such response times. This was not only a great step for any node runner, but also for the Pocket Network.

A few days after this initial discovery a new major node runner joined the group of node runners that were using this new technique (not our client). Now the improvement in the network response times was obvious. Using data from 15/09/2022 we recreated the QoS distribution that we presented in our report on PIP-22, on 29/07/2022:

This graphic is a histogram of the total nodes in the network arranged by their mean QoS (response time against all gateways). The colors represent the different QoS tiers, from best to worst: green, yellow, orange, and red. The faded color shows the distribution on 29/07/2022 and the solid colors show the current distribution. It is worth noting that the previous green section of the histogram ended close to 0.19 milliseconds, while the new one reaches values below 0.05 milliseconds.

The increase in QoS is outstanding, see the changes in the groups:

Node Tier Nodes on 29/07/2022 Nodes on 15/09/2022 Change
A – MSL < 0.225 2212 10971 +395%
B – 0.225 < MSL < 0.325 8480 6751 -20%
C – 0.325 < MSL < 0.5 20151 6363 -68%
D – 0.5 < MSL 3719 1097 -70%

Without entering details (like the total node reduction), the increase in nodes in tier A speaks for itself. Formerly this tier was composed only of high-end node providers.
Currently, not all the high-quality node runners have nodes capable of serving relays from different locations, in fact, only a small amount of large providers are capable of this (using whichever solution they have, unknown to us):

Domain Nodes
C0d3r.org 7764
nodies.org 766
blockhub.org 469
Node-river-001.com 387
poktschool.com 125
goodpokt.com 117
nodeworld.net 75
— Total — 9703

These node runners make up approx ~30% of the total network, we can only imagine how the QoS of the network could increase if any node runner has access to this new technology.

Our contribution

After some weeks of non-stop development and testing, we created a working pocket node that can delegate relay responses to other lighter clients. These clients can be deployed anywhere, making the geolocation of the main pocket node irrelevant for a quick response in the network.

The concept of this new client is the following:

The Pocket Application communicates with the gateways and these in turn make the relays to the node (normal interaction).
In the picture, we show a single Servicer Node with two associated Mesh Nodes. Each of the nodes, the base Servicer node, and the two Mesh Nodes have their own Local blockchains.
The nodes are all behind a Global DNS Load Balancer (not part of our contribution). When each Pocket Gateway requests the address of the Servicer Node, it receives the address of the closest node (Servicer or Mesh). This Global DNS holds the addresses of all the nodes (Mesh or Servicer), upon receiving a request it will respond with the closest available address.

Then, when the gateway in AZ 1 sends a relay to the servicer node, it is in fact calling the mesh node. This mesh node takes over the relay, performs minimum session validation steps, and serves it as fast as possible using its local chains. Then it reports its work to the Servicer node to keep track of the claims. This process repeats for each other mesh node. If a relay is sent to the servicer node, it is handled as always. At the end of the session, the Servicer node posts a single claim accounting for his work and the works of all its associated mesh nodes.

One of the great things about this client is that it does not require significant changes from the current Pocket Node. The main changes are:

  • New configs in “config.json” to enable mesh node support.
  • Health endpoint on Servicer Node to check the height and sync status.
  • A new endpoint on the Servicer Node to receive the Mesh Node’s report (secured using the “auth.json”).

The process is seamless to the Pocket Network. We have been testing it for a few days and the outcomes are great. Using some external nodes under the domain overlordyorch.com with great results. The Servicer nodes were placed in the USA and the Mesh Nodes were placed in Europe. The CP reported times are 69 milliseconds for us-east-1 and 66 milliseconds for eu-central-1. Which is in line with the observed response times of other node runners. Moreover, all these nodes are running using the Lean Node client!

In POKTscan we always believed in the community and the value of its support. Making this new technology open for everyone has massive positive effects, just a few that we can think of are:

  1. Pocket Network region quality increases at a fraction of the cost. No need to deploy a full pocket node on every location, only the served chains.
  2. More high-quality sessions for the Pocket Apps. If the quality of the average node reaches the level of node tier B (see image) the Apps will not have to be served by bad nodes if the limit of high-quality nodes within a session is reached.
  3. As the sessions are filled with nodes with a similar, high QoS, nodes, the modeling of the CP effects simplifies. Second-order effects tend to disappear due to the group being homogeneous.
  4. Hard limit of the number of relays will be harder to reach as a competitive environment equalizes the rewards per session by node.

We work closely with many active members of the community and they help us to shape POKTscan.com . We are open-sourcing this client in the hope of engaging the community in improving the Pocket Network ecosystem.

This client is not production ready (but is really close to being!), we need other node runners that are willing to risk a little in order to help the network improve. We also want to invite other node runners that have already been using this technology to jump in and provide their know-how and experience running these geo-distributed clients. We have people working 24-7 on this project, if you can provide feedback or suggestions do not hesitate to do it!

Links Here!

You can find the code in this repository. The usage instructions can be found here.
We are also releasing a Docker image and a local development repository to help potential contributors.

If you want to contribute, if you want to report any bugs, or just want to thank our devs for being so great, just drop by our Discord channel. We created a specific channel for the Mesh Client.

Please do not use this thread (or private messages to our developers) to discuss installation procedures or bug reporting, that’s why we are creating a specific channel for that.
If you have any comments on the project or non-technical issues/suggestions feel free to post!

Disclaimer of Warranty: Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.
16 Likes

This is awesome. Better for QoS network-wide. This will help us be competitive against globally load balanced centralized and upcoming decentralized providers in near future. Large kudos to PoktScan for stepping up to provide this to the entire network and level the playing field.

It will be interesting how this affects node running operations - as small node runners can’t afford to keep up chains in multiple regions. This might be another signal to step forward with chain node sharing sooner than later.

8 Likes

This looks really awesome - I can’t wait to start dabbling with it. Is a similar soution on the roadmap for chain nodes? :slight_smile:

2 Likes

Amazing! Super excited to dig into the client!

Will this client work with the Lean client tech once it’s production ready? I believe that is what pocket-core v0.9.2 will be.

2 Likes

Hi Steve yes there is.

2 Likes

Yes, fully compatible. Is done using Lean Node branch as base.

4 Likes

This is amazing work. We definitely would (will be) testing this. We’ve been noticing the network dynamics shifting quite a bit. Thank you for this contribution, BlockSpaces will do what we can to contribute.

3 Likes

Just to be clear…there has been a group of node runners that have been running a project to test a new client that gives superior QoS and thus increased averages for just those node runners? That for sure would have saved me what seems like years of stress in my life to know the past few months …we would have loved to be included in that beta group. Was there a public call for participants or was this more of an “invite only” kind of things? Thanks for sure for all the hard work here. Just trying to understand the process!

Can’t wait to look into this more. Looks awesome. Great job!

2 Likes

Hey @Rosa , there was no call for participants and if it was invite only, we at Poktscan were not invited. We do not know if the group of domains above are even working together or not. We just found out, through data, what was going on and embarked on creating a solution to level the playing field.

1 Like

Just to complete @Jeff answer.
We don’t know if the node runners that we mention are all even using the same client or if they have collaborated in any way. We do not know anything else about their operations.
Regarding our contribution, we did not perform any kind of private/invite only beta testing. Opensourcing is our beta testing!

3 Likes

Great work here guys! Going to have a dig through the code now and hopefully will play with some of our regions this week :).

Might want to make this slightly clearer in the OP. like @Rosa I was also confused. The OP sounds like the 10k nodes are running your version.

2 Likes

Noted. Will see how to better phrase this. Is there an exact part which made you think this? Was it:

Thanks for the feedback.

I also initially read this as multiple node runners were working together or part of a joint test. After re-reading I understand what you’re saying is that multiple node runners have figured out how to have a single node respond with low-latency, from any region. But they may or may not have worked together to figure this out. I believe the use of the term “new technology” might read as though they are using your new client . I’m not sure there is any single “technology” but rather multiple noder runners have come up with a solution that allows one node to perfrom well in any region. Is that correct?

3 Likes

We have now updated the post to better clarify that our client is not the one the other node providers are using.

@steve

This is correct.

I would say its a new “technology/technique”, but there are probably multiple different client/implementations/solutions.

1 Like

Is that ‘new’ to Pocket or new in general? Like having Pocket servers behind a proxy that routes geographically might be new in terms of how most node runners have been setting up Pocket nodes, but the technology is not new. Not trying to be pedantic here - just trying to make it clear for anyone who might not be familiar with the technical side of how nodes are run.

No problem. You are right, the global DNS already existed.
What is “new” is the Mesh Nodes, which gives the ability to “clone/replicate/mirror” a light version of one node in different geological locations.The Global DNS enables this to work by calling the closest node(original or one of his mesh nodes).

1 Like

Thank you much for the clarification! We for sure had already been researching this very specific thing and had been reaching out to see if others were noticing what we were noticing. I commend other providers who have figured this out as well, and totally agree that network wide QoS is the most important goal. The bigger picture seems to be (or I would like to believe it is) that while one provider over the other might be a short term win, the long term vision and the “real competition” at hand is not with POKT node providers, but the larger landscape of decentralized Web3 infrastructure of which POKT is but one choice…it seems like we are a bit early in the game to break up the band and dilute the fan base. (my music analogy)…so again we do appreciate the insights and for sure would like to collaborate and combine our collective efforts with those who feel the same!

8 Likes

Great job! Thanks for you contribution and selfless sharing

3 Likes

Hello everyone, very excited to announce that our POKTscan Geo-Mesh is growing and now support many servicers on the same mesh node process. (like Lean Pocket)

Details:

  • Added Multi Servicer support (like Lean Pocket)
  • Added Mesh node health check endpoint - /v1/mesh/health
  • Added servicer connectivity check to allow the user to check at the start of the mesh node is able to reach the servicer on the provided URL
  • Fixed an issue with the hot reload of the chains. Now you can just update your chains.json and they will be updated in memory.
  • Fixed endpoint /v1/private/mesh/chains to return the chains used by the mesh node correctly.
  • Enhanced shutdown process to stop all workers and close properly the session and relays database, avoiding the need to re-index them at startup time.
  • Enhanced logs (every log)
  • Renamed our docker tag to show that our Geo-Mesh is built on top of a non-released branch of Pocket Core. So the new tag name is poktscan/pocket-core:MESH-0.0.2-ALPHA-0.9.2

Notes:

  • Check mesh.md on the repository to see how to configure it (so easy btw) pocket-core/mesh.md at 577be8de2b0f2942e2384cc5821fa884ebe216e1 · pokt-scan/pocket-core · GitHub
  • Mesh node does not use a lot of resources, but we are still testing how much, remember to tune: max_workers, max_workers_capacity & workers_idle_timeout to speed up the notifications.
  • Please update workers_idle_timeout to something greater than 5000 (default on the library) otherwise will keep the CPU on High Load.
2 Likes