We are about to start onboarding numerous chains and whitelisting them in the SupportedBlockchains parameter, to support the new multi-chain era.
Before we do this, we still have a chance to enforce some order on our RelayChainID naming scheme. At the moment, only 0001 (Pocket mainnet) and 0021 (Ethereum mainnet full node) are locked-in, which means it may still be possible to come up with a naming scheme that is compatible with them.
Therefore I would like to break down the problem and propose a solution. I have no doubt that there will be gaps in my knowledge, meaning this won’t be a perfect solution. The goal is to start a discussion.
Defining the Problem
Limitations:
- 4 characters
- 16 symbols (hexadecimal): 0-9, a-f
- 0001 (Pocket) and 0021 (Ethereum Mainnet Full Node) are locked-in
Characteristics we need to consider in the naming scheme, in order of priority:
- Project: Pocket, Ethereum, Bitcoin, Avalance, etc. Top priority because this has the largest set.
- Network Type: mainnet, testnet, canary, …
- Node Type: full, archival, light, sentry, witness, …
- Shards: subsets of a project set, of which there may be many
- Subgraphs: a different project type than chains, so might be worth having a distinct naming scheme
- Duplicate network type indicators: e.g. Ethereum testnet - Rinkeby, Ropsten, Kovan, Gorli. Lowest priority, because few projects have multiple versions of the same network type.
Proposed Solution
Proposition Pt. 1: Use the 4th column to define network/node types, with numbers for mainnets and letters for testnets:
- 1: mainnet full
- 2: mainnet archival
- 3-9: mainnet spillover, for different node types or duplicate network types. No consistent logic because it will depend on the project.
- a: testnet full
- b: testnet archival
- c-f: testnet spillover, for different node types or duplicate network types. No consistent logic because it will depend on the project.
Proposition Pt. 2: Use columns 1-3 to define different projects in numerical terms, from 1-999. Pocket would be 000X, Ethereum would be 002X.
Proposition Pt. 3: Use letters in place of the 0s before a project number to identify shards of the project. Ethereum Shard #1 (full node) would be AA21, Shard 2 would be AB21, and so on.
Proposition Pt. 4: Use letters-only in columns 1-4 to define subgraphs, from AAAA to FFFF. This ensures no confusion between chains and subgraphs.
Evaluating the Solution
Benefit of Propositions Pt. 1/2: The locked-in IDs (0001 and 0021) remain compatible with the naming scheme. For example:
- POKT mainnet: 0001 (locked-in)
- POKT testnet: 000a
- ETH mainnet full: 0021 (locked-in)
- ETH mainnet archival: 0022
- ETH testnet full: 002a
Downside of Proposition Pt. 1: This naming scheme doesn’t actually support all the varieties of Ethereum testnets for both full and archival nodes. If this is a dealbreaker, we could swap to a-f for mainnets and 1-9 for testnets, but then this becomes incompatible with the locked-in 0001 and 0021.
Benefit of Proposition Pt. 3: Ties sharding to project IDs, making it easier to understand shard IDs at a glance.
Downside of Proposition 3: It’s not actually that scalable. The first 10 projects (000X-009X) will support up to 36 shards (AA-FF), but Ethereum is anticipated to have 64 shards already in Phase 1. And if sharding becomes prevalent we have a problem, because projects 010X to 099X would only support 6 shards each. If it turned out that all shards are going to have the same network/node type, we could use letters in the 4th column too and increase the scale for projects 000-009 to 216 shards.
Benefit of Proposition Pt. 4: Logically separates subgraphs from chains and supports up to 1296 subgraphs.
Note: we could technically use numbers in the 4th column and still maintain the logical separation while extending the possible scale to 3240 subgraphs.
Permutation Efficiency
One way we can evaluate the efficiency of the naming scheme is to consider how many permutations may be locked out in the following scenarios:
(a) the project didn’t make full use of their available IDs
(b) the permutation wasn’t included in the logic of the scheme
Scenario (a)
A hypothetical project 003 with no shards and only 2 network types (mainnet full and testnet full) would use only the following IDs: 0031 and 003a. Which leaves [A-F][A-F]3[2-9,B-F]
unused, which is 504 unused permutations.
Already that seems like a lot of waste with only scenario (a) applying for only 1 project.
Scenario (b)
Scenario (b) is harder to evaluate, but I’ll give it a shot.
Because we’re only applying letters before project numbers which increment from right to left in columns 1-3, or exclusively in the form of subgraphs (AAAA-FFFF), but not after numbers that are located in columns 1-2, we’re missing out on the following permutations: [0-9][0-F][A-F][0-F]
. That’s 13,824 permutations, which is 21% of the total possible permutations (65,536).
There are probably other missing permutations that I’m overlooking in scenario (b), but hopefully this will help others to assemble a more comprehensive critique.
Open Questions
These questions represent gaps in my knowledge, which may influence the appropriateness of the naming scheme:
- If we have 4 characters then upgrade to 5 characters, what happens to the existing IDs. Do we simply add a column in front of the ID?
- How do we handle forks? Which chain keeps the existing ID? Should we link the IDs of forks to the original ID?
- Does it matter which client a node is using? Could a Pocket node using the Ethereum RelayChainID 0021 choose freely between Geth, OpenEthereum, Nethermind, and Besu?
- Could a beacon chain be categorized within the 4th column?