POKT Network & AI: First Principles

Matteo · March 11, 2024, 5:27pm

[quote=“Adrienne, post:1, topic:5089”]
Question 1: What specific problem is POKT Network best placed to solve within AI, and for whom?
A more general answer perhaps, but the way I would position this is that you’re helping teams de-risk their AI/LLM backends - this is how Katara AI (my new project) would like to leverage POKT Network, most teams are either building a wrapper around a hosted foundational LLM as a Service (e.g. OpenAI) or attempting to host/run their own infra - we are doing both as a way to diversify the backend so we would like to both contribute to POKT as a node runner (reducing our infra cost) and leverage POKT for access to other models.

Question 2: What strengths can we draw on to solve this problem effectively for them?
Well I think the communities experience to running blockchain infra is super relevant here and POKT has already proven they can build the protocol to solve for the trilemma on the web3 side, so its not a huge leap to suggest you can solve for this in Web2/3 AI. Pattern for a developer looks the same - we don’t want a heavy dependency on a centralized LLM as a Service provider nor do we want to operate a bunch of infra if we don’t have to run everything ourselves.

Question 3: What do we need in order to have a solution ready for testing?
You need both test/prod endpoints, monitoring, and docs to get teams started. Some early latency numbers would also be helpful or anything else that could be a potential restriction.

Question 4: Who are we competing with in this specific problem space?
You’re going to be competing with a few emerging web2 platforms who have already identified the problem but are trying to solve it as web2 shops - teams like: https://vectara.com/ - who are building a RAG as a Service platform or https://www.neutrinoapp.com/ who are building centralized LLM routing/model merging solutions. They both will have a few advantages due to centralization but that’s obviously a spot POKT can differentiate on.

Question 5: What is the customer (from Q1) looking for in a solution?
My new project is the customer from Q1 and what we would be looking for at first pass is the following:

De-risk our backend via a decentralized provider of diverse model endpoints
Expand our product offerings with access to additional models / functionality via a decentralized source
Potentially interested in model-merging endpoints that are backed up by a decentralized source

Let me qualify the above - there are two main techniques for solving model diversification - one is called an ensamble technique in which one would operate several models - submitting them each the same input for example, checking results, and then using the best answer - this is great for some architectures as it helps spread risk across multiple pieces of infra (particularly if the family of models is a mix of as a service and self hosted) vs. [model merging] (Model merging) in which multi-models are combined into a single endpoint essentially “extending” model features/functionality beyond any one models capabilities. (This is also rather interesting when considering how POKT might impact AI architectures.

BONUS ROUND: What could go wrong?
QoS could be really bad, additional latency, etc. Also the architecture opens itself up to new attack vectors - prompt injection for example, we’re working with https://www.lakera.ai/ and I’m not sure how using AI security services like this would work in relation to POKT’s plans for decentralizing access. Finally, someone mentioned decentralized RAG, curious what those pipelines would look like as teams who are building off foundation models are providing the most value at the RAG/pipeline - meaning - the way I architect a RAG / Prompt pipeline is my differentiator when the backend foundational model is the same, so what exactly will POKT be offering? We wouldn’t give up our entire RAG pipeline to POKT.

Olshansky · March 12, 2024, 2:38am

The answers to these questions can get REALLy long (i.e. a separate blog post for each question), so I kept it short, to the point and focused on what’s needed for a pilot.

Question 1: What specific problem is POKT Network best placed to solve within AI, and for whom?

POKT Network: Performant and Cost-effective LLM inference endpoints across various open-source models.
POKT Network Gateways (e.g. Grove): Reliable access and routing to the inference endpoints on the network.
For whom: Cost-sensitive developers/enthusiasts looking to experiment with various open-source models.

Question 2: What strengths can we draw on to solve this problem effectively for them?

What is our “right to win” / what makes us credible / what is our competitive edge.

Right to win: Years of experience; supporting a diverse ecosystem of open-source LLMs is the same as supporting a diverse ecosystem of open-source blockchains.
Competitive Edge (POKT): Ability to attract, aggregate, incentivize and verify the Supply side.
Competitive Edge Gateways (e.g. Grove): Attract & support demand/users by providing QoS guarantees.

Question 3: What do we need in order to have a solution ready for testing?

What resources do we need, what do we need to build, what kind of partners do we need… etc

Protocol: New Chain IDs for LLMs and updated RTTM; lots of technical detail here that I’m omitting.
Supply: Hardware operators, preferably with GPUs.
Demand: Early adopters are willing to use inference points that may be lower quality (at first) in exchange for cheaper access and a more diverse set of models.
Gateways: Minimum Viable (Experimental) QoS for LLMs.

Question 4: Who are we competing with in this specific problem space?

Please just note any you’re immediately aware of and what they’re doing.

Web2: together.ai, octo.ai, anyscale.com, tromero.ai, etc…

Web3: morpheus.network, tromero.ai, etc…

Question 5: What is the customer (from Q1) looking for in a solution?

Please indicate what your answer is based on - opinion is valuable, but it would be good to flag it as such.

Early adopters (e.g. looking to experiment)
Developers (e.g. hands-on teams)
Looking to tradeoff some QoS (e.g. speed) in exchange for hands-on support, cost, and potentially access to a wider array of models

BONUS ROUND: What could go wrong?

The upside feels very clear and compelling so I want to create space to flag any concerns.

Too much focus / discussion revolving around how we solve Quality of Service rather than just shipping something that’s “good enough” and being the #1 player in the space.

steve · March 12, 2024, 10:56am

Hey @Matteo, to clarify, what I was suggesting is that Pocket could be used as a RAG datastore solution. Imagine a decentralized version of Pinecone. So, I was not suggesting that anyone should turn their RAG pipeline over to Pocket or make them public. On the contrary, I see this as a way to monetize RAG pipelines while also keeping them private. For example, Pocket could be used to relay RAG requests similar to how RPC requests for different chains are relayed. So, RAG data could be kept private but made accessible for rewards/fees. Does that make sense?

Matteo · March 12, 2024, 12:48pm

@Steve, nice - I like it.

steve · March 20, 2024, 12:22pm

@Adrienne, @Olshansky, @RawthiL , @Jinx , @shane , @bulutcambazi - After listening to the full transcript from the last call, I thought it would be helpful for me to expand on why I don’t think running open-source LLM models (AI inference) is the best fit for Pocket. In short, I just can’t come up with a good answer to the question: What is the value proposition?

Is it price? Performance? Flexibility? Privacy? Simplicity? Something else? When I consider each relative to alternative options like AWS Bedrock, CloudFlare AI Workers, HuggingFace Inference API, or the increasing number of options from smaller players like Runpod - I can’t see any strong value proposition that would make a Pocket LLM inference offering competitive. I’d love to be convinced otherwise but here’s my current logic.

Price

Could Pocket provide access to open-source LLMs for less than a competitive provider? If the assumption is ‘yes’; how? Someone needs to pay for the infrastructure necessary to make the models available. Would the current base of node runners, or new node runners be able to do this more efficiently than AWS, CloudFlare, etc? Again, how? By spending less? By a willingness to make less? I know for sure that some existing providers are willing (and able) to lose money providing access to open-source models because the prompt data is valuable for fine-tuning. Will Pocket Node runners be willing to operate at a loss?

Performance

Could open-source LLMs run on Pocket outperform the other options? Would latency be reduced? Would model completion quality increase? No, and no. The additional node coordination/protocol layer will add overhead and increase latency. Model completion quality will likely be worse because some independent node runners will likely try to game the system by misrepresenting what models they are hosting (because smaller models are less expensive to run). And this will be impossible to monitor for.

Flexibility

One of the main reasons to use open-source LLMs is not because they are cheaper than using OpenAI, Google Gemini, etc. It’s because they can be modified and retrained. This value proposition doesn’t exist when accessing open-source models via a 3rd party host.

Privacy

Another key selling point for using open-source LLMs is data privacy. The option to self-host open-source LLMs enables the prompts sent to the models and the resulting completions to stay within organizational boundaries keeping the data private. However, this won’t be the case when using Pocket or any other 3rd party provider.

Simplicity

Would it be easier to use open-source LLMs through Pocket? Relative to running the LLMs yourself - yes. Relative to accessing them through some other API - no.

So, that’s my short logic - without getting into the impact of hardware innovations like new chip architectures or other emerging open-source LLM packaging options like Nvidia Inference Microservices that could change everything even before Pocket gets the supply side figured out.

My “day job” is architecting enterprise AI systems, and has been for the past 10+ years. I’m also a long-time supporter of Pocket and have run hundreds of Pocket nodes. So I feel like I have a good understanding of both sides of this.

I’m not trying to be the devil’s advocate here. I’m genuinely interested in better AI solutions and an increased demand for POKT. I have a strong incentive to want both. I’m just not seeing either here.

But if I’m missing something please, please, set me straight - I’d love to be wrong or missing something.

Jinx · March 20, 2024, 12:43pm

I don’t think I’ll “prove you wrong”, but I do think there are other factors at play that support continued research on this front. Have you read @Olshansky 's blog post about this? I think it sets the stage for what we’re trying to accomplish:

I’ve generally agreed that I don’t think POKT will be in the business of running LLMs directly; data relaying is what we’re good at, and I think that will be the biggest use case of POKT in the AI space. And I think that comes in two forms:

relaying actual inference requests to DecAI systems, especially when they are blockchain based systems. This would be ongoing fairly continuous traffic.
relaying data for training models in parallel when the training data benefits from mass parallelization (i.e. when the size of the model crosses the ratio of third party latency for distribution versus processing time in a dedicated environment with a small number of chips). This would be burst traffic for the duration of training.

Both of these are theoretical, based on the work that others are doing, and require additional development in both the protocol and the gateways wanting to support that use case.

I do want to run a Llama2 instance or similar on some of our POKT nodes but only to generate network traffic to see where we can fit in, and with what architecture. My expectation is that we end up being to Gensyn or Bittensor what we are to Ethereum: a decentralized way to access data being generated by other projects.

steve · March 20, 2024, 1:29pm

Thanks for the reply @Jinx

I have read @Olshansky’s post. It’s excellent and I agree 100% - no one wants to host their own LLM Model. I could take that one step further and say no one wants to write their own software. Both are costly and time-consuming.

But the reasons to use open-source LLMs has less to do with costs and effort and more to do with the “lack of moat” that @Olshansky mentions in his post. I might not “want” to host my own LLM but if I want to fine-tune it in a way that provides a competitive advantage or if I want to keep my prompts and completions private - delegating the hosting isn’t an option. So, yes, I don’t “want” to host it but…

What we (Pocket) want and what the AI community needs must be aligned for Pocket to significantly benefit from the demand distributed AI will create for blockchain projects. Thankfully, there are lots of possibilities - this just isn’t close to the best one in my opinion.

RawthiL · March 20, 2024, 1:35pm

I somewhat agree with what you say, plain traffic handling will not give us enough edge.
I do believe that we can do inference cheaper than centralized services if we use spare times of deployed models.

Also, regarding privacy, what do you prefer, a single entity knowing all your interactions, linked to your profile and possibly a KYC policy or a network of independent inference providers who will need to do a lot of work to doxx the inference origins and are not in the data selling business?

One upside of plain inference, is that our models wont be censored by any entity, as nodes can be run from multiple geo-political regions and be staked permissionless.

Regarding API, I will advocate for full OpenAI compatibility, so there will be no friction on that side. Only what the gateways might want to add in Morse.

Thanks for saying this. I’ve been repeating that since day zero.
While I agree that this is a current weakness, I think that solving for this will be our biggest edge in the future. My long term vision with LLMs is to implement a model leaderbord (like huggingface) using the Watchers abilities of Shannon. We are currently developing this as an off-chain service in our Socket repo and aiming to have it running shortly after LLM inference begins in mainnet.
I think that an evolving and inmutable leaderbord of models is whats missing in the AI ecosystem. The only metrics that you have for the APIs you use are given by the API provider or obtained indirectly by using HF leaderboard and trusting that the actual model behind the API is the one being publicized (which is often not the case, see this paper on ChatGPT performance evolution

steve · March 20, 2024, 2:01pm

Thanks for your comments @RawthiL.

You are convinced that we can provide this for free in some cases? And run nodes more efficiently than AWS, CloudFlare, etc? I would love to better understand your thinking here.

If the “single entity” is my company I would choose the single entity every time. This is the point here I believe. Should I host it myself or delegate to a 3rd party? The details are nuances that will play very little into the decision-making process or matter to a small percentage of the market.

Will the leaderboard include metrics from off-chain competitors? That’s who Pocket will be competing with.

RawthiL · March 20, 2024, 2:25pm

Cost is not price, the cost of AWS/GCP, etc will be lower but what they are charging has nothing to do with that.
Whats the cost if you already have a model running locally (for other bussiness) and you can just connect it to POKT to earn some extra coins? None, its all upside. Pocket won’t care if your model is fine tuned or not, we will measure it and assign work if we find it worthy.

Sadly I don’t own OpenAI…
Joke aside, hosting models with enough quality is expensive, the default way is to go for a paid API. I don’t see that hosting a model is a comparable option to using an API, they are different use cases. Unless you are fine tuning or you have large volumes of requests, the API is cheaper.

Sure, we can use DAO funding to pay for endpoints, wont be too expensive and is part of the plan. Only for referent models of course, like OpenAI, Gemini, etc, smaller companies will be welcomed to stake a node and publicize their address.
Language models are black boxes for us, an off chain model is just a service pointing to an API.

steve · March 20, 2024, 4:35pm

Thanks again for the comments @RawthiL

Is a strategy to compete with AWS/GCP based on undercutting their price the best position for Pocket? A large majority of the Pocket nodes are hosted by providers who don’t own any hardware and are reselling AWS/GCP and other cloud offerings. This doesn’t seem like a great strategy to me.

This is true if you have that supply available. But who does? Are you referring to Poktscan’s extra capacity or are you assuming there is a broader market for supply? How do we find them? How much can they make? What is the effort to get those nodes in place (if you’re not just referring to the 3-4 big existing node runners)?

You are right, API access is cheaper. In fact, it’s free for most low-volume usages - that’s one of the big reasons that I’m concerned. But hosting them yourself is getting easier and less expensive. I’m sure you’ve seen and likely used ollama - it’s a lot easier than setting a node.

From a user perspective. If there are many free/inexpensive off chain API options - why choose Pocket? Do you expect that the dashboard will show that Pocket LLMs endpoints are better performing and more cost-effective than off chain API alternatives?

Again, thank you for the feedback. I hope I’m not coming off as sarcastic or argumentative - I just don’t see strong benefits on either side here.

Adrienne · March 20, 2024, 7:45pm

Hugely appreciate the comments - thank you @steve .

Could you expand a little on the use case that you do see more potential in? I would to hear more detail if you’re willing to share?

Thank you!

Adrienne

steve · March 21, 2024, 3:33pm

Thanks @Adrienne. Yes, I can and will share details about what I’m thinking. I’m working on that this week.

RawthiL · March 21, 2024, 6:53pm

I think that after 2023 bear, that’s not the case anymore, no proofs tho…

You still require a lot of hardware for big models and ollama is not a good option if you want inference speed (what everybody wants).

In time yes, for Morse I just expect a dashboard that shows that we are cheaper and have quality stuff running.

The same questions we could have posed for blockchain nodes:
why would someone spin up blockchain nodes when they are easy to deploy, there are API available and those who have them might not be willing to connect to Pocket?
We don’t know why, but Pocket is alive and now even Infura has partnered with us.
With ML endpoints, LLMs particularly, we have the same doubts, but if we build the correct incentives they will come, and I think that we know how to build incentives.

I don’t want to deflect the questions, I believe that you have valid concerns and I wont be able to give you straight answers. Strong benefits are hard to see, in fact the only real one in v0/Morse in censorship resistance.
However we are in a great position to be the first doing this and we will learn along the way, we have done it before.

shane · March 21, 2024, 8:18pm

You are not wrong. Due to flawed tokenomics in Morse node running is only sustainable for large providers due to requiring substantial hardware renting. This is not a sustainable future for POKT. With Shannon you won’t have to be a multi-continental, hardware renting titan to be a supplier, which opens the door to different types of supplier growth.

So I wouldn’t look at Morse’ current supplier typology to gauge what the supplier ecosystem could look like in new markets, like AI.

This is a good summary of what market research is required. POKT should make sense in a market when it is diving into it. My gut has been that POKT does make sense in AI, as I see POKT as data protocol that specialize in for heavy data sources. Blockchain nodes are a good example of sources that require a lot more compute than what folks what to run for their business. To me, LLMs are similar.

Just like POKT’s RPC market revolves around serving gateways (which are essentially UX businesses), there is a real potential for LLM gateways. LLM gateways that specialize in specific user experiences… some will be enterprise, and some will be consumer facing.

I think it is worth understanding what the general LLM market will be, where users/business don’t need specialty LLMs. POKT would first be able to provide access to general LLM deployments… so I’m curious what those kind of markets will be like.

The LLM API ecosystem is still young, and it not fully realized, so I definitely see a place for POKT.

shane · March 21, 2024, 9:40pm

Beyond general inferencing, there is definitely an opportunity to connect AIs to existing POKT sources (like blockchain nodes). Think of it like putting an AI layers on top of POKT current data sources.

Example: POKT currently serves blockchain data, so what about creating an AI ServiceID which has access to the blockchain data? With any data source on POKT, there could be a complimenting ServiceID for that data that is AI powered.

Basically a regular RPC ServiceID and an AI powered ServiceID for each data source (or chain).

That kind of use-case wouldn’t require heavy models, but could be run more efficiently with small, specialized models that build on using existing data sources on POKT. I think this provides one of the most direct utility for POKT’s current market.

Instead of trying to compete with big AI, it makes sense to think AI layers on top of POKT… where access to heavy data sources is the AI’s strength.

That is the kind of market research I think would be worth trying to explore.

steve · March 22, 2024, 8:13pm

@shane This is precisely the opportunity I’m imagining also! Pocket is uniquely positioned to become the standard for enabling trustless collaboration between AI systems/agents. I’ve started working on a whitepaper detailing what I’m referring to as Trustless Agent Orchestration, one of the biggest challenges facing multi-agent AI system builders. It’s also one of the most significant opportunities for blockchains, and Pocket could begin offering TAO services today—even before Shannon or any new development is completed. [EDIT: @Jinx mentioned that TAO is not the best name since it’s Bittensor’s token. It’s also AI-related, so I agree we need a better name. But it’s the general positioning that’s key.]

Pocket’s already-supported blockchains provide so much potential utility, and AI system builders are beginning to consider how they fit into the mix. Mostly, it’s about figuring out how AI agents will collaborate across organizational and system boundaries in a trustless way, which is obviously what blockchains are all about.

Projects like LangChain, AutoGPT, AutoGen, BabyAGI, and CrewAI are rapidly gaining popularity because multi-agent orchestration is the path to more capable AI systems. Also, emerging standards like Agent Protocol are quickly gaining traction, accelerating the development of multi-agent systems that cross organizational and platform boundaries.

^^Yes, this is it!

I’m working on some concrete examples of exactly how I think a TAO offering could be positioned. I hope to have details to share in the coming days.

RawthiL · March 25, 2024, 12:42pm

Can you expand, or share more details on how Agents would work on Pocket?

Agents are basically heuristics mounted on Language Models engines that can have (or not) access to several data sources.
As I see it Pocket will only offer access to these LM engines and data-sources (blockchains only right now). So, the Agents will be outside the Pocket Network, built by companies that want to use our services (devs, gateways, etc).
Agents can be part of the narrative but they are not part of the tech stack of Pocket Network right now, this should be really clear.

steve · March 25, 2024, 1:53pm

I’m not saying agents will run on Pocket. I’m using the term AI Agents to describe AI systems that can work together, use tools, and learn from one another. Here is an older OpenAI post and related white paper that describes the general concept. ChatGPT (and similar systems) are much more than just LLMs. They are multi-agent systems that provide functionality by combining multiple language models, general models, coded systems, runtime environments, etc. But it’s their ability to use ‘tools’ that I’m seeing as the opportunity for Pocket.

I’m working on a more detailed write-up that I hope to share in the coming days. But here is the abstract I’ve written that hopefully provides a decent TLDR in the meantime.

Abstract

Today, if you ask ChatGPT or Gemimi for the current temperature in a given city, you’ll get an accurate answer. But if you ask for the current balance for a public cryptocurrency address, you’ll be told to go elsewhere. The reason is that there is no go-to tool that AI systems can use to access blockchain data. Pocket could be that tool without any new Pocket development efforts.

Dermot · March 27, 2024, 6:25pm

Next steps following the call last week

Thanks everyone for a great discussion last Wednesday - see here for the recording and here for a summary that Adz put together.

POKT Network has clear strengths to build on in the worlds of AI inference, RAG and acting as an API for AI agents more generally. And it’s great to see builders like C0d3r, Grove, Matteo, and POKTScan working to turn this potential into a reality already.

While R&D continues in parallel, we (PNF) are proposing some next steps to develop a litepaper for POKT Network in AI that will provide focus as we move forward at pace.

We propose an empowered working group to execute on this.

THE WORKING GROUP

An agile and informed AI Working Group made up of experts in AI, with a strong understanding of the protocol and the market, reporting to the PNF board.

PNF proposes: Olshansky (lead), Bowen, and Ramiro.

Olshansky has over 6 years of experience doing AI/ML at Magic Leap and Waymo and unparalleled knowledge of our protocol.
Bowen previously led a team that bootstrapped and built out a foundation model and LLM inference platform for Apple and is now a principal engineer at EigenLayer.
Ramiro is an ML scientist who has already deployed an LLM to support the POKT ecosystem and has a deep understanding of the protocol and its economics thanks to his work at POKTscan and as a community contributor.

Additional expertise in the community will be consulted and involved on an as-needed basis. Including but not limited to, Steve, Mike, C0d3r, Shane, Jinx, Gabi, Matteo, and our broader network of advisers, investors, contributors and supporters.

WG OBJECTIVES AND PURPOSE

Produce a litepaper that articulates a clear vision for POKT Network’s AI potential, including a high-level roadmap and necessary future R&D. The Target date for publication is 17 May, but the consultation period will start much earlier than this to allow considerable time for input from all relevant stakeholders and experts.
PNF will use this litepaper to develop more accessible communications and documentation for a broader audience, as well as to work with all interested parties on unlocking funding from the DAO towards the opportunities highlighted within.