POKT Network & AI: First Principles

Adrienne · March 7, 2024, 4:08pm

Hi all,

We need your help to shape the story for POKT Network <> AI.

The goal here is to get to something that is as robust as possible, as quickly as possible.

@Jinx kindly agreed to let us take a chunk of the Ecosystem call next week (13th Mar at 12pm EST in Discord) to faciliate a discussion. Please join if you can. Otherwise there will be recordings available and we’ll update here afterwards.

In advance of that session, we’re asking you to please add your thoughts in comment against some initial and foundational questions.

Here goes…

Question 1: What specific problem is POKT Network best placed to solve within AI, and for whom?
Please be as specific as possible.

Question 2: What strengths can we draw on to solve this problem effectively for them?
What is our “right to win” / what makes us credible / what is our competitive edge.

Question 3: What do we need in order to have a solution ready for testing?
What resources do we need, what do we need to build, what kind of partners do we need… etc

Question 4: Who are we competing with in this specific problem space?
Please just note any you’re immediately aware of and what they’re doing.

Question 5: What is the customer (from Q1) looking for in a solution?
Please indicate what your answer is based on - opinion is valuable, but it would be good to flag it as such.

BONUS ROUND: What could go wrong?
The upside feels very clear and compelling so I want to create space to flag any concerns.

There will be many more questions as we continue down this road, but this should get the ball rolling.

Tagging @o_rourke @Olshansky @shane @bulutcambazi @RawthiL @Jinx @gabalab for visibility. Please excuse the oversight if I’ve missed anyone who has shown an interest - it’s not intentional!

Excited to see what ideas are out there!

Chat soon!

Adz

RawthiL · March 7, 2024, 6:46pm

Hi @Adrienne I will try to give my input on those.

I’ll leave out my long term vision here, because I believe that we can do much more, but we need to start strong and realistic.

Question 1: What specific problem is POKT Network best placed to solve within AI, and for whom?

Model inference. Running models is not easy, paying for someone else to do it (APIs) is expensive, hosting your own is also expensive as GPUs are almost always needed (for larger models such as LLM or Diffusers).
The ones that would consume this service are developers that need access to APIs, instead of dealing with your own deployment or paying expensive APIs you could just use Pokt and access a series of models to solve many different tasks, all in a single endpoint.

Some examples could be chat bots, summarization, question answering, image generation, aided graphic design, segmentation, sentiment analysis, etc.
Also we can do much much easier tasks such as embeding, while they are cheap to deploy on CPU, I don’t see why we should not also host them (they are also very simple to meassure)

Question 2: What strengths can we draw on to solve this problem effectively for them?

We have a working network and tokenomics that encourages node runners (in this case model runners) to provide services with high quality and is always aiming to maximize their productivity.
If we set the incentives right (aka metrics or ““QoS””), the node runners will be doing the work of getting the best models for each task and the end user (developer) will never have to deal with model updating or changing from one API standard to other.
Node runner have already done this for blockchain, why not also for ML models?

Question 3: What do we need in order to have a solution ready for testing?

We don’t need much to get this up and running in Morse, only be sure on what we will be offering and how we will be testing. The missing piece that we need to build is a credible way to show that our endpoint are not answering trash.
Besides that. we can setup our nodes and gateways and start selling API access to known large language models (LLMs), diffusion models (text-to-iamge) and others today, the network already supports that.
Since we do not try to bring inference into the blockchain or deal with distributed model training (two open problems in my opinion) we can push a real service faster than any other player in the crypto+ai world.

We need partners in the AI world, both to provide inference (spare at a low token sell pressure) and also to guide us on what exactly is needed (a target use case would be wonderful). We can adapt from there, we have the tools and expertise.

Question 4: Who are we competing with in this specific problem space?

From the Crypto+AI world, I don’t believe that there is a real competitor, most projects try to solve other things, like distributed training/inference or GPU computing.
In the AI world we are competing to every other API service out there. But remember, we were also competing against Infura at the beginning, so, competition for Pokt is only temporary

Question 5: What is the customer (from Q1) looking for in a solution?

This is only opinion, but from my experience with model usage:

Ease of use and migration. When I do a solution including ML and I change the API or the underlying model, it is a pain to re-adapt code for something that should be transparent. This is reflected in the LLM community where self hosting solutions like vLLM try to mimic OpenAI API standard and make it seamless to change the back-end.
Model specialization. Some tasks you know that can be achieved by cheaper (smaller) models or even be better solved by specialized ones. With an API you are married to a single model (a few options at least) and with your own hardware, well, you run out of GPUs… Pocket could enable you to do model mixing on the fly at no cost.

BONUS ROUND: What could go wrong?

We are truly permissionles, we need to be able to asses that our service is real. As opposed to blockhain nodes, ML models are countless and highly sensible on deployment choices, meaning that trivial tests like majority voting will fail.
We need to find the balance between vagueness and over-specialization. If we only allow a single service per model, we will have >30000 new services with a single node in them if we are too vague the customers wont know how to get what they need.

In my experience, the AI community is not crypto native and wont be lured by simple saying that we are a permissionles service and that we have a token and long live cryptoanarchy. We need to convince them that we provide real value, that we play on their terms and that we speak their language. If we nail that, they wont care if we are crypto, they will stay for the service.

Adrienne · March 7, 2024, 9:49pm

Thank you for going first @RawthiL !

Hope you can join the ecosystem call next Wednesday?

RawthiL · March 8, 2024, 1:12pm

Sure, I’ll be there!

steve · March 9, 2024, 10:18am

For starters, I’m super excited to see this post. I’ve been thinking a lot about the intersection of AI and blockchain, and where Pocket Network fits. There are so many possibilities. But here are the top three in my opinion:

Decentralized RAG (Retrieval-Augmented Generation)
AI Agent Coordination and Consensus
Decentralized Fine-Tuning Data Sets

Multi-agent AI systems are where everything is moving. They represent the biggest opportunity for blockchain in the coming months/years. While distributed compute for running or training models is interesting, I don’t realistically see Pocket Network as a viable solution for that. Not anytime soon at least. Maybe for very small models or as a gateway to GPU nodes. but still, I see a bigger opportunity in the areas of agent coordination and consensus and/or attribution/ownership verification of private data sets that AI systems need for RAG and fine-tuning.

I’ll try to make the call but at the moment I have a conflict. I will also try to expand on the above when I have a bit more time.

Jinx · March 9, 2024, 3:03pm

Steve! Given your history in this space, glad to see you here.

Adrienne · March 10, 2024, 11:01am

Thank you, @steve ! If you’re not able to make the ecosystem call perhaps we could grab 30 mins 1:1 afterwards? Would love any additional thoughts you can share - especially on strengths we can leverage and gaps we’d need to close.

One question in the meantime, within RAG, where would you see the most compelling role forPOKT Network? Hosting the LLMs and running inference to them, connecting LLMs to external data sources (would these be on chain / any open source data base / other?), or would we need to provide both aspects?

I found this Nvidia article for anyone that is new to RAG: What Is Retrieval-Augmented Generation aka RAG | NVIDIA Blogs

If there is any specific context you recommend we read up on please let us know. I expect will have a variety of experience levels on the call from enthusiast to expert.

Thanks again and hope you can make it on Wednesday! Hugely appreciate your input!

Adz

steve · March 10, 2024, 2:17pm

Hey @Adrienne! I’m going to try to make the call. I’m working on rescheduling a conflict. Regardless, I’m happy to set up a time to meet 1:1. Just DM me and we can find a time that works.

On your RAG application question. The short version. A blockchain like Pocket Network could be used as a decentralized vector database for AI agents. This provides a way to share RAG data across AI agents with the added benefit of being able to verify the data’s source and integrity. Data integrity and attribution will become increasingly important as AI agents become more autonomous and begin collaborating with other AI agents.

I’m working on an expanded version of my thoughts on the importance of decentralization/blockchain in the future of AI. Here is a link to Part 1 of a multi-part post I’m working on that discusses AI orchestration in general: AI orchestration and AI decentralization - Part 1.

As I mentioned in the article, more capable AI can only be achieved by scaling up or out. The natural and most viable path is to scale out. This will drive the intersection of AI and blockchain technologies. Understanding the high-level requirements for multi-agent orchestration and AI swarm intelligence will help connect the dots between the two technologies. I mention a few notable projects in the article that are focused on the creation, management, or orchestration of multi-agent AI systems. Looking into those projects might help you understand the potential roles of Pocket in the decentralized AI ecosystem.

I hope this is helpful. Let me know if you have any questions. I’m happy to help if I can.

Matteo · March 11, 2024, 5:27pm

[quote=“Adrienne, post:1, topic:5089”]
Question 1: What specific problem is POKT Network best placed to solve within AI, and for whom?
A more general answer perhaps, but the way I would position this is that you’re helping teams de-risk their AI/LLM backends - this is how Katara AI (my new project) would like to leverage POKT Network, most teams are either building a wrapper around a hosted foundational LLM as a Service (e.g. OpenAI) or attempting to host/run their own infra - we are doing both as a way to diversify the backend so we would like to both contribute to POKT as a node runner (reducing our infra cost) and leverage POKT for access to other models.

Question 2: What strengths can we draw on to solve this problem effectively for them?
Well I think the communities experience to running blockchain infra is super relevant here and POKT has already proven they can build the protocol to solve for the trilemma on the web3 side, so its not a huge leap to suggest you can solve for this in Web2/3 AI. Pattern for a developer looks the same - we don’t want a heavy dependency on a centralized LLM as a Service provider nor do we want to operate a bunch of infra if we don’t have to run everything ourselves.

Question 3: What do we need in order to have a solution ready for testing?
You need both test/prod endpoints, monitoring, and docs to get teams started. Some early latency numbers would also be helpful or anything else that could be a potential restriction.

Question 4: Who are we competing with in this specific problem space?
You’re going to be competing with a few emerging web2 platforms who have already identified the problem but are trying to solve it as web2 shops - teams like: https://vectara.com/ - who are building a RAG as a Service platform or https://www.neutrinoapp.com/ who are building centralized LLM routing/model merging solutions. They both will have a few advantages due to centralization but that’s obviously a spot POKT can differentiate on.

Question 5: What is the customer (from Q1) looking for in a solution?
My new project is the customer from Q1 and what we would be looking for at first pass is the following:

De-risk our backend via a decentralized provider of diverse model endpoints
Expand our product offerings with access to additional models / functionality via a decentralized source
Potentially interested in model-merging endpoints that are backed up by a decentralized source

Let me qualify the above - there are two main techniques for solving model diversification - one is called an ensamble technique in which one would operate several models - submitting them each the same input for example, checking results, and then using the best answer - this is great for some architectures as it helps spread risk across multiple pieces of infra (particularly if the family of models is a mix of as a service and self hosted) vs. [model merging] (Model merging) in which multi-models are combined into a single endpoint essentially “extending” model features/functionality beyond any one models capabilities. (This is also rather interesting when considering how POKT might impact AI architectures.

BONUS ROUND: What could go wrong?
QoS could be really bad, additional latency, etc. Also the architecture opens itself up to new attack vectors - prompt injection for example, we’re working with https://www.lakera.ai/ and I’m not sure how using AI security services like this would work in relation to POKT’s plans for decentralizing access. Finally, someone mentioned decentralized RAG, curious what those pipelines would look like as teams who are building off foundation models are providing the most value at the RAG/pipeline - meaning - the way I architect a RAG / Prompt pipeline is my differentiator when the backend foundational model is the same, so what exactly will POKT be offering? We wouldn’t give up our entire RAG pipeline to POKT.

Olshansky · March 12, 2024, 2:38am

The answers to these questions can get REALLy long (i.e. a separate blog post for each question), so I kept it short, to the point and focused on what’s needed for a pilot.

Question 1: What specific problem is POKT Network best placed to solve within AI, and for whom?

POKT Network: Performant and Cost-effective LLM inference endpoints across various open-source models.
POKT Network Gateways (e.g. Grove): Reliable access and routing to the inference endpoints on the network.
For whom: Cost-sensitive developers/enthusiasts looking to experiment with various open-source models.

Question 2: What strengths can we draw on to solve this problem effectively for them?

What is our “right to win” / what makes us credible / what is our competitive edge.

Right to win: Years of experience; supporting a diverse ecosystem of open-source LLMs is the same as supporting a diverse ecosystem of open-source blockchains.
Competitive Edge (POKT): Ability to attract, aggregate, incentivize and verify the Supply side.
Competitive Edge Gateways (e.g. Grove): Attract & support demand/users by providing QoS guarantees.

Question 3: What do we need in order to have a solution ready for testing?

What resources do we need, what do we need to build, what kind of partners do we need… etc

Protocol: New Chain IDs for LLMs and updated RTTM; lots of technical detail here that I’m omitting.
Supply: Hardware operators, preferably with GPUs.
Demand: Early adopters are willing to use inference points that may be lower quality (at first) in exchange for cheaper access and a more diverse set of models.
Gateways: Minimum Viable (Experimental) QoS for LLMs.

Question 4: Who are we competing with in this specific problem space?

Please just note any you’re immediately aware of and what they’re doing.

Web2: together.ai, octo.ai, anyscale.com, tromero.ai, etc…

Web3: morpheus.network, tromero.ai, etc…

Question 5: What is the customer (from Q1) looking for in a solution?

Please indicate what your answer is based on - opinion is valuable, but it would be good to flag it as such.

Early adopters (e.g. looking to experiment)
Developers (e.g. hands-on teams)
Looking to tradeoff some QoS (e.g. speed) in exchange for hands-on support, cost, and potentially access to a wider array of models

BONUS ROUND: What could go wrong?

The upside feels very clear and compelling so I want to create space to flag any concerns.

Too much focus / discussion revolving around how we solve Quality of Service rather than just shipping something that’s “good enough” and being the #1 player in the space.

steve · March 12, 2024, 10:56am

Hey @Matteo, to clarify, what I was suggesting is that Pocket could be used as a RAG datastore solution. Imagine a decentralized version of Pinecone. So, I was not suggesting that anyone should turn their RAG pipeline over to Pocket or make them public. On the contrary, I see this as a way to monetize RAG pipelines while also keeping them private. For example, Pocket could be used to relay RAG requests similar to how RPC requests for different chains are relayed. So, RAG data could be kept private but made accessible for rewards/fees. Does that make sense?

Matteo · March 12, 2024, 12:48pm

@Steve, nice - I like it.

steve · March 20, 2024, 12:22pm

@Adrienne, @Olshansky, @RawthiL , @Jinx , @shane , @bulutcambazi - After listening to the full transcript from the last call, I thought it would be helpful for me to expand on why I don’t think running open-source LLM models (AI inference) is the best fit for Pocket. In short, I just can’t come up with a good answer to the question: What is the value proposition?

Is it price? Performance? Flexibility? Privacy? Simplicity? Something else? When I consider each relative to alternative options like AWS Bedrock, CloudFlare AI Workers, HuggingFace Inference API, or the increasing number of options from smaller players like Runpod - I can’t see any strong value proposition that would make a Pocket LLM inference offering competitive. I’d love to be convinced otherwise but here’s my current logic.

Price

Could Pocket provide access to open-source LLMs for less than a competitive provider? If the assumption is ‘yes’; how? Someone needs to pay for the infrastructure necessary to make the models available. Would the current base of node runners, or new node runners be able to do this more efficiently than AWS, CloudFlare, etc? Again, how? By spending less? By a willingness to make less? I know for sure that some existing providers are willing (and able) to lose money providing access to open-source models because the prompt data is valuable for fine-tuning. Will Pocket Node runners be willing to operate at a loss?

Performance

Could open-source LLMs run on Pocket outperform the other options? Would latency be reduced? Would model completion quality increase? No, and no. The additional node coordination/protocol layer will add overhead and increase latency. Model completion quality will likely be worse because some independent node runners will likely try to game the system by misrepresenting what models they are hosting (because smaller models are less expensive to run). And this will be impossible to monitor for.

Flexibility

One of the main reasons to use open-source LLMs is not because they are cheaper than using OpenAI, Google Gemini, etc. It’s because they can be modified and retrained. This value proposition doesn’t exist when accessing open-source models via a 3rd party host.

Privacy

Another key selling point for using open-source LLMs is data privacy. The option to self-host open-source LLMs enables the prompts sent to the models and the resulting completions to stay within organizational boundaries keeping the data private. However, this won’t be the case when using Pocket or any other 3rd party provider.

Simplicity

Would it be easier to use open-source LLMs through Pocket? Relative to running the LLMs yourself - yes. Relative to accessing them through some other API - no.

So, that’s my short logic - without getting into the impact of hardware innovations like new chip architectures or other emerging open-source LLM packaging options like Nvidia Inference Microservices that could change everything even before Pocket gets the supply side figured out.

My “day job” is architecting enterprise AI systems, and has been for the past 10+ years. I’m also a long-time supporter of Pocket and have run hundreds of Pocket nodes. So I feel like I have a good understanding of both sides of this.

I’m not trying to be the devil’s advocate here. I’m genuinely interested in better AI solutions and an increased demand for POKT. I have a strong incentive to want both. I’m just not seeing either here.

But if I’m missing something please, please, set me straight - I’d love to be wrong or missing something.

Jinx · March 20, 2024, 12:43pm

I don’t think I’ll “prove you wrong”, but I do think there are other factors at play that support continued research on this front. Have you read @Olshansky 's blog post about this? I think it sets the stage for what we’re trying to accomplish:

I’ve generally agreed that I don’t think POKT will be in the business of running LLMs directly; data relaying is what we’re good at, and I think that will be the biggest use case of POKT in the AI space. And I think that comes in two forms:

relaying actual inference requests to DecAI systems, especially when they are blockchain based systems. This would be ongoing fairly continuous traffic.
relaying data for training models in parallel when the training data benefits from mass parallelization (i.e. when the size of the model crosses the ratio of third party latency for distribution versus processing time in a dedicated environment with a small number of chips). This would be burst traffic for the duration of training.

Both of these are theoretical, based on the work that others are doing, and require additional development in both the protocol and the gateways wanting to support that use case.

I do want to run a Llama2 instance or similar on some of our POKT nodes but only to generate network traffic to see where we can fit in, and with what architecture. My expectation is that we end up being to Gensyn or Bittensor what we are to Ethereum: a decentralized way to access data being generated by other projects.

steve · March 20, 2024, 1:29pm

Thanks for the reply @Jinx

I have read @Olshansky’s post. It’s excellent and I agree 100% - no one wants to host their own LLM Model. I could take that one step further and say no one wants to write their own software. Both are costly and time-consuming.

But the reasons to use open-source LLMs has less to do with costs and effort and more to do with the “lack of moat” that @Olshansky mentions in his post. I might not “want” to host my own LLM but if I want to fine-tune it in a way that provides a competitive advantage or if I want to keep my prompts and completions private - delegating the hosting isn’t an option. So, yes, I don’t “want” to host it but…

What we (Pocket) want and what the AI community needs must be aligned for Pocket to significantly benefit from the demand distributed AI will create for blockchain projects. Thankfully, there are lots of possibilities - this just isn’t close to the best one in my opinion.

RawthiL · March 20, 2024, 1:35pm

I somewhat agree with what you say, plain traffic handling will not give us enough edge.
I do believe that we can do inference cheaper than centralized services if we use spare times of deployed models.

Also, regarding privacy, what do you prefer, a single entity knowing all your interactions, linked to your profile and possibly a KYC policy or a network of independent inference providers who will need to do a lot of work to doxx the inference origins and are not in the data selling business?

One upside of plain inference, is that our models wont be censored by any entity, as nodes can be run from multiple geo-political regions and be staked permissionless.

Regarding API, I will advocate for full OpenAI compatibility, so there will be no friction on that side. Only what the gateways might want to add in Morse.

Thanks for saying this. I’ve been repeating that since day zero.
While I agree that this is a current weakness, I think that solving for this will be our biggest edge in the future. My long term vision with LLMs is to implement a model leaderbord (like huggingface) using the Watchers abilities of Shannon. We are currently developing this as an off-chain service in our Socket repo and aiming to have it running shortly after LLM inference begins in mainnet.
I think that an evolving and inmutable leaderbord of models is whats missing in the AI ecosystem. The only metrics that you have for the APIs you use are given by the API provider or obtained indirectly by using HF leaderboard and trusting that the actual model behind the API is the one being publicized (which is often not the case, see this paper on ChatGPT performance evolution

steve · March 20, 2024, 2:01pm

Thanks for your comments @RawthiL.

You are convinced that we can provide this for free in some cases? And run nodes more efficiently than AWS, CloudFlare, etc? I would love to better understand your thinking here.

If the “single entity” is my company I would choose the single entity every time. This is the point here I believe. Should I host it myself or delegate to a 3rd party? The details are nuances that will play very little into the decision-making process or matter to a small percentage of the market.

Will the leaderboard include metrics from off-chain competitors? That’s who Pocket will be competing with.

RawthiL · March 20, 2024, 2:25pm

Cost is not price, the cost of AWS/GCP, etc will be lower but what they are charging has nothing to do with that.
Whats the cost if you already have a model running locally (for other bussiness) and you can just connect it to POKT to earn some extra coins? None, its all upside. Pocket won’t care if your model is fine tuned or not, we will measure it and assign work if we find it worthy.

Sadly I don’t own OpenAI…
Joke aside, hosting models with enough quality is expensive, the default way is to go for a paid API. I don’t see that hosting a model is a comparable option to using an API, they are different use cases. Unless you are fine tuning or you have large volumes of requests, the API is cheaper.

Sure, we can use DAO funding to pay for endpoints, wont be too expensive and is part of the plan. Only for referent models of course, like OpenAI, Gemini, etc, smaller companies will be welcomed to stake a node and publicize their address.
Language models are black boxes for us, an off chain model is just a service pointing to an API.

steve · March 20, 2024, 4:35pm

Thanks again for the comments @RawthiL

Is a strategy to compete with AWS/GCP based on undercutting their price the best position for Pocket? A large majority of the Pocket nodes are hosted by providers who don’t own any hardware and are reselling AWS/GCP and other cloud offerings. This doesn’t seem like a great strategy to me.

This is true if you have that supply available. But who does? Are you referring to Poktscan’s extra capacity or are you assuming there is a broader market for supply? How do we find them? How much can they make? What is the effort to get those nodes in place (if you’re not just referring to the 3-4 big existing node runners)?

You are right, API access is cheaper. In fact, it’s free for most low-volume usages - that’s one of the big reasons that I’m concerned. But hosting them yourself is getting easier and less expensive. I’m sure you’ve seen and likely used ollama - it’s a lot easier than setting a node.

From a user perspective. If there are many free/inexpensive off chain API options - why choose Pocket? Do you expect that the dashboard will show that Pocket LLMs endpoints are better performing and more cost-effective than off chain API alternatives?

Again, thank you for the feedback. I hope I’m not coming off as sarcastic or argumentative - I just don’t see strong benefits on either side here.

Adrienne · March 20, 2024, 7:45pm

Hugely appreciate the comments - thank you @steve .

Could you expand a little on the use case that you do see more potential in? I would to hear more detail if you’re willing to share?

Thank you!

Adrienne