Thanks @bulutcambazi for putting this information up, there is a lot to discuss around this subject. We are starting a socket on March for treating this subject (and some other related ones).
We will be providing a large comment on the subject as soon as possible, but there are some points that we need to clarify.
Models
In order to white-list a model we need to know how to test if a model is of a given type. Having a service for “Llama-2 13B” is to ambiguous, as we explain in out Socket presentation. There is no easy way to know if a model is of a given kind and there is also no reason to separate language models of a kind from others. Also, the models that can be staked is large and grows every day. Setting a service per model sub-class like “Mistral” or “Llama-2” will bloat the blockchain or create friction with users that want to able to get the best possible results. The case with diffusion models is similar.
Pricing
We have not yet started to go down into the pricing subject, but I would like to make the ecosystem friendly to independent node runners, I was thinking to use the PoW mining rewards as a start point for pricing. Miners already have GPUs that they can connect to POKT instead if we set a fair price. With some luck this will be much cheaper than other services and we bring in people that already has the hardware.
Regarding counting tokens, I don’t like to rely on gateways for that and also, what happens if you need more tokens? models can take 4K tokens easily, why restrict? this will be a pain point for users who know that the context size is critical (and often scarce).
We will try to get a document ready as soon as possible to cover all these subjects, but we need to be clear that running machine learning models is not like running blockchain nodes and machine learning models users/devs are not like blockchain users/devs..