Immediately after switching the page, it will work with CSR.
Please reload your browser to see how it works.

Source:https://github.com/SoraKumo001/next-streaming

⬅️ Microsoft researchers developed a hyper-efficient AI model that can run on CPUs

hu3 4 daysReload

Repo with demo video and benchmark:

"...It matches the full-precision (i.e., FP16 or BF16) Transformer LLM with the same model size and training tokens in terms of both perplexity and end-task performance, while being significantly more cost-effective in terms of latency, memory, throughput, and energy consumption..."

https://arxiv.org/abs/2402.17764

ilrwbwrkhv 3 daysReload

This will happen more and more. This is why NVidia is rushing to get CUDA a software level lock-in otherwise their stock will go the way of Zoom.

zamadatix 3 daysReload

"Parameter count" is the "GHz" of AI models: the number you're most likely to see but least likely to need. All of the models compared (in the table on the huggingface link) are 1-2 billion parameters but the models range in actual size by more than a factor of 10.

Jedd 3 daysReload

I think almost all the free LLMs (not AI) that you find on hf can 'run on CPUs'.

The claim here seems to be that it runs usefully fast on CPU.

We're not sure how accurate this claim is, because we don't know how fast this model runs on a GPU, because:

  > Absent from the list of supported chips are GPUs [...]

And TFA doesn't really quantify anything, just offers:

  > Perhaps more impressively, BitNet b1.58 2B4T is speedier than other models of its size — in some cases, twice the speed — while using a fraction of the memory.

The model they link to is just over 1GB in size, and there's plenty of existing 1-2GB models that are quite serviceable on even a mildly-modern CPU-only rig.

ein0p 3 daysReload

This is over a year old. The sky did not come down, everyone didn't switch to this in spite of the "advantages". If you look into why, you'll see that it does, in fact, affect the metrics, and some more than others, and there is no silver bullet.