Small Language Models on CM4 locally

JesseandGraceandCats · December 22, 2024, 12:38pm

I have been wondering whether it would be possible to run some local language models on uConsole. But I never thought it would be possible as I have tried Llama2 7b on an Intel N95 device last semester as school project and it was not ideal.

Luckily, I came across this video comparing Pi 4 with Pi 5 running Llama and realized that the newer Llama 3.2:1b takes significantly less resources to run.

Then I tried it and it works. About 2 to 3 tokens per second, which is usable.

But CM5 would be so much better so I am looking forward to it down the way.

I also have tried Phi3 and Llama3.2:3b. They are usable as well, but you would have to wait a significantly longer time, like 1 to 1.5 tokens per second.

hotellonely · February 18, 2025, 12:41am

Yeah it would be great if Rpi foundation could figure out a way to add the npu onto the CMs, however I know it’s also too much of an ask for them lol.

JesseandGraceandCats · March 27, 2025, 9:17am

I think they have an AI hat for rbp 5 with 13 or 26 TOPS. I think it is built around Halo 8L

Topic		Replies	Views
Will the UConsole take CM5 16gb? uConsole uconsole , cm5	7	788	May 10, 2025
CM4 - benefits of the 8GB version? uConsole	32	2161	June 13, 2024
RPI CM4 pin compatible RISC-V compute module uConsole hardware , alternatives	14	1101	September 10, 2024
Feasibility of Lightweight LLMs or Chatbots on PicoCalc PicoCalc tinkering , community , help , retroarch	7	233	May 15, 2025
Beer fund for someone who can create a current os? uConsole	47	1099	April 19, 2024

Small Language Models on CM4 locally

Related topics