Two GPU. VRAM from both used. Just one for calculation

friki67 · October 25, 2024, 8:52am

I’m running Ollama (and OpenWebUI) as an Incus OCI container, directly using the docker container. I’ve set the container for use both GPU, so the container has two GPU devices, gpu0 and gpu1, and set nvidia.driver.capabilities: all and nvidia.runtime: "true". Indeed it is working, as you see. But it is not using the second GPU for calculations? What could be happening?

Incus --version says 6.4 in Rocky linux. I can update to 6.5-1 if something new could fix this (and of course if it is an Incus thing).

I’ve posted this in r/Ollama, just in case. Reddit - Dive into anything

EDIT: maybe the non-nvlink in 2060 has something to do with this?

friki67 · April 10, 2025, 7:24am

Ok I resolved this long time ago. It depends on the LLM I’m using! I thing this is fixed with the last Ollama versions. Anyway, at the pace they are developing models, this thing was lost in the past…