LLM Hardware Chronicles: The Unexpected Journey from NVIDIA to AMD and Back
After months of experimenting with local Large Language Models (LLMs) via Ollama, I’ve learned one fundamental truth: You can never have enough VRAM.
This journey from consumer GPUs to datacenter cards has been filled with interesting discoveries about the balance between hardware capabilities, software ecosystems, and practical limitations.
The Initial Setup My journey with the Derr AI system began with the wild west of models, experimenting with everything I could get my hands on to understand what would run locally. After much tinkering, I settled on running Mistral Small (22B parameters) on an NVIDIA RTX 3060 with 12GB of VRAM. While the 3060 is a solid card for many tasks, it quickly hit its limits when running larger language models.