being able to run models depends on your graphics card, the more vram the better, im running a 2080ti so i have 11gigs of gddr which is just barely enough to run mid to low end size models, 7b models was pushing i think about 9 gigs on my vram when it has a download size of 7gig's, the 2gig model was taking about 4 gigs on vram. The bigger the language model like 13b + will require a minimum of 12gig's vram which is why so many people want a 4090 since thats 24gigs.Am I dumb or are the offline modes just slow as all hell? like am I doing something wrong? or is my computer just doodoofard and AI is just beyond it
so yeah if you go over the size of your video card it'll bog down to where it takes a really long time for a reply.
Last edited: