Topic: Hardware shopping tips for SD and other AI uses?

Posted under Off Topic

I'd love to hear some informed opinions.

My rig is five years old now. I still have an rtx 2070 Super with 8GB of VRAM. It's not ideal for SD, but it could be worse.
I've also been experimenting with locally run LLMs/chatbots recently for the first time. I am still having a blast with all the possibilities, but it's obvious that my 8GBs are hopelessly obsolete for LLMs and their full potential.
I was gonna wait for the next rtx generation to be released and then maybe get one of the 4000 series when prices drop, but it looks like that may be another year from now and I'm getting itchy fingers...

Even if I got a new PC with a 16 GB card now, it would basically be insta-obsolete for LLMs and maybe other AI-driven applications, too - and even more so in another year or two.
I've got this feeling that high-VRAM will become more common and more affordable in the years to come, now that the demand is growing with AI developments.
So... Apart from having itchy fingers, it does seem like a bad time to spend a lot of money. Maybe there is an upgradable, future-proof solution? I'm not so up-to-date with current hardware and developments.
My budget would be something between 1500 and 2000€ for the whole system.

Thanks for all replies. The more, the better.

Updated

On such monopoly state where everyone prefer NVIDIA for AI generation the don't expect much from greedy NVIDIA on near future. Only when AMD become popular for AI generation too. The cheapest solution from NVIDIA with maximum possible VRAM is only 3090.

Updated

AMD GPUs are useless right now, so wait until AMD release UDNA architecture or buy used GPU such as 3080/ti. I have 1080Ti and i'm perfectly fine with it

AMD hardware is, unfortunately, almost useless, so you are stuck with NVIDIA.
For image generation you'll need 12GB of VRAM to get comfortable. Cheapest is 3060 12GB. You can move generations up or down, just make sure to have at least 11-12GB of VRAM, not less. It can be a slower GPU, but the amount of VRAM is what you care about.
For LLMs, i don't think anything below 24GB of VRAM will work. You can run LLMs even with lower VRAM, but output quality will suffer, making the entire thing useless. So yeah, you're pretty much limited to top NVIDIA GPUs then.

As an experiment, you could try buying Apple Silicon devices with shared memory. It can get you above 24GB of VRAM cheaper than alternatives. Although i've personally never tried it, so it'll be experimental. Otherwise you can try getting a used Quadro GPU, but they usually require modifications for desktop PCs (most of them have no fans).
New NVIDIA 5XXX GPUs are expected to drop in the end of 2024 or at the beginning of 2025. But prices won't be pretty.

Here's a huge resource on the topic: https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/
And here's your TL;DR: https://timdettmers.com/wp-content/uploads/2023/01/GPUs_Ada_performance_per_dollar6.png
This chart is useful too: https://i0.wp.com/timdettmers.com/wp-content/uploads/2023/01/gpu_recommendations.png?ssl=1

Updated

Also note that Stable Diffusion WebUI Forge can detect low memory environment and will use memory from GPU, motherboard and pagefile. I tried to generate image using flux model on notebook with 1650 (4GB) + 16 GB RAM on motherboard + 64 GB pagefile on SSD. It works. Slowly, but generated image successfully. 1024x1024, 10 steps took around 10 minutes.

yetanotheraiuser said:
Also note that Stable Diffusion WebUI Forge can detect low memory environment and will use memory from GPU, motherboard and pagefile. I tried to generate image using flux model on notebook with 1650 (4GB) + 16 GB RAM on motherboard + 64 GB pagefile on SSD. It works. Slowly, but generated image successfully. 1024x1024, 10 steps took around 10 minutes.

Yes, this works, but if you do that too often, it will ruin your SSD!

silvicultor said:
Yes, this works, but if you do that too often, it will ruin your SSD!

It won't ruin your SSD, and using SSD for pagefile won't either. It's a myth.
Your SSD will become obsolete (either in size or in speeds) long before it degrades.

yetanotheraiuser said:
Also note that Stable Diffusion WebUI Forge can detect low memory environment and will use memory from GPU, motherboard and pagefile. I tried to generate image using flux model on notebook with 1650 (4GB) + 16 GB RAM on motherboard + 64 GB pagefile on SSD. It works. Slowly, but generated image successfully. 1024x1024, 10 steps took around 10 minutes.

It will be so slow that it becomes pretty much useless. You're always better off using tiled processes and lower resolutions than waiting for RAM and swap.
I've once ran a 4x upscale on 2560px image by accident. I've only realized that the process is still ongoing (and is stuck at 10%) 30 minutes later because videos i've watched started lagging. I'd upscale this image 4x using tiled upscaler in, like, 5 minutes, to put it into perspective.
If you're running out of VRAM, rethink what you're doing and find a better way. VRAM swapping to RAM is simply there to prevent OOM crashes. It's not supposed to be used to actually do something... Slowly...

Thanks for all replies. I will definitely look into the links.

A 3060 wouldn't be enough of an upgrade compared to my 2070 Super, in my opinion. Maybe 3090, not sure yet... Anything under 16 GB doesn't really seem wothwhile.
If I should get a new system soon, I guess I could always consider replacing the graphics card again in a few years, if high-VRAM cards ever become standard and affordable.

Is SLI still a thing? Could I tandem my old 2070 Super with a new 16GB card for a total of 24GB? And/or could I get a second 16GB card later (maybe the exact same) for that purpose?
As you can see, I'm not that up to date anymore... I wish I could just buy an external 32GB chunk of VRAM and plug it into a USB port. xD

argon-42 said:
Is SLI still a thing? Could I tandem my old 2070 Super with a new 16GB card for a total of 24GB? And/or could I get a second 16GB card later (maybe the exact same) for that purpose?
As you can see, I'm not that up to date anymore... I wish I could just buy an external 32GB chunk of VRAM and plug it into a USB port. xD

SLI is dead, although 3090's got NVLink. None of 4XXX GPUs have that anymore, and none probably ever will.
NVLink should be capable of merging VRAM of multiple cards into one pool, but you better research if (and how well) it works for your applications. Some are actually becoming slower (Blender, for example). Some might not even support it, but i'm very fuzzy on that, haven't touched multigpu since 1080ti.

I wish I could just buy an external 32GB chunk of VRAM and plug it into a USB port. xD

The main reason why ML uses VRAM is how fast it is compared to any other memory, and how low the latency is. For that, it should be as close to GPU die as possible. Anything external wouldn't work, same way swapping into RAM is so slow it's easier to just stop. Imagine how painful anything slower than RAM would be? :)

  • 1