GPU

January 16, 2026
in GPU, Infrastructure, Hardware
5 min read

So You Need Enterprise GPUs: A No-BS Guide to H100, H200, and B200

Let's be honest—NVIDIA's naming conventions are designed to confuse procurement teams. H100 SXM5, H100 NVL, H200 SXM, B200... it sounds like someone spilled alphabet soup on a product roadmap.

I've spent way too many hours explaining these differences to engineering teams, so here's everything you actually need to know before signing that hardware purchase order.

January 7, 2026
in AI, GPU, Infrastructure, Machine Learning, Distributed Systems
8 min read

Why Your "Senior ML Engineer" Can't Deploy a 70B Model

TL;DR: Small models (≤30B) and large models (100B+) require fundamentally different infrastructure skills. Small models are an inference optimization problem—make one GPU go fast. Large models are a distributed systems problem—coordinate a cluster, manage memory as the primary constraint, and plan for multi-minute failure recovery. The threshold is around 70B parameters. Most ML engineers are trained for the first problem, not the second.

Here's something companies learn after burning through 6 figures in cloud credits: the skills for small models and large models are completely different. And most of your existing infra people can't do both.

Once you cross ~70B parameters, your job description flips. You're not doing inference optimization anymore. You're doing distributed resource management. Also known as: the nightmare.

December 3, 2025
in OpenStack, GPU, Virtualization, PassThrough
10 min read

Solving GPU Passthrough Memory Addressing in OpenStack

Delivering Accelerator enabled Developer Cloud Functionality on Rackspace OpenStack Flex.

When AMD launched the AMD Developer Cloud, we took notice. Here was a streamlined platform giving developers instant access to high-performance MI300X GPUs, complete with pre-configured containers, Jupyter environments, and pay-as-you-go pricing. The offering resonated with the AI/ML community because it eliminated friction: spin up a GPU instance, start training, destroy it when done.

December 1, 2025
in OpenStack, GPU, NVIDIA, Compute, Instance
7 min read

Getting Started with GPU Compute on Rackspace OpenStack Flex

Your instance is up, your GPU is attached, and now you're staring at a blank terminal wondering what's next. Time to get Clouding.

Rackspace OpenStack Flex delivers GPU-enabled compute instances with real hardware acceleration, not the "GPU-adjacent" experience you might get elsewhere. Whether you're running inference workloads, training models, or just need raw parallel compute power, getting your instance properly configured is the difference between frustration and actual productivity.