Live vCPU and RAM resize is now generally available
Resize production VPS instances with sub-second pauses. Here's how we built it.
By Vintony Engineering
Live resize on VPS instances graduated from beta to general availability last week. You can now adjust vCPU count, RAM allocation, and primary disk size from the dashboard or API at any time. The pause during a resize is consistently under one second in our measurements — short enough that most TCP connections stay alive and short enough that you can do it during the working day without telling anyone.
The underlying machinery is QEMU's CPU hotplug for vCPU changes, virtio-mem for RAM, and online thinpool growth for storage. None of these are new technologies; the trick was integrating them with our control plane so that a customer can request 'go from 4 GB to 8 GB' without a human in the loop.
Going up is the easy direction. Adding a vCPU is a single QEMU command; adding RAM is a virtio-mem 'plug' message. The kernel notices the new resources and the workload starts using them within a few seconds. Going down is harder, because you have to convince the guest to release pages before you can pull them. We use memory ballooning with a small kernel module installed on our default images that politely asks the workload to release cold pages before the host yanks them.
Storage resize is the most boring of the three. LVM thinpool growth happens online; the filesystem extension is also online for ext4 and xfs. We do not currently support online shrink because every filesystem that supports it has at least one corner case that ate a customer's data, and we like our customers having data.
What you should still take a snapshot for: cross-tier moves that change the underlying CPU family (e.g. AMD to Intel), and any storage shrink, because we will not do it for you. Live resize within the same plan family is genuinely safe — and we still recommend a snapshot, because the cost is two seconds and the value of an undo button is incalculable.
API reference is on the docs site, dashboard support is in /dashboard/services → manage → resize. If your workload sees more than a 200 ms pause during a resize, email us with the instance ID — that's a regression and we want to hear about it.