NVMe everywhere — what it means for your p95
We benchmarked every storage tier across our regions. Here's what the data says — and what to do about it.
By Vintony Engineering
When we say 'NVMe everywhere' we mean it literally. There is no SATA SSD tier hiding in any region of the Vintony fleet. The first reason is cost — datacenter-grade NVMe has been cheaper per usable IOPS than SATA for two years now. The second reason is the one that actually matters: p95 latency on a busy database goes from 'sometimes weird' to 'boring'.
We ran the same fio script — 4K random reads at queue depth 32 — across every storage tier in every region, three times a day for a fortnight. The graphs are all roughly the same shape: a median of ~150 microseconds, a p95 that sits comfortably under 500 microseconds, and a p99 that occasionally pokes above a millisecond when a neighbour does something silly. The variance is small enough that you can size storage by capacity, not by performance.
The practical consequence is that Postgres on a Vintony VPS behaves the way you think Postgres should behave. EXPLAIN ANALYZE numbers stay stable across the workday. Snapshot creates and restores happen at line rate. Application engineers spend less time arguing about whether 'the database is slow' or 'the disk is slow' because, frankly, neither of them is the slow part.
We run RAID 10 across NVMe modules in our VPS and managed cloud tiers. RAID 10 sounds quaint in 2026 but it gives us the cleanest failure model: a single drive failure is invisible to the workload, and a paired-drive failure is a known-bad state with a runbook. RAID 5/6 over NVMe is faster on paper for sequential writes but the rebuild storms are not worth the operational complexity for our customers.
Dedicated servers ship with at least two NVMe modules. The Pro and High-Perf tiers use four-drive RAID 10 with hot spares, plus an option for separate write-intent journalling on Optane Persistent Memory if you really, really need it. That last bit is a niche feature; most customers do not need it, and we will gently push back if you ask without a benchmark.
What this does not solve: cold-start latency for genuinely large datasets that exceed RAM, and write amplification on workloads that overwrite the same blocks in a tight loop. We have specific tunings for those — feel free to email engineering@vintonyhost.com with your workload pattern and we will point you at the right knobs.
Tl;dr: we picked NVMe because it makes the boring tier of your stack stay boring. Boring storage is a feature.