The Intel Optane SSD DC P4800X (375GB) Review: Testing 3D XPoint Performance
by Billy Tallis on April 20, 2017 12:00 PM ESTMixed Read/Write Performance
Workloads consisting of a mix of reads and writes can be particularly challenging for flash based SSDs. When a write operation interrupts a string of reads, it will block access to at least one flash chip for a period of time that is substantially longer than a read operation takes. This hurts the latency of any read operations that were waiting on that chip, and with enough write operations throughput can be severely impacted. If the write command triggers an erase operation on one or more flash chips, the traffic jam is many times worse.
The occasional read interrupting a string of write commands doesn't necessarily cause much of a backlog, because writes are usually buffered by the controller anyways. But depending on how much unwritten data the controller is willing to buffer and for how long, a burst of reads could force the drive to begin flushing outstanding writes before they've all been coalesced into optimal sized writes.
The effect of a read still applies to the Optane SSD's 3D XPoint memory, but with greatly reduced severity. Whether a block of reads coming in has an effect depends on how the Optane SSD's controller manages the 3D XPoint memory.
Queue Depth 4
Our first mixed workload test is an extension of what Intel describes in their specifications for throughput of mixed workloads. A total queue depth of 16 is achieved using four worker threads, each performing a mix of random reads and random writes. Instead of just testing a 70% read mixture, the full range from pure reads to pure writes is tested at 10% increments.
Vertical Axis units: | IOPS | MB/s |
The Optane SSD's throughput does indeed show the bathtub curve shape that is common for this sort of mixed workload test, but the sides are quite shallow and the minimum (at 40% reads/60% writes) is still 83% of the peak throughput (which occurs with the all-reads workload). While the Optane SSD is operating near 2GB/s the flash SSDs spend most of the test only slightly above 500MB/s. When the portion of writes increases to 70%, the two flash SSDs begin to diverge: the Intel P3700 loses almost half its throughput and only recovers a little of it during the remainder of the test, while the Micron 9100 begins to accelerate and comes much closer to the Optane SSD's level of performance.
Mean | Median | 99th Percentile | 99.999th Percentile |
The median latency curves for the two flash SSDs show a substantial drop when the median operation switches from a read to a cacheable write. The P3700's median latency even briefly drops below that of the Optane SSD, but then the Optane SSD is handling several times the throughput. The 99th and 99.999th percentile latencies for the Optane SSD are relatively flat after jumping a bit when writes are first introduced to the mix. The flash SSDs have far higher 99th and 99.999th percentile latencies through the middle of the test, but much fewer outliers during the pure read and pure write phases.
Adding Writes to a Drive that is Reading
The next mixed workload test takes a different approach and is loosely based on the Aerospike Certification Tool. The read workload is constant throughout the test: a single thread performing 4kB random reads at QD1. Threads performing 4kB random writes at QD1 and throttled to 100MB/s are added to the mix until the drive's throughput is saturated. As the write workload gets heavier, the random read throughput will drop and the read latency will increase.
The three SSDs have very different capacity for random write throughput: the Intel P3700 tops out around 400MB/s, the Micron 9100 can sustain 1GB/s, and the Intel Optane SSD DC P4800X can sustain almost 2GB/s. The Optane SSD's average read latency increases by a factor of 5, but that still enough to provide about 25k read IOPS. The flash SSDs both experience read latency growing by an order of magnitude as write throughput approaches saturation. Even though the Intel P3700 has a much lower capacity for random writes, it provides slightly lower random read latency at its saturation point than the Micron 9100. When comparing the two flash SSDs with the same write load, the Micron 9100 provides far more random read throughput.
117 Comments
View All Comments
Ninhalem - Thursday, April 20, 2017 - link
At last, this is the start of transitioning from hard drive/memory to just memory.ATC9001 - Thursday, April 20, 2017 - link
This is still significantly slower than RAM....maybe for some typical consumer workloads it can take over as an all in one storage solution, but for servers and power users, we'll still need RAM as we know it today...and the fastest "RAM" if you will is on die L1 cache...which has physical limits to it's speed and size based on speed of light!I can see SSD's going away depending on manufacturing costs but so many computers are shipping with spinning disks still I'd say it's well over a decade before we see SSD's become the replacement for all spinning disk consumer products.
Intel is pricing this right between SSD's and RAM which makes sense, I just hope this will help the industry start to drive down prices of SSD's!
DanNeely - Thursday, April 20, 2017 - link
Estimates from about 2 years back had the cost/GB price of SSDs undercutting that of HDDs in the early 2020's. AFAIK those were business as usual projections, but I wouldn't be surprised to see it happen a bit sooner as HDD makers pull the plug on R&D for the generation that would otherwise be overtaken due to sales projections falling below the minimums needed to justify the cost of bringing it to market with its useful lifespan cut significantly short.Guspaz - Saturday, April 22, 2017 - link
Hard drive storage cost has not changed significantly in at least half a decade, while ssd prices have continued to fall (albeit at a much slower rate than in the past). This bodes well for the crossover.Santoval - Tuesday, June 6, 2017 - link
Actually it has, unless you regard HDDs with double density at the same price every 2 - 2.5 years as not an actual falling cost. $ per GB is what matters, and that is falling steadily, for both HDDs and SSDs (although the latter have lately spiked in price due to flash shortage).bcronce - Thursday, April 20, 2017 - link
The latency specs include PCIe and controller overhead. Get rid of those by dropping this memory in a DIMM slot and it'll be much faster. Still not as fast as current memory, but it's going to be getting close. Normal system memory is in the range of 0.5us. 60us is getting very close.tuxRoller - Friday, April 21, 2017 - link
They also include context switching, isr (pretty board specific), and block layer abstraction overheads.ddriver - Friday, April 21, 2017 - link
PCIE latency is below 1 us. I don't see how subtracting less than 1 from 60 gets you anywhere near 0.5.All in all, if you want the best value for your money and the best performance, that money is better spent on 128 gigs of ecc memory.
Sure, xpoint is non volatile, but so what? It is not like servers run on the grid and reboot every time the power flickers LOL. Servers have at the very least several minutes of backup power before they shut down, which is more than enough to flush memory.
Despite intel's BS PR claims, this thing is tremendously slower than RAM, meaning that if you use it for working memory, it will massacre your performance. Also, working memory is much more write intensive, so you are looking at your money investment crapping out potentially in a matter of months. Whereas RAM will be much, much faster and work for years.
4 fast NVME SSDs will give you like 12 GB\s bandwidth, meaning that in the case of an imminent shutdown, you can flush and restore the entire content of those 128 gigs of ram in like 10 seconds or less. Totally acceptable trade-back for tremendously better performance and endurance.
There is only one single, very narrow niche where this purchase could make sense. Database usage, for databases with frequent low queue access. This is an extremely rare and atypical application scenario, probably less than 1/1000 in server use. Which is why this review doesn't feature any actual real life workloads, because it is impossible to make this product look good in anything other than synthetic benches. Especially if used as working memory rather than storage.
IntelUser2000 - Friday, April 21, 2017 - link
ddriver: Do you work for the memory industry? Or hold a stock in them? You have a personal gripe about the company that goes beyond logic.PCI Express latency is far higher than 1us. There are unavoidable costs of implementing a controller on the interface and there's also software related latency.
ddriver - Friday, April 21, 2017 - link
I have a personal gripe with lying. Which is what intel has been doing every since it announced hypetane. If you find having a problem with lying a problem with logic, I'd say logic ain't your strong point.Lying is also what you do. PCIE latency is around 0.5 us. We are talking PHY here. Controller and software overhead affect equally every communication protocol.
Xpoint will see only minuscule latency improvements from moving to dram slots. Even if PCIE has about 10 times the latency of dram, we are still talking ns, while xpoint is far slower in the realm of us. And it ain't no dram either, so the actual latency improvement will be nowhere nearly the approx 450 us.
It *could* however see significant bandwidth improvements, as the dram interface is much wider, however that will require significantly increased level of parallelism and a controller that can handle it, and clearly, the current one cannot even saturate a pcie x4 link. More bandwidth could help mitigate the high latency by masking it through buffering, but it will still come nowhere near to replacing dram without a tremendous performance hit.