Homemade Hybrid Disk Drives Part-3
Something old, something new, something fast, nothing blew. Whew!
This article starts out with a rant.
The Real Speed Limits
Perhaps you’re wondering why I’ve been spelling out the GBytes per second and Gbits per second, etc. It’s because the world is a very sloppy place when it comes to capitalization for data values, and calculated throughput maxima. People barely know the difference between bits and bytes as it is, and the capitalization problems make it much worse than it needs to be.
Serial communications are generally denoted in bits per second with a lowercase ‘b’, while parallel and disk drive throughput is usually specified in bytes per second using an uppercase ‘B’. You’d never know that while shopping for hard drives. You’ll very frequently be battered by entries showing the 6 Gbits/sec SATA-III interface of SSD’s and HDD’s as having 6GB (as in six gigabytes) per second throughput.
That’s wrong in two ways. First, it’s wrong because it’s six gigabits per second. Secondly, it’s wrong because you’ll never get the value of all six gigabits. That’s because the SATA interface and the USB 3.0 interface all use 8b/10b encoding. Each 8-bit byte is sent as a unique 10-bit character. Therefore there’s an inherent 20-percent ‘tax’ on the interface. The 6 Gbit/sec SATA interface can transport 4.8 Gbits/sec for a theoretical data maximum of 600 MBytes/sec, and a typical ~550 MBytes/sec practical maximum. The now-common 5 Gbits/sec USB 3.0 link can support about 4 Gbits/sec, giving at most 500 MBytes/sec, but with protocol overhead and a few other hassles, it’s typically in the ~430 MBytes/sec range as a practical maximum. Things did change somewhat with USB 3.1-gen2; it uses 128b/132b encoding; cutting the overhead to 3-percent.
Media Speed and Endurance
Every HDD manufacturer will list the SATA interface speed, but very few will tell you up front what the platter or media speed is. Media speed is the actual sustained data rate that the drive can give after its buffer is depleted. The sustained data rate will be at its highest near the outer cylinders where the larger radius offers a higher number of bits per second passing under the read/write head. As the drive fills, and moves the head assembly nearer the center, the linear bits per second speed diminishes to 60-percent of the outer-edge value. To say it in math terms, the angular velocity remains the same, but the linear velocity is proportional to the distance from the center of the platter.
Even the enterprise-grade drives don’t always document their media speed, and there have been some egregious errors (or convenient prevarications) posted by various HDD makers. For my purposes, media speed is everything. I specifically went through the HGST/Western Digital and Seagate/Samsung product lines looking for the highest media speed and capacity at the lowest cost that I could find.
By combining two Western Digital Gold 6TB Enterprise drives, I was able to buy and build a 12TB RAID-0 pair with a 430 MBytes/sec throughput at $0.021/GB (yes, two-point-one cents) with a MTBF reliability figure of 2.5 million hours each. These drives were manufactured in July 2017, and came directly from a Western Digital refurbishment center (via eBay). They have a 550 TBW per year endurance figure; yielding a 2750 TBW spec for its 5-year warranty period. That number makes consumer-grade SSD’s seem as fragile as a Number 9B graphite pencil, while simultaneously highlighting how awesome the Hitachi SS200 SSD is at a 13600 TBW specification.
Let’s give this WD Gold dancing pair scores of 7.5 ~ 7.5 ~ 7.2. The reliability figures are as good or better than consumer-grade SSD’s, the throughput is more than 80-percent of a good-running SATA-III SSD, the cost per gigabyte is 10-times better than currently available SSD’s, and we’re solidly past 10TB capacity.
Caching Strategy
While the outer cylinders of 7TB will service macOS as a 430MBytes/sec media pool, the partitioned inner 5TB of cylinders that run at a mere 320 MBytes/sec, will be assisted by approximately 8GB of read/write cache somewhere in the range of 11,000 MBytes/sec, and further supported by a substantial chunk of the 167GB RAID-0 SSD pool allocated for read cache. That moves the solution scores to {8.7 ~ 7.5 ~ 7.2} a very satisfying result.
Packaging and Drive Specifications
I set out to use the beautiful G-Tech enclosures (vintage 2004) that I acquired as part of a larger equipment transaction last year. They’re all aluminum with a style that matches the classic Mac Pro ‘cheese grater’ tower. That same basic box is available newly-manufactured today with a USB 3.0 connection (USB 3.1-gen1) and/or a Thunderbolt connection supporting a choice of 4TB, 6TB and 8TB capacity.
As you can see in the pictures, I had also abandoned a plan to use a couple of 2x 2.5-inch HDD brackets to enhance the twin 3.5-inch mounting scheme, and connected four little HDD’s. Two were Western Digital Black 1.0TB 7200 RPM, and the other two were Western Digital Red NAS 1.0TB 5400 RPM drives. Unfortunately, one of the refurbished Red drives was defective so I returned both of them. The individual Red drive was good for about 105 MBytes/sec at its outer media cylinders. The Black drives each achieve about 135 MBytes/sec.
The Mac Pro tower has 7 SSD’s. There are two 250GB Crucial BX100’s as a RAID-0 pair, and two 525GB Crucial MX300’s as a RAID-0 pair. Win-7 boots from an Intel S3500 Enterprise 480GB, Win-10 boots from a 500GB Crucial MX500, and macOS Sierra boots from a 500GB Crucial MX200.
While most TBW scores move upward with increasing drive size, Crucial advertises the same 72 TBW for all sizes of the BX100’s and that same figure for all but the 64GB size of the older M4 series. The 525GB MX300’s and the 500GB MX200 each are listed at 160 TBW, while the enterprise-grade Intel SSD offers 275 TBW in its 480GB capacity.
Trade-offs
My Crucial MX300’s don’t read as well as they write. This mystery has been apparently stone-walled by Crucial. So why do I keep these in service when I should protest, boycott, and get rid of these under-performing MX300’s ? The answer lies in a quirk related to the layout of 384 Gbit 3D NAND. While the 1050GB (1TB) model of the MX300 has only 6GB of over-provisioning, the 525GB model has 26GB of over-provisioning. Yes, that’s quite a lot, and it helps my implementation in quite a powerful way.
I have the two 525GB MX300’s configured as a hardware RAID-0 pair under my NewerTech MaxPower RAID controller. They're in the ATTO Benchmark picture as drive 'Qmedia-Q, formatted HFS+. Being under the RAID controller, they’re effectively insulated from the operating system and invisible to the Crucial Storage Executive software in Windows. Therefore the drives are on their own with regards to TRIM from the operating systems. These drives are depending entirely on the efficiency of their internal controller’s ‘garbage collection’ algorithm to keep them healthy for writing. Having 52GB of extra NAND around for the 1050GB pair is very handy for making sure all is well without overt manual maintenance.
Farewell and a Warning
So, all of this is set up and working now. The remaining big questions are about long-term stability, and any surprises that Win-10 may have in store for the PrimoCache product. I’ll update the article if anything weird happens.
For now, it’s just amazing to see how normal a fast computer will seem to be after just a few hours of usage. Beware, this can happen to you too! All that sweat, planning, and configuring comes down to ‘just another day at the office’.
- Ted Gary of TedLand
May 17, 2018