Solid State Drive lifetime

Discussion in 'Tech Talk' started by EniGmA1987, Jul 27, 2011.

  1. EniGmA1987
    Veteran Staff Member Xenforcer

    Joined:
    Aug 25, 2010
    Messages:
    4,778
    Likes Received:
    34
    As many of you know, a SSD is not rated by the same standards old mechanical drives are rated by. Whereas lifetime was rated in mean-time between failures, solid state drives are rated with a specific endurance. Each cell in the NAND chip has a certain endurance and once its endurance is up, you can no longer write to that cell anymore.

    The first drives were SLC (single level cell) NAND chips. This meant that each cell stored a single bit, on or off (1 or 0). The endurance of these chips was rated for between 100,000-200,000 program/erase cycles.

    Modern drives in the mainstream market are made with what is known as MLC, or multi-level cell. This stores two bits of data per cell, so the cell can either read 0 0, 0 1, 1 0, or 1 1. This allows drive sies to be larger as you can store twice as much data in the same size of chip.
    The problem from this is two things, first being that the endurance is MUCH lower. Instead of 100,000 we are anywhere between 3,000-20,000 cycles per cell. As the NAND chips are made on smaller process nodes, the endurance becomes less and less. Current drives are normally rated between 3,000-5,000. This issue will compound itself as we go to smaller and smaller manufacturing nodes. This issue can be somewhat counteracted by the use of extremely advanced error correction and wear leveling algorithms coupled with a very low write amplification factor.
    The second problem with MLC compared to SLC is the fact that when you write to a cell it must first do a check to see what the cell currently holds, erase said data and then write the new value to the cell containing both the first bit and second bit. This means there is extra work compared to the SLC's ability to simply erase and write the new bit to the cell. It also means that since a cell is holding two bits of information, that cell is being used more often than an SLC cell would be, thus compounding the problem of its already lesser endurance.



    Solid state drives are expensive pieces of technology. When you buy one, you want to make sure it will last you for at least a few years. Testing has been under way to ascertain the total write lifetime of modern SSDs, so that we may find the true lifetime potential of the drives and know which are the best.
    This doesnt include the chance of controller failure, just the total endurance of the NAND chips ability to write/store data. Testing is still underway, but I thought I would share the current results. I will update periodically as testing progresses.
    The drives tested as the last gen Crucial drive, the C300, the second gen Crucial drive, the M4, a second gen Intel drive, the X25-V, a third gen Intel drive, the 320 series, and a second gen Samsung Drive, the 470 series. Additionally, the Vertex 2 which is Sandforce based was also tested, however when a problem was discovered that is specifically built in to the controller chip, testing stopped on that drive due to its inability to be properly tested. I will get into that more later on. Now, on to the current results:


    TOTAL DATA WRITTEN TO DRIVE:
    [​IMG]


    TOTAL NUMBER OF P/E CYCLES TO DRIVE:
    [​IMG]




    As you can see in this picture, the total data written to the drives is quite large, in the terrabyte ranges. You may also notice something called the MWI. This stands for Media Wear Index, and is the SMART attribute for a solid state drive associated with the NAND chips lifetime. It is supposed to be that when the MWI reaches 0, the NAND chips has no endurance left. As we can see, this is not always the case. It seems to be that the MWI is either read differently for all drives, or is programmed differently in the firmware of the drive. Either way, MWI is not a proper indicator of the drives true lifetime potential.

    As I said earlier, testing is still going on and I will update this chart as it progresses. However as we can see currently, a significant amount of data can be written to each drive, far surpassing the rated limit given by the companies that manufacture these SSDs.

    Currently the Samsung drive is winning, having a massive amount of data successfully written to the drive and still working fine, but the Intel 320 series is a very close second place. What is more impressive though, is that the Intel is actually doing slightly better. Although less data has been written in total, the Intel drive is 1/3 smaller than the Samsung is. Meaning 1/3 less cells available to write data to.

    If anyone is in the market for a solid state drive, I highly recommend either the Samsung 470 series, or the Intel 320 series. However, I must caution the use of the 320 series. A bug was recently discovered where if the drive is put through too many power cycles (turning the computer on and off) there is a chance the drive can fail and read as only 8MB in size. Intel is working on a firmware fix but that fix is not currently out. I own a 320 series and have had no issues, but the small chance is still there until the firmware can be fixed.







    Now on to the whole Sandforce thing. Sandforce has a design inside the chip that allows a manufacturer to set the warranty period. When this happens, the drive forces the end user (you) to adhere to that warranty so that the company does not have to replace drives that get worn out too soon. What happens is that if you are writing too much data to your solid state drive, and the drive decides that at your usage level the drive will wear out before the warranty period is up, then the drive will limit its speed and your ability to write data to the drive to maintain its warranty time. So although the drive is rated for something like 285MB/s speed, if you actually do that too much then the drive will slow itself down as much as is needed to maintain its life. During testing for NAND chip endurance, the Sandforce based drives slowed themselves down to 0.5-1MB/s in write speed to maintain a 3 year lifetime. This made testing these drives impossible and the drives are no longer usable as 1MB/s is incredibly slow. You cant do anything at that speed.



    Additionally, I would like to leave one small final note about controller chips. The main controllers are:
    Sandforce
    Intel
    Marvel (used in Crucial drives)
    Indilinx (used in older drives from all main manufacturers, now everyone uses Sandforce instead)
    Samsung

    Of these controllers, I will sort them in reported failure rates, most to least:
    Sandforce, 2.5-3.5%
    Marvel, 2%
    Indilinx, 1.5-2.5% (Barefoot controller only)
    Intel, 0.5-1.5%
    Samsung, 0.5-1% (1st gen controller only)

    The current 470 series controller is still too early to tell what its failure rates are, but I would guess it will be pretty good once again. Samsung has a habit of making very reliable computer parts as they mainly sell to enterprise and high end OEM clients, not the mass market.

    Indilinx is now owned by OCZ, and is also releasing their new controller chip very soon. It looks to be pretty good from the marketing slides I saw on it, but we will have to wait and see once it comes out.
     
    Last edited: Aug 16, 2011
  2. EniGmA1987
    Veteran Staff Member Xenforcer

    Joined:
    Aug 25, 2010
    Messages:
    4,778
    Likes Received:
    34
    Now for the math examples of how long these drives will last:



    Lets take a 64GB drive as an example. With 64GB we have only half that of physical cells, since each cell holds two bits. So to start, we will divide by two to get 32GB. Now we multiply by 1024 to get the MB total: 32,768 MB. And multiply again by the same amount to go from MB to KB: 33,554,432 KB. And again by the same to go from KB to B: 34,359,738,368 Bytes. Now multiply by 8 to get the total bits: 274,877,906,944. Now that we have the total number of cells in the drive, we multiply by the cells endurance rating, we will use a common one which is 3,000:

    824,633,720,832,000. <--- That is the theoretical total number of bits capable of being written to a modern 64GB solid state drive with 25 nanometer NAND chips.




    Now that we have drooled over an impossibly large number that doesn't mean anything to our brains, we can move on to relevant things. Lets take the Samsung 470, 64GB drive as out example. So far we can see it has a total of 247.4TB of data written to it. So lets take this into practical terms. Lets say for example that I only care about my drive lasting 5 years. There are 365 days in a year (366 in leap year, which likely occurs once during this example timeframe). SO lets start by multiplying (365 x 5) + 1: 1826 days. Now we take our total data able to be written and multiply that number by 1024 to find the gigabyte total: 247.4 x 1024: 253,337.6 GB. Now we can take our gigabyte total, and divide by the number of days we want the drive to last: 138.7 GB. This means that on the Samsung 470 series 64GB SSD we are capable of writing at least 138GB of data EVERY DAY for five years straight before the drive wears out.





    Now lets do the math for that 40GB Intel 320 SSD:

    241 x 1024: 246,784 GB
    (365 x 5) + 1: 1826
    GB / day: 135.15 GB per day for 5 years straight
     
    Last edited: Jul 27, 2011
  3. Roxanne
    Veteran

    Joined:
    Jul 17, 2011
    Messages:
    30
    Likes Received:
    0
    Gender:
    Female
    Location:
    California
    Thanks for the informative post :D

    <- Considering I have the crucial m4
     
  4. Sogetsu
    Veteran

    Joined:
    Jul 27, 2009
    Messages:
    7,511
    Likes Received:
    3
    Occupation:
    Logistics
    Location:
    Atlanta, GA
    Excellent avatar, btw.
     
  5. Roxanne
    Veteran

    Joined:
    Jul 17, 2011
    Messages:
    30
    Likes Received:
    0
    Gender:
    Female
    Location:
    California
    Thank you ^_^

    Nyan nyan cat ftw XD
     
    Last edited: Jun 1, 2012
  6. Raelinoith
    Veteran

    Joined:
    Jun 17, 2010
    Messages:
    812
    Likes Received:
    5
    Occupation:
    Living
    Location:
    BFNW, New Jersey
    Thats pretty freaking awesome, considering some of my more moved around drives go in like 3 years. But what about data recovery? It's already a bitch now on regular drives, would thd method for recovery be the same? And then of course drive formatting counts towards the data written to it, correct?
     
  7. EniGmA1987
    Veteran Staff Member Xenforcer

    Joined:
    Aug 25, 2010
    Messages:
    4,778
    Likes Received:
    34
    You never do anything besides a quick format with these drives. A standard format will use up quite a lot of program/erase cycles on the drive and slow the entire drive down because it will write data to every cell on the drive. So never EVER do anything besides a quick format.

    TRIM is what is used to move data together and free up cells, then it does a proper erase of those now empty cells to leave them empty.
    This can also be achieved by a "secure erase" program designed for SSDs.





    One good thing about a SSD is that even after its cycles are used up and cant be written to anymore, you are still able to read from those cells. So if all goes well you should be able to easily pull your data off of it at the end of the drives lifetime.
    However, that is a best case scenario. If the drive is being written to this much that the cycles are all used up, that means there is a chance it is writing corrupted data for a little while so when you try and copy some of the data off it will be corrupted.
    I do not know if standard data recovery methods can be used with a SSD, however I would guess the answer is probably no. I do not know a lot about the data recovery procedure, but I was under the impression it was done by using a machine to read the magnetic disk and collect data off of it, SSDs dont have a magnetic disk. However I do not know if that method is still used today or whatever so maybe it is different.


    The largest likelihood of a SSD failing and you losing your data is actually that a bug in the firmware causes the drive to become unreadable, the controller fails from so much use, a power surge kills the drive, and probably a couple other random scenarios. With this generation of drives, the likelihood of a SSD failing due to its total writes being used up is incredibly small and I dont think anyone here could do it.
     
  8. Raelinoith
    Veteran

    Joined:
    Jun 17, 2010
    Messages:
    812
    Likes Received:
    5
    Occupation:
    Living
    Location:
    BFNW, New Jersey
    Obviously someone like myself couldn't, but I was thinking big business use where large amounts of data are kept/stored and accessed/changed.

    I actually wouldn't mine investing in a SSD but, I'm waiting for the storage capacities to rise, and the prices to lower a little bit ;) So, I'm gonna be waiting a bit, but they seem worth it.
     
  9. EniGmA1987
    Veteran Staff Member Xenforcer

    Joined:
    Aug 25, 2010
    Messages:
    4,778
    Likes Received:
    34
    Unfortunately, as prices fall it is due to the process node being shrunk, so then endurance will be lower than what it currently is. This will be one of the factors for capacity rise as well.

    Another thing that will lead to cheap high capacity SSDs that you all need to be aware of and stay away from is TLC. Which stands for triple-level cell. It is kinda like MLC but it stores three bits per cell. The endurance on these is incredibly low and speed is also lower. Drives with TLC NAND are just starting to come out very soon, stay away from them unless you only want to use the SSD as a storage drive for data not accesses very often.
     
  10. Lime
    Veteran

    Joined:
    May 4, 2009
    Messages:
    1,777
    Likes Received:
    37
    Gender:
    Male
    Occupation:
    Network Engineering
    Location:
    No Where Important
    Goddammit Soggy...
     
  11. Deadend
    Veteran Crowfall Member

    Joined:
    Jun 22, 2008
    Messages:
    1,449
    Likes Received:
    14
    Occupation:
    Monkey.
    Reliability review over at tom's
    http://www.tomshardware.com/reviews/ssd-reliability-failure-rate,2923.html

    What I take from what Enigma has said and what this article says the average consumer should be looking for the longest warranty they can find and then look around for reported firmware bugs. Since as far as we are concerned SSD failure is going to be pretty random anyway. And back up anything you don't want to lose on a separate drive/disk what ever (like you should be doing any way).
     
  12. EniGmA1987
    Veteran Staff Member Xenforcer

    Joined:
    Aug 25, 2010
    Messages:
    4,778
    Likes Received:
    34
    Interesting article, I hadnt read that. Published two days after I typed all this out too. Interesting that so many are suddenly interested in SSD lifetime and failure rates. hmm. I guess it is mostly due to all the failures and Bugs of OCZ/Corsair/GSkill/etc drives with the Sandforce chips.
     
  13. ss_hype
    Veteran

    Joined:
    Jun 23, 2008
    Messages:
    856
    Likes Received:
    1
    Occupation:
    Systems Administrator
    Location:
    Mass.
    I have a sandforce Corsair P128 120GB that I use for my O/S and 2-3 games at a time drive. The performance seems to be degrading over time, it's over a year old now. But I recently did some tests and the r/w speeds are still at 220ms so maybe I'm just imagining it.
     
  14. EniGmA1987
    Veteran Staff Member Xenforcer

    Joined:
    Aug 25, 2010
    Messages:
    4,778
    Likes Received:
    34
    Charts updated with newest info. Also added another chart showing total number of writes to the drive as that is very good info as well. Many of those drives are only supposed to last between 3000-5000 writes.