Security flaw found in solid-state drive design

Cal Jeffrey

Posts: 4,154   +1,416
Staff member

Hard disk drives (HDD) have not yet gone the way of the horse and buggy, but with the rise in popularity of the solid-state drive (SSD), it could only be a matter of time before SSDs become the standard and HDDs are phased out. This shift is entirely understandable considering the speed and reliability advantages that solid-state drives offer, not to mention the recent reductions in price.

However, researchers at Carnegie Mellon University have discovered a flaw in SSD design that makes them vulnerable to a particular type of attack that can cause premature failure and data corruption. The details of the flaw are highly technical, but I will attempt to make some sense of it here without getting too convoluted.

Apparently, the problem only applies to multi-level cell (MLC) drives. Single-level cells (SLC) are not vulnerable, but since MLC SSDs have become more popular because of their speed, the risk applies to many more devices. While the study did not address triple-level cell (TLC) SSDs, ExtremeTech points out that TLCs are likely vulnerable since they use the same type of multi-stage programming cycle as MLCs.

The vulnerability comes from how MLCs are programmed. Unlike single-level cell SSDs, MLC drives write data into a buffer from the flash cell rather than from the SSD’s flash controller. By intercepting this process, an attacker can corrupt the data being written. The obvious result is data corruption in stored memory, but it can also cause damage to the SDD, reducing its lifespan.

This explanation is highly simplified, but if you are fluent in technical jargon, you can read the researchers’ full paper titled “Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques” at Semantic Scholar.

Resolving the problem is a more straightforward affair. Manufacturers would simply have to run data through the flash controller instead, just like with SLC. However, this increases latency by about 5 percent, which somewhat defeats one of the main advantages MLC has over SLC drives.

If Carnegie Mellon has figured this out, hackers assuredly have as well. If they had not already known about this flaw, they do now. We have yet to see a reported attack that has exploited this vulnerability, and certainly, SSD manufacturers are already at work trying to find a way to plug the hole without compromising speed.

Even if they do figure out how to fix the flaw in new drives, what of the drives already in consumer devices? Is there a software fix for this issue? Are there virus definitions that can be written to detect if a program or app is written to exploit this vulnerability? If you are a security expert, please share your thoughts in the comments.

Permalink to story.

 
Carnegie Mellon was once a great repository of research, knowledge and ingenuity.

This tripe is pretty much Capt. Obvious and dumbed down in it's own 'write'.
 
Also constructive feedback:
The details of the flaw are highly technical, but I will attempt to make some sense of it here without getting too convoluted.
This is redundant, we are just reading the news, there is no need to go full technical on it and we all know it :)

Jeez, this sounds bad, as in massive bad. The companies should probably release a new firmware for their drives affected, but it sounds like one too many bricked devices to me, I look at this as an extremely complicated scenario, maybe it's not thaaat bad, but a firmware update always has it's possible complications.
 
Also constructive feedback:This is redundant, we are just reading the news, there is no need to go full technical on it and we all know it :)
Again thank you so much for the feedback. I do read comments as much as I can and am always open to constructive criticisms. Of course, if I had not included the disclaimer that you find redundant (which I agree, mind you), I suspect there would have been plenty of flames regarding how it sounded like I didn't know know what the heck I was talking about. lol There is plenty of that on simple topics. Didn't feel it was necessary to call out the trolls. :D I love you too trolls! :p
 
".....discovered a flaw in SSD design that makes them vulnerable to a particular type of attack that can cause premature failure and data corruption."
yes, hit it with a hammer and you get premature failure and data corruption, done
 
The abstract at Semantic Scholar is missing the letters "fl" wherever the word "flash" appears, making it into "ash", which makes it difficult to understand. Other letters are missing from a few other words, and then there's the mystery word "eeects", whose actual spelling is "effects". Fortunately the original PDF that's linked to above the abstract is accurate.
 
Which is why everytime you buy a new Solid State; you must update the Firmware on it first.

I suspect this will be fixed with a firmware update too. Shrugs!

Although, it says, "premature failure". Does this mean with data corruption or hardware failure? If it's data corruption. Whatever! Probably was time to clean out that porn anyway. Hehe
 
Sorry, this is almost funny to read. Let me try to translate this into English. SLC cells are programmed to a single bit per cell so that each physical page corresponds to a virtual page. MLC is programmed to 2 bits per cell, so each page of e.g. 4096 cells (plus parity information) corresponds to two virtual pages called the upper and the lower page. Both pages need to be programmed at the same time because it is a single charge level (for simplicity reasons let's use 0-3 or in binary 00, 01, 10 or 11) which gives you the upper and the lower bit. In reality it is the 0 or 2 bit which is programmed first and then there is the "fine tuning" to either leave the charge level of the floating gate as is or else bring it up to 1 or 3. Sounds a bit complicated but in reality it is really simple. Now, what the authors claim is that while the cells are waiting for the second programming pulse, they are vulnerable to any near field effect... well, yea... that is how they are programmed .... DUH *scratching my head*
 
Now, what the authors claim is that while the cells are waiting for the second programming pulse, they are vulnerable to any near field effect... well, yea... that is how they are programmed .... DUH *scratching my head*
...well, yea... that is why it's a problem apparently, it's vulnerable.
 
The author correctly identifies this subject matter as extremely technical and in this case resulted in some naive research getting incorrectly interpreted in several articles including this one.

First of all, the sources of error including nearest neighbor affects and read disturb are accounted for in the error budget of the device therefore these apparent susceptibilities are covered. There is conceivably a case where repeated reads to a page of a block containing a partially programmed page could inject an error but this is only if a read scrub algorithm is improperly designed (not the same would be true for a completed page). I find that the base article is pointing out a susceptibility that actually is a design consideration and the suggested remedies unnecessary.

The articles that have then picked up on this research overblow this research as a flaw in the drives (which it is not) and an opportunity for hackers (which it is not). Please get better understanding of issues before sensationalizing them for a headline.
 
Now, what the authors claim is that while the cells are waiting for the second programming pulse, they are vulnerable to any near field effect... well, yea... that is how they are programmed .... DUH *scratching my head*
...well, yea... that is why it's a problem apparently, it's vulnerable.

No, it is not really vulnerable. Any modern NAND flash controller will buffer several pages in the page buffer and then do a single sweep write of typically 4 pages. This is done as an additional layer of security / data integrity where the data for the upper and lower pages are written in interleaved mode to two different physical pages. The assumption by the authors completely disregards any of the advancements in flash technology and essentially describes technology from pre 2010. Keep in mind that there is no more difference between SLC and MLC flash, it is all the same, only the programming speed for SLC "mode" is faster because the required granularity is lower and the mode can be changed on the fly depending on the wear and target usage. Everything else applies to TLC / QLC etc as well.
 
Now, what the authors claim is that while the cells are waiting for the second programming pulse, they are vulnerable to any near field effect... well, yea... that is how they are programmed .... DUH *scratching my head*
...well, yea... that is why it's a problem apparently, it's vulnerable.

No, it is not really vulnerable. Any modern NAND flash controller will buffer several pages in the page buffer and then do a single sweep write of typically 4 pages. This is done as an additional layer of security / data integrity where the data for the upper and lower pages are written in interleaved mode to two different physical pages. The assumption by the authors completely disregards any of the advancements in flash technology and essentially describes technology from pre 2010. Keep in mind that there is no more difference between SLC and MLC flash, it is all the same, only the programming speed for SLC "mode" is faster because the required granularity is lower and the mode can be changed on the fly depending on the wear and target usage. Everything else applies to TLC / QLC etc as well.

In addition to this, the program disturb attack is predicated on being able to defeat an LFSR scrambler to write a worst case pattern to the adjacent wordline. Most SSDs encrypt before writing to Flash these days (with salting schemes), so the encryption would also need to be defeated. The read disturb attack is easily mitigated with firmware as mentioned in another reply. It's too bad this paper is getting sensationalized.
 
This security flaw as explained, if the explanation is in fact accurate, is a low priority threat. To justify that statement you have to consider the end result of corrupting data. No data on drive that is readable and the drive is down. Now how does a hacker profit off of this? If a hacker(s) are simply being malicious then how to they interrupt data as its being written? local machine software which an antivirus would remove. Just don't forget your antivirus if your are going to make an enemy out of hackers so much so that they will shut your data farm down!
 
Back