Japanese university loses 77TB of research data following a buggy software update

Humza

Posts: 1,026   +171
Staff member
In brief: Losing data to a backup error can mean fretting over the loss of years worth of personal photos or, as in the case of Japan’s Kyoto University, losing 77TB of critical research data. The incident occurred with the university’s supercomputer that received a faulty software update for its backup system, accidentally wiping 34 million files over a two-day period.

The culprit for this huge data loss was a faulty script originally meant to delete old, unnecessary log files from Kyoto university’s Cray/HPE supercomputer as part of a software update. However, it ended up deleting a massive 77TB of research data between Dec 14 and Dec 16, 2021, from the computer’s high-capacity /LARGE0 backup disc.

The university initially estimated losing up to 100TB of data after the buggy update wiped nearly all files older than 10 days. The 77TBs of research data that actually got deleted contained 34 million files that affected 14 research groups. Although Kyoto University didn’t reveal the nature or details of the wiped research data, it noted (Japanese) that files belonging to 4 groups were irrecoverable.

The university’s supercomputer supplier, Hewlett Packard Japan (HPE), admitted 100 percent responsibility for the incident and issued a letter of apology later published by the university. HPE said a modified script was issued in its update to “improve visibility and readability,” as The Stack reports.

However, HPE said it wasn’t aware of the side effects of this behavior, which caused the modified shell script to reload in the middle of execution, resulting in “undefined variables” and deletion of files in the supercomputer's /LARGE0 backup disc.

Kyoto University has since suspended the backup process, as it looks to make improvements and add preventive measures to deal with such incidents in the future. In addition to mirror backups, the university also plans to maintain incremental backups once it resumes the backup program later this month.

Permalink to story.

 
Uh, they didn't lose data, they lost backups, didn't they?
Yep, except for four of the fourteen groups where the backup data was apparently needed for later:
The 77TBs of research data that actually got deleted contained 34 million files that affected 14 research groups. Although Kyoto University didn’t reveal the nature or details of the wiped research data, it noted (Japanese) that files belonging to 4 groups were irrecoverable.
 
Is it really lost? Can't they undelete data with special tools? Maybe some malware already stole a copy of the data, they can look on the dark web too...
 
Is it really lost? Can't they undelete data with special tools? Maybe some malware already stole a copy of the data, they can look on the dark web too...
Depends. If said data was encrypted and overwritten, you'd need tools that typically cost $15,000 per hard drive to recover, and its still not a guarantee.
 
I lost 2.7 Mb of DNA sequencing data in 1985 when one of my post-doctoral fellows decided to reformat our IBM 5160’s 45Mb HDD. Took me 3 months to re-scan all the gel films one base at a time. It was a lot of G-A-T & C’s.
 
I lost 2.7 Mb of DNA sequencing data in 1985 when one of my post-doctoral fellows decided to reformat our IBM 5160’s 45Mb HDD. Took me 3 months to re-scan all the gel films one base at a time. It was a lot of G-A-T & C’s.
IIRC, I worked with someone, a "fellow programmer" actually, once a long time ago who, in the days of Windows NT, thought it was a good idea to delete the file "bootsect.dat" from his system drive. The result was his computer no longer booting. :rolleyes:

I was going to say that people who do not know what they are doing should not be allowed to do things without talking to someone else about the impact of reformatting a hard drive, however, even people that should know what they are doing, as in the case of my "fellow programmer", sometimes do stuff they should never do.
 
"The university’s supercomputer supplier, Hewlett Packard Japan (HPE), admitted 100 percent responsibility for the incident and issued a letter of apology later published by the university."
I'm not sure that the apology is worth 77TB of research data. I wonder how much HP owes Kyoto University for this mess-up. I also wonder how badly this will affect HP's business going forward.
 
All of this is a smoke screen! It wasn’t an accident, Aliens did it! The researchers were probably working on something that would get Aliens exposed so they got rid if all the data to cover up their real target. I’ve seen countless movies based on that scenario..!
 
The good news (and the bad news) is that 77TB of data isn't even that much. It's fairly trivial to backup this quantity of data in an entirely different data center. But since they lost it all, hopefully the data is easily reproducible from running code.
 
The current level of programming and the level of the average programmer says that NFC (your and our money) and other important data is better kept away from *****s.
 
If the data is on spinning disks, the data can be recovered even if it was deleted.
This is valid. It's expensive to get proper data recovery done, but HP should foot the bill on this one. At 77TB, I imagine it's on some sort SCSI or SAS drives. Solid State would be spendy at that capacity.
 
This is valid. It's expensive to get proper data recovery done, but HP should foot the bill on this one. At 77TB, I imagine it's on some sort SCSI or SAS drives. Solid State would be spendy at that capacity.
I suspect they already tried this, which is why the article mentions that 14 groups were affected, but files for 4 groups were unrecoverable.
 
Back