Complex workload bug found to crash Intel Skylake CPUs

Shawn Knight

Posts: 15,291   +192
Staff member

Intel has identified an issue with its sixth-generation Skylake processors that may result in chips hanging or causing other unpredictable system behavior.

The bug was reportedly discovered by hardwareluxx.de which was then passed on to Intel and mathematicians with the Great Internet Mersenne Prime Search (GIMPS). The latter group is responsible for Prime 95, an application that multiplies very large numbers using the Fast Fourier Transformation.

Aside from finding record prime numbers, the software has been used by enthusiasts for years to help benchmark and stress-test hardware.

The issue rears its ugly head when performing complex workloads like those used in Prime 95. Specifically, the exponent 14,942,209 has been singled out as a source of crashing.

It's publicly unclear at this time exactly why the bug is happening. We do know that it affects both Linux and Windows-based systems and isn't affected by underclocking or overclocking. Because it only surfaces under extremely complex workloads, it's unlikely to affect the vast majority of users.

Regardless, Intel said it has identified and released a fix and is working with motherboard makers to get the patch out to end-users via BIOS update. No word yet on how long it'll take manufacturers get the updates out to their users.

Permalink to story.

 
Thats all beyond me, but I am fascinated to think about what kind of software or hardware issues could possibly cause that.
 
I'd love to be a fly on the wall at Intel when they discuss this. I bet there's some funny stories to hear from complex problems such as these. It's also interesting how they can just patch the problem out of existence through a motherboard firmware update.
 
14942209 = 0xE40001 = 0b'111001000000000000000001
24 bits = 3 bytes wide
not sue what might be going on here
 
I'd love to be a fly on the wall at Intel when they discuss this. I bet there's some funny stories to hear from complex problems such as these. It's also interesting how they can just patch the problem out of existence through a motherboard firmware update.

Microprocessors have firmware on them that can be replaced by applying a high voltage on an "enable pin" and then inputting the correct sequence of data to allow for flashing. After the processor gives the ok signal to allow flashing, the new firmware is loaded and then given a checksum or full integrity check of the new firmware.
 
In all honesty, microprocessor design isn't that "difficult" these days. It's just really hard to figure out how to actually produce the designs that becomes an issue. It's mostly a giant bank of transistors.
 
In all honesty, microprocessor design isn't that "difficult" these days. It's just really hard to figure out how to actually produce the designs that becomes an issue. It's mostly a giant bank of transistors.

Could you elaborate? What are you comparing now with the past? What comes to my mind is the word "complex", but it isn't the same as "difficult", so I'm interested.
 
I'd love to be a fly on the wall at Intel when they discuss this. I bet there's some funny stories to hear from complex problems such as these. It's also interesting how they can just patch the problem out of existence through a motherboard firmware update.

I have a friend who works at Intel. They have some funny stories about the Pentium 2s when they came out and had some weird bug. Also, the P4s are considered the black sheep of the community within Intel.
 
I'd love to be a fly on the wall at Intel when they discuss this. I bet there's some funny stories to hear from complex problems such as these. It's also interesting how they can just patch the problem out of existence through a motherboard firmware update.

Microprocessors have firmware on them that can be replaced by applying a high voltage on an "enable pin" and then inputting the correct sequence of data to allow for flashing. After the processor gives the ok signal to allow flashing, the new firmware is loaded and then given a checksum or full integrity check of the new firmware.

this all sounds like rocket science to me.
 
I'd love to be a fly on the wall at Intel when they discuss this. I bet there's some funny stories to hear from complex problems such as these. It's also interesting how they can just patch the problem out of existence through a motherboard firmware update.

I have a friend who works at Intel. They have some funny stories about the Pentium 2s when they came out and had some weird bug. Also, the P4s are considered the black sheep of the community within Intel.

Haha. I work at Intel, and let me tell you: P4s are considered the black sheep outside and inside Intel -pun intended. I had a P4 much longer I ever thought of working here or what career would I choose and God I hated that thing, hottest CPU I've ever had -lots of trips back to the electronic store to get more fans to cool it down.

I moved from a Coppermine @533 MHz to a Prescott @3.4 GHz; tripling the system memory and moving to DDR2. I can tell you, subjectively in certain scenarios I felt it as much as twice the performance/responsiveness, a far cry from the expected ~7 times. Now I know it was due to the pipeline depths of both and basically the Netburst architecture was pure marketing (the GHz race); that's the one big mistake I've made in PC builds during my life and the thing I would change being able to go back then.

Disclaimer: what was written above is my personal opinion and doesn't represent Intel's.
 
I'd love to be a fly on the wall at Intel when they discuss this. I bet there's some funny stories to hear from complex problems such as these. It's also interesting how they can just patch the problem out of existence through a motherboard firmware update.

I have a friend who works at Intel. They have some funny stories about the Pentium 2s when they came out and had some weird bug. Also, the P4s are considered the black sheep of the community within Intel.

Haha. I work at Intel, and let me tell you: P4s are considered the black sheep outside and inside Intel -pun intended. I had a P4 much longer I ever thought of working here or what career would I choose and God I hated that thing, hottest CPU I've ever had -lots of trips back to the electronic store to get more fans to cool it down.

I moved from a Coppermine @533 MHz to a Prescott @3.4 GHz; tripling the system memory and moving to DDR2. I can tell you, subjectively in certain scenarios I felt it as much as twice the performance/responsiveness, a far cry from the expected ~7 times. Now I know it was due to the pipeline depths of both and basically the Netburst architecture was pure marketing (the GHz race); that's the one big mistake I've made in PC builds during my life and the thing I would change being able to go back then.

Disclaimer: what was written above is my personal opinion and doesn't represent Intel's.


lol brings back so many bad memories.

Lucky I skipped all of the P4 generation and was on AMD socket 939 until Nehalem.
 
In all honesty, microprocessor design isn't that "difficult" these days ... It's mostly a giant bank of transistors.

Good grief -- not that hard? REALLY??? Between precisely-organized logic that intelligently looks ahead at instructions to optimize things, to very intelligent caching of data and instructions, to the physical aspects of reducing RF interference when tiny things are clocked in the gigahertz range ... making a processor absolutely takes the best minds in the industry. Enormously difficult work.
 
Well, if I worked at Intel, I would say thank God we don`t have any competition these days, so no one really cares about news like this one... we`ll patch it in a Bios update, whatever...
 
I'd love to be a fly on the wall at Intel when they discuss this. I bet there's some funny stories to hear from complex problems such as these. It's also interesting how they can just patch the problem out of existence through a motherboard firmware update.

I have a friend who works at Intel. They have some funny stories about the Pentium 2s when they came out and had some weird bug. Also, the P4s are considered the black sheep of the community within Intel.
And the rest of the world...
 
In all honesty, microprocessor design isn't that "difficult" these days. It's just really hard to figure out how to actually produce the designs that becomes an issue. It's mostly a giant bank of transistors.
And I'm sure after you've drawn a schematic with a billion or so transistors for a workable, bug free CPU, you'll get back to us to tell us what a breeze that was...
 
I would be very leery of BIOS updates. How are you sure the OEM would have applied the BIOS upgrade? Even if they have, it is much easier to turn off a feature to prevent the CPU from doing certain routines than it is to correct the faulty routine in the 1st place. Surely that routine would be used in some other complex math calculating as well, not just the prime number one. Lucky they caught it through a freeze and not a mistaken calculation which might not have been caught. Do you trust Intel that their bios fix just isn't disabling the CPU in some way instead of a real fix? Reminds me of when Nissan disabled their crankshaft position sensor in 2007 in their Altimas and Sentras rather than replace 800,000 of them on a recall. Until Intel comes clean on this issue on exactly what the flaw was I would stay away from Skylark processors. Shades of the FDIV bug!!!!!!!!!!!!!
 
I would be very leery of BIOS updates. How are you sure the OEM would have applied the BIOS upgrade?
I'm certain anyone capable of knowing they have this bug will be capable of knowing they have a fix. Anyone wondering if they have an update, tells me they never had to worry about the issue in the first place.

Long story short, I'm not going to worry about this problem. Why? Because I'm not one that would ever face the problem. Those that do will know what they have to do to work around it. Even if that means they have to use an older system.
 
Back