Endymio
Posts: 2,618 +2,647
There are several fundamental errors here. First of all, if one attempts to execute SSE code on a chip that doesn't support it, there is a indeed a great deal of harm done -- a hard stop from an illegal instruction error, in fact. That's the entire reason compilers must emit code to check for these extended instruction sets. That "non-SSE codepath" you mention only gets executed if this check fails.There is no need to list CPU's that support SSE. First, it can be easily checked and since compiler creates additional non-SSE codepath when SSE is used, there is no harm creating SSE codepath for CPU that even doesn't support it.
Now, how does one "easily check" for the presence of SSE? You execute a cpuid opcode and check for the presence of a particular bit set in the EAX register. But here's the problem. That bit was defined by Intel, and it was defined only when Intel created the new instructions. Since the bit was meaningless before this time, some preexisting non-Intel CPUs were inconsistent in their handling of it. Which is why when the original SSE(1) instructions were introduced, some programs (compiled on certain non-Intel compilers and ran on certain non-Intel cpus) would fault. There was a somewhat similar early compiler issue between MMX and AMD's 3D-Now instruction set, which I won't get into, but the critical point is that the only way to be 100% sure was to validate the EAX bit against the ManufacturerID string, and, if necessary, to validate the highest calling (It's also possible to validate against the highest calling parameter as well-- essentially the "class" of the cpu within its manufacturer)
With me so far? Now, compilers that are interested in the highest possible performance on all possible CPUs will validate all possible combinations know to support the instruction set in question. An Intel compiler written by Intel-paid developers isn't (or wasn't then, at least) going to go to such great lengths to ensure it would emit proper code. If the SSE (and later, the SSE2) bit was set and the CPU was "GenuineIntel", take the SSE path. Otherwise, play it safe. You'll rightly point out that it's very little work to optimize for AMD as well. But when this code was first written, this wasn't possible, nor even beneficial (AMD, after all, did not support the instructions then). Could Intel have spent a trivial amount of time to update their compiler later? Sure. But again, why would they? Why spend money to help a competitor? Why should they be forced to? In closing, I'll point out that Intel's compiler was hardly the one around, even then. There were plenty of compilers that quickly adapted to AMD's processors, and optimized accordingly.
Last edited: