The article talks about "defects", which do occur and can disable parts of a die, but more often the performance variation is driven by process variation. Modern chip manufacturing requires hundreds of process steps, which deposit, pattern, and etch various materials. Ideally, these processes would be perfectly uniform across the entire wafer, however in reality there is always some level of variation. As a result a layer may be thicker or thinner at the wafer edge, or perhaps in the "donut" region at mid-radius. Or, during the processing, you may have portions of the wafer that are slightly hotter and colder, leading to different levels of dopant diffusion, which then drives variation in the electrical performance.
Each process has it's own typical "signature" and when you combine all of these individual "signatures" across hundreds of steps, you will get regions of the wafer that produce more high-performing die, while other regions, especially at the wafer edge are more likely to be slower performing duds. Then, in addition to these within wafer "signatures", you have run-to-run variation (eg. wafer-to-wafer variation), as well as lot-to-lot variation (wafers processed together tend to have more similar performance, while other batches or lots will tend to show more variation), and tool-to-tool variation, driven by differences in equipment age, performance, etc...
All of these variations conspire to cause quite a lot of die-to-die variation. For cheaper, lower performance chips, say in a cell phone, the manufacturer will just set a minimum passing criteria and everything passing will be binned the same. But, for CPU's, GPU's, and other high performance chips, they will bin the chips by performance and charge a premium for the better chips.
Each process has it's own typical "signature" and when you combine all of these individual "signatures" across hundreds of steps, you will get regions of the wafer that produce more high-performing die, while other regions, especially at the wafer edge are more likely to be slower performing duds. Then, in addition to these within wafer "signatures", you have run-to-run variation (eg. wafer-to-wafer variation), as well as lot-to-lot variation (wafers processed together tend to have more similar performance, while other batches or lots will tend to show more variation), and tool-to-tool variation, driven by differences in equipment age, performance, etc...
All of these variations conspire to cause quite a lot of die-to-die variation. For cheaper, lower performance chips, say in a cell phone, the manufacturer will just set a minimum passing criteria and everything passing will be binned the same. But, for CPU's, GPU's, and other high performance chips, they will bin the chips by performance and charge a premium for the better chips.