I strongly believe 1 core is capable of putting out what a now day i7 is scoring
Assuming that you are talking about P4 here, because that was the last mainstream desktop processor to have one physical core from Intel, before Pentium D's came out.
Now that out of the way, P4 (Prescott and later) had very deep pipeline, i.e. 31 stages (if I remember correctly), having such a long pipeline makes it easier to jack up the speeds, but the downside is, accurate branch prediction becomes extremely critical, because whenever a misprediction occurs, it would result in processor halting and clearing out the whole pipeline before starting over again. However, one important milestone of P4 (Northwood onward) was introduction of HyperThreading (which is also called Simultaneous Multi-Threading i.e. SMT).
Nehalem has a very wide pipeline. It has six execution units capable of executing three memory operations and three calculation operations. If the execution engine can’t find sufficient parallelism of instructions to take advantage of them all, “bubbles”—lost cycles—occur in the pipeline.
To address this issue, SMT looks for instruction level parallelism in two threads instead of just one, with the objective of leaving as few units unused as possible. This approach can be extremely effective when the two threads are executing tasks that are highly separate. On the other hand, two threads involving intensive calculation, for example, will only increase the pressure on the same calculating units, putting them in competition with each other for access to the cache. It goes without saying that SMT is of no interest in this type of situation, and can even negatively impact performance.
Generally, the impact of SMT on performance is positive most of the time and the cost in terms of resources is reasonably limited, which explains why Intel brought it back with i7. However, do note that programmers will need to pay attention because with Nehalem (and its later incarnations), all threads are not created as equals. To help solve this conundrum, Intel provided a way of precisely determining the accurate topology of the processor (the number of physical and logical processors), and programs can then use the OS affinity mechanism to assign each thread to a processor.
Since SMT puts a heavier load on the out-of-order execution engine, Intel has made significant improvements in this area as well on i7 and increased the size of certain internal buffers to avoid turning them into bottlenecks. So the reorder buffer, which keeps track of all the instructions being executed in order to reorder them, has increased from 96 entries on the Core 2 to 128 entries on Nehalem. In practice, since this buffer is partitioned statically to keep any one thread from monopolizing all the resources, its size is reduced to 64 entries for each thread with SMT.
Obviously, in cases where a single thread is executed, it has access to all the entries, which should mean that there won’t be any specific situations where Nehalem turns out to have worse performance than its predecessor.
Therefore, you can't really imply that generation x processor's core will perform on the same level as the generation y's processor core.
Sources: For i7,
Toms, I forgot where I read architectural details about Prescott's since it was light years ago