Architecture is transistors though. So it should be a bit of both. You obviously want to add transistors because that is the point of shrinking them smaller. But you have to add them only in the smartest areas to improve IPC and control power consumption.
In a GPU that is highly parallel, well then you can just add a load more of the same processing clusters to make the chip faster. In a CPU which is not parallel then deciding where to best add transistors is probably more difficult.
Everything about chip design is difficult, clearly. You have so many compromises to hit in terms of performance, die size and leakage, temperatures, power consumption, process yields, attainable clockspeeds etc
Do you add more transistors to an existing design? If you do that over successive generations, does the chip end up bloated, lopsided and inefficient? Would it be better to just scrap that and start again with a clean sheet, years of work?
Plenty of example like this, particularly the Pentium 4 where it reached an evolutionary dead end. Intel ended up binning the whole thing and started fresh with Core 2. Good decision. Same with AMD's K10 nonsense.