not fare not to include radeon IC bandwidth
Nvidia hasn't disclosed the cache data path widths in Lovelace, so it's not possible to directly compare bandwidths. Both architectures have the same size L1 cache (128 kB) and for RDNA 2, the read data width is 128 bytes per clock and 32 bytes per clock for writes. In Nvidia's Ampere architecture, it's 128 bytes for both read/write and it's safe to assume that Lovelace will be the same (as nothing else has changed with the L1 cache).
A Navi 21 full chip has 4MB of 16-way L2 cache, with a data path width of 128 bytes per clock (read and write) per L2 partition. The 4090 has 72MB of L2 cache but an unknown path width; for Ampere, the partition data path is 128 bytes per clock, so again, Lovelace will be at least that.
And finally, RDNA 2 has the Infinity Cache, up to 128MB of 16-way L3, with a path width of 64 bytes per clock per partition. Lovelace obviously doesn't have any L3 cache.
The highest level of cache is particularly significant because that's what is paired with each memory controller. So for the Navi 21, each MC has 16MB of cache associated with it. In Ampere, it was just 0.5 GB, whereas in Lovelace, it's now 6MB for the 4090 and 8MB for the full AD102 chip.
AMD obviously has an advantage there, but the data paths to those partitions is relatively narrow at 64 bytes. If Lovelace has the same L2 cache structure as Ampere, then not only is clocked faster, the data path is also wider.
It will be interesting to see what AMD does with RDNA 3, where the general expectation is the L3 cache and MCs are on separate chiplets, surrounding the rest of the GPU core die.