Some really interesting results here, especially if one compares the 3060 Ti, 3070, and 2080 Super against each other.
The above figures ignores various changes between Ampere and Turing (such as the increase in L1 cache, L1 cache bandwidth, RT cores operating independently, and so on), and the table is using the reference boost clocks for the rates. On paper, the 3060 Ti and 3070 are only better than the 2080 Super in three areas: outright FP32 throughput, pixel output rate, and RT core triangle-intersection test rate.
The 'No RT' results show that, for this particular game with those settings, the performance is dominated by either FP32, pixel, or a combination of both. There's no other factors (bar the ignored aspects) that would let a 3070 be roughly 9% down on the 2080 Super but performance 5 to 7% better. Interestingly, though, the 3070 isn't as far ahead of the 3060 Ti as it should be - I doubt the game is raster-limited, so either the 3060 Ti is maintaining its clocks better than the 3070 or the game is heavily VRAM bandwidth limited.
Once RT is enabled, one can see that Ampere's improvements in this area come to light, but again, not by a margin that the raw figures would suggest. My main take from this is just how much better value for money the 3060 Ti is compared to the 3070 - the latter has an MSRP of $499, compared to the former's $399, a difference of 25%. That's quite a bit more cost for just 10% better performance (not that one should ever choose a GPU on the basis of one game).