The HBM modules on the R9 Fury were clocked to 1 Gbps, whereas the GTX 980's GDDR5 were at 7 Gbps - so although former's combined bus width was indeed an enormous 4096-bits, giving a peak bandwidth of 512 GB/s, the comparatively slower access speed didn't help matters on a per-ROP/MC basis (and in the case of Fiji, it was two controllers per HBM stack). So if data in the same page is being requested, HBM was slower than GDDR5, although such situations are far less frequent than its normal usage.I have two Sapphire R9 Furies with 4GB of HBM1 but I'd much rather they had 6 or 8GB of GDDR5. Let's remember that HBM1 had an astonishing 4096-bit bus. That gave it 16x the bandwidth of today's RX 6800 XT and 8x the bandwidth of the RTX 3080. That didn't stop the GTX 980 from out-performing it with regular GDDR5.
Of course, all the peak DRAM bandwidth in the world isn't going to matter very much, if the chip isn't requesting the data very often, and this was GCN's Achilles' heel. This is why the likes of the RX 590 fairs pretty well against the R9 Fury, despite having half of everything (except clock speed).