You mixed up the numbers. 32GB/s is what is mentioned for x16 Gen 4 not Gen 3.
But those numbers are just straight conversions from the GBit/s numbers and still include encoding overhead. They are theoretical numbers that cannot ever be reached. You are better off with Wikipedia numbers.
Already available TB devices use mostly TB3 controllers and / or only have x4 Gen 3 slots for components like NVMes and GPUs.
The theoretical max. speed for TB3/USB4 with x4 Gen 3 is around 3.31-3.4 GB/s with all the overheads I know of subtracted. NVMes can in practice reach 3.1 GB/s usable speeds over that, GPUs will achieve less (more in the neighborhood of 2.6GB/s), probably for reaons having to do with the protocol and latency.
New ASM2464 controllers do support x4 Gen 4 on the port (the raw number is 64G), which would already be more than USB4 40G can transfer. That comes out to ~ 3.9-4 GB/s of theoretical max. bandwidth, with NVMe enclosures having shown 3.7 GB/s real-world.
Anything more would require newer USB4 versions with some fixes and 80G or asymmetric 120G modes. So you can expect further doubling of the bandwidth and a bit more in the next few generations.