Those numbers are not public but the only reason we are doing TPUs is that they offer a huge datacenter density and power advantage (in one word: cost) over GPUs, for ML workloads.
Thanks. Great information. Tensorflow 2.0 isn't mentioned. Would like to know 2.0 implementation challenges, timelines, anticipated performance differences.
Compute cost is an important cloud metric but what about BYO system (GPU) costs? A slide on BYO (GPU) costs vs GPU/TPU cloud would aid decision making.
The model I trained on stage on a 128 TPUv3 cores cost a little under $50. A TPUv3-128 is roughly the equivalent of 150 powerful GPUs on this workload. You can look up the market price of GPUs + interconnect. I hope this helps you make the buy on site / rent from cloud decision.
Would be nice to see TPU2/TPU3/GPU comparison chart supplemented with transistor counts, wattage.
Those numbers are not public but the only reason we are doing TPUs is that they offer a huge datacenter density and power advantage (in one word: cost) over GPUs, for ML workloads.
Thanks. Great information. Tensorflow 2.0 isn't mentioned. Would like to know 2.0 implementation challenges, timelines, anticipated performance differences.
Keras/TPU support is planned in TF 2.1 and the TF 2.1 RC0 has just been published!
Not sure why TPU does not support training multiple models on the same instance. This is not quite efficient for small models...
Compute cost is an important cloud metric but what about BYO system (GPU) costs? A slide on BYO (GPU) costs vs GPU/TPU cloud would aid decision making.
The model I trained on stage on a 128 TPUv3 cores cost a little under $50. A TPUv3-128 is roughly the equivalent of 150 powerful GPUs on this workload. You can look up the market price of GPUs + interconnect. I hope this helps you make the buy on site / rent from cloud decision.
Usually 5 V100s will be in one node, I don’t believe that one TPUv2 is as fast as 5 V100s, not by a long shot.
8:20 watch
Nvidia Tensor Core vs Google's TPU
NVIDIA!!!
We need TPUv4!
io 2020