How to bridge the energy/bandwidth wall in DRAM centric AI architectures