Nov 28, 2022
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Spatial-wise dynamic convolution has become a promising approach to improving the inference efficiency of deep networks. By allocating more computation to the most informative feature pixels, such an adaptive inference paradigm alleviates the spatial redundancy in image features and reduces a considerable amount of unnecessary computation. However, the theoretical efficiency achieved by previous methods can hardly translate into the realistic speedup, especially on the multi-core processors (e.g. GPUs). The key challenge is that the existing literature has only focused on designing algorithms with minimal computation, ignoring the fact that the practical latency can also be influenced by scheduling strategies and hardware properties. To bridge the gap between the theoretical computation and the practical efficiency, we propose a latency-aware spatial-wise dynamic network (LASNet), which performs coarse-grained spatially adaptive inference under the guidance of a novel latency prediction model. This latency prediction model can efficiently estimate the inference latency of dynamic networks by simultaneously considering the algorithms, the scheduling strategies, and the hardware properties. We use the latency predictor to guide both the algorithm design and the scheduling optimization on various hardware platforms. Experiments on image classification demonstrate that the proposed framework significantly improves the trade-off between the accuracy and the inference efficiency of deep networks. For example, the average latency of a ResNet-101 on the ImageNet validation set could be reduced by 23Spatial-wise dynamic convolution has become a promising approach to improving the inference efficiency of deep networks. By allocating more computation to the most informative feature pixels, such an adaptive inference paradigm alleviates the spatial redundancy in image features and reduces a considerable amount of unnecessary computation. However, the theoretical efficiency achieved by previous methods can hardly translate into the realistic speedup, especially on the multi-core processors (e.g…
Account · 961 followers
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Yinbin Han, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Yuanyu Wan, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Minji Yoon, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%