Kernel and Graph Optimization for DL Model Execution

13. Prosinec 2019

Řečníci

O prezentaci

There is increasing demand to deploy diverse deep learning models on edge devices. However, fully optimizing the execution of such models on resource-constrained HWs (e.g., CPU, DSP, NPU) is intrinsically challenging and often requires significant manual efforts. In this talk, we introduce our Morpheus team’s efforts to address these challenges. First, we optimize the performance of DL model execution in kernel level (e.g., a convolution operator). From a large number of possible kernel configurations (e.g., tiling, unrolling, vectorizations), the fastest kernel is quickly identified through machine learning algorithms we developed while binary codes are automatically generated by TVM or Halide compilers. Second, we further optimize the performance of DL model execution in graph-level (e.g., end-to-end network). Since each kernel or operator is often connected as a graph in deep learning models, the compute scheduling of such graphs significantly affects the end-to-end performance, especially memory I/O. We solve two problems in this context. First, for potentially complex topologies on edge devices with limited total memory, we solve the minimum memory usage problem, thus characterizing and enabling deployment of all feasible networks on a given device. Second, for any hardware with combined Tightly Coupled Memory (TCM) and more expensive external memory (e.g. DRAM), we solve the minimum external memory access problem, which optimizes hardware usage efficiency in I/O-bound conditions. For both problems we show efficient algorithms that are complete solutions, and improved results over heuristic methods. Finally, we will discuss our future directions to optimize deep learning model execution.

Organizátor

Kategorie

O organizátorovi (NIPS 2019)

Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.

Uložení prezentace

Měla by být tato prezentace uložena po dobu 1000 let?

Jak ukládáme prezentace

Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

Sdílení

Doporučená videa

Prezentace na podobné téma, kategorii nebo přednášejícího

Zajímají Vás podobná videa? Sledujte NIPS 2019