Nov 28, 2022
We study the problem of training deep networks while enforcing quantization and precision constraints to its parameters, a setting which can reduce energy consumption and inference time of deployed models. Unlike previous works, we propose a method that assigns different precisions (number of bits) to weights in a neural network, yielding an heterogeneous allocation of bits across parameters. Our method is derived from a novel framework, where the intractability of optimizing discrete precisions is approximated by training per-parameter noise magnitudes. Empirical evaluations show that our approach is capable of finding highly heterogeneous precision assignments for CNNs trained on CIFAR and ImageNet, improving upon the previous state-of-the-art and offering a theoretical foundation for the design of new quantization methods.We study the problem of training deep networks while enforcing quantization and precision constraints to its parameters, a setting which can reduce energy consumption and inference time of deployed models. Unlike previous works, we propose a method that assigns different precisions (number of bits) to weights in a neural network, yielding an heterogeneous allocation of bits across parameters. Our method is derived from a novel framework, where the intractability of optimizing discrete precisions…
Professionelle Aufzeichnung und Livestreaming – weltweit.
Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind