Towards Greener Yet Powerful Code Generation via Quantization: An Empirical Study

Dec 5, 2023

Speakers

About

ML-powered code generation aims to assist developers to write code in a more productive manner by intelligently generating code blocks based on natural language prompts. Recently, large pretrained deep learning models have pushed the boundary of code generation and achieved impressive performance. However, the huge number of model parameters poses a significant challenge to their adoption in a typical software development environment, where a developer might use a standard laptop or mid-size server to develop code. Such large models cost significant resources in terms of memory, latency, dollars, as well as carbon footprint. Model compression is a promising approach to address these challenges. Out of many compression techniques, we have identified that quantization is the most applicable for code generation task as it does not require significant retrain- ing cost. As quantization represents model parameters with lower-bit integer (e.g., int8), the model size and runtime la- tency would both benefit. We empirically evaluate quantized models on code generation tasks across different dimensions: (i) resource usage and carbon footprint, (ii) accuracy, and (iii) robustness. Through systematic experiments we find a code- aware quantization recipe that could run even a 6-billion- parameter model in a regular laptop without significant ac- curacy or robustness degradation. We find that the recipe is readily applicable to code summarization task as well.

Organizer

Categories

Store presentation

Should this presentation be stored for 1000 years?

How do we store presentations

Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

Interested in talks like this? Follow ESEC-FSE