Developing a Model of Loop Actions by Mining Loop Characteristics from a Large Code Corpus

Author : Wang, Xiaoran; Pollock, Lori; Vijay-Shanker, K.
Booktitle : International Conference on Software Maintenance and Evolution (ICSME)
Date : Sep 2015
Publisher : IEEE
Keyword(s) : abstraction, mining code patterns, documentation generation
Document Type : In Conference Proceedings

Abstract :

Some high level algorithmic steps require more than one statement to implement, but are not large enough to be a method on their own. Specifically, many algorithmic steps (e.g., count, compare pairs of elements, find the maximum) are implemented as loop structures, which lack the higher level abstraction of the action being performed, and can negatively affect both human readers and automatic tools. Additionally, in a study of 14,317 projects, we found that less than 20% of loops are documented to help readers. In this paper, we present a novel automatic approach to identify the high level action implemented by a given loop. We leverage the available, large source of high-quality open source projects to mine loop characteristics and develop an action identification model. We use the model and feature vectors extracted from loop code to automatically identify the high level actions implemented by loops. We have evaluated the accuracy of the loop action identification and coverage of the model over 7159 open source programs. The results show great promise for this approach to automatically insert internal comments and provide additional higher level naming for loop actions to be used by tools such as code search.

Paper Link

Presentation Link