Exploring Action Unit Granularity Of Source Code For Supporting Software Maintenance

Authors: Xiaoran Wang
Booktitle : University of Delaware, PhD Thesis
Date : Spring, 2017
Publisher : University of Delaware

Abstract :

Because resources for today’s software are used primarily for maintenance and evolution, researchers are striving to make software engineers more efficient through automation. Programmers now use integrated development environments (IDEs), debuggers, and tools for code search, testing, and program understanding to reduce the tedious, error-prone tasks. A key component of these tools is analyzing source code and gathering information for software developers. Most analyses treat a method as a set of individual statements or a bag of words. Those analyses do not leverage information at levels of abstraction between the individual statement and the whole method. However, a method normally contains multiple high-level steps to achieve a certain function or execute an algorithm. The steps are expressed by a sequence of statements instead of a single statement. In this dissertation, I have explored the feasibility of automatically identifying these high level actions towards improving software maintenance tools and program understanding.

Specifically, methods can often be viewed as a sequence of blocks that correspond to high level actions. We define an action unit as a code block that consists of a sequence of consecutive statements that logically implement a high level action. Rather than lower level actions represented by individual statements, action units represent a higher level action, for example, “initializing a collection” or “setting up a GUI component”. Action units are intermediary steps of an algorithm or sub-actions of a bigger and more general action. In this dissertation, I (1) introduce the notion of action units and define the kinds of action units, (2) develop techniques to automatically identify actions for loop-based action units, (3) automatically generate natural language descriptions for object-related action units, and (4) automatically insert blank lines into methods based on action units to improve source code readability.

Paper Link

Presentation