Analysing Source Code: Looking for Useful Verb-Direct Object Pairs in all the Right Places

Author : Fry, Zachary P.; Shepherd, David; Hill, Emily; Pollock, Lori; Vijay-Shanker, K.
Date : Feb 2008
Journal : IET Software Special Issue on Natural Language in Software Development
Volume : 2
Issue : 1
Pages : 27–36
Document Type : Article

Abstract :

The large time and effort devoted to software maintenance can be reduced by providing software engineers with software tools that automate tedious, error-prone tasks. However, despite the prevalence of tools such as IDEs, which automatically provide program information and automated support to the developer, there is considerable room for improvement in the existing software tools. The authors’ previous work has demonstrated that using natural language information embedded in a program can significantly improve the effectiveness of various software maintenance tools. In particular, precise verb information from source code analysis is useful in improving tools for comprehension, maintenance and evolution of object-oriented code, by aiding in the discovery of scattered, action-oriented concerns. However, the precision of the extraction analysis can greatly affect the utility of the natural language information. The approach to automatically extracting precise natural language clues from source code in the form of verb–direct object (DO) pairs is described. The extraction process, the set of extraction rules and an empirical evaluation of the effectiveness of the automatic verb–DO pair extractor for Java source code are described.

Paper Link