Learning from Examples to Find Fully Qualified Names of API Elements in Code Snippets

C M Khaled Saifullah, Muhammad Asaduzzaman, Chanchal K. Roy


Abstract
Developers often reuse code snippets from online forums, such as Stack Overflow, to learn API usages of software frameworks or libraries. These code snippets often contain ambiguous undeclared external references. Such external references make it difficult to learn and use those APIs correctly. In particular, reusing code snippets containing such ambiguous undeclared external references requires significant manual efforts and expertise to resolve them. Manually resolving fully qualified names (FQN) of API elements is a non-trivial task. In this paper, we propose a novel context-sensitive technique, called COSTER, to resolve FQNs of API elements in such code snippets. The proposed technique collects locally specific source code elements as well as globally related tokens as the context of FQNs, calculates likelihood scores, and builds an occurrence likelihood dictionary (OLD). Given an API element as a query, COSTER captures the context of the query API element, matches that with the FQNs of API elements stored in the OLD, and rank those matched FQNs leveraging three different scores: likelihood, context similarity, and name similarity scores. Evaluation with more than 600K code examples collected from GitHub and two different Stack Overflow datasets shows that our proposed technique improves precision by 4-6% and recall by 3-22% compared to state-of-the-art techniques. The proposed technique significantly reduces the training time compared to the StatType, a state-of-the-art technique, without sacrificing accuracy. Extensive analyses on results demonstrate the robustness of the proposed technique.
Cite:
C M Khaled Saifullah, Muhammad Asaduzzaman, and Chanchal K. Roy. 2019. Learning from Examples to Find Fully Qualified Names of API Elements in Code Snippets. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).
Copy Citation: