Automatically Generating Release Notes with Content Classification Models

Sristy Sumana Nath, Banani Roy


Abstract
Release notes are admitted as an essential technical document in software maintenance. They summarize the main changes, e.g. bug fixes and new features, that have happened in the software since the previous release. Manually producing release notes is a time-consuming and challenging task. For that reason, sometimes developers neglect to write release notes. For example, we collect data from GitHub with over 1900 releases, and among them, 37% of the release notes are empty. To mitigate this problem, we propose an automatic release notes generation approach by applying the text summarization techniques, i.e. TextRank. To improve the keyword extraction method of traditional TextRank, we integrate the GloVe word embedding technique with TextRank. After generating release notes automatically, we apply machine learning algorithms to classify the release note contents (or sentences). We classify the contents into six categories, e.g. bug fixes and performance improvements, to represent the release notes better for users. We use the evaluation metric, e.g. ROUGE, to evaluate the automatically generated release notes. We also compare the performance of our technique with two popular extractive algorithms, e.g. Luhn’s and latent semantic analysis (LSA). Our evaluation results show that the improved TextRank method outperforms the two algorithms.
Cite:
Sristy Sumana Nath and Banani Roy. 2021. Automatically Generating Release Notes with Content Classification Models. International Journal of Software Engineering and Knowledge Engineering, Volume 31, Issue 11n12, 31(11):1721–1740.
Copy Citation: