Master thesis
CheckMerge: A System for Risk Assessment of Code Merges

Thursday April 26, 2018

Research on providing an indication whether a code merge could result in broken code.

The thesis was graded 9/10 by the University of Twente.

Abstract

When working on large software projects using version control systems, merges are not always trivial to execute. Even when no git-like merge conflicts exist, the resulting program is not necessarily correct. In this study a number of categories of changes that may cause issues during merges are identified. Two new language-independent algorithms have been developed to detect changes from three of these categories. These algorithms work based on the abstract syntax trees (ASTs) of compared program versions and require the differences between these versions to be calculated beforehand. A prototype system has been designed and implemented for the C programming language. The newly developed algorithms perform well in detecting the problematic changes, in the case of one algorithm at the cost of false positives. The prototype system shows the feasibility of such a system, but is not suitable for production use. All in all the analysis of source code merges is a promising area of research and with some effort a tool for practical code merge analysis could be produced.

Research questions

  • Which kind of changes are likely to cause problems in code merges?
  • Which algorithms are best suited to compare code versions to find changes?
  • Which techniques exist to detect these merge problems from the changes between versions?

Full paper