New machine learning system Prophet can fix code up to 10 times more


A new machine learning system which has been developed by researchers at MIT: Computer Science and Artificial Intelligence Laboratory can fix errors in bug-riddled code ten times more compared to other machine learning systems currently in use.  

A paper entitled Automatic Patch Generation by Learning Correct Code by Fan Long and Martin Rinard, presented Prophet, a new patch generation system which is able to fix code by using open source software repositories obtained by a set of successful patches.

“One of the most intriguing aspects of this research is that we’ve found that there are indeed universal properties of correct code that you can learn from one set of applications and apply to another set of applications,” said paper co-author and professor of electrical engineering and computer science Martin Rinard. “If you can recognize correct code, that has enormous implications across all software engineering. This is just the first application of what we hope will be a brand-new, fabulous technique,” he added.

Supported by Defense Advanced Research Projects Agency (DARPA) the research showed how Prophet can fix more bugs by analyzing old patches which can help the new patch generation system to learn from previous mistakes and improve its abilities.

“A key challenge for Prophet is to identify, learn, and exploit universal properties of correct code. Many surface syntactic elements of the correct patches in the Prophet training set (such as variable names and types) tie the patches to their specific applications and prevent the patches from directly generalizing to other applications,” according to the paper.

An experiment by researchers which tested the Prophet system on 69 real-world errors which had been selected from eight different open-source applications showed that Prophet, on the same benchmark set, could fix a higher amount of errors compared to previous patch generation systems.

“We evaluate Prophet on 69 real world defects and 36 functionality changes in eight large open source applications: libtiff, lighttpd, the PHP interpreter, gmp, gzip, python, wireshark, and fbc. This is the same benchmark set used to evaluate SPR [18], Kali [27], GenProg [15], and AE [35]. For each defect, the benchmark set contains a test suite with positive test cases for which the unpatched program produces correct outputs and at least one negative test case for which the unpatched program produces incorrect output,” according to Rinard and Long.

Image Source:

Silvae Technologies Ruse, Bulgaria

44B Borisova Str.
7012, Ruse, Bulgaria

Silvae Technologies Brussels, Belgium

1000 Brussels, Belgium