Learning to have near-perfect software means..
- What parts of my code that is being merged in the main branch should receive additional review?
- What is the root cause of a bug being filed?
- How can bug triaging and assignment of a bug (to the right developer) be automated?
- How can test cases be driven by risk--not converge? What test cases should be prioritized because of impact?
- What are the most used and least used features in my software?
- What are the most liked/least liked features in my software?
- Where are the anomalous behaviors happening in the customer experience?
Rooting out bugs must match the speed of software deliveries..which can be daily or even every minute. It also means test paradigms that fit web-scale deployments and architecture complexity. So to get to near perfect code, companies are thoughtfully applying AI to help answer the questions above to make a difference in the customer experience:
- Root Cause Analytics - During bug resolution, the root cause identification and bug assignment is a challenging task which requires manual effort and many a times the bug is tossed around multiple times before being correctly assigned. Root Cause Analytics automates the bug analysis and bug assignment using ML techniques based on historical data.
- Code Defect Detection Analytics - Left shift testing so that issues can be caught earlier at development. Addresses the issue that code review process is error prone due to skill level of reviewers using a statistical prediction of the quality of the modified code.
- Dynamic Risk Based Testing - Once a project implements RBT, project risk changes over time based on software updates being carried out and bugs found during testing & in the field. Dynamic RBT eliminates the manual activities required for responding to the ever-changing risk scenarios.
- Test Chatbot - Chatbots extend chat tools such as Slack used by testers to collaborate. It provides a communication for large group collaboration using Chatbot integrated with chat tool that addresses both user-initiated as well as system-initiated distribution of information such a hyperlinks, screen captures, status updates etc.
Digging deeper into code defect detection analytics
Coders usually use static code analyzers as a part of the active software development and maintenance. Static code analyzers are rule based engines that are good at detecting coding errors such as null pointer dereferences, memory leaks, SQL injections, hard coded passwords etc. After running all tests, before merging the code to the main branch, coders request the code reviewers to give an approval of “Looks Good to Me” for proceeding. Code review is an error prone process as it’s dependent on the skill of reviewers and other factors such as time presses, work pressure etc.
Machine learning based code defect detection analytics fills the gap between static code analyzers and code reviewers. It uses historical data of the project to predict if “Something Smells Funny” in the new code. It is an early warning system that helps catch issues before they escape to field thereby reducing the cost of fix and helps improve code quality especially in historically problematic areas.
We recently investigated the effectiveness for such ML learning based system to aid code reviewers for paying more attention to code reviews that are predicted to be defect prone. The exploration was carried out on the Eclipse open source. One of the main challenges of the exercise was generation of a labelled dataset which can be used for model generation. The version control systems (git) does not contain bug and the bug tracking systems (Bugzilla) does not contain code check-in information. Custom scripts were used to identify the bug fixes from the git log messages of the initial training data set. 50% of the effort was spent in data preparation. 21 features were identified to be significant and four machine learning algorithms (Naive Bayes, Maximum entropy, Decision Trees, Support Vector Machines) were evaluated for precision and recall. Support Vector Machine (SVM) outperformed Naïve Bayes marginally. This technique is applicable for medium to large sized software projects where historical data is available for training ML models. Based on our investigations. we’d recommend using SVM for implementing machine learning based code defect detection analytics.
What's next? Focus on What Matters Most
1. High delivery velocity and software complexity. An oft quoted metric on accelerated software delivery time and scale of deployment is from Amazon in May 2011 when they achieved a mean time of deployment to production systems in 11.6 seconds with maximum of 1079 deployments in an hour. The deployments affected up to 30,000 hosts with 10,000 hosts on an average. What is not very highlighted is that one out of 100,000 deployment caused an outage. Though Amazon has an elegant strategy to deal with outage causing deployments, it still amounts to a software delivery causing an outage at an average of every 32 hours and 13 minutes. Netflix, one of the prominent proponents of microservices driven architecture, has 500+ microservices deployed over ten of thousands of VMs. With time, applications will deploy increasing number of microservices that are loosely coupled using an API Gateway. The delivery scale and speeds will overwhelm what humans can handle for diagnosing and correcting software malfunctions.
2. Legacy code and dwindling support. Evolution of software defined networks and hardware obsolescence is slowly transitioning large amount of telecommunication software into legacy code. Legacy telecom software, in spite of their obsolesce, provide a competition differentiator though their historical domain knowledge and continue to bring revenues. However, dwindling revenues mean that the project team size is dictated primarily by the commercials rather than a healthy skill distribution required to ideally maintain the software. In such a scenario, maintaining and upgrading legacy telecommunication system is a challenging task. Despite a large automated regression suite, bugs are leaking to the field. Product managers are struggling to keep the functionality in one piece while dealing with bug fixes and feature upgrades.