Can we use Chat-Bots for Software Quality Assurance?

Since OpenAI came up with ChatGPT, rumors about the upcoming improvements of Artificial Intelligence are omnipresent. In this article, I want to discuss, if and how we can use chat-bots for software QA and Test Automation. Therefore I want to use ChatGPT. The tool is designed to give detailed answers to questions, using multiple Sources with the ability to summarize complex topics.

Before I start to discuss how we can use this technology for Software QA, I want to point out how ChatGPT works. I am not an expert in the field so I had to ask someone who knows exactly what’s going on behind the scenes. I asked the only real expert within the field that I know and that is – you guessed it – ChatGPT itself.

For the Question of how ChatGPT works, I got a response that was a bit too long so I asked it to sum it up into two sentences. Here’s the response:

“ChatGPT is a transformer-based language generation model trained on a large corpus of text data to generate human-like responses to text inputs. It predicts the next word in a sequence by generating a probability distribution over the vocabulary, given the context of the previous words.”

The explicit focus on language is good news if you think of test automation, because the scripts that are used, are written in a programming language which is in fact a kind of language. Therefore I thought ChatGPT could be able to generate proper test-scripts. I decided to test this on my own. For a first short overview of ChatGPT’s skills in its understanding of human language, I gave it a clear instruction – at least it would have been clear for a human reading it. Here’s my little conversation with ChatGPT:

Me: “ChatGPT, please write a C# xUnit script that tests ChatGPT’s ability to calculate on a first-grade level. Test case: 1 + 2 = 3.“

ChatGPT: ”Here's an example of a simple xUnit test case in C# to test Chat-GPT's ability to perform first-grade level math:

using Xunit;

namespace ChatGPTTests

{

public class FirstGradeMathTests

{

[Fact]

public void OnePlusTwoEqualsThree()

{

int result = 1 + 2;

Assert.Equal(3, result);

}

Me: “This does not Test ChatGPT's calculation skills. If you have some trouble with mathematical problems I won't judge you. So please give me a proper test script.”

ChatGPT: “I apologize for the confusion earlier. Here's a corrected example that tests ChatGPT's ability to perform arithmetic operations, rather than relying on hardcoded values:…”

…followed by a script that tests different mathematical operations, but NOT ChatGPT at all…

Me: ”Ok next try... Please write a c# xUnit script, that navigates to ChatGPT, asks it to calculate 1+2 and checks if the result is 3.”

ChatGPT: “Here's an example of how you might write a C# xUnit test to interact with ChatGPT and check if it returns the correct result for the calculation:… “

Now the following Code goes in the right direction. It uses a placeholder for an API Endpoint which would give a response and ChatGPT explains that this placeholder has to be substituted by the “actual API endpoint”.

Let’s make a little recap at this point. I formulated my question in a fairly naive way to figure out if ChatGPT would be able to translate human language into a language that a machine would understand. This didn’t work out completely. Though ChatGPT was able to write a script that would compile, the script does not exactly do what it should do. The more precise my instructions got, the better the result was. This shows, that using a chatbot is a skill that you can improve. Maybe it’s a worthwhile idea, to train your ability of using these machines. Above all because of the expected progress of this technology.

Assuming that this progress will not only affect chatbots but the entire field of AI, other tools could participate in these advances. Chatbots are not designed for concerns of software QA. But other tools that are using AI are specially created to support the quality of the software development process. So let’s do a little research for which tools are those. But how can we do this kind of research? Well, this is exactly what ChatGPT is designed for, so let’s use it for this purpose:

Me: “ChatGPT, which Test-Automation Tools are using AI?”

ChatGPT: ” There are several test automation tools that incorporate artificial intelligence (AI) in various ways. Here are some examples:

Testim: Testim uses AI to generate tests, identify and fix test failures, and optimize test runs. It uses machine learning algorithms to analyze application behavior and detect patterns to generate test cases.
Functionize: Functionize is an AI-powered testing tool that automates test case creation and maintenance, and uses machine learning algorithms to detect and prioritize test cases based on their importance.
Applitools: Applitools uses AI to perform visual testing by comparing screenshots of the application across different devices and browsers. It uses machine learning algorithms to identify visual bugs and detect changes in the application's UI.
Mabl: Mabl is a test automation tool that uses AI to automate end-to-end testing. It uses machine learning algorithms to create and maintain tests, detect and isolate bugs, and optimize test execution.
Test.ai: Test.ai uses AI to generate and execute tests, and to identify and report bugs. It uses machine learning algorithms to analyze the application's behavior and detect patterns to create test cases.
Eggplant: Eggplant uses AI to automate functional testing and performance testing. It uses machine learning algorithms to analyze the application's behavior and detect patterns to generate test cases, and to optimize test runs.

These are just a few examples of test automation tools that incorporate AI. As AI continues to evolve and improve, it is likely that more test automation tools will incorporate AI in innovative ways.”

This is a very good result. If I was planning to implement AI-powered tools into my testing infrastructure, I now have got a short overview of existing solutions. As we have seen above, the more precise my requests, the more specific the answers are. If I was interested in a specific tool, I could start my research to get further information from ChatGPT.

Let’s try it with the code analysis tool CodeScene:

Me: ”What benefit does CodeScene have for my software development project?”

ChatGPT: “CodeScene is a tool that provides several benefits for software development projects, including:

Improved code quality: CodeScene uses machine learning algorithms to analyze your codebase and identify areas that may benefit from improvements in terms of maintainability, readability, and complexity. It can help you identify code smells, anti-patterns, and other issues that can negatively impact the quality of your code.
A better understanding of code changes: CodeScene provides detailed insights into how code changes over time, including the impact of code changes on the overall codebase. This can help you understand how changes in one part of the code may affect other parts of the system.
Team productivity: CodeScene provides data-driven insights to help you manage your development team more effectively. For example, it can help you identify areas where team members may be struggling or areas where additional training may be necessary.
Early detection of technical debt: CodeScene can help you identify technical debt early in the development process before it becomes a significant problem. This can help you avoid costly rework and delays later in the development process.

Overall, CodeScene can help you optimize your codebase for maintainability, readability, and team productivity, which can ultimately lead to a better end product.

Me: “And how does it work?”

ChatGPT: “CodeScene works by analyzing the history of your codebase and using machine learning algorithms to identify patterns and trends in the code. Here are the steps that CodeScene typically follows:

Data collection: CodeScene collects data from your code repository, including information about commits, files, and code changes. It also collects information about code metrics such as code complexity, coupling, and cohesion.
Analysis: CodeScene analyzes the data to identify patterns and trends in the codebase. It uses machine learning algorithms to identify areas that may benefit from improvement, such as areas with high code complexity or areas with a high number of code smells.
Visualization: CodeScene presents the analysis results in an intuitive, interactive visualization that helps you understand the structure and health of your codebase. This includes visualizations of the codebase architecture, heatmaps of code complexity and churn, and graphs of code changes over time.
Actionable insights: CodeScene provides actionable insights that you can use to improve your codebase. For example, it may suggest areas of the codebase that could benefit from refactoring, or areas where additional testing may be necessary.

CodeScene can be used as part of a continuous integration and delivery (CI/CD) pipeline, to ensure that code quality is continuously monitored and improved over time. It integrates with a variety of code repositories, including Git, GitHub, and Bitbucket, and can be used with a variety of programming languages, including Java, C#, and Python.”

Now I have an Idea of what benefits that tool can bring to my project and how this works. This can be a base for making a decision to try or not to try this tool. If I’m not sure about that for now, I can go on asking. If this leads me to a tool that helps me handle my issues, ChatGPT would have had a positive influence on my project quality. And that is likely because I used it for a purpose that it was designed for. And it is very easy if you use it in that way. And even if it’s possible to let ChatGPT do things that it is not specified, like writing scripts, I would prefer to use it mainly for research. And this research can be, that I’m looking for AI-powered tools, that can improve my work in the field of software quality assurance.

It seems like 2023 is the start of a new AI era. So let’s prepare ourselves for the future and learn to use the new generation of Software QA Tools! And of course, using chatbots can help us find the right tool that fulfills our needs.

Good Luck! And reach out if we can help you in any way!

Christoph Evers

Quality Assurance consultant at System Verification

Can we use Chat-Bots for Software Quality Assurance?

READ MORE

Revolutionizing QA: How No-Code, AI-Powered Solutions and Machine Learning are Transforming Software Testing in 2023

EVENT: CodeScene with Markus Borg

Key considerations before enabling no-code, AI solutions or machine learning in your QA process

Test automation recommendations – part 1

Budget, time and quality in your QA projects

SYVE QA Summit: Shaping the Future of Quality with AI, Strategy & Collaboration