GitHub - AtomicJar/genai-test

Requirements

Java 21 (use sdkman)
TCC and GPU enabled zone or run Ollama locally
OpenAI API key openai.api-key defined in src/main/resources/application.properties or provided as an environment variable OPENAI_API_KEY

Description

This project serves as an example of how to test GenAI applications by using a Large Language Model (LLM).

The main challenge with verifying answers from LLMs is that they generate responses in natural language that are non-deterministic, making traditional testing methods, which rely on predictable outcomes, unsuitable. To address this, the proposed solution involves using one LLM to assess the adequacy of another LLM's responses. This involves setting detailed validation criteria and employing an LLM as a 'Validator Agent' to ensure the responses meet these criteria. This method can be used to validate answers that require both general and specialized knowledge:

    String question = "Does 'good' have the same meaning as 'bad'?";
    String reference = "good is the opposite of bad";

    @Test
    void verifyValidatorDetectsWrongAnswer() {
        String answer = "Yes";
        ValidatorAgent.ValidatorResponse validate = validatorAgent.validate(question, answer, reference);
        assertThat(validate.response()).isEqualTo("no");
    }

    @Test
    void verifyValidatorDetectsGoodAnswer() {
        String answer = "No";
        ValidatorAgent.ValidatorResponse validate = validatorAgent.validate(question, answer, reference);
        assertThat(validate.response()).isEqualTo("yes");
    }

The ValidatorAgent is an AI Service responsible for validating the answers. It will verify if the answer is correct or not based on the reference provided.

Learn more about the details in the blog post: A Promising Methodology for Testing GenAI Applications in Java

How to run tests

./gradlew test

How to run backend

./gradlew run

How to run frontend

cd frontend
npm install
npm start

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
frontend		frontend
gradle/wrapper		gradle/wrapper
src		src
.gitignore		.gitignore
.sdkmanrc		.sdkmanrc
LICENSE		LICENSE
README.md		README.md
build.gradle		build.gradle
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Requirements

Description

How to run tests

How to run backend

How to run frontend

About

Releases

Packages

Contributors 2

Languages

License

AtomicJar/genai-test

Folders and files

Latest commit

History

Repository files navigation

Requirements

Description

How to run tests

How to run backend

How to run frontend

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages