Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NBS Test Cases #123

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

NBS Test Cases #123

wants to merge 11 commits into from

Conversation

cbrinson-rise8
Copy link
Collaborator

@cbrinson-rise8 cbrinson-rise8 commented Nov 8, 2024

Description

NBS provided us a spreadsheet that defines an existing state and then a bunch of tests with expected results. Architect a testing script of sorts, that can parse a similar spreadsheet (can just be a CSV) with seeding data and iterate through the tests with the expectations.

Related Issues

closes #110

Additional Notes

For a full explanation of run instructions, directory structure, and file uses check out the README.md

The key inputs to these tests will be the following:

  • a CSV file that can be used to seed the MPI
  • a CSV with test cases (basically the same format as the above), but also a column to indicate if it is a match and to whom
  • a JSON file with an algorithm config to use
    The setup should make it easy to edit any 3 of those above files, run the test cases, and examine the results.

<--------------------- REMOVE THE LINES BELOW BEFORE MERGING --------------------->

Checklist

Please review and complete the following checklist before submitting your pull request:

  • I have ensured that the pull request is of a manageable size, allowing it to be reviewed within a single session.
  • I have reviewed my changes to ensure they are clear, concise, and well-documented.
  • I have updated the documentation, if applicable.
  • I have added or updated test cases to cover my changes, if applicable.
  • I have minimized the number of reviewers to include only those essential for the review.

Checklist for Reviewers

Please review and complete the following checklist during the review process:

  • The code follows best practices and conventions.
  • The changes implement the desired functionality or fix the reported issue.
  • The tests cover the new changes and pass successfully.
  • Any potential edge cases or error scenarios have been considered.

@cbrinson-rise8 cbrinson-rise8 self-assigned this Nov 8, 2024
@cbrinson-rise8 cbrinson-rise8 added the qa Technical improvements to increase code quality label Nov 8, 2024
Copy link

codecov bot commented Nov 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.43%. Comparing base (3182485) to head (f8d811b).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #123      +/-   ##
==========================================
- Coverage   95.47%   95.43%   -0.04%     
==========================================
  Files          31       31              
  Lines        1414     1424      +10     
==========================================
+ Hits         1350     1359       +9     
- Misses         64       65       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cbrinson-rise8 cbrinson-rise8 force-pushed the feat/add-nbs-test-suite branch 2 times, most recently from 7377609 to aa66f56 Compare November 14, 2024 17:13
tests/algorithm/scripts/seed_db.py Fixed Show fixed Hide fixed
tests/algorithm/scripts/seed_db.py Fixed Show fixed Hide fixed
@cbrinson-rise8 cbrinson-rise8 force-pushed the feat/add-nbs-test-suite branch 2 times, most recently from c36a188 to 24e7026 Compare November 19, 2024 19:04
@cbrinson-rise8 cbrinson-rise8 force-pushed the feat/add-nbs-test-suite branch 2 times, most recently from 7f159f4 to 4038dd1 Compare November 20, 2024 15:08
@cbrinson-rise8 cbrinson-rise8 marked this pull request as ready for review November 21, 2024 15:59


def send_test_records(test_csv, algorithm_name, api_url):
df = pd.read_csv(test_csv)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is pandas only necessary to read and write csv files? If so, there is a standard lib for that https://docs.python.org/3/library/csv.html

Copy link
Collaborator Author

@cbrinson-rise8 cbrinson-rise8 Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I will go ahead and switch it to that standard lib. Removing it should make the build faster too.

@ericbuckley
Copy link
Collaborator

@cbrinson-rise8 I was going through the README and ran the default tests. I noticed the all of the expected results in the output were either a failure or a match, which is different from the results.csv file you committed. Any ideas why my results would have been different from the ones in the PR?

@cbrinson-rise8
Copy link
Collaborator Author

@cbrinson-rise8 I was going through the README and ran the default tests. I noticed the all of the expected results in the output were either a failure or a match, which is different from the results.csv file you committed. Any ideas why my results would have been different from the ones in the PR?

Hmm not sure but the output file I committed might just be old. I'll run through running the tests and see what my new output file would look like.

Copy link
Collaborator

@alhayward alhayward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this work! Excited that we'll have the functionality to test results, which paves the way for calculating match accuracy performance metrics!

Left several questions and a few suggestions.

@@ -0,0 +1,99 @@
# Record Linkage Algorithm Testing

This repository contains a project to test the effectiveness of the RecordLinker algorithm.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This repository contains a project to test the effectiveness of the RecordLinker algorithm.
This repository contains a project to evaluate the match accuracy performance of the RecordLinker algorithm.

- `results/`: Contains the results of the algorithm tests
- `scripts/`: Contains the scripts to run the algorithm tests

## Steup
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Steup
## Setup


The results of the algorithm tests will be available in the `results/output.csv` file.

The results will be in a csv formatted file withthe following columns: `Test Case #`, `Expected Result`, `Match Results`, and `Error`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The results will be in a csv formatted file withthe following columns: `Test Case #`, `Expected Result`, `Match Results`, and `Error`.
The results will be in a CSV formatted file with the following columns: `Test Case #`, `Expected Result`, `Match Results`, and `Error`.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to edit these column names over time to align with naming conventions of performance metrics used for evaluating classification models, like the Record Linkage algorithm (3 classes: match, no_match, possible_match). E.g. True Label instead of Expected Result and Prediction instead of Match Results.

"evaluators": [
{
"feature": "FIRST_NAME",
"func": "func:recordlinker.linking.matchers.feature_match_fuzzy_string"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on when this gets merged, you may need to pull the latest changes from main with the new names of the feature matching/evaluator funcs, if it's after #137 is merged.

@@ -0,0 +1,1001 @@
Match Id,ID,BIRTHDATE,FIRST,LAST,SUFFIX,MAIDEN,RACE,ETHNICITY,GENDER,ADDRESS,CITY,STATE,COUNTY,ZIP,SSN,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prob fine since this is synthetic data, but do we need approval from NBS to store this on our open-source GitHub repo? CC: @ericbuckley



def seed_database(csv_file, api_url):
MAX_CLUSTERS = 100
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of constraining the number of clusters to 100?

def send_test_records(test_csv, algorithm_name, api_url):
output_data = []

print("Sending test records to the API...")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume these messages get printed to the CLI? What's the difference between printing vs. logging in this case?

output_row = {
"Test Case #": match_info['test_case_number'],
"Expected Result": match_info['should_match'],
"Match Result": response['prediction'],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our valid prediction values of "match", "no_match", and "possible_match" don't exactly align with those in the Expected Result column of the NBS test cases. Is there transformation somewhere, or are we expecting our valid values in this column? If the latter, is validation somewhere needed?

For context, having the same set of values in the Expected Result and Match Result columns of the output.csv is very helpful for calculating match accuracy performance metrics down the line (e.g., False Positives, False Negatives, Sensitivity, Specificity, etc.). If there are transformations to results needed, I think that should live somewhere in this testing framework and outside of the calculations for these metrics.

E.g.,"Should Be a Match" in NBS test cases vs. "match" in Record Linkage algorithm prediction, "Should Be a Match?" vs. "possible_match", etc.

output_row = {
"Test Case #": match_info['test_case_number'],
"Expected Result": match_info['should_match'],
"Match Result": "Failed",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would cause this "Failed" result?

Would we want to error out in the testing framework somewhere earlier on? Not sure how long this kind of testing would take to run, but I'm imaging the case when it takes a while only to return many "Failed to link record" results due to a formatting error or something easily fixed; the user would only become aware of that after the testing finished running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
qa Technical improvements to increase code quality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

implement testing script for NBS test cases
3 participants