-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NBS Test Cases #123
base: main
Are you sure you want to change the base?
NBS Test Cases #123
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #123 +/- ##
==========================================
- Coverage 95.47% 95.43% -0.04%
==========================================
Files 31 31
Lines 1414 1424 +10
==========================================
+ Hits 1350 1359 +9
- Misses 64 65 +1 ☔ View full report in Codecov by Sentry. |
7377609
to
aa66f56
Compare
c36a188
to
24e7026
Compare
7f159f4
to
4038dd1
Compare
957bdaf
to
267ef0c
Compare
|
||
|
||
def send_test_records(test_csv, algorithm_name, api_url): | ||
df = pd.read_csv(test_csv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is pandas only necessary to read and write csv files? If so, there is a standard lib for that https://docs.python.org/3/library/csv.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I will go ahead and switch it to that standard lib. Removing it should make the build faster too.
@cbrinson-rise8 I was going through the README and ran the default tests. I noticed the all of the expected results in the output were either a failure or a match, which is different from the results.csv file you committed. Any ideas why my results would have been different from the ones in the PR? |
Hmm not sure but the output file I committed might just be old. I'll run through running the tests and see what my new output file would look like. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this work! Excited that we'll have the functionality to test results, which paves the way for calculating match accuracy performance metrics!
Left several questions and a few suggestions.
@@ -0,0 +1,99 @@ | |||
# Record Linkage Algorithm Testing | |||
|
|||
This repository contains a project to test the effectiveness of the RecordLinker algorithm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This repository contains a project to test the effectiveness of the RecordLinker algorithm. | |
This repository contains a project to evaluate the match accuracy performance of the RecordLinker algorithm. |
- `results/`: Contains the results of the algorithm tests | ||
- `scripts/`: Contains the scripts to run the algorithm tests | ||
|
||
## Steup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## Steup | |
## Setup |
|
||
The results of the algorithm tests will be available in the `results/output.csv` file. | ||
|
||
The results will be in a csv formatted file withthe following columns: `Test Case #`, `Expected Result`, `Match Results`, and `Error`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The results will be in a csv formatted file withthe following columns: `Test Case #`, `Expected Result`, `Match Results`, and `Error`. | |
The results will be in a CSV formatted file with the following columns: `Test Case #`, `Expected Result`, `Match Results`, and `Error`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to edit these column names over time to align with naming conventions of performance metrics used for evaluating classification models, like the Record Linkage algorithm (3 classes: match
, no_match
, possible_match
). E.g. True Label
instead of Expected Result
and Prediction
instead of Match Results
.
"evaluators": [ | ||
{ | ||
"feature": "FIRST_NAME", | ||
"func": "func:recordlinker.linking.matchers.feature_match_fuzzy_string" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on when this gets merged, you may need to pull the latest changes from main with the new names of the feature matching/evaluator funcs, if it's after #137 is merged.
@@ -0,0 +1,1001 @@ | |||
Match Id,ID,BIRTHDATE,FIRST,LAST,SUFFIX,MAIDEN,RACE,ETHNICITY,GENDER,ADDRESS,CITY,STATE,COUNTY,ZIP,SSN, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prob fine since this is synthetic data, but do we need approval from NBS to store this on our open-source GitHub repo? CC: @ericbuckley
|
||
|
||
def seed_database(csv_file, api_url): | ||
MAX_CLUSTERS = 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of constraining the number of clusters to 100?
def send_test_records(test_csv, algorithm_name, api_url): | ||
output_data = [] | ||
|
||
print("Sending test records to the API...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume these messages get printed to the CLI? What's the difference between printing vs. logging in this case?
output_row = { | ||
"Test Case #": match_info['test_case_number'], | ||
"Expected Result": match_info['should_match'], | ||
"Match Result": response['prediction'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our valid prediction
values of "match"
, "no_match"
, and "possible_match"
don't exactly align with those in the Expected Result
column of the NBS test cases. Is there transformation somewhere, or are we expecting our valid values in this column? If the latter, is validation somewhere needed?
For context, having the same set of values in the Expected Result
and Match Result
columns of the output.csv
is very helpful for calculating match accuracy performance metrics down the line (e.g., False Positives, False Negatives, Sensitivity, Specificity, etc.). If there are transformations to results needed, I think that should live somewhere in this testing framework and outside of the calculations for these metrics.
E.g.,"Should Be a Match"
in NBS test cases vs. "match"
in Record Linkage algorithm prediction, "Should Be a Match?"
vs. "possible_match"
, etc.
output_row = { | ||
"Test Case #": match_info['test_case_number'], | ||
"Expected Result": match_info['should_match'], | ||
"Match Result": "Failed", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would cause this "Failed"
result?
Would we want to error out in the testing framework somewhere earlier on? Not sure how long this kind of testing would take to run, but I'm imaging the case when it takes a while only to return many "Failed to link record"
results due to a formatting error or something easily fixed; the user would only become aware of that after the testing finished running.
Description
NBS provided us a spreadsheet that defines an existing state and then a bunch of tests with expected results. Architect a testing script of sorts, that can parse a similar spreadsheet (can just be a CSV) with seeding data and iterate through the tests with the expectations.
Related Issues
closes #110
Additional Notes
For a full explanation of run instructions, directory structure, and file uses check out the README.md
The key inputs to these tests will be the following:
The setup should make it easy to edit any 3 of those above files, run the test cases, and examine the results.
<--------------------- REMOVE THE LINES BELOW BEFORE MERGING --------------------->
Checklist
Please review and complete the following checklist before submitting your pull request:
Checklist for Reviewers
Please review and complete the following checklist during the review process: