Skip to content

Commit

Permalink
feat: initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
cbrinson-rise8 committed Nov 14, 2024
1 parent 6340323 commit 7377609
Show file tree
Hide file tree
Showing 5 changed files with 129 additions and 0 deletions.
19 changes: 19 additions & 0 deletions compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,3 +76,22 @@ services:
depends_on:
api:
condition: service_healthy

# TODO: fix this
algo-test-runner:
image: "python:3.11-slim"
env_file:
- tests/algorithm/algo.env
environment:
DB_URI: "postgresql+psycopg2://postgres:pw@db:5432/postgres"
API_URL: "http://api:8080"
command: sh -c "pip install pandas && python scripts/seed_db.py"
volumes:
- ./tests/algorithm/scripts:/scripts
depends_on:
db:
condition: service_healthy
api:
condition: service_healthy
profiles:
- algo-test
45 changes: 45 additions & 0 deletions tests/algorithm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Record Linkage Algorithm Testing

This repository contains a project to test the effectiveness of the RecordLinker algorithm.

## Prerequisites

Before getting started, ensure you have the following installed:

- [Docker](https://docs.docker.com/engine/install/)
- [Docker Compose](https://docs.docker.com/compose/install/)

## Setup

Before getting started, ensure you have the following installed:

- [Docker](https://docs.docker.com/engine/install/)
- [Docker Compose](https://docs.docker.com/compose/install/)

<!-- ## Steup
1. Build the Docker images:
```bash
docker compose --profile algo-test build
```
2. Configure environment variables
```bash
edit tests/algorithm/algo.env
```
Edit the environment variables to set the seed csv file used
## Running Algorithm Tests
1. Run the Synthea tests
```bash
docker compose --profile algo-test run --rm algo-test-runner python scripts/seed_db.py #TODO: script name
```
2. Analyze the results
The results of the algorithm tests will be available in the `tmp/results` directory. -->

3 changes: 3 additions & 0 deletions tests/algorithm/algo.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
SEED_CSV_FILE=
DB_URI="tests"
API_URL="api estst"
61 changes: 61 additions & 0 deletions tests/algorithm/scripts/seed_db.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
import sys
import os
import requests
import pandas as pd


def seed_database(api_url, csv_file):
# # Load the CSV data
df = pd.read_csv(csv_file)

cluster_group = []

for _, row in df.iterrows():
# Convert the row to a dictionary
record_data = row.to_dict()

# convert row to a pii_record
pii_record = {
"external_id": record_data['ID'],
"birth_date": record_data['BIRTHDATE'],
"sex": record_data['GENDER'],
"address": [
{
"line": [record_data['ADDRESS']],
"city": record_data['CITY'],
"state": record_data['STATE'],
"county": record_data['COUNTY'],
"postal_code": str(record_data['ZIP'])
}
],
"name": [
{
"given": [record_data['FIRST']],
"family": record_data['LAST']
}
],
"ssn": record_data['SSN'],
"race": record_data['RACE']
}

# nesting for the seeding api request
cluster = {"records": [pii_record]}
cluster_group.append(cluster)

# # make request to api to seed the db
# try:
# response = requests.post(api_url, json=pii_record)
# response.raise_for_status() # Raise an error for bad status codes
# print(f"Successfully posted record {pii_record['external_id']}: {response.status_code}")
# except requests.exceptions.RequestException as e:
# print(f"Failed to post record {pii_record['external_id']}: {e}")

if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python seed_db.py <seedfile.csv>")
sys.exit(1)

csv_file = sys.argv[1]


seed_database("http://localhost:8080/", csv_file)
1 change: 1 addition & 0 deletions tests/algorithm/scripts/test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
print("Heloo World")

0 comments on commit 7377609

Please sign in to comment.