Building a Dataset
Hi! My name is Phoebe Young, and I am a neuroscience and cognitive studies double major at Vanderbilt University. I am currently a sophomore working in the Winder lab at Vanderbilt Center for Addiction Research (VCAR) and participating in a micro-internship with Open Avenues Foundation and Tamr. My specialty primarily consists of data analysis, as I have had experience with data analysis in a cognitive studies lab and now with a software company.
This project began with the goal of creating a dataset to manage and ask questions about doctors involved in clinical trials research. We were tasked with cleaning multiple datasets to then be merged for a collection of information to be used to identify the ideal doctor for a given task. Using deepnote's servers and pandas, the data was cleaned, columns were edited and dropped, full datasets were merged, and that information was converted to a visually appealing and more easily interpreted chart. From this, an interactive notebook was created so that any person can input their criteria to find the best match for their clinical trial.
Merged dataset for clinical trial information:
Contains more clearly organized and succinct information to distinguish doctors by certain qualifications
Final merged dataset of Clinical Trial and Medicare information: contains combination of Medicare information and information from the original dataset
Find doctors of a specific gender:
Gender Distribution of Doctors
Top 10 Most Popular Primary Specialties in the Dataset
Find doctors in 10 states with the most clinical trials: