CV
Education
- B.A. in Data Science from the University of California, Berkeley (2023)
Industry work experience
- June 2023 - August 2023:
- New York City Department of City Planning
- Selected from an applicant pool of 2,500 students with an acceptance rate of 5% as part of the Coding it Forward Summer Fellowship, a program that empowers early-career technologists to explore careers in the federal government
- Duties:
- Built a brand-new data product aimed at tracking and analyzing NYC historical spendings to aid steering resources to underinvested areas of the city.
- Parsed and visualized the NYC Checkbook database to enrich the data with spatial location information using pandas, re and seaborn.
- Maintained core data products and pipelines built by the Enterprise Data Management team.
- Supervisor: Amanda Doyle
- May 2022 - August 2022:
- HP Inc.
- Duties:
- Developed and tested data pipelines to build the core dataset of 40 million records used for business planning and sales forecasting by the Enterprise Business Planning Team.
- Migrated the existing pipeline built locally in MS Access to Python and SQL and reduced the code from 15000 lines to 900 lines.
- Write and maintain documentation on the structure and relations of the datasets and the data pipelines.
- Supervisor: Pushon Mukherjee
Academic work experience
- August 2022-May 2023: Undergraduate Student Instructor
- University of California, Berkeley
- Fall 2022 in Data C88S: Probability and Mathematical Statistics for Data Science and Spring 2023 in Data C102: Data, Inference and Decisions)
- Duties:
- Lead discussion sections of 30 students in Data C88S and Data C102 and held weekly office hours to aid with material mastery.
- Created educational materials and exams in probability, Bayesian statistics, regression and inference.
- Supervisor: Ani Adhikari PhD. , Ramesh Sridharan PhD.
- August 2022-May 2023: Human Context and Ethics Curriculum Developer
- University of California, Berkeley
- Duties:
- Created course content in UC Berkeley’s largest data science classes to promote critical thinking in the area of social issues in technology.
- Worked with University of California, Berkeley instructors to integrate HCE curriculum in data science classes using Jupyter Notebooks, case studies and data sets.
- Supervisor: Ari Edmundson PhD.
- August 2020-December 2020: Research Assistant
- University of California, San Francisco
- Duties:
- Built random forest and logistic regression classifiers (with BERT and Word2Vec embeddings) to categorize clinical notes into medical conditions.
- Supervisor: Vivek Rudrapatna
Skills
- Programming Languages: Python, SQL, R
- Statistics: Machine Learning, Statistical Analysis, Causal Inference
- GIS tools: ArcGIS Pro, Carto, GeoPandas
- Data Visualization (seaborn, matplotlib, D3.js)
- Data Engineering (Pipelining, Data Integration, Database Management)
Teaching