CV
Profile
M.S. Biostatistics candidate (May 2026) with strong applied statistics and data skills for real-world clinical-style datasets. Experienced in data integrity checks, exploratory analysis, regression/classification modeling, and translating findings into clear presentations and reports. Proficient in R/Python/SAS (advanced); collaborative and comfortable in fast-paced environments.
Education
University of Michigan - Ann Arbor
Master of Science in Biostatistics, Aug 2024 - May 2026
University of Illinois Urbana-Champaign
Bachelor of Science in Statistics, Minor in Economics, Aug 2021 - May 2024
Technical Skills
Programming: R/Python/SAS (advanced), SQL, Git, Linux
Methods: Descriptive statistics, regression (linear/logistic), model selection, classification methods, time series/trend analysis, sensitivity/robustness checks
Data Integrity: Data cleaning/merging, validation checks (range/logic/missingness), reconciliation, documented/rerunnable pipelines
Reporting and Visualization: R Markdown/Quarto, Excel, Tableau; tables/figures, slide-ready summaries
Experience
Data Analyst Intern
Overseas Consulting Ltd., Beijing, China - May 2025 - Aug 2025
- Built analysis-ready longitudinal datasets by ingesting and merging multi-source data; standardized identifiers, derived fields, and maintained clear documentation for reuse.
- Implemented data integrity checks (range/logic checks, missingness profiling, reconciliation across sources) and maintained issue logs to support reliable reporting.
- Delivered stakeholder-ready outputs (tables/figures/trend summaries) and communicated findings to cross-functional partners under tight timelines.
Research Biostatistician
Institute of Biophysics, Chinese Academy of Sciences, Beijing, China - May 2023 - Aug 2023
- Programmed and validated preprocessing pipelines in R; created reproducible QC summaries and figures to support review and interpretation.
- Conducted statistical analyses and produced clear tables/figures; communicated methods, results, and limitations to multidisciplinary collaborators.
- Supported iterative analyses based on stakeholder questions; maintained organized documentation for consistent reruns and presentations.
Credit Operations Intern
Hangzhou Bank Co., Ltd., Beijing, China - Jun 2021 - Aug 2021
- Assessed 20+ clients through industry analysis and prepared evaluation reports identifying high-value segments.
- Utilized SQL to clean and validate credit data, enhancing data accuracy by 10% and optimizing reporting workflow.
- Analyzed 5-year income trends and identified distributional biases to support strategic decisions.
Projects
Pneumonia Patient Condition Classification (Clinical Imaging ML) (Aug 2024 - Dec 2024)
- Built a reproducible classification pipeline (preprocessing, train/validation splits, leakage checks) and generated standardized performance summaries (confusion matrix, class-wise metrics).
- Performed model comparison and error analysis; translated results into clear visuals and slide-ready summaries for non-technical stakeholders.
Population Health Survey Modeling: Subjective Probabilities (AFHS) (Jan 2026 - Present)
- Built analysis-ready survey datasets and documented variable definitions; performed QC checks and reproducible reporting workflows in R (R Markdown/Quarto).
- Fit ordinal models for 0-10 subjective probability outcomes using survey weights; summarized results in tables/figures and translated findings into manuscript-style methods/results text.
Clinical-Style Data Programming: COVID-19 Trends and Excess Mortality (Jan 2025 - Apr 2025)
- Cleaned, transformed, and merged large public health datasets; produced descriptive summaries and trend analyses with documented assumptions.
- Created tables/figures and short written summaries suitable for cross-functional presentations and reporting.
Investigating the Association of Hypertension and Alcohol Intake (NHANES) (Nov 2024)
- Developed multivariable regression models with interaction terms on clinical-style data to evaluate alcohol intake and systolic blood pressure relationships.
- Performed diagnostics and sensitivity checks; documented clinically interpretable findings in a formal report.
Qualitative Analysis and Visualization Consulting (STATCOM) (Dec 2025)
- Conducted thematic analysis across multi-stakeholder feedback (students, mentors, families) and translated unstructured responses into structured evidence themes.
- Delivered leadership-ready visual summaries and recommendations to support program evaluation decisions.
Student Performance Analysis Using SAS (May 2024)
- Performed end-to-end SAS analysis (data cleaning, regression, clustering) with transparent model assumptions and reproducible outputs.
- Communicated key drivers through concise tables and visuals to support evidence-based interventions.
U.S. Election Clustering Analysis (May 2022)
- Applied K-means and hierarchical clustering to large county-level datasets and compared unsupervised model behavior using visualization diagnostics.
- Produced clear methodology summaries and plots for non-technical interpretation.
Billboard Top 100 Shiny Dashboard (Nov 2022)
- Built an interactive R Shiny dashboard with filtering and trend visualization modules to support exploratory analysis.
- Designed stakeholder-friendly UI components and reporting views to improve insight accessibility.
Additional
Interests: Medical devices, manufacturing analytics, quality monitoring, applied statistics for patient-impacting products