We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Data Science Engineer

Columbia University
United States, New York, New York
Mar 27, 2026

  • Job Type: Officer of Administration
  • Bargaining Unit:
  • Regular/Temporary: Regular
  • End Date if Temporary:
  • Hours Per Week: 35
  • Standard Work Schedule: Monday - Friday
  • Building: PH-20
  • Salary Range: 160,000 - 180,000


The salary of the finalist selected for this role will be set based on a variety of factors, including but not limited to departmental budgets, qualifications, experience, education, licenses, specialty, and training. The above hiring range represents the University's good faith and reasonable estimate of the range of possible compensation at the time of posting.

Position Summary

The Department of Biomedical Informatics at Columbia University is seeking a highly motivated data science engineer to support large-scale observational research within the OHDSI (Observational Health Data Sciences and Informatics) network. This role will focus on the design, implementation, and execution of distributed network studies using electronic health record (EHR) and administrative claims data to generate real-world evidence.

The successful candidate will contribute to characterization, population-level estimation (causal inference), and patient-level prediction analyses across multi-institutional data networks. This position offers a unique opportunity to work at the intersection of biomedical informatics, data science, and clinical research within a leading academic medical center.

This position is a full-time, two-year position with a possibility of an extension, contingent on available funding.

Responsibilities

Key Responsibilities



  • Design and implement observational network studies using distributed EHR and administrative claims data
  • Conduct large-scale characterization, comparative effectiveness and safety estimation, and patient-level prediction analyses
  • Develop reproducible analytic pipelines using R and SQL in relational database environments
  • Apply and evaluate methods from causal inference (e.g., confounding control, bias assessment, sensitivity analyses)
  • Apply machine learning approaches for predictive modeling using high-dimensional healthcare data
  • Work with standardized data representations, including the OMOP Common Data Model and standardized clinical vocabularies for conditions, drugs, procedures, and measurements
  • Collaborate with interdisciplinary teams including clinicians, statisticians, data engineers, and informaticians
  • Contribute to scholarly outputs including manuscripts, presentations, and open-source analytic tools
  • Support transparent, reproducible, and scalable research practices across distributed data networks


Minimum Qualifications

Master's degree in biostatistics, public health, epidemiology, informatics, computer science, or related field, and or equivalent in education and experience, with at least 2 years' related experience.



  • At least 1 year of relevant prior work experience in the healthcare industry within a health system, a pharmaceutical company, or an insurer
  • Strong programming experience in R and SQL
  • Experience working with relational databases and large-scale healthcare datasets
  • Demonstrated interest in observational research using real-world clinical or claims data
  • Ability to design, implement, and document reproducible analytic workflows
  • Strong written and verbal communication skills


Preferred Qualifications

PhD degree in biostatistics, public health, epidemiology, informatics, computer science, or related field, and/or equivalent and experience in education, with at least 1 year of related work experience.

Other Requirements



  • Familiarity with the OMOP Common Data Model and standardized vocabularies (e.g., ICD, NDC, SNOMED, MedDRA, LOINC, CPT)
  • Knowledge of causal inference methods for observational studies
  • Experience with machine learning techniques for patient-level prediction
  • Prior experience working in distributed or federated data networks
  • Familiarity with open-source research ecosystems and collaborative scientific communities



Work Environment & Opportunities

This position is based at Columbia University's Department of Biomedical Informatics, a world leader in clinical research informatics and observational health data science. The role offers close collaboration with leading researchers, access to large-scale real-world data, and opportunities to contribute to impactful, open, and methodologically rigorous research that informs clinical and policy decision-making. This position is based in New York, NY and has the option to follow a hybrid schedule of 3 days per week working on site and 2 days per week working remotely.

Equal Opportunity Employer / Disability / Veteran

Columbia University is committed to the hiring of qualified local residents.

Applied = 0

(web-bd9584865-ksnsn)