What you'll do
- Develop and maintain data pipelines to process and transform data from a variety of heterogeneous sources, including hospitals, labs, and healthcare providers.
- Build robust systems for cleaning, normalizing, and integrating diverse datasets into formats suitable for machine learning and analytics.
- Collaborate closely with data scientists, clinicians, and product teams to understand data requirements and deliver impactful solutions.
- Ensure data quality by implementing automated validation, monitoring, and discrepancy resolution processes.
- Optimize data workflows to handle varying data formats, schemas, and structures efficiently.
- Contribute to architectural decisions to ensure systems are scalable, secure, and compliant with data privacy regulations like GDPR and HIPAA.