Data Scientist- Research Informatics

University of Kansas Medical Center

Kansas City, KS

ID: 7277380
Posted: September 3, 2024
Application Deadline: Open Until Filled

Job Description

Job Description Summary:
This position will be in Research Informatics (The Office of the Chief Research Informatics Officer) at the University of Kansas Medical Center, Kansas City, KS, USA. Research Informatics creates, maintains, and facilitates the use of technology to enable research, helping researchers to manage, process, and analyze their data. We use advanced data science methodologies on large-scale health data to push forward precision medicine. The University of Kansas Medical Center is inviting applications for a Data Scientist position to support the Department of Pathology. This will be a hybrid position requiring both data science and data engineering skills. The successful candidate will play a critical role in advancing research for the Department of Pathology, by managing and optimizing the digitization process for pathology data, leveraging technologies like OpenSlide, facilitating the transition from the pathology information system CoPath, by ensuring continued access to legacy data for research. Additionally, the Data Scientist will support analysis and modeling for the Department of Pathology using a combination of statistics and machine learning. The Data Scientist will join a dynamic team of pathologists, data scientists, informaticians, and clinicians and participate in unique opportunities to apply their skills to contribute to important scientific breakthroughs and to directly impact patients’ lives in a clinical setting.
Job Description:
Design, implement, and maintain infrastructure to create a data warehouse for legacy CoPath data.

Support the creation of a system for efficiently identifying relevant slides for research.

Use appropriate technologies to facilitate the digitization and retrieval of slides for research.

Develop robust data pipelines for ingesting, processing, and managing data for the Department of Pathology, ensuring scalability, reliability, and efficiency.

Implement quality assurance processes to validate the accuracy, completeness, and integrity of pathology data, identifying and resolving issues as needed.

Develop tools and APIs for accessing and retrieving pathology data in response to user requests, ensuring timely and efficient delivery of data to analysts, researchers, and clinicians.

Develop statistical or machine learning models to categorize disease, predict patient outcomes, identify biomarkers, and other applications.

Evaluate model behavior and robustness and maximize model performance and generalizability.

Optimize data processing and retrieval workflows to minimize latency and maximize throughput, leveraging parallel processing, caching, and other optimization techniques.

Collaborate closely with pathologists, data scientists, software engineers, and other stakeholders to understand requirements, prioritize tasks, and ensure alignment with organizational goals.

Document workflows, procedures, and best practices.

Ensure compliance with relevant regulatory requirements (e.g., HIPAA).

Work Environment

Work is normally performed in a typical interior work environment which does not usually subject the employee to any unpleasant elements.

Required Qualifications

Education: BS and/or MS in a quantitative science-related field (e.g., data science, biomedical informatics, clinical informatics, machine learning, biostatistics, etc.)

Work Experience:

Proven experience in data engineering, with a focus on managing and processing large-scale imaging data, preferably in the healthcare or life sciences domain.

Experience in machine learning techniques, ideally with published work and/or code available. Experience with deep learning frameworks is preferred (e.g.,, Tensorflow or PyTorch).

Experience with programming and statistical software experience in Python and/or R

Preferred Qualifications

Work Experience:

Experience with computational pathology, large vision models (CNNs, vision transformers), multi-GPU multi-node training, and federated learning strongly encouraged.

Experience working with cloud platforms.

Experience with high performance computing.

Publication track record including conference papers and preprints.

Skills: Strong communication and presentation skills

This job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee for this job. It is only a summary of the typical functions of the job, not an exhaustive list of all possible job responsibilities, tasks, duties, and assignments. Furthermore, job duties, responsibilities and activities may change at any time with or without notice.