? Job Description:
EDDC is a drug discovery organization committed to work with the SG ecosystem to translate scientific discoveries into life-changing therapeutics for patients. We aim to leverage state-of-the-art computational and data-driven approaches to uncover insights from large-scale and multidimensional data sources, and to tackle complex and unmet medical challenges in disease areas such as oncology, fibrosis, infectious and autoimmune diseases.
We are seeking a highly motivated data engineer to develop and maintain a database for multi-dimensional data (ie. high-throughput phenotypic and phenomic screens, multi-omics, etc.). The successful candidate will work in a dynamic, interdisciplinary team to design, build, and maintain a scalable and FAIR-compliant database, and to develop a user-interface to allow users to search and query the database. The candidate will also be involved in developing and implementing automated pipelines through the utilization of APIs across multiple applications for the acquisition, processing, and transformation of both in-house generated and externally-acquired and collated data sources.
Responsibilities:
-
Design, build, and maintain a scalable and FAIR-compliant database for large-scale and high-dimensional datasets
-
Develop and implement workflows/pipelines for the acquisition, processing, and transformation of in-house generated and externally collated data sources, through the utilization of APIs across multiple applications
-
Collaborate with bioinformaticians and data scientists to ensure high-quality data integration and analysis
-
Develop and implement data quality control measures and data management protocols
-
Implement and maintain data security and access control measures
-
Develop a user interface to allow users to search and query the database
-
Monitor and troubleshoot database performance issues
-
Develop and maintain documentation for the database and related pipelines
-
Debug and resolve technical problems that arise
-
Recommend changes to existing infrastructure
-
Train and mentor end-users to maximize value from the developed data engineering solutions
Job Requirements:
-
Bachelor's or Master's degree in computer science, information systems, data sciences or related field
-
3+ years of experience in database administration, design, and engineering, preferably in an industry setting
-
Proficiency in SQL and database management systems, such as MySQL, PostgreSQL, or Oracle
-
Experience with cloud-based and on-premise database services
-
Familiarity with data processing and storage frameworks, such as Hadoop and Spark
-
Strong programming skills in Python or other languages commonly used in data engineering
-
Experience developing user interfaces for databases or web applications
-
Familiarity with data quality control and management protocols
-
Strong organizational, communication and collaborations skills
-
Ability to work independently and as part of a team in a fast-paced, dynamic environment
-
Willingness to learn new techniques and adapt to changing priorities
-
Industry experience as a Data Engineer in pharmaceutical or biotech companies is an advantage
-
Data analysis, UI development and visualization experience with genomics, transcriptomics and proteomics data is an advantage
|