Sungjoon Park

Bio

Sungjoon Park is an AI Research Scientist at the Bio Intelligence Lab within LG AI Research. He received his PhD in Computer Science and Engineering from Seoul National University in 2024. His research centers on the application of artificial intelligence, machine learning, or deep learning to biomedical challenges, including drug discovery, phenotype prediction, biomarker identification, and genetic data analysis. He is currently investigating predictive modeling for Alzheimer’s Disease as well as antibody binding affinity prediction to support antibody design. By integrating multi-omics datasets, molecular characteristics, and clinical health records, his work aims to advance computational methods that accelerate biomedical research while reducing cost and development time. His broader goal is to bridge the gap between AI innovation and translational healthcare applications, contributing to the development of data-driven strategies in precision medicine.

In addition to his expertise in artificial intelligence, Sungjoon has experience with cloud computing infrastructure, Linux server maintenance, and full-stack web development. These experiences outside of research have equipped him with strong communication and cooperation skills and an understanding of the needs of technicians, designers, promoters, and decision-makers.

Work Experience

  • Research Scientist

    Bio Intelligence Lab, LG AI Research
    Jul 2024–Present
    • Antibody-antigen binding affinity prediction

      Development of an antibody-antigen binding energy prediction model based on Protein Data Bank (PDB) structural and sequence data to facilitate antibody design

    • Alzheimer's Disease biomarker discovery

      Phenotype prediction model for Alzheimer’s Disease utilizing AD-BXD mouse models with human multi-omics data from The Religious Orders Study and Memory and Aging Project (ROSMAP) and Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohorts

  • Postdoctoral Researcher

    Bioinformatics Institute, Seoul National University
    May–Jun 2024
    • Drug side effect frequency prediction

      Prediction of drug side-effect frequency by mapping drugs and side effects onto a common embedding space using deep learning and ensemble methods

    • Intratumoral heterogeneity and pharmacogenomics

      Linking a large in vivo clinical database of The Cancer Genome Atlas (TCGA) to in vitro experiments from Cancer Cell Line Encyclopedia (CCLE) using matrix factorization on the cloud system to recommend personalized medicine

    • Course teaching assistant

      Lecturing on python basics, dynamic programming and sequence alignment, classification, regression, and dimensionality reduction metehods (M1429.000100)

  • Program Committee Members

    • The 5th International Workshop on Big Data & AI Tools, Models, and Use Cases for Innovative Scientific Discovery (BTSD) 2024

Education

  • Doctor of Philosophy

    in Computer Science and Engineering
    Feb, 2024
  • Bachelor of Science

    in Computer Science and Engineering
    Aug, 2017
    • Institution : Department of Computer Science and Engineering, Seoul National University

    • Major : Computer Science and Engineering

Publication

Journal Articles

  1. S Park, S Lee, M Pak, and S Kim. "Dual representation learning for predicting drug-side effect frequency using protein target information." Journal of Biomedical and Health Informatics (2024). doi:10.1109/jbhi.2024.3350083
  2. JK Yoon, S Park, KH Lee, D Jeong, J Woo, J Park, SM Yi, D Han, CG Yoo, S Kim, and CH Lee. "Machine Learning-Based Proteomics Reveals Ferroptosis in COPD Patient-Derived Airway Epithelial Cells Upon Smoking Exposure." Journal of Korean Medical Science 38.29 (2023). doi:10.3346/jkms.2023.38.e220
  3. S Park, D Lee, Y Kim, S Lim, H Chae, and S Kim. "BioVLAB-Cancer-Pharmacogenomics: tumor heterogeneity and pharmacogenomics analysis of multi-omics data from tumor on the cloud." Bioinformatics 38.1 (2022): 275-277. doi:10.1093/bioinformatics/btab478
  4. S Lim, Y Lu, CY Cho, Y Kim, S Park, and S Kim. "A review on compound-protein interaction prediction methods: data, format, representation and model." Computational and Structural Biotechnology Journal 19 (2021): 1541-1556. doi:10.1016/j.csbj.2021.03.004
  5. M Oh, S Park, S Kim, and H Chae. "Machine learning-based analysis of multi-omics data on the cloud for investigating gene regulations." Briefings in bioinformatics 22.1 (2021): 66-76. doi:10.1093/bib/bbaa032
  6. M Oh, S Park, S Lee, D Lee, S Lim, D Jeong, K Jo, I Jung, and S Kim. "DRIM: a web-based system for investigating drug response at the molecular level by condition-specific multi-omics data integration." Frontiers in Genetics 11 (2020): 564792. doi:10.3389/fgene.2020.564792
  7. S Park, M Kim, S Seo, S Hong, K Han, K Lee, JH Cheon, and S Kim. "A secure SNP panel scheme using homomorphically encrypted K-mers without SNP calling on the user side." BMC genomics 20 (2019): 163-174. doi:10.1186/s12864-019-5473-z

Conference Proceedings

  1. J Lee, S Park, K lee, S Yim, D Hwang, D Kim, K Yoo, S Lee, and K Kim. "Structural MRI–Informed Multimodal Fusion for Robust Alzheimer’s Disease Prediction." 39th Conference on Neural Information Processing Systems (NeurIPS 2025) Workshop: The 3rd Workshop on Imageomics: Discovering Biological Knowledge from Images Using AI.
  2. S Yim, K Lee, D Kim, S Park, D Hwang, S Lee, A Dunn, D Gatti, E Chesler, K O'Connell, and K Kim. "Cell-Type-Aware Pooling for Robust Sample Classification in Single-Cell RNA-seq Data." Proceedings of the ICML 2025 Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences.
  3. S Park, K Lee, S Yim, D Hwang, D Kim, S Lee, A Dunn, D Gatti, E Chesler, K O'Connell, and K Kim. "Robust Multi-Omics Integration from Incomplete Modalities Significantly Improves Prediction of Alzheimer's Disease." Proceedings of the ICML 2025 Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences.
  4. TH Kwon, B Koo, S Park, T Southiratn, and S Kim. "Web-based Exploratory Data Mining System for Analyzing the Gene-level Relationship between Intratumoral Heterogeneity of Promoter DNA Methylation and Drug Response." 한국정보과학회 학술발표논문집 (2024): 363-365. dbpia.co.kr
  5. M Pak, D Jeong, S Park, J Gu, S Lee, and S Kim. "ALPACA: A Visual Data Mining System for Subcellular Location-specific Knowledge Mining from Multi-Omics Data in Cancer." The 20th Asia Pacific Bioinformatics Conference (APBC 2020); (Publication in BMC Bioinformatics pending).

Abstracts

  1. J Lee, S Park, K lee, S Yim, D Hwang, D Kim, K Yoo, S Lee, and K Kim. "Expanding Patient Coverage in Alzheimer's Disease Studies through Robust Multimodal Data Integration." (Publication in Journal of Prevention of Alzheimer’s Disease pending).
  2. K Kim, S Park, S Yim, D Kim, D Hwang, K Lee, D Gatti, A Dunn, E Chesler, K O'Connell, and S Lee. "Investigating Hyperactivity in Alzheimer's Disease: Genetic and Dietary Influences in AD-BXD Mice." (Publication in Alzheimer's & Dementia pending).
  3. S Yim, K Lee, D Kim, S Park, D Hwang, K Kim, A Dunn, D Gatti, E Chesler, K O'Connell, and S Lee. "Cell Type-Aware Multiple Instance Learning Improves Alzheimer’s Disease Prediction from snRNA-seq." (Publication in Alzheimer's & Dementia pending).
  4. S Park, K Lee, S Yim, D Hwang, K Kim, A Dunn, D Gatti, E Chesler, and K O'Connell. "Incomplete multi-modal learning of omics data for phenotype prediction and biomarker discovery." (Publication in Alzheimer's & Dementia pending).
  5. D Hwang, K Lee, S Park, S Yim, K Kim, A Dunn, D Gatti, E Chesler, and K O'Connell. "Integrating SNP Dimensionality Reduction and Bootstrapped k-NN Imputation for Cognitive Function Prediction in AD-BXD Mice." (Publication in Alzheimer's & Dementia pending).

Preprints

  1. Y Lu, S Lim, S Park, MG Choi, C Cho, S Kang, and S Kim. "EnsDTI-kinase: Web-server for Predicting Kinase-Inhibitor Interactions with Ensemble Computational Methods and Its Applications." bioRxiv (2023): 2023-01. doi:10.1101/2023.01.06.523052

Presentations

  1. 18th Clinical Trials on Alzheimer's Disease (CTAD 2025)

    "Expanding Patient Coverage in Alzheimer's Disease Studies through Robust Multimodal Data Integration." Dec 1–2, 2025. San Diego, CA, United States.

  2. Alzheimer’s Association International Conference (AAIC) 2025

    "Incomplete multi-modal learning of omics data for phenotype prediction and biomarker discovery." Jul 27–31, 2025. Virtual.

  3. ICML 2025 Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences

    "Robust Multi-Omics Integration from Incomplete Modalities Significantly Improves Prediction of Alzheimer's Disease." Jul 19, 2025. Virtual.

  4. 2023 SNU Artificial Intelligence Institute Retreat

    "Dual representation learning for predicting drug-side effect frequency using protein target information." May 19, 2023. Seoul, Korea.

  5. AI for Drug Discovery Symposium, MOGAM Institute of Biomedical Research

    "BioVLAB-Cancer-Pharmacogenomics: Tumor heterogeneity and pharmacogenomics analysis of multi-omics data from tumor on the cloud." Jun 27, 2022. Seoul, Korea.

  6. The 6th SNU Bioinformatics Research Exchange Conference

    "Multi-omics integrative analysis pipelines of cancer pharmacogenomics." Feb 16, 2022. Virtual.

  7. ICGC ARGO 17th Scientific Workshop / 4th ARGO Meeting

    "BioVLAB-Cancer-Pharmacogenomics: Tumor heterogeneity and pharmacogenomics analysis of multi-omics data from tumor on the cloud." May 14–15, 2021. Virtual.

  8. The 17th Asia Pacific Bioinformatics Conference (APBC 2019)

    "A secure SNP panel scheme using homomorphically encrypted K-mers without SNP calling on the user side." Jan 14, 2019. Wuhan, China.

Toy Projects

  • Parade

    Implemented and deployed Naoki Homma's board game "Parade", accomodating 2-6 players per game. A round typically lasts 10-15 minutes. Currently, the interface and tutorials are available only in Korean.