Resources
HLA/C4 imputation reference panel (Annals of the Rheumatic Diseases 2025; recommended)
Access the Korean Imputation Server to use the HLA/C4 imputation reference panel:
- A novel MHC imputation reference panel was designed to overcome long-standing technical challenges in analyzing C4 CNVs. The genetic complexity of the C4 locus, characterized by its multi-allelic nature and linkage disequilibrium with other MHC variants (e.g., HLA classical alleles), has historically hindered high-throughput genotyping and robust association studies.
- By leveraging whole-genome sequencing data from large Korean cohorts (n=~4,000), we devised a method that integrates haplotype clustering, deep learning models, and iterative optimization to precisely infer haploid copy numbers of C4A, C4B, and HERV. The panel’s performance was rigorously assessed through various methods, including leave-one-out cross-validation, whole-genome sequencing, and digital droplet PCR, consistently achieving high accuracy for both HLA variants and C4-related CNVs.
- This approach significantly enhances the accuracy of C4 imputation and holds promise for broader applications in complex CNV genotyping across diverse populations. We have made this imputation panel publicly available via an imputation server (https://coda.nih.go.kr/usab/kis/intro.do), providing researchers with a valuable tool to better understand causal variants in the MHC region, particularly for inflammatory diseases in East Asian populations.
Citation: Development of an MHC imputation panel highlights independent contributions of HLA amino acid residues and C4 copy number variations to SLE risk
Chae-Yeon Yu^, Dong Mun Shin^, Sung Min Kim, Yui Taek Lee, Sungwon Jeon, Sehwan Chun, So-Young Bang, Hye-Soon Lee, Xianyong Yin, Yong Cui, Xuejun Zhang, Jong Bhak, Soon Ji Yoo, Young Jin Kim*, Bong-Jo Kim*, Sang-Cheol Bae*, Kwangwoo Kim*.
Annals of the Rheumatic Diseases. in press. doi: 10.1016/j.ard.2025.06.2121. [July, 2025]
RA GWAS association summary statistics (Annals of the Rheumatic Diseases 2021)
The association summary statistics were generated from a genome-wide association study in rheumatoid arthritis using large cohorts of Korean, Japanese, and European populations (311,292 individuals). Please find detailed information from our research article.
Download the RA GWAS association summary statistics:
via DRYADCitation: Large-scale meta-analysis across East Asian and European populations updated genetic architecture and variant-driven biology of rheumatoid arthritis, identifying 11 novel susceptibility loci
Eunji Ha, Sang-Cheol Bae*, Kwangwoo Kim*.
Annals of the Rheumatic Diseases, 80(5):558-565. doi: 10.1136/annrheumdis-2020-219065. [May 2021]
RA CD4 multi-omics summary statistics (Annals of the Rheumatic Diseases 2021)
We profiled genome-wide variants, gene expression, and DNA methylation in CD4+ T cells from 82 RA patients and 40 healthy controls using high-throughput technologies. We investigated differentially expressed genes (DEGs) and differentially methylated regions (DMRs) in RA and localized quantitative trait loci (QTLs) for expression and methylation. In addition, eQTLs and meQTLs were detected by linear regression. Please find detailed information from our research article.
Download the RA CD4 multi-omics summary statistics:
via DRYADCitation: Genetic variants shape rheumatoid arthritis-specific transcriptomic features in CD4+ T cells through differential DNA methylation, explaining a substantial proportion of heritability
Eunji Ha^, So-Young Bang^, Jiwoo Lim, Jun Ho Yun, Jeong-Min Kim, Jae-Bum Bae, Hye-Soon Lee, Bong-Jo Kim*, Kwangwoo Kim*, Sang-Cheol Bae*.
Annals of the Rheumatic Diseases. 80(7):876-883. doi: 10.1136/annrheumdis-2020-219152. [July 2021] [PDF]
[resource] DEG/DMR/QTL summary statistics: https://doi.org/10.5061/dryad.w0vt4b8pw
DISH: Direct Imputing Summary association statistics of HLA variants (Scientific Reports 2019)
Download the DISH (Direct Imputing Summary association statistics of HLA variants):
via GitHubRscript DISH.r input_file input_type(T/P) hg_version(hg18/hg19) ethnicity(EUR/ASN) MAF_threshold stat_type(Z/T) output (lambda)
Copied!
Rscript DISH.r EUR_sample.txt T hg18 EUR 0.005 T EUR_imputed.txt (lambda)
Copied!
or
Rscript DISH.r EUR_sample P hg18 EUR 0.005 T EUR_imputed.txt (lambda)
Copied!
———————————-REQUIRED————————————-
[input_file] must be a tab delimited file – If [input_type] is T, the file extension must be included in [input_file]. If [input_type] is P, the file extension must be excluded. [input_type] must be a “T” or “P” – A T-type input is a single file that contains both SNP information and statistical values, and a P-type input means that both plink’s .frq file and the user-defined file (containing SNP positions and statistics values) are used. [hg_version] must be “hg18” or “hg19” [ethnicity] must be “EUR” or “ASN” – EUR means European and ASN means Asian ethnicity [MAF_threshold] must be a numeric value – MAF_threshold must be >0 and <0.5 [stat_type] must be “Z” or “T” [output] is an output file prefix———————————-OPTIONAL————————————-
[lambda] must be a numeric value. For details, find our manuscript and related papers.Citation: Understanding HLA associations from SNP summary association statistics
Jiwoo Lim, Kwangwoo Kim*.
Scientific Reports. 4;9(1):1337. doi: 10.1038/s41598-018-37840-9. [February 2019]
SLE ImmunoChip association summary statistics (Nature Genetics 2016)
Download the SLE genetic association summary statistics:
Download XLSXCitation: High-density genotyping of immune-related loci identifies new SLE risk variants in individuals with Asian ancestry
Celi Sun^, Julio E. Molineros^, Loren L. Looger^, Xu-jie Zhou^, Kwangwoo Kim^, Yukinori Okada, Jianyang Ma, Yuan-yuan Qi, Xana Kim-Howard, Prasenjeet Motghare, Krishna Bhattarai, Adam Adler, So-Young Bang, Hye-Soon Lee, Tae-Hwan Kim, Young Mo Kang, Chang-Hee Suh, Won Tae Chung, Yong-Beom Park, Jung-Yoon Choe, Seung Cheol Shim, Yuta Kochi, Akari Suzuki, Michiaki Kubo, Takayuki Sumida, Kazuhiko Yamamoto, Shin-Seok Lee, Young Jin Kim, Bok-Ghee Han, Mikhail Dozmorov, Kenneth M. Kaufman, Jonathan D. Wren, John B. Harley, Nan Shen, Kek Heng Chua, Hong Zhang, Sang-Cheol Bae*, Swapan K. Nath*.
Nature Genetics. 48(3):323-30. doi: 10.1038/ng.3496. [March 2016]
HLA imputation reference panel V1.1 (PLoS ONE 2016; not recommended for most users)
Download the Korean HLA Reference Panel V1.1:
Download Panel (ZIP)Readme.txt
########################################################### # Korean Reference Panel v1.1 for imputing HLA variants # Contact: Kwangwoo Kim (kkim@khu.ac.kr); Sang-Cheol Bae (scbae@hanyang.ac.kr) # Revised on Aug 23, 2025 ###########################################################
- Thank you for downloading the Korean reference panel.
- New features: The Korean HLA reference panel v1.0 included haplotype-level data of 2- and 4-digit classical alleles and amino acid residues of 6 HLA genes: HLA-A, HLA-B, HLA-C, HLA-DRB1, HLA-DPB1, and HLA-DQB1, from 413 unrelated Korean individuals. For v1.1, we additionally merged the data for copy number, classical allele, and amino-acid residue of HLA-DRB3, HLA-DRB4, and HLA-DRB5 of the same 413 Korean subjects with the existing data in the previous HLA reference panel.
- The three alleles – HLA_DRB3_9999, HLA_DRB4_9999 and HLA_DRB5_9999 – indicate gene deletion of HLA-DRB3, HLA-DRB4 and HLA-DRB5, respectively.
- Similarly, the amino_acid alleles – AA_DRB3_*_*_Z, AA_DRB4_*_*_Z and AA_DRB5_*_*_Z – indicate absence of residue due to gene copy deletion.
- Gene positions:
- HLA-DRB5 = 32599737 (CDS from 32605979 to 32593494, NM_002125)
- HLA-DRB4 = 32619838 (arbitrarily defined; no location info on the reference genome)
- HLA-DRB3 = 32639940 (arbitrarily defined; no location info on the reference genome)
- Advantage: v1.1 includes copy number and variant info for HLA-DRB3, DRB4, DRB5. However, overall imputation accuracy is slightly reduced. Therefore, unless you specifically need DRB3,4,5 data, it is recommended to use HLA imputation panel v1.0, for most users.
- Citation:
- Kwangwoo Kim, So-Young Bang, Hye-Soon Lee, Sang-Cheol Bae. Construction and Application of a Korean Reference Panel for Imputing Classical Alleles and Amino Acids of Human Leukocyte Antigen Genes. PLoS ONE. 9(11), e112546. doi: 10.1371/journal.pone.0112546. [November 2014]
- Kwangwoo Kim, So-Young Bang, Dae Hyun Yoo, Soo-Kyung Cho, Chan-Bum Choi, Yoon-Kyoung Sung, Tae-Hwan Kim, Jae-Bum Jun, Young Mo Kang, Chang-Hee Suh, Seung-Cheol Shim, Shin-Seok Lee, Jisoo Lee, Won Tae Chung, Seong-Kyu Kim, Jung-Yoon Choe, Swapan K. Nath, Hye-Soon Lee, Sang-Cheol Bae. Imputing Variants in HLA-DR Beta Genes Reveals that HLA-DRB1 is Solely Associated with Rheumatoid Arthritis and Systemic Lupus Erythematosus. PLoS ONE. 26;11(2):e0150283. doi: 10.1371/journal.pone.0150283. [February 2016]
- Usage:
./SNP2HLA.csh DATA (.bed/.bim/.fam) REFERENCE (.bgl.phased/.markers) OUTPUT plink {optional: max_memory[mb] window_size}
Example:
./SNP2HLA.csh /path_to/KOR_REF_1.1/HapMap3_CHB_JPT/hapmap3_r2_b36_chr6.MHC._plink.id61 /path_to/KOR_REF_1.1/KOR_REF/Kim_KOR_HLA_v1.1 OUTPUT/imputed plink
HLA imputation reference panel V1.0 (PLoS ONE 2014; recommended)
Download the Korean HLA Reference Panel V1.0:
Download Panel (ZIP)Readme.txt
########################################################### # Korean Reference Panel v1.0 for imputing HLA variants # Contact: Kwangwoo Kim (kkim@khu.ac.kr); Sang-Cheol Bae (scbae@hanyang.ac.kr) ###########################################################
- Thank you for downloading the Korean imputation reference panel.
- Our HLA reference panel is designed for use with SNP2HLA. Please visit http://www.broadinstitute.org/mpg/snp2hla/ to learn SNP2HLA.
- A total of 413 unrelated Korean subjects were analyzed for MHC SNPs within the extended MHC locus and classical alleles of six HLA genes: HLA-A, -B, -C, -DRB1, -DPB1, and -DQB1. The HLA reference panel was constructed by phasing 5,858 MHC SNPs, 233 classical HLA alleles, and 1,387 amino acid residue markers from 1,025 amino acid positions as binary variables.
- The Korean HLA reference panel is highly applicable and suitable for various genome-wide array data from East Asians, including Han Chinese, Japanese, and Korean populations.
- Citation: Kwangwoo Kim, So-Young Bang, Hye-Soon Lee, Sang-Cheol Bae. (2014). Construction and Application of a Korean Reference Panel for Imputing Classical Alleles and Amino Acids of Human Leukocyte Antigen Genes. PLoS ONE. 9(11), e112546. doi: 10.1371/journal.pone.0112546. [November 2014]
- Usage:
./SNP2HLA.csh DATA (.bed/.bim/.fam) REFERENCE (.bgl.phased/.markers) OUTPUT plink {optional: max_memory[mb] window_size}
Example:
./SNP2HLA.csh /path_to/KOR_REF_1.0/HapMap3_CHB_JPT/hapmap3_r2_b36_chr6.MHC._plink.id61 /path_to/KOR_REF_1.0/KOR_REF/Kim_KOR_HLA OUTPUT/imputed plink