Publications

My publications focus on transforming heterogeneous unstructured inputs into structured evidence, reliable long-context reasoning, and robustness diagnostics. Accepted and published work appears first; anonymous manuscripts under review and future-facing research directions are separated below.

Selected Publications

HiKEY: Hierarchical Multimodal Retrieval for Open-Domain Document Question Answering

Top Conferences

ACL 2026 (Main) · Mar 2026First AuthorAccepted

Joongmin Shin*, Gyuho Shim, Jeongbae Park, Jaehyung Seo, Heuiseok Lim

Hierarchical retrieval for multimodal document QA with structured evidence assembly.

hierarchical retrieval multimodal QA evidence assembly

M3DocDep: Multi-modal, Multi-page, Multi-document Dependency Chunking with Large Vision-Language Models

Top Conferences

CVPR 2026 (Main) · Feb 2026First AuthorAccepted

Joongmin Shin*, Jeongbae Park, Jaehyung Seo, Heuiseok Lim

LVLM-based dependency chunking that reconstructs cross-page structure for long-document retrieval and QA.

document structure LVLM chunking retrieval

MultiDocFusion: Hierarchical and Multimodal Chunking Pipeline for Enhanced RAG on Long Industrial Documents

Top Conferences

EMNLP 2025 (Main) · Aug 2025First AuthorPublished

Joongmin Shin*, Chanjun Park, Jeongbae Park, Jaehyung Seo, Heuiseok Lim

A hierarchical multimodal chunking pipeline that preserves layout and improves evidence composition in industrial RAG.

hierarchical chunking multimodal RAG industrial

Intelligent Predictive Maintenance RAG framework for Power Plants: Enhancing QA with StyleDFS and Domain Specific Instruction Tuning

Top Conferences

EMNLP 2024 (Industrial) · Oct 2024Co-First AuthorAccepted

Seongtae Hong*, Joongmin Shin*, Jaehyung Seo, Taemin Lee, Jeongbae Park, Heuiseok Lim

Domain-specific RAG framework for scientific and industrial QA; led to two technology transfers.

domain RAG industrial QA technology transfer

Journals

Distance Based Korean WordNet (alias. KorLex) Embedding Model

Journals

Applied Artificial Intelligence · Sep 2024Co-AuthorPublished

SeongReol Park*, Joongmin Shin, Sanghyun Cho, Hyuk-Chul Kwon, Jung-Hun Lee

Graph-aware lexical embedding model that injects structured knowledge into vector representations.

knowledge graph embedding Korean NLP

Multi-Paragraph Machine Reading Comprehension with Hybrid Reader over Tables and Text

Journals

Applied Artificial Intelligence · Jun 2024Co-AuthorPublished

Sanghyun Cho*, SeongReol Park, Hye-Lynn Kim, Jung-Hun Lee, Joongmin Shin, Hyuk-Chul Kwon

Hybrid reader model that jointly processes text and tables for multi-paragraph machine reading comprehension.

table QA reading comprehension hybrid model

Domestic Conferences & Theses

Search-Based Generation Techniques for Improving LLM Responses

Domestic Conferences & Theses

KIICE 2023 · Oct 2023First AuthorPublished

Joongmin Shin*, Jungun Lee

Comparative analysis of zero-shot vs. RAG for GPT models, demonstrating benefits of evidence-grounded generation.

Comparative Analysis of Korean Quality in Large-Scale Language Models Based on Zero-Shot Learning

Domestic Conferences & Theses

HCLT 2023 · Oct 2023Co-AuthorPublished

Yunah Huh, Aram So, Taemin Lee, Joongmin Shin, Heuiseok Lim

Investigated language-specific influences on LLM performance across Korean benchmarks.

QA Pair Passage RAG-based LLM Korean Chatbot Service

Domestic Conferences & Theses

HCLT 2023 · Oct 2023First AuthorPublished

Joongmin Shin*, Jaewwok Lee, Kyungmin Kim, Heuiseok Lim

QA-pair passage construction method for Korean RAG chatbots, reducing hallucination in domain-specific settings.

Neural Symbolic Models for Overcoming Deep Learning Limitations and Korean Dependency Parsing

Domestic Conferences & Theses

Master's Thesis · Feb 2023First AuthorPublished

Joongmin Shin*

Neural-symbolic parser integrating linguistic constraints to overcome deep learning limitations in dependency parsing.

EDT5: Proposal of an Encoder-Decoder Structure Embedding Model for T5

Domestic Conferences & Theses

KSC 2022 · Dec 2022First AuthorPublished

Joongmin Shin*, Joogyoung Jung, Junghoon Lee, Hyuk-Chul Kwon

Proposed encoder-decoder embedding architecture for T5, improving upon encoder-only approaches.

Evaluation of Korean Machine Reading Comprehension Generalization Performance

Domestic Conferences & Theses

KSC 2022 · Dec 2022Co-AuthorPublished

Hyelin Kim*, Sanghyun Cho, Joongmin Shin, Hyuk-Chul Kwon

Identified domain generalization limitations in tabular MRC models through cross-validation analysis.

A Dependency Parsing Model Applying Enhanced Dominant-Dependent Constraint Rules

Domestic Conferences & Theses

HCLT 2022 · Oct 2022First AuthorPublished

Joongmin Shin*, Hyuk-Chul Kwon

Expanded neural-symbolic constraint rules from 2 to 24, achieving state-of-the-art dependency parsing.

Rule-Based Dependency Parsing Using Artificial Neural Networks

Domestic Conferences & Theses

KCC 2022 · Jun 2022First AuthorPublished

Joongmin Shin*, Sanghyun Cho, Bongwoo Nam, Hyuk-Chul Kwon

Transformer-augmented dependency parser with rule-based probability control for improved parsing accuracy.

Machine Reading Comprehension of Korean Using Continual Learning

Domestic Conferences & Theses

HCLT 2021 · Oct 2021First AuthorPublished

Joongmin Shin*, Sanghyun Cho, Hyuk-Chul Kwon

Applied continual learning to Korean MRC, addressing catastrophic forgetting in sequential model training.

Manuscripts Under Review

9 first-author and 4 co-author manuscripts currently under peer review at top-tier venues (anonymized titles per double-blind submission policy).