Joongmin Shin

Multimodal Senior Researcher, Human-Inspired AI Research Founder & Lead at KUDoc

I research structure-aware Multimodal Reasoning: methods that recover document structure (layout, tables, figures, hierarchy, cross-page dependencies, and provenance) to turn complex documents into auditable evidence for reliable retrieval, grounded generation, and long-context reasoning. My first- and co-first-author work spans hierarchical retrieval, multimodal dependency parsing, structure-preserving chunking, and domain-specific RAG. My current manuscripts extend this agenda toward evidence auditing, multimodal QA evaluation, and evidence-centric memory for reliable multimodal agents.

I completed my B.S. in Computer Science and Engineering (advised by Prof. Jeongu Kim) and M.S. in Artificial Intelligence (advised by Prof. Hyuk-Chul Kwon) at Pusan National University, beginning NLP research in Prof. Kwon's AI Research Lab in 2020. I then joined Korea University's Human-Inspired AI Research Group, co-advised by Prof. Jaehyung Seo and Prof. Heuiseok Lim, where I founded and lead KUDoc, earning a promotion to Senior Researcher.

Together we have produced 13 publications and 13 manuscripts under review at top-tier venues, including 4 first-author papers at ACL, CVPR, and EMNLP, along with 5 patents, 4 industry projects with 3 technology transfers, and 7 awards.

I am seeking Ph.D. opportunities with advisors working on multimodal reasoning, structured memory, world models, and planning for future agents. If these directions resonate with yours, please feel free to reach out via email or LinkedIn.

ACL 2026 CVPR 2026 EMNLP 2025/2024 13 Under Review 5 Patents 4 Pilots 7 Awards

Google Scholar CV Email LinkedIn

Latest News

Apr 2026
HiKEY was accepted at ACL 2026 Main, Oral (First Author).
Feb 2026
M3DocDep was accepted at CVPR 2026 Main (First Author).
Aug 2025
MultiDocFusion was accepted at EMNLP 2025 Main (First Author).
Aug 2025
Appointed to the National Representative K-AI Research Team (with NC AI).
Oct 2024
StyleDFS was accepted at EMNLP 2024 Industry (Co-First Author).

Research Areas

View all publications

🔍

Structure-Aware Multimodal Reasoning

Retrieval- and generation-oriented representations that preserve layout, section hierarchy, table–figure relations, cross-page dependencies, and provenance.

HiKEY ACL 2026 MultiDocFusion EMNLP 2025

📄

Document Structure Recovery

LVLM-based parsing and dependency modeling for long, noisy, multi-page documents.

M3DocDep CVPR 2026 StyleDFS EMNLP 2024

🔬

Auditable Evidence Reasoning

Claim-to-evidence linking, evidence coverage analysis, and support-sensitive QA evaluation.

Evidence Auditing TACL Under Review Error Propagation TPAMI Under Review

Impact Snapshot

4 First-Author Papers

Top-Tier Venues

Across ACL, CVPR, and EMNLP(2).

5 Patents

IP Portfolio

Translated research ideas into protected and deployable AI methods.

16 Projects

Project Portfolio

Across multimodal reasoning, foundation models, and applied AI collaborations.

7 Awards

Recognition

Recognized across research, industry collaboration, and applied AI competitions.

Selected Project Contributions

See full projects

5 Projects

Multimodal Reasoning & Document AI

Structure-aware retrieval, multimodal RAG, and production-facing document workflows.

6 Projects

Foundation Models and Adaptation

K-AI Research Ko-Gemma KULLM PLC Assistant Mi:deum K 1.0 Exobrain WiseQA

Korean and multilingual foundation-model adaptation, data pipelines, post-training, and evaluation.

5 Projects

Additional Projects

TTS Prosody (KT) TTS Pronunciation (KT) Navigation TTS AI Online Judge KIGAM Mineral

Applied AI collaborations spanning Korean speech, education, and geoscience.