대한전자공학회

The Institute of Electronics and Information Engineers

학술행사 Confercence & Workshop

2026년 영상처리연구회+가상융합연구회 여름학교
이화여자대학교 학관 251호 / 2026-08-19

행사안내

(사)대한전자공학회 연구회 행사

2026년 영상처리연구회+가상융합연구회 여름학교
2026년 8월 19일(수) / 이화여자대학교 학관 251호

*사전등록기간 : ~ 2025. 8. 14.(금) 18시까지

초대의 글

최근 인공지능 기술은 텍스트와 이미지를 이해하고 생성하는 수준을 넘어, 비디오, 3차원 공간, 물리 세계, 그리고 인간 행동을 포괄적으로 모델링하는 방향으로 빠르게 발전하고 있습니다. 특히 멀티모달 대형언어모델 (LLM), 비디오 LLM, World Model, Agentic AI, 동적 4D 월드 복원 및 비디오 생성 기술의 발전은 인공지능이 단순한 인식과 생성의 도구를 넘어 복잡한 환경을 이해하고 미래를 예측하며 실제 세계와 상호작용하는 지능으로 확장되고 있음을 보여줍니다.

이번 교육프로그램에서는 멀티모달 대형언어모델의 이해를 통한 할루시네이션 완화, 군중 행동 모델링 기반 소셜 월드 모델, 온디바이스 멀티모달 Agentic AI 시스템의 방향을 살펴봅니다. 또한 차세대 비디오 생성 기술, 동적 4D 월드 복원 및 생성 기술을 함께 다룸으로써, 월드 모델 연구가 자율주행, 디지털 트윈, 가상융합, Physical AI 등 다양한 응용으로 확장되는 가능성을 폭넓게 논의하고자 합니다.

아울러, 전통적으로 꾸준한 관심을 받아온 Computational Photography 및 3D Computer Vision 분야와 더불어, 산업계 최신 기술 동향을 소개하는 특별 강연도 준비되어 있습니다. 이를 통해 대학원생뿐만 아니라 해당 분야로의 진학 및 진로를 고민하는 학부생, 그리고 산업계 연구원 여러분께도 유익한 시간이 되기를 기대합니다.

본 프로그램이 대학원생, 연구자, 산업계 전문가 여러분께 최신 기술 흐름을 체계적으로 이해하고 새로운 연구 방향을 모색하는 뜻깊은 기회가 되기를 기대합니다. 바쁘신 가운데 귀중한 강연을 맡아 주신 연사 여러분과 관심을 가지고 참여해 주시는 모든 분들께 깊이 감사드립니다.

영상처리연구회 위원장 강제원
가상융합연구회 김학구
인공지능 신호처리 소사이어티 회장 박인규

여름학교 개요

o 행사명 : 2026년 영상처리연구회 + 가상융합연구회 여름학교
o 일시 : 2026년 8월 19일(수) 09:00 ~ 18:00

o 장소 : 이화여자대학교 학관 251호

o 주최 : 대한전자공학회 인공지능신호처리 소사이어티, 영상처리연구회, 가상융합연구회
-----------------------------------------------------------------------------------------
2026년도 영상처리연구회 + 가상융합연구회 여름학교 운영위원회
o 조직위원장 : 강제원 (이화여대), 김학구 교수 (중앙대학교)
o 프로그램위원장 : 오지형 교수 (중앙대학교)
o 조직위원 : 장부루 교수 (고려대), 배인환 교수 (DGIST), 최성하 교수 (경희대), 어영정 교수 (연세대),
김태경 박사 (네이버 AI), 공경보 교수 (부산대), 김진화 박사 (네이버 AI) ,

여름학교 프로그램 (세부일정)

[ 2026년 8월 19일(수). 이화여자대학교 학관 (251호) ]

시간	프로그램	강연자
09:20 – 10:00 (40’)	등록 접수	등록데스크
09:50 – 10:00 (10’)	인사말 및 개회사	박인규 소사이어티 회장 (인공지능 신호처리)
세션 1.
10:00 – 11:00 (60’)	Mitigating Hallucinations in Multimodal Large Language Models	장부루 교수 (고려대학교)
11:00 – 12:00 (60’)	Toward Social World Models via Crowd Behavior Modeling	배인환 교수 (DGIST)
12:00 – 13:00 (60’)	중 식	지정 장소
세션 2.
13:00 – 14:00 (60’)	Toward On-Device Multimodal Agentic AI with Efficient Small Language Models	최성하 교수 (경희대학교)
14:00 – 15:00 (60’)	3From Diffusion Models to World Models: Principles and Trends in Video Generation	어영정 교수 (연세대학교)
15:00 – 15:15 (15’)	Coffee Break	-
세션 3.
15:15 – 16:15 (60’)	Tracing Information Flow of Modalities: Mechanistic Interpretability of LLMs and VLMs	김태경 박사 (네이버 AI Lab)
16:15 – 17:15 (60’)	Dynamic 4D World Reconstruction and Interactive Generation	공경보 교수 (부산대학교)
17:15 – 18:15 (60’)	월드 모델, 현실 공간을 만나다: 개념, 동향, 그리고 서울 월드 모델	김진화 박사 (네이버 AI Lab)
18:15 – 18:25 (15’)	경품추첨 및 폐회	강제원 영상처리연구회장(이화여자대학교)

* 주최측의 사정으로 프로그램이 일부 변경될 수 있습니다.

여름학교 등록안내

여름학교 등록비 안내

구분	학생	일반
사전등록	200,000원	300,000원
현징등록	250,000원	350,000원

o 등록기간 : ~ 2026년 8월 14일(금) 18시 까지

o 아래 사전등록 클릭 - 사전등록 정보 입력 및 등록비 결제 진행을 하여주시기 바랍니다.
o 사전등록(결제완료)시, 신용카드 전표 및 거래명세서를 출력할 수 있습니다. (참가확인서 웹출력 - 행사종료 이후 가능)

o 카드결제가 불가하신 분께서는 계좌시, 아래 계좌정보로 이체하여 주시기 바랍니다.
이체 후 입금정보 및 계산서발행을 위한 정보(사업자등록증)를 E-메일로 송부하여 주시기 바랍니다.

- 입금계좌(영상처리연구회) :
수협은행 1010-2143-7815 (예금주 : (사)대한전자공학회)

o 대한전자공학회 사업자등록증(2026년) 사본 (클릭 다운로드 -PDF)
o 등록비 수협은행 입금통장사본(영상처리연구회) (클릭 다운로드-PDF)

영수증 및 계산서 발급안내

결제방법	카드영수증	계산서(전자)	거래명세서
카드결제	가능(온라인 출력가능)	불가능(이중발급)	기본발행
계좌이체 및 무통장 결제	불가능	가능 (전자-이메일발행,)	기본발행

o 계산서는 온라인에서 신청해 주시기 바랍니다. 카드결제시 계산서 발급은 불가능합니다.

문의처

o 담 당: 대한전자공학회 배기동 부장
o 연락처 : 02-553-0255 (내선4) / E-메일 : biz@theieie.org

( 준비중 )

영상처리연구회 + 가상융합 여름학교 강연 요약

연사	강연 요약
장부루 교수 (고려대)	Mitigating Hallucinations in Multimodal Large Language Models With the emergence of Multimodal Large Language Models (MLLMs) that can process visual information alongside text, a wide range of applications integrating language and vision has become possible. However, these models still suffer from the problem of hallucinations, where generated responses are inconsistent with the actual visual content. This talk examines the characteristics of hallucinations in multimodal settings and reviews representative approaches proposed to mitigate them. Furthermore, the talk discusses our ongoing research that extends these ideas to Video LLMs, addressing the challenges of reducing hallucinations in scenarios involving temporal visual information.
배인환 교수 (DGIST)	Toward Social World Models via Crowd Behavior Modeling Modeling crowd behaviors is essential for building social world models that can understand and simulate human dynamics in real and virtual environments, including autonomous driving, virtual/augmented reality, digital twins, and crowd safety analysis. However, this remains challenging due to the diversity and uncertainty of human motion. This talk introduces a data-driven pipeline for crowd behavior modeling from three perspectives: representation, understanding, and generation. The first part discusses efficient representations for capturing interactions among pedestrians, surrounding objects, and environments, including group-level and motion-pattern-based approaches that reduce spatial and temporal complexity. The second part presents probabilistic and generative methods for understanding realistic crowd dynamics and predicting feasible future trajectories under uncertainty. The final part introduces techniques for generating diverse crowd behaviors in simulation spaces using learnable crowd emitters and purposive sampling strategies. Overall, this talk discusses how these approaches can contribute to realistic, scalable, and controllable social world models for virtual convergence applications.
최성하 교수 (경희대)	Toward On-Device Multimodal Agentic AI with Efficient Small Language Models Recent advances in agentic AI open up the possibility of always-on personalized agents that understand continuous multimodal streams on everyday devices. Realizing this vision requires efficient and private systems that can process personal data locally. However, current agentic pipelines often rely on repeated large language model (LLM) calls, leading to high cost and privacy risks. Small language models (SLMs) offer a practical alternative, but simply replacing LLMs with SLMs can cause significant performance gaps. In this talk, I present our recent efforts to bridge this gap through efficient structured reasoning, concept-aware fine-tuning, universal multimodal retrieval, and facility location-based visual token compression for long video understanding. Together, these works move toward practical, privacy-preserving, on-device multimodal agentic AI.
어영정 교수 (연세대)	From Diffusion Models to World Models: Principles and Trends in Video Generation 본 강연에서는 비디오 생성 분야의 최근 흐름을 diffusion 및 flow matching 기반 모델을 중심으로 살펴본다. 먼저 이미지 생성 모델에서 시작한 초기 비디오 모델과 근래의 비디오 모델의 기초를 설명하고, 생성될 비디오를 묘사하는 다양한 방법을 다룬다. 마지막으로 interactive world model 및 cosmos3 까지 다루어 최신 연구 동향을 소개한다.
김태경 박사 (네이버 AI Lab)	Tracing Information Flow of Modalities: Mechanistic Interpretability of LLMs and VLMs Large-scale foundation models increasingly operate across language, vision, and video, yet our understanding of them is largely confined to their external behavior. Prediction results offer limited insight into the internal computations, such as how information is represented and propagated within a model. Mechanistic interpretability seeks to close this gap by scopring internal computation, offering a principled basis for explaining model behavior. In this presentation, we trace how information flows within modalities through three recent studies. We first map the layer wise information flow that enables VideoLLMs to answer questions about video. We then show that scene text in VLMs forms a representational modality of its own, distinct from both visual objects and prompt text. Finally, we analyze how distinct reasoning operations are geometrically organized in the hidden representations of LLMs. These studies show that understanding internal computation is key to building reliable and controllable models.
공경보 교수 (부산대)	Dynamic 4D World Reconstruction and Interactive Generation 본 강연에서는 동적 실세계 장면을 구축하기 위한 4D 장면 복원과 생성 기술의 최근 흐름을 소개한다. 먼저 다중시점 영상으로부터 공간 구조와 시간에 따른 움직임을 복원하는 방법을 살펴보고, 이어서 단일 이미지나 희소한 입력으로부터 새로운 움직임을 생성하는 접근을 다룬다. 또한 환경 변화와 물리적 상호작용에 반응하는 인터랙티브 4D 월드로의 확장 가능성을 소개한다.
김진화 박사 (네이버 AI Lab)	월드 모델, 현실 공간을 만나다: 개념, 동향, 그리고 서울 월드 모델 월드 모델(World Model)은 환경의 동역학을 학습해 미래를 예측하고 시뮬레이션하는 AI 패러다임으로, Physical AI 시대의 기반 기술로 떠오르고 있습니다. 이번 강연에서는 제목 그대로 개념, 동향, 그리고 NAVER 사례의 흐름을 따라가 봅니다. 먼저 월드 모델의 기초와 함께 세 가지 큰 줄기 — 강화학습 기반(Dreamer), 추상적 표현 예측 기반(LeCun의 JEPA), 생성 기반(Sora·Genie·NVIDIA Cosmos) — 을 나란히 놓고 각각의 아이디어와 장단점을 풀어 보고, 이어서 '생성이냐 추상이냐', '물리 법칙을 어떻게 담을 것인가', '인과 구조' 같은 요즘의 논점과 상상 속 환경을 넘어 실제 지리 공간에 발 딛는(grounded) 월드 모델로 옮겨 가는 흐름을 함께 살펴봅니다. 마지막으로 그 구체적인 결실이자 최근 NVIDIA GTC Taipei 2026에서 언급된 서울 월드 모델(Seoul World Model)을 사례로 소개하며, 거리뷰 같은 실측 데이터를 바탕으로 도시 규모의 궤적까지 충실하게 그려 내는 방식과, NAVER의 공간지능이 자율주행과 디지털 트윈 같은 가상융합 응용으로 어떻게 뻗어 갈 수 있을지 이야기 나눕니다. 이 시간을 통해 월드 모델 연구의 큰 그림과, 현실 공간을 만난 월드 모델이 열어 갈 새로운 연구·산업의 기회를 함께 그려 볼 수 있기를 바랍니다.

행사 거래명세서, 참가확인서 발급 안내

o 거래명세서 / 참가확인서 발급 – 온라인 발급
- 대한전자공학회 사전등록(메뉴)에서 로그인 후 다운로드-출력)
* 사전등록비 결제가 완료하신 분들께서는 출력이 가능합니다.
- 개최 행사일 이후 참가확인서 온라인 출력가능

o 신용카드 전표(인쇄) : 행사 사전등록(메뉴) 로그인 후 신용카드전표 출력 가능
* 결제 조회 후 신용카드 전표출력

o 문의처: 대한전자공학회 배기동 부장 / 02-553-0255 (내선5) / E-메일 : biz@theieie.org

준비중

대한전자공학회

학술행사 Confercence & Workshop

2026년 영상처리연구회+가상융합연구회 여름학교 이화여자대학교 학관 251호 / 2026-08-19

2026년 영상처리연구회+가상융합연구회 여름학교
이화여자대학교 학관 251호 / 2026-08-19