ICIP’26 Special Session: Generative Visual Coding: Emerging Paradigms for Future Communication

IEEE ICIP 2026

IEEE International Conference on Image Processing (ICIP) 2026

Special Session: Generative Visual Coding: Emerging Paradigms for Future Communication

https://floatbutterfly.github.io/ICIP2026-special-session-GVC/

Generative Visual Coding (GVC) is an emerging paradigm that explores how generative models and structured visual representations can redefine visual communication. By integrating generative capabilities into the coding process, GVC enables new forms of representation, transmission, and reconstruction that enhance perceptual and semantic fidelity while improving communication efficiency. Beyond human-centric reconstruction, GVC supports machine- and task-oriented communication, where compact and semantically meaningful representations benefit downstream analysis and decision-making.

The paradigm also motivates theoretical study on how generative priors interact with information constraints, optimization objectives, and emerging concepts in semantic communication. As generative processes gain prominence, principled evaluation becomes increasingly essential, encouraging advances in quality assessment, distortion modeling, and the development of benchmark datasets for generative and hybrid codec systems. Efficiency remains central to deployment, underscoring the importance of model design, complexity optimization, and computational scalability.

GVC further extends to immersive and spatial communication, including three-dimensional and scene-level content. In these settings, generative models can infer geometry, semantics, and contextual relationships, enabling new modes of multi-view and interactive media delivery. Overall, GVC offers a unified framework that integrates generative modeling, visual coding, and intelligent communication, laying the groundwork for next-generation visual communication systems.

Scope / Topics

  • Generative foundation models, methodologies, frameworks, and analytical perspectives for visual coding and task-oriented communication
  • Theoretical modeling and rate–distortion perspectives for generative and semantic visual communication
  • Evaluation frameworks, quality assessment, and benchmark datasets for generative coding systems
  • Complexity optimization for generative visual communication
  • Generative coding use cases (e.g., Generative Face Video Coding)
  • Generative visual communication for immersive, three-dimensional, and spatially aware media

Submission: Submission Website

  • Paper Format: up to 5 pages + 1 page of only references (see Author Kit)
  • Topic Selection: When submitting, select Special Session ‘Generative Visual Coding: Emerging Paradigms for Future Communication’ as well as up to two additional regular topics (Step 5)

 Important Dates

  • Special Session Submission Opens: January 7, 2026
  • Paper Submission Deadline: February 4, 2026 (Extended)
  • Notification of Acceptance: April 22, 2026
  • Camera-Ready Paper Due: May 13, 2026

Organizers

  • Jianhui Chang, China Telecom Cloud Computing Research Institute
  • Hadi Amirpour, University of Klagenfurt
  • Giuseppe Valenzise, Université Paris-Saclay
Posted in ATHENA | Comments Off on ICIP’26 Special Session: Generative Visual Coding: Emerging Paradigms for Future Communication

ICIP’26 Special Session: Visual Information Processing for Human-centered Immersive Experiences

IEEE ICIP 2026

IEEE International Conference on Image Processing (ICIP) 2026

Special Session: Visual Information Processing for Human-centered Immersive Experiences

https://medialab.dei.unipd.it/special-session-icip-2026/

Immersive systems such as Virtual and Extended Reality are becoming widespread thanks to the wide diffusion of relatively low-cost headsets and the increased immersivity and sense of presence they provide with respect to their 2D counterparts. However, the novelty of the involved technologies as well as the variety of available media types, together with the high number of applications, entail endless challenges for the research community. One key feature of immersive systems is that they inherently place users at the center of the experience, allowing them to actively explore, manipulate, and interact with content. As a result, immersive systems introduce new perceptual, behavioral, and interaction aspects that require dedicated investigation. This special session focuses on the role of visual information processing in enabling human-centered immersive experiences, providing complementary insights into how visual information plays a critical role in enhancing effectiveness, comfort, usability, and perceptual quality in next-generation immersive applications.

Topics of interest

  • Visual attention mechanisms
  • Perceptual modelling
  • Emerging media formats (stereoscopic and omnidirectional imagery, light fields, point clouds, meshes, and Gaussian splats)
  • Multimodal immersive applications
  • Quality of Experience

Submission instructions

  • Submission sitehttps://icip2026.exordo.com/
  • Topic selection: when submitting your paper, you will be able to find the accepted special sessions as part of the list of topics (Step 5). Please make sure to select the Special Session ‘Visual information processing for human-centered immersive experiences’ as well as up to two additional regular topics, to assist in the review process and for program-building purposes.
  • Format: up to 5 pages + 1 page for references only (refer to the Author Kit)
  • Conference websitehttps://2026.ieeeicip.org/
  • Please note: special session papers will undergo the same rigorous peer-review process as regular papers.

Important dates and deadlines

  • Submission deadline: February 4, 2026 (AoE)
  • Notification of Acceptance: April 22, 2026
  • Camera-Ready Paper Due: May 13, 2026
  • Conference: 13-17 September 2026, Tampere, Finland

Organizers and contacts

  • Sara Baldoni, University of Padova 
  • Hadi Amirpour, University of Klagenfurt
Posted in ATHENA | Comments Off on ICIP’26 Special Session: Visual Information Processing for Human-centered Immersive Experiences

Patent Approval for “Video encoding complexity predictor”

Video encoding complexity predictor

US Patent

[PDF]

Vignesh Menon (Alpen-Adria-Universität Klagenfurt, Austria), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt, Austria), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria)

 

Abstract: Techniques for predicting video encoding complexity are described herein. A method for predicting video encoding complexity includes performing video complexity feature extraction on a video segment to extract low-complexity frame-based features, predicting video encoding complexity for the video segment using the low-complexity frame-based features, and outputting a predicted encoding bitrate and a predicted encoding time. An embodiment may include implementing a hybrid model using a CNN, wherein a latent vector from a frame of the video segment is extracted and also may be used to predict video encoding complexity. The predicted encoding bitrates and encoding times may be provided to encoding infrastructure for use in optimizing a schedule of encodings.

Posted in ATHENA | Comments Off on Patent Approval for “Video encoding complexity predictor”

QoE- and Energy-aware Content Consumption for HTTP Adaptive Streaming

Klagenfurt, July 31, 2025

Congratulations to Dr. Daniele Lorenzi for successfully defending his dissertation on “QoE- and Energy-aware Content Consumption for HTTP Adaptive Streaming” at Universität Klagenfurt in the context of the Christian Doppler Laboratory ATHENA.

Abstract

HTTP Adaptive Streaming (HAS) has become the dominant paradigm for video delivery over the Internet, enabling scalable and flexible content consumption across heterogeneous networks and devices. The continuous growth of video traffic, coupled with the increasing complexity of multimedia content and the proliferation of resource-constrained devices, poses significant challenges for streaming systems. In particular, service providers and researchers must jointly address Quality of Experience (QoE), energy consumption, and emerging protocol and content technologies to meet user expectations while ensuring sustainable operation.

This dissertation investigates QoE- and energy-aware content consumption in HAS, with a primary focus on client-side adaptation mechanisms. Through a systematic analysis of existing approaches, the thesis identifies key limitations in current Adaptive Bitrate (ABR) algorithms, which often prioritize bitrate maximization without sufficiently considering perceptual quality, energy efficiency, codec diversity, or new networking capabilities. To address these challenges, the dissertation proposes a set of novel methodologies, algorithms, and datasets that jointly optimize QoE and energy consumption under realistic network and device constraints.

The first contribution explores the exploitation of emerging transport protocols, specifically HTTP/3 and QUIC, to enhance QoE in HAS. The proposed DoFP+ approach leverages advanced protocol features such as stream multiplexing, prioritization, and termination to upgrade previously downloaded low-quality segments during playback. Extensive experimental evaluations demonstrate significant QoE improvements, reduced stall events, and more efficient bandwidth utilization compared to state-of-the-art approaches.

As a second contribution, the dissertation addresses the limitations of single-codec streaming by introducing MEDUSA, a dynamic multi-codec ABR approach. MEDUSA enables per-segment codec selection based on content-aware perceptual quality and segment size information, allowing the system to adapt to varying content complexity over time. Results show that dynamic codec switching can substantially improve perceptual quality while reducing transmitted data volume, thereby benefiting both end users and streaming providers.

The third contribution focuses on sustainable video streaming through energy-aware adaptation. The thesis introduces E-WISH, an energy-aware ABR algorithm that incorporates an explicit energy consumption model into the quality selection process, reducing playback stalls and lowering power usage without compromising QoE. To support systematic energy evaluations, the dissertation further presents COCONUT, a comprehensive dataset of fine-grained energy measurements collected from multiple client devices. This dataset enables in-depth analysis of the impact of video, device, and network parameters on energy consumption in HAS.

Finally, the dissertation investigates neural-enhanced streaming (NES), where client-side machine learning techniques are used to improve visual quality at the cost of additional computational overhead. To balance QoE gains and power consumption in heterogeneous client environments, the thesis proposes Receptive, a coordinated system that jointly optimizes ABR decisions and neural enhancement strategies across multiple users. Experimental results demonstrate that Receptive achieves substantial QoE improvements while significantly reducing energy consumption on NES-capable devices.

Overall, this dissertation advances the state of the art in HTTP Adaptive Streaming by introducing protocol-aware, content-aware, and energy-aware adaptation mechanisms, complemented by realistic datasets and comprehensive evaluations. The presented contributions provide a solid foundation for future research and practical deployments aiming to deliver high-quality, energy-efficient, and sustainable video streaming services.

Slides are available here.

Posted in ATHENA | Comments Off on QoE- and Energy-aware Content Consumption for HTTP Adaptive Streaming

Visual Communication in the Age of AI: VCIP 2025 Highlights from Klagenfurt

VCIP 2025 in Klagenfurt: Advancing Sustainable and Trustworthy Visual Communications in the Age of AI

From December 1–4, 2025, the Department of Information Technology (ITEC) at the University of Klagenfurt (Austria) hosted the International Conference on Visual Communications and Image Processing (VCIP 2025), welcoming an international community of researchers, practitioners, and industry experts to discuss the future of visual communication and image processing. Under the theme “Sustainable and Trustworthy Visual Communications in the Age of AI,” the conference addressed some of the most pressing challenges and opportunities facing the field today.

A forum for cutting-edge research and dialogue

VCIP 2025 continued the long-standing tradition of the conference as a premier venue for both foundational and applied research. Over four days, the program offered a diverse range of tutorials, overview talks, keynotes, oral and poster sessions, and interactive formats. Discussions ranged from adaptive and low-latency streaming, source coding, and compressed-domain processing to volumetric media, computational vision, quality assessment, and AI-driven restoration and enhancement techniques.

Inspiring keynotes on trust, sustainability, and clarity

A particular highlight of VCIP 2025 were the three keynote talks, which set the tone for the conference by connecting technical innovation with broader societal concerns. The speakers addressed trustworthy multimedia communication in the era of AI-generated content, the environmental impact and sustainability of visual technologies, and the role of visual analytics in making complex data understandable across time and space. Together, the keynotes sparked lively discussions that extended well beyond the conference halls.

Tutorials, overview talks, and emerging topics

Four half-day tutorials provided in-depth insights into current and emerging technologies, including generative face video coding for ultra-low bitrate communication, JPEG AI standardization and implementation, the convergence of low-level image processing and generative AI, and the past, present, and future of volumetric video. Complementing these, overview talks offered broader perspectives on emotion and quality estimation, AI-enabled video streaming efficiency, 3D scene capture and compression, and the use of large vision–language models for visual quality assessment.

Supporting early-career researchers and innovation

VCIP 2025 placed strong emphasis on nurturing the next generation of researchers. The doctoral symposium and the VSPC Rising Star session provided a platform for early-career scientists to present their work and engage with senior experts. Demo, open-source, and dataset sessions further highlighted the practical impact of research, showcasing tools, prototypes, and resources that bridge theory and application.

Exchange beyond the technical program

In addition to the scientific sessions, the conference offered a vibrant social program, including a welcome reception, a Glühwein gathering, a conference banquet, and a closing ceremony with awards. These events fostered informal exchange, strengthened international collaboration, and contributed to the open and collegial atmosphere that characterizes VCIP.

Best paper award: “AlignGS: Aligning Geometry and Semantics for Robust Indoor Reconstruction from Sparse Views”, Yijie Gao, Houqiang Zhong, Tianchi Zhu, Li Song, Zhengxue Cheng, Qiang Hu (Shanghai Jiao Tong University)

Rising Star: Heming Sun (Yokohama National University), “Traditional and Learned Image and Video Coding: From Algorithms to Implementations”

A successful event for Klagenfurt and the department/university

Hosting VCIP 2025 marked an important milestone for the Department of Information Technology (ITEC) at the University of Klagenfurt, reinforcing its international visibility in the fields of visual computing, multimedia systems, and artificial intelligence. By bringing together experts from academia and industry and opening selected sessions to the wider university community, the conference created valuable opportunities for interdisciplinary exchange and long-term collaboration.

VCIP 2025 concluded with a strong sense of momentum, underscoring the importance of responsible, transparent, and sustainable approaches to visual communication in an AI-driven world. The discussions and connections formed in Klagenfurt will continue to shape research and innovation in the field well beyond the conference itself.

Following a successful edition in Klagenfurt, VCIP 2026 will take place in Singapore from December 13–16, 2026, focusing on “Visual Communications and Image Processing at the Frontiers of Generative and Perceptual AI” (https://vcip-2026.org/).

Posted in ATHENA, News | Comments Off on Visual Communication in the Age of AI: VCIP 2025 Highlights from Klagenfurt

Farzad Tashtarian completed his habilitation on Network-Assisted Adaptive Streaming: Toward Optimal QoE through System Collaboration

Network-Assisted Adaptive Streaming: Toward Optimal QoE through System Collaboration

Farzad Tashtarian

Habilitation

On 14.11.2025, Farzad Tashtarian defended his habilitation “Network-Assisted Adaptive Streaming: Toward Optimal QoE through System Collaboration” at the University of Klagenfurt, Austria.

Abstract: Providing seamless, low-latency, and energy-efficient video streaming experiences remains an ongoing challenge as content delivery infrastructures evolve to support higher resolutions, immersive formats, and heterogeneous networks. This talk explores an end-to-end perspective on network-assisted adaptive streaming, where close coordination between the player, network, and edge/cloud components enables data-driven and context-aware optimization. It will discuss adaptive bitrate algorithm design, cost- and delay-conscious edge transcoding, and multi-objective optimization across the streaming pipeline. Emerging AI-based methods—such as reinforcement learning, generative modeling, and large language model (LLM) orchestration—will be highlighted as key enablers for intelligent and self-adjusting video delivery. The talk concludes with a discussion of open challenges, scalability, and future research directions toward resilient, efficient, and user-centric streaming infrastructures.

Committee members: Prof. Martin Pinzger (Chairperson), Prof. Oliver Hohlfeld (external member), Prof. Bernhard Rinner, Prof. Angelika Wiegele, Prof. Chitchanok Chuengsatiansup, MSc Zoha Azimi Ourimi, Dr. Alice Tarzariol, Kateryna Taranov, and Gregor Lammer

Posted in ATHENA | Comments Off on Farzad Tashtarian completed his habilitation on Network-Assisted Adaptive Streaming: Toward Optimal QoE through System Collaboration

Interactive Stereoscopic Videos On Head-mounted Displays

Interactive Stereoscopic Videos On Head-mounted Displays

IEEE VCIP 2025

December 1 – December 4, 2025

Klagenfurt, Austria

[PDF]

Afshin Gholami(AAU, Austria), Hadi Amirpour (AAU, Austria), Christian Timmerer (AAU, Austria)

Abstract: This paper presents ISV-Demo, a novel system for delivering interactive stereoscopic video experiences on head-mounted displays in spatial environments. Unlike traditional flat or real-time VR media, our approach leverages pre-rendered, high-fidelity 3D video to enable immersive, cinematic storytelling enhanced by user agency. Viewers influence narrative direction by making real-time decisions at branching points within the stereoscopic scene. We propose two interaction models: a timeline-based model with embedded prompts, and a loop-based segmented model offering flexible timing and decision persistence. These models define a new paradigm for authored cinematic interaction in extended reality, addressing the gap between passive 3D video and dynamic VR content.

Posted in ATHENA | Comments Off on Interactive Stereoscopic Videos On Head-mounted Displays