SEED: Energy and Emission Estimation Dataset for Adaptive Video Streaming

SEED: Energy and Emission Estimation Dataset for Adaptive Video Streaming

IEEE VCIP 2025

December 1 – December 4, 2025

Klagenfurt, Austria

[PDF]

Samira Afzal (Baylor University), Narges Mehran (Salzburg Research Forschungsgesellschaft mbH), Farzad Tashtarian (AAU, Austria), Andrew C. Freeman (Baylor University), Radu Prodan (University of Innsbruck), Christian Timmerer (AAU, Austria)

Abstract: The environmental impact of video streaming is gaining more attention due to its growing share in global internet traffic and energy consumption. To support accurate and transparent sustainability assessments, we present SEED (Streaming Energy and Emission Dataset)}: an open dataset for estimating energy usage and CO2 emissions in adaptive video streaming. SEED comprises over 500 video segments. It provides segment-level measurements of energy consumption and emissions for two primary stages: provisioning, which encompasses encoding and storage on cloud infrastructure, and end-user consumption, including network interface retrieval, video decoding, and display on end-user devices. The dataset covers multiple codecs (AVC, HEVC), resolutions, bitrates, cloud instance types, and geographic regions, reflecting real-world variations in computing efficiency and regional carbon intensity. By combining empirical benchmarks with component-level energy models, \dataset{} enables detailed analysis and supports the development of energy- and emission-aware adaptive bitrate (ABR) algorithms. The dataset is publicly available at: https://github.com/cd-athena/SEED.

SEED is available at: https://github.com/cd-athena/SEED

Posted in ATHENA | Comments Off on SEED: Energy and Emission Estimation Dataset for Adaptive Video Streaming

Resolution vs Quantization: A Trade-Off for Hologram Compression

Resolution vs Quantization: A Trade-Off for Hologram Compression

Ayman Alkhateeb, Hadi Amirpour,Christian Timmerer

Alpen-Adria-Universität , Klagenfurt, Austria

PCS 2025  Achen , German

Holographic imaging offers a path to true three-dimensional visualization for applications such as augmented andvirtual reality, but the immense data size of high-quality holo-grams prevents their practical adoption. This paper investigatesa pre-processing strategy to improve the compression of holo-graphic data using the High-Efficiency Video Coding (HEVC) standard. By downsampling the hologram before encoding andsubsequently upsampling it after decoding, we demonstrate thatit is possible to achieve better reconstruction quality at lowbitrates compared to encoding the full-resolution data. Thiscounterintuitive result basically comes from the reduction inspatial complexity, which allows the HEVC encoder to allocate more bits to preserving critical high-frequency information that would otherwise be lost. Although the hologram phase is highly sensitive to scaling,the overall perceptual quality improves at bitrates below 1 Bpp, with gains of approximately 0.1 in SSIM and 0.015 in VIF. Our work underscores a critical principle in holographic codec design: optimizing the trade-off between spatial complexity and quantization error is key to maximizing reconstruction quality, especially in bandwidth-constrained environments.

Posted in HoloSense | Comments Off on Resolution vs Quantization: A Trade-Off for Hologram Compression

Multiview x265 vs. MV-HEVC: An Efficiency Analysis for Stereoscopic Videos

Multiview x265 vs. MV-HEVC:
An Efficiency Analysis for Stereoscopic Videos

PCS 2025

December 8 – December 11, 2025

Aachen, Germany

[PDF]

Kamran Qureshi, Hadi Amirpour, Christian Timmerer

Abstract: With the increasing demand for immersive video experiences, efficient compression of multiview content has become crucial for reducing storage and transmission costs. The introduction of stereoscopic video support on head-mounted displays, along with the emergence of smartphones capable of easily capturing stereoscopic videos, further highlights the need for optimized encoding solutions. Although the efficient but computationally intensive Multi-View High Efficiency Video Coding (MV-HEVC) standard has been available since 2014, only recently has x265—a real-time open-source HEVC encoder—introduced support for multiview encoding. This work (i) evaluates the encoding efficiency of multiview x265 across all presets, and compares it with MV-HEVC, (ii) proposes a perceptual quality–aware preset selection method, and (iii) conducts a comparative study on single-view and stereoscopic videos.

Posted in ATHENA | Comments Off on Multiview x265 vs. MV-HEVC: An Efficiency Analysis for Stereoscopic Videos

NeVES: Real-Time Neural Video Enhancement for HTTP Adaptive Streaming

NeVES: Real-Time Neural Video Enhancement for HTTP Adaptive Streaming

IEEE VCIP 2025

December 1 – December 4, 2025

Klagenfurt, Austria

[PDF]

Daniele Lorenzi, Farzad Tashtarian, Christian Timmerer

Abstract: Enhancing low-quality video content is a task that has raised particular interest since recent developments in deep learning. Since most of the video content consumed worldwide is delivered over the Internet via HTTP Adaptive Streaming (HAS), implementing these techniques on web browsers would ease the access to visually-enhanced content on user devices.

In this paper, we present NeVES, a multimedia system capable of enhancing the quality of video content streamed through HAS in real time.

The demo is available at: https://github.com/cd-athena/NeVES.

Posted in ATHENA | Comments Off on NeVES: Real-Time Neural Video Enhancement for HTTP Adaptive Streaming

Perceptual Quality Assessment of Spatial Videos on Apple Vision Pro

Perceptual Quality Assessment of Spatial Videos on Apple Vision Pro

ACMMM IXR 2025

October 27 – October 31, 2025

Dublin, Ireland

[PDF]

Afshin Gholami, Sara Baldoni, Federica Battisti, Wei Zhou, Christian Timmerer, Hadi Amirpour

Abstract: Immersive stereoscopic/3D video experiences have entered a new era with the advent of smartphones capable of capturing spatial videos, advanced video codecs optimized for multiview content, and Head Mounted Displays (HMD s) that natively support spatial video playback. In this work, we evaluate the quality of spatial videos encoded using optimized x265 software implementations of MV-HEVC on the AVP and compare them with their corresponding 2D versions through a subjective test.

To support this study, we introduce SV-QoE, a novel dataset comprising video clips rendered with a twin-camera setup that replicates the human inter-pupillary distance. Our analysis reveals that spatial videos consistently deliver a superior Quality of Experience ( QoE ) when encoded at identical bitrates, with the benefits becoming more pronounced at higher bitrates. Additionally, renderings at closer distances exhibit significantly enhanced video quality and depth perception, highlighting the impact of spatial proximity on immersive viewing experiences.

We further analyze the impact of disparity on depth perception and examine the correlation between Mean Opinion Score (MOS ) and established objective quality metrics such as PSNR, SSIM, MS-SSIM, VMAF, and AVQT. Additionally, we explore how video quality and depth perception together influence overall quality judgments.

 

Posted in ATHENA | Comments Off on Perceptual Quality Assessment of Spatial Videos on Apple Vision Pro

ACM MM’25: SVD: Spatial Video Dataset

SVD: Spatial Video Dataset

ACM Multimedia 2025

October 27 – October 31, 2025

Dublin, Ireland

[PDF]

MH Izadimehr, Milad Ghanbari, Guodong Chen, Wei Zhou, Xiaoshuai Hao, Mallesham Dasari, Christian Timmerer, Hadi Amirpour

Abstract:  Stereoscopic video has long been the subject of research due to its ability to deliver immersive three-dimensional content to a wide range of applications, from virtual and augmented reality to advanced human–computer interaction. The dual‑view format inherently provides binocular disparity cues that enhance depth perception and realism, making it indispensable for fields such as telepresence, 3D mapping, and robotic vision. Until recently, however, end‑to‑end pipelines for capturing, encoding, and viewing high‑quality 3D video were neither widely accessible nor optimized for consumer‑grade devices. Today’s smartphones, such as the iPhone Pro and modern HMDs like the AVP, offer built‑in support for stereoscopic video capture, hardware‑accelerated encoding, and seamless playback on devices like the AVP and Meta Quest 3, which require minimal user intervention. Apple refers to this streamlined workflow as spatial Video. Making the full stereoscopic video process available to everyone has made new applications possible. Despite these advances, there remains a notable absence of publicly available datasets that include the complete spatial video pipeline on consumer platforms, hindering reproducibility and comparative evaluation of emerging algorithms.

In this paper, we introduce SVD, a spatial video dataset comprising 300 five-second video sequences, i.e., 150 captured using an iPhone Pro and 150 with an AVP. Additionally, 10 longer videos with a minimum duration of 2 minutes have been recorded. The SVD is publicly released under an open source license to facilitate research in codec performance evaluation, subjective and objective Quality of Experience assessment, depth‑based computer vision, stereoscopic video streaming, and other emerging 3D applications such as neural rendering and volumetric capture. Link to the dataset: https://cd-athena.github.io/SVD/.

 

Posted in ATHENA | Comments Off on ACM MM’25: SVD: Spatial Video Dataset

ACM MM’25 BNI: GenStream: Semantic Streaming Framework for Generative Reconstruction of Human-centric Media

GenStream: Semantic Streaming Framework for Generative Reconstruction of Human-centric Media

ACM Multimedia 2025

October 27 – October 31, 2025

Dublin, Ireland

[PDF]

Emanuele Artioli (AAU, Austria), Daniele Lorenzi (AAU, Austria), Shivi Vats (AAU, Austria),Farzad Tashtarian (AAU, Austria), Christian Timmerer (AAU, Austria)

Abstract: Video streaming dominates global internet traffic, yet conventional pipelines remain inefficient for structured, human-centric content such as sports, performance, or interactive media. Standard codecs re-encode entire frames, foreground and background alike, treating all pixels uniformly and ignoring the semantic structure of the scene. This leads to significant bandwidth waste, particularly in scenarios where backgrounds are static and motion is constrained to a few salient actors. We introduce GenStream, a semantic streaming framework that replaces dense video frames with compact, structured metadata. Instead of transmitting pixels, GenStream encodes each scene as a combination of skeletal keypoints, camera viewpoint parameters, and a static 3D background model. These elements are transmitted to the client, where a generative model reconstructs photorealistic human figures and composites them into the 3D scene from the original viewpoint. This paradigm enables extreme compression, achieving over 99.9% bandwidth reduction compared to HEVC. We partially validate GenStream on Olympic figure skating footage and demonstrate potential high perceptual fidelity under minimal data. Looking forward, GenStream opens new directions in volumetric avatar synthesis, canonical 3D actor fusion across views, personalized and immersive viewing experiences at arbitrary viewpoints, and lightweight scene reconstruction, laying the groundwork for scalable, intelligent streaming in the post-codec era.

Posted in ATHENA | Comments Off on ACM MM’25 BNI: GenStream: Semantic Streaming Framework for Generative Reconstruction of Human-centric Media