ICME’25 Tutorial: Video Coding Advancements in HTTP Adaptive Streaming

IEEE ICME 2025
June 30, 2025- July 04, 2025

Nantes, France

https://www.2025.ieeeicme.org/tutorials/

Tutorial speakers:

  • Hadi Amirpour (University of Klagenfurt)
  • Christian Timmerer (University of Klagenfurt)

Tutorial description:

This tutorial provides a comprehensive exploration of the HTTP Adaptive Streaming (HAS) pipeline, covering advancements from content provisioning to content consumption. We begin by tracing the history of video streaming and the evolution of video coding technologies. Attendees will gain insights into the timeline of significant developments, from early proprietary solutions to modern adaptive streaming standards like HAS. A comparative analysis of video codecs is presented, highlighting milestones such as H.264, HEVC, and the latest standard, Versatile Video Coding (VVC), emphasizing their efficiency, adoption, and impact on streaming technologies. Additionally, new trends in video coding, including AI-based coding solutions, will be covered, showcasing their potential to transform video compression and streaming workflows.

Building on this foundation, we explore per-title encoding techniques, which dynamically tailor bitrate ladders to the specific characteristics of video content. These methods account for factors such as spatial resolution, frame rate, device compatibility, and energy efficiency, optimizing both Quality of Experience (QoE) and environmental sustainability. Next, we highlight cutting-edge  advancements in live streaming, including novel approaches to optimizing bitrate ladders without introducing latency. Fast multi-rate encoding methods are also presented, showcasing how they significantly reduce encoding times and computational costs, effectively addressing scalability challenges for streaming providers.

The tutorial further delves into edge computing capabilities for video transcoding, emphasizing how edge-based architectures can streamline the processing and delivery of streaming content. These approaches reduce latency and enable efficient resource utilization, particularly in live and interactive streaming scenarios.

Finally, we discuss the QoE parameters that influence both streaming and coding pipelines, providing a holistic view of how QoE considerations guide decisions in codec selection, bitrate optimization, and delivery strategies. By combining historical context, theoretical foundations, and practical insights, this tutorial equips attendees with the knowledge to navigate and address the evolving challenges in video streaming applications.

Posted in ATHENA | Comments Off on ICME’25 Tutorial: Video Coding Advancements in HTTP Adaptive Streaming

Patent Approval for “Perceptually-aware Online Per-title Encoding for Live Video Streaming”

Perceptually-aware Online Per-title Encoding for Live Video Streaming

US Patent

[PDF]

Vignesh Menon (Alpen-Adria-Universität Klagenfurt, Austria), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt, Austria), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria)

 

Abstract: Techniques for implementing perceptually aware per-title encoding may include receiving an input video, a set of resolutions, a maximum target bitrate and a minimum target bitrate, extracting content aware features for each segment of the input video, predicting a perceptually aware bitrate-resolution pair for each segment using a model configured to optimize for a quality metric using constants trained for each of the set of resolutions, generating a target encoding set including a set of perceptually aware bitrate-resolution pairs, and encoding the target encoding set. The content aware features may include a spatial energy feature and an average temporal energy. According to these methods only a subset of bitrates and resolutions, less than a full set of bitrates and resolutions, are encoded to provide high quality video content for streaming.

Posted in ATHENA | Comments Off on Patent Approval for “Perceptually-aware Online Per-title Encoding for Live Video Streaming”

DORBINE Project

DORBINE Project Approved by FFG.

DORBINE is a project

  

For more information, please visit the DORBINE webpage here.

 

Posted in ATHENA | Comments Off on DORBINE Project

Real-Time Quality- and Energy-Aware Bitrate Ladder Construction for Live Video Streaming

Real-Time Quality- and Energy-Aware Bitrate Ladder Construction for Live Video Streaming

IEEE Journal on Emerging and Selected Topics in Circuits and Systems

[PDF]

Mohammad Ghasempour (AAU, Austria), Hadi Amirpour (AAU, Austria), and Christian Timmerer (AAU, Austria)

Abstract: Live video streaming’s growing demand for high-quality content has resulted in significant energy consumption, creating challenges for sustainable media delivery. Traditional adaptive video streaming approaches rely on the over-provisioning of resources leading to a fixed bitrate ladder, which is often inefficient for the heterogeneous set of use cases and video content. Although dynamic approaches like per-title encoding optimize the bitrate ladder for each video, they mainly target video-on-demand to avoid latency and fail to address energy consumption. In this paper, we present LiveESTR, a method for building a quality- and energy-aware bitrate ladder for live video streaming. LiveESTR eliminates the need for exhaustive video encoding processes on the server side, ensuring that the bitrate ladder construction process is fast and energy efficient. A lightweight model for multi-label classification, along with a lookup table, is utilized to estimate the optimized resolution-bitrate pair in the bitrate ladder. Furthermore, both spatial and temporal resolutions are supported to achieve high energy savings while preserving compression efficiency. Therefore, a tunable parameter λ and a threshold τ are introduced to balance the trade-off between compression, quality, and energy efficiency. Experimental results show that LiveESTR reduces the encoder and decoder energy consumption by 74.6% and 29.7%, with only a 2.1% increase in Bjøntegaard Delta Rate (BD-Rate) compared to traditional per-title encoding. Furthermore, it is shown that by increasing λ to prioritize video quality, LiveESTR achieves 2.2% better compression efficiency in terms of BD-Rate while still reducing decoder energy consumption by 7.5%.

Posted in ATHENA | Comments Off on Real-Time Quality- and Energy-Aware Bitrate Ladder Construction for Live Video Streaming

Multi-resolution Encoding for HTTP Adaptive Streaming using VVenC

Multi-resolution Encoding for HTTP Adaptive Streaming using VVenC

The IEEE International Symposium on Circuits and Systems (IEEE ISCAS 2025)

https://2025.ieee-iscas.org/

25–28 May 2025 // London, United Kingdom

[PDF]

Kamran Qureshi (AAU, Austria), Hadi Amirpour (AAU, Austria), Christian Timmerer (AAU, Austria)

Abstract: HTTP Adaptive Streaming (HAS) is a widely adopted method for delivering video content over the Internet, requiring each video to be encoded at multiple bitrates and resolution pairs, known as representations, to adapt to various network conditions and device capabilities. This multi-bitrate encoding introduces significant challenges due to the computational and time-intensive nature of encoding multiple representations. Conventional approaches often encode these videos independently without leveraging similarities between different representations of the same input video. This paper proposes an accelerated multi-resolution encoding strategy that utilizes representations of lower resolutions as references to speed up the encoding of higher resolutions when using Versatile Video Coding (VVC); specifically in VVenC, an optimized open-source software implementation. For multi-resolution encoding, a mid-bitrate representation serves as the reference, allowing interpolated encoded partition data to efficiently guide the partitioning process in higher resolutions. The proposed approach uses shared encoding information to reduce redundant calculations, thereby optimizing the partitioning decisions. Experimental results demonstrate that the proposed technique achieves a reduction of up to 17% compared to medium preset in encoding time across videos of varying complexities with minimal BDBR/BDT of 0.12 compared to the fast preset.

Posted in ATHENA | Comments Off on Multi-resolution Encoding for HTTP Adaptive Streaming using VVenC

Improving the Efficiency of VVC using Partitioning of Reference Frames

Improving the Efficiency of VVC using Partitioning of Reference Frames

The IEEE International Symposium on Circuits and Systems (IEEE ISCAS 2025)

https://2025.ieee-iscas.org/

25–28 May 2025 // London, United Kingdom

[PDF]

Kamran Qureshi (AAU, Austria), Hadi Amirpour (AAU, Austria), Christian Timmerer (AAU, Austria)

Abstract: In response to the growing demand for high-quality videos, a new coding standard, Versatile Video Coding (VVC), was released in 2020. VVC is based on the same hybrid coding architecture as its predecessor, High-Efficiency Video Coding (HEVC), providing a bitrate reduction of approximately 50% for the same subjective quality. VVC extends HEVC’s Coding Tree Unit (CTU) partitioning with more flexible block sizes, increasing its encoding complexity. Optimization is essential to making efficient use of VVC in practical applications. VVenC, an optimized open-source VVC encoder, introduces multiple presets to address the trade-off between compression efficiency and encoder complexity. Although an optimized set of encoding tools has been selected for each preset, the rate-distortion (RD) search space in the encoder presets still poses a challenge for efficient encoder implementations. This paper proposes Early Termination using Reference Frames (ETRF). It improves the trade-off between encoding efficiency and time complexity and positions itself as a new preset between medium and fast presets. The CTU partitioning map of the reference frames present in lower temporal layers is employed to accelerate the encoding of frames in higher temporal layers. The results show a reduction in the encoding time of around 22% compared to the medium preset. Specifically, for videos with high spatial and temporal complexities, which typically require longer encoding times, the proposed method shows an improved BDBR/BDT compared to the fast preset.

Posted in ATHENA | Comments Off on Improving the Efficiency of VVC using Partitioning of Reference Frames

CLIP-DQA: Blindly Evaluating Dehazed Images from Global and Local Perspectives Using CLIP

CLIP-DQA: Blindly Evaluating Dehazed Images from Global and Local Perspectives Using CLIP

The IEEE International Symposium on Circuits and Systems (IEEE ISCAS 2025)

https://2025.ieee-iscas.org/

25–28 May 2025 // London, United Kingdom

[PDF]

Yirui Zeng (Cardiff University, UK), Jun Fu (Cardiff University), Hadi Amirpour (AAU, Austria), Huasheng Wang (Alibaba Group), Guanghui Yue (Shenzhen University, China), Hantao Liu (Cardiff University), Ying Chen (Alibaba Group), Wei Zhou (Cardiff University)

Abstract: Blind dehazed image quality assessment (BDQA), which aims to accurately predict the visual quality of dehazed images without any reference information, is essential for the evaluation, comparison, and optimization of image dehazing algorithms. Existing learning-based BDQA methods have achieved remarkable success, while the small scale of DQA datasets limits their performance. To address this issue, in this paper, we propose to adapt Contrastive Language-Image Pre-Training (CLIP), pre-trained on large-scale image-text pairs, to the BDQA task. Specifically, inspired by the fact that the human visual system understands images based on hierarchical features, we take global and local information of the dehazed image as the input of CLIP. To accurately map the input hierarchical information of dehazed images into the quality score, we tune both the vision branch and language branch of CLIP with prompt learning. Experimental results on two authentic DQA datasets demonstrate that our proposed approach, named CLIP-DQA, achieves more accurate quality predictions over existing BDQA methods.

Posted in ATHENA | Comments Off on CLIP-DQA: Blindly Evaluating Dehazed Images from Global and Local Perspectives Using CLIP