A Tutorial on Immersive Video Delivery: From Omnidirectional Video to Holography

IEEE Communications Surveys and Tutorials

[PDF]

Jeroen van der Hooft (Ghent University, Belgium), Hadi Amirpour (AAU, Austria), Maria Torres Vega (KU Leuven, Belgium), Yago Sanchez (Fraunhofer/HHI), Raimund Schatz (AIT, Austria), Thomas Schierl (Fraunhofer/HHI, Germany), and Christian Timmerer (AAU, Austria)

Abstract: Video services are evolving from traditional two-dimensional video to virtual reality and holograms, which offer six degrees of freedom to users, enabling them to freely move around in a scene and change focus as desired. However, this increase in freedom translates into stringent requirements in terms of ultra-high bandwidth (in the order of Gigabits per second) and minimal latency (in the order of milliseconds). To realize such immersive services, the network transport, as well as the video representation and encoding, have to be fundamentally enhanced. The purpose of this tutorial article is to provide an elaborate introduction to the creation, streaming, and evaluation of immersive video. Moreover, it aims to provide lessons learned and to point at promising research paths to enable truly interactive immersive video applications toward holography.

Keywords—Immersive video delivery, 3DoF, 6DoF, omnidirectional video, volumetric video, point clouds, meshes, light fields, holography, end-to-end systems

Posted in ATHENA | Comments Off on A Tutorial on Immersive Video Delivery: From Omnidirectional Video to Holography

Transcoding Quality Prediction for Adaptive Video Streaming

2023 ACM Mile High Video (MHV) 

May 7-10, 2023 | Denver, US

[PDF] [Slides]

Vignesh V Menon (Alpen-Adria-Universität Klagenfurt), Reza Farahani (Alpen-Adria-Universität Klagenfurt), Prajit T Rajendran (Universite Paris-Saclay), Mohammed Ghanbari (University of Essex), Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt),  and Christian Timmerer (Alpen-Adria-Universität Klagenfurt).

Abstract:

In recent years, video streaming applications have proliferated the demand for Video Quality Assessment (VQA). Reduced reference video quality assessment (RR-VQA) is a category of VQA where certain features (e.g., texture, edges) of the original video are provided for quality assessment. It is a popular research area for various applications such as social media, online games, and video streaming. This paper introduces a reduced reference Transcoding Quality Prediction Model (TQPM) to determine the visual quality score of the video possibly transcoded in multiple stages. The quality is predicted using Discrete Cosine Transform (DCT)-energy-based features of the video (i.e., the video’s brightness, spatial texture information, and temporal activity) and the target bitrate representation of each transcoding stage. To do that, the problem is formulated, and a Long Short-Term Memory (LSTM)-based quality prediction model is presented. Experimental results illustrate that, on average, TQPM yields PSNR, SSIM, and VMAF predictions with an ?2 score of 0.83, 0.85, and 0.87, respectively, and Mean Absolute Error (MAE) of 1.31 dB, 1.19 dB, and 3.01, respectively, for single-stage transcoding.
Furthermore, an ?2 score of 0.84, 0.86, and 0.91, respectively, and MAE of 1.32 dB, 1.33 dB, and 3.25, respectively, are observed for a two-stage transcoding scenario. Moreover, the average processing time of TQPM for 4s segments is 0.328s, making it a practical VQA method in online streaming applications.

An example scenario of VQA in adaptive streaming applications.

Posted in ATHENA | Comments Off on Transcoding Quality Prediction for Adaptive Video Streaming

LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation framework

IEEE Access, A Multidisciplinary, Open-access Journal of the IEEE

[PDF, GitHub, Slides, Video]

Babak Taraghi , Hermann Hellwagner and Christian Timmerer
(AAU, Austria)g2g

Low-latency live streaming by HTTP Chunked Transfer EncodingAbstract: Live media streaming is a challenging task by itself, and when it comes to use cases that define low-latency as a must, the complexity will rise multiple times. In a typical media streaming session, the main goal can be declared as providing the highest possible Quality of Experience (QoE), which has proved to be measurable using quality models and various metrics. In a low-latency media streaming session, the requirements are to provide the lowest possible delay between the moment a frame of video is captured and the moment that the captured frame is rendered on the client screen, also known as end-to-end (E2E) latency and maintain the QoE. This paper proposes a sophisticated cloud-based and open-source testbed that facilitates evaluating a low-latency live streaming session as the primary contribution. Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation (LLL-CAdViSE) framework is enabled to asses the live streaming systems running on two major HTTP Adaptive Streaming (HAS) formats, Dynamic Adaptive Streaming over HTTP (MPEG-DASH) and HTTP Live Streaming (HLS). We use Chunked Transfer Encoding (CTE) to deliver Common Media Application Format (CMAF) chunks to the media players. Our testbed generates the test content (audiovisual streams). Therefore, no test sequence is required, and the encoding parameters (e.g., encoder, bitrate, resolution, latency) are defined separately for each experiment. We have integrated the ITU-T P.1203 quality model inside our testbed. To demonstrate the flexibility and power of LLL-CAdViSE, we have presented a secondary contribution in this paper; we have conducted a set of experiments with different network traces, media players, ABR algorithms, and with various requirements (e.g., E2E latency (typical/reduced/low/ultra-low), diverse bitrate ladders, and catch-up logic) and presented the essential findings and the experimental results.

Keywords: Live Streaming; Low-latency; HTTP Adaptive Streaming; Quality of Experience; Objective Evaluation, Open-source Testbed.

Posted in ATHENA | Comments Off on LLL-CAdViSE: Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation framework

SPACE: Segment Prefetching and Caching at the Edge for Adaptive Video Streaming

IEEE Access

[PDF]

Jesús Aguilar Armijo (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt) and Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt)

Abstract: Multi-access Edge Computing (MEC) is a new paradigm that brings storage and computing close to the clients. MEC enables the deployment of complex network-assisted mechanisms for video streaming that improve clients’ Quality of Experience (QoE). One of these mechanisms is segment prefetching, which transmits the future video segments in advance closer to the client to serve content with lower latency. In this work, for HAS-based (HTTP Adaptive Streaming) video streaming and specifically considering a cellular (e.g., 5G) network edge, we present our approach Segment Prefetching and Caching at the Edge for Adaptive Video Streaming (SPACE). We propose and analyze different segment prefetching policies that differ in resource utilization, player and radio metrics needed, and deployment complexity. This variety of policies can dynamically adapt to the network’s current conditions and the service provider’s needs. We present segment prefetching policies based on diverse approaches and techniques: past segment requests, segment transrating (i.e., reducing segment bitrate/quality), Markov prediction model, machine learning to predict future segment requests, and super-resolution.We study their performance and feasibility using metrics such as QoE characteristics, computing times, prefetching hits, and link bitrate consumption. We analyze and discuss which segment prefetching policy is better under which circumstances, as well as the influence of the client-side Adaptive Bit Rate (ABR) algorithm and the set of available representations (“bitrate ladder”) in segment prefetching. Moreover, we examine the impact on segment prefetching of different caching policies for (pre-)fetched segments, including Least Recently Used (LRU), Least Frequently Used (LFU), and our proposed popularity-based caching policy Least Popular Used (LPU).

Keywords: Adaptive video streaming, content delivery, HAS, edge computing, cellular network edge, MEC, segment prefetching, segment caching.

Posted in ATHENA | Comments Off on SPACE: Segment Prefetching and Caching at the Edge for Adaptive Video Streaming

Farzad Tashtarian to give the initial talk of his habilitation at Klagenfurt University.

How to Optimize Dynamic Adaptive Video Streaming? Challenges and Solutions

Abstract: Empowered by today’s rich tools for media generation and collaborative production and convenient network access to the Internet, video streaming has become very popular. Dynamic adaptive video streaming is a technique used to deliver video content to users over the Internet, where the quality of the video adapts in real time based on the network conditions and the capabilities of the user’s device. HTTP Adaptive Streaming (HAS) has become the de-facto standard to provide a smooth and uninterrupted viewing experience, especially when network conditions frequently change. Improving the QoE of users concerning various applications‘ requirements presents several challenges, such as network variability, limited resources, and device heterogeneity. For example, the available network bandwidth can vary over time, leading to frequent changes in the video quality. In addition, different users have different preferences and viewing habits, which can further complicate live streaming optimization. Researchers and engineers have developed various approaches to optimize dynamic adaptive streaming, such as QoE-driven adaptation, machine learning-based approaches, and multi-objective optimization, to address these challenges. In this talk, we will give an introduction to the topic of video streaming and point out the significant challenges in the field. We will present a layered architecture for video streaming and then discuss a selection of approaches from our research addressing these challenges. For instance, we will present approaches to improve the  QoE of clients in User-generated content applications in centralized and distributed fashions. Moreover, we will present a novel architecture for low-latency live streaming that is agnostic to the protocol and codecs that can work equally with existing HAS-based approaches.

Posted in ATHENA | Comments Off on Farzad Tashtarian to give the initial talk of his habilitation at Klagenfurt University.

Patent approval for “Fast multi-rate encoding for adaptive HTTP streaming”

Fast multi-rate encoding for adaptive HTTP streaming

US Patent

[PDF]

Hadi Amirpour (Alpen-Adria-Universität Klagenfurt, Austria), Ekrem Çetynkaya (Alpen-Adria-Universität Klagenfurt, Austria), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria)

 

Abstract: According to embodiments of the disclosure, information of higher and lower quality encoded video segments is used to limit Rate-Distortion Optimization (RDO) for each Coding Unit Tree (CTU). A method first encodes the highest bit-rate segment and consequently uses it to encode the lowest bit-rate video segment. Block structure and selected reference frame of both highest and lowest bit-rate video segments are used to predict and shorten RDO process for each CTU in middle bit-rates. The method delays just one frame using parallel processing. This approach provides time-complexity reduction compared to the reference software for middle bit-rates while degradation is negligible.

 

Posted in ATHENA | Comments Off on Patent approval for “Fast multi-rate encoding for adaptive HTTP streaming”

VCA v2.0 released!

Another Valentine’s day, another gift from ATHENA! We are delighted to share Video Complexity Analyzer (VCA) version 2.0. The new release brings add-on benefits over our previous release, v1.5, like supporting analysis of High Dynamic Range (HDR) videos. The version is faster, courtesy of the low-pass DCT optimizations. We expect this release of VCA facilitates research on content-adaptive encoding, video quality assessment, and other research topics more than ever.

A command-line executable is provided to facilitate testing and development. VCA is available as an open-source library published under the GPLv3 license. For more details, please visit the online software documentation here. The source code can be found here.

Heatmap depiction of the luma texture information features extracted from the second frame of Cover- Song_1080P_0a86 video of Youtube UGC Dataset.

Posted in ATHENA | Comments Off on VCA v2.0 released!