ATHENA Christian Doppler (CD) Laboratory

EUVIP 2022 Special Session: Machine Learning for Immersive Content Processing

Posted on May 9, 2022 by

EUVIP 2022 Special Session on

September, 2022, Lisbon, Portugal

Organizers:

Hadi Amirpour, University Klagenfurt, Austria
Christine Guillemot, INSA, France
Christian Timmerer, University Klagenfurt, Austria

Brief description:

The importance of remote communication is becoming more and more important in particular after COVID-19 crisis. However, to bring a more realistic visual experience, more than the traditional two-dimensional (2D) interfaces we know today is required. Immersive media such as 360-degree, light fields, point cloud, ultra-high-definition, high dynamic range, etc. can fill this gap. These modalities, however, face several challenges from capture to display. Learning-based solutions show great promise and significant performance in improving traditional solutions in addressing the challenges. In this special session, we will focus on research works aimed at extending and improving the use of learning-based architectures for immersive imaging technologies.

Important dates:

Paper Submissions: 6^th June, 2022
Paper Notifications: 11^th July, 2022

Posted in ATHENA | Comments Off

LiDeR: Lightweight Dense Residual Network for Video Super-Resolution on Mobile Devices

Posted on May 7, 2022 by

IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP 2022)

June 26-29, 2022 | Nafplio, Greece

Conference Website

[PDF][Slides][Video]

Ekrem Çetinkaya (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Minh Nguyen (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), and Christian Timmerer (Christian Doppler LaboratoryATHENA, Alpen-Adria-Universität Klagenfurt)

Abstract: Video is now an essential part of the Internet. The increasing popularity of video streaming on mobile devices and the improvement in mobile displays brought together challenges to meet user expectations. Advancements in deep neural networks have seen successful applications on several computer vision tasks such as super-resolution (SR). Although DNN-based SR methods significantly improve over traditional methods, their computational complexity makes them challenging to apply on devices with limited power, such as smartphones. However, with the improvement in mobile hardware, especially GPUs, it is now possible to use DNN based solutions, though existing DNN based SR solutions are still too complex. This paper proposes LiDeR, a lightweight video SR network specifically tailored toward mobile devices. Experimental results show that LiDeR can achieve competitive SR performance with state-of-the-art networks while improving the execution speed significantly, i.e., 267 % for X4 upscaling and 353 % for X2 upscaling compared to ESPCN.

Keywords: Super-resolution, Mobile machine learning, Video super-resolution.

Posted in ATHENA | Comments Off

OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming

Posted on May 7, 2022 by

2022 IEEE International Conference on Multimedia and Expo (ICME) Industry & Application Track

July 18-22, 2022 | Taipei, Taiwan

[PDF][Slides][Video]

Vignesh V Menon (Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Feldmann (Bitmovin, Austria), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt)

Abstract:

In live streaming applications, typically a fixed set of bitrate-resolution pairs (known as a bitrate ladder) is used during the entire streaming session in order to avoid the additional latency to find scene transitions and optimized bitrate-resolution pairs for every video content. However, an optimized bitrate ladder per scene may result in (i) decreased
storage or delivery costs or/and (ii) increased Quality of Experience (QoE). This paper introduces an Online Per-Scene Encoding (OPSE) scheme for adaptive HTTP live streaming applications. In this scheme, scene transitions and optimized bitrate-resolution pairs for every scene are predicted using Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features. Experimental results show that, on average, OPSE yields bitrate savings of upto 48.88% in certain scenes to maintain the same VMAF,
compared to the reference HTTP Live Streaming (HLS) bitrate ladder without any noticeable additional latency in streaming.

The bitrate ladder prediction envisioned using OPSE.

https://www.slideshare.net/christian.timmerer/opse-online-perscene-encoding-for-adaptive-http-live-streaming

Posted in ATHENA | Comments Off

Perceptually-aware Per-title Encoding for Adaptive Video Streaming

Posted on May 4, 2022 by

2022 IEEE International Conference on Multimedia and Expo (ICME)

July 18-22, 2022 | Taipei, Taiwan

Conference Website

[PDF][Slides][Video]

Vignesh V Menon (Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt)

Abstract:

In live streaming applications, typically a fixed set of bitrate-resolution pairs (known as bitrate ladder) is used for simplicity and efficiency in order to avoid the additional encoding run-time required to find optimum resolution-bitrate pairs for every video content. However, an optimized bitrate ladder may result in (i) decreased storage or delivery costs or/and (ii) increased Quality of Experience (QoE). This paper introduces a perceptually-aware per-title encoding (PPTE) scheme for video streaming applications. In this scheme, optimized bitrate-resolution pairs are predicted online based on Just Noticeable Difference (JND) in quality perception to avoid adding perceptually similar representations in the bitrate ladder. To this end, Discrete Cosine Transform(DCT)-energy-based low-complexity spatial and temporal features for each video segment are used. Experimental results show that, on average, PPTE yields bitrate savings of 16.47% and 27.02% to maintain the same PSNR and VMAF, respectively, compared to the reference HTTP Live Streaming (HLS) bitrate ladder without any noticeable additional latency in streaming accompanied by a 30.69% cumulative decrease in storage space for various representations.

Architecture of PPTE

https://www.slideshare.net/christian.timmerer/perceptuallyaware-pertitle-encoding-for-adaptive-video-streaming

Posted in ATHENA | Comments Off

LFC-SASR: Light Field Coding Using Spatial and Angular Super-Resolution

Posted on April 30, 2022 by

ICME Workshop on Hyper-Realistic Multimedia for Enhanced Quality of Experience (ICMEW)

July 18-22, 2022 | Taipei, Taiwan

Conference Website

[PDF]

Ekrem Çetinkaya (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), and Christian Timmerer (Christian Doppler LaboratoryATHENA, Alpen-Adria-Universität Klagenfurt)

Abstract: Light field imaging enables post-capture actions such as refocusing and changing view perspective by capturing both spatial and angular information. However, capturing richer information about the 3D scene results in a huge amount of data. To improve the compression efficiency of the existing light field compression methods, we investigate the impact of light field super-resolution approaches (both spatial and angular super-resolution) on the compression efficiency. To this end, firstly, we downscale light field images over (i) spatial resolution, (ii) angular resolution, and (iii) spatial-angular resolution and encode them using Versatile Video Coding (VVC). We then apply a set of light field super-resolution deep neural networks to reconstruct light field images in their full spatial-angular resolution and compare their compression efficiency. Experimental results show that encoding the low angular resolution light field image and applying angular super-resolution yield bitrate savings of 51.16 % and 53.41 % to maintain the same PSNR and SSIM, respectively, compared to encoding the light field image in high-resolution.

Keywords: Light field, Compression, Super-resolution, VVC.

Posted in ATHENA | Comments Off

MPEG awarded a Technology & Engineering Emmy® Award for DASH

Posted on April 28, 2022 by

MPEG, specifically, ISO/IEC JTC 1/SC 29/WG 3 (MPEG Systems), has been just awarded a Technology & Engineering Emmy® Award for its ground-breaking MPEG-DASH standard. Dynamic Adaptive Streaming over HTTP (DASH) is the first international de-jure standard that enables efficient streaming of video over the Internet and it has changed the entire video streaming industry including — but not limited to — on-demand, live, and low latency streaming and even for 5G and the next generation of hybrid broadcast-broadband. The first edition has been published in April 2012 and MPEG is currently working towards publishing the 5th edition demonstrating an active and lively ecosystem still being further developed and improved to address requirements and challenges for modern media transport applications and services.

This award belongs to 90+ researchers and engineers from around 60 companies all around the world who participated in the development of the MPEG-DASH standard for over 12 years.

From left to right: Kyung-mo Park, Cyril Concolato, Thomas Stockhammer, Yuriy Reznik, Alex Giladi, Mike Dolan, Iraj Sodagar, Ali Begen, Christian Timmerer, Gary Sullivan, Per Fröjdh, Young-Kwon Lim, Ye-Kui Wang. (Photo © Yuriy Reznik)

Christian Timmerer, director of the Christian Doppler Laboratory ATHENA, chaired the evaluation of responses to the call for proposals and since that served as MPEG-DASH Ad-hoc Group (AHG) / Break-out Group (BoG) co-chair as well as co-editor for Part 2 of the standard. For a more detailed history of the MPEG-DASH standard, the interested reader is referred to Christian Timmerer’s blog post “HTTP Streaming of MPEG Media” (capturing the development of the first edition) and Nicolas Weill’s blog post “MPEG-DASH: The ABR Esperanto” (DASH timeline).

Posted in ATHENA | Comments Off

Multi-Codec Ultra High Definition 8K MPEG-DASH Dataset

Posted on April 6, 2022 by

The 13th ACM Multimedia Systems Conference (ACM MMSys 2022) Open Dataset and Software (ODS) track

June 14–17, 2022 | Athlone, Ireland

[PDF][Video]

Babak Taraghi (Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt).

sequences

workflow Abstract: There exist many applications that produce multimedia traffic over the Internet. Video streaming is on the list, with a rapidly growing desire for more bandwidth to deliver higher resolutions such as Ultra High Definition (UHD) 8K content. HTTP Adaptive Streaming (HAS) technique defines baselines for audio-visual content streaming to balance the delivered media quality and minimize streaming session defects. On the other hand, video codecs development and standardization help the theorem by introducing efficient algorithms and technologies. Versatile Video Coding (VVC) is one of the latest advancements in this area that is still not fully optimized and supported on all platforms. Stated optimization and supporting many platforms require years of research and development. This paper offers a dataset that facilitates the research and development of the aforementioned technologies. Our open-source dataset comprises Dynamic Adaptive Streaming over HTTP (MPEG-DASH) multimedia test assets of encoded Advanced Video Coding (AVC), High Efficiency Video Coding (HEVC), AOMedia Video 1 (AV1), and VVC content with resolutions of up to 7680×4320 or 8K. Our dataset has a maximum media duration of 322 seconds, and we offer our MPEG-DASH packaged content with two segments lengths, 4 and 8 seconds.