We cordially invite you to join the Long Night of Research / Lange Nacht der Forschung at the University of Klagenfurt on May 20, 2022, 4pm-11pm.
ATHENA will be present with two booths as follows.
#1: L70: How does video streaming work! (Lakeside Park, B12b.1.1)
More than 60% of the internet data volume is video content consumed via streaming services like YouTube, Netflix, or Flimmit. We show how video streaming services work. It is essential that the quality of the videos is played back as optimally as possible on various end devices. With us, you can gain practical experience and get to know differences in the quality of perception. In particular, we will show an animation video of how video streaming actually works, and visitors will be able to experience videos of different qualities and run some exciting experiments.
#2: U25: How to make video streaming faster and better! (Main University Building; jointly with Bitmovin)
Bitmovin, founded by graduates and employees of the University of Klagenfurt, is a global leader in online video technology. The aim is to develop new technologies that will improve the video streaming experience in the future, for example, through smooth image quality. We will show you the latest achievements from ATHENA and how video streaming can become even more innovative in the future.
Optimizing QoE in Live Streaming over Wireless Networks using Machine Learning Techniques
CHIST-ERA Conference 2022 [Website][Abstract]
May 25, 2022, Edinburgh, UK
Empowered by today’s rich tools for media generation and collaborative production and the convenient wireless access (e.g., WiFi and cellular networks) to the Internet, crowdsourced live streaming over wireless networks have become very popular. However, crowdsourced wireless live streaming presents unique video delivery challenges that make a difficult tradeoff among three core factors: bandwidth, computation/storage, and latency. However, the resources available for these non-professional live streamers (e.g., radio channel and bandwidth) are limited and unstable, which potentially impairs the streaming quality and viewers’ experience. Moreover, the diverse live interactions among the live streamers and viewers can further worsen the problem. Leveraging recent technologies like Software-defined Networking (SDN), Network Function Virtualization (NFV), Mobile Edge Computing (MEC), and 5G facilitate providing crowdsourced live streaming applications for mobile users in wireless networks. However, there are still some open issues to be addressed. One of the most critical problems is how to allocate an optimal amount of resources in terms of bandwidth, computation power, and storage to meet the required latency while increasing the perceived QoE by the end-users. Due to the NP-complete nature of this problem, machine learning techniques have been employed to optimize various items on the streaming delivery paths from the streamers to the end-users joining the network through wireless links. Furthermore, to tackle the scalability issue, we need to push forward our solutions toward distributed machine learning techniques. In this short talk, we are going first to introduce the main issues and challenges of the current crowdsourced live streaming system over wireless networks and then highlight the opportunities of leveraging machine learning techniques in different parts of the video streaming path ranging from encoder and packaging algorithms at the streamers to the radio channel allocation module at the viewers, to enhance the overall QoE with a reasonable resource cost.
CoPaM: Cost-aware VM Placement and Migration for Mobile Services in Multi-Cloudlet Environment: An SDN-based Approach
Elsevier Computer Communications Journal
Shirzad Shahryari (Ferdowsi University of Mashhad, Mashhad, Iran), Farzad Tashtarian (Alpen-Adria-Universität Klagenfurt), Seyed-Amin Hosseini-Seno (Ferdowsi University of Mashhad, Mashhad, Iran).
Abstract: Edge Cloud Computing (ECC) is a new approach for bringing Mobile Cloud Computing (MCC) services closer to mobile users in order to facilitate the complicated application execution on resource-constrained mobile devices. The main objective of the ECC solution with the cloudlet approach is mitigating the latency and augmenting the available bandwidth. This is basically done by deploying servers (a.k.a ”cloudlets”) close to the user’s device on the edge of the cellular network. Once the user requests mount, the resource constraints in a cloudlet will lead to resource shortages. This challenge, however, can be overcome using a network of cloudlets for sharing their resources. On the other hand, when considering the users’ mobility along with the limited resource of the cloudlets serving them, the user-cloudlet communication may need to go through multiple hops, which may seriously affect the communication delay between them and the quality of services (QoS).
EUVIP 2022 Special Session on
“Machine Learning for Immersive Content Processing”
September, 2022, Lisbon, Portugal
- Hadi Amirpour, University Klagenfurt, Austria
- Christine Guillemot, INSA, France
- Christian Timmerer, University Klagenfurt, Austria
The importance of remote communication is becoming more and more important in particular after COVID-19 crisis. However, to bring a more realistic visual experience, more than the traditional two-dimensional (2D) interfaces we know today is required. Immersive media such as 360-degree, light fields, point cloud, ultra-high-definition, high dynamic range, etc. can fill this gap. These modalities, however, face several challenges from capture to display. Learning-based solutions show great promise and significant performance in improving traditional solutions in addressing the challenges. In this special session, we will focus on research works aimed at extending and improving the use of learning-based architectures for immersive imaging technologies.
Paper Submissions: 6th June, 2022
Paper Notifications: 11th July, 2022
2022 IEEE International Conference on Multimedia and Expo (ICME) Industry & Application Track
July 18-22, 2022 | Taipei, Taiwan
Vignesh V Menon (Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Feldmann (Bitmovin, Austria), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt)
In live streaming applications, typically a fixed set of bitrate-resolution pairs (known as a bitrate ladder) is used during the entire streaming session in order to avoid the additional latency to find scene transitions and optimized bitrate-resolution pairs for every video content. However, an optimized bitrate ladder per scene may result in (i) decreased
storage or delivery costs or/and (ii) increased Quality of Experience (QoE). This paper introduces an Online Per-Scene Encoding (OPSE) scheme for adaptive HTTP live streaming applications. In this scheme, scene transitions and optimized bitrate-resolution pairs for every scene are predicted using Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features. Experimental results show that, on average, OPSE yields bitrate savings of upto 48.88% in certain scenes to maintain the same VMAF,
compared to the reference HTTP Live Streaming (HLS) bitrate ladder without any noticeable additional latency in streaming.
The bitrate ladder prediction envisioned using OPSE.
2022 IEEE International Conference on Multimedia and Expo (ICME)
July 18-22, 2022 | Taipei, Taiwan
Vignesh V Menon (Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt)
In live streaming applications, typically a fixed set of bitrate-resolution pairs (known as bitrate ladder) is used for simplicity and efficiency in order to avoid the additional encoding run-time required to find optimum resolution-bitrate pairs for every video content. However, an optimized bitrate ladder may result in (i) decreased storage or delivery costs or/and (ii) increased Quality of Experience (QoE). This paper introduces a perceptually-aware per-title encoding (PPTE) scheme for video streaming applications. In this scheme, optimized bitrate-resolution pairs are predicted online based on Just Noticeable Difference (JND) in quality perception to avoid adding perceptually similar representations in the bitrate ladder. To this end, Discrete Cosine Transform(DCT)-energy-based low-complexity spatial and temporal features for each video segment are used. Experimental results show that, on average, PPTE yields bitrate savings of 16.47% and 27.02% to maintain the same PSNR and VMAF, respectively, compared to the reference HTTP Live Streaming (HLS) bitrate ladder without any noticeable additional latency in streaming accompanied by a 30.69% cumulative decrease in storage space for various representations.
Architecture of PPTE
ICME Workshop on Hyper-Realistic Multimedia for Enhanced Quality of Experience (ICMEW)
July 18-22, 2022 | Taipei, Taiwan
Ekrem Çetinkaya (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), and Christian Timmerer (Christian Doppler LaboratoryATHENA, Alpen-Adria-Universität Klagenfurt)
Abstract: Light field imaging enables post-capture actions such as refocusing and changing view perspective by capturing both spatial and angular information. However, capturing richer information about the 3D scene results in a huge amount of data. To improve the compression efficiency of the existing light field compression methods, we investigate the impact of light field super-resolution approaches (both spatial and angular super-resolution) on the compression efficiency. To this end, firstly, we downscale light field images over (i) spatial resolution, (ii) angular resolution, and (iii) spatial-angular resolution and encode them using Versatile Video Coding (VVC). We then apply a set of light field super-resolution deep neural networks to reconstruct light field images in their full spatial-angular resolution and compare their compression efficiency. Experimental results show that encoding the low angular resolution light field image and applying angular super-resolution yield bitrate savings of 51.16 % and 53.41 % to maintain the same PSNR and SSIM, respectively, compared to encoding the light field image in high-resolution.
Keywords: Light field, Compression, Super-resolution, VVC.