MMM’21: Towards Optimal Multirate Encoding for HTTP Adaptive Streaming

Towards Optimal Multirate Encoding for HTTP Adaptive Streaming

The International MultiMedia Modeling Conference (MMM)

25-27 January 2021, Prague, Czech Republic

https://mmm2021.cz

[PDF][Slides][Video]

Hadi Amirpour (Alpen-Adria-Universität Klagenfurt),Ekrem Çetinkaya (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), and Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK)

Abstract: HTTP Adaptive Streaming (HAS) enables high quality stream-ing of video contents. In HAS, videos are divided into short intervalscalled segments, and each segment is encoded at various quality/bitratesto adapt to the available bandwidth. Multiple encodings of the same con-tent imposes high cost for video content providers. To reduce the time-complexity of encoding multiple representations, state-of-the-art methods typically encode the highest quality representation first and reusethe information gathered during its encoding to accelerate the encodingof the remaining representations. As encoding the highest quality rep-resentation requires the highest time-complexity compared to the lowerquality representations, it would be a bottleneck in parallel encoding scenarios and the overall time-complexity will be limited to the time-complexity of the highest quality representation. In this paper and toaddress this problem, we consider all representations from the highestto the lowest quality representation as a potential, single reference toaccelerate the encoding of the other, dependent representations. We for-mulate a set of encoding modes and assess their performance in terms ofBD-Rate and time-complexity, using both VMAF and PSNR as objec-tive metrics. Experimental results show that encoding a middle qualityrepresentation as a reference, can significantly reduce the maximum en-coding complexity and hence it is an efficient way of encoding multiplerepresentations in parallel. Based on this fact, a fast multirate encodingmethod is proposed which utilizes depth and prediction mode of a middle quality representation to accelerate the encoding of the dependentrepresentations.

Keywords: HEVC, Video Encoding , Multirate Encoding , DASH

Posted in News | Comments Off on MMM’21: Towards Optimal Multirate Encoding for HTTP Adaptive Streaming

ISM’20: Dynamic Segment Repackaging at the Edge for HTTP Adaptive Streaming

Dynamic Segment Repackaging at the Edge for HTTP Adaptive Streaming

IEEE International Symposium on Multimedia (ISM)

2-4 December 2020, Naples, Italy

https://www.ieee-ism.org/

[PDF][Slides][Video]

Jesús Aguilar Armijo (Alpen-Adria-Universität Klagenfurt), Babak Taraghi (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), and Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt)

Abstract: Adaptive video streaming systems typically support different media delivery formats, e.g., MPEG-DASH and HLS, replicating the same content multiple times into the network. Such a diversified system results in inefficient use of storage, caching, and bandwidth resources. The Common Media Application Format (CMAF) emerges to simplify HTTP Adaptive Streaming (HAS), providing a single encoding and packaging
format of segmented media content and offering the opportunities of bandwidth savings, more cache hits and less storage needed. However, CMAF is not yet supported by most devices. To solve this issue, we present a solution where we maintain the main
advantages of CMAF while supporting heterogeneous devices using different media delivery formats. For that purpose, we propose to dynamically convert the content from CMAF to the desired media delivery format at an edge node. We study the bandwidth savings with our proposed approach using an analytical model and simulation, resulting in bandwidth savings of up to 20% with different media delivery format distributions.
We analyze the runtime impact of the required operations on the segmented content performed in two scenarios: the classic one, with four different media delivery formats, and the proposed scenario, using CMAF-only delivery through the network. We
compare both scenarios with different edge compute power assumptions. Finally, we perform experiments in a real video streaming testbed delivering MPEG-DASH using CMAF content to serve a DASH and an HLS client, performing the media conversion for the latter one.

Keywords: CMAF, Edge Computing, HTTP Adaptive Streaming (HAS)

Posted in News | Comments Off on ISM’20: Dynamic Segment Repackaging at the Edge for HTTP Adaptive Streaming

PCS’21 Special Session: Video encoding for large scale HAS deployments

Video encoding for large scale HAS deployments

Picture Coding Symposium (PCS)

29 June to 2 July 2021, Bristol, UK

https://pcs2021.org

Session organizers: Christian Timmerer (Bitmovin, Austria), Mohammad Ghanbari (University of Essex, UK), and Alex Giladi (Comcast, USA).

Abstract: Video accounts for the vast majority of today’s internet traffic and video coding is vital for efficient distribution towards the end-user. Software- or/and cloud-based video coding is becoming more and more attractive, specifically with the plethora of video codecs available right now (e.g., AVC, HEVC, VVC, VP9, AV1, etc.) which is also supported by the latest Bitmovin Video Developer Report 2020. Thus, improvements in video coding enabling efficient adaptive video streaming is a requirement for current and future video services. HTTP Adaptive Streaming (HAS) is now mainstream due to its simplicity, reliability, and standard support (e.g., MPEG-DASH). For HAS, the video is usually encoded in multiple versions (i.e., representations) of different resolutions, bitrates, codecs, etc. and each representation is divided into chunks (i.e., segments) of equal length (e.g., 2-10 sec) to enable dynamic, adaptive switching during streaming based on the user’s context conditions (e.g., network conditions, device characteristics, user preferences). In this context, most scientific papers in the literature target various improvements which are evaluated based on open, standard test sequences. We argue that optimizing video encoding for large scale HAS deployments is the next step in order to improve the Quality of Experience (QoE), while optimizing costs.

Posted in News | Comments Off on PCS’21 Special Session: Video encoding for large scale HAS deployments

IEEE Communication Magazine: From Capturing to Rendering: Volumetric Media Delivery With Six Degrees of Freedom

From Capturing to Rendering: Volumetric Media Delivery With Six Degrees of Freedom

Teaser: “Help me, Obi-Wan Kenobi. You’re my only hope,” said the hologram of Princess Leia in Star Wars: Episode IV – A New Hope (1977). This was the first time in cinematic history that the concept of holographic-type communication was illustrated. Almost five decades later, technological advancements are quickly moving this type of communication from science fiction to reality.

IEEE Communication Magazine

[PDF]

Jeroen van der Hooft (Ghent University), Maria Torres Vega (Ghent University), Tim Wauters (Ghent University), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), Ali C. Begen (Ozyegin University, Networked Media), Filip De Turck (Ghent University), and Raimund Schatz (AIT Austrian Institute of Technology)

Abstract: Technological improvements are rapidly advancing holographic-type content distribution. Significant research efforts have been made to meet the low-latency and high-bandwidth requirements set forward by interactive applications such as remote surgery and virtual reality. Recent research made six degrees of freedom (6DoF) for immersive media possible, where users may both move their heads and change their position within a scene. In this article, we present the status and challenges of 6DoF applications based on volumetric media, focusing on the key aspects required to deliver such services. Furthermore, we present results from a subjective study to highlight relevant directions for future research.

Posted in News | Comments Off on IEEE Communication Magazine: From Capturing to Rendering: Volumetric Media Delivery With Six Degrees of Freedom

MTAP paper: Automated Bank Cheque Verification Using Image Processing and Deep Learning Methods

MTAP paper: Automated Bank Cheque Verification Using Image Processing and Deep Learning Methods

Multimedia tools and applications (Springer Journal)

Prateek Agrawal (University of Klagenfurt, Austria), Deepak Chaudhary (Lovely Professional University, India), Vishu Madaan (Lovely professional University, India), Anatoliy Zabrovskiy (University of Klagenfurt, Austria), Radu Prodan (University of Klagenfurt, Austria), Dragi Kimovski (University of Klagenfurt, Austria), Christian Timmerer (University of Klagenfurt, Austria)

Abstract: Automated bank cheque verification using image processing is an attempt to complement the present cheque truncation system, as well as to provide an alternate methodology for the processing of bank cheques with minimal human intervention. When it comes to the clearance of the bank cheques and monetary transactions, this should not only be reliable and robust but also save time which is one of the major factor for the countries having large population. In order to perform the task of cheque verification, we developed a tool which acquires the cheque leaflet key components, essential for the task of cheque clearance using image processing and deep learning methods. These components include the bank branch code, cheque number, legal as well as courtesy amount, account number, and signature patterns. our innovation aims at benefiting the banking system by re-innovating the other competent cheque-based monetary transaction system which requires automated system intervention. For this research, we used institute of development and research in banking technology (IDRBT) cheque dataset and deep learning based convolutional neural networks (CNN) which gave us an accuracy of 99.14% for handwritten numeric character recognition. It resulted in improved accuracy and precise assessment of the handwritten components of bank cheque. For machine printed script, we used MATLAB in-built OCR method and the accuracy achieved is satisfactory (97.7%) also for verification of Signature we have used Scale Invariant Feature Transform (SIFT) for extraction of features and Support Vector Machine (SVM) as classifier, the accuracy achieved for signature verification is 98.10%.

Keywords: Cheque Truncation system, Image Segmentation, Bank Cheque Clearance, Image Feature Extraction, Convolution Neural Network, Support Vector Machine, Scale Invariant Feature Transform.

Acknowledgment: This work has been partly supported by the European Union Horizon 2020 Research and Innovation Programme under the ARTICONF Project with grant agreement number 644179 and in part by the Austrian Research Promotion Agency (FFG) under the APOLLO project.

Posted in APOLLO, News | Comments Off on MTAP paper: Automated Bank Cheque Verification Using Image Processing and Deep Learning Methods

VCIP’20: FaME-ML: Fast Multirate Encoding for HTTP Adaptive Streaming Using Machine Learning

FaME-ML: Fast Multirate Encoding for HTTP Adaptive Streaming Using Machine Learning

IEEE International Conference on Visual Communications and Image Processing (VCIP)

1-4 December 2020, Macau

http://www.vcip2020.org/

[PDF][Slides][Video]

Ekrem Çetinkaya (Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), and Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK)

Abstract: HTTP Adaptive Streaming(HAS) is the most common approach for delivering video content over the Internet. Therequirement to encode the same content at different quality levels(i.e., representations) in HAS is a challenging problem for content providers. Fast multirate encoding approaches try to accelerate this process by reusing information from previously encoded representations. In this paper, we propose to use convolutional neural networks (CNNs) to speed up the encoding of multiple representations with a specific focus on parallel encoding. In parallel encoding, the overall time-complexity is limited to the maximum time-complexity of one of the representations that are encoded in parallel. Therefore, instead of reducing the time-complexity for all representations, the highest time-complexities are reduced. Experimental results show that FaME-ML achieves significant time-complexity savings in parallel encoding scenarios(41%in average) with a slight increase in bitrate and quality degradation compared to the HEVC reference software.

Keywords: Video Coding, Convolutional Neural Networks, HEVC, HTTP Adaptive Streaming (HAS)

Posted in News | Comments Off on VCIP’20: FaME-ML: Fast Multirate Encoding for HTTP Adaptive Streaming Using Machine Learning

Interns at the ATHENA Christian Doppler Laboratory

In July 2020, the ATHENA Christian Doppler Laboratory hosted three interns working on the following topics:

  • Indira Pal: Quality of Experience for HTTP Adaptive Streaming based use cases and scenarios.
  • Ilja Pronegg: Machine learning methods for HTTP Adaptive Streaming.
  • Miriam Gütl: State-of-the-art video streaming technologies.

At the end of the internship, the interns presented their work and the results in the form of a presentation and report. We believe that the joint work was useful both for the laboratory and for the interns themselves. We would like to thank the interns for their productive work, useful results, and excellent feedback about our laboratory.

Indira Pal: “I liked the internship very much and felt like a part of the team very soon. This was most likely due to the awesome atmosphere at the Christian Doppler laboratory. In my brief time at ATHENA, I was able to learn a lot about video compression and the challenges associated with this fairly widespread topic. Hadi was an excellent supervisor because he was always at the office and let me work independently while still providing assistance when I needed it. I enjoyed getting to know a Computer Scientist’s/Engineer’s work and now have a much clearer view of what a PhD student’s or Postdoc’s work looks like. I am very glad that I applied for this internship as I believe that my work at ATHENA was truly meaningful and I gained a lot from this experience.”

Miriam Gutl: “I really enjoyed every day of my internship at ATHENA and I am actually quite sad that the four weeks passed by so quickly. Already after the first day I felt so intergrated in the whole team which was so great for the whole atmosphere. In general the whole team was always so nice to us which was a real isperation. Also I would say that I have never learned so much in such a short periode of time till now. So I am really very thankfull that I was taken for this job it really did learned me a lot and in between it was also extremly fun. When I started looking for a internship I wanted something like that and I have actually found it at ATHENA so thank you.”

Ilja Pronegg:The internship was awesome! Everyone was nice and the topics I worked on were very interesting. I would make a second internship here!”

We wish the interns every success in their journey through life and we hope seeing them soon back at the University of Klagenfurt and ATHENA.

Posted in News | Comments Off on Interns at the ATHENA Christian Doppler Laboratory

ACM Multimedia’20: Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks

ACM MM’20: Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks

ACM International Conference on Multimedia 2020, Seattle, United States.
https://2020.acmmm.org

[PDF][Slides][Video]

Negin Ghamsarian (Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), Mario Taschwer (Alpen-Adria-Universität Klagenfurt), and Klaus Schöffmann (Alpen-Adria-Universität Klagenfurt)

Abstract: Recorded cataract surgery videos play a prominent role in training and investigating the surgery, and enhancing the surgical outcomes. Due to storage limitations in hospitals, however, the recorded cataract surgeries are deleted after a short time and this precious source of information cannot be fully utilized. Lowering the quality to reduce the required storage space is not advisable since the degraded visual quality results in the loss of relevant information that limits the usage of these videos. To address this problem, we propose a relevance-based compression technique consisting of two modules: (i) relevance detection, which uses neural networks for semantic segmentation and classification of the videos to detect relevant spatio-temporal information, and (ii) content-adaptive compression, which restricts the amount of distortion applied to the relevant content while allocating less bitrate to irrelevant content. The proposed relevance-based compression framework is implemented considering five scenarios based on the definition of relevant information from the target audience’s perspective. Experimental results demonstrate the capability of the proposed approach in relevance detection. We further show that the proposed approach can achieve high compression efficiency by abstracting substantial redundant information while retaining the high quality of the relevant content.

Keywords: Video Coding, Convolutional Neural Networks, HEVC, ROI Detection, Medical Multimedia.

Posted in News | Comments Off on ACM Multimedia’20: Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks

QUALINET White Paper on Definitions of Immersive Media Experience (IMEx)

QUALINET announces its recent White Paper on Definitions of Immersive Media Experience (IMEx).

It is online available here https://arxiv.org/abs/2007.07032 for free.

With the coming of age of virtual/augmented reality and interactive media, numerous definitions, frameworks, and models of immersion have emerged across different fields ranging from computer graphics to literary works. Immersion is oftentimes used interchangeably with presence as both concepts are closely related. However, there are noticeable interdisciplinary differences regarding definitions, scope, and constituents that are required to be addressed so that a coherent understanding of the concepts can be achieved. Such consensus is vital for paving the directionality of the future of immersive media experiences (IMEx) and all related matters.

The aim of this white paper is to provide a survey of definitions of immersion and presence which leads to a definition of immersive media experience (IMEx). The Quality of Experience (QoE) for immersive media is described by establishing a relationship between the concepts of QoE and IMEx followed by application areas of immersive media experience. Influencing factors on immersive media experience are elaborated as well as the assessment of immersive media experience. Finally, standardization activities related to IMEx are highlighted and the white paper is concluded with an outlook related to future developments.

This White Paper is a contribution by QUALINET, the European Network on Quality of Experience in Multimedia Systems and Services (http://www.qualinet.eu/) to the discussions related to Immersive Media Experience (IMEx). It is motivated by the need for definitions around this term to foster a deeper understanding of ideas and concepts originating from multidisciplinary groups but with a joint interest in multimedia experiences. Thus, this white paper has been created mainly with such multimedia experiences in mind but may be also used beyond.

The QUALINET community aims at extending the notion of network-centric Quality of Service (QoS) in multimedia systems, by relying on the concept of Quality of Experience (QoE). The main scientific objective is the development of methodologies for subjective and objective quality metrics taking into account current and new trends in multimedia communication systems as witnessed by the appearance of new types of content and interactions. QUALINET (2010-2014 as COST Action IC1003) meets once a year collocated with QoMEX (http://qomex.org/) to coordinate its activities around 4 Working Groups (WGs): (i) research, (ii) standardization, (iii) training, and (iv) innovation.

List of Authors and Contributors
Andrew Perkis (andrew.perkis@ntnu.no, editor), Christian Timmerer (christian.timmerer@itec.uni-klu.ac.at, editor), Sabina Baraković, Jasmina Baraković Husić, Søren Bech, Sebastian Bosse, Jean Botev, Kjell Brunnström, Luis Cruz, Katrien De Moor, Andrea de Polo Saibanti, Wouter Durnez, Sebastian Egger-Lampl, Ulrich Engelke, Tiago H. Falk, Asim Hameed, Andrew Hines, Tanja Kojic, Dragan Kukolj, Eirini Liotou, Dragorad Milovanovic, Sebastian Möller, Niall Murray, Babak Naderi, Manuela Pereira, Stuart Perry, Antonio Pinheiro, Andres Pinilla, Alexander Raake, Sarvesh Rajesh Agrawal, Ulrich Reiter, Rafael Rodrigues, Raimund Schatz, Peter Schelkens, Steven Schmidt, Saeed Shafiee Sabet, Ashutosh Singla, Lea Skorin-Kapov, Mirko Suznjevic, Stefan Uhrig, Sara Vlahović, Jan-Niklas Voigt- Antons, Saman Zadtootaghaj.

How to reference this white paper
Perkis, A., Timmerer, C., et al., “QUALINET White Paper
on Definitions of Immersive Media Experience (IMEx)”, European Network on Quality of Experience in Multimedia Systems and Services, 14th QUALINET meeting (online), May 25, 2020.

Alternatively, you may export the citation from arXiv: https://arxiv.org/abs/2007.07032.

Posted in News | Comments Off on QUALINET White Paper on Definitions of Immersive Media Experience (IMEx)

BigMM’20: ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Sequences using Artificial Neural Network

ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Sequences Using Artificial Neural Network

The Sixth IEEE International Conference on Multimedia Big Data (BigMM 2020)

September 24-26, 2020 New Delhi. http://bigmm2020.org/

[PDF][Slides][Video]

Anatoliy Zabrovskiy (Alpen-Adria-Universität Klagenfurt), Prateek Agrawal (Alpen-Adria-Universität Klagenfurt, Lovely Professional  University), Roland Mathá (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin) and Radu Prodan (Alpen-Adria-Universität Klagenfurt).

Abstract: HTTP Adaptive Streaming of video content is becoming an integral part of the Internet and accounts for the majority of today’s traffic. Although Internet bandwidth is constantly increasing, video compression technology plays an important role and the major  challenge is to select and set up multiple video codecs, each with hundreds of transcoding parameters. Additionally, the transcoding speed depends directly on the selected transcoding parameters and the infrastructure used. Predicting transcoding time for multiple transcoding parameters with different codecs and processing units is a challenging task, as it depends on many factors. This paper provides a novel and considerably fast method for transcoding time prediction using video content classification and neural network prediction. Our artificial neural network (ANN) model predicts the transcoding times of video segments for state-of-the-art video codecs based on transcoding parameters and content complexity. We evaluated our method for two video codecs/implementations (AVC/x264 and HEVC/x265) as part of large-scale HTTP Adaptive Streaming services. The ANN model of our method is able to predict the transcoding time by minimizing the mean absolute error (MAE) to 1.37 and 2.67 for x264 and x265 codecs, respectively. For x264, this is an improvement of 22% compared to the state of the art.

Keywords: Transcoding time prediction, adaptive streaming, video transcoding, neural networks, video encoding, video complexity class, MPEG-DASH

Acknowledgment: This work has been supported in part by the Austrian Research Promotion Agency (FFG) under the APOLLO project and by the European Union Horizon 2020 Research and Innovation Programme under the ASPIDE Project with grant agreement number 801091.

Posted in APOLLO, News | Comments Off on BigMM’20: ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Sequences using Artificial Neural Network