IEEE International Conference on
Acoustics, Speech, and Signal Processing (ICASSP)
14-19 April 2024, Seoul, Korea
[PDF]
Hadi Amirpour (AAU, Austria), Jingwen Zhu (University of Nantes, France), Patrick Le Callet (University of Nantes, France), and Christian Timmerer (AAU, Austria)
Abstract: In HTTP Adaptive Streaming (HAS), a video is encoded at multiple bitrate-resolution pairs, referred to as representations, which enables users to choose the most suitable representation based on their network connection. To optimize the set of bitrate-resolution pairs and improve the Quality of Experience (QoE) for users, it is of utmost importance to measure the quality of the representations. VMAF is a highly reliable metric used in HAS to assess the quality of representations. However, in practice, using it for optimization can be a very time-consuming process, and it is infeasible for live streaming applications. To tackle its high complexity, our paper introduces a new method called VQM4HAS, which extracts low-complexity features including (i) video complexity features, (ii) bitstream features logged during the encoding process, and (iii) basic video quality metrics. These extracted features are then fed into a regression model to predict VMAF.
Our experimental results demonstrate that VQM4HAS achieves a high Pearson correlation coefficients (PCC) with VMAF from 0.95 to 0.96 depending on the resolution, but exhibits significantly less complexity, making it suitable for live streaming scenarios.