One of the challenges faced by many video providers is the heterogeneity of network specifications, user requirements, and content compression performance. The universal solution of a fixed bitrate ladder is inadequate in ensuring a high quality of user experience without re-buffering or introducing annoying compression artifacts. However, a content-tailored solution, based on extensively encoding across all resolutions and over a wide quality range is highly expensive in terms of computational, financial, and energy costs. Inspired by this, we propose approaches that exploit machine learning to predict a content-optimized bitrate ladder. The methods extract spatio-temporal features from the uncompressed content, train machine-learning models to predict the Pareto front parameters, and, based on that, build the ladder within a defined bitrate range. The methods have the benefit of significantly reducing the number of encodes required per sequence.
The collection of papers below provides all related research results.