Title :
Online learning for wireless video transmission with limited information
Author :
Zhang, Yu ; Fu, Fangwen ; Van der Schaar, Mihaela
Author_Institution :
Electr. Eng. Dept., UCLA, Los Angeles, CA, USA
Abstract :
In this paper, we address the problem of joint packet scheduling at the application layer as well as power and rate allocation at the physical layer for delay-sensitive video streaming over slow-varying flat-fading wireless channels. Our goal is to find the optimal cross-layer policy that maximizes the cumulative received video quality, while minimizing the total transmission energy. We first formulate the cross-layer optimization using a systematic layered Markov Decision Process (MDP) framework and then propose a layered real-time dynamic programming (RTDP) algorithm for solving this cross-layer optimization problem by combining together the policy update and real-time decision making. This approach reduces the high complexity of the conventionally used offline dynamic programming methods. Moreover, to accommodate the cases when the network environment dynamics (e.g. state transition probabilities) are unknown or non-stationary (e.g. state transition probabilities are changed over time), we further improve our RTDP method by collecting the required network information and estimating the dynamics online, using a model-free approach. Based on this information, a user (a transmitter-receiver pair) can adaptively change its policy to cope in real-time with the experienced environment dynamics. We also prove the convergence of this RTDP method (which complies with the layered architecture of the OSI stack). Finally, our numerical experiments show that the proposed RTDP solutions outperform the conventional offline DP methods for real-time video streaming.
Keywords :
Markov processes; computer networks; dynamic programming; fading channels; learning systems; video streaming; Markov decision process; application layer; delay sensitive video streaming; joint packet scheduling; limited information; online learning; optimal cross layer policy; power allocation; rate allocation; real time dynamic programming algorithm; slow varying flat fading wireless channels; wireless video transmission; Decision making; Delay; Dynamic programming; Heuristic algorithms; Open systems; Physical layer; Real time systems; Scheduling algorithm; State estimation; Streaming media; dynamic programming; layered markov decision process; online learning; real-time learning; wireless video transmission;
Conference_Titel :
Packet Video Workshop, 2009. PV 2009. 17th International
Conference_Location :
Seattle, WA
Print_ISBN :
978-1-4244-4651-3
Electronic_ISBN :
978-1-4244-4652-0
DOI :
10.1109/PACKET.2009.5152166