We consider a system with N different service modes handling several traffic classes, where a scheduling agent decides which service option to use at each time slot. Each service mode provides service simultaneously to all the traffic classes, at different random rates (possibly zero). The scheduling agent can observe the global queue state, but does not have any advance knowledge of the instantaneous rates or service rate distributions associated to each of the service modes. We propose a threshold-based algorithm where the threshold value is compared to the sum of the square of the queue lengths at that time slot. If the threshold value is exceeded, then the scheduling agent switches the service mode, and it keeps using the same service mode otherwise. This sequential decision making process is related to the well-studied Multi-Armed Bandit problem or the Max-Weight algorithm but involves several major differences and additional challenges.
We aim to understand the performance of this threshold-based scheduling algorithm. In the first step, we assume that when the scheduling agent switches to a new service mode this is sampled uniformly at random among all possible options. We show that the scheduling algorithm achieves maximum stability and also analyze the mean responses time of the various traffic classes. In the second step, we extend the scheduling agent in order to learn which service mode to use at each queue state.
|Supervisors||Sem Borst (TU/e)|
|Location||Eindhoven University of Technology (TU/e)|
This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement Grant Agreement No 101034253.