At each sampling time instant, one observes system output and action to form discrete-time rewards. The sampled input-output data are collected along the trajectory of the dynamical system in ...
Max-plus linear systems represent a class of mathematical models where the conventional operations of addition and multiplication are replaced by maximisation and addition, respectively. This ...