Dspace @ IIM Kozhikode

effSAMWMIX: An efficient Stochastic Multi-Armed Bandit Algorithm based on a Simulated Annealing with Multiplicative Weights

Show simple item record

dc.contributor.author Villari, Boby Chaitanya
dc.contributor.author Abdulla, Mohammed Shahid
dc.date.accessioned 2017-05-18T10:39:48Z
dc.date.available 2017-05-18T10:39:48Z
dc.date.issued 2017-01
dc.identifier.uri http://hdl.handle.net/2259/935
dc.description.abstract β€”SAMWMIX, a Stochastic Multi-Armed Bandit(SMAB) which obtains a 𝑶𝑶(𝒍𝒍𝒍𝒍𝒍𝒍 T) where T being the number of steps in the time horizon, is proposed in the literature . A blind-SAMWMIX which incorporates an input parameter ,which has better empirical performance but obtains a regret of the order 𝑶𝑶(𝒍𝒍𝒍𝒍𝒈𝒈𝟏𝟏+𝟐𝟐𝜶𝜶 𝑻𝑻).Current work proposes an efficient version of SAMWMIX which not only obtains a regret of 𝑶𝑶(𝒍𝒍𝒍𝒍𝒍𝒍 K) but also exults a better performance. A proof for the same is given in this work. The proposed effSAMWMIX algorithm is compared with KL-UCB and Thompson Sampling(TS) algorithms over rewards which follow distributions like Exponential, Poisson, Bernoulli, Triangular, Truncated Normal distribution and a synthetic distribution designed to stress test SMAB algorithms with closely spaced reward means. It is shown that effSAMWMIX performs better than both KL-UCB & TS in both regret performance and execution time en_US
dc.language.iso en en_US
dc.publisher Indian Institute of Management en_US
dc.subject stochastic multi-armed bandit en_US
dc.subject stochastic processes en_US
dc.subject reward distributions en_US
dc.subject optimization en_US
dc.title effSAMWMIX: An efficient Stochastic Multi-Armed Bandit Algorithm based on a Simulated Annealing with Multiplicative Weights en_US
dc.type Working Paper en_US
ο»Ώ

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account