Blog

Reinforcement Learning - Bandit Algorithms (Slot Machine Probability Simulation, Bull Market Stock Selection)

hyuniiie

2026.03.30

·Service·by 배레온/부산/개발자

#Bandit Algorithm#Epsilon-Greedy#Python#Reinforcement Learning#Simulation

Key Points

1This paper introduces reinforcement learning concepts and the multi-armed bandit problem, highlighting the critical balance between exploration and exploitation in decision-making.
2It demonstrates methods for estimating slot probabilities using recursive Q-value updates and epsilon-greedy policies, further comparing their performance in both static and dynamically changing environments.
3The principles of Q-learning are then applied to a real-world financial scenario, identifying KOSPI stocks with a high probability of significant price increases based on a learned Q-value threshold.

R_1, R_2, \dots, R_n

Blog

hyuniiie

2026.03.30

·Service·by 배레온/부산/개발자

#Bandit Algorithm#Epsilon-Greedy#Python#Reinforcement Learning#Simulation

1This paper introduces reinforcement learning concepts and the multi-armed bandit problem, highlighting the critical balance between exploration and exploitation in decision-making.
2It demonstrates methods for estimating slot probabilities using recursive Q-value updates and epsilon-greedy policies, further comparing their performance in both static and dynamically changing environments.
3The principles of Q-learning are then applied to a real-world financial scenario, identifying KOSPI stocks with a high probability of significant price increases based on a learned Q-value threshold.

R_1, R_2, \dots, R_n