[AI7-2-6] Reinforcement learning

Not in BoKV5

Outgoing relations