Pro & Institutional

Reinforcement Learning

Train AI agents to trade using PPO, A2C, DQN and other RL algorithms on a sub-millisecond event loop.

RlxEnv — Gym-Compatible

RLX provides RlxEnv — a high-performance trading environment fully compatible with OpenAI Gymnasium.

Environment Initialization
from rlxbt import RlxEnv, load_data

# 1. Load data
data = load_data("data/BTCUSDT_1h.csv")

# 2. Setup environment
env = RlxEnv(
    data=data,
    initial_capital=100000.0,
    window_size=32       # History window for the agent
)

# 3. Standard interface
obs, info = env.reset()
obs, reward, done, truncated, info = env.step(env.action_space.sample())

Discrete Action Space

0
Hold / Close
1
Long Position
2
Short Position

Observation Vector

Market State

Normalized sliding window of OHLCV data.

Account State

Current equity, signed position size, and time-in-trade.

Total Vector Dimension
163
For window_size=32 (Market 160 + Stats 3)

PPO Training Example

Stable Baselines 3 Integration
from stable_baselines3 import PPO

# 1. Initialize vectorized environment
env = RlxEnv(data=data, window_size=32)

# 2. Create the agent
model = PPO(
    "MlpPolicy",
    env,
    learning_rate=3e-4,
    verbose=1
)

# 3. Learn patterns
model.learn(total_timesteps=100_000)