Order Flow in Python
The signed volume of buy and sell orders hitting the market, used to infer the directional pressure and informed trading activity driving short-term price movements.
Definition
Order Flow refers to the real-time stream of market orders — both aggressive buy orders that lift the ask and aggressive sell orders that hit the bid — that drives immediate price discovery. Unlike quotes (which represent passive intentions), order flow represents revealed, committed capital. Order Flow Imbalance (OFI), the net difference between buyer-initiated and seller-initiated volume, is one of the most powerful short-term predictors of price direction in microstructure research. The intuition is that a market flooded with buy market orders will exhaust available ask-side liquidity and push prices higher, revealing the direction of informed trading activity.
Quantitative Formula
Where and are the volumes of the -th buy and sell transactions at time , indicates a buyer-initiated trade (executed at the ask), and indicates a seller-initiated trade (executed at the bid). In practice, the Lee-Ready (1991) algorithm is used to classify each trade as buyer- or seller-initiated when side information is unavailable.
Why It Matters in Backtesting
Order flow analysis is the domain where quantitative microstructure and high-frequency trading intersect. Without tick data classified by trade direction, order flow cannot be properly modeled in a backtest. Strategies that trade on order flow imbalance signals have holding periods measured in seconds to minutes, making slippage and infrastructure latency the dominant P&L drivers. For daily bar backtests, a pragmatic proxy is using the ratio of up-volume to down-volume (derived from intraday ticks) as an order flow imbalance estimate.
Python Implementation
import numpy as np
import pandas as pd
def calculate_order_flow_imbalance(tick_data: pd.DataFrame,
window: int = 100) -> pd.DataFrame:
"""
Computes Order Flow Imbalance (OFI) from classified tick data.
tick_data: DataFrame with columns ['price', 'volume', 'side'] where
side = 1 for buyer-initiated, -1 for seller-initiated.
Uses Lee-Ready classification if 'side' column is absent.
"""
df = tick_data.copy()
if "side" not in df.columns:
# Lee-Ready tick rule: compare trade price to previous trade price
df["side"] = np.sign(df["price"].diff()).replace(0, np.nan).ffill().fillna(1)
df["signed_volume"] = df["volume"] * df["side"]
df["ofi"] = df["signed_volume"].rolling(window).sum()
df["buy_volume"] = df["volume"].where(df["side"] == 1, 0).rolling(window).sum()
df["sell_volume"] = df["volume"].where(df["side"] == -1, 0).rolling(window).sum()
df["ofi_ratio"] = df["buy_volume"] / (df["buy_volume"] + df["sell_volume"] + 1e-9)
# Price impact: correlation between OFI and subsequent price change
df["future_return"] = df["price"].pct_change().shift(-1)
ofi_price_corr = df["ofi"].corr(df["future_return"])
df.attrs["ofi_price_correlation"] = ofi_price_corr
df.attrs["informed_trading_signal"] = abs(ofi_price_corr) > 0.3
return df[["ofi", "ofi_ratio", "buy_volume", "sell_volume", "future_return"]]Test this in a live environment
Stop running Jupyter notebooks locally. Paste this Order Flow code directly into Valetha's Strategy Lab and run a full historical backtest in seconds.
Open the Python Strategy Lab