Skip to content

Benchmark hardcoded to SPY breaks reflection alpha for non-US tickers #628

@adityarai7297

Description

@adityarai7297

Problem

_fetch_returns() in tradingagents/graph/trading_graph.py:205 always pulls SPY as the benchmark when computing alpha for the reflection layer:

spy = yf.Ticker("SPY").history(start=trade_date, end=end_str)
...
alpha = raw - spy_ret

The reflection log line in tradingagents/graph/reflection.py:48 hardcodes the same assumption:

f"Alpha vs SPY: {alpha_return:+.1%}\n\n"

This is fine for US-listed tickers but produces misleading alpha for international ones. For example, analyzing RELIANCE.NS (NSE) against SPY mixes a single-stock INR return with a USD index return — the resulting "alpha" reflects FX and regional macro drift more than stock-specific performance. Same issue for .T (Japan), .HK (Hong Kong), .L (London), .TO (Toronto), etc.

The framework already advertises international support — cli/utils.py:11 literally lists CNC.TO, 7203.T, 0700.HK as ticker examples, and agent_utils.py:37-43 prompts agents to preserve exchange suffixes — so the reflection layer is the odd component out.

Proposed solution

Make the benchmark configurable in default_config.py, with optional auto-detection from the ticker's exchange suffix. Two shapes worth discussing:

Option A — single override:

"benchmark_ticker": "SPY"

Option B — suffix map with auto-detect (preferred):

"benchmark_ticker": None,  # None = auto-detect via suffix map
"benchmark_map": {
    ".NS": "^NSEI",    # Nifty 50
    ".BO": "^BSESN",   # Sensex
    ".T":  "^N225",    # Nikkei 225
    ".HK": "^HSI",     # Hang Seng
    ".L":  "^FTSE",    # FTSE 100
    ".TO": "^GSPTSE",  # TSX Composite
    "":    "SPY",      # default for US-listed
}

Option B keeps the zero-config experience for US users while making non-US analyses meaningful out of the box. An explicit benchmark_ticker value would still win over the map for advanced users.

reflection.py would then format f"Alpha vs {benchmark}: ..." instead of hardcoding SPY.

Scope

Strictly the benchmark. Out of scope for this issue (each its own follow-up):

  • Currency normalization for TraderAction price targets
  • News source localization (Moneycontrol, Economic Times, etc.)
  • Region-specific insider/disclosure feeds (SEBI for India, etc.)

Willing to PR

Happy to send a PR against main once a maintainer confirms which shape (A vs B) is preferred. Existing tests in tests/test_memory_log.py already cover the _fetch_returns path and would be extended.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions