Build a Portfolio Overlap Checker in Python

Many investors unknowingly hold the same stocks twice. If you own both QQQ and VGT, you have massive overlap in Apple, Microsoft, and NVIDIA. This tutorial builds a complete overlap detector using the free SecuritiesDB API — a programmable Morningstar X-Ray alternative.

~10 min read · Intermediate Python

Why Overlap Matters

Diversification is the only free lunch in investing (Harry Markowitz). But ETF overlap silently destroys it:

  • VTI + SPY: ~80% overlap in holdings — paying two expense ratios for nearly the same exposure
  • QQQ + VGT: ~45% overlap — heavy tech concentration you might not realize
  • SCHD + VYM: ~35% overlap — dividend strategies share many value stocks

A programmatic checker catches this before it costs you.

Step 1: Fetch Overlap Data

The ETF Overlap API returns the intersection of two funds in a single request. No auth required.

import requests

def get_overlap(etf_a, etf_b):
    """Fetch overlap between two ETFs."""
    url = f"https://securitiesdb.com/api/v1/etfs/{etf_a}/overlap/{etf_b}"
    r = requests.get(url)
    r.raise_for_status()
    return r.json()["data"]

# Example
data = get_overlap("QQQ", "VGT")
print(f"Overlap: {data['overlap_pct']:.1f}%")
print(f"Shared holdings: {data['shared_count']}")

Step 2: Build a Portfolio Matrix

Check every pair of ETFs in your portfolio:

from itertools import combinations

portfolio = ["VTI", "QQQ", "VGT", "SCHD", "VYM"]

matrix = {}
for a, b in combinations(portfolio, 2):
    data = get_overlap(a, b)
    matrix[(a, b)] = data["overlap_pct"]
    print(f"  {a} ↔ {b}: {data['overlap_pct']:.1f}% overlap")

# Flag dangerous overlaps (>30%)
print("\n⚠️  High Overlap Pairs:")
for (a, b), pct in sorted(matrix.items(), key=lambda x: -x[1]):
    if pct > 30:
        print(f"  🔴 {a} ↔ {b}: {pct:.1f}% — consider replacing one")

Step 3: Visualize the Overlap

Create a heatmap with matplotlib:

import numpy as np
import matplotlib.pyplot as plt

n = len(portfolio)
heatmap = np.zeros((n, n))

for (a, b), pct in matrix.items():
    i, j = portfolio.index(a), portfolio.index(b)
    heatmap[i][j] = heatmap[j][i] = pct
np.fill_diagonal(heatmap, 100)

fig, ax = plt.subplots(figsize=(8, 6))
im = ax.imshow(heatmap, cmap="YlOrRd", vmin=0, vmax=100)
ax.set_xticks(range(n)); ax.set_xticklabels(portfolio)
ax.set_yticks(range(n)); ax.set_yticklabels(portfolio)

for i in range(n):
    for j in range(n):
        ax.text(j, i, f"{heatmap[i][j]:.0f}%",
                ha="center", va="center", fontsize=10)

plt.colorbar(im, label="Overlap %")
plt.title("ETF Portfolio Overlap Matrix")
plt.tight_layout()
plt.savefig("portfolio_overlap.png", dpi=150)
print("Saved portfolio_overlap.png")

Step 4: Generate a Report

# Combine with ETF X-Ray for a full portfolio audit
for etf in portfolio:
    xray = requests.get(
        f"https://securitiesdb.com/api/v1/etfs/{etf}/xray"
    ).json()["data"]
    print(f"\n{etf}:")
    print(f"  Expense ratio: {xray.get('expense_ratio', 'N/A')}")
    print(f"  HHI concentration: {xray.get('hhi', 'N/A')}")
    print(f"  Top sector: {xray.get('sectors', [{}])[0].get('name', 'N/A')}")

# Final recommendation
max_pair = max(matrix.items(), key=lambda x: x[1])
print(f"\n📋 Recommendation: {max_pair[0][0]} and {max_pair[0][1]} "
      f"have {max_pair[1]:.0f}% overlap — consider consolidating.")

What You Built

  • Pairwise overlap detection for any ETF portfolio
  • A visual heatmap highlighting concentration risk
  • Automated buy/sell recommendations based on threshold
  • Full portfolio X-Ray combining overlap + fund diagnostics

Total code: ~60 lines. No API key. No rate limiting for reasonable use. Compare that to a Morningstar Direct subscription at $15,000/year.

Related API Endpoints