The security conversation in 2026 is still mostly about the transaction. Crypto mixers. CoinJoin entropy. Stealth addresses. The assumption is that if you secure the ledger, you secure the person.
The research disagrees.
Tilburg University's Breadcrumbs in the Digital Forest (de Jong et al., 2026) is not about blockchain. It is about BitTorrent — a protocol most people filed under "solved privacy risk" in 2005 and never revisited. The paper documents how torrent swarm metadata is now a high-speed behavioral profiling engine. Your IP, your peer associations, the specific intersection of what you downloaded alongside whom — that pattern is more durable than a transaction signature and requires no cryptanalysis to read.
The Swarm as a Registration System
When you join a torrent swarm, you announce your presence to a UDP tracker. The protocol is not optional about this. You send your IP, your port, the info hash of the content you want, and the tracker responds with a peer list. You now know who else is in the swarm. They know you're in it too. The tracker logged the transaction.
De Jong et al. collected over 60,000 unique IPs from a handful of targeted torrents. Not through intrusion. Not through a wiretap. Through normal protocol participation — the same scrape request any client sends.
import requests
import socket
import struct
def scrape_http_tracker(tracker_url: str, info_hash_hex: str) -> list[tuple[str, int]]:
"""
Scrape an HTTP tracker for the peer list of a specific torrent.
Normal protocol behavior — no authentication, no intrusion.
The tracker hands you the swarm because that is what trackers do.
"""
info_hash = bytes.fromhex(info_hash_hex)
scrape_url = tracker_url.replace('/announce', '/scrape')
response = requests.get(
scrape_url,
params={'info_hash': info_hash},
timeout=10
)
peers = []
# Compact peer format: 6 bytes per peer (4 IP + 2 port)
peer_data = response.content
for i in range(0, len(peer_data) - 5, 6):
ip = socket.inet_ntoa(peer_data[i:i+4])
port = struct.unpack('>H', peer_data[i+4:i+6])[0]
peers.append((ip, port))
return peers
def enrich_peer(ip: str) -> dict:
"""
Geolocation and ISP enrichment against a harvested peer IP.
Flags known VPN exit nodes and hosting providers.
"""
response = requests.get(f"https://ipinfo.io/{ip}/json", timeout=5)
data = response.json()
org = data.get('org', '')
return {
'ip': ip,
'country': data.get('country'),
'org': org,
'is_vpn': any(kw in org.lower() for kw in ['vpn', 'hosting', 'datacenter', 'cloud'])
}
Sixty thousand IPs. Enriched with geolocation and ISP data. Each one placed into a profile that updates every time that IP appears in a new swarm.
The Interest-Based Fingerprint
The individual IP is not the threat model. The co-download pattern is.
Anonymity depends on the size of the anonymity set — the number of people who could plausibly be you given the observed behavior. If a thousand people downloaded the same file, the investigator has a thousand candidates. If you are the only person who downloaded that specific obscure forensic examination guide and that specific privacy-hardened OS image in the same 72-hour window, your anonymity set collapsed to one.
from collections import defaultdict, Counter
def build_interest_graph(swarm_records: dict[str, set[str]]) -> dict[str, list[str]]:
"""
swarm_records: {info_hash: set of peer IPs that appeared in swarm}
Returns: per-IP list of swarms they participated in.
Cross-reference with content metadata and you have an interest graph.
"""
ip_to_swarms = defaultdict(list)
for swarm_id, peers in swarm_records.items():
for ip in peers:
ip_to_swarms[ip].append(swarm_id)
return dict(ip_to_swarms)
def find_collapsed_anonymity_sets(
ip_to_swarms: dict[str, list[str]],
sensitive_swarms: set[str]
) -> list[dict]:
"""
Flag IPs whose co-download pattern uniquely identifies them.
sensitive_swarms: info hashes of content with low total peer counts.
When the intersection of sensitive swarms narrows to a single IP,
the anonymity set is one. No encryption broke. No warrant was served.
"""
candidates = []
for ip, swarms in ip_to_swarms.items():
overlap = [s for s in swarms if s in sensitive_swarms]
if len(overlap) >= 2:
candidates.append({
'ip': ip,
'sensitive_swarm_count': len(overlap),
'swarms': overlap
})
# Sort by most uniquely identifying profile
return sorted(candidates, key=lambda x: x['sensitive_swarm_count'], reverse=True)
The investigator does not need your name. They need the intersection. If the intersection is unique, the investigation is over before it started.
The VPN Fallacy
The 2026 research specifically addresses the obvious counter-move. Yes, a VPN hides your residential IP from the tracker. The peer list sees an exit node, not your home. Analysts now flag anonymization status as part of enrichment — residential IP versus known VPN exit node versus cloud/datacenter address.
