How OpenScan finds your txs without an indexer #168

AugustoL · 2026-01-21T12:24:23Z

AugustoL
Jan 21, 2026
Maintainer

How OpenScan finds your txs without an indexer

The Problem

Every block explorer you've used — Etherscan, Blockscout, Arbiscan — relies on centralized indexers. These are databases that scan every block, extract transaction data, and store it for quick lookups.

When you search for an address, you're not querying the blockchain. You're querying their database.

This works great for speed, but it requires:

Trust in a third party
Infrastructure to run indexers
Dependency on external APIs

OpenScan takes a different approach. We query the blockchain directly through your configured RPC, using a binary search algorithm to find transactions efficiently.

The Key Insight

Two pieces of on-chain state change when an address transacts:

Nonce — Increments by 1 for every outgoing transaction
Balance — Changes when ETH is sent or received

Here's the critical part: you can query these values at any historical block using standard RPC methods:

eth_getTransactionCount(address, blockNumber) → nonce
eth_getBalance(address, blockNumber) → balance

If the nonce at block 1,000,000 is 5 and at block 2,000,000 is 8, we know 3 outgoing transactions occurred somewhere in that range.

If the balance changed but the nonce didn't, we know the address received ETH (or interacted with a contract).

The Algorithm

Phase 1: Initial State Comparison

First, we fetch the address state at two points:

Block 0 (or the start of our search range)
Latest block

Block 0:      nonce=0, balance=0
Block 18M:    nonce=47, balance=2.5 ETH

We now know: 47 outgoing transactions, and some incoming activity.

RPC Calls: 4 (2 parallel calls × 2 blocks)

Phase 2: Binary Search

Instead of scanning 18 million blocks, we binary search for state changes:

Search(block 0 → block 18M):
  ├─ Fetch state at midpoint (block 9M)
  │   └─ nonce=30, balance=1.2 ETH
  │
  ├─ Left half changed? (0→9M): nonce 0→30 ✓
  │   └─ Recurse into left half
  │
  └─ Right half changed? (9M→18M): nonce 30→47 ✓
      └─ Recurse into right half

We prioritize the right half first (newer blocks) so recent transactions appear in your UI faster.

When we narrow down to adjacent blocks where state changed, we've found a block containing transactions.

RPC Calls per iteration: 2 (nonce + balance in parallel)

Phase 3: Transaction Fetching

When an "important" block is identified:

Fetch the full block with eth_getBlockByNumber(block, true)
Filter transactions where address is from (sent) or to (received)
Batch fetch receipts in groups of 20 to avoid rate limits

Phase 4: Internal Transaction Detection

If the balance changed but nonce didn't, and no direct transfer was found, we look for internal transactions:

Check calldata first — Scan transaction input for the address hex (free, no RPC)
Check logs if needed — Scan event logs for the address (requires receipt fetches)

This catches interactions with contracts like Uniswap, Disperse.app, and multisigs.

RPC Call Estimates

Here's what to expect for different address histories on Ethereum mainnet (~20M blocks as of early 2026):

Binary Search Phase

Block Range	Binary Search Iterations	State Queries
~220K blocks (1 month)	~18	~36 calls
~1.3M blocks (6 months)	~20	~40 calls
~5.3M blocks (2 years)	~22	~44 calls
~20M blocks (genesis)	~24	~48 calls

The math: Binary search is O(log n). log₂(20,000,000) ≈ 24 iterations. Each iteration queries one midpoint = 2 RPC calls (nonce + balance in parallel).

Transaction Fetching Phase

For each block with transactions found:

1 call to fetch the full block (eth_getBlockByNumber)
1 call per transaction to fetch receipts (batched in groups of 20)

Total Estimates by Transaction Count

Transactions Found	State Queries	Block Fetches	Receipt Fetches	Total
5 txs	~40	~5	~5	~50 calls
20 txs	~50	~15	~20	~85 calls
50 txs	~60	~30	~50	~140 calls
100 txs	~80	~50	~100	~230 calls

Note: Actual numbers vary based on transaction distribution across blocks.

Compared to Naive Approach

Scanning all 20M blocks would require 20 million eth_getBlockByNumber calls. Binary search reduces this to 50-250 calls — a reduction of 99.999%.

Optimizations We Use

1. Right-First Search

We search newer blocks first. Most users care about recent activity, so this gets results on screen faster.

2. Receipt Batching

Instead of fetching receipts one by one (which triggers rate limits), we batch them in groups of 20 with small delays between batches.

3. Calldata Scanning

Before making expensive receipt calls for internal transaction detection, we scan the transaction input data. This catches many contract interactions for free.

4. Session Caching

Within a single search session, we cache nonce/balance results. If the algorithm queries the same block twice, we don't make redundant RPC calls.

5. Streaming Results

Transactions appear in your UI as they're found, not after the entire search completes.

Current Limitations

We believe in being transparent about trade-offs. Here's what we're still working on:

1. Speed vs Trust Trade-off

This approach is slower than centralized indexers. A full search can take 10-30 seconds for active addresses, while Etherscan returns results instantly.

But those instant results come from trusting their database. Our results come directly from the blockchain through your RPC.

2. Internal Transaction Detection

Detecting transactions where your address is only involved internally (not as from or to) is challenging:

Contract-to-contract transfers: If a contract sends you ETH, we detect the balance change but finding the exact transaction requires log scanning
Pure internal calls: Some contract interactions don't emit events, making them nearly impossible to attribute

We detect most cases, but some edge cases slip through.

3. RPC Rate Limits

Public RPCs have rate limits. When searching old addresses with many transactions:

Requests may get throttled (HTTP 429)
Some providers temporarily block your IP
Search may fail partway through

Our solution: OpenScan lets you configure Infura and Alchemy API keys directly in Settings. These providers offer:

Higher rate limits than public RPCs
Lower latency for faster searches
More reliable connections

Just add your API key once, and OpenScan automatically uses it for all supported networks. You can also configure multiple RPC endpoints in fallback mode for redundancy.

4. Balance Ambiguity

If an address sends and receives ETH in the same block, we detect both transactions, but the algorithm can't distinguish which balance change came from which transaction without fetching all transactions in the block.

5. Memory Usage

All found transactions are held in memory during the search. For addresses with thousands of transactions, this can be significant. Results are cached to localStorage afterward (with 1-week expiry) to avoid re-searching.

Why We Built This

OpenScan exists because we believe in trustless infrastructure.

We choose to build trustless systems even when it is harder.

Every transaction you see in OpenScan was fetched directly from the blockchain through your RPC. No middleman. No database we control. No trust required.

Is it slower? Yes. Is it worth it? We think so.

Try It Yourself

Go to any address page on openscan.eth.link
Click Search Transactions
Watch the binary search work in real-time
Results are cached — subsequent visits are instant

What's Next

We're exploring several improvements:

ERC20 transfer history: The same binary search approach works for tokens! By querying balanceOf(address) at different blocks, we can track all ERC20 transfers — including internal calls and contract interactions. Unlike ETH where internal transfers are hard to detect, token balance changes are fully visible on-chain.
Parallel binary search: Search multiple state ranges simultaneously
Smarter internal tx detection: Use trace APIs where available
Incremental caching: Cache intermediate search states, not just results
Retry logic: Automatic retry with exponential backoff on rate limits

Have ideas? We'd love to hear them in the comments or on GitHub.

Technical Reference

For those who want to dig into the code:

Core algorithm: src/services/AddressTransactionSearch.ts
Integration layer: src/services/adapters/EVMAdapter/EVMAdapter.ts

The implementation is ~600 lines of TypeScript. PRs welcome.

The OpenScan Team

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Openscan

How OpenScan finds your txs without an indexer #168

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Openscan

How OpenScan finds your txs without an indexer #168

Uh oh!

Uh oh!

AugustoL Jan 21, 2026 Maintainer

How OpenScan finds your txs without an indexer

The Problem

The Key Insight

The Algorithm

Phase 1: Initial State Comparison

Phase 2: Binary Search

Phase 3: Transaction Fetching

Phase 4: Internal Transaction Detection

RPC Call Estimates

Binary Search Phase

Transaction Fetching Phase

Total Estimates by Transaction Count

Compared to Naive Approach

Optimizations We Use

1. Right-First Search

2. Receipt Batching

3. Calldata Scanning

4. Session Caching

5. Streaming Results

Current Limitations

1. Speed vs Trust Trade-off

2. Internal Transaction Detection

3. RPC Rate Limits

4. Balance Ambiguity

5. Memory Usage

Why We Built This

Try It Yourself

What's Next

Technical Reference

Replies: 0 comments

AugustoL
Jan 21, 2026
Maintainer