Перейти до основного вмісту

Bitcoin Block Streaming: Why gRPC Outperforms JSON-RPC for Indexer Backfills

· 3 хв. читання
OverBlock Team
OverBlock Engineering

Bitcoin has over 880,000 blocks. If you're building an indexer and need to backfill from genesis, your choice of transport protocol will determine whether that takes hours or weeks.

The JSON-RPC bottleneck

The standard approach to fetching Bitcoin blocks is getblock over JSON-RPC. It works fine for one-off queries. For bulk ingestion, it has a structural problem: every block is a separate HTTP request.

Each request adds:

  • TCP round-trip latency (typically 5-50ms depending on distance to node)
  • HTTP headers on every request and response
  • JSON encoding and decoding overhead

At 880,000 blocks with 20ms average round-trip, just the network overhead is 4.9 hours before you parse a single byte.

In practice, you can parallelize - but then you're handling ordering yourself, managing concurrency limits, dealing with connection pools, and retrying failures. The code that was "just fetch a block" becomes infrastructure.

gRPC streaming: a different model

gRPC bidirectional streaming uses a single persistent connection. The server sends blocks continuously; the client reads at its own pace (backpressure).

What this eliminates:

  • Per-block connection overhead
  • HTTP header costs (HTTP/2 frames are smaller)
  • Ordering logic (stream guarantees sequence)

A managed gRPC stream for block ingestion can deliver thousands of blocks per second on a standard connection - versus tens to low hundreds for parallelized JSON-RPC.

MessagePack vs JSON

Wire format matters at scale.

A typical Bitcoin block as JSON: ~5-15 MB depending on transaction density. The same block as MessagePack: roughly 40-60% smaller.

Beyond size, MessagePack deserialization is faster than JSON parsing. For a pipeline ingesting millions of blocks, this difference accumulates.

When to use JSON: debugging, compatibility with tools that expect JSON, low-volume one-off queries. When to use MessagePack: production backfill pipelines, high-volume ingestion, when you control both sides.

What confirmation depth means for streaming

One nuance: a streaming service that delivers blocks as they're mined introduces reorg risk. Bitcoin reorganizations do happen - the probability decreases with each subsequent block.

A streaming service that waits for ~6 confirmations before delivering a block trades a few minutes of latency for reorg safety. For an indexer building historical state, this is almost always the right trade - you don't want to process a block that will be rolled back.

When JSON-RPC is still the right choice

For single block lookups, spot checks, or fetching a specific transaction, JSON-RPC is fine. It's universally supported, easy to test with curl, and available from every node and RPC provider.

The case for gRPC streaming is specifically bulk sequential ingestion - which is exactly what indexer backfills require.

Practical implication

If you're building a Bitcoin indexer and your initial sync currently takes days via RPC polling, the bottleneck is almost certainly the transport layer. Switching to ordered gRPC streaming changes the equation.

Here is a minimal gRPC stream in Go using the stream-app API:

conn, _ := grpc.Dial("stream.overblock.io:443", grpc.WithTransportCredentials(creds))
client := pb.NewBlockStreamClient(conn)

md := metadata.Pairs("x-api-key", os.Getenv("OVERBLOCK_KEY"))
ctx := metadata.NewOutgoingContext(context.Background(), md)

stream, _ := client.Subscribe(ctx, &pb.SubscribeRequest{
StartHeight: 800000,
Format: pb.Format_MESSAGEPACK,
})

for {
block, err := stream.Recv()
if err != nil { break }
processBlock(block)
}

stream-app provides this as a managed service: activate in the dashboard, get a token, open a stream from any block height. No node required.