pg_textsearch vs System X

Development Progress Report

Snapshot in time: This comparison reflects pg_textsearch as of March 11, 2026. The project is under active development with performance improvements shipping regularly. Check the benchmark dashboard for the latest numbers.

Dataset: MS MARCO 8.8M passages | Date: 2026-03-11 | Commit: 39c60d4 | System X: v0.21.6

Current Status

pg_textsearch today

  • 3.9x faster overall query throughput
  • Faster on all query lengths (1-8+ tokens)
  • Smaller index (no positions stored)*
  • Parallel index build (4 workers)
  • Native Postgres integration

System X v0.21.6

  • Faster index build (1.6x)
  • Phrase queries supported
  • Larger feature set (facets, etc.)

Recent Improvements

Index Size & Build Time

Metric pg_textsearch System X Difference
Index Size 1,215 MB 1,503 MB -19%
Build Time 233.5 sec 142.5 sec +64%
Documents 8,841,823 -
Index size caveat: pg_textsearch does not store term positions, so it cannot support phrase queries like "quick brown fox". System X stores positions by default, which adds significant overhead but enables phrase search. This accounts for most of the index size difference—it's a feature tradeoff, not a compression advantage.
Build time improvement: With the arena allocator rewrite (PR #231) and leader-only merge (PR #244), build time dropped from 270s to 234s. System X is 1.6x faster at index build.

Query Latency (p50)

Median latency in milliseconds. Lower is better.

Query Tokens pg_textsearch System X Difference
1 token 0.70 ms 18.05 ms -96%
2 tokens 1.31 ms 18.51 ms -93%
3 tokens 2.44 ms 24.69 ms -90%
4 tokens 3.73 ms 27.35 ms -86%
5 tokens 6.07 ms 28.94 ms -79%
6 tokens 8.76 ms 35.34 ms -75%
7 tokens 13.05 ms 37.29 ms -65%
8+ tokens 19.86 ms 44.98 ms -56%

Query Latency (p95)

95th percentile latency in milliseconds. Lower is better.

Query Tokens pg_textsearch System X Difference
1 token 1.62 ms 24.20 ms -93%
2 tokens 3.62 ms 32.18 ms -89%
3 tokens 6.97 ms 37.57 ms -81%
4 tokens 10.86 ms 36.17 ms -70%
5 tokens 17.76 ms 39.34 ms -55%
6 tokens 21.24 ms 59.09 ms -64%
7 tokens 32.55 ms 67.40 ms -52%
8+ tokens 43.94 ms 70.92 ms -38%

Throughput

Total time to execute 800 test queries sequentially.

Metric pg_textsearch System X Difference
Total time 6.46 sec 25.20 sec -74%
Avg ms/query 8.08 ms 31.50 ms -74%

Analysis

Query latency: pg_textsearch faster across all token counts

pg_textsearch is faster on all 8 token buckets at both p50 and p95, ranging from 26x faster on single-token queries to 2.3x faster on 8+ token queries at p50. SIMD-accelerated bitpack decoding (PR #250), stack-allocated decode buffers (PR #253), and BMW cache optimizations (PR #274) drove improvements across the board.

Overall throughput: pg_textsearch 3.9x faster

pg_textsearch completes 800 queries in 6.5s vs 25.2s for System X, a 3.9x throughput advantage. This is up from 3.2x on March 3, driven by continued scoring path optimizations.

Index build: System X 1.6x faster

System X builds its index in 143s vs 234s for pg_textsearch (1.6x faster). The arena allocator rewrite (PR #231) and leader-only merge (PR #244) previously cut build time from 270s to 234s.

Methodology

Both extensions benchmarked on identical GitHub Actions runners with the same Postgres configuration. See full methodology for details.

Note: Both extensions use default configurations. Results are from a single run; expect ~10% variance between runs. This page will be updated as optimizations land.

MS-MARCO v2 — 138M Passages

Large-Scale Benchmark

New experiment: This section uses the full MS-MARCO v2 passage collection (138M documents, ~16x larger than v1 above). The hardware and configuration differ from the 8.8M experiment — see details below. Numbers between the two sections are not directly comparable.

Dataset: MS MARCO v2 — 138,364,158 passages | Date: 2026-03-10 | pg_textsearch: v1.0.0-dev (main @ fb3b3b1) | System X: v0.21.6

Environment

Component Specification
CPU Intel Xeon Platinum 8375C @ 2.90 GHz, 8 cores / 16 threads
RAM 123 GB
Storage NVMe SSD (885 GB)
Postgres 17.7, shared_buffers = 31 GB, data on NVMe
Table size 47 GB (87 GB with TOAST)

Current Status (138M)

pg_textsearch

  • 2.3x faster weighted p50 query latency
  • 4.7x higher concurrent throughput (16 clients)
  • Faster on all 8 token buckets at p50
  • 26% smaller index on disk
  • Block-Max WAND with cached skip entries
  • SIMD-accelerated bitpack decoding

System X v0.21.6

  • 1.9x faster index build
  • Phrase queries supported
  • Larger feature set (facets, etc.)

Index Build (138M)

Metric pg_textsearch System X Difference
Build time 17 min 37 s 8 min 55 s 1.9x slower
Parallel workers 15 14 -
Index size 17 GB 23 GB -26%
Documents 138,364,158 -
Unique terms 17,373,764 - -
Build time note: pg_textsearch's parallel build has two phases: scan/tokenize (parallel) and merge (single-threaded I/O-bound). The merge phase dominates at this scale. System X uses a Tantivy-based backend with a different merge strategy.

Single-Client Query Latency (138M)

Top-10 results (LIMIT 10), BMW optimization enabled. 691 queries sampled across 8 token-count buckets.

Median Latency (p50)

Query Tokens pg_textsearch System X Speedup
1 token 5.11 ms 59.83 ms 11.7x
2 tokens 9.14 ms 59.65 ms 6.5x
3 tokens 20.04 ms 77.62 ms 3.9x
4 tokens 41.92 ms 98.89 ms 2.4x
5 tokens 67.76 ms 125.38 ms 1.9x
6 tokens 102.82 ms 148.78 ms 1.4x
7 tokens 159.37 ms 169.65 ms 1.1x
8+ tokens 177.95 ms 190.47 ms 1.1x

95th Percentile Latency (p95)

Query Tokens pg_textsearch System X Speedup
1 token 6.43 ms 68.34 ms 10.6x
2 tokens 32.63 ms 103.17 ms 3.2x
3 tokens 51.51 ms 114.79 ms 2.2x
4 tokens 124.17 ms 147.32 ms 1.2x
5 tokens 167.05 ms 190.07 ms 1.1x
6 tokens 262.07 ms 201.76 ms 0.77x
7 tokens 311.58 ms 291.09 ms 0.94x
8+ tokens 404.95 ms 310.68 ms 0.77x

Weighted-Average Latency

Weighted by observed query-length distribution from 1,010,916 MS-MARCO v1 Bing queries after English stopword removal and stemming (mean 3.7 lexemes, mode 3).

Query length distribution (click to expand)
MS-MARCO Query Lexeme Count Distribution (1,010,916 queries)
  Lexemes = distinct stems after English stopword removal

  lexemes   queries      %  distribution
  ───────  ────────  ─────  ──────────────────────────────────────────────────
        0        11   0.0%  ▏
        1    35,638   3.5%  ▓▓▓▓▓
        2   165,033  16.3%  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
        3   304,887  30.2%  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
        4   264,177  26.1%  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
        5   143,765  14.2%  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
        6    59,558   5.9%  ▓▓▓▓▓▓▓▓▓
        7    22,595   2.2%  ▓▓▓
        8     8,627   0.9%  ▓
        9     3,395   0.3%  ▏
       10     1,555   0.2%  ▏
       11       721   0.1%  ▏
       12       402   0.0%  ▏
       13       235   0.0%  ▏
       14       123   0.0%  ▏
       15+      193   0.0%  ▏

    Total: 1,010,916 queries
     Mean: 3.7 lexemes
     Mode: 3 lexemes (30.2%)

  72.6% of queries have 2-4 lexemes.
  96.2% of queries have 1-6 lexemes.

Benchmark buckets 1–7 contain 100 queries each; bucket 8+ contains 38 queries covering all lengths ≥8. Weights applied to each bucket match the distribution above.

Metric pg_textsearch System X Speedup
Weighted p50 40.61 ms 94.36 ms 2.3x
Weighted avg 46.69 ms 101.66 ms 2.2x

Throughput (138M)

Single-Client Sequential

691 queries run 3 times; median iteration reported.

Metric pg_textsearch System X Speedup
Avg ms/query 62.92 ms 106.53 ms 1.7x
Total (691 queries) 43.5 s 73.6 s 1.7x

Concurrent (pgbench, 16 clients, 60 s)

Metric pg_textsearch System X Ratio
Transactions/sec (TPS) 91.4 19.4 4.7x
Avg latency 175 ms 823 ms 4.7x
Transactions (60 s) 5,526 1,180 4.7x

Analysis (138M)

Query latency: pg_textsearch faster across all token counts

pg_textsearch is faster on all 8 token buckets at p50, ranging from 11.7x faster on single-token queries to 1.1x on 8+ token queries. Cached skip entries and reusable decompression buffers (PR #274) reduced per-block overhead in the WAND inner loop by 20–25%, closing the gap on high-token queries. The weighted p50 advantage is 2.3x.

Tail latency: improved but mixed at p95

pg_textsearch has tighter tail latency on 1–5 token queries at p95. On 6–8+ token queries, System X still has tighter tails. The p95 gap narrowed significantly with the cache optimizations (e.g., 5-token p95 went from 200ms to 167ms, now faster than System X's 190ms). Further tail latency optimization on long queries remains an active area of work.

Concurrent throughput: pg_textsearch 4.7x higher TPS

Under 16-client concurrent load, pg_textsearch achieves 91.4 TPS vs 19.4 TPS for System X — a 4.7x advantage. This is significantly wider than the 1.5x single-client gap, indicating that pg_textsearch scales much better under concurrency. pg_textsearch uses native Postgres buffer management and shared memory, avoiding the external process coordination overhead present in System X's architecture.

Index build: System X 1.9x faster

System X builds its index in 8 min 55 s vs 17 min 37 s for pg_textsearch (1.9x faster). pg_textsearch's parallel build uses 15 workers for the scan phase, but the subsequent merge phase is single-threaded and I/O-bound, accounting for the majority of the build time. Despite the slower build, pg_textsearch produces a 26% smaller index (17 GB vs 23 GB).

Methodology (138M)

Both extensions benchmarked on the same dedicated EC2 instance (c6i.4xlarge), same Postgres 17.7 installation, same dataset. The table was loaded once; each extension built its index from scratch with page cache dropped before each build. Query benchmarks include warmup passes. The pgbench power test uses -M prepared mode with random query selection from 691 benchmark queries.

Note: Both extensions use default configurations. Results are from a single run on dedicated hardware; expect ~5-10% variance between runs.