pg_textsearch vs System X

Development Progress Report

Snapshot in time: This comparison reflects pg_textsearch as of January 10, 2026. The project is under active development with performance improvements shipping regularly. Check the benchmark dashboard for the latest numbers.

Dataset: MS MARCO 8.8M passages | Date: 2026-01-10 | Commit: 3eb3a6f

Current Status

pg_textsearch today

  • Smaller index (no positions stored)*
  • Faster on short queries (1-6 tokens)
  • Better p95 latency for simple queries
  • Native Postgres integration

System X (baseline)

  • Faster index build (for now)
  • Faster on complex queries (7+ tokens)
  • Phrase queries supported
  • Larger feature set (facets, etc.)

Coming Soon

Index Size & Build Time

Metric pg_textsearch System X Difference
Index Size 1269 MB 1501 MB -15%
Build Time 517.8 sec 135.8 sec +281%
Documents 8,841,770 8,841,823 -
Index size caveat: pg_textsearch does not store term positions, so it cannot support phrase queries like "quick brown fox". System X stores positions by default, which adds significant overhead but enables phrase search. This accounts for most of the index size difference—it's a feature tradeoff, not a compression advantage.
In progress: Parallel index build (PR #125) will significantly reduce build time. Currently pg_textsearch builds single-threaded.

Query Latency (p50)

Median latency in milliseconds. Lower is better.

Query Tokens pg_textsearch System X Difference
1 token 12.34 ms 19.96 ms -38%
2 tokens 12.91 ms 18.92 ms -32%
3 tokens 15.72 ms 24.63 ms -36%
4 tokens 20.97 ms 26.03 ms -19%
5 tokens 27.50 ms 28.78 ms -4%
6 tokens 33.84 ms 36.13 ms -6%
7 tokens 43.41 ms 34.33 ms +26%
8+ tokens 68.30 ms 41.74 ms +64%

Query Latency (p95)

95th percentile latency in milliseconds. Lower is better.

Query Tokens pg_textsearch System X Difference
1 token 15.13 ms 29.84 ms -49%
2 tokens 17.31 ms 34.89 ms -50%
3 tokens 28.82 ms 35.64 ms -19%
4 tokens 48.37 ms 37.63 ms +29%
5 tokens 60.52 ms 40.18 ms +51%
6 tokens 72.74 ms 50.21 ms +45%
7 tokens 81.19 ms 57.60 ms +41%
8+ tokens 129.47 ms 62.62 ms +107%

Throughput

Total time to execute 800 test queries sequentially.

Metric pg_textsearch System X
Total time 27.07 sec 24.20 sec
Avg ms/query 33.84 ms 30.25 ms

Analysis

Short queries (1-6 tokens): pg_textsearch wins

pg_textsearch's Block-Max WAND implementation with compressed posting lists (delta encoding + bitpacking) excels at pruning non-competitive documents early. The 15% smaller index size also reduces I/O overhead.

Long queries (7+ tokens): Work in progress

Both pg_textsearch and System X use the same Block-Max WAND algorithm. The current performance gap on longer queries reflects implementation maturity, not algorithmic limitations. This is an active optimization target.

Methodology

Both extensions benchmarked on identical GitHub Actions runners with the same Postgres configuration. See full methodology for details.

Note: Both extensions use default configurations. Results are from a single run; expect ~10% variance between runs. This page will be updated as optimizations land.