Current Status
pg_textsearch today
- Smaller index (no positions stored)*
- Faster on short queries (1-6 tokens)
- Better p95 latency for simple queries
- Native Postgres integration
System X (baseline)
- Faster index build (for now)
- Faster on complex queries (7+ tokens)
- Phrase queries supported
- Larger feature set (facets, etc.)
Coming Soon
- Parallel index build - Will close the build time gap (PR #125)
- Long query optimization - Tuning BMW for 7+ token queries
- Tail latency improvements - Reducing p95/p99 variance
Index Size & Build Time
| Metric | pg_textsearch | System X | Difference |
|---|---|---|---|
| Index Size | 1269 MB | 1501 MB | -15% |
| Build Time | 517.8 sec | 135.8 sec | +281% |
| Documents | 8,841,770 | 8,841,823 | - |
"quick brown fox". System X stores
positions by default, which adds significant overhead but enables phrase search.
This accounts for most of the index size difference—it's a feature tradeoff, not
a compression advantage.
Query Latency (p50)
Median latency in milliseconds. Lower is better.
| Query Tokens | pg_textsearch | System X | Difference |
|---|---|---|---|
| 1 token | 12.34 ms | 19.96 ms | -38% |
| 2 tokens | 12.91 ms | 18.92 ms | -32% |
| 3 tokens | 15.72 ms | 24.63 ms | -36% |
| 4 tokens | 20.97 ms | 26.03 ms | -19% |
| 5 tokens | 27.50 ms | 28.78 ms | -4% |
| 6 tokens | 33.84 ms | 36.13 ms | -6% |
| 7 tokens | 43.41 ms | 34.33 ms | +26% |
| 8+ tokens | 68.30 ms | 41.74 ms | +64% |
Query Latency (p95)
95th percentile latency in milliseconds. Lower is better.
| Query Tokens | pg_textsearch | System X | Difference |
|---|---|---|---|
| 1 token | 15.13 ms | 29.84 ms | -49% |
| 2 tokens | 17.31 ms | 34.89 ms | -50% |
| 3 tokens | 28.82 ms | 35.64 ms | -19% |
| 4 tokens | 48.37 ms | 37.63 ms | +29% |
| 5 tokens | 60.52 ms | 40.18 ms | +51% |
| 6 tokens | 72.74 ms | 50.21 ms | +45% |
| 7 tokens | 81.19 ms | 57.60 ms | +41% |
| 8+ tokens | 129.47 ms | 62.62 ms | +107% |
Throughput
Total time to execute 800 test queries sequentially.
| Metric | pg_textsearch | System X |
|---|---|---|
| Total time | 27.07 sec | 24.20 sec |
| Avg ms/query | 33.84 ms | 30.25 ms |
Analysis
Short queries (1-6 tokens): pg_textsearch wins
pg_textsearch's Block-Max WAND implementation with compressed posting lists (delta encoding + bitpacking) excels at pruning non-competitive documents early. The 15% smaller index size also reduces I/O overhead.
Long queries (7+ tokens): Work in progress
Both pg_textsearch and System X use the same Block-Max WAND algorithm. The current performance gap on longer queries reflects implementation maturity, not algorithmic limitations. This is an active optimization target.
Methodology
Both extensions benchmarked on identical GitHub Actions runners with the same Postgres configuration. See full methodology for details.