Benchmarks

PerfLocale overhead measured on our test setup. Real-world numbers depend on your hardware, caching, and other plugins.

~ 1.2 ms

PerfLocale overhead, typical (median) request

Measured across 216 requests on 2 site topologies (single-site + multisite), 6 configuration combos, 18 page types, 10 warm samples each. 100% of requests land under 5 ms overhead - single-site max 3.51 ms, multisite-subdir max 4.42 ms, zero outliers in either topology.

The distribution

We built a profiler (a WordPress mu-plugin that wraps every callback attached to a hook whose target class lives in the PerfLocale\ namespace) and measured the wall-clock the plugin's code consumes on each request. We also track the profiler's own overhead as a separate header so the raw cost (what the plugin ACTUALLY does) and the measurement artifact (what the profiler ADDS to the request) can be separated.

The tables below show the clean number - raw plugin wall-clock with profiler self-time subtracted. 108 samples per table: 6 configs × 18 URLs × 1 median-of-10 warm samples each (after 3 throwaway warmup hits per URL to eliminate cold-cache artifacts).

Normal WordPress (single-site install)

Site	Samples	Mean	p50	p95	Max	Over 5 ms
Single-site (standard WP install)	108	1.72 ms	1.44 ms	3.26 ms	3.51 ms	0 / 108 (100% under)

On a standard WordPress install, 100% of requests stay under 5 ms PerfLocale overhead. The slowest observation across 108 samples was 3.51 ms. p95 is 3.26 ms; typical requests land at ~1.4 ms. Confirmed stable across two back-to-back canonical benchmark runs.

WordPress Multisite (subdirectory mode)

Multisite carries more inherent per-request work - network-level hooks, blog-switching, cross-blog plugin stacks. We benchmarked on a subdirectory-based multisite because that's the more common configuration; subdomain-based multisites produce similar numbers (within ~0.5 ms on p50, within ~1 ms on max).

Site	Samples	Mean	p50	p95	Max	Over 5 ms
Multisite (subdirectories)	108	1.23 ms	0.93 ms	3.71 ms	4.42 ms	0 / 108 (100% under)

On multisite-subdirectory, 100% of requests stay under 5 ms, with the slowest observation at 4.42 ms. p50 is sub-millisecond (0.93 ms) and the typical request is essentially free on top of WordPress's own hook chain. Before the deep-dive sweep (see v2/v3 baseline), this topology showed 6–7 requests per 108 over 5 ms; batched slug priming across all active languages + webhook class-load deferral were the two critical fixes that cleared the tail.

Tested across 6 configurations

Overhead does vary meaningfully by configuration. The big divide is between hide_default_prefix=off (fast) and hide_default_prefix=on (slower). Choose accordingly for your site. Numbers below are split by topology - single-site first, then multisite - so you can see what each config costs in your own environment.

Per-config p50 / p95 - Single-site

Combination	Prefix	Hide default	String mode	Fallback	p50	p95
`locale_prefix`	locale	off	files	off	1.35 ms	1.86 ms
`fallback_chain`	slug	off	files	on	1.38 ms	3.03 ms
`db_strings`	slug	off	database	off	1.39 ms	3.00 ms
`baseline`	slug	off	files	off	1.44 ms	3.12 ms
`db_hide_default`	slug	on	database	off	2.67 ms	3.14 ms
`hide_default`	slug	on	files	off	2.75 ms	3.50 ms

Per-config p50 / p95 - Multisite (subdirectories)

Combination	Prefix	Hide default	String mode	Fallback	p50	p95
`baseline`	slug	off	files	off	0.80 ms	3.07 ms
`fallback_chain`	slug	off	files	on	0.80 ms	3.38 ms
`db_strings`	slug	off	database	off	0.89 ms	3.37 ms
`locale_prefix`	locale	off	files	off	0.94 ms	1.25 ms
`hide_default`	slug	on	files	off	1.77 ms	3.33 ms
`db_hide_default`	slug	on	database	off	1.83 ms	3.20 ms

Takeaway: the fastest configuration is locale_prefix (always-prefixed URLs) - p95 under 2.1 ms on single-site and under 1.6 ms on multisite. The hide_default_prefix=on family is ~1–2 ms slower at the median because every request has to canonicalize "is this URL for the default language or for the URL-converter to rewrite?". Both options are fine for real sites; just know the trade-off.

Does it stay fast at scale?

We seeded 10,000 posts (5,000 English + 5,000 German) on each of the 3 test sites and re-measured. Internal hot-path operations stay sub-millisecond even at 10× the data volume:

Internal hot-path operation	Mean at 10K scale	DB queries	Notes
`detect_post_language()` warm	0.0001 ms	0	Static-cache hit.
`WP_Query(lang=de, per_page=20)`	0.07 ms	0	Archive listings are fully cached.
`prime_translations(50 IDs)`	0.013 ms	0	Batch API used by listing loops.
`StringTranslationRepository::get_map(200 IDs)`	~0.05 ms	1	One SELECT for 200 strings.
Frontend overhead at 10K posts	Same as baseline	-	Data volume doesn't move the needle.

What this doesn't mean

This is the overhead PerfLocale adds to a request, not the total request time. WordPress itself, your theme, other plugins, database round-trips, PHP boot - all of that is still there.
Benchmarks on one hardware configuration don't transfer 1:1 to another. A host with slow MySQL or missing OPcache shows worse numbers across the board - for PerfLocale and everything else.
Real production sites have caches warm all day (via CDN cache-tag purging), so the p95 tail you see here (6–9ms on cold-cached requests) is rarer in practice.