Laplace mechanism
anon.ldp_laplace adds zero-mean Laplace noise to a numeric value,
calibrated to the column's public range. The arguments lo and hi are
the public lower and upper bounds you commit to for the column — anything
the analyst could already say about the column without looking at the
data (e.g., a rating column is in , a wait-time column is in
seconds). They drive sensitivity, so tighter bounds give less
noise. Each call gives an ε-DP release with no failure
parameter. anon.dp_laplace_avg is the central-DP counterpart, applied
to a mean rather than per-row.
Both extend the Adding Noise category in the masking-functions catalog.
Use as a masking rule
Attach ldp_laplace to a numeric column with a security label, as shown
in Declare Masking Rules:
SECURITY LABEL FOR anon ON COLUMN responses.rating
IS 'MASKED WITH FUNCTION anon.ldp_laplace(rating, 1.0, 1, 5)';
SECURITY LABEL FOR anon ON COLUMN responses.wait_seconds
IS 'MASKED WITH FUNCTION anon.ldp_laplace(wait_seconds, 1.0, 0, 600)';
Apply via static masking or an anonymous dump:
SELECT anon.anonymize_table('public.responses');
The original values are overwritten in place with fresh Laplace draws. See Security & limitations before applying this through dynamic masking.
When to use it
Laplace fits numeric columns: ratings, counts, prices, durations, any value bounded by a public range. Use it when you need pure ε-DP without a failure parameter. For categorical columns use GRRM.
The same noise distribution covers two very different setups. Choose
the per-row form when raw values can't be entrusted to a central party
— rows are noised independently before any aggregation. Choose
dp_laplace_avg when a curator can see raw values and only the
released aggregate is public; the noise is added once, to the
aggregate, and is roughly tighter at the same ε. See
Central DP for means below.
Per-row LDP
Per-row LDP applies the noise to each row at the trust boundary — either on insert (the database never stores raw values) or on release (raw values are stored but masked on each read). Sensitivity is the full public range, so the noise per row is large.
-- Inspect the function on a few rows:
SELECT anon.ldp_laplace(rating, 1.0, 1, 5) AS noisy_rating
FROM responses
LIMIT 5;
| call | sensitivity | scale |
|---|---|---|
anon.ldp_laplace(value, ε, lo, hi) |
The default behavior returns the raw noisy value. Pass clamp => true to
round and clip into . Clamping helps when the
output feeds a typed column with a check constraint, but it biases values
near the boundary.
SELECT anon.ldp_laplace(rating, 1.0, 1, 5, clamp => true)
FROM responses;
Central DP for means
dp_laplace_avg is a first prototype of the central-DP Laplace mechanism
specialized to one query: the arithmetic mean of a bounded numeric
column. Laplace generalizes to any query for which you can bound a
global sensitivity — counts, sums, quantiles, regression
coefficients. The mechanics are always the same (add ); the hard part is deriving for the
specific query. dp_laplace_avg does that derivation for the mean and
hands you the result.
Sensitivity of a mean over rows is , so the scale is times smaller than per-row LDP at the same ε.
SELECT anon.dp_laplace_avg(
AVG(wait_seconds)::float8,
0.5, -- epsilon
0, 600, -- public range
COUNT(*)::int -- public n
) AS private_mean
FROM responses;
For , ε=0.5, range the scale is .
dp_laplace_avg operates on an aggregate, not a column, so it does not
fit a per-column SECURITY LABEL. Wrap it in a
masking view
to publish a private aggregate.
If the row count is itself sensitive, pass a public lower bound n_min
instead of COUNT(*):
SELECT anon.dp_laplace_avg(AVG(wait_seconds)::float8, 0.5, 0, 600,
n_min => 1000)
FROM responses;
Per-row LDP vs central DP
Both paths land on the same mean and spend the same ε. The noise scale is not the same.
| Approach | Per-call scale | Std error of the released mean |
|---|---|---|
| Per-row LDP, then average | ||
| Central DP on the mean |
Numbers are for , ε=0.5, range . Central DP is tighter on the mean; the per-row path pays for not trusting an intermediate aggregator.
/amount-mean runs both paths on the same data at the same ε.
Choosing parameters
epsilon. Typical 0.1 to 1.0 for per-row LDP. ε=1.0 is reasonable for a one-shot central mean. Lower ε means heavier noise.lo,hi. Must be public.MIN(col)andMAX(col)are queries on the data and leak privacy. Hardcode the bounds, or store them in a public reference table. Tighter bounds give less noise.clamp. Off by default. Turn it on when downstream expects values inside .n/n_min. PassCOUNT(*)if the row count is public,n_minif it is sensitive.
Security & limitations
- Averaging attack under dynamic masking. Each
ldp_laplacecall draws fresh noise, so reading the same row times reconstructs the true value with std error . The same caveat applies toanon.noise()and is documented in the Adding Noise section of the masking-functions catalog. Applyldp_laplacethrough static masking or anonymous dumps. Under dynamic masking, ε has to be budgeted across every query a single role issues against the column. - Bounds must be public.
MIN(col)andMAX(col)are queries on the data, not bounds. Hardcodelo,hi, or put them in a public reference table. - Row count can leak.
dp_laplace_avgusesnto compute scale(hi-lo)/(n·ε). Passing the trueCOUNT(*)makes the scale a public function ofn. Ifnis itself sensitive, pass a public lower boundn_mininstead — the released mean is still ε-DP, slightly noisier. - Averaging LDP output is not central DP.
AVG()over per-row Laplace outputs is consistent for the true mean, but the variance is worse than central DP. Usedp_laplace_avgif the trust model allows it. - Memoization. PostgreSQL may memoize calls with identical arguments
and return the same "random" output across rows. If you see identical
noisy values where they should differ, run
SET LOCAL enable_memoize = off;before the SELECT.
The math
Sample noise from with and return . The Laplace density has mean , variance , and standard deviation . The privacy proof is a ratio-of-densities argument (see Concepts): two neighboring datasets shift the density's center by at most , so the ratio at any output is bounded by .
Try it live
- /amount-mean: per-row
ldp_laplace(with and without clamping) compared againstdp_laplace_avgon the same data at the same ε.