Gaussian mechanism

anon.ldp_gaussian(value, epsilon, lo, hi, delta) adds zero-mean Gaussian noise to a numeric value, calibrated to the column's public range and the target $(\varepsilon, \delta)$ guarantee. As with Laplace, lo and hi are the public lower and upper bounds you commit to for the column — defaults the analyst could already state without looking at the data. They drive sensitivity, so tighter bounds give less noise. Each call gives an $(\varepsilon, \delta)$ -DP release. Gaussian noise has lighter (sub-Gaussian) tails than Laplace and composes more tightly under repeated releases, at the cost of a small $\delta$ failure probability.

The Gaussian mechanism extends the Adding Noise category of the masking-functions catalog.

Use as a masking rule

Attach ldp_gaussian to a numeric column with a security label, as shown in Declare Masking Rules:

SECURITY LABEL FOR anon ON COLUMN responses.rating
  IS 'MASKED WITH FUNCTION anon.ldp_gaussian(rating, 1.0, 1, 5, 1e-5)';

Apply via static masking or an anonymous dump:

SELECT anon.anonymize_table('public.responses');

See Security & limitations before applying this through dynamic masking.

When to use it

Gaussian fits the same numeric columns as Laplace and produces $(\varepsilon, \delta)$ -DP releases instead of pure ε-DP. The trade-off: $\delta$ is a small failure probability, but Gaussian noise is sub-Gaussian (lighter tails than Laplace) and composes more tightly under repeated releases. Vector-output mechanisms like the one-hot variants calibrate naturally to L2 sensitivity, which Gaussian uses. For pure ε-DP without $\delta$ , use Laplace. For categorical values, use GRRM.

Per-row LDP

-- Inspect the function on a few rows:
SELECT anon.ldp_gaussian(wait_seconds, 1.0, 0, 600, 1e-5) AS noisy_wait
FROM   responses
LIMIT  5;

call	sensitivity	scale $\sigma$
`anon.ldp_gaussian(value, ε, lo, hi, δ)`	$\text{hi} - \text{lo}$	$(\text{hi} - \text{lo}) \cdot \sqrt{2 \ln(1.25/\delta)} / \varepsilon$

ldp_gaussian returns the raw noisy value. Pass clamp => true to round and clip into $[\text{lo}, \text{hi}]$ . Clamping is post-processing (still $(\varepsilon, \delta)$ -DP) but biases values near the boundary.

Parameter helper

anon.ldp_gaussian_sigma(epsilon float8, lo float8, hi float8, delta float8) -> float8

Returns the standard deviation of the noise the mechanism will draw. Pure post-processing, no privacy cost.

Choosing parameters

epsilon. Same range as Laplace: 0.1 to 1.0 typical. Lower means heavier noise.
delta. Small and on the order of $1/n^{1+c}$ for some $c > 0$ . $\delta = 10^{-5}$ is a common default for $n$ around $10^4$ to $10^6$ . $\delta$ is the probability the privacy guarantee fails entirely, so it has to be cryptographically small relative to the dataset size.
lo, hi. Public bounds; same constraint as Laplace. Tighter bounds give less noise.
clamp. Off by default. Turn it on when downstream expects values inside $[\text{lo}, \text{hi}]$ .

Security & limitations

Averaging attack under dynamic masking. Each ldp_gaussian call draws fresh noise, so reading the same row $k$ times reconstructs the true value with std error $\sigma/\sqrt{k}$ . The same caveat applies to anon.noise() and is documented in the Adding Noise section of the masking-functions catalog. Apply ldp_gaussian through static masking or anonymous dumps. Under dynamic masking, ε has to be budgeted across every query a single role issues against the column.
Bounds must be public. MIN(col) and MAX(col) are queries on the data, not bounds. Hardcode lo, hi, or put them in a public reference table.
delta is a failure probability, not a knob. $\delta = 0.01$ is not a stronger or "tunable" version of $\delta = 10^{-5}$ ; it means a 1% chance the privacy guarantee fails entirely. Keep $\delta$ cryptographically small relative to $n$ .
Memoization. PostgreSQL may memoize calls with identical arguments and return the same "random" output across rows. If you see identical noisy values where they should differ, run SET LOCAL enable_memoize = off; before the SELECT.

The math

For a query $f(D)$ with sensitivity $\Delta f$ , the Gaussian mechanism returns $f(D) + \mathcal{N}(0, \sigma^2)$ with $\sigma = \Delta f \cdot \sqrt{2 \ln(1.25/\delta)} / \varepsilon$ . The result is $(\varepsilon, \delta)$ -DP for any $\varepsilon \in (0, 1), \delta \in (0, 1)$ . Tighter bounds on $\sigma$ hold when ε > 1 or under the analytic Gaussian mechanism; the formula above is the standard textbook bound and what ldp_gaussian implements.

Try it live

/onehot-histogram: histogram estimation comparing Gaussian one-hot against scalar Gaussian on the same data.