stats

Probability distributions as objects (1.27.0). Each constructor returns a Distribution with a uniform method set: pdf, cdf, ppf, mean, variance, std, and sample.

import stats;

let d = stats.normal(0.0, 1.0);
d.pdf(0.0);               /* 0.3989... */
d.cdf(1.96);              /* 0.9750... */
d.ppf(0.975);             /* 1.9599... */
d.mean(); d.variance(); d.std();

d.sample(1000);                   /* 1-D ndarray of 1000 draws */
d.sample(1000, {"seed": 42});     /* reproducible draw */
d.sample();                       /* single scalar */

Distributions

Constructor Parameters Support
normal(mu, sigma) sigma > 0 (std dev) all reals
uniform(a, b) a < b [a, b]
exponential(rate) rate > 0 x >= 0
gamma(shape, scale) shape > 0, scale > 0 x > 0
beta(alpha, beta) alpha > 0, beta > 0 [0, 1]
chiSquared(df) df > 0 x >= 0
studentT(df) df > 0 all reals
f(d1, d2) d1 > 0, d2 > 0 x >= 0
lognormal(mu, sigma) sigma > 0 x > 0
weibull(shape, scale) shape > 0, scale > 0 x >= 0
binomial(n, p) n >= 0 (int), 0 <= p <= 1 {0, ..., n}
poisson(lambda) lambda > 0 {0, 1, 2, ...}

Moments

Distribution mean variance
normal mu sigma^2
uniform (a+b)/2 (b-a)^2/12
exponential 1/rate 1/rate^2
gamma shape*scale shape*scale^2
beta a/(a+b) ab/((a+b)^2(a+b+1))
chiSquared df 2*df
studentT 0 (df > 1), NaN otherwise df/(df-2) (df > 2), NaN otherwise
f d2/(d2-2) (d2 > 2), NaN otherwise standard closed form
lognormal exp(mu + sigma^2/2) standard closed form
weibull scale*gamma(1+1/shape) standard closed form
binomial n*p np(1-p)
poisson lambda lambda

mean() and variance() return NaN where the moment is undefined for the given parameters (for example, studentT mean when df <= 1). This follows the math module's NaN convention rather than throwing.

Methods

  • pdf(x) - probability density (continuous) or probability mass (discrete). For discrete distributions this is the mass function; pdf at a non-integer argument returns 0.0.
  • cdf(x) - cumulative distribution function.
  • ppf(p) - inverse CDF (quantile function); p must be in [0, 1].
  • mean(), variance(), std() - closed-form moments. std() is sqrt(variance()).
  • sample(n) - draw n variates, returned as a 1-D ndarray (float64 for continuous distributions, int64 for discrete).
  • sample(n, {"seed": k}) - reproducible draw; the same seed produces the same sequence on both the evaluator and the VM.
  • sample() - single scalar draw (float for continuous, int for discrete).

Reproducibility

The seed is local to the call: passing {"seed": k} does not affect the module's shared RNG used by unseeded calls. The same seed produces byte-identical results on both backends and across process restarts.

let a = stats.normal(0.0, 1.0).sample(5, {"seed": 1});
let b = stats.normal(0.0, 1.0).sample(5, {"seed": 1});
/* a and b are identical */

Error handling

  • Constructors throw RuntimeError if parameters are out of range (e.g. sigma <= 0, a >= b, df <= 0, p < 0 or p > 1, n < 0).
  • ppf(p) throws RuntimeError if p is outside [0, 1].
  • sample(n) throws RuntimeError if n < 0.
try {
    stats.normal(0.0, -1.0);
} catch (RuntimeError e) {
    io.println(e.message); /* sigma must be > 0 */
}

Example: large-sample mean convergence

import stats;

let d = stats.poisson(4.0);
let s = d.sample(10000, {"seed": 42});
io.println(s.mean()); /* close to 4.0 */

Hypothesis tests and confidence intervals

Tests return a dict with statistic, pvalue, and (where applicable) df. Confidence intervals return {low, high}.

import stats;

let a = [2.1, 2.4, 2.6, 2.8, 3.0];
let b = [1.9, 2.0, 2.2, 2.3, 2.5];

stats.tTestOneSample(a, 2.5);                            /* one-sample t vs mu=2.5 */
stats.tTestIndependent(a, b);                            /* pooled two-sample t */
stats.tTestIndependent(a, b, {"equalVar": false});       /* Welch variant */
stats.tTestPaired(a, b);                                 /* paired t */
stats.chiSquareTest([10, 20, 30], [20, 20, 20]);         /* goodness-of-fit */
stats.chiSquareIndependence([[10, 20], [30, 40]]);       /* independence */
stats.mannWhitneyU(a, b);                                /* Mann-Whitney U */
stats.ksTest(a, b);                                      /* Kolmogorov-Smirnov */
stats.confidenceIntervalMean(a, 0.95);                   /* CI for mean */
stats.confidenceIntervalProportion(40, 100, 0.95);       /* CI for proportion */
stats.confidenceIntervalDiffMeans(a, b, 0.95);           /* CI for difference of means */
Function Result keys
tTestOneSample(sample, mu, opts?) {statistic, pvalue, df}
tTestIndependent(a, b, opts?) {statistic, pvalue, df}
tTestPaired(a, b, opts?) {statistic, pvalue, df}
chiSquareTest(observed, expected?, opts?) {statistic, pvalue, df}
chiSquareIndependence(table) {statistic, pvalue, df, expected}
mannWhitneyU(a, b, opts?) {statistic, pvalue}
ksTest(a, b) {statistic, pvalue}
confidenceIntervalMean(sample, level?) {low, high}
confidenceIntervalProportion(successes, n, level?) {low, high}
confidenceIntervalDiffMeans(a, b, level?, opts?) {low, high}

opts keys:

  • alternative - "two-sided" (default), "less", or "greater".
  • equalVar - true (default, pooled) or false (Welch) for tTestIndependent and confidenceIntervalDiffMeans.
  • ddof - integer delta degrees of freedom for chiSquareTest (default 0).

chiSquareIndependence returns the matrix of expected cell counts in expected as a list of lists in addition to statistic, pvalue, and df.

The confidence level parameter defaults to 0.95 if omitted. All sample arguments must be lists of numbers; mismatched lengths, empty lists, or out-of-range level values raise RuntimeError.

Regression

import stats;

let xs = [1.0, 2.0, 3.0, 4.0, 5.0];
let ys = [2.1, 3.9, 6.0, 8.1, 9.8];

let fit = stats.linregress(xs, ys);   /* {slope, intercept, r, r2, pvalue, stderr} */
let c = stats.polyfit(xs, ys, 2);     /* coefficients, highest degree first */
stats.polyval(c, 3.0);                /* evaluate the polynomial at x=3.0 */
Function Result
linregress(x, y) {slope, intercept, r, r2, pvalue, stderr} (requires n >= 3)
polyfit(x, y, degree) list<float> of degree+1 coefficients, highest degree first (degree in [1, 10])
polyval(coeffs, x) float (coefficients highest degree first)

linregress fits a line y = slope*x + intercept using ordinary least squares. r is the Pearson correlation coefficient, r2 its square, pvalue the two-tailed p-value against slope=0 via the Student-t distribution, and stderr the standard error of the slope. Both x and y must have at least 3 elements.

polyfit(x, y, degree) fits a polynomial of the given degree (1 to 10) using normal equations. Coefficients are returned highest degree first, matching the convention used by polyval. A singular design matrix raises RuntimeError.

polyval(coeffs, x) evaluates a polynomial at a single point using Horner's method. coeffs is a list of coefficients highest degree first (matching polyfit output). An empty coeffs list raises a runtime error.

/* fit a quadratic and evaluate at new points */
let c = stats.polyfit([0.0, 1.0, 2.0, 3.0], [0.0, 1.1, 3.9, 9.1], 2);
io.println(stats.polyval(c, 4.0));   /* ~16.0 */

Descriptive extensions

import stats;

let xs = [2.0, 4.0, 4.0, 4.0, 5.0, 5.0, 7.0, 9.0];
let ys = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0];

stats.skewness(xs);           /* population skewness */
stats.kurtosis(xs);           /* population excess kurtosis (normal is 0) */
stats.covariance(xs, ys);     /* sample covariance (n-1 denominator) */
stats.corrcoef(xs, ys);       /* Pearson correlation coefficient */
Function Result
skewness(xs) population skewness (float)
kurtosis(xs) population excess kurtosis, normal is 0 (float)
covariance(xs, ys) sample covariance, n-1 denominator (float)
corrcoef(xs, ys) Pearson correlation coefficient (float)

skewness and kurtosis require at least 2 values and non-zero variance. covariance and corrcoef require equal-length samples of at least 2 elements; corrcoef additionally requires that neither input is constant (non-zero variance in both xs and ys). All functions raise RuntimeError when their preconditions are not met.