The name "Elias" in all capital letters with a stylized "A".

Maximum Likelihood Estimate for Linear Gaussian Conditional Probability Distributions in Python

by Elias Hernandis • Published Feb. 6, 2023 • Tagged maths, probability, statistics

A random variable XX has a linear Gaussian conditional probability distribution on its set of parents {U1,,Uk}\{U_1, \ldots, U_k\} if

P(XU1,,Uk)=N(w0+w1U1++wkUk,σ2) P(X \mid U_1, \ldots, U_k ) = \mathcal N (w_0 + w_1 U_1 + \ldots + w_k U_k, \sigma^2)

where w0,,wkRw_0, \ldots, w_k \in \mathbb R are generally known as weights.

Suppose we have some data where each tuple is an observation of XX and U1,,UkU_1, \ldots, U_k. Then we can estimate the parameters w0,,wk,σ2w_0, \ldots, w_k, \sigma^2 using maximum likelihood relatively easily.

The detailed mathematical explanation of how this is done can be found in Section 17.2.4 of Koller and Friedman. Here we just provide which works by using the sufficient statistics outlined in that method (and hence is quite fast!).

from typing import Any, Sequence

import numpy as np
import pandas as pd


def mle(self, node: Any, parents: Sequence[Any], data: pd.DataFrame):
    '''
    Find maximum likelihood estimate for `mean`, `variance` and `weights` given
    the `data` and the dependencies on the `parents`.


    Parameters
    ----------
    data : pandas.DataFrame
            A DataFrame with one row per observation and one column per
            variable.

    Returns
    -------
    A two-tuple where the first argument is the vector of weights and the second
    is the variance.

    Notes
    -----
    Maximum likelihood estimation of parameters is computed using the
    *sufficient statistics*  approach described in section 17.2.4 of [1]_.

    References
    ----------

    .. [1] D. Koller and N. Friedman, Probabilistic graphical models:
            principles and techniques. Cambridge, MA: MIT Press, 2009.

    '''
    M = len(data)
    k = len(parents)
    x_sum = data[node].sum()
    u_sums = data[list(parents)].sum().to_numpy()
    xu_sums = [(data[node] * data[p]).sum() for p in parents]
    uu_sums = [[(data[ui] * data[uj]).sum() for uj in parents]
               for ui in parents]

    # solve A*beta = b
    A = np.block([[np.reshape([M], (1, 1)),
                   np.reshape(u_sums, (1, k))],
                  [np.reshape(u_sums, (k, 1)),
                   np.reshape(uu_sums, (k, k))]])
    b = [x_sum] + xu_sums
    beta = np.linalg.solve(A, b)

    # extract parameters
    mean, weights = beta[0], beta[1:]
    x_var = data[node].var()
    cov_d = data[list(parents)].cov()
    var = x_var - sum([
        sum([
            weights[i] * weights[j] * cov_d[pi][pj]
            for j, pj in enumerate(parents)
        ]) for i, pi in enumerate(parents)
    ])
    return beta, var