Почему ранг ковариационной матрицы не более

Как указано в этом вопросе, максимальный ранг ковариационной матрицы равен $n-1$ где $n$ - размер выборки, поэтому, если размер ковариационной матрицы равен размеру выборки, он будет единственным. Я не могу понять, почему мы вычитаем $1$ из максимального ранга $n$ ковариационной матрицы.

covariance-matrix linear-algebra user3070752
источник

Чтобы получить интуицию, подумайте о

n = 2

$n=2$ балла в 3D. Какова размерность подпространства, в котором лежат эти точки? Можете ли вы разместить их на одной линии (1D подпространство)? Или вам нужна плоскость (2D подпространство)?

говорит амеба, восстанови Монику

Итак, вы понимаете, что

приводит к ковариационной матрице ранга 1? Хорошо, давайте возьмем

балла. Видите ли вы, что вы всегда можете разместить их на 2D плоскости?

n = 2

$n=2$

n = 3

$n=3$

говорит амеба, восстанови Монику

@amoeba ваш пример был ясен, но я не могу понять, какова взаимосвязь между подгонкой гиперплоскости в вашем примере и ковариационной матрицей?

user3070752

Извините за задержку;)

user3070752

Ответы:

Несмещенная оценка выборочной ковариационной матрицы при заданных точках данных равна $n$ $\newcommand{\x}{\mathbf x}\x_i \in \mathbb R^d$ гдепредставляет собой среднее по всем точкам. Обозначимкак.

C = \frac{1}{n - 1} \sum_{i = 1}^{n} (x_{i} - \bar{x}) (x_{i} - \bar{x})^{⊤},

$\mathbf C = \frac{1}{n-1}\sum_{i=1}^n (\x_i - \bar \x)(\x_i - \bar \x)^\top,$

\bar{x} = \sum x_{i} / n

$\bar \x = \sum \x_i /n$

(x_{i} - \bar{x})

$(\x_i-\bar \x)$

z_{i}

$\newcommand{\z}{\mathbf z}\z_i$

фактор не меняет ранг, и каждый член в сумме имеет (по определению) ранг

, поэтому суть вопроса заключается в следующем:

\frac{1}{n - 1}

$\frac{1}{n-1}$

1

$1$

Why does $\sum \z_i\z_i^\top$ have rank $n-1$ and not rank $n$ , as it would seem because we are summing $n$ rank- $1$ matrices?

The answer is that it happens because $\z_i$ are not independent. By construction, $\sum\z_i = 0$ . So if you know $n-1$ of $\z_i$ , then the last remaining $\z_n$ is completely determined; we are not summing $n$ independent rank- $1$ matrices, we are summing only $n-1$ independent rank- $1$ matrices and then adding one more rank- $1$ matrix that is fully linearly determined by the rest. This last addition does not change the overall rank.

We can see this directly if we rewrite $\sum\z_i = 0$ as

z_{n} = - \sum_{i = 1}^{n - 1} z_{i},

$\z_n = -\sum_{i=1}^{n-1}\z_i,$ and now plug it into the above expression:

\sum_{i = 1}^{n} z_{i} z_{i}^{⊤} = \sum_{i = 1}^{n - 1} z_{i} z_{i}^{⊤} + (- \sum_{i = 1}^{n - 1} z_{i}) z_{n}^{⊤} = \sum_{i = 1}^{n - 1} z_{i} (z_{i} - z_{n})^{⊤} .

$\sum_{i=1}^n \z_i\z_i^\top = \sum_{i=1}^{n-1} \z_i\z_i^\top + \Big(-\sum_{i=1}^{n-1}\z_i\Big)\z_n^\top=\sum_{i=1}^{n-1} \z_i(\z_i-\z_n)^\top.$ Now there is only $n-1$ terms left in the sum and it becomes clear that the whole sum can have at most rank $n-1$ .

This result, by the way, hints to why the factor in the unbiased estimator of covariance is $\frac{1}{n-1}$ and not $\frac{1}{n}$ .

The geometric intuition that I alluded to in the comments above is that one can always fit a 1D line to any two points in 2D and one can always fit a 2D plane to any three points in 3D, i.e. the dimensionality of the subspace is always $n-1$ ; this only works because we assume that this line (and plane) can be "moved around" in order to fit our points. "Positioning" this line (or plane) such that it passes through $\bar \x$ is equivalent of centering in the algebraic argument above.

amoeba says Reinstate Monica
источник

A bit shorter, I believe, explanation goes like this:

Let us define matrix $n$ x $m$ matrix $x$ of sample data points where $n$ is a number of variables and $m$ is a number of samples for each variable. Let us assume that none of the variables are linearly dependent.

The rank of $x$ is $min(n,m)$ .

Let us define matrix $n$ x $m$ matrix $z$ of rowwise centered variables:

$z = x - E[x]$ .

The rank of centered data becomes $min(n,m-1)$ , because each data row is now subjected to constraint:

$\sum_{i=1}^{m}z_{*i} =0$ .

It basically means we can recreate the entire $z$ matrix even if one of columns is removed.

The equation for sample covariance of $x$ becomes:

$cov(x,x) = \frac{1}{m-1}zz^T$

Clearly, the rank of covariance matrix is the $rank(zz^T)$ .

By rank-nullity theorem: $rank(zz^T) = rank(z) = min(n,m-1)$ .

Mikel
источник