Normalization

We define two kind of normalizations: component and norm.

Definition

component

component normalization refers to tensors with each component of value around 1. More precisely, the second moment of each component is 1.

\[\langle x_i^2 \rangle = 1\]

Examples:

  • [1.0, -1.0, -1.0, 1.0]

  • [1.0, 1.0, 1.0, 1.0] the mean don’t need to be zero

  • [0.0, 2.0, 0.0, 0.0] this is still fine because \(\|x\|^2 = n\)

torch.randn(10)
tensor([ 1.3721,  1.3019,  0.3710, -0.0964, -0.5026, -0.3084,  1.3420,  0.0690,
        -0.6135,  1.5337])

norm

norm normalization refers to tensors of norm close to 1.

\[\|x\| \approx 1\]

Examples:

  • [0.5, -0.5, -0.5, 0.5]

  • [0.5, 0.5, 0.5, 0.5] the mean don’t need to be zero

  • [0.0, 1.0, 0.0, 0.0]

torch.randn(10) / 10**0.5
tensor([-0.2803, -0.0424,  0.1595, -0.1745, -0.4294,  0.0879,  0.0624,  0.0943,
        -0.3048, -0.0972])

There is just a factor \(\sqrt{n}\) between the two normalizations.

Motivation

Assuming that the weights distribution obey

\[ \begin{align}\begin{aligned}\langle w_i \rangle = 0\\\langle w_i w_j \rangle = \sigma^2 \delta_{ij}\end{aligned}\end{align} \]

It imply that the two first moments of \(x \cdot w\) (and therefore mean and variance) are only function of the second moment of \(x\)

\[ \begin{align}\begin{aligned}\langle x \cdot w \rangle &= \sum_i \langle x_i w_i \rangle = \sum_i \langle x_i \rangle \langle w_i \rangle = 0\\\langle (x \cdot w)^2 \rangle &= \sum_{i} \sum_{j} \langle x_i w_i x_j w_j \rangle\\ &= \sum_{i} \sum_{j} \langle x_i x_j \rangle \langle w_i w_j \rangle\\ &= \sigma^2 \sum_{i} \langle x_i^2 \rangle\end{aligned}\end{align} \]

Testing

You can use e3nn.util.test.assert_normalized to check whether a function or module is normalized at initialization:

from e3nn.util.test import assert_normalized
from e3nn import o3
assert_normalized(o3.Linear("10x0e", "10x0e"))