【Information geometry and it’s applications】
I am reading Shun-ichi Amari’s book， Information geometry and it’s applications. This sequences of blogs will be my thought and understanding derived from Amari’s book.
One central concept in information geometry is Divergence：
Definition 1.1: is called a divergence, when it satisfies the following criteria:
- if and only if
- is a smooth function.
One direct corollary of (3) is that
Corollary: If is a divergence, and is a coordinate system around . Then
in which are positive definite symmetric matrices. Therefore the average is also a positive definite matrix.
Each of these three tensors defines a Riemannian metric on manifold .
The last equation is obtained from , and take second derivative of .
There are two major differences between Divergence and Rimannian matrix：
- Local to global:
- Riemannian metric is defined point-wisely, and the distance function is obtained by integration of Riemannian matric.
- Divergence is defined globally on , the corresponding Riemannian matrices only describe the local behavior of around the diagnal of .
Consider a Bregman Divergence defined by coordinate system and convex function . Then it is not hard to prove
Because is a globally defined function, we can say that this Riemannian matrix has a global potential. In a later blog of dual flat structure, we are going to see more details of this Divergence.