Distances and Riemann matrices from Divergence

【Information geometry and it’s applications】

I am reading Shun-ichi Amari’s book, Information geometry and it’s applications. This sequences of blogs will be my thought and understanding derived from Amari’s book.

One central concept in information geometry is Divergence

Definition 1.1: D[P:Q] is called a divergence, when it satisfies the following criteria:

  1. D[P:Q]\leq 0
  2. D[P:Q] = 0 if and only if P=Q
  3. D:M\times M\to \mathbb{R}_+ is a smooth function.

One direct corollary of (3) is that

Corollary: If D[P:Q] is a divergence, and \xi is a coordinate system around P. Then

  • D[\xi(P),\xi(P)+d\xi] =\frac{1}{2}\sum\limits_{i,j}gr_{ij}d\xi_id\xi_j + O(|\xi|^3)
  • D[\xi(P)+d\xi,\xi(P)] =\frac{1}{2}\sum\limits_{i,j}gl_{ij}d\xi_id\xi_j + O(|\xi|^3)

in which gr_{ij}, gl_{ij} are positive definite symmetric matrices. Therefore the average g=\frac{1}{2} (gr + gl) is also a positive definite matrix.

Each of these three tensors gl, gr, g defines a Riemannian metric on manifold M.

  • gl_{ij} = \partial_i\partial_jD[\xi,\xi']
  • gr_{ij} = \partial_i'\partial_j'D[\xi,\xi']
  • g_{ij} = -\frac{1}{2}\partial_i\partial_j'D[\xi,\xi']

The last equation is obtained from f(\xi)=D[\xi,\xi]=0, and take second derivative of \partial_i\partial_j f = 0.

There are two major differences between Divergence and Rimannian matrix

  • symmetric: D[P:Q]\neq D[Q:P]
  • Local to global:
    • Riemannian metric is defined point-wisely, and the distance function is obtained by integration of Riemannian matric.
    • Divergence is defined globally on M\times M:\to \mathbb{R}_+, the corresponding Riemannian matrices gl, gr, g only describe the local behavior of D around the diagnal of M\times M.


Consider a Bregman Divergence defined by coordinate system \xi and convex function \psi. Then it is not hard to prove

gl = gr = g = \partial_i\partial_j\psi

Because \psi is a globally defined function, we can say that this Riemannian matrix has a global potential. In a later blog of dual flat structure, we are going to see more details of this Divergence.


One thought on “Distances and Riemann matrices from Divergence”

  1. When we talk about Divergence, we actually talk about a non-negative function defined on M\times M, which equals to zero on diagonal.

    several questions will need further understanding:
    1. What property of metric with potential can be useful in this scenario?
    2. Is this related to the generalized complex structure? (this blocked 2 by 2 matrix (gl,-g\\ -g,gr))
    3. Will this give a better understanding of dual function?
    4. Can we also build this type of dual flat structure on moduli spaces?


Fill in your details below or click an icon to log in:

WordPress.com 徽标

You are commenting using your WordPress.com account. Log Out /  更改 )

Google+ photo

You are commenting using your Google+ account. Log Out /  更改 )

Twitter picture

You are commenting using your Twitter account. Log Out /  更改 )

Facebook photo

You are commenting using your Facebook account. Log Out /  更改 )


Connecting to %s