- Algebraic foundation
of mathematical statistics, Nikolai Nikolaevich Čencov (Chentsov),
Statistics: A Journal of Theoretical and Applied Statistics 9.2 (1978): 267-276.
A very readable synthesis of Chentsov monograph results defining statistical invariance, total variation, Kullback-Leibler, and Chernoff divergences, etc
-
Defining the Curvature of a Statistical Problem (with Applications to Second Order Efficiency), Bradley Efron,
The Annals of Statistics (1975): 1189-1242.
Statistical inference on curved exponential families linked with a novel notion of statistical curvature, information loss, etc. Great discussions of the paper by many Statisticians.
- PDF Differential geometry
of smooth families of probability distributions, Hiroshi Nagaoka and Shun-ichi Amari,
METR (1982): 82-7.
New theory of dual connections and key theorems (Pythagoras theorem, projections, etc).
- Natural gradient works efficiently in learning, Shun-ichi Amari, Neural computation 10.2 (1998): 251-276.
Introduce natural gradient on Riemannian manifolds and demonstrate theoretically its efficiency.
- Second order efficiency of minimum contrast estimators in a curved exponential family,
Shinto Eguchi, The Annals of Statistics (1983): 793-803.
Introduce information geometry of constrast functions/divergences.
- A characterization of monotone and regular divergences,
José Manuel Corcuera and Federica Giummole,
Annals of the Institute of Statistical Mathematics 50 (1998): 433-450.
Characterize monotone divergences and regular divergences. Taylor expansions.
- PDF Dependence,
correlation and gaussianity in independent component analysis,
Jean-François Cardoso, The Journal of Machine Learning Research 4 (2003): 1177-1203.
Information geometry in action for ICA
- Information geometry on hierarchy of probability distributions, Shun-ichi Amari, IEEE transactions on information theory 47.5 (2001): 1701-1711.
Mixed primal/dual coordinate systems, dual foliations, and divergence decomposition.
- PDF Information geometry of Boltzmann machines,
Shun-ichi Amari, Koji Kurata and Hiroshi Nagaoka, IEEE Transactions on neural networks 3.2 (1992): 260-271.
Information geometry in action for neural networks
- PDF Exponentially concave functions and a new information geometry,
Soumik Pal, and Ting-Kam Leonard Wong, The Annals of probability 46.2 (2018): 1070-1113.
Logarithmic divergences extends Bregman divergences and are canonical divergences in constant sectional curvature manifolds