-----------------------------------
Classical and quantum probabilities
-----------------------------------
Suppose you have a system with N energy eigenstates |k> (k=1:N) of
energies E_k, E_1<... to be the k-th unit vector.
The Hamiltonian is then the diagonal matrix H=Diag(E_1,...,E_N) with
diagonal entries E_1,...,E_N. A general state of the system is
described by a density matrix rho, a semidefinite Hermitian matrix of
trace 1. In particular, the diagonal elements p_k:= rho_{kk} are
nonnegative and satisfy sum p_k = Tr rho = 1. Thus they look like
probabilities. Observables are represented by arbitrary Hermitian
matrices X, and their expectation in the state rho is defined to be
= Tr rho X.
A classical system corresponds (in some sense) to the case where the
only allowed states and observables are diagonal. T
hus rho=Diag(p_1,...,p_N) and X=Diag(x_1,...,x_N),giving
=Tr rho f(X) = sum p_k f(x_k).
This is precisely the formula for the expectation of a function f(x)
of a random variable x that takes the values x_k when the random event
k happens with probability p_k. (In measure-based probability theory,
one would write omega in place of k, call it an elementary event,
organize the possible events in a sigma algebra, and write x(omega)
in place of x_k, thereby turning the random variable into a functions
of elementary events.)
The quantum case is therefore just the generalization of classical
probability calculus to the case where densities and observables may
be matrices rather than functions.
A very special case of states are the so-called pure states.
These are characterized by the fact that all their columns are
proportional to the same unit vector psi (called the state vector of
the pure state). Because the density matrix must be Hermitian and have
trace 1, it is not difficult to conclude that in this case
rho=psi psi^*, where psi^* is the conjugate transpose of psi.
Therefore, the diagonal elements are
p_k = rho_{kk}=psi_k psi_k^*=|si_k|^2,
which is the Born rule.
Thus nothing fancy is going on. But typical introductions to quantum
mechanics make it unnecessarily mysterious by starting with the
special case of pure states rather than with the (in reality much more
frequent) case of a general (mixed) state. A notable exception is my
online book
Arnold Neumaier and Dennis Westra,
Classical and Quantum Mechanics via Lie algebras, 2008.
http://lanl.arxiv.org/abs/0810.1019
Note that in quantum optics one does not demand the condition
Tr rho = 1, in order to be able to discuss matters of efficiency of
detection. In this case, r:= Tr rho denotes a rate of detection, and
:= Tr rho X / Tr rho
defines the conditional expectation given a detection.
The probability interpretation still follows with p_k=rho_kk/r.
And the derivation of Born's rule is still valid except that now the
wave function is related to the density matrix by rho = r psi psi^*.