----------------------------------- Classical and quantum probabilities ----------------------------------- Suppose you have a system with N energy eigenstates |k> (k=1:N) of energies E_k, E_1<... to be the k-th unit vector. The Hamiltonian is then the diagonal matrix H=Diag(E_1,...,E_N) with diagonal entries E_1,...,E_N. A general state of the system is described by a density matrix rho, a semidefinite Hermitian matrix of trace 1. In particular, the diagonal elements p_k:= rho_{kk} are nonnegative and satisfy sum p_k = Tr rho = 1. Thus they look like probabilities. Observables are represented by arbitrary Hermitian matrices X, and their expectation in the state rho is defined to be = Tr rho X. A classical system corresponds (in some sense) to the case where the only allowed states and observables are diagonal. T hus rho=Diag(p_1,...,p_N) and X=Diag(x_1,...,x_N),giving =Tr rho f(X) = sum p_k f(x_k). This is precisely the formula for the expectation of a function f(x) of a random variable x that takes the values x_k when the random event k happens with probability p_k. (In measure-based probability theory, one would write omega in place of k, call it an elementary event, organize the possible events in a sigma algebra, and write x(omega) in place of x_k, thereby turning the random variable into a functions of elementary events.) The quantum case is therefore just the generalization of classical probability calculus to the case where densities and observables may be matrices rather than functions. A very special case of states are the so-called pure states. These are characterized by the fact that all their columns are proportional to the same unit vector psi (called the state vector of the pure state). Because the density matrix must be Hermitian and have trace 1, it is not difficult to conclude that in this case rho=psi psi^*, where psi^* is the conjugate transpose of psi. Therefore, the diagonal elements are p_k = rho_{kk}=psi_k psi_k^*=|si_k|^2, which is the Born rule. Thus nothing fancy is going on. But typical introductions to quantum mechanics make it unnecessarily mysterious by starting with the special case of pure states rather than with the (in reality much more frequent) case of a general (mixed) state. A notable exception is my online book Arnold Neumaier and Dennis Westra, Classical and Quantum Mechanics via Lie algebras, 2008. http://lanl.arxiv.org/abs/0810.1019 Note that in quantum optics one does not demand the condition Tr rho = 1, in order to be able to discuss matters of efficiency of detection. In this case, r:= Tr rho denotes a rate of detection, and := Tr rho X / Tr rho defines the conditional expectation given a detection. The probability interpretation still follows with p_k=rho_kk/r. And the derivation of Born's rule is still valid except that now the wave function is related to the density matrix by rho = r psi psi^*.