On this posting series, I will introduce the newest quantum machine learning algorithm such as QAOA(Quantum Approximated Optimization Algorithm), QKE(Quantum Kernel Estimator) and QPCA(Quantum Principal Component Analysis).

However, all of these papers are come from physicist who are experts on quantum theory. Therefore, they use terminologies in the quantum physics which are not familiar to us. Thus, we need to study the concepts first before watching the papers. In this posting, I’ll introduce the basic concepts of the quantum physics.

This posting is based on the book “Supervised Learning with Quantum Computers” by Schuld. M and Petruccione. F.

1. Quantum Theory

Quantum Theory

Quantum Theory which is more famous to be known as Quantum Mechanics, is ‘first and foremost a calculus for computing the probabilities of outcomes of measurements made on physical system’. Not like the classical mechanics which aims to explain the situation as it is, the quantum theory targets to explain the potential situation and cause too.

It is useful because it can explain the observation which cannot be explained by the Classical mechanics. However, because it is too complicated and difficult and in some portion it even cannot be understood, quantum theory are not being able to replace whole mechanic theory.

According to the book, the quantum theory can be understood as generalization version of classical probability theory which regards the outcome as result of measurement. In the classical probability theory, we use probability measure theory in real analysis (we have to discriminate the “measure” and “measurement”. The former one indicates a special function in mathematical analysis and the later one indicates an experimental evaluation in physics) and explain the probability as magnitude of some set in Borel sigma algebra which can be understood as a set of all possible experimental outcomes. Then, we can understand that the quantum information theory goes deeper than probability measure theory and analyze the measure with superposition and entanglement.

State

Assume there is a set of events (or observation) \(S = \{s_1, s_2, ..., s_N\}\) . Then we can define a random variable X which takes real valued vector corresponding with event \(\{x_1,x_2,...x_N\}\). Remark that X is defined as function which maps real events to real values (the definition could be more generalized). For each real valued vector, we can define a probability measure as \(\{p_1,p_2,...p_N\}\) which satisfies $\sum_{i=1}^{N} p_i =1$. Then we can define a expectation value of random variable X as \(\langle X \rangle = \sum_{i=1}^{N} p_i x_i\) (Remark that we restricted the codomain of X as a real scalar field)

In the Quantum Theory, we generalize the concepts. In the classical expectation, \(p_i\) is in real scalar (0,1). Let’s generalize it into complex scalar $\sqrt{p_i}$. And define following vector and matrix

\(q = \left[\begin{array}{c} \sqrt{p_1}\\ \sqrt{p_2}\\ \vdots \\ \sqrt{p_N} \end{array}\right]\), \(X = \left[ \begin{array}{ccc} x_1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & x_N \end{array}\right]\)

Then we can rewrite the expectation value as \(\langle X \rangle = q^t X q = \sum_{i=1}^{N}p_i x_i\)

As we use the matrix and vectors to indicate the probability and event, by using the basis like assigning each event to standard basis \(e_i =:x_i\), we can make it simple further like \(O = \sum e_i e_i^t\). Furthermore, in the complex field, there are infinite value whose square is some constant probabilities. Therefore, to be exactly speaking, \(q\) become \(\alpha = \{\alpha_1,\alpha_2,....,\alpha_N\} \quad (\alpha_i^2 = p_i)\)

The $\alpha$ vector is called as amplitude vector and \(O\) is called as Observable which have to be self-adjoint matrix. The observable matrix have to be hermitian matrix which is complex generalized version of symmetric matrix.

Finally, we say the linear combination of amplitude vector and event basis as state which is represented as \(\sum_{i=1}^N \alpha_i e_i\)

Unitary Evolutions

In the quantum physics, we are generally interested in the changes of some state which is called as Evolutions. Each changes can be perfectly represented with linear algebra because they are all discrete value. Remark the probability transformation matrix (or called as Stochastic Matrix) of Markov Process.

\[MP = \left[ \begin{array}{ccc} m_{11} & \cdots & m_{1n} \\ \vdots & \ddots & \vdots \\ m_{n1} & \cdots & m_{nn} \end{array}\right]\left[ \begin{array}{c} p_{1}\\ \vdots \\ p_{n} \end{array}\right]=\left[ \begin{array}{c} p'_{1}\\ \vdots \\ p'_{n} \end{array}\right]\]

This means that the event in time t with probability \(\{p_1,p_2,..p_n\}\) have the probability of \(MP\) at time \(t+1\). For them, there need constraint that sum of probabilities have to be one both before and after.(\(\sum_i^n p_i = \sum_i^n p_i' =1\)) To do them, each sum of column in stochastic matrix have to be one.

We can generalize them into quantum evolution.

\[U\alpha = \left[ \begin{array}{ccc} u_{11} & \cdots & u_{1n} \\ \vdots & \ddots & \vdots \\ u_{n1} & \cdots & u_{nn} \end{array}\right]\left[ \begin{array}{c} \alpha_{1}\\ \vdots \\ \alpha_{n} \end{array}\right]=\left[ \begin{array}{c} \alpha'_{1}\\ \vdots \\ \alpha'_{n} \end{array}\right] = \alpha'\]

For these notation, we need constraint \(\sum_i^n \mid \alpha_i \mid^2 = \sum_i^n \mid \alpha_i' \mid^2 =1\)

To do meet the restriction, the U have to be unitary matrix which means \(U^\ast U = U U^{\ast}=I\)

Brief confirmation of it is follow.

\[\alpha^{\ast} \alpha = \sum_i^n \mid \alpha_i \mid^2=1\] \[\alpha'^\ast \alpha' = \alpha^\ast U^\ast U \alpha =1 ,\forall \alpha \mbox{ such that }\alpha^\ast \alpha =1\]

By injecting \(e_i\) as \(\alpha\), we can analyze \(U^\ast U = I\) which means unitary matrix.

Because any transformation operator in quantum theory have to be unitary, we could know that \(U^{-1}=U^{\ast}\) which means that to reverse the state, all we need to do is applying conjugate transpose of the operator.

In addition to unitary operator, there is another operator to change the state. This is called as measurement which makes the probability to either one or zero.

Density Matrices

If we perfectly know the amplitude vector \(\alpha = (\alpha_1 , \alpha_2)^t\) of some state, then the state is called as pure state. However, if we are uncertain that our state follows \(\alpha\) or \(\beta\), then the state is called as mixed state. Remark that we call the distribution which consist of several pdf as Mixture.

In the physics, they usually represent amplitude vector in the form of matrix called as density matrix. A density matrix \(\rho\) corresponding with pure quantum state \(\alpha\) is given by the outer product of vector \(\rho = \alpha \alpha^\ast\). According to the book, this is similar with covariance matrix.

In the mixed state which we do not know exact state, we can represent mixed state like \(\rho = p_1 \alpha \alpha^\ast + p_2 \beta \beta^\ast\).

\[\def\ket#1{\mid #1 \rangle}\]

2. The Postulates of Quantum Mechanics

Now, we’ll gonna use Dirac notation. Basic Dirac Notation is in previous posting.

Observables

As mentioned before, they have to be self-adjoint matrix a.k.a hermitian matrix. Every hermitian matrix can be decomposed as follow \(O = \underset{i}{\sum} o_i \ket{e_i} \langle e_i \mid\) which is known as spectral decomposition.

There is a theorem that \(f(O) = \underset{i}{\sum}f(o_i)\ket{e_i} \langle e_i \mid\).

\[f(O) = \underset{k=0}{\overset{\infty}{\sum}}\frac{f^{(k)}(0)}{k!}O^k \\\ \qquad = \underset{k=0}{\overset{\infty}{\sum}}\frac{f^{(k)}(0)}{k!} \underset{i}{\sum} o_i^k \ket{e_i} \langle e_i \mid \\\ \qquad = \underset{i}{\sum} \underset{k=0}{\overset{\infty}{\sum}}\frac{f^{(k)}(0)}{k!} o_i^k \ket{e_i} \langle e_i \mid \\\ \qquad = \underset{i}{\sum} f(o_i) \ket{e_i} \langle e_i \mid\]

For any analytic function f, we can use the theorem.

The most important f is \(f(A) = exp(i\epsilon A)\) because for any hermitian matrix H, \(exp ( i \epsilon H)\) is unitary matrix. \(exp(i\epsilon H)^{\ast}exp(i\epsilon H) = exp(-i\epsilon H +i \epsilon H) = exp(0)=I\)

The expectation values of quantum mechanical observable O can be calculated as follow.

\(\langle O \rangle = tr(\rho O)\) which is trace of multiplication of observable and density matrix.

It can be rewritten as following form if \(\rho = \ket{\psi} \langle \psi\mid\)

​ \(= tr(\ket{\psi} \langle \psi\mid O ) = tr(\langle \psi\mid O \ket{\psi}) = \langle \psi\mid O \ket{\psi}\)

The later form is more famous than the former form.

Time Evolution

There is a famous equation in quantum physics: Schrodinger equation.

\[ih \frac{d}{d t} \ket{\psi} = H \ket{\psi}\]

It represent the time evolution of a quantum mechanical system. H denotes Hamiltonian which represent the changes of some quantum and h is a Planck’s constant.

By solving the differential equation, we can derive the solution of $\ket{\psi(t)}$ when the hamiltonian is time-independent as follow.

\[\ket{\psi(t)} = exp(-i \frac{t}{h}H) \ket{\psi(0)}\]

which shows that if we know initial state \(\ket{\psi(0)}\) and its movement H, then we can know exact state of \(\psi\) after time t goes.

\[\frac{d}{dt} \ket{\psi(t)} = -i\frac{1}{h}H \exp(-i\frac{t}{h}H)\ket{\psi(0)}\] \[ih \frac{d}{dt}\ket{\psi(t)} = H exp(-i \frac{t}{h}H) \ket{\psi(0)} = H \ket{\psi(t)}\]

There is a similar equation which is called as von Nuemann equation.

\[ih \frac{d}{dt} \rho(t) = \frac{d}{dt} \underset{i}{\sum}p_i \ket{\psi_i}\langle \psi_i \mid \\ =[H , \rho]\]

where \([A,B] = AB-BA\) which means commutator.

The equation can be called as a density matrix version of Schrodinger equation.

And its solution is \(\rho(t) = \exp(-i\frac{t}{h}H) \rho(t_0) \exp(i\frac{t}{h}H)\)

By using the notation, we can transform the state in just smaller portion of it.

Quantum Measurement

For some observable of specific event $P_i$, to measure the probability of some state, we can use Born Rule.

\[Pr(o_n) = tr(P_n \rho) \\ = tr(P_n \ket{\psi}\langle \psi \mid)\\ = \parallel P_n \ket{\psi} \parallel^2\]

By using them we can derive the probability that some states takes some specific results.

After measuring the state, the state become \(\ket{\psi} \rightarrow \frac{P_n\ket{\psi}}{\sqrt{\langle\psi\mid P_n \ket{\psi}}}\) and in density matrix form, it becomes \(\frac{P_n \rho P_n}{tr(P_n \rho)}\)


Even after studying this unfamiliar notation, there are another fundamental which have to be studied. In the next posting, we’ll gonna see the information encoding method at the quantum computing.


Schuld, M. & Petrucionne, F. (2019). Supervised Learning with Quantum Computers,Cham, Switzerland:Springer