[su_box title=”Summary”]
  • We’ll invent quantum mechanics from scratch.
  • You’ll understand how the framework of Quantum Mechanics works.
  • We’ll talk about the best textbooks.

Nowadays there are hundreds of books and other resources about quantum mechanics and, especially as a beginner, it‘s quite hard to decide what is essential and what is fluff. The spectrum ranges from layman books to hardcore-mathematical treatments. The former often have the tendency to strive into mysticism and the latter tend to lose sight of what physics is all about: describing nature.

Happily, we are free to ignore things that don‘t really help us and can focus on the shortest path to a deep understanding of quantum mechanics. Of course you can fight for a year or so with some hardcore, abstract treatment, but there is no guarantee such a torture will give you an advantage, or a deep understanding in any sense.

The same is true for books full of fluffy words. Bad writing has the tendency to confuse instead of to illuminate.

I guess that‘s enough negativity for today. Now to the good news: quantum mechanics isn‘t as esoteric as many layman books want to make you believe and there is no need to study dry mathematics before being “allowed” to read about the „good stuff“.

Quantum mechanics is, if explained from the correct perspective, a quite natural framework to describe nature at the most fundamental level.

Now, where to start? Let’s get started with some crucial concepts of quantum mechanics and talk about the best textbooks afterwards. Ready?

Inventing quantum mechanics from scratch

Let‘s imagine for a moment that quantum mechanics, Newton’s classical mechanics, Maxwell‘s electrodynamics weren‘t discovered yet and we start thinking about nature. Not just nature, but nature at the most fundamental level. Basically, what the Greeks did 3000 years ago but with the power of hindsight.

Firstly, no matter how we may define fundamental it is quite rational to assume there are some basic building blocks everything else is made of. The Greeks called them atoms (Atomos is greek for indivisible), but because someone in the modern world decided to call something not fundamental atoms, we must use the slightly ugly notion of elementary particles.


Will the framework that is perfectly suited to describe everyday objects like a rolling ball be a good fit to describe elementary particles? How do we know something about the rolling ball?

Well, we need some light and then we look at it, film it with a camera or use some other sophisticated motion tracking device. We can do this all the time. Does the ball care about the lamp or the camera? Of course not! A ball does not behave differently in the dark.

How do we know something about elementary particles? Well, we need to measure. Of course this is just a fancy term for the process described in the last paragraph, but it‘s a useful one as we will see. Again, we need some light and a camera. Does the elementary particle care about the lamp and the camera? Of course it does!


The rolling ball would behave differently, too, if we measure its position by shooting tennis balls at it and analyze the direction of the tennis balls afterwards.

That is not the usual measurement technique for rolling balls, because we can shoot something much smaller at it that does not change its behaviour. For fundamental particles we have no choice.


Light itself consists of elementary particles, called Photons. To learn something about fundamental objects we have no choice, but to use objects of similar size. There is nothing smaller, because that is how we define the fundamental scale.

Now we can prepare ourselves for some seemingly crazy property a theory describing nature at the most fundamental scale must have.

For the rolling ball we can in principle compute how the rolling ball changes its momentum if we shoot tennis balls at it, because we control the location and momentum of the tennis balls that we use to measure its properties, like its location and its momentum.

On the fundamental scale we aren‘t able to control the particles we use to measure completely, because how would we do that? There is nothing smaller and every time we measure the properties of the particles that we use to measure, we change their properties. In addition, how should we know the properties of the particles that we use to measure the properties of the particles that we use it to measure?

On the fundamental scale, in order to measure we have no choice but to use particles of similar size. This process necessarily changes the properties of the particles in question, because again we can‘t control their properties completely.

Changing back to the macroscopic example, we can imagine what it would be like if we could only measure the location and momentum of the yellow tennis balls by shooting red tennis balls onto them. Then we could use blue tennis balls to measure the properties of the red balls, but how would we know their properties? As long as we have only tennis balls and nothing fundamentally smaller there is no way to measure their properties with arbitrary precision.

Of course for macroscopic objects we have smaller things, but on the fundamental scale there is nothing smaller and this is why we will necessarily have some randomness to account for in our framework. In mathematical terms this requires that we need to talk about probabilities.

Short disclaimer: Please don‘t get confused by the notion of size, because it‘s terribly ill-defined regarding elementary particles. Maybe more appropriate would be energy or momentum. I hope the message I‘m trying to bring across is clear no matter how we call it. Macroscopic objects, like a ball, don‘t change their behaviour if we shoot elementary particles at them. In contrast, elementary particles behave differently when we shoot elementary particles at them and, unfortunately, in order to learn something about them we have no other choice.

For macroscopic objects there is no need to talk about measurements, because some photons crashing at the big ball aren‘t a big deal. Therefore a framework suited to describe a rolling ball works perfectly without anything that accounts for measurements.

It should be clear by now that the same framework is ill-suited for describing elementary particles.

Let‘s build a better framework. Remember, we know nothing about the conventional theories and want to start from scratch. We know nothing about an elementary particle until we measure its properties like position or momentum. (The same is true of course for the rolling ball, but the statement is on the macroscopic scale trivial, because when we talk about the position and momentum of the ball it‘s clear that we mean the position and momentum that we measure with the camera etc.)

Our framework must be able to account for the possibility that our measurement changes the properties of the particle in question. Therefore, we introduce measurement operators $\hat O$. For example, the momentum operator $\hat p$ or the position operator $\hat x$. (For macroscopic objects we can use ordinary numbers, but numbers do not really change anything. On the fundamental scale we need a mathematical concept that is able to change something.)

Next, we need something that describes the elementary particle, or to be more general the physical system in question. Our framework should be able to describe lots of different situations and therefore we simply invent something abstract $| \Psi >$, that describes the state of our physical system or of the elementary particle.

If we measure the momentum of an elementary particle, we have in our framework $\hat p | \Psi > $. The measurement operator acts on the object that describes the elementary particle. Okay, what now?

At this point it is a good idea to talk to the best friends of every physicist: Mathematicians. Happily, mathematicians have thought about the things we need for our framework, like operators, for quite some time. All we need to do to develop our framework further is ask some mathematician: What can you tell me about operators? [div  class=”bubble-container”]

mathguy [div  class=”bubble”] Firstly, operators have abstract (!) eigenvectors and eigenvalues. Famous examples are matrices, but be aware of the fact that not every operator in mathematics is a matrix. The definition of an eigenvector is that it is unchanged by the corresponding operator. Concretely this means
$$ \hat O | o> = E_o | o>, $$

if $ | o> $ is an eigenvector of $\hat O$ and $E_o$ is the corresponding eigenvalue. In contrast, for a general, abstract vector $ | x>$, we have
$$ \hat O | x> = z | y >. $$



The number of eigenvectors and eigenvalues depends on the operator and in physics often we will have to deal with infinitely many. Does this sound strange?

It shouldn‘t, because how else should we interpret the eigenvalues of a measurement operator, if not as the actual numbers we measure in experiments? There are infinitely many different momenta we can measure and therefore we need an infinite number of eigenvalues and eigenvectors for the momentum operator. (You‘ll learn soon enough what an operator with an infinite number of eigenvalues looks like.)

Back to our framework, what can we say about our abstract $| \Psi >$ we introduced in order to account for our elementary particle or physical system? The crucial ingredient of our framework are the measurement operators.

Let’s listen again to what our mathematician has to tell us about our abstract $| \Psi >$:

[div  class=”bubble-container”]

mathguy-armehinten [div  class=”bubble”]

Every operator has a set of eigenvectors, which form a basis for the corresponding vector space. The abstract $| \Psi >$ must live in this vector space, because otherwise acting with the measurement operators on it makes no sense and we conclude that $| \Psi >$ are vectors in an abstract sense.

The defining feature of a basis is that we can write every element of this vector space, which includes $| \Psi >$, as a linear combination of eigenvectors.

For a general $| \Psi >$ we have

$$ \hat O | \Psi > = | \Phi >, $$

because in general $| \Psi >$ is no eigenvector of $\hat O $. Nevertheless, we can write

$$ | \Psi > = \sum_o c_o | o> , $$

with some numbers $c_o$ that are called the coefficients in this series expansion. This yields

\begin{align} \hat O | \Psi > &= \hat O \sum_o c_o | o > \notag \\
&= \sum_o c_o \hat O | o > \notag \\
&= \sum_o c_o E_o | o > \notag \\
&= \sum_o d_o | o > \notag \\
&= |\Phi > , \end{align}

which we can see as a series expansion of $|\Phi >$ in terms of the basis $ | o> $, with now different coefficients $d_o$.




That may seem like a lot of unnecessary yada yada. Maybe you wonder: Why all this talk about general things like $|\Psi >$ ? Don‘t we only need eigenvectors where we get some easy answer when we act on it with our operators?

If the eigenvalues are the result of our measurements, what does something like $ \sum_o c_o E_o | o>$ mean? If we measure the momentum of an elementary particle, we get a definite number and here in the formalism we get wild sum.

You’re right! Of course, as long as we don‘t measure the properties of our elementary particle, we don‘t know its properties and therefore we need a sum that includes all possibilities to describe it.

The same is true for an ordinary rolling ball. If we have an elementary particle with definite momentum, which we can adjust for example with magnetic fields, and then measure its momentum, we get of course just a number: its momentum. An elementary particle with prepared definite momentum, say $4 \frac{kg m}{s}$ is of course described by an eigenvector $| 4 \frac{kg m}{s} >$ of the momentum operator $ \hat p$ and we have

$$ \hat p | 4 \frac{kg m}{s} > = 4 \frac{kg m}{s} | 4 \frac{kg m}{s} >. $$

For situations like this our formalism is really easy and works as we would expect it, but now prepare for a big surprise!

Our good friend, the mathematician, wants to explain to us something else about operators:

[div  class=”bubble-container”]

mathguy-rechterarmhoch [div  class=”bubble”]

Operators have some curious feature that ordinary numbers do not have. If we have two operators $ \hat p $ and $ \hat x $ with different eigenvectors $| p>$ and $| x>$, we have in general
$$ \hat p \hat x | x> \neq \hat x \hat p | x> , $$

because $ | x>$ is not an eigenvector of $\hat p $ and therefore we have $\hat p | x>= z | y>$. We have on the one hand:
$$ \hat x \hat p | x> = \hat x z | y> \text{ and on the other hand } \hat p \hat x | x> = \hat p E_x | x>$$

There is no reason why these two terms should be the same and in general they are really different. (There are operator pairs where $ \hat O_1 \hat O_2 | x> = \hat O_2 \hat O_1 | x>$, but in general this is not the case.)

[end-div] [end-div]

What does this mean? Translated into words this means that a measurement of the location, followed by a measurement of the momentum is in general something different than a measurement of the momentum followed by a measurement of the location. A measurement of the location changes the location and vice versa!

We already talked about how on a fundamental scale we can‘t expect that our measurement does not change anything, but what we have here isn‘t some effect that appears because our experiment was designed poorly. It is intrinsic! We are always changing the location if we measure the momentum.

There is no way to avoid this and this is why we need to talk about general, abstract things like $|\Phi >$.

As explained above, we can prepare our particle with definite momentum and we get a simple answer if we act with our measurement operator on the corresponding vector. But what if we want to know its location, too?

An eigenvector of $ \hat x $ is not necessarily and eigenvector of $ \hat p$ and therefore acting with $ \hat x $ on our carefully prepared particle results in a big question mark:

$$ \hat x |4 \frac{kg m}{s} > = ? $$

We can expand $ |4 \frac{kg m}{s} > $ in terms of eigenvectors of $\hat x$ and then the question mark turns into a sum:

$$ \hat x |4 \frac{kg m}{s} > = \sum_x c_x \hat x |x > = \sum_x c_x \hat x |x > = \sum_x c_x E_x |x >. $$

All these to come to a dead end? We must find some interpretation for this if we want to make sense of our framework.

Let‘s recap how we got here. We started by thinking about one crucial feature our framework must include: Measurements.

This led us to the notion of a measurement operator, which must be one of the cornerstones of our framework, because we know nothing until we measure. Then we took a look at what mathematics can tell us about operators and we learned many things that make a lot of sense. Unfortunately, the last feature is somewhat weird.

Happily, there is another mathematical idea that will help us make sense of all this: If we have a vector space, we must have some scalar product. This means it‘s possible to combine two vectors in such a way that the result is an ordinary, possibly complex, number.

For the moment don‘t worry about the details, just imagine some trustworthy mathematician told you and now we want to think about the physical interpretation.

Not caring about any details, we denote our scalar product of two vectors $ |y >$ and $ |z >$ abstractly:

$$ <z| |y > = \text{ some number} $$

The bases we talked about earlier have the nice feature of being orthogonal, which means we have

$$ <o‘| |o > = 0 $$

unless $o‘ =o$, then we have

$$ <o| |o > = \text{ some number} . $$

Going back to our example from above we can now see that this is exactly the missing puzzle piece. We have a sum that we couldn‘t interpret in any sensible way:

$$ \sum_x c_x E_x |x > . $$

Recall what we mean by $ |x >$. We use the abstract eigenvector $|x >$ to describe an elementary particle at position $x$.

For example, if we had prepared the location $x=5$ cm, using some coordinate system, instead of the momentum, we would have $|5 cm >$ as abstract object that describes our particle. But we decided to prepare our particle with definite momentum and therefore we describe it by a momentum eigenvector: $|4 \frac{kg m}{s} >$.

Acting with the location operator on this eigenvector results in the sum. Now we can multiply $ \hat x |4 \frac{kg m}{s} > $ with for example $<5 cm |$, which means we compute the scalar product and make sense of the big question mark, because the result will be a number instead of a sum of many, many vectors! We have

$$ <5 cm | \hat x |4 \frac{kg m}{s} > = <5 cm| \sum_x c_x E_x |x >  = \sum_x c_x E_x <5 cm | |x > $$

Every term in this sum is zero except for $|x >= |5 cm >$ and therefore we have

\begin{align} 5 cm | \hat x |4 \frac{kg m}{s} > &= <5 cm| \sum_x c_x E_x |x >  \\ &= \sum_x c_x E_x <5 cm | |x > \\ &= c_{5 cm} E_{5 cm} <5 cm | |5 cm > \\ &= c_{5 cm} E_{5 cm} \times \text{ some number} \end{align}

Still confused? The sum has gone away because we multiplied it with the vector that we would use to describe a particle sitting at the definite location $x=5$ cm. We can do the same with vectors for different locations like $|7 cm >$ or $|99 cm >$ and get a different number.

These numbers are the probability amplitudes for measuring that the particle we prepared with momentum $4 \frac{kg m}{s}$ sits at $x=5$ cm, $x=7$ cm or $x=99$ cm.

Of course it is strange that we must now talk about probabilities, but please keep in mind how we got here.

Mathematics tells us that the position and momentum operators have different eigenvectors. Therefore each time we measure the location we change the momentum and vice versa.

This is exactly what we expect from the discussion at the beginning. Just think about measuring the location of a rolling ball by shooting tennis balls at it. Of course this will change its momentum and if we could only determine the properties of the tennis balls, by using different tennis balls, we would need to talk about probabilities, too.

There are still a lot of things of missing to transform our rough framework into an actual physical theory. For example, we must know how our vectors change in time. This is what the Schrödinger equation tells you and you will spend a lot of time solving it for different situations. In addition, take note that we haven‘t talked about any real experiments, but be assured that all experiments up to today are in agreement with the predictions of quantum mechanics.

I feel you are now prepared to to start your own journey through the wonders of quantum mechanics and learn about the Schrödinger equation, the double-slit experiment and the Stern-Gerlach experiment!

As noted at the beginning there are hundreds of books about quantum mechanics. To help you get started I’ll tell you which books helped me the most.

Best Introductory treatments

I remember sitting as a junior in a beginner lecture about quantum mechanics. The professor talked for almost two hours about Hilbert spaces and obscure objects called kets $|\Psi >$. I didn’t understand a single thing. “What the heck have these things to do with things in the real world?” I felt stupid.

Frustrated I went to the library afterwards and there I found:

  • The Feynman Lectures Vol. 3. I did nothing but reading for the next three days. Everything seemingly obscure started to make a lot of sense. I mean I’ve tried to read other books about quantum mechanics before that. For example, my professor recommended THE standard book about quantum mechanics: Cohen-Tannoudji: Quantum Mechanics I & IIThe first 100 pages are filled with concise mathematical talk: Hilbert space, square-integrable functions, bras, kets… yada yada yada. When Feynman introduces kets he tells us that they represent the initial state $|\Psi_i >$ of our physical system. A bra represents one possible final state $<\Psi_f |$. If we put a bra and a ket together $<\Psi_f |\Psi_i >$ we get the probability amplitude that our system is found in the final state. Things are really that simple. The The Feynman Lectures Vol. 3 is the best book to start learning quantum mechanics, because Feynman answers all the questions that bother beginner students. (There is no need to read the other two volumes before you start Vol. 3, although they are awesome, too!). Afterwards you can read more technical and more modern treatments. Another great book, which I read afterwards is
  • Griffiths: Introduction to Quantum Mechanics. It’s not really technical, but explains things very nicely. The third book I read and really liked is
  • Sakurai: Modern Quantum Mechanics. Sakurai makes many things a little different, but it’s great to see many things from a different perspective. Many advanced concepts that are nearly impossible to understand elsewhere are explained brilliantly. A must read for everyone who wants to dive deeper.

That said, please don‘t let yourself get confused by philosophical, mystical, fluffy or incredibly abstract talk that is circulated everywhere. Simply try to understand what quantum mechanics is all about on an adequate mathematical level. Things aren‘t that strange after all. Have fun!

 Read on:









[div class=”smallfontforcredits”] Image Credit:

Headline Background and Character vectors designed by Freepik


2 comments on “How to learn Quantum Mechanics

  • I am not sure if this explanation develops a right intuition for QM.

    1. Measuring a position of tennis balls with tennis balls cannot be precise (as the author points out) but QM measurement must provide (in theoretical idealization) a precise result if applied to the eigenstate.
    2. Analogy with bombarding a ball with other balls may explain why the moment of the first ball may change after the “coordinate measurement” but this is very different from a QM formalism which supposes that coordinate and moment just cannot be defined 100% precise in the same state. It is not just a feature of non-ideal measurement but an inherent property of the quantum state, because otherwise, this classical analogy returns us back to the problem of stability of hydrogen atom.
    3. An operator associated with measurement should not be applied to the original state to get a new state. It cannot describe a normal evolution (it is not unitary) and as a matter of fact it also does not describe state evolution in the course of measurement.

Leave a Reply

Your email address will not be published. Required fields are marked *