Jakob Schwichtenberg

Write down what you learn – or – why I wrote a textbook for students as a student

I wrote a textbook during my master studies. People who hear about this usually ask me:

“How did this happen?”

The truth is that my book is simply a collection of things that I wrote down for myself.

Whenever I understand something, I write it down. It has happened too often to me that something doesn’t make sense to me, although I understood it a few weeks ago. Thus I’ve developed this habit of writing things down.

As my handwriting is horrible, I always write at a computer. This way I can always share quickly with others what I’ve learned. I used ShareLatex in the past, but recently switched to my own public notebook.

Most people believe that only experts should write about complicated topics. That you’re only “allowed” to write about something when you are an expert in the field.

This is a problem.

It takes a long time to become an expert. In most fields several years. During this time one easily forgets what it’s like to be a beginner. The problems one had a few months ago, now seem obvious and not worth talking about. An expert doesn’t understand the problems a beginner has because he was a beginner far too long ago.

Whenever I look back a few months I think “how stupid was I?“. The things I struggled with for weeks or even months are now obvious. When I don’t write things down, I find it impossible to recall my train of thoughts from a few months ago. In addition, most of the time I find the problems I struggled with embarrassing because their solutions are now so obvious.  Thus, when I would write things down only after I’ve “mastered” the topic, I wouldn’t talk about these struggles. Either because I forgot about them or because they don’t seem important to me any longer.

However, beginners need help to overcome these hurdles. Almost everyone struggles with the same things in the beginning. Unfortunately the standard textbooks, written by experts, are here of little help. Experts find completely different things important than beginners. Most helpful are often fellow students who struggled with the same things a few days ago or if someone has written down his own learning process as a beginner. This, of course, doesn’t mean that textbooks by experts aren’t important. However, they serve a different purpose and are often not the best choice for beginners.

I’m not alone with this observation:

British writer C. S. Lewis wrote in 1958 in Reflections on the Psalms:

“It often happens that two schoolboys can solve difficulties in their work for one another better than the master can. […] The fellow-pupil can help more than the master because he knows less. The difficulty we want him to explain is one he has recently met. The expert met it so long ago he has forgotten. He sees the whole subject, by now, in a different light that he cannot conceive what is really troubling the pupil; he sees a dozen other difficulties which ought to be troubling him but aren’t.”

More recently, product designer Scott Riley tweeted:

I think everyone should write down what they learn. Partly this already happens in the form of master and Ph.D. theses and I find these often more helpful than textbooks. However, the primary goal of a thesis is not to teach, but to present results. Therefore, one rarely reads about the struggles and things that didn’t work out.

The biggest problem here is, of course, that writing about your learnings in a form presentable to others requires time and there is almost no reward for doing this. You don’t get citations because you’re able to explain some topic better than the standard textbooks. The citations go to the authors who invented the method, not the person who can explain it beautifully. There isn’t even a good place where you can publish what you learn or beautiful explanations.

However, one possibility is to publish things on your own website. Nowadays a personal website is incredible cheap or even free. An alternative is to write a book. This requires a lot of time and the reward isn’t much bigger

I’ll continue to publish what I learn here at JakobSchwichtenberg.com. This way, I can always come back to read my own explanations for things I no longer understand and maybe someone else will find it helpful.

PS: An awesome book about the benefits of sharing more than just the final results is Show your Work by Austin Kleon.

EDIT: I recently stumbled upon the following quote “Group Theory – Birdtracks, Lie’s, and Exceptional Groups” by Predrag Cvitanovic:

Almost anybody whose research requires sustained use of group theory (and it is hard to think of a physical or mathematical problem that is wholly devoid of symmetry) writes a book about it.

and immediately was reminded of this image

Only because there are already a hundred books on a given topic, this doesn’t mean that the world doesn’t need another one. Each new writing on a given topic offers a new perspective that could be incredibly valuable for some readers.

One Thing You Must Understand About Studying Physics

As a beginner student it’s really easy to feel overwhelmed and stupid, because every page in the textbook you’re reading makes the question mark above your head bigger. It took me quite some time to realize that most textbooks aren’t written for the reader, but for the author. My realization started with a little sentence in a book about classical mechanics.

Many of the scientific treatises of today are formulated in a half-mystical language, as though to impress the reader with the uncomfortable feeling that he is in the permanent presence of a superman.

Lanczos: The Variational Principles of Mechanics

 

I mean, it’s understandable. Physics is like any scientific discipline about prestige. Writing a super-complicated treatise may help the author get recognized as super-smart, but surely isn’t very helpful for the reader. For a book every reader understands, the probability that someone criticizes your treatment is much higher as with a book nobody understands. There is a risk with every understandable sentence in a book.

It’s much safer to simply write down equation after equation without talking much about them. The equations can be checked and there is little room for criticism. But as soon as an author starts writing about the meaning of the equations, the why, the how, things get dangerous. Sentences that interpret things and put them into context are often the most valuable, but are in danger of being criticized.

There is no way to make them as bullet-proof as an equation. That’s why quite often the authors ego wins the fight and explanatory sentences are kept to a minimum. Always keep in mind that books that are hard to understand are written with the authors ego in mind and not the readers needs.

It’s the authors job to explain the subject to you. If you don’t understand it’s the authors fault, not yours. If you are reading a book that you find hard to understand, simply go to the library and get another book. Please don’t let yourself get demotivated by such books! You simple need to find a book that you understand. Physics is never really complicated, but only badly explained.

“Dr. Hoenikker used to say that any scientist who couldn’t explain to an eight-year-old what he was doing was a charlatan.”

Vonnegut: Cat’s Cradle

So why do people read and recommend all these complicated books at all?

For the same reasons outlined above. Unfortunately most “standard books“ and books recommended by professors aren’t particular good for the reader. These books don’t have the best explanations, but are recommended because they are safe to recommend. There is little room for criticizing them, because the explanations and illuminating remarks are kept to a safe minimum.

In addition, recommending a book that is hard to understand is good for your ego. Indirectly you’re signalling this way: “Well, I understand and even enjoy this super-complicated stuff. I’m smart!“

A recommendation for a book that explains everything in great detail signals: “I needed such dummy treatment to understand the subject. I’m not particular smart.”

Be assured that if you read some treatment of the subject that was written with the readers needs in mind first and then read the super-complicated stuff, you will understand, too. Don’t panic, don’t feel stupid. It’s the same for everyone. Most of the time the recommendations of books that are hard to understand happen with no ill intent. Your professor learned the subject some decades ago, so he’s not the best person the get recommendations from.

“It often happens that two schoolboys can solve difficulties in their work for one another better than the master can. The fellow-pupil can help more than the master because he knows less. The difficulty we want him to explain is one he has recently met. The expert met it so long ago he has forgotten. He sees the whole subject, by now, in a different light that he cannot conceive what is really troubling the pupil; he sees a dozen other difficulties which ought to be troubling him but aren’t.”

-C. S. Lewis, Introductory to Reflections on the Psalms

 

Most books are written by professors, not students. That’s what makes them hard to understand.

I mean it’s fine books exist that are regarded as the standard books, but they aren’t the best option to learn the subject from. Later, if you already learned and understood the subject, you’ll come back to these books. They are suited as references, they are suited to look things up like an equation if you’re writing a paper.

The message to take away is: Stop Reading Books you don’t understand immediately! There is absolutely no reason to feel ashamed or stupid. If you don’t understand something, simply search for another explanation that makes sense to you!

Everything around you that you call life was made up by people that were no smarter than you. And you can change it, you can influence it… Once you learn that, you’ll never be the same again.

Steve Jobs

 

This is especially true for physics.

Vectors, Forms, p-Vectors, p-Forms and Tensors

This is a topic that can cause quite a bit confusion, so here is a short post I can come back to whenever I get confused.

Lets start with the definition of a vector. A vector is… uhmm … I guess you have a rough idea of what a vector is. Otherwise this is stuff for another post.

(The notion vector means here the abstract objects of a vector space. Euclidean vectors (=arrow-like objects) are just one special example.  Nevertheless, Euclidean vectors are the example that inspired the idea of abstract vectors. All objects that share some basic properties with Euclidean vectors, without the need of beeing arrow-like objects in Euclidean space, are called vectors.)

One-Forms

A one-form (1-form) is the dual object to a vector: A one-form $ \tilde w()$ eats a vector $V$ and spits out a number

\begin{equation} \tilde w(V) . \end{equation}

The word dual is used, because we can think of a vector $V()$ as an object that eats a one-form $\tilde w$ and spits out a number

\begin{equation}  V(\tilde w) \equiv \tilde w(V) . \end{equation}

We will see in a moment why defining such an object is a useful idea.

Two examples:

From matrix algebra: If we decide to call column vectors “vectors”, then row vectors are one-forms. Given a vector $ V=\begin{pmatrix} 2 \\ 4 \end{pmatrix}$, ordinary matrix multiplication (in the correct order) with a one-form $ \tilde w = \begin{pmatrix} 1 & 3 \end{pmatrix}$, results in a single real number:

\begin{equation} \tilde w(V) \equiv \begin{pmatrix} 1 & 3 \end{pmatrix} \begin{pmatrix} 2 \\ 4 \end{pmatrix} = 2+12 = 14
\end{equation}

Another example used in quantum mechanics are bras $\langle\Psi\vert$ and kets $\vert\Phi\rangle$.  Kets are used to describe the inital state of a system, bras for the final states. Together they give a single number:

\begin{equation} \langle\Psi\vert \Phi\rangle
\end{equation}

which is the probability amplitude for finding (measuring) the system $\vert \Phi\rangle$ in the final state $\langle\Psi\vert$.  Kets, like $\vert \Phi\rangle$ are vectors, and bras, like $\langle\Psi\vert$, are one forms.

Tensors

Tensors are the natural generalization of the ideas described above. Tensors are linear operators on vectors and one-forms. A tensor of type $ \begin{pmatrix}N \\ N’ \end{pmatrix}$, eats $N$ one-forms and $N’$ vectors and spits out a single number.

A $ \begin{pmatrix}2 \\ 0\end{pmatrix}$ tensor is an object $F( \quad , \quad )$ that eats two one-forms, say $\tilde w$ and $\tilde v$ and spits out a number $F( \tilde w , \tilde v )$.  A $ \begin{pmatrix}1 \\ 2\end{pmatrix}$ tensor $F( \quad ; \quad , \quad)$ eats 1 one-form $\tilde w$ and 2 vectors, say $V$ and $W$ and spits out a number $F( \tilde w ; V, W)$ .

The term “linear operator” means that tensors obey ( for arbitrary numbers a, b)

\begin{equation}  F( a \tilde w + b \tilde v , \tilde z ; V, W) = a F(  \tilde w , \tilde z ; V, W) + b F(  \tilde v , \tilde z ; V, W)
\end{equation}

and similarly for the other arguments. In general, the order of the arguments makes a difference

\begin{equation}  F( \tilde w , \tilde v ) \neq F( \tilde v, \tilde w ),
\end{equation}

just as for a function of real variables, say $f(x,y)=4x+7y$, we have $f(1,3)\neq f(3,1)$.

A special and very important kind of tensors are anti-symmetric tensors. Anti-symmetry means that if we change the order of two arguments, the tensors changes only by a sign:

\begin{equation}  F_\mathrm{asym}( \tilde w , \tilde v ) = – F_\mathrm{asym}( \tilde v, \tilde w ).
\end{equation}

Analogous a symmetric tensor does not care about the order of its arguments:

\begin{equation}  F_\mathrm{sym}( \tilde w , \tilde v ) =  F_\mathrm{asym}( \tilde v, \tilde w ).
\end{equation}

If the tensor has more than two arguments of the same kind, the tensor is said to be totally antisymmetric (symmetric) if it is antisymmetric (symmetric) under the exchange of any of the arguments.

\begin{equation}  F_\mathrm{tasym}( \tilde w , \tilde v, \tilde z ) = – F_\mathrm{tasym}( \tilde v, \tilde w ,\tilde z) = F_\mathrm{tasym}( \tilde v,\tilde z, \tilde w ) = – F_\mathrm{tasym}(\tilde z,  \tilde v, \tilde w )
\end{equation}

Antisymmetric tensors play a special role in mathematics and therefore they are given a special name: p-forms for antisymmetric $ \begin{pmatrix} 0 \\ p \end{pmatrix}$ tensors , and p-vectors for antisymmetric $  \begin{pmatrix} p \\  0\end{pmatrix}$ tensors.

P-Forms and P-Vectors

A p-form is simply an (antisymmetric in its arguments) object that eats p vectors and spits out a real number. Analogous a p-vector eats p one-forms and spits out a number (, and is antisymmetric in its arguments). For example a 2-form $\tilde w( , )$, eats two vectors $V$, $W$, spits out a real number $\tilde w(V , W)$, and is antisymmetric in its arguments $\tilde w(V , W) = – \tilde w(W , V)$.

P-forms are important, because they are exactly the objects we need if we want to talk about areas and volumes (and higher dimensional analogues).

If we have a metric (the mathematical object defining length) defining areas and volumes is straight forward. Nevertheless, the notion of area is less restrictive than the notion of metric and we can define area without having to define a metric on the manifold in question.

Lets see how this comes about.

parallelo-area

We start by taking a step back and think about what properties a mathematical object, describing area should have. Suppose we have two (possibly infinitesimal) vectors, forming a two-dimensional parallelogram and we need the mathematical object that tells us the area of this parallelogram:

\begin{equation}  \mathrm{area}(V,W) = \text{ area of the parallelogram formed by V and W}.
\end{equation}

An obious property is that the number $\mathrm{area}(V,W)$ ought to double if we double the length of one vector:

\begin{equation}  \mathrm{area}(2V,W) = 2 \times \text{ area of the parallelogram formed by V and W}
\end{equation}

 

area-addition

In addition, the area object should be additive under the addition of vectors

\begin{equation}  \mathrm{area}(V,W+Z) = \mathrm{area}(V,W)  + \mathrm{area}(V,Z)
\end{equation}

A pictorial proof can be seen in the following figure:

Area-additive-proof-SCHUTZ114-2

 

Together these properties are what we called linearity above, and we see that this fits perfectly with the defintion of a  $  \begin{pmatrix} 0 \\  2 \end{pmatrix}$ tensor.

Another parallelo-area-zeroimportant propery is that $\mathrm{area}(V,W) $ must vanish if $V$ and $W$ are parallel, i.e. if $W=aV$ for some number $a$, we have

\begin{equation} \mathrm{area}(V,W)  =\mathrm{area}(V,aV)
\stackrel{!}{=} 0.
\end{equation}

.Because we have $\mathrm{area}(V,aV)=a \times \mathrm{area}(V,V) \stackrel{!}{=} 0$, we start with $\mathrm{area}(V,V) \stackrel{!}{=} 0$ in the next step to see that $\mathrm{area}(V,W)$ must be an antisymmetric $  \begin{pmatrix} 0 \\  2 \end{pmatrix}$ tensor.

The proof is simple: From

\begin{equation}  \mathrm{area}(V,V) = 0,
\end{equation}

if we write $V= U+W$, it follows

\begin{equation}  \mathrm{area}(U+W,U+W) = 0.
\end{equation}

Using the linearity yields

\begin{equation} \mathrm{area}(U,U) +  \mathrm{area}(U,W)  + \mathrm{area}(W,U) + \mathrm{area}(W,W) = 0.
\end{equation}

Using now $\mathrm{area}(V,V) = 0$ for $W$ and $U$, i.e., $\mathrm{area}(W,W) = 0$,  $\mathrm{area}(U,U) = 0$, yields

\begin{equation}  \underbrace{\mathrm{area}(U,U)}_{=0} +  \mathrm{area}(U,W)  + \mathrm{area}(W,U) + \underbrace{\mathrm{area}(W,W)}_{=0} =  \mathrm{area}(U,W)  + \mathrm{area}(W,U)  =  0 .
\end{equation}

\begin{equation} \rightarrow   \mathrm{area}(U,W)   =  – \mathrm{area}(W,U).  \end{equation}

We conclude that demanding $\mathrm{area}(V,aV) \stackrel{!}{=} 0$, leads us directly to the property $\mathrm{area}(U,W) = – \mathrm{area}(W,U) $ of the area object. This is what we call antisymmetry.

Therefore, the appropriate mathematical object to describe area is an antisymmetric $  \begin{pmatrix} 0 \\  2 \end{pmatrix}$ tensor, which we call a 2-form.