Nature loves math | Tensors for the noob

15 Feb 2016Tensors for the noob

Posted at 00:40h in Mathematics by Johann 7 Comments

Niveau de difficulté

Update (13/10/2017) : New detailed visual desription of the covariant and contravariant components (without any new formulas).
Beware : As always on this blog, this is a rather exotic approach, so probably not a good way to learn to use these notions rigorously. It’s only a way to give them some meaning.

If you are here, then it’s because you don’t like formulas and complicated calculations too much, but you’re still curious about what is a tensor.

So we will do a quick tour of the concept, in the easiest way possible. Obviously, there will be approximations and shortcuts, so if some paragraph hurts your feelings, then you’re not on the right page and you should check out the detailed, more rigourous post (link at the bottom).

How to calculate the length of a vector ?

Vecteur

(or $-\sqrt2$ but negative lengths are pretty much useless).

Nice. Now what to do when the axes are not perpendicular ?
(we say that the basis is not orthogonal anymore)

Vecteur dans une base écrasée

Here we have $y\approx 1.15$ and $x \approx 0.42$ . Using the same formula would yield a length of $1.22$ … But as far as we know, the vector hasn’t changed, so something’s wrong here !

To solve this, we will need a new kind of vector components.

Covariant and contravariant components

By “squeezing” the axes of our frame, we lost the orthogonality of the basis in which the vector was defined. Not to worry, let’s build a pair of new axes to compensate !

Let’s take the first axis, and build a new one perpendicular to it. Then let’s do the same with the second one :

composantes covariantes et contravariantes

Note : careful, the axes’ colors are only for visual effect, they do not differentiate the two bases.

In the first frame, we’ll name the components $v^1$ and $v^2$ , they are called “contravariant” components.
In the new frame, $v^{}_1$ and $v^{}_2$ will be called “covariant components.

Representation of covariant and contravariant components of a vector

We already have the (contravariant) components of the vector in the first frame, they are $v^1\approx 0.42$ and $v^2\approx 1.15$ .

Skipping over the calculations, the (covariant) components of the vector in the new frame are $v^{}_1=1$ and $v^{}_2\approx 1.37$ .

Ok, now what do we do with that ?

Well, now we have a new formula to calculate the vector’s length :

\|\vec v\|^2=v^iv^{}_i=v^1v^{}_1+v^2v^{}_2

We find $0.42 \times 1 + 1.15\times 1.37=1.99\approx 2$ and in the end we get our length $\sqrt2$ .
(of course if we used the exact values it would be exactly $2$ )

Norme covariante

Note : Picture may differ from actual product ! This animation simplifies the concept. Since the components depend on different bases (not pictured here), the rectangles do not have the right proportions…
Oh and I inadvertedly inverted the components, but whatever, you get the idea.

So we have a method to get the length of a vector in any frame. Physicists say that the length is invariant for any coordinate transformations. Why is that important ? Because it guaranties that the laws of physics won’t change when we observe them from another place or at another time, and even if we are moving (in a straight line and at a uniform speed) !
(to be fair, here we are in a euclidean space, so the geometry is simple, it’s the one we all learned in school. In special relativity things are more complicated… but the principle still applies)

In fact, the covariant components of the vector are the components of another vector, identical to ours but in another vector space, mirroring that one in a way. We call it the dual space. A tensor is an object based on that duality, combining both of their properties.

My first tensor : the vector

A vector is a tensor of order $1$ . There are two kinds, depending on the nature of their components : $v^i$ with contravariant components is a usual vector, $v^{}_i$ with covariant components is called a dual vector, a covector or a linear form. To calculate the length of a vector in any frame, we need both components.

With all that said, covariant and contravariant components are pretty tough to figure out in higher dimensions… So let’s see a different description, which by the way will be a better approximation of their formal definitions : these kind of components are linked to their behaviour in a change of basis.

Measures and graduations

First, let’s start by giving an intuitive definition of length. The length of a brick wall for example, is the number of bricks times their length :

A wall made of 20 bricks

I already hear you saying “we are chasing our tail here ! How is the length of a brick defined ?” Well, it is not. On the contrary, it is it that will define a unit of length. We can measure the length of our wall by the number of bricks alone. But this is an arbitrary choice of course, we could as well have counted in a number of feet, in a number of cars, or with any object we want. This choice defines a unit of length. Thankfully, humans have (almost) all agreed on common units of length : the meter, centimeter, etc. Hum. ;-)

Since all bricks have the same length (not by chance), we only need to measure one brick in our chosen unit, before multiplying it by the number of bricks. So far so good, we have all learned this in kindergarten…

Our choice of unit will give us two different interpretations of the notion of components. Note that to count bricks we do not need to restrict ourselves with whole numbers. With a brick that was cut in half for example, we can use rational and even real numbers.

$\star\quad$ Suppose we want to build a 1000 feet long wall. If we use 10 feet long bricks, we will need a 100 of them to build a layer. Now suppose we choose to use 20ft bricks instead. Then we’ll only need 50 bricks. The bigger the bricks are, the less we need to attain our length objective.

$\star\quad$ Now suppose we decide to make a wall with only 10 bricks, whatever their length may be. If we use 10ft long bricks, our wall is going to be 100ft long. If instead we choose to use 20ft long bricks, our wall will be two times longer, so 200ft. The bigger the bricks are, the longer our wall will be.

We just described contravariant and covariant components in a one-dimensional space !

Effect of a change of unit on contravariant graduations (fixed length) and covariant graduations (fixed number of graduations).

A change of unit (which we will assimilate with a change of basis) is a passive transformation on a vector, and an active transformation on a covector.

Let’s see these interpretations on a couple of numbers (two-dimensional space) :

$\star\quad$ If we define our couple as a vector, we are fixing a length objective in two chosen units. For exemple, suppose we want to build a brick house that is 100ft long on each side with 10ft long bricks (let’s forget height again). With the coordinates $(100, 100)$ , we define a vector which is our length objective in feet.
Here, we will need 10 bricks on each side, so its contravariant components are $(10, 10)$ :

A contravariant component is a number of bricks, 10 here (note : I was too lazy to change the units in the picture, so sue me ^^).

A change of basis is equivalent to switching brick sizes : with bricks two times longer (so 20ft) we will need two times less to build the same house. Only 5 for each side. So after a change of basis, our vector will have contravariant components $(5, 5)$ :

Depending on our brick’s size, the number of bricks needed is inversely proportionnal to the house’s size. The bigger the bricks, the less we need. That’s why they are called contravariant.

$\star\quad$ Now let’s say we define our couple of numbers as a quantity objective (a number without dimension). Say for artistic reasons, we want a house made of 10 bricks on each side, whatever their sizes may be. The couple $(10, 10)$ here defines a covector. With 10 feet bricks, our dual basis will be bricks of inverse size $\frac1{10}ft^{-1}$ . The covariant components will be the lengths of the house, so $(100, 100)$ in feet.
Why ? Because to get the number of bricks (10), we multiply the sides’ lengths in feet (100) by the number of bricks necessary for each feet, which is $\frac1{10}$ … I admit that it must feel weird at first, but it is coherent.

Covariant components are lengths. Here 100 ft.

If we decide to switch to smaller bricks of, say, 5 feet, it is a new change of basis. But since by design the house must still have the same number of bricks for each side (10), it will be 50 feet wide, so it will be two times smaller ! But 4 times less spacious obviously ;-)
Our covariant components would then be $(50, 50)$ .

Depending on our brick’s size, the house will vary in the same proportion in size for a fixed number of bricks. The bigger the bricks, the bigger the house. That’s why they are called covariant.

To sum up, a vector defines a couple of measures (lengths in feet expressed as $\text{number of bricks }\times\text{ brick size}$ ), while a covector defines a couple of dimensionless numbers (or lengths in quantity of bricks, which is the same, expressed as $\text{house size }\times\text{ number of bricks needed for each unit of length}$ ).

Remarks :
– to simplify things, we used the same unit of length in both dimensions, but we could have used feet in one direction and yards in another for example…
– on the other hand, we can’t mix dimensionless numbers and lengths in the compenents of a tensor of order one.
– again, there is no need to restrict ourselves to whole numbers when counting bricks. One can have an irationnal number of bricks, though I wouldn’t recommend trying that in real life…

This describes well the effect of a change of basis on covariant and contravariant components. But it doesn’t include duality. Let’s go a little further.

Another look at duality

For this, we will need to do some dimensional analysis. We will write [1] a dimension in “number of bricks” (to be fair it should be considered a dimensionless number) and [L] a dimension of length (or any other kind of unit !). No need to go too far in complexity, we only need to remember these two rules :
First, a length multiplied by a number of bricks (dimensionless number) gives a length, so $[1] \times [L] = [L]$
Second, a length divided by another length gives a number of bricks, so $\dfrac{[L]}{[L]}= [1]$

Now let’s get to it : let’s say we use 2 feet long bricks. It is our basis. Say we want to build a 12 feet wall, so we will need 6 of them. 6 is our contravariant component (a number of bricks).

In the dual space, things will be reversed, like in a very (very !) weird mirror. The image of our wall in that mirror will be the same length, but in number of “mirror bricks” instead of feet. It will be 12 “mirror bricks” long. The construction method given earlier to build a second set of axes gives us only one choice for a dual basis, which must be the inverse. Our dual basis will then be $\frac12$ ft ${}^{-1}$ . That’s the “inverse length” of our “mirror bricks”. So the covariant components will be equal to 24 feet. Why 24 ? because $24$ ft $\times\frac12$ ft ${}^{-1}=12$ mirror bricks.

And here comes again our calculation of length using tensor components : the square root of 6 (bricks) times 24 (feet) is precisely 12 (feet) ! Again the dimensional analysis and the calculations hold, as weird as it seems, it just works.

So duality can be seen as the pairing of a unit of length and a dimensionless unit precisely in the only way that can make this work.

Tensors of order 2

A tensor of order 2 is an operator acting on tensors of order 1.
So in a two dimensional vector space, when a tensor of order 2 acts on a couple of numbers, it will transform that couple but also “decide” if they represent a length or a number of bricks. That is to say how the numbers are going to change with new units.
Beware : the dimensional analysis is done on the components.

For example, let’s take a vector. It represents two lengths, say in feet. Then its (contravariant) components will be two numbers of bricks. If a tensor of order 2 two times covariant acts on the components, it will result in two lengths in feet, so the (covariant) components of a covector. This tensor is represented by a matrix, which line entries could be interpreted as lengths in feet.

With that in mind, there are four types of tensors of order 2, depending on how they transform (again, in terms of components only) :
– a vector into another vector (it is a linear operator, which lines are of dimension [1])
– a vector into a coector (it is a bilinear form, which lines are of dimension [L])
– a covector into a vector (it is a bivector, which columns are of dimension [L] ${}^{-1}$ )
– a covector into another covector (it is the transpose of a linear operator, which columns are of dimension [1])

Why do we attribute dimensions to columns sometimes instead of lines ? It is because of the rules of multiplication between matrices and the fact that vectors are written in columns while dual vectors are written in lines…

Since the contravariant components are easy to find, we only need to calculate the covariant ones, and for that there is two methods : the first consists in using the scalar product which is just a way to project the vector onto the new axis, the second uses another tensor which links vectors and covectors. It is called the metric tensor.

The metric tensor

Written $g^{}_{ij}$ , it is a tensor of order $2$ (it has 2 indices), and it sends the contravariant components onto the covariant components : $v^{}_j=g^{}_{ij}v^i$ ,
and vice-versa with its inverse : $v^j=g^{ij}v^{}_i$ .

Note that $g$ is different for every “type” of frame.
Its components form a matrix. In an orthonormal basis (perpendicular axis and basis vectors of same length $1$ ), $g$ is the identity matrix, so it doesn’t change anything : that’s why in a usual frame (the ones we are used to) we only have one type of components, the contravariant components are equal to the covariant components.

In the theory of special relativity, the frames are “squeezed” depending on the speed of the reference frame, so we need the metric tensor to naviguate from one kind of component to the other and make sure the overall “space-time length” does not change from one observer to another.

To get a general idea of what a tensor of order $2$ is, let’s see how they can be built.

Building tensors of order 2

Say we have a vector $\vec x$ with contravariant components $x^i=\begin{pmatrix}-1\\1\end{pmatrix}$ and covariant components $x^{}_i=\begin{pmatrix}1\\2\end{pmatrix}$ and a covector $\vec y$ with covariant components $y^{}_j=\begin{pmatrix}-3\\8\end{pmatrix}$ and contravariant components $y^j=\begin{pmatrix}3\\4\end{pmatrix}$ .

We are going to build matrices by multiplying each component of one to those of the other, but in a unusual way (though very intuitive).

For example in $3$ dimensions : $\begin{pmatrix}1\\2\\3\end{pmatrix}\otimes\begin{pmatrix}a\\b\\c\end{pmatrix}=\begin{pmatrix}1a&1b&1c\\2a&2b&2c\\3a&3b&3c\end{pmatrix}$

$\star\quad$ Case 1 : By using one vector and one covector.
$T^j_i=x^{}_i\otimes y^j=\begin{pmatrix}1\\2\end{pmatrix}\otimes\begin{pmatrix}3\\4\end{pmatrix}=\begin{pmatrix}1\times 3 & 1\times 4\\2\times 3 & 2\times 4\end{pmatrix}=\begin{pmatrix}3 & 4\\6 & 8\end{pmatrix}$
Since we used one vector and one covector, this tensor is one time contravariant and one time covariant. It’s the type of every “usual” matrix.

$\star\quad$ Case 2 : By using two vectors.
$T^{ij}=x^i\otimes y^j=\begin{pmatrix}-1\\1\end{pmatrix}\otimes\begin{pmatrix}3\\4\end{pmatrix}=\begin{pmatrix}-1\times 3 & -1 \times 4\\ 1\times 3 & 1\times 4\end{pmatrix}=\begin{pmatrix}-3 & -4\\3 & 4\end{pmatrix}$
Since we used two vectors, this tensor is two times contravariant. It is called a bivector.

$\star\quad$ Case 3 : By using two covectors.
$U^{}_{ij}=x^{}_i\otimes y^{}_j=\begin{pmatrix}1\\2\end{pmatrix}\otimes\begin{pmatrix}-3\\8\end{pmatrix}=\begin{pmatrix}1\times -3 & 1 \times 8\\ 2\times -3 & 2\times 8\end{pmatrix}=\begin{pmatrix}-3 & 8\\-6 & 16\end{pmatrix}$
Since we used two covectors, this tensor is two times covariant. It is called a bilinear form.

Remarks :

$\star\quad$ We just did tensor calculation. It’s not that hard !

$\star\quad$ We cannot build every matrices that way because this method makes only non invertible matrices…

$\star\quad$ In the third case we get a tensor $U$ different from $T$ precisely for the aforementioned reason. So if we did the product $x^i\otimes y^{}_j$ we would get $U^i_j$ .

$\star\quad$ BEWARE : the matrix notation of its components is not enough to completely define a tensor of order two. Two tensors can have the exact same components, but one can be entirely covariant while the other is contravariant for example which would make them different objets. This is true whatever their order (i.e. the number of dimensions of the component space) : at order one for example a vector is entirely defined by its components. Not so for a tensor of order one that can be either a vector or a covector…

The metric tensor in action

As we have seen, it changes a vector into a covector : $y^{}_j=g^{}_{kj}y^k$ and a covector into a vector : $x^i=g^{ki}x^{}_k$ .

But it works also with tensors of greater order. So we can verify that $T^{}_{ik}=T^j_ig^{}_{jk}$ or that $U^{ik}=U^i_jg^{jk}$ .
In fact, the metric tensor either brings down or brings up one index of a tensor.

Here the metric tensor is $g^{}_{ij}=\begin{pmatrix}-1 & 0\\0 & 2\end{pmatrix}$ and of course, it is always invertible (try it out !).

But the most important property of the metric tensor, is that is is the same for a whole lot of frames (basis systems). Instead of working with vectors that depend heavily on the basis in which they are expressed, we will then be able to ignore the basis on the condition that they share some geometrical properties which make them generate the same metric tensor…

Conclusion : What is a tensor ?

In the end, a tensor is much, much more than just the generalization of the concept of vector and matrix. It is an object that does not depend on the choice of a basis near as much as vectors (or matrices) do. Instead of depending on a particular basis, tensors will behave the same for a whole lot of “similar” basis. As long as the basis in which we work generate the same metric tensor, the calculations will be greatly simplified and won’t change each time we change frames.
In particular, in special relativity and quantum electrodynamics, every equation must stay the same if we change inertial reference frames. We then only have to express the laws of physics in terms of tensors, and we’re golden !
The same principle applies to general relativity, but it is much more complicated though…

If this post didn’t quench your thirst on the subject, go see the detailed post on tensors !

7 Comments

Krishna
Posted at 03:22h, 09 November

Thank you
AMARNADH
Posted at 07:22h, 06 March

Such a writing is seen nowhere… Thank you…
Francesco Bolleri
Posted at 14:07h, 11 December

Dear Johann, this was a great post. Thank you!
I’d like to ask if you can please explicitate the way you calculate the covariant components of the vector (the point you tell “Skipping over the calculations…”)

In my understanding the base lenght change of 1*cos(angle), but i am not sure..
Should be lovely to see the right way to calculate the components.

Thanks
Francesco
Johann
Posted at 12:08h, 12 December

Dear Francesco. Thank you for your kind comment. You can find the calculations in the detailed post about tensors. Not sure if the example uses the exact same vector, but even if it doesn’t, I think it is detailed enough for you to get how it works. If it isn’t clear enough don’t hesitate to ask for further details !
Cheers,
Johann.
Francesco Bolleri
Posted at 18:02h, 14 December

Dear Johann, thanks!! Sorry if I don’t get this point by myself!

Referring to the detailed post, I still not have clear one step, in the case of non-orthogonal basis.

I arrive to calculate via scalar product the covariant components (1 , 1.37), but now I don’t understand how to get the lenght of the E* basis components (in the figure “Dans E” / “Dans E*”)

I think this lenght is what you express when you tell (immediately above the figure)
“That gives us the dual basis system f*…” But how you calculate these quantities?

Because in my understanding, in “covariant world”, the meaning of value 1.37 is “1.37 times” the lengt of the base v2 in E*.

I hope my question is clear.

Sorry for disturb you again and thank you very much for your time and effort!
Francesco
Francesco Bolleri
Posted at 15:10h, 16 December

Dear Johann, I think I understood where I was wrong, please forget my previous question.

Geometrically, the quantity 1.37 (covariant v2) is “how much we have of “f2-contravariant-red base”, in fact it is visible in the diagram “Dans E *” which is little more than “1 base”.

What made me confused is that the projection diagram “CORRECT vs INCORRECT” seems to suggest measuring covariant v2 as the length of the projection on the path of f2-contravariant-red.
Instead the correct quantity measurement is from the origin of the axes until the crossing between the v2 red dotted projection and the f2-covariant-blue extension line, where there is the perpendicular symbol.
Now I think I understand why contravariant * covariant in the correction of the calculation of the square of length! The length of the f2-contravariant-red base is exactly the amount of the v2 contravariant component of the vector.
It’s pretty hard to describe here, because I can not put the subscripts/quotes.. but i think my understanding is now finally correct.

Now I am curious about facing a mix of non-orthogonal bases with non-normal bases togheter.. I will try to make some exercise.

Thanks for everything.
Francesco
Agraj Yadav
Posted at 14:43h, 02 May

Very Very Very Coooool!
It was a very clear explanation!