Introduction to general relativity

Here I will take a top-to-bottom approach to introduce the theory of general relativity. Thus, I present from start the field equations. The field equations are extremely compact in the usual form that they are presented. This means that the equations we have to deal with are encrypted and need to be made explicit.

In traditional courses on general relativity, a lot of time and effort is invested in learning the elements of differential geometry that are necessary to derive and justify the field equations of Einstein. Very often this is discouraging because the student tackles a lot of mathematical definitions that many times do not help to understand the physics behind the symbols but increases the feeling that, besides physics, there is something additional that the student doesn’t understand so well, namely, differential geometry.

I think that it is possible to start to understand the physics of the general theory of relativity and in parallel to acquire the necessary knowledge on differential geometry. The advantage is that the physics ideas help to identify what is necessary to know about differential geometry. In my opinion, this is better than just start learning a lot of complicated mathematics with the promise that you will need it and you will understand everything later.

The field equations in general relativity

In general relativity, the field equation that describes gravity was proposed by Einstein. It is usually written in the form

\begin{equation}R_{\mu\nu}-\frac{1}{2}g_{\mu\nu}R+\Lambda g_{\mu\nu}=\kappa T_{\mu\nu}.\end{equation}

The goal of the present post is to unpack this equation and briefly explain (when possible) what each term means.

Let’s start by saying that \(\Lambda\) and \(\kappa\) are constants. \(\Lambda\) is the cosmological constant and \(\kappa\) is the coupling constant between matter-energy and space-time geometry.

Actually, Eq.(1) is a set of ten coupled partial differential equations. Let’s see why ten. Each index \(\mu\) and \(\nu\) runs from zero to three, so we can write the equation in the form of a matrix equation. The term \(R_{\mu\nu}\), which is called Ricci tensor, for example, in matrix form looks like:

\begin{equation}
R_{\mu\nu}:=\begin{pmatrix}R_{00}&R_{01}&R_{02}&R_{03}\\R_{10}&R_{11}&R_{12}&R_{13}\\R_{20}&R_{21}&R_{22}&R_{23}\\
R_{30}&R_{31}&R_{32}&R_{33}\end{pmatrix}.
\end{equation}

We see that \(R_{\mu\nu}\) refers generically to any entry of this matrix. The symbol “:=” means that the tensor is represented by the matrix on the right side. Don’t worry about the word tensor. We will learn its meaning later on. For the time being it is enough to consider a tensor as a matrix when it has two indexes.

There are sixteen entries in the Ricci tensor. The tensors in the gravitational equations are symmetric. This means that the six entries \(R_{01}\), \(R_{02}\), \(R_{03}\), \(R_{12}\), \(R_{13}\), and \(R_{23}\) that appears at the right side of the diagonal of the matrix are, in the respective order, equal to the six entries \(R_{10}\), \(R_{20}\), \(R_{30}\), \(R_{21}\), \(R_{31}\), and \(R_{32}\), that appears on the left side of the diagonal. Thus, the independent terms are those at the diagonal and those at one side of the diagonal, i.e., ten independent terms. The same applies for the terms \(g_{\mu\nu}R\), and \(T_{\mu\nu}\). The value \(\mu=0\), or \(\nu=0\) refers to the time coordinate of the space-time, usually written as \(x^0=ct\), while the values \(\mu=1,2,3\) or \(\nu=1,2,3\) refers to the space coordinates; thus, in Cartesian coordinates, for example, we have \(x^1=x\), \(x^2=y\), and \(x^3=z\).

The explicit form of \(R_{\mu\nu}\) will be given later. The geometric meaning of the Ricci tensor is not accessible without some knowledge of differential geometry that we will acquire in other lectures. We can loosely say that this tensor measures how far the space-time is of being euclidean.

The metric tensor

The unknowns in general relativity are the entries of the matrix whose elements are \(g_{\mu\nu}\). The independent ones are

\begin{equation}
g_{\mu\nu}:=\begin{pmatrix}g_{00}&g_{01}&g_{02}&g_{03}\\\cdot&g_{11}&g_{12}&g_{13}\\\cdot&\cdot&g_{22}&g_{23}\\
\cdot&\cdot&\cdot&g_{33}\end{pmatrix}.
\end{equation}

The matrix whose elements are \(g_{\mu\nu}\), is called “the metric”, or “the metric tensor”. The reason is that its entries are necessary to calculate the length of arcs and distances in space-time. The square of length \(ds^2\) of an arc connecting two infinitesimally close points in space-time is given (in Cartesian coordinates) by the expression

\[\begin{split}ds^2&=g_{00}(cdt)^2+g_{11}dx^2+g_{22}dy^2+g_{33}dz^2\\
&+2g_{01}cdtdx+2g_{02}cdtdy+2g_{03}cdtdz\\
&+2g_{12}dxdy+2g_{13}dxdz+2g_{23}dydz\\
\end{split}\]

The factor 2 comes from the symmetry of the metric tensor, for example: \(g_{02}cdtdy+g_{20}dycdt=2g_{02}cdtdy\).

The elements \(g_{\mu\nu}\) are functions of the space-time coordinates in general. Their form depends on the coordinates used to express them. When there is a coordinate system in which all of them are constant the space-time is said to be flat. Later we will see why. For the time being, we recall here the expression for \(ds^2\) in special relativity, namely, \(ds^2=(cdt)^2-dx^2-dy^2-dz^2\). So, the metric tensor in special relativity, which is commonly called \(\eta_{\mu\nu}\) instead of \(g_{\mu\nu}\), is

\[\eta_{\mu\nu}:=\begin{pmatrix}1&0&0&0\\0&-1&0&0\\0&0&-1&0\\0&0&0&-1\end{pmatrix},\]

and therefore, in special relativity the space-time is flat.

The metric tensor is non degenerated, i.e., its determinant is different from zero at any point in space-time. This means that the inverse of the metric tensor exists. The entries of the inverse matrix are written with upper indexes, ie., in the form \(g^{\mu\nu}\):

\begin{equation}g^{\mu\nu}:=\begin{pmatrix}g^{00}&g^{01}&g^{02}&g^{03}\\\cdot&g^{11}&g^{12}&g^{13}\\\cdot&\cdot&g^{22}&g^{23}\\
\cdot&\cdot&\cdot&g^{33}\end{pmatrix}.
\end{equation}

Upper indexes refer to the inverse matrix in the case of the tensor metric only. For any other tensor, the connection between the expression with lower indexes and upper indexes follows a rule that will be explained in other articles. Let me emphasize that, the Ricci tensor \(R^{\mu\nu}\) with super-indexes, for example, is not represented by the inverse of the matrix that represents the same tensor with sub-indexes \(R_{\mu\nu}\).

The connection between the metric and gravity

The metric tensor determines the geometry of the space-time. Thus for example, if we know the metric we can say whether the geometry of the space-time is either Euclidean or not. To better understand what non-Euclidean means, let’s consider the surface of the sphere. On the sphere, the role of straight lines is played by circles that result from intersecting the sphere with planes that pass through the center. With these lines, we can construct a triangle on the sphere such that the sum of its inner angles is greater than 180 degrees. That is not possible in a plane, whose geometry is Euclidean, so the geometry of the surface of the sphere is non-Euclidean.

On the surface of a sphere, the sum of the inner angles of the triangle shown in red is \(180^0+\theta\), where \(\theta\) can take values in the interval \(0<\theta<2\pi\). This is impossible on a plane surface. The geometry on the surface of the sphere is non-Euclidean.

The generalization of straight lines to non-Euclidean geometry are the geodesics, which are completely determined by the metric tensor. But, what are geodesics? They are curves with zero acceleration. To better understand what that means think about a point particle moving through space. It describes a trajectory. At each point of the trajectory, there is the velocity vector that is tangent to it. If there is no acceleration of the body, then the velocity vector doesn’t change along the trajectory. In other words, the derivative of the velocity vector along the trajectory is zero. Now think for a while that the space is two-dimensional and is the surface of a sphere. A point particle moving on this surface will describe a certain curve on it and will have a velocity vector on each point of its trajectory. If the derivative of the velocity vector along the trajectory on the sphere is zero, the curve will be a geodesic. This geodesic will be the analog of a straight line in the case the surface was a plane instead of a sphere. Thus, the geodesics on the sphere are curved but have zero acceleration. The curvature of the geodesic is due to the fact that the underlying space is itself curved.

The geodesics on the surface of a sphere are arcs of maximum circles. The maximum circles are the lines resulting from the intersection of the spherical surface with planes passing through the center. These lines are the only solutions to the geodesic equations given in Eq.(5).

Why are geodesics important? The key is the principle of inertia, namely, the motion of a material particle, on which there are no forces acting on, is along a geodesic. If the space is flat, the geodesics are straight lines, otherwise, the geodesics are curved. The differential equations determining the geodesics are

\begin{equation}
\frac{d^2x^{\sigma}}{ds^2}=-\sum_{\mu=0}^{3}\sum_{\nu=0}^{3}\Gamma^{\sigma}_{\mu\nu}\frac{dx^{\mu}}{ds}\frac{dx^{\nu}}{ds}.
\end{equation}

There are four differential equations, one for each value of \(\sigma=0,1,2,3\). The terms on the right side contain the factors \(\Gamma^{\sigma}_{\mu\nu}\) which are called Christoffel symbols and are determined by the metric tensor as follows:

\begin{equation}\Gamma^{\sigma}_{\mu\nu}=\frac{1}{2}\sum_{\lambda=0}^3g^{\sigma\lambda}\left\{\frac{\partial g_{\lambda\nu}}{\partial x^{\mu}}+\frac{\partial g_{\mu\lambda}}{\partial x^{\nu}}-\frac{\partial g_{\mu\nu}}{\partial x^{\lambda}}\right\}.
\end{equation}

If there is a system of coordinates in which the metric tensor is constant, then the Christoffel symbols are zero (in that system) and we get the differential equations \(\frac{d^2x^{\sigma}}{ds^2}=0\), whose solutions are straight lines. In other posts, we will derive the equation of the geodesics.

In Newtonian mechanics, it was assumed that the geometry of space-time is flat, and consequently, the motion of free bodies is along straight lines. In general relativity, instead of assuming the geometry of the space-time, it is calculated.

The role of the mass in the Newtonian theory of gravitation is to exert a force on other masses curving their trajectories in space. In general relativity, the role of the mass is to modify the geometry of the surrounding space-time curving the straight lines to geodesics in the new geometry.

In general relativity, the distribution of matter determines the metric tensor and the metric tensor determines the geodesics in space-time. Thus, a material body affects the trajectory of another body, not by exerting a force on it, but by changing the geometry of the space-time in the region where the other body is freely moving. We will say more about this point in another post.

The scalar of curvature \(R\)

We have seen that the Ricci tensor \(R_{\mu\nu}\) and the metric \(g_{\mu\nu}\) can be represented by matrices and that the matrix representing the metric has an inverse. The scalar function \(R\) is defined as the trace (sum of the diagonal elements of a squared matrix) of the product of the inverse of the metric and the matrix representing \(R_{\mu\nu}\):

\[R:=\text{Tr}\left(\begin{pmatrix}g^{00}&g^{01}&g^{02}&g^{03}\\
g^{10}&g^{11}&g^{12}&g^{13}\\g^{20}&g^{21}&g^{22}&g^{23}\\
g^{30}&g^{31}&g^{32}&g^{33}\end{pmatrix}
\begin{pmatrix}R_{00}&R_{01}&R_{02}&R_{03}\\
R_{10}&R_{11}&R_{12}&R_{13}\\R_{20}&R_{21}&R_{22}&R_{23}\\
R_{30}&R_{31}&R_{32}&R_{33}\end{pmatrix}
\right)\]

Like the Ricci tensor, the scalar function \(R\) measures how far the space-time is of being Euclidean. The function \(R\) is the sum of sixteen terms:

\[R=\sum_{\mu=0}^3\sum_{\nu=0}^3g^{\mu\nu}R_{\mu\nu}.\]

In other posts, we will explicitly calculate the Ricci tensor and the scalar of curvature for some known surfaces in the tridimensional space in order to gain some intuition about their meaning.

The explicit form of the Ricci tensor

The explicit expression of the Ricci tensor is cumbersome. Here it is:

\begin{equation}\begin{split}R_{\mu\nu}&=\sum_{\sigma=0}^3\left(\frac{\Gamma^{\sigma}_{\mu\nu}}{\partial x^{\sigma}}-\frac{\partial\Gamma^{\sigma}_{\sigma\nu}}{\partial x^{\mu}}\right)\\
&+\sum_{\lambda=0}^3\sum_{\sigma=0}^3\left(\Gamma_{\mu\nu}^{\lambda}\Gamma_{\lambda\sigma}^{\sigma}-\Gamma^{\sigma}_{\mu\lambda}\Gamma^{\lambda}_{\sigma\nu}\right),
\end{split}\end{equation}

The Christoffel symbols \(\Gamma^{\sigma}_{\mu\nu}\) were defined in Eq.(6).

The Ricci tensor contains the second derivatives of the metric tensor and therefore the gravitational equations given in Eq.(1) are second-order partial differential equations for the metric tensor. This is a generalization of the Poisson equation for the gravitational potential in the Newtonian theory of gravitation. Indeed, the metric tensor plays the role of the gravitational potential in general relativity.

The source of the gravitational field

In the Newtonian theory of gravitation, the source of the gravitational potential is the mass density \(\rho\). The gravitational potential \(\phi\) satisfies the Poisson equation \(\nabla^2\phi=-k\rho\). In general relativity, the source of the gravitational potential is the mass-energy density tensor \(T_{\mu\nu}\), and the analog of the Poisson equation is played by Einstein’s equations given in Eq.(1).

In the Newtonian theory, one is often interested in the gravitational effect produced by the distribution of matter in its surroundings, i.e., outside the matter, where \(\rho=0\) and the potential satisfies the Laplace equation \(\nabla^2\phi=0\). Similarly, in general relativity, we are often interested in determining the gravitational effect produced by certain distribution of matter and energy (remember that mass and energy are related by \(m=E/c^2\)) in its surroundings. Thus, in free space, \(T_{\mu\nu}=0\), and the metric tensor has to satisfy the analog of the Laplace equation:

\begin{equation}
R_{\mu\nu}-\frac{1}{2}g_{\mu\nu}R=0.
\end{equation}

The cosmological term \(\Lambda g_{\mu\nu}\) is not relevant in this kind of problem. In cosmology, the cosmological term has to be included as well as the tensor \(T_{\mu\nu}\) which has to be given beforehand. Indeed, the expression “outside the universe” has no meaning and we are always inside the distribution of matter.

Some final words

The goal in general relativity is to calculate the components \(g_{\mu\nu}\) of the metric tensor. This is achieved by solving Eq.(1). Once the metric has been calculated one can investigate the motion of bodies and light rays through space.

Solving the gravitational equations is usually extremely difficult. Thus, one first learns how to solve them in very simple cases to gain experience and some intuition.

For a better understanding of the geometric ideas contained in the general theory of relativity, some times is convenient to study the physic of artificial spaces whose metric not necessarily satisfy Einstein’s equations but is given from start.

Share it!

Do you have some thoughts, opinions or questions? Share them!

Cancel Reply

2 comments

Barbaro Q. Leyva
March 4, 2022 at 4:06 pm
1-Why the curves represented by the interception of planes (not passing trough the center) and the sphere are not considered geodesic? They are the equivalent of straight lines also
2-Does the energy-momentum tensor (Tmu,nu) depends on the gravitational potential (if yes: what if
potential=-GM/r^2 instead of the usual -GM/r ?
3-Please let me know when I can read a derivation of the Newtonian Poisson equation from General relativity
(I have already read some derivations but I want to have a clearer picture of the dependence on specific potentials )
Thank you.
1. J.P. Mizrahi
  March 12, 2022 at 8:26 am
  Thanks for your questions. The answer to the first one I think is very nice because it doesn’t require differential geometry but simple Euclidean geometry. Take two points A and B on the plane and draw a circle passing through them. Now do the same but this time with a circle of larger radius. If you continue this process, the arc of the circle connecting A and B will tend to a straight line as the radius increases. So it is clear that the greater the circle the shorter the arc length between A and B. On the sphere, you cannot continue the process indefinitely because the maximum radius available is that of the sphere. So, arcs of greatest circles are the straightest lines you can draw on the sphere and therefore these are the geodesics.
  There is another easy proof that not every circle on the sphere is a geodesic. This proof doesn’t require differential geometry either. On the sphere, every circle results from the intersection with some plane and is equivalent to some parallel of latitude; Consider a sphere of radius R, take a circle of latitude 45 degrees, and consider two points A and B diametrically opposed on that circle. The radius of that circle is R*sin(45)=R*sqrt(2)/2. So the arc length between A and B is pi*R*sqrt(2)/2. Now consider the greatest circle passing through A, B, and the north pole. The arc length on this greatest circle is pi*R/2 which is shorter by a factor of sqrt(2). You can easily generalize this particular example to any circle of latitude and see that the greatest circle gives you the shortest arc length.
  In response to your second question, the answer is no. In general relativity, the gravitational potential is not a tensor. Indeed, you can make it vanish locally by selecting a free fall coordinate system. So you cannot include in the energy-momentum tensor a non-tensor quantity. However, there are alternative theories of gravitation where you not only can, but you have to include the gravitational potential on the side of the sources. This happens in Lorentz invariant theories and in covariant theories of gravitation.
  If in those alternative theories you include gravitational energy in the form -GM/r, then you are implicitly considering weak field approximation. Terms proportional to 1/r^2 would correspond to dipole approximations. But dipoles are usually used either when monopole terms are zero and dipole becomes the leading order term, or are included together with the monopole terms just because its contribution is nonnegligible and can have observable effects. In the detection of gravitational waves, for example, quadrupoles and octopoles terms are taken into account.
  In response to your third point, I will write soon an article exposing one possible derivation. For the time being, I can say that It is actually not so difficult. Indeed, it is just the linear nonrelativistic approximation of the Einstein equations