HPS 0410

Einstein for Everyone

Back to Ontology of Space and Time

Technical Appendix

Philosophical Significance of the General Theory of Relativity
or
What does it all mean, again?

Ontology of Space and Time

John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh

This page uses mathematics typeset with mathjax, for which an internet connection is needed. Apologies for any presentation anomalies. I am not sufficiently expert in mathjax to know how to stop them.

Temporal metric

The coordinate T coincides with the time read by clocks. We introduce a new time coordinate t(T) as a function of T. Then small increments of T, ΔT, transform according to

$$\Delta t=\frac{dt}{dT}\Delta T$$

or equivalently

$$\Delta T=\frac{dT}{dt}\Delta t$$

If t = log₁₀T so that T = 10^t, then

$$\frac{dT}{dt} = \frac{10^t}{dt} = \frac{d e^{t \thinspace \text{log}_e10}} {dt} = (\text{log}_e10) e^{t \thinspace \text{log}_e10} = (\text{log}_e10) 10^t = 2.3025 \thinspace 10^t = 2.3015 \thinspace T$$

Minkowski Spacetime Line Element

Represent the spacetime interval by s. In a coordinate system (T, X, Y, Z) adapted to an inertial frame of reference in a Minkowski spacetime, the small difference of interval between two neighboring events ds is given by the expresssion for the line element:

ds² = - dT² + dX² + dY² + dZ²

where units are chosen so that c=1.

For events that are spacelike separated, ds² corresponds to the ordinary Euclidean distance. For example, consider two events that differ only in their X coordinate and do so by a small amount dX, for example (0, X, 0, 0) and (0, X+dX, 0, 0). The interval-squared separating them is ds² = dX², which corresponds to a measured distance of dX.

For events that are timelike separated, ds² corresponds to the time elapsed on clocks that move on a geodesic between them. For example, consider two events that differ only in their T coordinate and do so by a small amount dT, for example (T, 0, 0, 0) and (T+dT, 0, 0, 0). The interval-squared separating them is ds² = - dT². To recover the proper time difference, we ignore the minus sign and just use dT.

Events that are lightlike separated--that is can be connected by a light signal--are zero interval apart. Since we have set c=1, an example is a pair of events (T, X, 0, 0) and (T+1, X+1, 0, 0). More generally, setting ds = 0 picks out all the lightlike curves and thus specifies the lightcone structure of the spacetime.

To see this, we have

ds² = - dT² + dX² + dY² + dZ² = 0

entails

dX² + dY² + dZ² = dT²

so that

$$1 = \frac{dX^2 + dY^2 + dZ^2}{dT^2} = \frac{dX^2}{dT^2} + \frac{dY^2} {dT^2} + \frac{dZ^2}{dT^2} = \left( \frac{dX}{dT} \right)^2 + \left( \frac{dY}{dT} \right)^2 + \left( \frac{dZ}{dT} \right)^2 = |\mathbf v|^2$$

where the velocity along the trajectory is v with

$$ \mathbf v = \left ( \frac{dX}{dT} , \frac{dY}{dT} ,\frac{dZ}{dT} \right) $$

That is, if ds = 0 along a trajectory, that trajectory is one that is traced out by a point moving at unit speed. Since we have set c=1, unit speed is the speed of light.

How to Introduce Arbitrary Spacetime Coordinates

Let us introduce a new spacetime coordinate system x^μ = (x⁰, x¹, x², x³). Since ds is a quantity ascertainable by direct measurement, independent of the choice of the coordinate system, it must remain unchanged when determined in the new coordinate system.

To ensure this sameness, we recover an expression for ds by replacing each of the small coordinate differences in the original coordinate system by corresponding differences in the new coordinate system. To do this we use expressions:

$$dT = \frac{\partial T}{\partial x^0}dx^0 + \frac{\partial T}{\partial x^1}dx^1 + \frac{\partial T}{\partial x^2}dx^2 + \frac{\partial T}{\partial x^3}dx^3$$

...

$$dZ = \frac{\partial Z}{\partial x^0}dx^0 + \frac{\partial Z}{\partial x^1}dx^1 + \frac{\partial Z}{\partial x^2}dx^2 + \frac{\partial Z}{\partial x^3}dx^3$$

After substituting for dT, ..., dZ, we recover an intimidatingly large expression for the line element:

$$ds^2 = \left ( \frac{\partial T}{\partial t} \right )^2 \left ( dx^0 \right )^2

+ \left [ \left ( \frac{\partial T}{\partial x^0} \right )\left ( \frac{\partial T}{\partial x^1} \right )

+ \left ( \frac{\partial X}{\partial x^0} \right )\left ( \frac{\partial X}{\partial x^1} \right )

+... \right ]dx^0dx^1

+ ...$$

This is formally rather unwieldy. We can simplify the notation if we write

T = X⁰, X = X¹, Y = X², Z = X³.

These four coordinates can then be written simply as X^μ where μ takes the values 0, 1, 2, 3.

Then, following the Einstein summation convention explained below, the expression for ds² is written as

$$ds^2 = \eta _{\mu\nu} dX^\mu dX^\nu$$

where μ and ν take values 0, 1, 2, 3 and the matrix $$\eta _{\mu\nu}=\begin{bmatrix}
-1 & 0 & 0 & 0\\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0\\
0 & 0 & 0 & 1
\end{bmatrix}$$

is the representation of the metric tensor of Minowski spacetime in this coordinate system.

Einstein's Summation Convention

In Einstein's convention, when an index is repeated "upstairs" and "downstairs" we sum over that index. So the expression for ds² is really:

$$ds^2 =\sum_{\mu =0}^{3} \sum_{\nu =0}^{3} \eta _{\mu\nu} dX^\mu dX^\nu$$

We will continue to use this convention below.

The Spacetime Metric Revealed

We transform the expression for ds² to the new coordinate system as before, by substituting the new coordinate differences dx^α for the old coordinate differences dX^μ:

$$ds^2 = \eta _{\mu\nu}
\left (\frac{\partial X^\mu}{\partial x^\alpha} \right )
\left (\frac{\partial X^\nu}{\partial x^\beta} \right ) dx^\alpha dx^\beta$$

This expression within the new coordinate system can be rewritten as

$$ds^2 = g _{\alpha\beta} dx^\alpha dx^\beta$$

where the Minkowski metric tensor in this new coordinate system is given by

$$g_{\alpha\beta} = \eta _{\mu\nu}
\left (\frac{\partial X^\mu}{\partial x^\alpha} \right )
\left (\frac{\partial X^\nu}{\partial x^\beta} \right )$$

This last equation shows that g_μν is a tensor. For the defining property of a tensor is the rule that is used to transform it to a new coordinate system, when we have it given in an old coordinate system.

The new components are linear functions of the old components, where the coefficients in the linear transformations are coordinate derivatives like $$\left (\frac{\partial X^\mu}{\partial x^\alpha} \right )$$

Covariant and Contravariant Tensors

The way these coordinate derivatives enter into the transformation equations determine whether the tensor is known as a "covariant tensor" or a "contravariant tensor."

The metric tensor g_μν is "covariant" since the coordinate derivatives in the transformation equations have the form

$$\left (\frac{\partial X^\mu}{\partial x^\alpha} \right ) = \left (\frac{\text{old coordinate}}{\text{new coordinate}} \right )$$

The coordinate differences $$dx^\alpha$$ transform contravariantly:

$$dx^\mu = \left ( \frac{\partial x^\mu}{\partial X^\nu} \right ) dX^\nu$$

What makes this an example of a contravariant transformation is that the coordinate derivatives are reversed and have the form

$$\left (\frac{\text{new coordinate}}{\text{old coordinate}} \right )$$

Transformation to Uniformly Accelerated Coordinates

An illustration of these transformations is provided by the transformation from a Minkowski spacetime in an inertial coordinate system to one in a uniformly accelerated coordinate system.

In the inertial coordinate system (T, X, Y, Z) = (X⁰, X¹, X², X³), we have the line element

ds² = - dT² + dX² + dY² + dZ²

and the metric matrix

$$\eta _{\mu\nu}= \begin{bmatrix}
-1 & 0 & 0 & 0\\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0\\
0 & 0 & 0 & 1
\end{bmatrix}$$

We transform to a uniformly accelerated coordinate system (t, x, y, z) = (x⁰, x¹, x², x³). Following A. Einstein and N. Rosen, "The Particle Problem in the General Theory of Relativity," Physical Review, 48 (1935), pp. 73 - 77, on p. 74, we have for the transformation equations that:

T = x sinh(at) X = x cosh(at) Y = y Z = z

We seek the form of the Minkowski metric, g_μν, in this new coordinate system, using the rule for transforming the covariant tensor η_μν,

$$g_{\alpha\beta} = \eta _{\mu\nu}
\left (\frac{\partial X^\mu}{\partial x^\alpha} \right )
\left (\frac{\partial X^\nu}{\partial x^\beta} \right )$$

To apply the formula, we need first to calculate the partial derivatives in it. Most of them are zero valued. The only non-zero ones are

$$ \frac{\partial T}{\partial t} = \frac{\partial (x \sinh(at))}{\partial t} = ax \cosh(at) $$

$$ \frac{\partial T}{\partial x} = \frac{\partial (x \sinh(at))}{\partial x} = \sinh(at) $$

$$ \frac{\partial X}{\partial t} = \frac{\partial (x \cosh(at))}{\partial t} = ax \sinh(at) $$

$$ \frac{\partial X}{\partial x} = \frac{\partial (x \cosh(at))}{\partial x} = \cosh(at) $$

$$ \frac{\partial Y}{\partial y} = \frac{\partial Z}{\partial z} = 1 $$

Most of the terms in the summation for g_μν vanish, which greatly simplifies the calculation. Here are the non-zero terms:

$$
\begin{multline}
g_{00} =
\eta _{00} \left (\frac{\partial T}{\partial t} \right )^2
+
\eta _{11} \left (\frac{\partial X}{\partial t} \right )^2 \\
=
(-1) (ax \cosh(at))^2 + (+1) (ax \sinh(at))^2
= -a^2x^2
\end{multline}
$$

since cosh²(at) - sinh²(at) =1.

$$
\begin{multline}
g_{11} =
\eta _{00} \left (\frac{\partial T}{\partial x} \right )^2
+
\eta _{11} \left (\frac{\partial X}{\partial x} \right )^2 \\
=
(-1) (\sinh(at))^2 + (+1) (\cosh(at))^2
= 1
\end{multline}
$$

$$ g_{22} =
\eta _{22} \left (\frac{\partial Y}{\partial y} \right )^2
= (+1) (1)^2 = 1
$$

$$
g_{33} =
\eta _{33} \left (\frac{\partial Z}{\partial z} \right )^2
= (+1) (1)^2 = 1 $$

Combining we find that

$$ g _{\mu\nu}= \begin{bmatrix}
-a^2x^2 & 0 & 0 & 0\\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0\\
0 & 0 & 0 & 1
\end{bmatrix}$$

and that the Minkowski line element becomes

ds² = - a²x²dt² + dx² + dy² + dz²

Why are these Uniformly Accelerated Coordinates?

Finally, we might ask why the transformation

T = x sinh(at) X = x cosh(at) Y = y Z = z

is to a coordinate system adapted to uniformly accelerating frame of reference (or briefly "uniformly accelerated coordinates").

The Newtonian transformation for constant acceleration A is:

T = t and X = x + (1/2)At²

A point at rest in the accelerating frame will have a constant x coordinate, so that its trajectory in the inertial coordinate system is

X = constant + (1/2)At² = constant + (1/2)AT²

Its velocity is V = dX/dT = AT, which grows without limit with time and will eventually exceed the speed of light.

The formula using sinh and cosh functions assures that points at rest in the uniformly accelerated frame of reference approach the speed of light, c=1, asymptotically. Instead of the Newtonian parabolic motion, they have a hyperbolic motion.

A point at rest in the accelerating frame has x = constant = k, so that its trajectory is

T = k sinh(at) X = k cosh(at)

We can treat t as a path parameter and form

$$ \frac {dT} {dt} = ak \cosh(at) $$

$$ \frac {dX} {dt} = ak \sinh(at) $$

It follows that the speed of the point at rest in the accelerating frame of reference is

$$ \left( \frac {dX} {dT} \right ) = \left( \frac {dX} {dt} \right ) \left( \frac {dt} {dT} \right ) = \frac {ak \sinh(at)} {ak \cosh(at)} $$

$$ \lim_{t \to \infty} \left( \frac {dX} {dT} \right ) = \lim_{t \to \infty} \frac {ak \sinh(at)} {ak \cosh(at)} = 1 $$

For small t, however, the transformation is approximately the Newtonian. Then we have

sinh(at) ≈ at and cosh(at) ≈ 1 + (1/2)a²t²

Once again, we track a point at rest in the accelerating frame at x = constant = k, but now for small t. Then we have

T ≈ k at X ≈ k(1 + (1/2)a²t²)

Substituting t = T/ka in the expression for X, we recover that the trajectory of the point is

X ≈ k(1 + (1/2)a²t²) = k(1 + (1/2)a²(T/ka)²)= k(1 + (1/2) (1/k)² T²)

Hence we recover the Newtonian parabolic motion in a small region around x=k, where the acceleration corresponding to the Newtonian acceleration is A = (1/k)². This means that the corresponding Newtonian acceleration differs according to the value of the coordinate x=k of the point whose motion is tracked.

In another chapter, you will find diagrams of the parabolic motions of uniform acceleration and corresponding hyperbolic motions that display these effects graphically.