Relativity Theory and Causality


Ether & Experiments of Michelson and Morley

Another historically significant experiment that changed the prevailing theories of its time was that of the US scientists Albert Abraham Michelson and Edward Williams Morley. In their now famous experiment they employed a clever method of length measurement, based on the phenomena of interference of waves, to measure the speed of light to a very high accuracy, in late 1880s. The relation of light with electromagnetism was well understood by that time, thanks to the theories of Maxwell. Mater waves, such as sound and water waves, require a medium for their propagation. In analogy to matter waves it was theorized, then, that electromagnetic waves also have a medium for their propagation.

In case of sound waves, for example, scientists had well established the fact that its speed of travel depends on the medium that it is traversing in. Sound travels much faster in solid objects than in air. It was also well known fact that sound does not travel at all in vacuum! So, it was clear that sound needs a "medium" to travel in. In the case of light too it was known that as it traveled in different material it did so with different speeds. In fact the measure of this, the so called index of refraction, had been measured for a variety of substances and was also the quantity that was related to the refraction, bending, of light rays as they traversed from one medium (say air) into another (say diamond). Unlike sound, however, it was observed that light did managed to pass through a vessel (say made of glass) that all the air was evacuated out of it. There were therefore two logical options: either light was such an exception that it did not a medium for its travel, or the medium was there, but we could not see its effect because it was everywhere. It is this second explanation (later) that people chose to accept, i.e. light needed a medium and when there was no matter present, medium was assumed to be one that all of the universe is imbedded in, and was called the ether (or aether). So, the challenge was to find a way to measure some aspect of this medium's existence.

Michelson's optics training in Europe had led him to devise an instrument that could measure speed of light to a high degree of accuracy. Along with Morley, Michelson set out to measure the speed of earth as it travels in ether (i.e. the ether wind velocity), by measuring the variation in speed of light when it propagates along the travel direction of earth's motion, versus when it travels perpendicular to it. Despite the very high accuracy of this measurement, these scientists detected no variation! This result was very hard to accept. It clearly need an explanation.

The existence of ether was so strongly believed that Michelson's null result was "explained" by two theorists: Lorentz and FitzGerald by invoking the assumption that travel in the ether causes a length shrinkage proportional to the speed of travel. This assumption, however, was not based on any independent measurement or observations or physical relations; its sole reason was to retain an earlier assumption, that of the existence of ether in the light of Morely and Michelson's experiment. Einstein, on the other hand, created his Theory of Relativity based on the null result of Michelson-Morley's experiment. This theory rejected the ether hypothesis altogether.


Einstein's Postulates of Special Theory of Relativity

If light requires no medium for its travel, then there is no preferred frame of reference for it. So, when we measure the speed of light, the result of our measurement should be independent of our speed. This is why Michelson-Morley experiment detected no change. Based on this, Einstein formed two postulates that he incorporated in the development of his theory of relativity.

Before discussing these postulates it is useful to define some common terminology that is often used in this theory; these are frame of reference, transformation equations, and observers. In discussing relative motion one often refers to the "frame of reference". In the context of position measurements this is nothing other than a fixed set of coordinate axis. Note that an object's position value depends on the choice of a frame of reference (a coordinate axis), even though the distance that it travels does not. So, some measurable could have different values depending on the choice of a frame of reference. As an example consider rotational motion. In particular, consider the case of an object traveling in a circular path. One could consider a frame of reference that also revolves at the same rate as the object, or one that is fixed on earth (laboratory frame). Of course, the formulation of the observation becomes very different depending on the choice of the frame. An object traveling in a circular path is constantly changing its direction of motion. In the laboratory frame even if the object's speed of travel remains a constant, from the laws of mechanics, it is implied that the object has an acceleration. But in the frame that is also rotating the object is not changing its direction of motion; so if its speed remains a constant, then its acceleration is measured to be zero! Which is true? Well the accepted answer is that we need to come up with a theory, a set of so called transformation relations, that would allow us to agree with both of these results. That is to say, by using these transformation relations we could "transform" the results of one frame into the results of the other frame. As a simple example to demonstrate this idea consider a moving escalator, of the type you see in most shopping malls and airports. Let us say that this escalator moves with a constant speed of ve and it goes upward. Now, consider two separate frames of reference, one that is fixed on the escalator (moving frame) and another that is stationary with respect to the building (laboratory frame). Now, consider a person who is climbing up on this escalator with a speed of, say, vup as measured in the frame fixed on the escalator. What is this person's speed as measured in the frame fixed to the building? The answer is that this speed is just the sum of the person's speed plus the escalators. We can formulate this as a mathematical transformation according to:

vlaboratory = vfixed + ve

In the above "transformation equation" the bolded letters are vector velocities and we've arbitrarily chosen the "up" direction as positive. The "laboratory" subscript refers to the velocity measured in the frame of the building, and the "fixed" subscript to the frame fixed on the escalator. So, in this notation a person climbing down our escalator has a negative vfixed value.

The last term for us to discuss, the observer, simply refers to the hypothetical scientist who is "sitting" motionless in one frame of reference collecting data. For this observer, all measurements are performed with respect to his or her frame. (In this context the term observer is used to mean a person performing an experiment; not just a casual witness.) So, any motion must be measured with respect to the coordinate system that is at rest relative to this observer. Finally, we need to imagine that there is an observer associated with each frame of reference. The job of our transformation relations, then, is to allow these different observers translate their results and get agreement with each other.

The first of Einstein's two postulates states that all laws of physics must have the same mathematical form as observed by any inertial observer. Again, what we mean by "observed" is the actual process of measurement and not a simple visual observation. What we mean by an "inertial observer" is in fact any observer (person making a measurement) among a collection of observers that have non-accelerating motion with respect to one another. The second postulate is just the above mentioned statement regarding the null results of ether wind velocity. It states that all inertial observers must measure the same value for the speed of light (as it propagates in vacuum).

An observer in the (unprimed) frame F making measurements of position values x, y, z, and of time, t are related to the measurements of x', y', z', and t' of the observer in the (primed) frame F', that has a relative speed of v along the x-direction with respect to F, according to
x' = γ (x - βct)
y' = y
z' = z
ct' = γ(ct - βx)

where β = (v/c) and γ = (1 - β2)-1/2 and c is the speed of light in vacuum.

The derivation of a set of mathematical transformation rules by Einstein, i.e. the theory of Special Relativity, is based on the above two postulates. Using these transformations one inertial observer can calculate results that agree with the observation of another inertial observer. Thus the same physical laws can be employed, without the need for "corrections", by all inertial observers. (Einstein's General Theory of Relativity expands this to include even non-inertial observers.) In this way Einstein has required physics to be "the truth"! After all, it is the truth that you would find no matter how or where you make measurement of a physical process.

This newly created universality of laws of physics in the absence of the preferred frame of reference of ether, however, came at a price! Its penalty is the breakdown of simultaneity and creation of restrictions on causality.


Causality and Space-Time

Among the many interesting consequences of the Special Theory of Relativity (see the exercises) one is that nothing that carries energy and momentum can travel any faster than light does in the vacuum. This immediately sets a limit on causality and it clearly requires us to be very careful when we talk about simultaneous events.

The restriction on causality is set because the only way that a "cause" at location x = 0 and time t =0 can be responsible for an "effect" at location x = L and time t is that c t > L, where c is the speed of light in vacuum. So, two events that are separated in space far enough can have no causal connection. (Some people prefer to refer to this consequence as the breakdown of locality; i.e. two events that are far enough in space may be causally connected, but are non-local. But so long as we understand what is that we mean, the rest is terminology). Similarly, two events can be regarded as simultaneous, or not, depending on their space-time locations! In fact, different observer may view the "same two events" in a different order in time.

In a similar consideration, this theory now allows us to check causality (or rather the lack of it) by separating events in space such that the light from one cannot reach the other within the time period of their occurrences. This is in fact how Einstein's theory of relativity relates to quantum measurement.