You may be aware that I’m a huge fan of
the YouTube channel 3blue1brown, run by Grant Sanderson. Grant makes excellent videos about
math, and mathy aspects of other topics, so I’m letting him take over my channel for
the day. Grant, take it away. (Grant) A week ago, I put out a tweet showing
a peculiar place where an ellipse arises, but what I didn’t mention is that this arbitrary-seeming
construction is highly relevant to a once lost lecture by Richard Feynman on why planets
orbit in ellipses. The construction starts by drawing a circle,
and choosing some point within the circle which is not the center, what I’ll call
an “eccentric” point. Then draw a bunch of lines from this eccentric
point to the circumference of the circle. For each of those lines, rotate it 90 degrees
about its midpoint. Once you do this for all the lines, an ellipse emerges in the middle.
Out of context, this a mildly pleasing curiosity, but there’s a much deeper form of satisfaction
on its way once you understand the full story surrounding this. Front and center in that story is Richard
Feynman, whose famous along a number of dimensions. To scientists, he’s a giant of 20th century
physics, winner of the Nobel prize for his foundational insights for Quantum Electro
Dynamics among many other things. To the public, he’s a refreshing contradiction
to stereotypes about physicists: A safe-cracking, bongo-playing, mildly-philanderous non-conformist
whose heavily brooklyn-accented voice you’ve probably heard either relaying some bit of
no-nonsense pragmatic wisdom about the only sensible way to view the world, or else some
wry joke told through a crooked smile. But to physics students, he was an exceptionally
skillful teacher, both for his charisma and his uncanny ability to make complicated topics
feel natural and approachable. Many of the lectures he gave as a CalTech
freshman course are immortalized in the now famous “Feynman lectures”, whose three
volumes are available for free online. But not all of the lectures he gave made it
into this collection. One in particular, a guest lecture lecture
given in March 13, 1964 entitled “The motion of planets around the sun”, survived only
as an unpublished partial transcript with a smattering of notes buried in the office
of one of Feynman’s colleagues until it was eventually dug up by Caltech archivist
Judith Goodstein. Despite the absence of some crucial blackboard
drawings to follow what Feynman was saying, her husband David eventually reconstructed
the argument of the lecture, which the two of them published in a book titled “Feynman’s
lost lecture”, conveying both the lecture itself and the surrounding story in a really
beautiful way. Here, I’d like to give a more animated and
slightly simplified retelling of the argument Feynman presented. The lecture itself is about why planets and
other astronomical objects orbit the sun in ellipses.
It ultimately has to do with the inverse square law, the fact that the gravitational force
pulling an object towards the sun is inversely proportional to the square of the distance
between the orbiting object and the sun. But why? How exactly does that give rise to an
ellipse, of all shapes? Of course, the gravitational attraction between
planets, moons comets and such means that no one orbit is a perfect ellipse, but to
a very good approximation this is the shape of an orbit.
You can solve this analytically, setting up the appropriate differential equation and
seeing the formula for an ellipse pop out, but Feynman’s goal was not to rely on any
heavy mathematical machinery. In fact, let’s take a listen to him articulate his own goal.
I am going to give what I will call an elementary demonstration. But elementary does not mean
easy to understand. Elementary means that very little is required to know ahead of time
in order to understand it, except to have an infinite amount of intelligence.
There may be a large number of steps that hard to follow, but to each does not require
already knowing the calculus or Fourier transforms. Yeah, that’s all, infinite intelligence.
I think you’re up to that, don’t you? I’ve done what I can to simplify things
down further, but that’s not to say a good deal of focus won’t be required. First thing’s first, we need some definition
of an ellipse, otherwise there’s no hope of proving that they’re the shape of orbits.
Some of you may be familiar with the classic way to construct an ellipse using two thumbtacks
and a piece of string. Use the thumb tacks to fix the ends of a small length of string
in place, then pull the string taut with a pencil, and trace out a curve while keeping
the string taut. It’s similar to how you might use a single
pushpin to construct a circle, where the fixed length of string guarantees that every point
you trace is a constant distance from the thumbtack. But in this case, with two thumbtacks,
what property are you guaranteeing about each point you trace?
Well, at every point, the sum of the distances from that point to each of the thumbtacks
will be the full length of the string, right? So the defining property of this curve is
that when you draw lines from any point on the curve to these two special thumbtack locations,
the sum of the lengths of those lines is a constant, namely the length of the string.
Each of these points is called a “focus” of your ellipse, collectively called “foci”.
Fun fact, the word focus comes from the latin for “fireplace”, since one of the first
places ellipses were studied was for orbits around the sun, a sort of fireplace of the
solar system, sitting at one of the foci of a planet’s orbit.
Making up a bit of terminology, let’s call this constant sum of the distances from any
point on the ellipse to the two foci the “focal sum” of the ellipse. We’ll get to orbital mechanics in a moment,
but first let’s turn back to that construction I showed at the beginning, which will come
up again later in the story. Remember, we take all these lines from an
eccentric point of a circle to its circumference, and rotate each 90 degrees about its center,
why on earth should an ellipse pop up? You could just take my word for it, but I
think you’ll be much more satisfied in the end if we take the time now for a brief sidestep
into geometry proof land. First off, there are really only two special
points in this diagram, the eccentric point from which all the lines emerge, and the center
of the circle, so you might guess that each of these is a focus of the ellipse.
Given the defining property of an ellipse, you know you’re going to want to look at
the sum of the distances from these two points to…something.
Also, if you’re doing a geometry problem involving a circle, you’ll very likely want
to draw a radius of that circle, and at some point use the fact that this radius has a
constant length no matter where you draw it. I mean, that’s what defines a circle, so
you’ll probably need to incorporate that somewhere.
With those two thoughts in the back of our mind, let’s limit our attention to just
one of these lines, touching some point P on the circle.
Remember what happens in our construction: You rotate this line from the eccentric point
90 degrees about its center, and the geometry enthusiasts in the room might fancifully call
this a “perpendicular bisector” of the original line.
Take a moment to think about the sum of the distances from our two proposed focus points
to any point Q along this perpendicular bisector. The key insight here is that you can find
two similar triangles to conclude that the distance from the eccentric point to Q is
the same as the distance from Q to P. So, that means adding the distances to each
focus is the same as adding the distances from the center to Q, then Q to P.
Now there are two key things I want you to notice: First, at the point where this perpendicular
bisector intersects the radius, that sum is clearly the radius of the circle. Since that
radius is a constant no matter where we draw it, the focal sum at that intersection point
stays constant, which by definition means it traces out an ellipse, specifically an
ellipse whose focal sum equals the radius of this circle. Isn’t that neat?
Second, because the sum of these two lengths at every other point on this perpendicular
bisector is larger than the radius, meaning the sum of the distances to the foci from
those points are bigger than the ellipse’s focal sum, all other points of this line must
lie outside the ellipse. What this means, and this will be important,
is that this perpendicular bisector, the line we got after our special 90 degree rotation,
is tangent to the ellipse. So the reason all the lines we drew earlier
make an ellipse appear is because we’re drawing a bunch of that ellipse’s tangent
lines. The reason this will be important, as you’ll
say later, is that this tangency direction will correspond to the velocity of an orbiting
object. Okay, geometry proofiness done, onto some
actual physics and orbital mechanics! The first fact to use is Kepler’s (very
beautiful) second law, which says that as an object orbits around the sun, the area
it sweeps out during a given amount of time, like 1 day, will be a constant, no matter
where you are in the orbit. For example, think of a comet whose orbit
is very skewed. Close to the sun, it’s getting whipped around very quickly, so it covers
a larger arc length during a given time interval. Farther away, it’ll move slower, so covers
a shorter arc length during that same time. And this trade off between radius and arc
length balances in just such a way that the swept area is the same.
A quick way to see why this is true is to leverage conservation of angular momentum.
For a tiny time step, delta t, the area swept out is essentially a triangle.
In principle you should think of this as a small sliver for a tiny time step, but I’ll
draw it thicker so we can better see all the parts.
The area is ½ base times height, right? The base is the distance to the sun, and the
height will be this little length here, which you can think of as the component of the object’s
velocity perpendicular to the line to the sun, which I’ll call v_perp, multiplied
by the small duration of time. So the area is ½ R * (v_perp) * (delta t).
Conservation of angular momentum with respect to a given origin point, like the sun, tells
us that this radius time the component of velocity perpendicular to it will remain constant,
so long as all forces acting on the object are directed towards that origin.
Well, specifically it says this quantity times the mass of the object stays constant, but
the mass of an orbiting object won’t be changing.
So! Our expression for the area swept out depends only on the amount of time that has
passed, delta t. Historically, this went the other way around,
and Kepler’s second law is one of the empirical facts that led to an understanding of angular
momentum. I should emphasize, this law does not assume
that the orbit is an ellipse. Heck, it doesn’t even assume the inverse square law, the only
thing needed for this to hold is that the only force acting on the orbiting object is
directed straight towards the sun. This is a fact Feynman spent much more time
showing, recounting an argument by Newton in his Principia, but it kind of distracts
from our main target, so I figure assuming conservation of angular momentum is good enough
for our purposes here, albeit at some loss of elementarity. At this point we don’t know the shape of
an orbit; for all we know it’s some wonky non-elliptical egg shape.
The inverse square law will help pin down that shape precisely, but the strategy is
a little indirect. Before showing the shape of the path traced by the orbiting object,
we’ll show the shape traced out by the velocity vectors. Here, let me show you what I mean
by that. As the object orbits, its velocity will be
changing, always tangent to the curve of the orbit, longer at points where the object moves
quickly, and shorter at points where it moves more slowly.
What we’ll show is that if you take all these velocity vectors, and collect them together
so that their tails all sit a single point, their tips actually trace out a perfect circle.
This is a pretty awesome fact, if you ask me. The velocity spins around and gets faster
and slower at various angles, but evidently the laws of physics cook things up just right
so that these trace out a perfect circle. The astute among you might have a little internal
lightbulb starting to turn on at the sight of this circle with an off-center point. Now, why on earth should this be true?
Feynman describes being unable to easily follow Newton at this point, so instead he comes
up with his own elegant line of reasoning to explain where this circle comes from.
He starts by looking at the orbit, and slicing it up into little pieces which all cover the
same angle with respect to the sun. Alright, now think about how the amount of
time it takes the orbiting object to traverse one of these equal-angle slices changes as
it gets farther away. Well, by Kepler’s 2nd law, it’s proportional
to the area swept out, right? And because these slices have the same angle, as you get
farther away from the sun, not only does the radius increase, but the component of arc
length perpendicular to that radial line goes up in proportion to that radius. So the area
of one of these slices, and hence the time it takes the object to traverse it, is proportional
to the distance away from the sun squared. In principle, we’ll ultimately be considering
very small slices, so there won’t be ambiguity in what I mean by the radius from the planet
to the sun on a given slice, and the relevant bits of arc length will be effectively straight.
Alright, now think about how the inverse square law comes into play. At any given point, the
force the sun imparts on the object is proportional to 1/(the radius)^2, but what does that really
mean? What force is is the acceleration on the object, the amount that it’s velocity
changes per unit time, multiplied by that object’s mass.
This is enough to give us a super useful bit of information about how the velocity of our
orbiting object changes from one slice to the next. The change in velocity is acceleration
times change in time, right? Which means its proportional to the change in time over the
radius squared. But since the time it takes to traverse one
slice is proportional to the radius squared, these terms cancel, so the change in velocity
as the object traverses a given slice is actually some constant that doesn’t depend on the
slice at all. In other words, if you look the velocity at
the start of the slice, and at the end of the slice, then directly compare them by joining
their tails, looking at the difference between the two, the vector joining their tips, this
difference has the same length no matter which slice of the orbit you were looking at.
Also! Since the force vector is always pointing towards the sun, as we go from the start of
one slice to the next, that force vector is turning by a constant angle. In geometry lingo,
you might say that all the “external angles” of this polygon that has formed will be equal.
I know this is a little tricky, but hang in there! Remember that all you need to follow
along is infinite intelligence. Take a moment to make sure it’s clear what’s
happening without velocity diagram: The change from one vector to the next, the little difference
vector joining one tip to the next, will always have the same length, which was a consequence
of the perfect cancelation that happens when mixing Kepler’s second law with the inverse
square law. Now, because those constant-length change
vectors rotate by a constant angle each time, it means they form regular polygon.
As we consider finer and finer slices of the original orbit, based on smaller and smaller
angles for those slices, the relevant regular polygon defining the tips of the vectors in
our velocity diagram will approach a circle. Isn’t that really neat? Hopefully, at this point you’re looking
at this circle with a special eccentric point, and your just itching to see it give rise
to an ellipse the way we saw earlier. But, it’s a little weird, right? We’re looking
at a diagram in velocity space, how exactly will this give us the shape of the orbit?
What follows is tricky, but very clever. Step back and consider what we know: We don’t
know the specific shape of the orbit, only the shape the velocity vectors trace.
But more specifically than that, we know that once the planet has turned an angle theta
off the horizontal with respect to the sun, this corresponds to walking theta degrees
around our circle in the velocity diagram, since the acceleration vectors rotate just
as much as the radius vector. This tells us the tangency direction for each
point on the orbit; whichever vector from our velocity diagram touches that point theta
degrees around, that’s the velocity vector of our orbiting object, and hence the tangency
direction of the curve. In fact, let me just start drawing all those
velocity vectors as lines, since all we’ll need to use is the information they cary about
the slope of the orbit curve; the specific magnitude of each velocity will not be as
important. Notice, it’s not that the angle of the velocity
vector at this point is an angle theta off the vertical. No no no. The angle I’m referencing
in the velocity diagram is with respect to the circle’s center, which is almost certainly
a little different from where the velocity vectors are rooted.
So the question is, what special curve satisfies the property that the tangency direction for
a point theta radians off the horizontal is given by this vector from a special eccentric
point of a circle to a point theta degrees around the circle from the vertical? Well, here’s the trick. First, rotate this
whole circle setup 90 degrees. Then take each of those individual velocity
directions and rotate them 90 degrees back the other way, so that they’re oriented
as they were before, it’s just that each is rooted in a different spot.
Aha! We’ve spotted our ellipse! But we still have just a little thinking ahead of us to
really understand how this emergent ellipse is related to the astronomical orbit.
Importantly, I didn’t just rotate these lines about any point, I rotated each about
its center, which means we can leverage the geometric proof we saw several minutes ago.
And this is the moment where you kind of have to furrow your brow and think back “wait,
what was going on in that proof again?” One of the key points was that when you have
two lines, one from the center of the circle, one from the eccentric point, both to a common
spot on the circle’s circumference, the perpendicular bisector to the eccentric line
will be tangent to the ellipse. What’s more, the point of tangency is where it intersects
with the radial line from the center. What that means is that the point of our little
ellipse which is theta degrees off the horizontal, with respect to the circle’s center, has
a tangent slope perpendicular to this eccentric line. And because of the whole 90 degree rotation,
this means it’s parallel to the velocity vector we need it to be.
So this little emergent curve inside the velocity diagram has exactly the tangency property
we need our orbit to have! And hence, the shape of the orbit must be
an ellipse. QED. Alright, pat yourself on the back, because
there’s no small amount of cleverness required to follow this.
First there was this peculiar way of constructing an ellipse, requiring some geometry savviness
to prove. Then there’s the pretty clever step of even
thinking to ask the question about what shape the velocity vectors trace out when you move
all their tails to the same spot. And showing that this is a circle requires
mixing together the inverse square law with Kepler’s second law in another sly move.
But the cleverness doesn’t end there! Showing how this velocity diagram with vectors rooted
at a point off the circle’s center implies an elliptical orbit brings in this neat 90
degree rotation trick. I just love this. Watching Feynman do physics,
even elementary physics, is like watching Bobby Fischer play chess. (Henry): Thanks again to Grant, and you should
definitely go check out his videos on 3blue1brown