# Learning General Relativity

Math blogger Joseph Nebus does another A – Z series of posts, explaining technical terms in mathematics. He asked readers for their favorite pick of things to be covered in this series, and I came up with General Covariance. Which he laid out in this post – in his signature style, using neither equations nor pop-science images like deformed rubber mattresses – but ‘just words’. As so often, he manages to explain things really well!

Actually, I asked for that term as I am in the middle of yet another physics (re-)learning project – in the spirit of my ventures into QFT a while back.

Since a while I have now tried (on this blog) to cover only the physics related to something I have both education in and hands-on experience with. Re General Relativity I have neither: My PhD was in applied condensed-matter physics – lasers, superconductors, optics – and this article by physicist Chad Orzel about What Math Do You Need For Physics? covers well what sort of math you need in that case. Quote:

I moved into the lab, and was concerned more with technical details of vacuum pumps and lasers and electronic circuits and computer data acquisition and analysis.

So I cannot find the remotest way to justify why I would need General Relativity on a daily basis – insider jokes about very peculiarly torus-shaped underground water/ice tanks for heat pumps aside.

My motivation is what I described in this post of mine: Math-heavy physics is – for me, that means a statistical sample of 1 – the best way of brazing myself for any type of tech / IT / engineering work. This positive effect is not even directly related to math/physics aspects of that work.

But I also noticed ‘on the internet’ that there is a community of science and math enthusiasts, who indulge in self-studying theoretical physics seriously as a hobby. Often these are physics majors who ended up in very different industry sectors or in management / ‘non-tech’ jobs and who want to reconnect with what they once learned.

For those fellow learners I’d like to publish links to my favorite learning resources.

There seem to be two ways to start a course or book on GR, and sometimes authors toggle between both modes. You can start from the ‘tangible’ physics of our flat space (spacetime) plus special relativity and then gradually ‘add a bit of curvature’ and related concepts. In this way the introduction sounds familiar, and less daunting. Or you could try to introduce the mathematical concepts at a most rigorous abstract level, and return to the actual physics of our 4D spacetime and matter as late as possible.

The latter makes a lot of sense as you better unlearn some things you took for granted about vector and tensor calculus in flat space. A vector must no longer be visualized as an arrow that can be moved around carelessly in space, and one must be very careful in visualizing what transforming coordinates really means.

For motivation or as an ‘upper level pop-sci intro’…

Richard Feynman’s lecture on curved space might be a very good primer. Feynman explains what curved space and curved spacetime actually mean. Yes, he is using that infamous beetle on a balloon, but he also gives some numbers obtained by back-of-the-envelope calculations that explain important concepts.

For learning about the mathematical foundations …

I cannot praise these Lectures given at the Heraeus International Winter School Gravity and Light 2015 enough. Award-winning lecturer Frederic P. Schuller goes to great lengths to introduce concepts carefully and precisely. His goal is to make all implicit assumptions explicit and avoid allusions to misguided ‘intuitions’ one might got have used to when working with vector analysis, tensors, gradients, derivatives etc. in our tangible 3D world – covered by what he calls ‘undergraduate analysis’. Only in lecture 9 the first connection is made back to Newtonian gravity. Then, back to math only for some more lectures, until finally our 4D spacetime is discussed in lecture 13.

Schuller mentions in passing that Einstein himself struggled with the advanced math of his own theory, e.g. in the sense of not yet distinguishing clearly between the mathematical structure that represents the real world (a topological manifold) and the multi-dimensional chart we project our world onto when using an atlas. It is interesting to pair these lectures with this paper on the history and philosophy of general relativity – a link Joseph Nebus has pointed to in his post on covariance.

Learning physics or math from videos you need to be much more disciplined than with plowing through textbooks – in the sense that you absolutely have to do every single step in a derivation on your own. It is easy to delude oneself that you understood something by following a derivation passively, without calculating anything yourself. So what makes these lectures so useful is that tutorial sessions have been recorded as well: Tutorial sheets and videos can be found here.
(Edit: The Youtube channel of the event has not all the recordings of the tutorial sessions, only this conference website has. It seems the former domain does not work any more, but the content is perserved at gravity-and-light.herokuapp.com)

You also find brief notes for these lectures here.

For a ‘physics-only’ introduction …

… I picked a classical, ‘legendary’ resource: Landau and Lifshitz give an introduction to General Relativity in the last third of the second volume in their Course of Theoretical Physics, The Classical Theory of Fields. Landau and Lifshitz’s text is terse, perhaps similar in style to Dirac’s classical introduction to quantum mechanics. No humor, but sublime and elegant.

Landau and Lifshitz don’t need manifolds nor tangent bundles, and they use the 3D curvature tensor of space a lot in addition to the metric tensor of 4D spacetime. They introduce concepts of differences in space and time right from the start, plus the notion of simultaneity. Mathematicians might be shocked by a somewhat handwaving, ‘typical physicist’s’ way to deal with differentials, the way vectors on different points in space are related, etc. – neglecting (at first sight, explore every footnote in detail!) the tower of mathematical structures you actually need to do this precisely.

But I would regard Lev Landau sort of a Richard Feynman of The East, so it takes his genius not make any silly mistakes by taking the seemingly intuitive notions too literally. And I recommend this book only when combined with a most rigorous introduction.

I recommend Sean Carroll’s  Lecture Notes on General Relativity from 1997 (precursor of his textbook), together with his short No-Nonsense Introduction to GR as a summary. Carroll switches between more intuitive physics and very formal math. He keeps his conversational tone – well known to readers of his popular physics books – which makes his lecture notes a pleasure to read.

__________________________________

So this was a long-winded way to present just a bunch of links. This post should also serve as sort of an excuse that I haven’t been really active on social media or followed up closely on other blogs recently. It seems in winter I am secluding myself from the world in order to catch up on theoretical physics.

# Rowboats, Laser Pulses, and Heat Energy (Boring Title: Dimensional Analysis)

Dimensional analysis means to understand the essentials of a phenomenon in physics and to calculate characteristic numbers – without solving the underlying, often complex, differential equation. The theory of fluid dynamics is full of interesting dimensionless numbers –  Reynolds Number is perhaps most famous.

In the previous post on temperature waves I solved the Heat Equation for a very simple case, in order to answer the question How far does solar energy get into ground in a year? Reason: I have been working on simulations of our heat pump system since a few years. This also involves heat transport between the water/ice tank and ground. If you set out to simulate a complex phenomenon you have to make lots of assumptions about materials’ parameters, and you have to simplify the system and equations you use for modelling the real world. You need a way of cross-checking if your results sound plausible in terms of orders of magnitude. So my goal has been to find yet another method to confirm assumptions I have made about the thermal properties of ground elsewhere.

Before I am going to revisit heat transport, I’ll try to explain what dimensional analysis is – using the best example I’ve ever seen. I borrow it from theoretical physicist – and awesome lecturer – David Tong:

How does the speed of a rowing boat depend in the number of rowers?

References: Tong’s lecture called Dynamics and Relativity (Chapter 3), This is the original paper from 1971 Tong quotes: Rowing: A similarity analysis.

The boat experiences a force of friction in water. As for a car impeded by the friction of the surrounding air, the force of friction depends on velocity.

Force is the change of momentum, momentum is proportional to mass times velocity. Every small ‘parcel’ of water carries a momentum proportional to speed – so force should at least be proportional to one factor of v. But these parcel move at a speed v, so the faster they move the more momentum is exchanged with the boat; so there has to be a second factor of v, and force is proportional to the square of the speed of the boat.

The larger the cross-section of the submerged part of the boat, A, the higher is the number of collisions between parcels of water and the boat, so putting it together:

$F \sim v^{2}A$

Rowers need to put in power to compensate for friction. Power is energy per time, and Energy is force times distance. Since distance over time is velocity, thus power is also force times velocity.

So there is one more factor of v to be included in power:

$P \sim v^{3}A$

For the same reason wind power harvested by wind turbines is proportional to the third power of wind speed.

A boat does not sink because downward gravity and upward buoyancy just compensate each other; buoyancy is the weight of the volume of water displaced. The heavier the load, the more water needs to be displaced. The submerged volume of the boat V is proportional to the weight of the rowers, and thus to their number N if the mass of the boat itself is negligible:

$V \sim N$

The volume of something scales with the third power of its linear dimensions – think of a cube or a sphere; so the surface area scales with the square of the length, and the cross-section A scales with V – and thus with N:

$A \sim N^{\frac{2}{3}}$

Each rower contributes the same share to the total rowing power, so:

$P \sim N$

Inserting for A in the first expression for P:

$P \sim v^{3} N^{\frac{2}{3}}$

Eliminating P as it has been shown to be proportional to N:

$N \sim v^{3} N^{\frac{2}{3}}$
$v^{3} \sim N^{\frac{1}{3}}$
$v \sim N^{\frac{1}{9}}$

… which is in good agreement with measurements according to Tong.

Heat Transport and Characteristic Lengths

In the last post I’ve calculated characteristic lengths, describing how heat is slowly dissipated in ground: 1) The wavelength of the damped oscillation and 2) the run-out length of the enveloping exponential function.

Both are proportional to the square root of a simple number:

$l \sim \sqrt{D \tau}$

… the factor of proportionality being ‘small’ on a logarithmic scale, like π or 2 or their inverse. τ is the period, and D was a number expressing how well the material carries away heat energy.

There is another ‘simple’ scenario that also results in a length scale described by
$\sqrt{D \tau}$ times a small number: If you deposit a confined ‘lump of heat’, a ‘point heat’ it will peter out and the average width of the lump after some time τ is about this length as well.

Using very short laser pulse to heat solid material is very close to depositing ‘point heat’. Decades ago I worked with pulsed excimer lasers, used for ablation (‘shooting off) material from ceramic targets.This type of lasers is used in eye surgery today:

Heat is deposited in nanosecond pulses, and the run-out length of the heat peak in the material is about $\sqrt{D \tau}$ with tau being equal to the very short laser’s pulse length of several nanoseconds. As the pulse duration is short, the penetration depth is short as well, and tissue is ‘cut’ precisely without heating much of the underlying material.

So this type of $\sqrt{D \tau}$ length is not just a result of a calculation for a specific scenario, but it rather seems to encompass important characteristics of heat conduction as such.

The unit of D is area over time, m2/s. If you accept the heat equation as a starting point, analysing the dimensions involved by counting x and t you see that D has to contain two powers of x and one of t. Half of applied physics and engineering is about getting units right.

But I pretend I don’t even know the heat equation and ‘visualize’ heat transport in this way: ‘Something’ – like heat energy – is concentrated in space and closely peters out. The spreading out is faster, the more concentrated it is. A thin needle-like peak quickly becomes a rounded hill, and then is flattened gradually. Concentration in space means curvature. The smaller the space occupied by the lump of heat is, the smaller its radius, the higher its curvature as curvature is the inverse of the radius of a tangential circular path.

I want to relate curvature to the change with time. Change in time has to be measured in units including the inverse of time, curvature is the inverse of space. Equating those, you have to come with something including the square of spatial dimension and one temporal dimension – something like D [m2/s].

How to get a characteristic length from this? D has to be multiplied by a characteristic time, and then we need to take a the square root. So we need to put in some characteristic time, that’s a property of the specific system investigated and not of the equation – like the yearly period or the laser pulse. And the resulting length is exactly that $l \sim \sqrt{D \tau}$ that shows up in any of of the solutions for specific scenarios.

_________________________________

The characteristic width of the spreading lump of heat is visible in the so-called Green’s functions. These functions described a system’s response to a ‘source’ which resemble a needle-like peak in time. In this case it is a Gaussian functions with a ‘width’ $\sim \sqrt{D \tau}$. See e.g. equation (14) on PDF-page 14 of these lecture notes.

Penetration depth of excimer lasers in human tissue – in this book the square root D times tau formula is used and depths are calculated to be equal to several 10 micrometers.

# Temperature Waves and Geothermal Energy

Nearly all of renewable energy exploited today is, in a sense, solar energy. Photovoltaic cells convert solar radiation into electricity, solar thermal collectors heat hot water. Plants need solar power for photosynthesis, for ‘creating biomass’. The motion of water and air is influenced by the fictitious forces caused by the earth’s rotation, but by temperature gradients imposed by the distribution of solar energy as well.

Also geothermal heat pumps with ground loops near the surface actually use solar energy deposited in summer and stored for winter – that’s why I think that ‘geothermal heat pumps’ is a bit of a misnomer.

Collector (heat exchanger) for brine-water heat pumps.

Within the first ~10 meters below the surface, temperature fluctuates throughout the year; at 10m the temperature remains about constant and equal to 10-15°C for the whole year.

Only at higher depths the flow of ‘real’ geothermal energy can be spotted: In the top layer of the earth’s crust the temperatures rises about linearly, at about 3°C (3K) per 100m. The details depend on geological peculiarities, it can be higher in active regions. This is the energy utilized by geothermal power plants delivering electricity and/or heat.

Geothermal gradient adapted from Boehler, R. (1996). Melting temperature of the Earth’s mantle and core: Earth’s thermal structure. Annual Review of Earth and Planetary Sciences, 24(1), 15–40. (Wikimedia, user Bkilli1). Geothermal power plants use boreholes a few kilometers deep.

This geothermal energy originates from radioactive decays and from the violent past of the primordial earth: when the kinetic energy of celestial objects colliding with each other turned into heat.

The flow of geothermal energy per area directed to the surface, associated with this gradient is about 65 mW/m2 on continents:

Global map of the flow of heat, in mW/m2, from Earth’s interior to the surface. Davies, J. H., & Davies, D. R. (2010). Earth’s surface heat flux. Solid Earth, 1(1), 5-24. (Wikimedia user Bkilli1)

Some comparisons:

• It is small compared to the energy from the sun: In middle Europe, the sun provides about 1.000 kWh per m2 and year, thus 1.000.000Wh / 8.760h = 144W/m2 on average.
• It also much lower than the rule-of-thumb power of ‘flat’ ground loop collectors – about 20W/m2
• The total ‘cooling power’ of the earth is several 1010kW: Would the energy not be replenished by radioactive decay, the earth would lose a some seemingly impressive 1014kWh per year, yet this would result only in a temperature difference of ~10-7°C (This is just a back-of-the-envelope check of orders of magnitude, based on earth’s mass and surface area, see links at the bottom for detailed values).

The constant energy in 10m depth – the ‘neutral zone’ – is about the same as the average temperature of the earth (averaged over one year over the surface of the earth): About 14°C. I will show below that this is not a coincidence: The temperature right below the fluctuating temperature wave ‘driven’ by the sun has to be equal to the average value at the surface. It is misleading to attribute the 10°C in 10m depths to the ‘hot inner earth’ only.

In this post I am toying with theoretical calculations, but in order not so scare readers off too much I show the figures first, and add the derivation as an appendix. My goal is to compare these results with our measurements, to cross-check assumptions for the thermal properties of ground I use in numerical simulations of our heat pump system (which I need for modeling e.g. the expected maximum volume of ice)

1. The surface temperature varies periodically in a year, and I use maximum, minimum and average temperature from our measurements, (corrected a bit for the mild last seasons). These are daily averages as I am not interested in the daily temperature changes between and night.
2. A constant geothermal flow of 65 mW/m2 is superimposed to that.
3. The slow transport of solar energy into ground is governed by a thermal property of ground, called the thermal diffusivity. It describes ‘how quickly’ a lump of heat deposited will spread; its unit is area per time. I use an assumption for this number based on values for soil in the literature.

I am determining the temperature as a function of depth and of time by solving the differential equation that governs heat conduction. This equation tells us how a spatial distribution of heat energy or ‘temperature field’ will slowly evolve with time, given the temperature at the boundary of the interesting part of space in question – in this case the surface of the earth. Fortunately, the yearly oscillation of air temperature is about the simplest boundary condition one could have, so you can calculate the solution analytically.
Another nice feature of the underlying equation is that it allows for adding different solutions: I can just add the effect of the real geothermal flow of energy to the fluctuations caused by solar energy.

The result is a  ‘damped temperature wave’; the temperature varies periodically with time and space: The spatial maximum of temperature moves from the surface to a point below and back: In summer (beginning of August) the measured temperature is maximum at the surface, but in autumn the maximum is found some meters below – heat flows back from ground to the surface then:

Calculated ground temperature, based on measurements of the yearly variation of the temperature at the surface and an assumption of the thermal properties of ground. Calculated for typical middle European maximum and minimum temperatures.

This figure is in line with the images shown in every textbook of geothermal energy. Since the wave is symmetrical about the yearly average, the temperature in about 10m depth, when the wave has ‘run out’, has to be equal to the yearly average at the surface. The wave does not have much chance to oscillate as it is damped down in the middle of the first period, so the length of decay is much shorter than the wavelength.

The geothermal flow just adds a small distortion, an asymmetry of the ‘wave’. It is seen only when switching to a larger scale.

Some data as in previous plot, just extended to greater depths. The geothermal gradient is about 3°C/100m, the detailed value being calculated from the value of thermal conductivity also used to model the fluctuations.

Now varying time instead of space: The higher the depth, the more time it takes for ground to reach maximum temperature. The lag of the maximum temperature is proportional to depth: For 1m difference in depth it is less than a month.

Temporal change of ground temperature at different depths. The wave is damped, but other simply ‘moving into the earth’ at a constant speed.

Measuring the time difference between the maxima for different depths lets us determine the ‘speed of propagation’ of this wave – its wavelength divided by its period. Actually, the speed depends in a simple way on the thermal diffusivity and the period as I show below.

But this gives me an opportunity to cross-check my assumption for diffusivity: I  need to compare the calculations with the experimentally determined delay of the maximum. We measure ground temperature at different depths, below our ice/water tank but also in undisturbed ground:

Temperature measured with Pt1000 sensors – comparing ground temperature at different depths, and the related ‘lag’. Indicated by vertical dotted lines, the approximate positions of maxima and minima. The lag is about 10-15 days.

The lag derived from the figure is in the same order as the lag derived from the calculation and thus in accordance with my assumed thermal diffusivity: In 70cm depth, the temperature peak is delayed by about two weeks.

___________________________________________________

Appendix: Calculations and background.

I am trying to give an outline of my solution, plus some ‘motivation’ of where the differential equation comes from.

Heat transfer is governed by the same type of equation that describes also the diffusion of gas molecules or similar phenomena. Something lumped together in space slowly peters out, spatial irregularities are flattened. Or: The temporal change – the first derivative with respect to time – is ‘driven’ by a spatial curvature, the second derivative with respect to space.

$\frac{\partial T}{\partial t} = D\frac{\partial^{2} T}{\partial x^{2}}$

This is the heat transfer equation for a region of space that does not have any sources or sinks of heat – places where heat energy would be created from ‘nothing’ or vanish – like an underground nuclear reaction (or freezing of ice). All we know about the material is covered by the constant D, called thermal diffusivity.

The equation is based on local conservation of energy: The energy stored in a small volume of space can only change if something is created or removed within that volume (‘sources’) or if it flows out of the volume through its surface. This is a very general principles applicable to almost anything in physics. Without sources or sinks, this translates to:

$\frac{\partial [energy\,density]}{\partial t} = -\frac{\partial \overrightarrow{[energy\,flow]}}{\partial x}$

The energy density [J/m3] stored in a volume of material by heating it up from some start temperature is proportional to temperature, proportionality factors being the mass density ρ [kg/m3] and the specific heat cp [J/kg] of this material. The energy flow per area [W/m2] is typically nearly proportional to the temperature gradient, the constant being heat conductivity κ [W/mK]. The gradient is the first-order derivative in space, so inserting all this we end with the second derivative in space.

All three characteristic constants of the heat conducting material can be combined into one – the diffusivity mentioned before:

$D = \frac{\kappa }{\varrho \, c_{p} }$

So changes in more than one of these parameters can compensate for each other; for example low density can compensate for low conductivity. I hinted at this when writing about heat conduction in our gigantic ice cube: Ice has a higher conductivity and a lower specific heat than water, thus a much higher diffusivity.

I am considering a vast area of ground irradiated by the sun, so heat conduction will be one-dimensional and temperature changes only along the axis perpendicular to the surface. At the surface the temperature varies periodically throughout the year. t=0 is to be associated with beginning of August – our experimentally determined maximum – and the minimum is observed at the beginning of February.

This assumption is just the boundary condition needed to solve this partial differential equation. The real ‘wavy’  variation of temperature is closed to a sine wave, which makes the calculation also very easy. As a physicist I have trained to used a complex exponential function rather than sine or cosine, keeping in mind that only real part describes the real world. This a legitimate choice, thanks to the linearity of the differential equation:

$T(t,x=0) = T_{0} e^{i\omega t}$

with ω being the angular frequency corresponding to one year (2π/ω = 1 year).

It oscillates about 0, with an amplitude of half of T0. But after all, the definition of 0°C is arbitrary and – again thanks to linearity – we can use this solution and just add a constant function to shift it to the desired value. A constant does neither change with space or time and thus solves the equation trivially.

If you have more complicated sources or sinks, you would represent those mathematically as a composition of simpler ‘sources’, for each of which you can find a quick solution and then add up add the solutions, again thanks to linearity. We are lucky that our boundary condition consist just of one such simple harmonic wave, and we guess at the solution for all of space, adding a spatial wave to the temporal one.

So this is the ansatz – an educated guess for the function that we hope to solve the differential equation:

$T(t,x) = T_{0} e^{i\omega t + \beta x}$

It’s the temperature at the surface, multiplied by an exponential function. x is positive and increasing with depth. β is some number we don’t know yet. For x=0 it’s equal to the boundary temperature. Would it be a real, negative number, temperature would decrease exponentially with depth.

The ansatz is inserted into the heat equation, and every differentiation with respect to either space or time just yields a factor; then the exponential function can be cancelled from the heat transfer equation. We end up with a constraint for the factor β:

$i\omega = D\beta^{2}$

Taking the square root of the complex number, there would be two solutions:

$\beta=\pm \sqrt{\frac{\omega}{2D}}(1+i))$

β has a real and an imaginary part: Using it in T(x,t) the real part corresponds to exponential ‘decay’ while the imaginary part is an oscillation (similar to the temporal one).

Both real and imaginary parts of this function solve the equation (as any linear combination does). So we take the real part and insert β – only the solution for β with negative sign makes sense as the other one would describe temperature increasing to infinity.

$T(t,x) = Re \left(T_{0}e^{i\omega t} e^{-\sqrt{\frac{\omega}{2D}}(1+i)x}\right)$

The thing in the exponent has to be dimension-less, so we can express the combinations of constants as characteristic lengths, and insert the definition of ω=2π/τ):

$T(t,x) = T_{0} e^{-\frac{x}{l}}cos\left(2\pi\left(\frac {t} {\tau} -\frac{x}{\lambda }\right)\right)$

The two lengths are:

• the wavelength of the oscillation $\lambda = \sqrt{4\pi D\tau }$
• and the attenuation length  $l = \frac{\lambda}{2\pi} = \sqrt{\frac{D\tau}{\pi}}$

So the ratio between those lengths does not depend on the properties of the material and the wavelength is always much shorter than the attenuation length. That’s why there is hardly one period visible in the plots.

The plots have been created with this parameters:

• Heat conductivity κ = 0,0019 kW/mK
• Density ρ = 2000 kg/m3
• Specific heat cp = 1,3 kJ/kgK
• tau = 1 year = 8760 hours

Thus:

• Diffusivity D = 0,002631 m2/h
• Wavelength λ = 17 m
• Attenuation length l = 2,7 m

The wave (any wave) propagates with a speed v equivalent to wavelength over period: v = λ / tau.

$v = \frac{\lambda}{\tau} = \frac{\sqrt{4\pi D\tau}}{\tau} = \sqrt{\frac{4\pi D}{\tau}}$

The speed depends only on the period and the diffusivity.

The maximum of the temperature as observed in a certain depth x is delayed by a time equal x over v. Cross-checking our measurements of the temperature T(30cm) and T(100cm), I would thus expect a delay by 0,7m / (17m/8760h) = 360 h = 15 days which is approximately in agreement with experiments (taking orders of magnitude). Note one thing though: Only the square root of D is needed in calculations, so any error I make in assumptions for D will be generously reduced.

I have not yet included the geothermal linear temperature gradient in the calculation. Again we are grateful for linearity: A linear – zero-curvature – temperature profile that does not change with time is also a trivial solution of the equation that can be added to our special exponential solution.

So the full solution shown in the plot is the sum of:

• The damped oscillation (oscillating about 0°C)
• Plus a constant denoting the true yearly average temperature
• Plus a linear decrease with depth, the linear correction being 0 at the surface to meet the boundary condition.

If there would be no geothermal gradient (thus no flow from beneath) the temperature at infinite distance (practically in 20m) would be the same as the average temperature of the surface.

Daily changes could be taken into account by adding yet another solution that satisfies an amendment to the boundary condition: Daily fluctuations of temperatures would be superimposed to the yearly oscillations. The derivation would be exactly the same, just the period is different by a factor of 365. Since the characteristic lengths go with the square root of the period, yearly and daily lengths differ only by a factor of about 19.

________________________________________

Intro to geothermal energy:

Geothermal gradient and energy of the earth:

These data for bore holes using one scale show the gradient plus the disturbed surface region, with not much of a neutral zone in between.

Theory of Heat Conduction

Heat Transfer Equation on Wikipedia
Textbook on Heat Conduction, available on archive.org in different formats.

I have followed the derivation of temperature waves given in my favorite German physics book on Thermodynamics and Statistics, by my late theoretical physics professor Wilhelm Macke. This page quotes the classic on heat conduction, by Carlslaw and Jäger, plus the main results for the characteristic lengths.

# How to Evaluate a Heat Pump’s Performance?

The straight-forward way is to read off two energy values at the end of a period – day, month, or season:

1. The electrical energy used by the heat pump
2. and the heating energy delivered.

The Seasonal Performance Factor (SPF) is the ratio of these – the factor the input electrical energy is ‘multiplied with’ to yield heating energy. The difference between these two energies is supplied by the heat source – the underground water tank / ‘cistern’ plus solar collector in our setup.

But there might not be a separate power meter just for the heat pump’s compressor. Fortunately, performance factors can also be evaluated from vendors’ datasheets and measured brine / heating water temperatures:

Datasheets provide the Coefficient of Performance (COP) – the ‘instantaneous’ ratio of heating power and electrical power. The COP decreases with increasing temperature of the heating water, and with decreasing temperature of the source  – the brine circuit immersed in the cold ice / water tank. E.g when heating the water in floor loops to 35°C the COP is a bit greater than 4 if the water in the underground tank is frozen (0°C). The textbook formula based on Carnot’s ideal process for thermodynamic machines is 8,8 for 0°C/35°; realistic COPs are typically by about factor of 2 lower.

COPs, eletrical power (input) and heating power (output) of a ‘7 kW’ brine / water heat pump. Temperatures in the legend are heating water infeed temperatures – 35°C as required by floor loops and 50°C for hot water heating.

If you measure the temperature of the brine and the temperature of the heating water every few minutes, you can determine the COP from these diagrams and take averages for days, months, or seasons.

But should PF and average COP actually be the same?

Average power is total energy divided by time, so (with bars denoting averages):

$\text{Performance Factor } = \frac {\text{Total Heating Energy } \mathnormal{E_{H}}} {\text{Total Electrical Energy } \mathnormal{E_{E}}} = \frac {\text{Average Heating Power } \mathnormal{\bar{P}_{H}}} {\text{Average Electrical Power }\mathnormal{\bar{P}_{E}} }$

On the other hand the average COP is calculated from data taken at many different times. At any point of time t,

$\text{Coefficient of Performance(t)} = \frac {\text{Heating Power }P_{H}(t))} {\text{Electrical Power } P_{E}(t))}$

Having measured the COP at N times, the average COP is thus:

$\overline{COP}(t) = \frac {1}{N} \sum \frac{P_{H}(t)}{P_{E}(t)} = \overline{\frac{P_{H}(t)}{P_{E}(t)}}$

$\overline{\frac{P_{H}(t)}{P_{E}(t)}}$ is not necessarily equal to $\frac{\overline{P_{H}}}{\overline{P_{E}}}$

When is the average of ratios equal to the ratios of the averages?

If electrical power and heating power would fluctuate wildly we would be in trouble. Consider this hypothetical scenario of odd non-physical power readings:

• PH = 10, PE = 1
• PH = 2, PE = 20

The ratio of averages is: (10 + 2) / (1 + 20) = 12 / 21 = 0,57
The average of ratios is: (10/1 + 2/20) / 2 = (10 + 0,1) / 2 = 5,05

Quite a difference. Good that typical powers look like this:

Powers measured on 2015-02-20 . Two space heating periods with a COP between 4 and 5, and one heating hot water cycle: the COP gradually decreases as heating water temperature increases.

Powers change only by a fraction of their absolute values – the heat pump is basically ON or OFF.  When these data were taken in February, average daily ambient temperature was between 0°C and 5°C, and per day about 75kWh were used for space heating and hot water. Since heat pump output is constant, daily run times change with heating demands.

Results for the red hot tap water heating cycle:

• Performance Factor calculated from energies: 3,68
• Average COP: 3,76.

I wanted to know how much powers are allowed to change without invalidating the average COP method:

Electrical power and heating power rise / fall about linearly, so they can be described by two parameters: Initial powers when the heat pump is turned on, and the slope of the curve or relative change of power within on cycle. The Performance Factor is determined from energies, the areas of trapezoids under the curves. For calculating the COP the ratio needs to be integrated, which results in a not so nice integral.

The important thing is that COP and PF are proportional to the ratio of inital powers and their relative match only depends on the slopes of the heating power and electrical power curves. As long as the relative increase / decrease of those powers is significantly smaller than 1, the difference in performance indicators is just a few percent. In the example curve, the heating energy decreases by 15%, while electrical energy increases by 52% – performance indicators would differ by less than 2%. This small difference is not too sensitive to changes in slopes.

All is well.

Happily harvesting ambient energy.

________________

Detailed monthly and seasonal performance data are given in this document.

# Gödel, Escher, Bach, and Strange Loops: Nostalgia and Random Thoughts

I am curious – who read the book, too? Did you like it?

I read it nearly 30 years ago and I would also tag it one of the most influential books I read as a teenager.

[This might grow into a meandering and lengthy post with different (meta-)levels – given the subject of the post I think this is OK.]

A modern variant of the ambigram presented at GEB’s cover. The shadows created by this cube represent the QR codes of Wikipedia articles about Gödel, Escher, Bach, respectively.

In 1995 author Douglas Hofstadter said the following in an interview by Wired – and this also resembles similar statements in his book I am a Strange Loop published 2007. He utters frustration with the  effect of GEB on readers and on his reputation – although he won a Pulitzer Prize for his unusual debut book (published 1979).

From the Wired interview:

What Gödel, Escher, Bach was really about – and I thought I said it over and over again – was the word I. Consciousness. It was about how thinking emerges from well-hidden mechanisms, way down, that we hardly understand. How not just thinking, but our sense of self and our awareness of consciousness, sets us apart from other complicated things. How understanding self-reference could help explain consciousness so that someday we might recognize it inside very complicated structures such as computing machinery. I was trying to understand what makes for a self, and what makes for a soul. What makes consciousness come out of mere electrons coursing through wires.

There is nothing metaphysical in the way the term soul is used here. Having re-read GEB now I marvel at the level Hofstadter was able to provide an interpretation devoid of metaphysics – yet elegant and even poetic. Hofstadter is quoting Zen koans but he does not force “spirituality”  upon the subject – he calls Zen intellectual quicksand.

GEB is about the machinery of mind without catering to the AI enthusiasm shared by transhumanists. It has once been called a Bible of AI but maybe today it would not be considered optimistic enough in the nerdy sense. It is not about how new technology might exploit our (alleged) understanding of the mind – it is only about said understanding.

When I read the book nearly 30 years ago I enjoyed it for two main reasons: the allusions and references to language, metaphors and translation – especially as implemented in the whimsical Lewis-Carroll-style dialogues of Achilles, Mr. Tortoise and friends…

And yet many people treated the book as just some sort of big interdisciplinary romp whose point was simply to have fun. In fact, the fun was merely icing on the cake.

… and,  above all, that popular but yet mathy introduction to Gödel’s Incompleteness Theorem(s). Gödel’s theorem is presented as the analogue of oxymoronic statements such as I am a liar or This statement is false – translated to math. More precisely there are true statements about integers in sufficiently powerful formal systems that yet cannot be proven within those systems.

Originally, the book was purely about the way the proof of Gödel’s theorem kept cropping up in the middle of a fortress – Principia Mathematica by Bertrand Russell and Alfred North Whitehead – that was designed to keep it out. I thought, Here’s a structure that attempts to keep out self-knowledge, but when things get sufficiently complex and sufficiently tangled, all of a sudden – whammo! – it’s got self-representation in it. That to me was the trick that underlies consciousness.

I had considered Gödel the main part of the trio and I think I was sort of “right” due to this:

So, at first, there were no dialogs, no jokes, no wordplay, and no references to Escher or Bach. But as I typed the manuscript up in ’74, I decided it was written in an immature style. I decided to insert the dialogs and the Escher so that the playfulness became a kind of a secondary – but extremely important – part of the book. Many people focused on those things and treated the book as a big game-playing thing.

I am afraid, I did. I read the chapters dealing with a gradual introduction of the theorem more often than the parts about consciousness. Blending something abstract – that only hardcore nerds might appreciate – with wordplay, Escher drawings and musings on musical theory (pun not intended but obviously this is contagious) was a master piece of science writing. It seems this has widened the audience but not in an intended way.

But isn’t that the fate of nearly every real well-written science book transcending the boundaries of disciplines? Is there any philosopher-physicist writing about quantum mechanics who had not been quoted out-of-context by those who prefer to cook up metaphysical / emotionally appealing statements using scientifically sounding phrases as ingredients?

Anyway, focusing on the theorem: The gist of Hofstadter’s argument is that inherent contradictions were introduced directly to the very epitome of pristine rationality, Russell’s and Whitehead’s attempted to create. So we should not be surprised to find self-reference and emergent symbols in other systems built from boring little machine-like components. In a dialogue central to the idea of GEB his main protagonists discuss about holism and reductionism with a conscious ant hill – made up from dumb ants.

The meticulously expounded version of Gödel’s theorem is the heart and the pinnacle of the storyline of GEB in my point of view, and it is interesting to compare Hofstadter’s approach to the crisp explanation Scott Aaronson gives in Quantum Computing since Democritus. Scott Aaronson calls Gödel’s way to have formal statement talking about themselves an elaborate hack to program without programming. Aaronson makes the very convincing case that you could avoid all that talk about grand difficult math and numbering statements by starting from the notion of a computer, a Universal Turing machine.

Model of a Turing machine (Wikimedia, http://aturingmachine.com): an idealized computer working on a tape. It can move the read forward or back, (over-)write symbols on the tape or halt. Its actions are determined by the instructions on the tape and its internal state. Given a program you cannot decide if it will ever halt.

Gödel’s Proof then turns into a triviality as a formal system envisaged by Russell would be equivalent to having found a solution to the halting problem. The philosophical implications are preserved but it sounds more down-to-earth and it takes about two orders of magnitude less pages.

As Hofstadter says implicitly and explicitly: Metaphors and context are essential. Starting from a proof involving a program that is fed its own code probably avoids unwanted metaphysical-mystical connotations – compared to cooking up a scheme for turning statements of propositional logic into numbers, framed with Zen Buddhism, molecular biology, and art. But no matter in which way I might prefer to think about Gödel’s proof I guess I missed the mark:

(From the Wired interview – continued)

I had been aiming to have the book reach philosophers, people who thought about the mind and consciousness, and a small number actually saw what I was getting at, but most people just saw the glitter. At the time, I felt I’d lost a great deal by writing a book like that so early in my career, because I was no longer taken seriously by anybody.

If you did not get the message either you are in good company. David Deutsch, says in his review of I am a Strange Loop:

Hofstadter … expresses disappointment that his 1979 masterpiece Gödel, Escher, Bach (one of my favourite books) was not recognized as explaining the true nature of consciousness, or “I”-ness. I have to confess that it never occurred to me that it was intended to do so. I thought it merely explained the problem, highlighting stark flaws in common-sense ideas about minds. It also surveyed the infinite depth and meaning that can exist in “mere” computer programs. One could only emerge from the book (or so I thought) concluding that brains must in essence be computers, and consciousness an attribute of certain programs – and that discovering exactly what attribute is an urgent problem for philosophy and computer science. Hofstadter agrees with the first two conclusions but not the third; he considers that problem solved.

I can’t comment on the problem of consciousness being a yet-to-clarified attribute / by-product of computing but I find the loopy part about brains that must in essence be computers convincing.

Accidentally I have now read three different refutations of the so-called Chinese Room argument against strong AI – by Hofstadter, Aaronson and Ray Kurzweil. A human being in an hypothetical room pretends to exchange messages (on paper) in Chinese with interrogators. They might believe the guy speaks Chinese though he does only lookup rules in a book and mindlessly shift papers.

But how could you not associate the whole room, the rule book, the (high-speed!) paper-shuffling process with what goes on the system of the brains’ neurons? The person does not speak Chinese but “speaking Chinese” is an emergent phenomenon of the whole setup. Mental images invoked by “rule book” and “paper” are called intuition pumps by Hofstadter (a term coined by his friend Daniel Dennett) – examples picked deliberately to invoke that sudden “self-evident” insight along the lines of: Of course the human mind does not follow a mere rulebook!.

[Pushing to the level of self-referential  navel-gazing now]

Re-reading the blurb of my old version of the book I am able to connect some dots: I had forgotten that Hofstadter actually has a PhD in physics – theoretical condensed matter physics – and not in computer science or cognitive science. So the fact that a PhD in physics could prepare you for a career / life of making connections between all kinds of hard sciences, arts and literature was certainly something that might have shaped my worldview. All the authors heroes who have written those books that have influenced my the most as a teenager were scientist-philosophers, such as Albert Einstein and Viktor Frankl.

If I go on like this, talking about the science books and classics I read as a child I might get the same feedback as Hofstadter (see amazon.com reviews for example): This is elitist and only about showing off his education etc.

I am not sure what Hofstadter should have been done to avoid this. Not writing the books at all? Focusing on a narrower niche in order to comply with common belief that talents in seemingly diverse fields have to be mutually exclusive?

Usually a healthy dose of self-irony mitigates the smarty effect. Throw in se jokes about how your stereotype absent-mindedness prevents you from exchanging that clichéd light bulb. But Hofstadter’s audience is rather diverse – so zooming in on the right kind of humor could be tricky.

[Pop]

And now I do what is explained so virtuoso in GEB – having pushed and popped through various meta-levels I will not resolve the tension and return to the tonic of the story … a music pun, pathetically used out of context.

[Coda]

You might wonder why I did not include any Escher drawings. There are all copyrighted still since  less than 70 years have passed since Escher’s death. But there are some interesting DIY Projects on Youtube, bringing to life Escher’s structures – such as this one:

_____________________________

Further reading The Man Who Would Teach Machines to Think (The Atlantic)

Douglas Hofstadter, the Pulitzer Prize–winning author of Gödel, Escher, Bach, thinks we’ve lost sight of what artificial intelligence really means. His stubborn quest to replicate the human mind.

Scott Aaronson’s website, blog and papers  – a treasure trove! His book is not an easy read and probably unlike every so called science book you have ever read. It has been created from lecture notes. His tone is conversational and the book is incredibly witty – but nonetheless it is quite compressed information containing more than one course in math, quantum physics and computer science. And yet – this is exactly the kind of science book I want to read when trying to make myself familiar with a new field. One “warning”: it is about theory, not about how to build a quantum computer. Thanks to wavewatching.net for the pointer.

# In Praise of Textbooks with Tons of Formulas (or: The Joy of Firefighting)

I know. I am repeating myself.

Maurice Barry has not only recommended Kahneman’s Thinking, Fast and Slow to me, but he also runs an interesting series of posts on his eLearning blog.

These got mixed and entangled in my mind, and I cannot help but returning to that pet topic of mine. First, some statistically irrelevant facts of my personal observations – probably an example of narrative fallacy or mistaking correlation for causation:

As you know I had planned to reconnect to my roots as a physicist for a long time despite working crazy schedules as a so-called corporate knowledge worker. Besides making the domain subversiv.at mine and populating it with content similar to the weirdest in this blog I invented my personal therapy to deflect menacing burn-out: I started reading or better working with my old physics textbooks. Due to time constraints I sometimes had to do this very early in the morning – and I am not a lark. I have read three books on sleep research recently – I know that both my sleep duration as well as my midsleep are above average and I lived in a severely sleep-deprived state most of my adult life.

Anyway, the point was: Physics textbooks gave me some rehash of things I had forgotten and prepared me to e.g. work with the heat transfer equation again. But what was more important was: These books transformed my mind in unexpected ways. Neither entertaining science-is-cool pop-sci books nor philosophical / psychological books about life, the universe and everything could do this for me at that level. (For the records: I tried these to, and I am not shy to admit I picked some self-help books also. Dale Carnegie, no less.)

There were at least two positive effects – I try to describe them in my armchair psychologist’s language. Better interpretations welcome!

Concentrating and abstract reasoning seems to be effective in stopping or overruling the internal over-thinking machine that runs in circles if you feel trapped in your life or career. Probably people like me try to over-analyze what has to be decided intuitively anyway. Keeping the thinking engine busy lets the intuitive part do its work. Whatever it was – it was pleasant, and despite the additional strain on sleep and schedule it left me more energetic, more optimistic, and above all more motivated and passionate about that non-physics work.

I also found that my work related results – the deliverables as we say – improved. I have been the utmost perfectionist ever since and my ability to create extensive documentation in parallel to doing the equivalent of cardiac surgery to IT systems is legendary (so she says in her modest manner). Nevertheless, plowing through tensor calculus and field equations helps to hone these skills even more. For those who aren’t familiar with that biotope: The mantra of other Clint-Eastwood-like firefighters is rather: Real experts don’t provide documentation!

I would lie if I would describe troubleshooting issues with digital certificates as closely related to theoretical physics. You can make some remote connections between skills that sort of related such as cryptography is math after all, but I am not operating at that deep mathematical level most of the time. I rather believe that anything rigorous and mathy puts your mind – or better its analytical subsystem – in a advanced state. Advanced refers to the better prepration to tackle a specific class of problems. The caveat is that you lose this ability if you stop reading textbooks at 4:00 AM.

Using Kahneman’s terminology (mentioned briefly in my previous post) I consider mathy science the ultimate training for system 2 – your typically slow rational decision making engine. It takes hard work and dedication at the beginning to make system 2 work effortless in some domains. In my very first lecture at the university ever the math professor stated that mathematics will purge and accelerate your brain – and right he was.

Hence I am so skeptical about joyful learning and using that science-is-cool-look-at-that-great-geeky-video-of-blackholes-and-curved-space approach. There is no simple and easy shortcut and you absolutely, positively have to love the so-called tedious work you need to put in. You are rewarded later with that grand view from the top of the mountain. The ‘trick’ is that you don’t consider it tedious work.

Kahneman is critical of so-called intuition – effortless intuitive system 1 at work – and he gives convincing accounts of cold-hearted algorithms beating humans, e.g. in picking the best candidate for a job. However, he describes his struggles with another school of thought of psychologists who are wary of algorithms. I have scathed dumb HR-acronym-checking-bots at this blog, too. But Kahneman finally reached an agreement with algorithm haters as he acknowledged that there is a specific type of expert intuition that appears like magic to outsiders. His examples: Firefighters and nurses who feel what is wrong – and act accordingly – before they can articulate it. He still believes that picking stocks or picking job applicants is not a skill and positive results don’t correlate at with skill but are completely random.

I absolutely love the example of firefighters as I can literally relate to it. Kahneman demystifies their magic abilities though as he states that this is basically pattern recognition – you have gathered similar experience, and after many years of exposure system 1 can draw from that wealth of patterns unconsciously.

Returning to my statistically irrelevant narrative this does still not explain completely why exposure to theoretical physics should make me better at analyzing faulty security protocols. Physics textbooks make you an expert in solving physics textbook problems, this is: in recognizing patterns and provide you with ideas of that type of out-of-the-box idea you sometimes need to find a clever mathematical proof. You might get better in solving that physics puzzles people enjoy sharing on social media.

But probably the relation to troubleshooting tech problems is very simple and boils down to the fact that you love to tackle formal, technical problems again and again even if many attempts are in vain. The motivation and the challenge is in looking at the problem as a black box and trying to find a clever way to get in. Every time you fail you learn something nonetheless, and that learning is a pleasure in its own right.

# Mastering Geometry is a Lost Art

I am trying to learn Quantum Field Theory the hard way: Alone and from text books. But there is something harder than the abstract math of advanced quantum physics:

You can aim at comprehending ancient texts on physics.

If you are an accomplished physicist, chemist or engineer – try to understand Sadi Carnot’s reasoning that was later called the effective discovery of the Second Law of Thermodynamics.

At Carnotcycle’s excellent blog on classical thermodynamics you can delve into thinking about well-known modern concepts in a new – or better: in an old – way. I found this article on the dawn of entropy a difficult ready, even though we can recognize some familiar symbols and concepts such as circular processes, and despite or because of the fact I was at the time of reading this article a heavy consumer of engineering thermodynamics textbooks. You have to translate now unused notions such as heat received and the expansive power into their modern counterparts. It is like reading a text in a foreign language by deciphering every single word instead of having developed a feeling for a language.

Stephen Hawking once published an anthology of the original works of the scientific giants of the past millennium: Corpernicus, Galieo, Kepler, Newton and Einstein: On the Shoulders of Giants. So just in case you googled for Hawkins – don’t expect your typical Hawking pop-sci bestseller with lost of artistic illustrations. This book is humbling. I found the so-called geometrical proofs most difficult and unfamiliar to follow. Actually, it is my difficulties in (not) taming that Pesky Triangle that motivated me to reflect on geometrical proofs.

I am used to proofs stacked upon proofs until you get to the real thing. In analysis lectures you get used to starting by proving that 1+1=2 (literally) until you learn about derivatives and slopes. However, Newton and his processor giants talk geometry all the way! I have learned a different language. Einstein is most familiar in the way he tackles problems though his physics is on principle the most non-intuitive.

This amazon.com review is titled Now We Know why Geometry is Called the Queen of the Sciences and the reviewer perfectly nails it:

It is simply astounding how much mileage Copernicus, Galileo, Kepler, Newton, and Einstein got out of ordinary Euclidean geometry. In fact, it could be argued that Newton (along with Leibnitz) were forced to invent the calculus, otherwise they too presumably would have remained content to stick to Euclidean geometry.

Science writer Margaret Wertheim gives an account of a 20th century giant trying to recapture Isaac Newton’s original discovery of the law of gravitation in her book Physics on the Fringe (The main topic of the book are outsider physicists’ theories, I have blogged about the book at length here.).

This giant was Richard Feynman.

Today the gravitational force, gravitational potential and related acceleration objects in the gravitational fields are presented by means of calculus: The potential is equivalent to a rubber membrane model – the steeper the membrane, the higher the force. (However, this is not a geometrical proof – this is an illustration of underlying calculus.)

Model of the gravitational potential. An object trapped in these wells moves along similar trajectories as bodies in a gravitational field. Depending on initial conditions (initial position and velocity) you end up with elliptical, parabolic or hyperbolic orbits. (Wikimedia, Invent2HelpAll)

(Today) you start from the equation of motion for a object under the action of a force that weakens with the inverse square of the distance between two massive objects, and out pops Kepler’s law about elliptical orbits. It takes some pages of derivation, and you need to recognize conic sections in formulas – but nothing too difficult for an undergraduate student of science.

Newton actually had to invent calculus together with tinkering with the law of gravitation. In order to convince his peers he needed to use the geometrical language and the mental framework common back then. He uses all kinds of intricate theorems about triangles and intersecting lines (;-)) in order to say what we say today using the concise shortcuts of derivatives and differentials.

Wertheim states:

Feynman wasn’t doing this to advance the state of physics. He was doing it to experience the pleasure of building a law of the universe from scratch.

Feynman said to his students:

“For your entertainment and interest I want you to ride in a buggy for its elegance instead of a fancy automobile.”

But he underestimated the daunting nature of this task:

In the preparatory notes Feynman made for his lecture, he wrote: “Simple things have simple demonstrations.” Then, tellingly, he crossed out the second “simple” and replaced it with “elementary.” For it turns out there is nothing simple about Newton’s proof. Although it uses only rudimentary mathematical tools, it is a masterpiece of intricacy. So arcane is Newton’s proof that Feynman could not understand it.

Given the headache that even Corpernicus’ original proofs in the Shoulders of Giants gave me I can attest to:

… in the age of calculus, physicists no longer learn much Euclidean geometry, which, like stonemasonry, has become something of a dying art.

Richard Feynman has finally made up his own version of a geometrical proof to fully master Newton’s ideas, and Feynman’s version covered hundred typewritten pages, according to Wertheim.

Everybody who indulges gleefully in wooden technical prose and takes pride in plowing through mathematical ideas can relate to this:

For a man who would soon be granted the highest honor in science, it was a DIY triumph whose only value was the pride and joy that derive from being able to say, “I did it!”

Richard Feynman gave a lecture on the motion of the planets in 1964, that has later been called his Lost Lecture. In this lecture he presented his version of the geometrical proof which was simpler than Newton’s.

The proof presented in the lecture have been turned in a series of videos by Youtube user Gary Rubinstein. Feynman’s original lecture was 40 minutes long and confusing, according to Rubinstein – who turned it into 8 chunks of videos, 10 minutes each.

The rest of the post is concerned with what I believe that social media experts call curating. I am just trying to give an overview of the episodes of this video lecture. So my summaries do most likely not make a lot of sense if you don’t watch the videos. But even if you don’t watch the videos you might get an impression of what a geometrical proof actually is.

In Part I (embedded also below) Kepler’s laws are briefly introduced. The characteristic properties of an ellipse are shown – in the way used by gardeners to creating an elliptical with a cord and a pencil. An ellipse can also be created within a circle by starting from a random point, connecting it to the circumference and creating the perpendicular bisector:

Part II starts with emphasizing that the bisector is actually a tangent to the ellipse (this will become an important ingredient in the proof later). Then Rubinstein switches to physics and shows how a planet effectively ‘falls into the sun’ according to Newton, that is a deviation due to gravity is superimposed to its otherwise straight-lined motion.

Part III shows in detail why the triangles swept out by the radius vector need to stay the same. The way Newton defined the size of the force in terms of parallelogram attached to the otherwise undisturbed path (no inverse square law yet mentioned!) gives rise to constant areas of the triangles – no matter what the size of the force is!

In Part IV the inverse square law in introduced – the changing force is associated with one side of the parallelogram denoting the deviation from motion without force. Feynman has now introduced the velocity as distance over time which is equal to size of the tangential line segments over the areas of the triangles. He created a separate ‘velocity polygon’ of segments denoting velocities. Both polygons – for distances and for velocities – look elliptical at first glance, though the velocity polygon seems more circular (We will learn later that it has to be a circle).

In Part V Rubinstein expounds that the geometrical equivalent of the change in velocity being proportional to 1 over radius squared times time elapsed with time elapsed being equivalent to the size of the triangles (I silently translate back to dv = dt times acceleration). Now Feynman said that he was confused by Newton’s proof of the resulting polygon being an ellipse – and he proposed a different proof:
Newton started from what Rubinstein calls the sun ‘pulsing’ at the same intervals, that is: replacing the smooth path by a polygon, resulting in triangles of equal size swept out by the radius vector but in a changing velocity.  Feynman divided the spatial trajectory into parts to which triangles of varying area e are attached. These triangles are made up of radius vectors all at the same angles to each other. On trying to relate these triangles to each other by scaling them he needs to consider that the area of a triangle scales with the square of its height. This also holds for non-similar triangles having one angle in common.

Part VI: Since ‘Feynman’s triangles’ have one angle in common, their respective areas scale with the squares of the heights of their equivalent isosceles triangles, thus basically the distance of the planet to the sun. The force is proportional to one over distance squared, and time is proportional to distance squared (as per the scaling law for these triangles). Thus the change in velocity – being the product of both – is constant! This is what Rubinstein calls Feynman’s big insight. But not only are the changes in velocity constant, but also the angles between adjacent line segments denoting those changes. Thus the changes in velocities make up for a regular polygon (which seems to turn into a circle in the limiting case).

Part VII: The point used to build up the velocity polygon by attaching the velocity line segments to it is not the center of the polygon. If you draw connections from the center to the endpoints the angle corresponds to the angle the planet has travelled in space. The animations of the continuous motion of the planet in space – travelling along its elliptical orbit is put side-by-side with the corresponding velocity diagram. Then Feynman relates the two diagrams, actually merges them, in order to track down the position of the planet using the clues given by the velocity diagram.

In Part VIII (embedded also below) Rubinstein finally shows why the planet traverses an elliptical orbit. The way the position of the planet has finally found in Part VII is equivalent to the insights into the properties of an ellipse found at the beginning of this tutorial. The planet needs be on the ‘ray’, the direction determined by the velocity diagram. But it also needs to be on the perpendicular bisector of the velocity segment – as force cause a change in velocity perpendicular to the previous velocity segment and the velocity needs to correspond to a tangent to the path.