In my previous post, Turtles and polynomials, I explained a geometric construction to find the real roots of a polynomial. This post is a follow-up. I want to give you a second perspective to why the construction works: In this post, we will build up our understanding of the construction one degree at a time from degree 1 to degree n. I find that this second perspective provides a neat geometric understanding of polynomial factorization.
Let us first briefly recall how the construction works: We imagine recording the path of a turtle walking in a way which encodes the polynomial; The turtle walks in straight line segments, and the length of the segments are determined by the coefficients of the polynomial. We find the real roots of the polynomial by sending off a baby turtle at an angle $\relax \theta$. It will walk in straight lines and turn at a $90\degree$ angle when it hits extensions of segments of its parents path. The illustration below shows the construction for the polynomial
$$\relax p(x)=x^3-6x^2+9x-4.$$
Recall that $\relax \theta$ is positive in the counterclockwise direction, so that, in the picture above, it is negative. If $\relax \theta$ is such that the baby turtle succeeds in finding its parent, then
$$\relax\lambda=-\tan(\theta)$$
is a root of the polynomial. So, once again, why does this work?
Degree 1
This time, we start our analysis with a polynomial of degree one:
$$\relax p(x)=ax+b.$$
Let us assume $\relax a>0$, so that the turtle will start walking to the right. If $\relax b=0$, it is not hard to see that the construction works: The successful baby turtle simply walks directly to the right ($\relax \theta=0$) until it meets its parent.
When $\relax b\neq0$, the turtle path of $\relax p$ consists of the two catheti of a right-angled triangle; one side with length $\relax a$ and one with length $\relax b$. When $\relax b$ is negative, the triangle points downwards, and when $\relax b$ is positive it points upwards. We complete the right-angled triangle by sending off the baby turtle in the correct angle $\relax\theta$ towards its parent.
With the sign conventions we have chosen, we see that
$$-\tan(\theta)=-\frac{b}{a}$$
And lo and behold, this is the root of $\relax p(x)=ax+b$.
Let us instead start with a root λ. We associate to λ the right-angled triangle whose two catheti are the turtle path of the polynomial $\relax p_1(x)=x-\lambda$:
With the convention that $\relax \theta$ is positive in the counterclockwise direction, we have λ = −tan θ, which is the root of $\relax p_1$.
Now, notice that any scaling of the turtle path by some real number $\relax s\neq0$ yields the path of the scaled polynomial $\relax s\cdot p_1(x)=sx-s\lambda$. And further, such a scaling changes neither the root nor the angle $\relax \theta$.
Our polynomial $\relax p(x)=ax+b$ is a scaling of $\relax p_1$ by a factor of $\relax a$.
Degree 2
Next, we consider a second degree polynomial $\relax p_2(x)=ax^2+bx+c$ with two real roots λ1 and λ2. Since all we care about are the roots, we can assume that $\relax p_2$ is scaled so that its leading coefficient $\relax a=1$.
We associate to each root a right-angled triangle, as we did before:
In the drawing, λ1 = 4 and λ2 = 1. We build the turtle path for $\relax p_2$ from these two triangles as follows: We take two copies of the λ2-triangle, and we scale and rotate them so that we can glue their hypothenuses to each of the two catheti of the λ1-triangle. Importantly there are no reflections in this process; only scalings and rotations.
In this way we obtain rotated turtle path! It consists of 3 line segments, so it is associated to a second degree polynomial. We scale and rotate it so that the first segment of this turtle path is horizontal and has length one.
In the picture above, we have labelled the side lengths of each of the 3 triangles making up the figure. There are two λ2-triangles: one in its original size and one which is scaled by an unknown factor r. And then there is one λ1-triangle which is scaled by another factor s.
To find out what second degree polynomial this turtle path corresponds to, we need to know what r is. But this is easy: Notice that the hypothenus of the first λ2-triangle is s, whereas the hypothenus for the second is s ⋅ λ1. Thus, the second λ2-triangle is scaled by a factor of λ1, and so, r = λ1.
Now we can see that the constructed turtle path is the path for the polynomial $\relax x^2-(\lambda_1+\lambda_2)x+\lambda_1\lambda_2$. Is this our polynomial $\relax p_2$? Indeed it is. As $\relax p_2$ has the two roots λ1 and λ2, it factors like this:
$\relax p_2(x)=(x-\lambda_1)(x-\lambda_2)=x^2-(\lambda_1+\lambda_2)x+\lambda_1\lambda_2$
Notice that it doesn’t matter how we order the 2 roots λ1 and λ2. If we start with a λ2-triangle and glue two λ1-triangles to its sides, we still end up with the turtle path for $\relax p_2$:
In other words, the process of constructing the turtle path by gluing one triangle to the sides of the other is commutative. This corresponds to the commutativity of the product:
$\relax (x-\lambda_1)(x-\lambda_2)=(x-\lambda_2)(x-\lambda_1)$.
One more comment is in place before we proceed: In all of the illustrations, we have worked with two positive roots. If instead, both roots are negative, the picture is exactly the same, except it is mirrored across the horizontal axis. If the two roots have opposite signs, the picture becomes slightly more complicated. Below is an illustration where λ1 = 4 and λ2 = −1/2:
Degree 3
Now suppose we have a third degree polynomial $\relax p_3$ with real roots λ1, λ2 and λ3. We’ll assume $\relax p_3$ is normalized so that its leading coefficient is 1.
We build the turtle path of $\relax p_3$ by first building the turtle path for the second degree polynomial $\relax p_2$ with roots λ1 and λ2. Then we add three copies of the λ3-triangle by gluing their hypothenuses to each of the three segments in the path for $\relax p_2$. Only rotations and scalings of the λ3-triangle are allowed. In the illustration below, we have used the roots λ1 = 4, λ2 = 1 and λ3 = 1.
The result is a rotated turtle path with 4 segments – one more than before. We rotate and scale it so that the first segment in this path is horizontal and has length one. This implies that the first λ3-triangle is back to its original size with the two short sides equal to 1 and λ3, respectively.
If the hypothenus of the first λ3-triangle has length s, it means that the old $\relax p_2$ turtle path is scaled by a factor of s. Then, as we see in the illustration above, the remaining two λ3-triangles are scaled by a factor of whatever the original side lengths were of the segments they are glued to.
So… is this the turtle path of our polynomial $\relax p_3$? Let’s turn to the algebraic expression for $\relax p_3$. Working out what the coefficients are in terms of the roots is a combinatoric task:
$\relax p_3(x)=(x-\lambda_1)(x-\lambda_2)(x-\lambda_3)$
$\relax =x^3-(\lambda_1+\lambda_2+\lambda_3)x^2+(\lambda_1\lambda_2+\lambda_2\lambda_3+\lambda_3\lambda_1)x-\lambda_1\lambda_2\lambda_3$
And indeed, the lengths of the segments in the constructed turtle path matches up with the coefficients of $\relax p_3$. We have to be careful with the signs here. But if we recall in what direction the turtle should be facing on each of the segments when walking this path, we see that the signs do match up (recall that the roots used in the illustration are positive).
Degree n
Let n be any natural number, and let us consider a degree n polynomial $\relax p_n$ with n real roots λ1, λ2, …, λn. That is,
$\relax p_n(x)=(x-\lambda_1)(x-\lambda_2)\cdots(x-\lambda_n)$
We build the turtle path of $\relax p_n$ recursively by continuing the process described above. Starting with the λ1-triangle, we glue a scaled and rotated copy of the λ2-triangle to each of its catheti to obtain the turtle path of the degree 2 polynomial with the roots λ1 and λ2. We continue this process adding at each step another layer of triangles. Precisely, at step k, we take k copies of the λk-triangle, and we scale and rotate them so that we can glue one to each of the sides of the path we constructed at step k − 1.
After step n, we have constructed the turtle path of $\relax p_n$ (this, of course, calls for an induction proof). Because the process is commutative, it doesn’t matter how we order the n roots λ1, λ2, …, λn. If we take them in a different order, we still end up with the turtle path for $\relax p_n$.
In all of this, we have assumed that the leading coefficient of $\relax p_n$ is 1. If we want a different leading coefficient, say $\relax a>0$, we simply multiply $\relax p_n$ with $\relax a$, which, in the turtle geometric picture corresponds to scaling the whole picture by a factor of $\relax a$. A negative value of $\relax a$ corresponds to both scaling and mirroring the resulting path across the $\relax x$-axis.
To predict what the side lengths should be, we carry out the multiplication in the decompostion of $\relax p_n$: For $\relax k$ between $\relax 0$ and $\relax n$, the coefficient in front of the $\relax x^k$-term is the sum of all different choices of $\relax n-k$ of the roots multiplied together, possibly with a minus in front. Precisely:
$\relax(-1)^{n-k}\sum_{J^{(n-k)}\subset\{1,\ldots,n\}}\prod_{j\in J^{(n-k)}}\lambda_j$
Here, $\relax J^{(n-k)}$ is a subset of size $\relax n-k$, so that the sum is over all choices of exactly $\relax n-k$ different numbers from the set $\relax \{1,\ldots,n\}$.
Factorization
This way of constructing the turtle path highlights how the polynomial factors into a product of terms of the form $\relax (x-\lambda_k)$:
$\relax p_n(x)=(x-\lambda_1)(x-\lambda_2)\cdots(x-\lambda_{n-1})(x-\lambda_n)$
It also shows why the turtle path we get after finding one root can be used to find another root of the original polynomial. A successful baby turtle finding a root, say λn, will walk the turtle path of the degree n − 1 polynomial with the roots λ1, λ2, …, λn − 1. Algebraically, this corresponds to the polynomial division
$$\frac{p_n(x)}{x-\lambda_n}=(x-\lambda_1)(x-\lambda_2)\cdots(x-\lambda_{n-1})$$
In the recursive construction, this is like going one step back peeling off the outer layer of λn-triangles.