Using matrices to understand polynomials

The use of linear algebra to help us understand polynomials is well-known. We would go through some prominent examples.

Multiplication of polynomials

We notice that coefficients of product of polynomials can be generated by a multiplication of a matrix by a vector. This is best shown by examples. For example to multiply \(a_0 + a_1x\) and \(b_0 + b_1x\), we have \(c_0 + c_1x + c_2x^2 := a_0b_0 +(a_0b_1 + a_1b_0)x + a_1b_1x^2 = (a_0 + a_1x)(b_0 + b_1x)\), hence

\[\begin{bmatrix}c_0 \\ c_1 \\ c_2 \end{bmatrix} = \begin{bmatrix} a_0 & 0 \\ a_1 & a_0 \\ 0 & a_1\end{bmatrix} \begin{bmatrix} b_0 \\ b_1 \end{bmatrix} = \begin{bmatrix} b_0 & 0 \\ b_1 & b_0 \\ 0 & b_1\end{bmatrix} \begin{bmatrix} a_0 \\ a_1 \end{bmatrix}.\]

As another example, if \(c_0 + c_1x + c_2x^2 + c_3x^3 = (a_0 + a_1x)(b_0 + b_1x + b_2x^2)\), then

\[\begin{bmatrix}c_0 \\ c_1 \\ c_2 \\ c_3 \end{bmatrix}= \begin{bmatrix}a_0 & 0 & 0 \\ a_1 & a_0 & 0 \\ 0 & a_1 & a_0 \\ 0 & 0 & a_1 \end{bmatrix} \begin{bmatrix}b_0 \\ b_1 \\ b_2 \end{bmatrix} = \begin{bmatrix} b_0 & 0 \\ b_1 & b_0 \\ b_2 & b_1 \\ 0 & b_2 \end{bmatrix} \begin{bmatrix} a_0 \\ a_1 \end{bmatrix}.\]

More generally, we could multiply two pair of polynomials and get their sum. Here’s an example. Let \(a(x), b(x), c(x), d(x)\) be linear polynomials. Then we can express \(a(x)c(x) + b(x)d(x)\) as follows:

\[\begin{bmatrix} a_0 & 0 & b_0 & 0 \\ a_1 & a_0 & b_1 & b_0 \\ 0 & a_1 & 0 & b_1 \end{bmatrix}\begin{bmatrix} c_0 \\ c_1 \\ d_0 \\ d_1 \end{bmatrix} = \begin{bmatrix} a_0 c_0 \\ a_1 c_0 + a_0 c_1 \\ a_1c_1 \end{bmatrix} + \begin{bmatrix} b_0 d_0 \\ b_1d_0 + b_0d_1 \\ b_1 d_1 \end{bmatrix}.\]

This is more or less derived from multiplication of block matrices. Given matrices \(A, B, C, D\) with matching dimensions, one could write \(\begin{bmatrix} A & B \end{bmatrix}\begin{bmatrix} C \\ D\end{bmatrix} = AC + BD.\)

This leads on to theory about Sylvester matrices and resultants.

Formal Derivatives

The derivative of a polynomial \(a_0 + a_1 + \dots + a_n x^n\) is \(a_1 + \dots + na_n x^{n-1}\). Exploiting the fact that the derivative is linear, we could express it with a matrix equation as follows:

\[\begin{bmatrix} 0 & 1 & 0 & \dots & 0 \\ 0 & 0 & 2 & \dots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \dots & n \\ 0 & 0 & 0 & \dots & 0 \end{bmatrix}\begin{bmatrix}a_{0}\\a_{1}\\\vdots\\ a_{n-1} \\ a_{n}\end{bmatrix} = \begin{bmatrix}a_{1}\\2a_{2}\\\vdots\\ na_{n} \\ 0\end{bmatrix}.\]

Combined with resultants, one could use the above to derive the discriminant of a polynomial.

Polynomial interpolation

If we wanted to find a \(n\)-th degree polynomial \(a_0 + a_1x + \dots + a_nx^n\) that goes through \(m\) points , i.e. \(p(x_0) = y_0, \dots, p(x_m) = y_m\), we could reformulate the problem as follows:

\[{\displaystyle {\begin{bmatrix}1&x_{0}&x_{0}^{2}&\dots &x_{0}^{n}\\1&x_{1}&x_{1}^{2}&\dots &x_{1}^{n}\\1&x_{2}&x_{2}^{2}&\dots &x_{2}^{n}\\\vdots &\vdots &\vdots &\ddots &\vdots \\1&x_{m}&x_{m}^{2}&\dots &x_{m}^{n}\end{bmatrix}} {\begin{bmatrix}a_{0}\\a_{1}\\\vdots \\a_{m}\end{bmatrix}}={\begin{bmatrix}p(x_{0})\\p(x_{1})\\\vdots \\p(x_{m})\end{bmatrix}}.}\]

This affirms our geometric intuition that if \(m \leq n\), then there exists an infinite number of polynomials that goes through those points. If \(m = n\), then the polynomial exists and is unique.

The matrix on the left is the Vandermonde matrix, which is interesting for its own sake / used in number theory.