Introduction
In this short article, I’ll give a geometric motivation for the definition of the inner product of vectors ( also called scalar product). The objects that we will be considering are arrows in the three-dimensional space and they will be represented by Latin letters with an arrow above them like \(\vec{A}\), or \(\vec{a}\). The inner product is indicated by a centered dot between the two vectors. Remember that any arrow \(\vec{a}\) has a length \(|\vec{a}|\) which can be measured by using a meter stick. In addition, the angle between two arrows can also be measured with the help of a protractor.
The inner product of parallel vectors
What we want is to define a multiplication between a couple of vectors to get a number. Let’s start by considering two parallel vectors \(\vec{a}\), and \(\vec{b}\) that point in the same direction. If we were to define a multiplication between them, it seems natural to define it as the multiplication of their lengths. This multiplication rule can be interpreted as taking the length of one of the vectors multiplied by a factor equal to the length of the other.
It follows from the definition that in the particular case when \(\vec{b}=\vec{a}\), we get the relation \(\vec{a}\cdot\vec{a}=|\vec{a}|^2\).
If the vectors are parallel but oppositely directed, we would define the product between them in the same way as before, but to distinguish this case from that where they point in the same direction, we can assign a minus sign to the result. This is just the same as multiplying real numbers.
The inner product of nonparallel vectors
The idea of the inner product is to extend the previous rule of multiplication to the case when the vectors are not parallel. In such case we want a product that takes into account not just the lengths of the vectors, but also the relative orientation between them, i.e., the angle between them.
Let’s take a look at the case when the angle \(\theta\) between the two vectors \(\vec{a}\), \(\vec{b}\) is more than zero but less than \(90\) degrees. If the vectors are parallel we take the length of \(\vec{a}\) multiplied by a factor equal to the length of \(\vec{b}\), but now a smaller portion of \(\vec{b}\) points in the same direction as \(\vec{a}\). The natural thing to do seems to define the product between the vectors as multiplying the length of \(\vec{a}\) by the length of the portion of \(\vec{b}\) in the direction of \(\vec{a}\). By “the portion of one vector in the direction of the other” I mean the component of one of them in the direction of the other.
We could also take the length of \(\vec{b}\) multiplied by a factor equal to the portion of \(\vec{a}\) pointing in the same direction of \(\vec{b}\). The two results coincide, and the inner product so defined is commutative. The verification of this is left as an exercise.
If the angle between the vectors is larger than \(90\) degrees, we take the length of one of them multiplied by a factor equal to the length of the portion of the other one in the opposite direction of the former and assign a minus sign to the result.
Properties of the inner product
An important property of the inner product is its distributivity on the sum of vectors, namely:
\[\vec{a}\cdot(\vec{b}+\vec{c})=\vec{a}\cdot\vec{b}+\vec{a}\cdot\vec{c}.\]
To see that it is the case, let’s consider the following figure
The vectors \(\vec{a}\), \(\vec{b}\) and \(\vec{b}+\vec{c}\) share the starting point. The projection of \(\vec{b}+\vec{c}\) on \(\vec{a}\) is shown in light blue, while the projections of \(\vec{b}\) and \(\vec{c}\) are shown in pink and green respectively. Clearly the sum of the pink and green arrows equals the light blue arrow. The inner product corresponds to the multiplication of the respective lengths which is clearly distributive.
The inner product has also the property
\[\vec{a}\cdot(\alpha\vec{b})=(\alpha\vec{a})\cdot\vec{b}=\alpha(\vec{a}\cdot\vec{b}),\]
where \(\alpha\) is a real number. The demonstration of this latter property is left as an exercise.
The inner product in orthogonal coordinates
I have defined the inner product geometrically. It needs no coordinates. Lengths and angles are directly measurable quantities and the notion of perpendicularity is purely geometric. If we take an ordered base consisting of three mutually perpendicular vectors of unit length, and call them \(\hat{x},\hat{y},\hat{z}\), it is well known that any vector of the three-dimensional space can be expressed as a linear combination of them. The usual expression looks like \(\vec{a}=a_x\hat{x}+a_y\hat{y}+a_z\hat{z}\), and similarly for any other vector. The numbers \(a_x,a_y,a_z\) are the coordinates of \(\vec{a}\) in the base.
Let’s look for the expression of the inner product in terms of the coordinates of the vectors. We will need two theorems from the euclidean geometry, namely, Pitagora’s theorem and the theorem of cosines. The usual expression \(|\vec{a}|^2=a_x^2+a_y^2+a_z^2\) is just the statement of Pitagora’s theorem.
The theorem of cosines relates the square of the lengths of the sides in any triangle and the cosine of one inner angle.
In terms of vectors, the preceding figure looks like
From the rule of subtraction of two vectors and the theorem of cosines we see that
The latter equation is a purely geometric relation. Let’s see what this relation looks like when the vectors are expressed in the base.
First, we note that
\[\vec{a}=a_x\hat{x}+a_y\hat{y}+a_z\hat{z},\]
\[\vec{b}=b_x\hat{x}+b_y\hat{y}+b_z\hat{z},\]
and therefore
\[\begin{split}\vec{c}&=\vec{a}-\vec{b}\\&=(a_x-b_x)\hat{x}+(a_y-b_y)\hat{y}+(a_z-b_z)\hat{z}\end{split}\]
In other words, we have the relations \(c_x=a_x-b_x\), \(c_y=a_y-b_y\), and \(c_z=a_z-b_z\).
Pitagora’s theorem implies \(|\vec{c}|^2=c_x^2+c_y^2+c_z^2\). From \((a_x-b_x)^2=a_x^2+b_x^2-2a_xb_x\), and similar expressions for the other coordinates, after regrouping terms we arrive to
Now we compare the two expressions for \(|\vec{c}|^2\), and arrive to the conclusion that
but the left side of the latter equation is just the geometric definition of inner product, so we conclude that
\[\vec{a}\cdot\vec{b}=a_xb_x+a_yb_y+a_zb_z.\]
I want to emphasize that this latter expression is not a definition. It was obtained starting from four things, namely: The geometric definition of inner product, the algebraic expression of vectors in terms of an ordered base of mutually perpendicular unit vectors, Pitagora’s theorem and the theorem of cosines.
The expression for the inner product looks somewhat different when the selected base is more general, i.e., either the basis vectors are not orthogonal, or they are not of unit length or both of them. I will explain that case in another lecture in which the concepts of contravariant and covariant components will be discussed.
In “Inner product of vectors”, why do you use the Pythagorean theorem separately from the cosine theorem? Can’t you say that the Pythagorean theorem is just a private case of the cosine theorem?
Thanks for asking. Historically, Pitagora’s theorem was demonstrated long before the cosine’s theorem was even stated and I think nobody would call it “corollary of cosine’s theorem”. Both theorems can be demonstrated one independently of the other, and once one of them is proved, it can be used to prove the other one.
In reality, Pitagora’s theorem is more fundamental. Indeed, the definition of cosine (and of sine also) is based on the orthogonal projection of a point on the unit circle on the diameter and then extended to general right triangles through Thale’s theorem. Every demonstration of the cosine’s theorem uses the definition of cosine that in the end satisfies the fundamental Pitagorean relation sine square plus cosine square equals to one.
I use them separately because, in practice, both theorems are usually regarded as distinct useful relations.