Consider a walk in $\mathbb{Z}^n$ with the following constraints:
1) At each step you may only step $1$ unit in the positive or negative direction of each orthogonal basis vector of $\mathbb{Z}^n$.
2) You may not visit the same position more than once.
What is the fewest number of steps you must take to get "stuck," or in other words, so that all positions $1$ step away have already been visited, leaving you with $0$ possible moves which follow the above constraints.
Claim:
The minimum number of steps needed to get stuck in $\mathbb{Z}^n$ is given by
$4n - 1$
for all $n \ge 2\in\mathbb{N}$
Proof:
Let $x_1, x_2, x_3, \ldots , x_n$ be the $n$ unit orthogonal basis vectors of $\mathbb{Z}^n$. Let $v_1 = \pm x_a$ and $v_2 = \pm x_b$ for all $a, b$ such that $1 \le a \ne b \le n$.
$v_1$ can reach $v_2$ in $2$ steps and no fewer. Each step can be seen as adding one of these basis vectors (muliplied by $\pm 1$) to the "current position." Then if $v_1$ is the current position, then the first step is to add $v_2$, putting the current position at $v_1 + v_2$. The second step is to add $-v_1$, putting the current position at $v_2$.
We will start the walk at $x_1$. Using the method just described, we will move from
$x_1$ to $x_1 + x_2$ to $x_2$ to $\ldots$ to $x_n$ to $x_n - x_1$ to $-x_1$ to $\ldots$ to $-x_n$.
Each one of the vectors we visit and each intermediary step is unique and not equal to the zero vector. There are $2n$ different positions that we will visit, not including the origin. Then there are $2n - 1$ jumps between them, eaching costing $2$ steps, putting the total number of steps at $2(2n-1) = 4n-2$ steps.
Lastly, we step to the origin, putting the total number of steps at $4n-1$. We have already visited all $2n$ positions reachable from the origin using the minimal number of steps possible, and have no more possible steps to take. Therefore, the minimal number of steps to get stuck in $\mathbb{Z}^n$ is $4n-1$.
Friday, October 25, 2019
Saturday, July 13, 2019
Euclid-Euler Theorem Proved Visually
Update: I made an animated version of this same proof which I believe is easier to follow than the written version. You can view it on my YouTube channel.
Forever ago, Euclid proved that any positive integer of the form
$(2^{k-1})(2^{k}-1)$
Will be a perfect number so long as $2^{k}-1$ is prime (primes that are one less than a power of two are called Mersenne Primes).
In the 1700s, Euler proved that all even perfect numbers will be of this form.
If you're interested, you can read their proofs on Wikipedia. You will notice that they're very concise and do a good job showing that this is true, but (at least for me) do not leave you with a good intuition why this should be true.
So let me see if I can help with that!
Recall that a perfect number is a positive integer whose factors (including itself) add up to double itself. For example, $6$ is a perfect number since the factors of $6$ are $1, 2, 3, 6$. If you add up all the factors, you get $12$ which is double $6$ and therefore $6$ is a perfect number.
This is an illustration emphasizing the role that the factors $2^{k-1}$ and $2^{k}-1$ play in the makeup of $6$. Note that in this case, $k=2$.
I have drawn it again, but this time I have replaced the three blocks that represent $2^{k}-1$ with one bigger block that says $3$. And notice that there are $2$ of these blocks. This is because
$2^{2-1} = 2^{1} = 2$
So essentially what we've done so far is draw the number $(2^{k-1})(2^{k}-1)$ chopped up into its factors of $2^{k}-1$. And if $6$ is a perfect number (which it is) then if we continue drawing its factors then they should all add up to be twice as much as what we've drawn so far.
The $\sigma$ function simply means that we will add up all of the factors of the input. The illustration shows the "goal" that we're trying to get to. We want to show that $6$ is a perfect number by showing that it's factors add up to $2 \cdot 6$.
The first row was easy to derive, because that's just $6$, and of course $6$ is a factor of itself. Well, $6$ is divisible by $2$. So let's chop $6$ in half and add that to the next row. Notice we're still missing a chunk from our picture that we need in order to show that $6$ is perfect.
We have an easy way to find the remaining factors! Let's pause and think about what we've done so far.
We started with $2 \cdot 3$ which we get from $(2^{k-1})(2^{k}-1)$ when $k=2$. So $3$ functions as our prime number and the first row has $2^{k-1}$ many $3$s. It worked nicely to get the second row since $6$ is divisible by $2$. But, this was no coincidence, because $2^{k-1}$ is a power of $2$ we can always produce $k-1$ rows by repeatedly halving the original number. In this case, $k-1=1$ so we only divide by $2$ once and we're left with $2$ rows, but soon you will see this works nicely for larger perfect numbers too.
So we start with the factor $6$ then divide by $2$ to get the factor $3$. And as the image above illustrates, we can find the remaining factors by checking what we need to multiply each row by to get back to $6$. In this case, those factors are $1$ and $2$. So let's add that to the image.
And this is the super cool part! Since $3$ is a prime number (and for this proof to be valid, $2^{k}-1$ will always be prime) then we know its only factors are $3$ and $1$, which are both already accounted for. This is important, because this means that we will always $\sigma(2^{k-1})$ "big blocks" on the top, since each big block is $2^{k}-1$. Also,
$\sigma(2^{k-1}) = 2^{k}-1$
which has a simple proof by induction that I show in my previous post.
This is crucial! We have $2^{k}-1$ big blocks, each of which represent the value $2^{k}-1$.
And we will also have $2^{k}-1$ small blocks (which each represent $1$). We start with the first row of big blocks that represents our original number, which corresponds with the "small block" factor of $1$ since
$1 \cdot (2^{k-1})(2^{k}-1) = (2^{k-1})(2^{k}-1)$
The next row we got by dividing the first row in two. This corresponds with the "small block" factor of $2$ since
$2 \cdot \frac{(2^{k-1})(2^{k}-1)}{2} = (2^{k-1})(2^{k}-1)$
And this pattern continues, making the total number of small blocks:
$1 + 2 + ... + 2^{k-1}$
And this is exactly the factors of $2^{k-1}$, so as already pointed out, this means:
$1 + 2 + ... + 2^{k-1} = \sigma(2^{k-1}) = 2^{k}-1$
So now we have $2^{k}-1$ small blocks, which add up to exactly one big block!
Then we now have $2^{k}-1 + 1$ big blocks, which is $2^{k}$. Each big block represents the value $2^{k}-1$. So in total we have:
$(2^{k})(2^{k}-1) = 2 \cdot (2^{k-1})(2^{k}-1)$
Then that means the factors of $(2^{k-1})(2^{k}-1)$ add up to $2 \cdot (2^{k-1})(2^{k}-1)$ which is the definition of a perfect number!
I will do one more example using $28$ as the perfect number, which may help solidify this explanation.
We start by dividing up $28$ into $2^{k-1}$ equal big blocks, each of which have value $2^{k}-1$. Then we find all $k-1$ rows by halving the previous row.
By the same reasoning as last time, we know we have $2^{k}-1$ small blocks:
Then these small blocks all form one big block:
And these big blocks are guaranteed to add up to twice the original amount:
And to reiterate, the importance that $2^{k}-1$ is prime is that this makes sure there are no other "little block" factors, other than the $2^{k}-1$ that we already got. Then the little blocks will always add to one big block, hence allowing all the factors to add up to twice the value of the perfect number.
This was very hard to explain in writing but I hope it's coherent!
Forever ago, Euclid proved that any positive integer of the form
$(2^{k-1})(2^{k}-1)$
Will be a perfect number so long as $2^{k}-1$ is prime (primes that are one less than a power of two are called Mersenne Primes).
In the 1700s, Euler proved that all even perfect numbers will be of this form.
If you're interested, you can read their proofs on Wikipedia. You will notice that they're very concise and do a good job showing that this is true, but (at least for me) do not leave you with a good intuition why this should be true.
So let me see if I can help with that!
Recall that a perfect number is a positive integer whose factors (including itself) add up to double itself. For example, $6$ is a perfect number since the factors of $6$ are $1, 2, 3, 6$. If you add up all the factors, you get $12$ which is double $6$ and therefore $6$ is a perfect number.
This is an illustration emphasizing the role that the factors $2^{k-1}$ and $2^{k}-1$ play in the makeup of $6$. Note that in this case, $k=2$.
I have drawn it again, but this time I have replaced the three blocks that represent $2^{k}-1$ with one bigger block that says $3$. And notice that there are $2$ of these blocks. This is because
$2^{2-1} = 2^{1} = 2$
So essentially what we've done so far is draw the number $(2^{k-1})(2^{k}-1)$ chopped up into its factors of $2^{k}-1$. And if $6$ is a perfect number (which it is) then if we continue drawing its factors then they should all add up to be twice as much as what we've drawn so far.
The $\sigma$ function simply means that we will add up all of the factors of the input. The illustration shows the "goal" that we're trying to get to. We want to show that $6$ is a perfect number by showing that it's factors add up to $2 \cdot 6$.
The first row was easy to derive, because that's just $6$, and of course $6$ is a factor of itself. Well, $6$ is divisible by $2$. So let's chop $6$ in half and add that to the next row. Notice we're still missing a chunk from our picture that we need in order to show that $6$ is perfect.
We have an easy way to find the remaining factors! Let's pause and think about what we've done so far.
We started with $2 \cdot 3$ which we get from $(2^{k-1})(2^{k}-1)$ when $k=2$. So $3$ functions as our prime number and the first row has $2^{k-1}$ many $3$s. It worked nicely to get the second row since $6$ is divisible by $2$. But, this was no coincidence, because $2^{k-1}$ is a power of $2$ we can always produce $k-1$ rows by repeatedly halving the original number. In this case, $k-1=1$ so we only divide by $2$ once and we're left with $2$ rows, but soon you will see this works nicely for larger perfect numbers too.
So we start with the factor $6$ then divide by $2$ to get the factor $3$. And as the image above illustrates, we can find the remaining factors by checking what we need to multiply each row by to get back to $6$. In this case, those factors are $1$ and $2$. So let's add that to the image.
And this is the super cool part! Since $3$ is a prime number (and for this proof to be valid, $2^{k}-1$ will always be prime) then we know its only factors are $3$ and $1$, which are both already accounted for. This is important, because this means that we will always $\sigma(2^{k-1})$ "big blocks" on the top, since each big block is $2^{k}-1$. Also,
$\sigma(2^{k-1}) = 2^{k}-1$
which has a simple proof by induction that I show in my previous post.
This is crucial! We have $2^{k}-1$ big blocks, each of which represent the value $2^{k}-1$.
And we will also have $2^{k}-1$ small blocks (which each represent $1$). We start with the first row of big blocks that represents our original number, which corresponds with the "small block" factor of $1$ since
$1 \cdot (2^{k-1})(2^{k}-1) = (2^{k-1})(2^{k}-1)$
The next row we got by dividing the first row in two. This corresponds with the "small block" factor of $2$ since
$2 \cdot \frac{(2^{k-1})(2^{k}-1)}{2} = (2^{k-1})(2^{k}-1)$
And this pattern continues, making the total number of small blocks:
$1 + 2 + ... + 2^{k-1}$
And this is exactly the factors of $2^{k-1}$, so as already pointed out, this means:
$1 + 2 + ... + 2^{k-1} = \sigma(2^{k-1}) = 2^{k}-1$
So now we have $2^{k}-1$ small blocks, which add up to exactly one big block!
Then we now have $2^{k}-1 + 1$ big blocks, which is $2^{k}$. Each big block represents the value $2^{k}-1$. So in total we have:
$(2^{k})(2^{k}-1) = 2 \cdot (2^{k-1})(2^{k}-1)$
Then that means the factors of $(2^{k-1})(2^{k}-1)$ add up to $2 \cdot (2^{k-1})(2^{k}-1)$ which is the definition of a perfect number!
I will do one more example using $28$ as the perfect number, which may help solidify this explanation.
We start by dividing up $28$ into $2^{k-1}$ equal big blocks, each of which have value $2^{k}-1$. Then we find all $k-1$ rows by halving the previous row.
By the same reasoning as last time, we know we have $2^{k}-1$ small blocks:
Then these small blocks all form one big block:
And these big blocks are guaranteed to add up to twice the original amount:
And to reiterate, the importance that $2^{k}-1$ is prime is that this makes sure there are no other "little block" factors, other than the $2^{k}-1$ that we already got. Then the little blocks will always add to one big block, hence allowing all the factors to add up to twice the value of the perfect number.
This was very hard to explain in writing but I hope it's coherent!
Tuesday, July 9, 2019
Proof Sum of Divisors Function is Multiplicative
I was reading the Wikipedia article on the Euclid-Euler theorem and in the proof section, two truths are mentioned that are necessary for Euler's proof, but neither are proved themselves.
I decided to prove both of these statements as an exercise and so that I may better understand Euler's proof.
The first of two statements is the following:
$\sigma(6) = 1 + 2 + 3 + 6 = 12$
The numbers summing to $12$ are all the positive integers that divide $6$.
Let $a$ and $b$ be two co-prime integers. Then we have:
$\sigma(a) = 1 + a_1 + a_2 + ... + a_n$
$\sigma(b) = 1 + b_1 + b_2 + ... + b_n$
Where $a_i$ and $b_i$ represents the factors besides $1$ (for prime numbers, there will be no other factors). Now I will define two new variables:
$A = \sigma(a) - 1$
$B = \sigma(b) - 1$
$A$ represents all of the factors of $a$ that are not factors of $b$ and vice versa. Since $a$ and $b$ are co-prime, we only need to subtract one from each since that's the only factor they share.
Now consider the product $ab$. What will its factors be? Well, it will inherit all the factors of $a$ and all the factors of $b$ since anything that can divide $a$ can also divide $ab$. Also $ab$ will be divisible by the product $a_i b_i$ which is any factor from $a$ times any factor from $b$.
Thanks to the distributive rules of multiplication, we can express the sum of all factors of $ab$ that can be expressed as $a_i b_i$ by the product:
$AB = (a_1 + a_2 + ... + a_n)(b_1 + b_2 + ... + b_n)$
$= a_1 b_1 + a_1 b_2 + ... + a_1 b_n\\
+ a_2 b_1 + a_2 b_2 + ... + a_2 b_n\\
+ ...\\
+ a_n b_1 + a_n b_2 + ... + a_n b_n$
This will result in every product between the factors of $a$ and $b$ except $1$, then we sum them together.
And now we have all the information we need! Consider:
$\sigma(ab) = 1 + A + B + AB$
This must be true, since we account for all factors of $ab$ and we sum them together, following the definition of the $\sigma$ function. And expressing it in this form makes it obvious that:
$\sigma(a)\sigma(b) = (1 + A)(1 + B) = 1 + A + B + AB = \sigma(ab)$
And so the $\sigma$ function is multiplicative.
The second statement that the Wikipedia article mentions is the following:
Consider the number $2^{p}$. Conveniently, this number is already in the form of its prime-factorization since $2$ is a prime number. We can easily convert a prime factorization of a number into its normal factorization by multiplying together every combination of its prime factors.
In this case we get:
$2^{0}, 2^{1}, 2^{2}, ... , 2^{p} = 1, 2, 4, ... , 2^{p}$
Now let's take the sum of all the factors:
$\sigma(2^{p}) = 1 + 2 + 4 + ... + 2^{p}$
Notice that:
$\sigma(2^{0}) = 2^{1} - 1$
So then we know for at least one value of $k$ it's true that:
$\sigma(2^{k-1}) = 1 + 2 + 4 + ... + 2^{k-1} = 2^{k} - 1$
Then with that assumption we can prove that it's true for $k + 1$
$\sigma(2^{k}) = 1 + 2 + 4 + ... + 2^{k-1} + 2^{k}$
$= (2^{k} - 1) + 2^{k}$
$= 2(2^{k}) - 1$
$= 2^{k+1} - 1$
Then it's true for all integers $p \geq 0$ that $\sigma(2^{p-1}) = 2^{p} - 1$
I decided to prove both of these statements as an exercise and so that I may better understand Euler's proof.
The first of two statements is the following:
"Euler's proof is short and depends on the fact that the sum of divisors function $\sigma$ is multiplicative; that is, if $a$ and $b$ are any two relatively prime integers, then $\sigma(ab) = \sigma(a)\sigma(b)$."The sum of divisors function $\sigma$ sums all factors of its input. For example,
$\sigma(6) = 1 + 2 + 3 + 6 = 12$
The numbers summing to $12$ are all the positive integers that divide $6$.
Let $a$ and $b$ be two co-prime integers. Then we have:
$\sigma(a) = 1 + a_1 + a_2 + ... + a_n$
$\sigma(b) = 1 + b_1 + b_2 + ... + b_n$
Where $a_i$ and $b_i$ represents the factors besides $1$ (for prime numbers, there will be no other factors). Now I will define two new variables:
$A = \sigma(a) - 1$
$B = \sigma(b) - 1$
$A$ represents all of the factors of $a$ that are not factors of $b$ and vice versa. Since $a$ and $b$ are co-prime, we only need to subtract one from each since that's the only factor they share.
Now consider the product $ab$. What will its factors be? Well, it will inherit all the factors of $a$ and all the factors of $b$ since anything that can divide $a$ can also divide $ab$. Also $ab$ will be divisible by the product $a_i b_i$ which is any factor from $a$ times any factor from $b$.
Thanks to the distributive rules of multiplication, we can express the sum of all factors of $ab$ that can be expressed as $a_i b_i$ by the product:
$AB = (a_1 + a_2 + ... + a_n)(b_1 + b_2 + ... + b_n)$
$= a_1 b_1 + a_1 b_2 + ... + a_1 b_n\\
+ a_2 b_1 + a_2 b_2 + ... + a_2 b_n\\
+ ...\\
+ a_n b_1 + a_n b_2 + ... + a_n b_n$
This will result in every product between the factors of $a$ and $b$ except $1$, then we sum them together.
And now we have all the information we need! Consider:
$\sigma(ab) = 1 + A + B + AB$
This must be true, since we account for all factors of $ab$ and we sum them together, following the definition of the $\sigma$ function. And expressing it in this form makes it obvious that:
$\sigma(a)\sigma(b) = (1 + A)(1 + B) = 1 + A + B + AB = \sigma(ab)$
And so the $\sigma$ function is multiplicative.
The second statement that the Wikipedia article mentions is the following:
"The sum of divisors for $2^{p − 1}$ is $2^{p} − 1$, e.g. for $p=5$ we get $16$ $(2^{5 − 1})$ and the sum of its divisors is $31$ $(1 + 2 + 4 + 8 + 16 = 31 = 2^{p} − 1)$."In the article, $p$ is any prime number, however for this particular statement it's not necessary that $p$ be prime, as you will see.
Consider the number $2^{p}$. Conveniently, this number is already in the form of its prime-factorization since $2$ is a prime number. We can easily convert a prime factorization of a number into its normal factorization by multiplying together every combination of its prime factors.
In this case we get:
$2^{0}, 2^{1}, 2^{2}, ... , 2^{p} = 1, 2, 4, ... , 2^{p}$
Now let's take the sum of all the factors:
$\sigma(2^{p}) = 1 + 2 + 4 + ... + 2^{p}$
Notice that:
$\sigma(2^{0}) = 2^{1} - 1$
So then we know for at least one value of $k$ it's true that:
$\sigma(2^{k-1}) = 1 + 2 + 4 + ... + 2^{k-1} = 2^{k} - 1$
Then with that assumption we can prove that it's true for $k + 1$
$\sigma(2^{k}) = 1 + 2 + 4 + ... + 2^{k-1} + 2^{k}$
$= (2^{k} - 1) + 2^{k}$
$= 2(2^{k}) - 1$
$= 2^{k+1} - 1$
Then it's true for all integers $p \geq 0$ that $\sigma(2^{p-1}) = 2^{p} - 1$
The Sum of Cubes and the Triangle Numbers
Update: I created an animated version of this same proof. Check it out on my YouTube channel!
Earlier today I came across the summation:
$\sqrt{1^3 + 2^3 + ... + n^3}$
After calculating the sum for some low values of $n$, it seemed as though this summation was exactly equivalent to the following:
$1 + 2 + ... + n$
The latter summation is the definition of the $n$th triangle number. In other words, the triangle number of the positive integer $n$ is the sum of all positive integers less than or equal to $n$.
I was very perplexed to see this, and needed some proof and reasoning of why this should be true.
And this is where it gets fun!
Let us first see how this is true geometrically for small values of $n$, then we can generalize from there.
Consider the following image:
We start off with one cube (right). Now let us see if we can arrange it into a square (left). Imagine the square view is just viewing the cube from the top down.
Alright, that was easy. Since $1$ is a perfect square, we can take its square root and get an integer value, showing that
$\sqrt{1^3} = 1$
which so far is consistent with the triangle numbers, since the first triangle number is also $1$. Don't worry if this doesn't make sense yet, because we haven't really done anything.
Let's make things a little more interesting by looking at the summation for $n=2$
The left view represents the sum so far. We started with $1^3$ which is simply $1$ so we've drawn one box to left. Now we must add $2^3$ which is picture to the right. To visualize this I will start by moving only one layer of the cube over to the left.
And now it's pretty apparent that we can move the remaining $4$ cubes over to the left side to make another perfect square.
Great! Our assumption that the square root of the sum of cubes is equivalent to the triangle numbers still holds, because
$\sqrt{1^3 + 2^3} = 1 + 2 = 3$
Now we're ready to generalize this, so we can know that this is true for any positive integer $n$.
Once again, the left view represents the sum so far (which we know is a perfect square), and the right view represents the $n$th cube that we're adding.
We always start by adding only one layer of the cube to our sum. Notice that every layer of $n^3$ is composed of exactly $n^2$ unit cubes. This means that there are $n^3 - n^2$ remaining unit cubes on the right that we somehow need to fit into the square on the left. We've done it for $n=1$ and $n=2$ but now we must find a way to prove we can make a perfect square for any positive integer $n$.
I notice that there are two rectangles missing from the square. One side length is simply $n$ since that's the side length of the square we've just added. The other side length is the sum of the side lengths of all the previous cubes we've added. In other words, it's the triangle number of $n-1$. We can be sure of this, since we always construct the next perfect square by placing one layer of the cube diagonally, insuring that each side length of the square increases by the side length of the cube.
And in fact, there's a cool trick for deriving an equation for the $n$th triangle number. Consider:
This schematic demonstrates adding up the numbers from $1$ to $n$ to calculate the $n$th triangle number. Nothing too special. But, if we double that triangle we made and rotate it...
Amazing! We've made a rectangle that we know has the area of twice the $n$th triangle number. So we can just divide the area by $2$ and we get the equation:
$\frac{n(n+1)}{2}$
And what we actually need is the triangle number for $n-1$, which comes out to be:
$\frac{n(n-1)}{2}$
And with that, we've found the area of the missing rectangles!
Notice that the area of both missing rectangles combined is $n^3 - n^2$ which is exactly the remaining number of unit cubes on the right! This means we know we can make a perfect square for any value of $n$. And what's more, we know the side length of the perfect square is the $n$th triangle number, thanks to our sneaky method of constructing the perfect square starting with one layer of the cube.
This shows that the square root of the sum of the first $n$ cubes is in fact equivalent to the $n$th triangle number! Wow, I love math.
I would not be surprised if this proof has been shown elsewhere, but I did discover this proof independently and had fun doing it! Now time to sleep.
Earlier today I came across the summation:
$\sqrt{1^3 + 2^3 + ... + n^3}$
After calculating the sum for some low values of $n$, it seemed as though this summation was exactly equivalent to the following:
$1 + 2 + ... + n$
The latter summation is the definition of the $n$th triangle number. In other words, the triangle number of the positive integer $n$ is the sum of all positive integers less than or equal to $n$.
I was very perplexed to see this, and needed some proof and reasoning of why this should be true.
And this is where it gets fun!
Let us first see how this is true geometrically for small values of $n$, then we can generalize from there.
Consider the following image:
We start off with one cube (right). Now let us see if we can arrange it into a square (left). Imagine the square view is just viewing the cube from the top down.
Alright, that was easy. Since $1$ is a perfect square, we can take its square root and get an integer value, showing that
$\sqrt{1^3} = 1$
which so far is consistent with the triangle numbers, since the first triangle number is also $1$. Don't worry if this doesn't make sense yet, because we haven't really done anything.
Let's make things a little more interesting by looking at the summation for $n=2$
The left view represents the sum so far. We started with $1^3$ which is simply $1$ so we've drawn one box to left. Now we must add $2^3$ which is picture to the right. To visualize this I will start by moving only one layer of the cube over to the left.
And now it's pretty apparent that we can move the remaining $4$ cubes over to the left side to make another perfect square.
Great! Our assumption that the square root of the sum of cubes is equivalent to the triangle numbers still holds, because
$\sqrt{1^3 + 2^3} = 1 + 2 = 3$
Now we're ready to generalize this, so we can know that this is true for any positive integer $n$.
Once again, the left view represents the sum so far (which we know is a perfect square), and the right view represents the $n$th cube that we're adding.
We always start by adding only one layer of the cube to our sum. Notice that every layer of $n^3$ is composed of exactly $n^2$ unit cubes. This means that there are $n^3 - n^2$ remaining unit cubes on the right that we somehow need to fit into the square on the left. We've done it for $n=1$ and $n=2$ but now we must find a way to prove we can make a perfect square for any positive integer $n$.
I notice that there are two rectangles missing from the square. One side length is simply $n$ since that's the side length of the square we've just added. The other side length is the sum of the side lengths of all the previous cubes we've added. In other words, it's the triangle number of $n-1$. We can be sure of this, since we always construct the next perfect square by placing one layer of the cube diagonally, insuring that each side length of the square increases by the side length of the cube.
And in fact, there's a cool trick for deriving an equation for the $n$th triangle number. Consider:
This schematic demonstrates adding up the numbers from $1$ to $n$ to calculate the $n$th triangle number. Nothing too special. But, if we double that triangle we made and rotate it...
Amazing! We've made a rectangle that we know has the area of twice the $n$th triangle number. So we can just divide the area by $2$ and we get the equation:
$\frac{n(n+1)}{2}$
And what we actually need is the triangle number for $n-1$, which comes out to be:
$\frac{n(n-1)}{2}$
And with that, we've found the area of the missing rectangles!
Notice that the area of both missing rectangles combined is $n^3 - n^2$ which is exactly the remaining number of unit cubes on the right! This means we know we can make a perfect square for any value of $n$. And what's more, we know the side length of the perfect square is the $n$th triangle number, thanks to our sneaky method of constructing the perfect square starting with one layer of the cube.
This shows that the square root of the sum of the first $n$ cubes is in fact equivalent to the $n$th triangle number! Wow, I love math.
I would not be surprised if this proof has been shown elsewhere, but I did discover this proof independently and had fun doing it! Now time to sleep.
Thursday, June 20, 2019
How Quaternions Produce 3D Rotation
Edit: I now have an animated video where I explain this topic. Check it out!
Before reading this post, I absolutely recommend watching 3Blue1Brown's video on quaternions and then experimenting with them on his and Ben Eater's interactive website. I consider this post to just be supplemental to their excellent explanation.
So then, how can quaternions be used to produce 3D rotations that could be used in video games, for example?
Quaternions are a 4-dimensional extension of the complex numbers.
Complex numbers are of the form:
$a + bi$
Where $a, b \in \mathbb{R}$ and $i$ is the imaginary unit.
Quaternions on the other hand, are of the form:
$a + bi + cj + dk$
where $a, b, c, d \in \mathbb{R}$ and $i$, $j$, and $k$ are the fundamental quaternion units.
So complex numbers can be thought of as quaternions which have $c = d = 0$, similarly to how real numbers can be thought of as complex numbers with $b = 0$.
Multiplication of two unit complex numbers will result in a pure rotation, which I discuss in more detail in my last post. Similarly, multiplication of two quaternions result in another unit quaternion, but the wacky 4D rotation will not directly translate to a 3D rotation because when we project it back into 3D it does not preserve length of the original 3D vectors, so the space comes out looking... Well, wacky!
If you've ever read anything on quaternions, you are probably familiar with their rules of multiplication:
$i^2 = j^2 = k^2 = -1$
$ij = -ji = k$
$jk = -kj = i$
$ki = -ik = j$
These rules don't make it immediately obvious what happens rotation-wise when two quaternions are multiplied. Note though that any two fundamental quaternion units being multiplied together do not commute (one is the exact opposite of the other) but any one of the fundamental quaternion units multiplying itself or a real number does commute. This will be important later.
The diagram above is the same stereographic projection that Grant used in his video. Look at this diagram and imagine left-multiplying by $i$. It rotates the circle passing through $1$ and $i$ as well as rotating the perpendicular circle that passes through $j$ and $k$ such that:
$1 \mapsto i$
$j \mapsto k$
Take a minute to understand and visualize that.
Now imagine left-multiplying by $-i$. It rotates the same two circles in the opposite direction than positive $i$ did, such that:
$1 \mapsto -i$
$j \mapsto -k$
This shouldn't be too surprising. But now instead of left-multiplying, imagine right-multiplying by $-i$! Well remember, right multiplying only rotates the opposite direction than left-multiplying when you're multiplying two different fundamental quaternion units. So the circle that passes through $1$ and $i$ will be unaffected by switching to right-multiplication, but the circle passing through $j$ and $k$ will now rotate the opposite direction!
$1 \mapsto -i$
$j \mapsto k$
Hmm... Let's try first right-multiplying by $-i$ then left-multiplying the result of that by $i$, so that our equation looks like $iv(-i)$ where $v$ is some quaternion from the picture.
$1 \mapsto -i \mapsto 1$
$j \mapsto k \mapsto -j$
This is very important what just happened! The rotation of the circle passing through $1$ and $i$ has completely cancelled out, while the rotation of the circle passing through $j$ and $k$ has doubled! We have effectively created a rotation only about the i-axis!
We can generalize this to create a pure rotation about any unit vector that is a combination of $i$, $j$, and $k$! But we're gonna need some pictures.
Notice in the picture we first left-multiply by quaternion $q$ (in our example, $q=i$) which creates two perpendicular rotations, both of $\theta$ radians. Then we right-multiply by $\bar{q}$ (conjugate of $q$, which in our example was $-i$. More on this in a moment) which also produces two rotations of $\theta$ radians, but only one of them is opposite of the rotations produced by $q$. Then at last I show "sandwiching" $v$ between $q$ and $\bar{q}$ to produce the single rotation by $2\theta$ radians.
Now let's learn how this works with any unit quaternion $q$ rather than just sticking with our simple example of $q=i$ and $\bar{q}=-i$. Also I will discuss what the conjugate of $q$ is.
First, I'm going to attempt to visualize the 4D unit hypersphere because I think it will be helpful in explanations.
Pay attention to the red region. Since I obviously cannot create a 4D image, I have squished the 3-dimensional ijk space down into a flat circle. The real axis is perpendicular to this space. But remember that the red region is actually 3D, and the 2D appearance is only an artifact of the schematic.
It will be useful to differentiate between the real and vector part of the quaternion. The vector part of a quaternion is the component that lies in the ijk space, and the real part lies on the real axis.
I should mention that when you have a 3D vector $v$ (for rendering graphics for example), and want to rotate this point using quaternion multiplication, you must specify this vector as a "pure quaternion" or as a quaternion who's real part is 0.
So if $v$ has components $x, y, z$ in 3D space, then to specify it as a quaternion, you would write $v = 0 + xi + yj + zk$. Then the multiplication $qv\bar{q}$ will yield a new pure quaternion if $q$ is a unit quaternion. This new pure quaternion represents the point $v$ after the rotation has been applied.
Now back to the question of how to find a quaternion $q$ that will perform the desired rotation. Think back to complex numbers. To specify a complex number $z$ on the unit circle that makes some angle $\theta$ with the horizontal, you can write:
$z = \cos{\theta} + i\sin{\theta}$
Increasing $\theta$ will rotate a point around the circle passing through $1$ and $i$. But instead of multiplying $\sin{\theta}$ by $i$, we could multiply it by any unit vector that is orthogonal to the real axis and we would still get a unit circle. So let's do exactly that.
$q = \cos{\theta} + \sin{\theta}(xi + yj + zk)$
We are multiplying $\sin{\theta}$ by the vector part of a quaternion! And in fact, the whole equation describes a single quaternion whose real part is $\cos{\theta}$ and vector part is $\sin{\theta}(xi + yj + zk)$.
Now instead of making a circle that passes through $1$ and $i$ like complex numbers, we have made a circle that passes through $1$ and any arbitrary point on the 3D unit sphere (3D because there are only 3 components in the vector part of the quaternion that is multiplying $\sin{\theta}$). But remember that this point in 3D space is entirely orthogonal to the real axis (it is the red region in the first schematic), so we're still forming a perfect unit circle assuming that $x^2 + y^2 + z^2 = 1$.
And this also guarantees that the whole quaternion will still have length $1$ since $\cos^2{\theta} + \sin^2{\theta} = 1$ is always true.
Now think back to the quirk of quaternion multiplication. It will rotate two circles: the one passing through $1$ and the vector $xi + yj + zk$, and the perpendicular circle to that. And if we were to negate the vector part of $q$ (which results in the conjugate of $q$, denoted $\bar{q}$), both circles would rotate the opposite direction. But like before, if we right multiply by $\bar{q}$ instead of left multiply, we will only negate the rotation direction of the one circle passing through $1$ and the vector part.
Just like before, we can then multiply $qv\bar{q}$ where $v$ is a pure quaternion, both opposite rotations that pass through $1$ and the vector part will cancel out, and the rotation orthogonal to that will be doubled! But the rotation orthogonal to that will be entirely in the ijk space since it's orthogonal to the real axis. In fact, it will be a rotation exactly around the vector $xi + yj + zk$ since the circle must also be orthogonal to that vector! So we know exactly what rotation this multiplication will produce, and it occurs entirely in the ijk space which is a 3-dimensional space. This means we can specify any 3D vector we want in the ijk space and rotate it about any axis also specified in the ijk space!
A side effect of this doubling of the angles is that there are two ways to specify any orientation in 3D space using a quaternion and its conjugate. Specifically, we can achieve the other identical orientation of $q$ with $-q$. Negating a unit vector corresponds to a 180 degree rotation, but the conjugate quaternion doubles this angle resulting in a 360 degree rotation which is the same orientation!
This property becomes especially useful when you want to interpolate between two orientations. It allows the programmer to choose whether to take the long path or short path to rotate to the new orientation. The illustration above shows the difference in path lengths interpolating from $\frac{\pi}{2}$ to $2\pi$. If you would like to take the shorter path to $2\pi$, simply negate $q$ before interpolating.
Hopefully this has been insightful. Another thanks to 3Blue1Brown for making such an amazing introduction to quaternions which I still recommend you look at if you haven't yet.
Before reading this post, I absolutely recommend watching 3Blue1Brown's video on quaternions and then experimenting with them on his and Ben Eater's interactive website. I consider this post to just be supplemental to their excellent explanation.
So then, how can quaternions be used to produce 3D rotations that could be used in video games, for example?
Quaternions are a 4-dimensional extension of the complex numbers.
Complex numbers are of the form:
$a + bi$
Where $a, b \in \mathbb{R}$ and $i$ is the imaginary unit.
Quaternions on the other hand, are of the form:
$a + bi + cj + dk$
where $a, b, c, d \in \mathbb{R}$ and $i$, $j$, and $k$ are the fundamental quaternion units.
So complex numbers can be thought of as quaternions which have $c = d = 0$, similarly to how real numbers can be thought of as complex numbers with $b = 0$.
Multiplication of two unit complex numbers will result in a pure rotation, which I discuss in more detail in my last post. Similarly, multiplication of two quaternions result in another unit quaternion, but the wacky 4D rotation will not directly translate to a 3D rotation because when we project it back into 3D it does not preserve length of the original 3D vectors, so the space comes out looking... Well, wacky!
If you've ever read anything on quaternions, you are probably familiar with their rules of multiplication:
$i^2 = j^2 = k^2 = -1$
$ij = -ji = k$
$jk = -kj = i$
$ki = -ik = j$
These rules don't make it immediately obvious what happens rotation-wise when two quaternions are multiplied. Note though that any two fundamental quaternion units being multiplied together do not commute (one is the exact opposite of the other) but any one of the fundamental quaternion units multiplying itself or a real number does commute. This will be important later.
The diagram above is the same stereographic projection that Grant used in his video. Look at this diagram and imagine left-multiplying by $i$. It rotates the circle passing through $1$ and $i$ as well as rotating the perpendicular circle that passes through $j$ and $k$ such that:
$1 \mapsto i$
$j \mapsto k$
Take a minute to understand and visualize that.
Now imagine left-multiplying by $-i$. It rotates the same two circles in the opposite direction than positive $i$ did, such that:
$1 \mapsto -i$
$j \mapsto -k$
This shouldn't be too surprising. But now instead of left-multiplying, imagine right-multiplying by $-i$! Well remember, right multiplying only rotates the opposite direction than left-multiplying when you're multiplying two different fundamental quaternion units. So the circle that passes through $1$ and $i$ will be unaffected by switching to right-multiplication, but the circle passing through $j$ and $k$ will now rotate the opposite direction!
$1 \mapsto -i$
$j \mapsto k$
Hmm... Let's try first right-multiplying by $-i$ then left-multiplying the result of that by $i$, so that our equation looks like $iv(-i)$ where $v$ is some quaternion from the picture.
$1 \mapsto -i \mapsto 1$
$j \mapsto k \mapsto -j$
This is very important what just happened! The rotation of the circle passing through $1$ and $i$ has completely cancelled out, while the rotation of the circle passing through $j$ and $k$ has doubled! We have effectively created a rotation only about the i-axis!
We can generalize this to create a pure rotation about any unit vector that is a combination of $i$, $j$, and $k$! But we're gonna need some pictures.
Notice in the picture we first left-multiply by quaternion $q$ (in our example, $q=i$) which creates two perpendicular rotations, both of $\theta$ radians. Then we right-multiply by $\bar{q}$ (conjugate of $q$, which in our example was $-i$. More on this in a moment) which also produces two rotations of $\theta$ radians, but only one of them is opposite of the rotations produced by $q$. Then at last I show "sandwiching" $v$ between $q$ and $\bar{q}$ to produce the single rotation by $2\theta$ radians.
Now let's learn how this works with any unit quaternion $q$ rather than just sticking with our simple example of $q=i$ and $\bar{q}=-i$. Also I will discuss what the conjugate of $q$ is.
First, I'm going to attempt to visualize the 4D unit hypersphere because I think it will be helpful in explanations.
Pay attention to the red region. Since I obviously cannot create a 4D image, I have squished the 3-dimensional ijk space down into a flat circle. The real axis is perpendicular to this space. But remember that the red region is actually 3D, and the 2D appearance is only an artifact of the schematic.
It will be useful to differentiate between the real and vector part of the quaternion. The vector part of a quaternion is the component that lies in the ijk space, and the real part lies on the real axis.
I should mention that when you have a 3D vector $v$ (for rendering graphics for example), and want to rotate this point using quaternion multiplication, you must specify this vector as a "pure quaternion" or as a quaternion who's real part is 0.
So if $v$ has components $x, y, z$ in 3D space, then to specify it as a quaternion, you would write $v = 0 + xi + yj + zk$. Then the multiplication $qv\bar{q}$ will yield a new pure quaternion if $q$ is a unit quaternion. This new pure quaternion represents the point $v$ after the rotation has been applied.
Now back to the question of how to find a quaternion $q$ that will perform the desired rotation. Think back to complex numbers. To specify a complex number $z$ on the unit circle that makes some angle $\theta$ with the horizontal, you can write:
$z = \cos{\theta} + i\sin{\theta}$
Increasing $\theta$ will rotate a point around the circle passing through $1$ and $i$. But instead of multiplying $\sin{\theta}$ by $i$, we could multiply it by any unit vector that is orthogonal to the real axis and we would still get a unit circle. So let's do exactly that.
$q = \cos{\theta} + \sin{\theta}(xi + yj + zk)$
We are multiplying $\sin{\theta}$ by the vector part of a quaternion! And in fact, the whole equation describes a single quaternion whose real part is $\cos{\theta}$ and vector part is $\sin{\theta}(xi + yj + zk)$.
Now instead of making a circle that passes through $1$ and $i$ like complex numbers, we have made a circle that passes through $1$ and any arbitrary point on the 3D unit sphere (3D because there are only 3 components in the vector part of the quaternion that is multiplying $\sin{\theta}$). But remember that this point in 3D space is entirely orthogonal to the real axis (it is the red region in the first schematic), so we're still forming a perfect unit circle assuming that $x^2 + y^2 + z^2 = 1$.
And this also guarantees that the whole quaternion will still have length $1$ since $\cos^2{\theta} + \sin^2{\theta} = 1$ is always true.
Now think back to the quirk of quaternion multiplication. It will rotate two circles: the one passing through $1$ and the vector $xi + yj + zk$, and the perpendicular circle to that. And if we were to negate the vector part of $q$ (which results in the conjugate of $q$, denoted $\bar{q}$), both circles would rotate the opposite direction. But like before, if we right multiply by $\bar{q}$ instead of left multiply, we will only negate the rotation direction of the one circle passing through $1$ and the vector part.
Just like before, we can then multiply $qv\bar{q}$ where $v$ is a pure quaternion, both opposite rotations that pass through $1$ and the vector part will cancel out, and the rotation orthogonal to that will be doubled! But the rotation orthogonal to that will be entirely in the ijk space since it's orthogonal to the real axis. In fact, it will be a rotation exactly around the vector $xi + yj + zk$ since the circle must also be orthogonal to that vector! So we know exactly what rotation this multiplication will produce, and it occurs entirely in the ijk space which is a 3-dimensional space. This means we can specify any 3D vector we want in the ijk space and rotate it about any axis also specified in the ijk space!
A side effect of this doubling of the angles is that there are two ways to specify any orientation in 3D space using a quaternion and its conjugate. Specifically, we can achieve the other identical orientation of $q$ with $-q$. Negating a unit vector corresponds to a 180 degree rotation, but the conjugate quaternion doubles this angle resulting in a 360 degree rotation which is the same orientation!
This property becomes especially useful when you want to interpolate between two orientations. It allows the programmer to choose whether to take the long path or short path to rotate to the new orientation. The illustration above shows the difference in path lengths interpolating from $\frac{\pi}{2}$ to $2\pi$. If you would like to take the shorter path to $2\pi$, simply negate $q$ before interpolating.
Hopefully this has been insightful. Another thanks to 3Blue1Brown for making such an amazing introduction to quaternions which I still recommend you look at if you haven't yet.
Monday, June 17, 2019
Geometric Interpretation of Complex Multiplication
You probably know that multiplication of two complex numbers results in a third complex number whose angle with the horizontal can be found by adding the two input angles, and whose magnitude can be found by multiplying the two input magnitudes.
It works out really nicely but... Why? How does this geometric interpretation come about by using the simple defintion that $i=\sqrt{-1}$?
Let $z$ be a complex number of the form $a + bi$ where $a, b \in \mathbb{R}$, and $i$ is the imaginary unit.
The magnitude of $z$ is $\sqrt{a^2 + b^2}$. We will call this $r$ for radius.
$r = ||z|| = \sqrt{a^2 + b^2}$
We can now divide $z$ by $r$ resulting in a unit vector (or unit complex number I guess).
$z = r\frac{z}{r} = r(unit\_vector)$
Now let's think of a way to rewrite "unit_vector" because it's not doing us much good right now... Well, $\cos{\theta} + \sin{\theta}$ is guaranteed to lie on the unit circle, so this is promising! Of course we're dealing with the complex plane, so we'll multiply $\sin{\theta}$ by $i$ so that $\sin{\theta}$ will represent the imaginary component which is perpendicular to the real component.
Rewriting our new and improved equation, we get:
$z = r\frac{z}{r} = r(\cos{\theta} + i\sin{\theta})$
Ok great! We've written it in a form where the magnitude and direction are separated so it's easy to keep track of each one. Maybe this will let us discover what happens to magnitude and direction during multiplication!
Let's go ahead and multiply two complex numbers written in this form and see what happens:
$r_1(\cos{\theta} + i\sin{\theta}) \cdot r_2(\cos{\beta}+ i\sin{\beta})$
$= r_1 r_2(\cos{\theta} + i\sin{\theta})(\cos{\beta} + i\sin{\beta})$
$= r_1 r_2(\cos{\theta}\cos{\beta} + i\cos{\theta}\sin{\beta} + i\sin{\theta}\cos{\beta} - \sin{\theta}\sin{\beta})$
$= r_1 r_2(\cos{\theta}\cos{\beta} - \sin{\theta}\sin{\beta} + i(\sin{\theta}\cos{\beta} + \cos{\theta}\sin{\beta}))$
It's pretty obvious that the magnitudes are being multiplied which explains half of the puzzle! But it looks like the angles have become a little... messy.
Let's take an intermission from this problem, and give linear algebra a quick visit. In fact let's just derive the angle addition formula, just a totally random unrelated activity to get our minds off complex numbers.
Recall the 2D rotation matrix:
$\left[\begin{matrix}\cos{\theta} & -sin{\theta}\\ sin{\theta} & cos{\theta}\end{matrix}\right]$
So then if we first rotate by angle $\beta$ followed by a rotation of angle $\theta$, we can write this either as a rotation by angle $\theta + \beta$ or as the multiplication of two matrices representing the rotation of angle $\beta$ and $\theta$ respectively:
$\left[\begin{matrix}\cos(\theta + \beta) & -sin(\theta + \beta)\\ sin(\theta + \beta) & cos(\theta + \beta)\end{matrix}\right]$
$= \left[\begin{matrix}\cos{\theta} & -sin{\theta}\\ sin{\theta} & cos{\theta}\end{matrix}\right]
\left[\begin{matrix}\cos{\beta} & -sin{\beta}\\ sin{\beta} & cos{\beta}\end{matrix}\right]$
$= \left[\begin{matrix}\cos{\theta}\cos{\beta} - \sin{\theta}\sin{\beta} &
-cos{\theta}\sin{\beta} - \sin{\theta}\cos{\beta}\\
sin{\theta}\cos{\beta} + \cos{\theta}\sin{\beta} &
-\sin{\theta}\sin{\beta} + \cos{\theta}\cos{\beta}\end{matrix}\right]$
Therefore,
$\cos(\theta + \beta) = \cos{\theta}\cos{\beta} - \sin{\theta}\sin{\beta}$
$sin(\theta + \beta) = sin{\theta}\cos{\beta} + \cos{\theta}\sin{\beta}$
We just derived the angle addition formulas!! And it looks like we can use it to make some substitutions where we left off on our complex number multiplication!
...
$= r_1 r_2(\cos{\theta}\cos{\beta} - \sin{\theta}\sin{\beta} + i(\sin{\theta}\cos{\beta} + \cos{\theta}\sin{\beta}))$
$= r_1 r_2(\cos(\theta + \beta) + i\sin(\theta + \beta))$
And that's it! We can now see clearly that multiplying two complex numbers will result in a new complex number whose magnitude is multiplying the input magnitudes, and whose angle is the addition of the input angles.
This can also be shown using Euler's Formula:
$e^{ix} = \cos{x}+i\sin{x}$
Just multiply by some radius so you're no longer restricted to unit vectors, multiply two of them together and watch the magic.
$r_1 e^{i\theta} \cdot r_2 e^{i\beta} = r_1 r_2 e^{i(\theta + \beta)}
= r_1 r_2(\cos(\theta + \beta) + i\sin(\theta + \beta))$
Euler's Formula also gives rise to the ubiquitous Euler's Identity: $e^{i\pi}=-1$.
There is much more to be said about Euler's Formula and its derivation, but this post is long enough.
It works out really nicely but... Why? How does this geometric interpretation come about by using the simple defintion that $i=\sqrt{-1}$?
Let $z$ be a complex number of the form $a + bi$ where $a, b \in \mathbb{R}$, and $i$ is the imaginary unit.
The magnitude of $z$ is $\sqrt{a^2 + b^2}$. We will call this $r$ for radius.
$r = ||z|| = \sqrt{a^2 + b^2}$
We can now divide $z$ by $r$ resulting in a unit vector (or unit complex number I guess).
$z = r\frac{z}{r} = r(unit\_vector)$
Now let's think of a way to rewrite "unit_vector" because it's not doing us much good right now... Well, $\cos{\theta} + \sin{\theta}$ is guaranteed to lie on the unit circle, so this is promising! Of course we're dealing with the complex plane, so we'll multiply $\sin{\theta}$ by $i$ so that $\sin{\theta}$ will represent the imaginary component which is perpendicular to the real component.
Rewriting our new and improved equation, we get:
$z = r\frac{z}{r} = r(\cos{\theta} + i\sin{\theta})$
Ok great! We've written it in a form where the magnitude and direction are separated so it's easy to keep track of each one. Maybe this will let us discover what happens to magnitude and direction during multiplication!
Let's go ahead and multiply two complex numbers written in this form and see what happens:
$r_1(\cos{\theta} + i\sin{\theta}) \cdot r_2(\cos{\beta}+ i\sin{\beta})$
$= r_1 r_2(\cos{\theta} + i\sin{\theta})(\cos{\beta} + i\sin{\beta})$
$= r_1 r_2(\cos{\theta}\cos{\beta} + i\cos{\theta}\sin{\beta} + i\sin{\theta}\cos{\beta} - \sin{\theta}\sin{\beta})$
$= r_1 r_2(\cos{\theta}\cos{\beta} - \sin{\theta}\sin{\beta} + i(\sin{\theta}\cos{\beta} + \cos{\theta}\sin{\beta}))$
It's pretty obvious that the magnitudes are being multiplied which explains half of the puzzle! But it looks like the angles have become a little... messy.
Let's take an intermission from this problem, and give linear algebra a quick visit. In fact let's just derive the angle addition formula, just a totally random unrelated activity to get our minds off complex numbers.
Recall the 2D rotation matrix:
$\left[\begin{matrix}\cos{\theta} & -sin{\theta}\\ sin{\theta} & cos{\theta}\end{matrix}\right]$
So then if we first rotate by angle $\beta$ followed by a rotation of angle $\theta$, we can write this either as a rotation by angle $\theta + \beta$ or as the multiplication of two matrices representing the rotation of angle $\beta$ and $\theta$ respectively:
$\left[\begin{matrix}\cos(\theta + \beta) & -sin(\theta + \beta)\\ sin(\theta + \beta) & cos(\theta + \beta)\end{matrix}\right]$
$= \left[\begin{matrix}\cos{\theta} & -sin{\theta}\\ sin{\theta} & cos{\theta}\end{matrix}\right]
\left[\begin{matrix}\cos{\beta} & -sin{\beta}\\ sin{\beta} & cos{\beta}\end{matrix}\right]$
$= \left[\begin{matrix}\cos{\theta}\cos{\beta} - \sin{\theta}\sin{\beta} &
-cos{\theta}\sin{\beta} - \sin{\theta}\cos{\beta}\\
sin{\theta}\cos{\beta} + \cos{\theta}\sin{\beta} &
-\sin{\theta}\sin{\beta} + \cos{\theta}\cos{\beta}\end{matrix}\right]$
Therefore,
$\cos(\theta + \beta) = \cos{\theta}\cos{\beta} - \sin{\theta}\sin{\beta}$
$sin(\theta + \beta) = sin{\theta}\cos{\beta} + \cos{\theta}\sin{\beta}$
We just derived the angle addition formulas!! And it looks like we can use it to make some substitutions where we left off on our complex number multiplication!
...
$= r_1 r_2(\cos{\theta}\cos{\beta} - \sin{\theta}\sin{\beta} + i(\sin{\theta}\cos{\beta} + \cos{\theta}\sin{\beta}))$
$= r_1 r_2(\cos(\theta + \beta) + i\sin(\theta + \beta))$
And that's it! We can now see clearly that multiplying two complex numbers will result in a new complex number whose magnitude is multiplying the input magnitudes, and whose angle is the addition of the input angles.
This can also be shown using Euler's Formula:
$e^{ix} = \cos{x}+i\sin{x}$
Just multiply by some radius so you're no longer restricted to unit vectors, multiply two of them together and watch the magic.
$r_1 e^{i\theta} \cdot r_2 e^{i\beta} = r_1 r_2 e^{i(\theta + \beta)}
= r_1 r_2(\cos(\theta + \beta) + i\sin(\theta + \beta))$
Euler's Formula also gives rise to the ubiquitous Euler's Identity: $e^{i\pi}=-1$.
There is much more to be said about Euler's Formula and its derivation, but this post is long enough.
Thursday, May 23, 2019
Mandelbrot and Julia Fractals
The Mandelbrot set and the Julia sets are very similar in the way they're defined. Both are generated iteratively by the equation:
$z_{n+1} = z_{n}^2 + c$
Where $z$ and $c$ are complex numbers.
The Mandelbrot set is generated by starting with $z_{0}=0$ and varying $c$. Each $c$ that causes the iteration to converge to some complex number is part of the set, while each $c$ that causes it to diverge is not part of the set.
The Julia sets are similar to the Mandelbrot set, except $z_{0}$ is varied, and $c$ is constant. Each $c$ will create a unique Julia set.
Both sets can be visualized by treating each pixel in an image as a complex number where the horizontal axis is the real component of a complex number, and the vertical axis is the imaginary component. Each pixel will represent $c$ in the Mandelbrot set and $z_{0}$ in the Julia sets.
Here is an image of the Mandelbrot set that I generated using a small program I wrote in C++
Here is the most relevant function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | std::vector<unsigned char> generateMandelbrot(const unsigned width, const unsigned height, int iterations, double scale, double realAnim, double imagAnim) { int offset{ -static_cast<int>(width) / 4 }; int maxX{ static_cast<int>(width / 2) + offset }, minX{ -static_cast<int>(width / 2) + offset }; int maxY{ static_cast<int>(height / 2) }, minY{ -maxY }; std::vector<unsigned char> img(width * height * 4, 255); for (int y{ minY }; y < maxY; ++y) for (int x{ minX }; x < maxX; ++x) { Complex z{ realAnim, imagAnim }; Complex c{ x / (scale * 500), y / (scale * 500) }; int i; for (i = 0; i < iterations; ++i) { z = z * z + c; if (complexLength(z) > 2) break; } unsigned index{ 4 * width * (y + height / 2) + 4 * (x + width / 2 - offset) }; addRGBA(img, index, static_cast<double>(i) / iterations); } return img; } |
Here are two Julia fractals:
And here's the function I wrote for the Julia fractals:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | std::vector<unsigned char> generateJulia(const unsigned width, const unsigned height, int iterations, double scale, int offset, const Complex& c) { int maxX{ static_cast<int>(width / 2) + offset }, minX{ -static_cast<int>(width / 2) + offset }; int maxY{ static_cast<int>(height / 2) }, minY{ -maxY }; std::vector<unsigned char> img(width * height * 4, 255); for (int y{ minY }; y < maxY; ++y) for (int x{ minX }; x < maxX; ++x) { Complex z{ x / (scale * 500), y / (scale * 500) }; int i; for (i = 0; i < iterations; ++i) { z = z * z + c; if (complexLength(z) > 2) break; } unsigned index{ 4 * width * (y + height / 2) + 4 * (x + width / 2 - offset) }; addRGBA(img, index, static_cast<double>(i) / iterations); } return img; } |
I was curious what it would look like to vary some of the parameters for Mandelbrot and Julia fractals so I made an animation of it and put it on my new YouTube channel. It turned out pretty cool, take a look:
Saturday, May 4, 2019
Why are V and U orthogonal in singular-value decomposition?
Singular-value decomposition is the process of factoring a matrix $A$ into $A = U\Sigma V^T$, where $\Sigma$ is a diagonal matrix that contains the singular values.
$V$ is a matrix whose columns are the eigenvectors of $A^TA$. A useful property of this decomposition is that $V$ and $U$ are both guaranteed to be orthogonal matrices. So why is this true?
First, it is helpful to know that symmetric matrices always have orthogonal eigenvectors for distinct eigenvalues. This can be shown:
For $\lambda_1 \neq \lambda_2$, and symmetric matrix $B$ with eigenvectors $\vec{x}$ and $\vec{v}$
$\lambda_1 \vec{x} \cdot \vec{v} = B \vec{x} \cdot \vec{v} = (\vec{x}^T B^T)v = \vec{x}^T B\vec{v} = \lambda_2 \vec{x}^T \vec{v} = \lambda_2 \vec{x} \cdot \vec{v}$
Therefore $\vec{x} \cdot \vec{v} = 0$; they are orthogonal. This argument only works for distinct eigenvalues of $A$. How might you show that identical eigenvalues also correspond to orthogonal eigenvectors for symmetric matrices?
This explains why $V$ is an orthogonal matrix, since we will choose to make $V$ from eigenvectors of $A^T A$ with length 1, and $A^T A$ is symmetric.
So what happens when we multiply some vector from $V$ by $A$?
$A \vec{v} = \vec{u}$
$A^T A \vec{v} = A^T \vec{u}$
$\lambda \vec{v} = A^T \vec{u}$
$\lambda A \vec{v} = AA^T \vec{u} = \lambda \vec{u}$
So then $\vec{u}$ is an eigenvector of $AA^T$ which is also symmetric, so we know all of its eigenvectors are orthogonal as well. Notice that $A^T A$ and $AA^T$ have the same eigenvalues, see the previous post for an illustration of this.
We will put all the $\vec{u}$ vectors into a matrix called $U$, and factor out the length of each vector into a the matrix $\Sigma$. We factor out the lengths so that each $\vec{u}$ will have length 1, making $U$ an orthogonal matrix.
We found that the orthogonal matrix $V$ will be transformed into the still-orthogonal $U \Sigma$.
$AV = U \Sigma$
So why is this a desirable property for the singular value decomposition?
Well, here's one way that I think about it.
$A = U \Sigma V^T = \sigma_1 \vec{u}_1 \vec{v}^T _1 + ... + \sigma_n \vec{u}_n \vec{v}^T _n$
In this form, it becomes clear that each singular value corresponds to some rank 1 matrix that add up to form the complete $A$. In some sense, each singular value represents the "importance" or amount of influence that that rank 1 matrix has on $A$.
Since $U$ and $V$ are orthogonal, we know that all the elements in their product will be in the interval $[0, 1]$. So the singular value multiplying each of them tells the "importance" of that particular rank 1 matrix, or how much influence it has on the matrix $A$.
Having $U$ and $V$ be orthogonal matrices gives us more confidence that the singular values express the actual importance of that rank 1 matrix since we know its contribution will not be "undone" by one of the other rank 1 matrices.
To show what I mean by "undone," let me illustrate adding two non-orthogonal bases to express point $P$.
Note the area in red that I've marked with a sad face. Since $\vec{v}_1$ and $\vec{v}_2$ are not orthogonal, there is some redundancy in the direction that they encode. As a consequence, both $\vec{v}_1$ and $\vec{v}_2$ must have larger coefficients to express point $P$, and those coefficients aren't as meaningful because some of the magnitude of each of the vectors only "cancels out" that of the other, meaninglessly inflating the coefficients.
On the other hand, here is the same point $P$ expressed using an orthogonal basis:
Now none of $\vec{v}_1$ or $\vec{v}_2$'s magnitude is cancelling out the other, but they solely tell us information about $P$. In this case, both coefficients decreased, but this won't necessarily be true when switching to an orthogonal basis. The point is to show that with an orthogonal basis, the coefficients carry real meaning so we can make more well informed choices about interpreting them.
In the case of singular value decomposition, it is important to know that the coefficients actually tell us how much that rank 1 matrix contributes to the final matrix. One application of this is image compression. If we know certain rank 1 matrices don't contribute much to the final image, we can discard them. This means the image will require less information to encode with little sacrifice to its visual quality.
$V$ is a matrix whose columns are the eigenvectors of $A^TA$. A useful property of this decomposition is that $V$ and $U$ are both guaranteed to be orthogonal matrices. So why is this true?
First, it is helpful to know that symmetric matrices always have orthogonal eigenvectors for distinct eigenvalues. This can be shown:
For $\lambda_1 \neq \lambda_2$, and symmetric matrix $B$ with eigenvectors $\vec{x}$ and $\vec{v}$
$\lambda_1 \vec{x} \cdot \vec{v} = B \vec{x} \cdot \vec{v} = (\vec{x}^T B^T)v = \vec{x}^T B\vec{v} = \lambda_2 \vec{x}^T \vec{v} = \lambda_2 \vec{x} \cdot \vec{v}$
Therefore $\vec{x} \cdot \vec{v} = 0$; they are orthogonal. This argument only works for distinct eigenvalues of $A$. How might you show that identical eigenvalues also correspond to orthogonal eigenvectors for symmetric matrices?
This explains why $V$ is an orthogonal matrix, since we will choose to make $V$ from eigenvectors of $A^T A$ with length 1, and $A^T A$ is symmetric.
So what happens when we multiply some vector from $V$ by $A$?
$A \vec{v} = \vec{u}$
$A^T A \vec{v} = A^T \vec{u}$
$\lambda \vec{v} = A^T \vec{u}$
$\lambda A \vec{v} = AA^T \vec{u} = \lambda \vec{u}$
So then $\vec{u}$ is an eigenvector of $AA^T$ which is also symmetric, so we know all of its eigenvectors are orthogonal as well. Notice that $A^T A$ and $AA^T$ have the same eigenvalues, see the previous post for an illustration of this.
We will put all the $\vec{u}$ vectors into a matrix called $U$, and factor out the length of each vector into a the matrix $\Sigma$. We factor out the lengths so that each $\vec{u}$ will have length 1, making $U$ an orthogonal matrix.
We found that the orthogonal matrix $V$ will be transformed into the still-orthogonal $U \Sigma$.
$AV = U \Sigma$
So why is this a desirable property for the singular value decomposition?
Well, here's one way that I think about it.
$A = U \Sigma V^T = \sigma_1 \vec{u}_1 \vec{v}^T _1 + ... + \sigma_n \vec{u}_n \vec{v}^T _n$
In this form, it becomes clear that each singular value corresponds to some rank 1 matrix that add up to form the complete $A$. In some sense, each singular value represents the "importance" or amount of influence that that rank 1 matrix has on $A$.
Since $U$ and $V$ are orthogonal, we know that all the elements in their product will be in the interval $[0, 1]$. So the singular value multiplying each of them tells the "importance" of that particular rank 1 matrix, or how much influence it has on the matrix $A$.
Having $U$ and $V$ be orthogonal matrices gives us more confidence that the singular values express the actual importance of that rank 1 matrix since we know its contribution will not be "undone" by one of the other rank 1 matrices.
To show what I mean by "undone," let me illustrate adding two non-orthogonal bases to express point $P$.
Note the area in red that I've marked with a sad face. Since $\vec{v}_1$ and $\vec{v}_2$ are not orthogonal, there is some redundancy in the direction that they encode. As a consequence, both $\vec{v}_1$ and $\vec{v}_2$ must have larger coefficients to express point $P$, and those coefficients aren't as meaningful because some of the magnitude of each of the vectors only "cancels out" that of the other, meaninglessly inflating the coefficients.
On the other hand, here is the same point $P$ expressed using an orthogonal basis:
Now none of $\vec{v}_1$ or $\vec{v}_2$'s magnitude is cancelling out the other, but they solely tell us information about $P$. In this case, both coefficients decreased, but this won't necessarily be true when switching to an orthogonal basis. The point is to show that with an orthogonal basis, the coefficients carry real meaning so we can make more well informed choices about interpreting them.
In the case of singular value decomposition, it is important to know that the coefficients actually tell us how much that rank 1 matrix contributes to the final matrix. One application of this is image compression. If we know certain rank 1 matrices don't contribute much to the final image, we can discard them. This means the image will require less information to encode with little sacrifice to its visual quality.
Monday, April 22, 2019
Why do ATA and AAT have the same eigenvalues?
Why is it that $A^TA$ and $AA^T$ have the same non-zero eigenvalues? A symbolic proof is not hard to find, but as usual, I prefer to find a way to visualize it in order to gain a better mathematical intuition.
Let $\vec{x}$ be an eigenvector of $A^TA$.
We start with vector $\vec{x}$. $A$ transforms $\vec{x}$ into some arbitrary vector $A\vec{x}$. This is multiplied by $A^T$ resulting in $A^TA\vec{x}$. But remember, we defined $\vec{x}$ as an eigenvector of $A^TA$, so by definition $A^TA\vec{x} = \lambda \vec{x}$.
Now we're almost back to where we started, except $\vec{x}$ is being multiplied by a scalar! So if $\lambda \vec{x}$ undergoes another linear transformation, the result will be $\lambda$ times the transformation of $\vec{x}$.
So what if we choose to multiply $\lambda \vec{x}$ by $A$? We get $\lambda A \vec{x}$. But to get to this point, we multiplied $A \vec{x}$ by $AA^T$.
This means that $A \vec{x}$ is an eigenvector of $A^TA$ with eigenvalue $\lambda$! This is the same eigenvalue that we found by multiplying $\vec{x}$ by $A^TA$!
Update: This is actually true for any matrices $AB$ and $BA$, not only a matrix and its transpose. Thanks reddit user etzpcm for pointing this out!
Let $\vec{x}$ be an eigenvector of $A^TA$.
We start with vector $\vec{x}$. $A$ transforms $\vec{x}$ into some arbitrary vector $A\vec{x}$. This is multiplied by $A^T$ resulting in $A^TA\vec{x}$. But remember, we defined $\vec{x}$ as an eigenvector of $A^TA$, so by definition $A^TA\vec{x} = \lambda \vec{x}$.
Now we're almost back to where we started, except $\vec{x}$ is being multiplied by a scalar! So if $\lambda \vec{x}$ undergoes another linear transformation, the result will be $\lambda$ times the transformation of $\vec{x}$.
So what if we choose to multiply $\lambda \vec{x}$ by $A$? We get $\lambda A \vec{x}$. But to get to this point, we multiplied $A \vec{x}$ by $AA^T$.
This means that $A \vec{x}$ is an eigenvector of $A^TA$ with eigenvalue $\lambda$! This is the same eigenvalue that we found by multiplying $\vec{x}$ by $A^TA$!
Update: This is actually true for any matrices $AB$ and $BA$, not only a matrix and its transpose. Thanks reddit user etzpcm for pointing this out!
Friday, April 19, 2019
Exploring the Dot Product
Let $\vec{v}_1 = \left[ \begin{matrix} a\\ b \end{matrix} \right]$ and $\vec{v}_2 = \left[ \begin{matrix} c\\ d \end{matrix} \right]$.
It is often taught that the dot product $\vec{v}_1 \cdot \vec{v}_2$ can be interpreted as projecting $\vec{v}_2$ onto $\vec{v}_1$ (or vice versa), which I will refer to as $\vec{v}_2'$, and then multiplying the lengths of $\vec{v}_2'$ and $\vec{v}_1$. Or equivalently,
$\vec{v}_1 \cdot \vec{v}_2 = \cos{\theta}\mid\vec{v}_1\mid\mid\vec{v}_2\mid = \mid\vec{v}_1\mid\mid\vec{v}_2'\mid$
But where did this interpretation come from? Why does multiplying each component of the vectors and adding them correspond to projecting then multiplying the lengths?
It helps me to think of it this way: the dot product is a measure of similarity between two vectors. Of course, their magnitudes play a role in the product as well, but that is not the point. In the case of projection, you trash the component of $\vec{v}_2$ that is orthogonal to $\vec{v}_1$, and you are only left with the part of $\vec{v}_2$ that is similar to $\vec{v}_1$.
Multiplying their corresponding components works in much the same way. You break each vector into its orthogonal components, and multiply them with the corresponding orthogonal components of the other vector.
$\vec{v}_1 \cdot \vec{v}_2 = (\vec{v}_a + \vec{v}_b) \cdot (\vec{v}_c + \vec{v}_d)$
$=\vec{v}_a \cdot \vec{v}_c + \vec{v}_a \cdot \vec{v}_d + \vec{v}_b \cdot \vec{v}_c +
\vec{v}_b \cdot \vec{v}_d$
$=\vec{v}_a \cdot \vec{v}_c + 0 + 0 + \vec{v}_b \cdot \vec{v}_d$
$=\vec{v}_a \cdot \vec{v}_c + \vec{v}_b \cdot \vec{v}_d$
$=ac + bd$
This can be visualized nicely:
In the image, it becomes apparent that multiplying each component of the two vectors really represents multiplying their "similarity" in the horizontal direction and the vertical direction then adding them up to sum to the "total similarity." This summation of the similarity in both directions really is exactly the same as the projection method-- because projecting a vector onto another is keeping only the "similar" components between the vectors.
It use to puzzle me how the computation of the dot product corresponds with the geometric interpretation, but this idea of similarity between the vectors has helped me gain some intuition. It also makes sense that the dot product of two orthogonal vectors is 0, since they share no similarity in the direction they point.
Often when multiplication is involved, it can be helpful to visualize with areas. So I will leave with this last visualization of the dot product in which the area of the two outer boxes sums to the area of the inner one.
Let me know if you have any other visualizations or ideas to help with gaining an intuition for the dot product.
It is often taught that the dot product $\vec{v}_1 \cdot \vec{v}_2$ can be interpreted as projecting $\vec{v}_2$ onto $\vec{v}_1$ (or vice versa), which I will refer to as $\vec{v}_2'$, and then multiplying the lengths of $\vec{v}_2'$ and $\vec{v}_1$. Or equivalently,
$\vec{v}_1 \cdot \vec{v}_2 = \cos{\theta}\mid\vec{v}_1\mid\mid\vec{v}_2\mid = \mid\vec{v}_1\mid\mid\vec{v}_2'\mid$
But where did this interpretation come from? Why does multiplying each component of the vectors and adding them correspond to projecting then multiplying the lengths?
It helps me to think of it this way: the dot product is a measure of similarity between two vectors. Of course, their magnitudes play a role in the product as well, but that is not the point. In the case of projection, you trash the component of $\vec{v}_2$ that is orthogonal to $\vec{v}_1$, and you are only left with the part of $\vec{v}_2$ that is similar to $\vec{v}_1$.
Multiplying their corresponding components works in much the same way. You break each vector into its orthogonal components, and multiply them with the corresponding orthogonal components of the other vector.
$\vec{v}_1 \cdot \vec{v}_2 = (\vec{v}_a + \vec{v}_b) \cdot (\vec{v}_c + \vec{v}_d)$
$=\vec{v}_a \cdot \vec{v}_c + \vec{v}_a \cdot \vec{v}_d + \vec{v}_b \cdot \vec{v}_c +
\vec{v}_b \cdot \vec{v}_d$
$=\vec{v}_a \cdot \vec{v}_c + 0 + 0 + \vec{v}_b \cdot \vec{v}_d$
$=\vec{v}_a \cdot \vec{v}_c + \vec{v}_b \cdot \vec{v}_d$
$=ac + bd$
This can be visualized nicely:
In the image, it becomes apparent that multiplying each component of the two vectors really represents multiplying their "similarity" in the horizontal direction and the vertical direction then adding them up to sum to the "total similarity." This summation of the similarity in both directions really is exactly the same as the projection method-- because projecting a vector onto another is keeping only the "similar" components between the vectors.
It use to puzzle me how the computation of the dot product corresponds with the geometric interpretation, but this idea of similarity between the vectors has helped me gain some intuition. It also makes sense that the dot product of two orthogonal vectors is 0, since they share no similarity in the direction they point.
Often when multiplication is involved, it can be helpful to visualize with areas. So I will leave with this last visualization of the dot product in which the area of the two outer boxes sums to the area of the inner one.
Let me know if you have any other visualizations or ideas to help with gaining an intuition for the dot product.
Wednesday, April 17, 2019
Double Angle
Next time you come across $\sin{2\theta}$,
Forget about your trig sheet of unexplained data.
Recall how to rotate a vector by some degrees
$\left[\begin{matrix}\cos{\theta} & -sin{\theta}\\ sin{\theta} & cos{\theta}\end{matrix}\right]$
And to do it again, we'll need two of these.
$\left[\begin{matrix}\cos{\theta} & -sin{\theta}\\ sin{\theta} & cos{\theta}\end{matrix}\right]$$\left[\begin{matrix}\cos{\theta} & -sin{\theta}\\ sin{\theta} & cos{\theta}\end{matrix}\right]$
Let us find the matrix composition,
$\left[\begin{matrix}\cos^{2}{\theta}-sin^{2}{\theta} & -2sin{\theta}cos{\theta}\\ 2sin{\theta}cos{\theta} & cos^{2}{\theta}-sin^{2}{\theta}\end{matrix}\right]$
to guide our vector to its final position.
But wait!
Rotating by the same angle twice
is rotating by $2\theta$ but less concise!
$\left[\begin{matrix}\cos{2\theta} & -sin{2\theta}\\ sin{2\theta} & cos{2\theta}\end{matrix}\right] = \left[\begin{matrix}\cos^{2}{\theta}-sin^{2}{\theta} & -2sin{\theta}cos{\theta}\\ 2sin{\theta}cos{\theta} & cos^{2}{\theta}-sin^{2}{\theta}\end{matrix}\right]$
Hmm... So that means
$\cos{2\theta} = \cos^{2}{\theta} - \sin^{2}{\theta}$
$\sin{2\theta} = 2\sin{\theta}\cos{\theta}$
Forget about your trig sheet of unexplained data.
Recall how to rotate a vector by some degrees
$\left[\begin{matrix}\cos{\theta} & -sin{\theta}\\ sin{\theta} & cos{\theta}\end{matrix}\right]$
And to do it again, we'll need two of these.
$\left[\begin{matrix}\cos{\theta} & -sin{\theta}\\ sin{\theta} & cos{\theta}\end{matrix}\right]$$\left[\begin{matrix}\cos{\theta} & -sin{\theta}\\ sin{\theta} & cos{\theta}\end{matrix}\right]$
Let us find the matrix composition,
$\left[\begin{matrix}\cos^{2}{\theta}-sin^{2}{\theta} & -2sin{\theta}cos{\theta}\\ 2sin{\theta}cos{\theta} & cos^{2}{\theta}-sin^{2}{\theta}\end{matrix}\right]$
to guide our vector to its final position.
But wait!
Rotating by the same angle twice
is rotating by $2\theta$ but less concise!
$\left[\begin{matrix}\cos{2\theta} & -sin{2\theta}\\ sin{2\theta} & cos{2\theta}\end{matrix}\right] = \left[\begin{matrix}\cos^{2}{\theta}-sin^{2}{\theta} & -2sin{\theta}cos{\theta}\\ 2sin{\theta}cos{\theta} & cos^{2}{\theta}-sin^{2}{\theta}\end{matrix}\right]$
Hmm... So that means
$\cos{2\theta} = \cos^{2}{\theta} - \sin^{2}{\theta}$
$\sin{2\theta} = 2\sin{\theta}\cos{\theta}$
Subscribe to:
Posts (Atom)