Notes for competitive programming

Module

Number theory 3: Extended Euclidean algorithm

Key ideas

  • The extended Euclidean algorithm extends the basic Euclidean algorithm calculates the smallest possible (x,y)(x, y) that satisfies the equation ax+by=gcd(a,b)ax + by = \gcd(a, b).
  • At the base case, we let (x,y)=(1,0)(x, y) = (1, 0); for each recursive step, we let (x,y)=(y,xmy)(x', y') = (y, x - my), where m=aamodbbm = \frac{a - a \mod b}{b} (or a / b).
  • To implement the algorithm iteratively instead of recursively, we calculate a recursive sequence for xx and yy, given as xi+1=xi1mixi;yi+1=yi1miyix_{i+1} = x_{i-1} - m_ix_i; y_{i+1} = y_{i-1} - m_iy_i with x0,y0=1,0x_0, y_0 = 1, 0.

In 14.02, we briefly discussed Bezout’s identity. This states that for integers aa and bb such that gcd(a,b)=d\gcd(a, b) = d, then we can find integers xx and yy such that

ax+by=dax+by=d

In many cases, it would be useful to find the values of xx and yy that satisfy this equation. For example, gcd(28,8)=4\gcd(28, 8) = 4, and we want to find the values of xx and yy such that 28x+8y=428x+8y=4. A solution would be x=1,y=3x=1, y = -3. This solution is not unique; another possible solution would be x=3,y=10x = 3, y = -10.

Euclid’s algorithm

It turns out that Euclid’s algorithm, with additional bookkeeping, can help us find xx and yy. To recall, Euclid’s algorithm uses the following recurrence:

gcd(a,b)={0if b=0gcd(b,amodb)otherwise\gcd(a, b)=\begin{cases}0&\text{if } b = 0 \\ \gcd(b, a \bmod b)&\text{otherwise}\end{cases}

So for the pair (28,8)(28, 8), Euclid’s algorithm does the following transition:

(28,8)(8,4)(4,0)(28, 8) \to (8, 4) \to (4, 0)

Let’s try to visualize what Euclid’s algorithm is actually doing and why it works by analyzing the state transitions. Notice that two successive states share an element:

(28,8)(8,4)(4,0)(28, \textcolor{blue}{8}) \to (\textcolor{blue}{8}, \textcolor{red}{4}) \to (\textcolor{red}{4}, 0)

The element added by state transition arrows \to represent the operation amodba \bmod b. Now recall the definition of the modulo:

amodb=k    ak=mba \bmod b = k \implies a - k = mb

for some positive mm. For example,

28mod8=4    284=m828 \bmod 8 = 4 \implies 28 - 4 = m \cdot 8

Here, mm is 3, since m=akb=3m = \displaystyle \frac{a-k}{b} = 3. We’ll take note of this for later.

We can prove that this transition “preserves” the GCD, i.e. gcd(a,b)=gcd(b,k)\gcd(a, b) = \gcd(b, k). To prove this, we can use the division theorem, which states that a=qb+ka=qb+k for some quotient qq.

By divisibility rules, if an integer cc divides bb and kk, then it must divide aa, since aa is a linear combination of bb and kk. We also note that aqb=ka - qb = k, so if an integer cc divides aa and bb, then it must divide kk, since kk is a linear combination of aa and bb.

Restating the paragraph above, if cc divides bb and kk, it also divides aa, and if cc divides aa and bb, it also divides kk. Bridging these two facts together, if cc divides aa and bb, it must also divide bb and kk, and vice versa. Thus, the pair (a,b)(a, b) shares the same divisors as the pair (b,k)(b, k), and they must share the same GCD.

Now this process must terminate, since the second element keeps on getting smaller (amodb<ba \bmod b < b). Since the modulo operation never becomes negative for two positive inputs, at some point, amodba \bmod b must be 00. In fact, we can prove that the first element is halved every two iterations (abamodb    amodba/2a \to b \to a \bmod b \implies a \bmod b \le a/2), which proves that Euclid’s algorithm is O(loga)O(\log{a}).

Bookkeeping

Let’s extend Euclid’s algorithm by keeping track of mm. Let’s zoom in on the first transition state:

(28,8)(8,4)(28, 8) \to (8, 4)

From before, we know that 284=m×828 - 4 = m \times 8, with m=3m = 3. If we move the 44 on the right side, then we get

283×8=41×28+(3)×8=4\begin{align*} 28 - 3 \times 8 &= 4 \\ 1\times 28 + (-3) \times 8 &= 4 \end{align*}

Note that this already gives us the exact formulas for Bezout’s identity: x=1x = 1 and y=3=akb=aamodbb\displaystyle y = -3 = \frac{a - k}{b} = \frac{a - a \bmod b}{b}. So we see that this formula for mm is somehow key to getting the correct values of xx and yy.

Let’s try to look at a bigger example to see how to get the correct values of xx and yy. The following is a trace table for gcd(305,65)=5\gcd(305, 65) = 5. Note that, as we found before, m=aamodbbm = \displaystyle \frac{a - a \bmod b}{b}.

aabbk=a(modb)k = a \pmod b=amb= a - mb
3056545= 305 - 4 x 65
654520= 65 - 1 x 45
45205= 45 - 2 x 20
2050= 20 - 4 x 5
50--

To recover xx and yy, we can start from the following equation in the bottom row:

1×5+0×0=51 \times 5 + 0 \times 0 = 5

where x=1,y=0x = 1, y = 0. The row above states that 0=204×50 = 20 - 4 \times 5, so we substitute that in:

1×5+0×(204×5)=51×5+(0×20)(0×5)=50×20+1×5=5\begin{align*} 1 \times 5 + 0 \times (20 - 4 \times 5) &= 5 \\ 1 \times 5 + (0 \times 20) - (0 \times 5) &= 5 \\ 0 \times 20 + 1 \times 5 &= 5 \end{align*}

So as (a,b)(a, b) went from (5,0)(5, 0) back to (20,5)(20, 5), (x,y)(x, y) transitioned from (1,0)(0,1)(1, 0) \to (0, 1). Let’s repeat this transition:

0×20+1×(452×20)=50×20+(1×45)(2×20)=51×452×20=5\begin{align*} 0 \times 20 + 1 \times (45 - 2 \times 20) &= 5 \\ 0 \times 20 + (1 \times 45) - (2 \times 20) &= 5 \\ 1 \times 45 - 2 \times 20 &= 5 \end{align*}

Let’s repeat it again:

1×452×(651×45)=51×45(2×65)+(2×45)=52×65+3×45=5\begin{align*} 1 \times 45 - 2 \times (65 - 1 \times 45) &= 5 \\ 1 \times 45 - (2 \times 65) + (2 \times 45) &= 5 \\ -2 \times 65 + 3 \times 45 &= 5 \\ \end{align*}

And one last time:

2×65+3×(3054×65)=52×65+(3×305)(12×65)=53×30514×65=5\begin{align*} - 2 \times 65 + 3 \times (305 - 4 \times 65) &= 5 \\ -2 \times 65 + (3 \times 305) - (12 \times 65) &= 5 \\ 3 \times 305 - 14 \times 65 &= 5 \end{align*}

So the transitions for x,yx, y are (1,0)(0,1)(1,2)(2,3)(3,14)(1, 0) \to (0, 1) \to (1, -2) \to (-2, 3) \to (3, -14), such that the final coefficients for Bezout’s identity are x=3,y=14x = 3, y = -14. It can be proven that this is the pair of values that minimizes x+y|x| + |y|.

In general, when we transition from (ai,bi,xi,yi)(ai1,bi1,xi1,yi1)(a_i, b_i, x_i, y_i) \to (a_{i-1}, b_{i-1}, x_{i-1}, y_{i - 1}), noting that bi=ai1mbi1b_i = a_{i-1} - mb_{i - 1}, we do the following transition:

aixi+biyi=gcd(a,b)aixi+yi(ai1mbi1)=gcd(a,b)aixi+yiai1yimbi1=gcd(a,b)bi1xi+yiai1yimbi1=gcd(a,b)ai1yi+bi1(ximyi)=gcd(a,b)\begin{align*} & a_ix_i+b_iy_i=\gcd(a,b) \\ &\to a_ix_i+y_i\left(a_{i - 1} - mb_{i - 1}\right)=\gcd(a,b) \\ &\to a_ix_i+y_ia_{i - 1} - y_imb_{i - 1}=\gcd(a,b) \\ &\to b_{i-1}x_i+y_ia_{i - 1} - y_imb_{i - 1}=\gcd(a,b) \\ &\to a_{i-1}y_i + b_{i-1}(x_i-my_i) = \gcd(a,b) \end{align*}

So at each step, we transition from (x,y)(y,xmy)(x, y) \to (y, x - my). Recall that m=ai1ai1modbi1bi1\displaystyle m = \frac{a_{i-1} - a_{i-1} \bmod b_{i-1}}{b_{i-1}}.

Recursive algorithm

To calculate xx and yy along with the GCD kk, we thus need to implement this state transition. Let’s start off with the basic Euclidean algorithm. This is an expanded version of the one-liner algorithm from 14.02.

int gcd(int a, int b) {
if (b == 0) {
return a;
}
int k = gcd(b, a % b);
return k;
}

We want to calculate xx and yy as well, so we’ll return a 3-tuple (k,x,y)(k, x, y). As our base case, (k,x,y)=(a,1,0)(k, x, y) = (a, 1, 0), since gcd(a,0)=a=1×a+0×0\gcd(a, 0) = a = 1 \times a + 0 \times 0.

tuple<int, int, int> ext_gcd(int a, int b) {
if (b == 0) {
return {a, 1, 0};
}
int k = gcd(b, a % b);
return k;
}

ext_gcd now returns a tuple, so we’ll have to recover (k,x,y)(k, x', y'). Remember that (x)(b)+(y)(amodb)=k(x')(b) + (y')(a \bmod b) = k.

tuple<int, int, int> ext_gcd(int a, int b) {
if (b == 0) {
return {a, 1, 0};
}
auto [k, xp, yp] = ext_gcd(b, a % b);
return k;
}

Once we’ve gotten the GCD, we need to work our way backwards, like in our trace table above. We need to calculate xx and yy for our current (a,b)(a, b) pair, based on the (a,b)(a, b) pair below it in the table. From before, we know that this is given by the state transition (x,y)(y,xmy)(x, y) \to (y, x - my). Note that this mm is given by aamodbb\displaystyle \frac{a - a \bmod b}{b} for the current pair and not the pair we just calculated in line 5.

To simplify our code, we note that our calculation for mm is equivalent to integer division, a / b.

tuple<int, int, int> ext_gcd(int a, int b) {
if (b == 0) {
return {a, 1, 0};
}
auto [k, xp, yp] = ext_gcd(b, a % b);
int x = yp, y = xp - yp * (a / b);
return {k, x, y};
}

You can confirm that this function calculates xx and yy in the same way as our problem above, starting with (1,0)(1, 0) and ending at the correct values for xx and yy.

Iterative algorithm

Converting this algorithm to work alternatively is a bit difficult because our state transition for (x,y)(x, y) works from base case upwards, in the reverse direction of how we compute (a,b)(a, b).

If we look at how Euclid’s algorithm transitions between states:

(28,8)(8,4)(4,0)(28, \textcolor{blue}{8}) \to (\textcolor{blue}{8}, \textcolor{red}{4}) \to (\textcolor{red}{4}, 0)

Since each pair of values has a clear overlap, we can restate this as a sequence:

ri:28840r_i: 28 \to 8 \to 4 \to 0

This sequence {ri}\left\{r_i\right\} is a sequence of remainders, with r0=ar_0=a, r1=br_1=b, and ri+1=ri1modri=ri1mirir_{i+1} = r_{i-1} \bmod r_{i} = r_{i-1} - m_ir_i (for some quotient mi=ri1rim_i = \left\lfloor\frac{r_{i-1}}{r_i}\right\rfloor). The sequence terminates when rn=0r_n = 0, with the GCD being rn1r_{n-1}.

We can see if we can express x,yx, y as a sequence {xi},{yi}\{x_i\}, \{y_i\}. We want to maintain the following invariant for any ii:

axi+byi=riax_i+by_i = r_i

This way, when i=n1i = n - 1, we have

axn1+byn1=rn1=gcd(a,b)ax_{n-1}+by_{n-1}=r_{n-1}=\gcd(a,b)

which is what we want.

We first set x0=1,x1=0x_0=1,x_1=0 and y0=0,y1=1y_0=0,y_1=1 to make the following two equations correct:

ax0+by0=r0=aax1+by1=r1=b\begin{align*} ax_0+by_0 &= r_0=a\\ ax_1+by_1 &= r_1=b\\ \end{align*}

Now r2=r0modr1=r0m1r1r_2=r_0 \bmod r_1 = r_0 - m_1r_1. If we substitute in the equations for r0r_0 and r1r_1 as above, we get:

r2=r0m1r1=ax0+by0m1(ax1+by1)=ax0am1x1+by0bm1y1=a(x0m1x1)+b(y0m1y1)=ax2+by2\begin{align*} r_2&=r_0-m_1r_1\\ &= ax_0+by_0-m_1\left(ax_1+by_1\right) \\ &= ax_0 - am_1x_1 + by_0-bm_1y_1 \\ &= a(x_0-m_1x_1)+b(y_0-m_1y_1) \\ &= ax_2+by_2 \end{align*}

and thus x2=x0m1x1x_2 = x_0 - m_1x_1, y2=y0m1y1y_2 = y_0-m_1y_1. This holds for greater values of ii, so in general, we have

xi+1=xi1mixiyi+1=yi1miyi\begin{align*} x_{i+1} &= x_{i-1} - m_ix_i \\ y_{i+1} &= y_{i-1} - m_iy_i \end{align*}

which exactly parallels the recurrence for rir_i. We can easily implement this using the code below:

tuple<int, int, int> ext_gcd(int a, int b) {
int xi = 1, yi = 0, ri = a;
int xii = 0, yii = 1, rii = b;
while (rii != 0) {
int m = ri / rii;
tie(xi, xii) = {xii, xi - m * xii};
tie(yi, yii) = {yii, yi - m * yii};
tie(ri, rii) = {rii, ri - m * rii};
}
return {ri, xi, yi};
}

Note that tie(i, j) creates an “assignable pair”; C++ will get the pair on the right, unpack it, and assign it to each individual value on the left.

Module created
2024-11-12T00:00:00.000Z
Last updated
2025-01-04T00:00:00.000Z