You are currently browsing the category archive for the ‘NT.number-theory’ category.

In my recent number theory seminar on “Hilbert’s 90 and generalizations” (notes here), Professor Goins asked the following interesting question.

Let {K} be a field and {d\in K^*}. Define {T_d} to be the torus

\displaystyle \left\{ \begin{pmatrix} x & dy \\ y & x \end{pmatrix} : x,y \in L, x^2-dy^2=1\right\}.

What values of {d} give {K}-isomorphic tori?

(The question was perhaps motivated by the observation that over the reals, the sign of {d} determines completely whether {T_d} would be split (i.e., isomorphic to {\mathbb R^*}) or anisotropic (i.e., isomorphic to {S^1}).

Here are two ways of looking at the answer.

  • For {d,e \in K^*}, we determine when two matrices {\displaystyle\begin{pmatrix} x & dy \\ y & x \end{pmatrix}} and {\displaystyle\begin{pmatrix} u & ev \\ v & u \end{pmatrix}} are conjugate in {\text{SL}_2(K)}. Solving the system

\displaystyle \begin{pmatrix} x & dy \\ y & x \end{pmatrix} = \begin{pmatrix} a & b \\ c & d \end{pmatrix} . \begin{pmatrix} u & ev \\ v & u \end{pmatrix} . \begin{pmatrix} d & -b \\ -c & a \end{pmatrix}

gives {\displaystyle \frac{e}{d} = \left(\frac{b}{c}\right)^2, de = \left(\frac{d}{a}\right)^2}.

Thus {e \in d.(K^*)^2} i.e., the {T_d}‘s are classified by {\displaystyle \frac{K^*}{(K^*)^2}}. (For {K=\mathbb R}, this is isomorphic to {\{\pm 1\}} so the sign of {d} determines {T_d} upto conjugation.) By Kummer theory, {\displaystyle \frac{K^*}{(K^*)^2} \cong H^1(\text{Gal}(\overline K / K), \mu_2)}, where {\mu_2 = \{\pm 1\}} are the second roots of unity. Thus there is a correspondence between isomorphism classes of tori {T_d \; (d \in K^*)} and quadratic extensions of {K}.

  • Another way to look at the same thing is as follows. Fix {d \in K^*}. Let {L} be an extension of {K} wherein {T_d} splits. Now {T_d(L)} is a split torus of rank 1. For an algebraic group {G} over an algebraically closed field, we have the exact sequence

\displaystyle 1 \rightarrow \text{Inn}(G) \rightarrow \text{Aut}(G) \rightarrow \text{Aut}(\Psi_0(G)) \rightarrow 1,

where {\Psi_0(G)} is the based root datum {(X,\Delta,X\;\check{}, \Delta\,\check{})} associated to {G}. (Here, {X = X^*(G)} and {\Delta} is the set of simple roots of {X} corresponding to a choice of a Borel subgroup of {G}.) For details, see Corollary 2.14 of Springer’s paper “Reductive Groups” in Corvallis.

In our case, {G = T_d} so {\Psi_0(G) = ( \mathbb Z, \emptyset, \mathbb Z \;\check{}, \emptyset)}.

\displaystyle \text{Aut}(\Psi_0(G)) \cong \text{Aut}(\mathbb Z) \cong \{ \pm 1\}.

Now {L/K} forms of {T_d} are in bijective correspondence with

\displaystyle H^1(\text{Gal}(L/K), \text{Aut}(\Psi_0(G))) = H^1(\text{Gal}(L/K), \{\pm 1\}) \cong \text{Hom}_{\mathbb Z}(\text{Gal}(L/K), \{\pm 1\});

the last isomorphism because the Galois group acts trivially on the split torus {T_d(L)}. {\blacksquare}

In this previous post, we saw the existence of a common eigenvector, namely \phi(n) = a_n = number of nonzero solutions to x^2=d modulo n. This was not a coincidence. Indeed, it was based on the fact that \{ T_p : p \nmid N := 4|d| \} is a family of self-adjoint and commuting operators on the space of complex-valued functions on G = (\mathbb Z/N \mathbb Z)^*.

(Here, by self-adjoint, I’m talking about the inner product

\langle f,g \rangle = \displaystyle\sum_{r\in G} f(r) \overline{g(r)} . \qquad )

This post is a generalization.

Let f be a holomorphic function on the upper half complex plane. We say f is modular if it satisfies a technical condition called “holomorphic at the cusps” and the following.

\displaystyle f\left(\frac{az+b}{cz+d}\right) = (cz+d)^k f(z) \quad \forall \gamma = \begin{pmatrix} a&b\\c&d \end{pmatrix} \in \Gamma := SL(2,\mathbb Z).

Given any f holomorphic on the upper half plane and \gamma \in \Gamma(1), define

\displaystyle (f|_\gamma)(z) := (cz+d)^{-k} (\text{det} \gamma)^{k/2} f\left(\frac{az+b}{cz+d}\right), \quad \forall \gamma = \begin{pmatrix} a&b\\c&d \end{pmatrix} \in \Gamma(1).

It is a fact that for any \alpha \in \text{GL}(2,\mathbb Q)^+, there is a double-coset decomposition \Gamma(1) \alpha \Gamma(1) = \displaystyle\bigcup_{i=1}^l \Gamma(1) \alpha_i.

Define for such a decomposition,

f|_{T_\alpha} := \displaystyle\sum_{i=1}^l f|_{\alpha_i}.

Observe that (f|_{\alpha})|_{\beta} = f|_{\alpha\beta} so that defines a well-defined action of \Gamma(1) on the f‘s. There is a vector space called the space of modular forms and a T_\alpha-invariant subspace – S_k(\Gamma(1)) – the space of cusp forms (similar to V_N in the previous post) and for varying \alpha, the operators T_\alpha (called the Hecke operators)

T_\alpha : S_k(\Gamma(1)) \to S_k(\Gamma(1)),

T_\alpha(f) := f|_{T_\alpha}.

It’s a cool theorem that the Hecke algebra is commutative and the Hecke operators are self-adjoint with respect to an inner product (the Petersson inner product). A standard result in linear algebra tells that these can be diagonalized; there is a common eigenvector, called the Hecke eigenform. When suitably normalized, it’s associated L- function has an Euler product (similar to the \zeta function). This Euler product gives the Ramanujan’s identity –

\displaystyle\sum_{n=1}^\infty \tau(n) n^{-s} = \prod_p \frac{1}{1-\tau(p) p^{-s} + p^{11-2s}}.

(Here, \tau is the Ramanujan-\tau function. )

Pretty cool stuff, eh!


Tailpiece: References (since I am very vague here) –

  • A first course in modular forms – Diamond, Shurman
  • Automorphic forms and representations – Daniel Bump

Also, I was interested in the properties the L-function corresponding to the a_p‘s in the earlier post. I haven’t seen any book that mentions about these.

Below is Gauss’ quadratic reciprocity, found in most elementary texts in number theory. In this post, we’ll see how the Hecke operators originate from this theorem.

Theorem (Gauss). Let \varepsilon(n) = (-1)^{\frac{n-1}{2}} and \omega(n) = (-1)^{\frac{n^2-1}{8}}. For distinct odd primes p, q,

\displaystyle \left(\frac{p}{q}\right) = \varepsilon(p) .\varepsilon(q). \left(\frac{q}{p}\right),

\displaystyle \left(\frac{-1}{p}\right) = \varepsilon(p),

\displaystyle \left(\frac{2}{p}\right) = \omega(p).

Consider the equation

\textbf{(Q)}: \qquad x^2 = d; \qquad \qquad d\in \mathbb Z \backslash \{0\}.

Let a_p(Q) be the number of solutions to Q modulo p, minus one. Then by definition of the Legendre symbol, a_p(Q) = \left( \frac{d}{p} \right). By property of the Lengendre symbol (or rather, the Jacobi symbol), we have

a_{mn}(Q) = a_m(Q) . a_n(Q).                        (*)

Let N = 4 |d|. Then it follows from the reciprocity law that a_p(Q) depends only on the value of p modulo N. Furthermore, the finite sequence {a_2(Q), a_3(Q), a_5(Q), \cdots } arises as a set of eigenvalues of a linear operator (the Hecke operator) on a finite dimensional complex vector space. We’re going to construct the space.

V_N := \{ f: (\mathbb Z/n\mathbb Z)^* \to \mathbb C \}

T_p : V_N \to V_N, \quad T_p(f)(n) = f(pn) \quad \text{if } p \nmid n \text{ and } 0 \text{ otherwise }.

Verify that T_p is a linear operator on V_N. Now all these operators for varying primes commute with each other. So what is a common eigenvector?

Define \phi (n) = a_n(Q). Then use (*) to show that

T_p(\phi) = a_p(Q) \phi,

for all p prime. So, this \phi is indeed, a common eigenvector for all the T_p‘s!

I will explain more about Hecke operators on modular forms in a future post. (Edit July 10, 2013: Link to the said post).

Recently (22 March 2013) I took my advanced topics exam. In spirit of the Princeton Generals, I wrote out my interview questions for use to anyone in these special topics.

Topics: linear algebraic groups, class field theory

Committee: Freydoon Shahidi (chair), David Goldberg

Linear algebraic groups:

Goldberg decided to start with Linear algebraic groups. What is meant by ‘split’? (I told I had prepared only the algebraically closed case. They were okay). What is parabolic? If Q is parabolic in G and P is parabolic in Q, prove that P is parabolic in G. (Went totally blank. They gave hints – use the other equivalent definition).

State Bruhat decomposition. (I started defining terms – maximal torus, root system, Weyl group etc.) How is the Weyl group related to the torus? (Another moment when I blanked out). How does W act on T? (By conjugation. So W is the quotient of the normalizer of the maximal torus by its centralizer). When is the centralizer equal the torus? (When it is maximal). Shahidi objected and corrected me twice when I wrongly pronounced “veil” for Weyl group – correct is “vile”. There is another important group after Andre Weil.

Bruhat decomposition for GL_n. What is it’s Weyl group? (S_n). What is the length of an element? (minimal length of the decomposition in terms of reflections). What is the polynomial the big cell satisfies? (I had prepared this one. Told the answer and that the proof goes by induction but Shahidi was not satisfied. He told something I didn’t quite understand. I gave the decomposition explicitly and he told it generalizes the well-known LU decomposition to classical groups).

What is the critical step in the classification of semisimple groups of rank 1? (Uses Bruhat decomposition). Give a sketch of proof. (I couldn’t. But turns out, they really wanted me to state the theorem. SL_2 and PSL_2).

Given that the normalizer of a parabolic is itself, show that it must be connected. (Missed a step in that one can choose the conjugating element inside the connected component).

[ That was all about algebraic groups. Nothing about Dynkin diagrams, finding parabolics, simple connectedness and fundamental groups of classical groups. ]


Number Theory:

Class group of Q(\sqrt -7). (Minkowski bound works).

Class group of Q(\sqrt 21). (I saw that 25 – 21 = 4 and started considering the factorization of the prime 2 above using quadratic reciprocity but got it wrong. It took me almost 10 grueling minutes to figure out that 21 was not a prime! Embarrassment).

Define the Hilbert symbol. State the norm condition. Prove the bi-multiplicativity property. (Consequence of a group homomorphism). State and prove Hilbert’s reciprocity. (I had prepared a classical proof from Serre but he insisted on proving via Class field theory. Proved that the local p-Artin maps glue to give a global Artin map and the theorem follows elegantly).

Goldberg asked if I knew anything about the generalized symbol. (I stated and mentioned the skew-symmetry property). He said it’s not in your syllabus anyways.

Show that p = a^2 + ab + b^2 precisely when p is 1 (mod 3). (I was looking at the field of cube roots of unity but was blundering some norm calculation. After many frustrating minutes for everyone, it was discovered that the problem was wrong and should have p = a^2 – ab + b^2. I commented, “I have been doing this calculation since tenth grade” on which Prof. Shahidi quickly retorted, “still you haven’t memorized it!”


The exam lasted for 90 minutes. They told me to wait outside and after a stressful 5-minute period for me, they came out and congratulated me.

Shahidi: Good, congratulations.

Me: Thanks.

Shahidi: Don’t see me for a month now.

Me: Yeah, the past two weeks were tough on me too.

Shahidi: Let us not meet each other for a while now. (All smile in acknowledgement).

Recently, I’ve been preparing for my advanced topics exam at Purdue and have been going over some algebraic number theory. The question in consideration was to find the Hilbert class field (i.e., the maximal abelian unramified extension) of {\mathbb Q(\sqrt{-6})}. The problem boiled down to determining how a prime split in an extension, which reduced to finding the ring of integers of a number field. The usual method (following Kummer) is to find an algebraic integer that generates the ring of integers of the extension over the ring of integers of the base field. But such an integer is not guaranteed to exist. I posted this on math.stackexchange and got an interesting answer in the form of a comment which I now blog.


Theorem: Let {K} and {L} be linearly disjoint number fields, i.e.,

\displaystyle K \otimes_{\mathbb Q} L \rightarrow KL

is an isomorphism. Then, {D_{KL} | D_K^s D_L^r}, where {D_K} denotes the discriminant of {K} over {\mathbb Q}, etc. and {r = [K:\mathbb Q]} and {s = [L : \mathbb Q]}. Moreover, if {D_K} and {D_L} are relatively prime over {\mathbb Z}, then {D_{KL} = D_K^s D_L^r} and {\mathcal O_K \otimes_{\mathbb Z} \mathcal O_L \hookrightarrow \mathcal O_{KL}} is an isomorphism. So, the basis of {\mathcal O_{KL}} can be obtained from the bases of {\mathcal O_K} and {\mathcal O_L} in a natural way.

Let {M} be the image of

\displaystyle \mathcal O_K \otimes_{\mathbb Z} \mathcal O_L \hookrightarrow \mathcal O_{KL}.

Then {M} is an order of {\mathcal O_{KL}} (i.e., a subring of the ring of integers). If {x = ab \in M} is the image of a pure tensor {a \otimes b \in \mathcal O_K \otimes \mathcal O_L}, then the trace form on {KL} acts on {x} by {T_{KL}(ab) = T_K(a).T_L(b)}. (For a proof of this, go to the Galois closure and use the canonical isomorphism {\text{Gal}(KL/\mathbb Q) \cong \text{Gal}(K/\mathbb Q) \times \text{Gal}(L/\mathbb Q)} given by linear disjointness). Thus, {T_{KL} : M \rightarrow M} acts by {T_K \otimes T_L}. The matrix of {T_{KL}} is then the Kronecker product of the matrices of {T_K} and {T_L}. It follows that {D_M = D_K^s D_L^r}. By transitivity of discriminants (see below), {D_{KL} | D_K^s D_L^r}. By considering the towers {\mathbb Q \subseteq K \subseteq KL} and {\mathbb Q \subseteq L \subseteq KL}, we also know that {D_K^s D_L^r} is divisible by {D_K^s} and {D_L^r}, hence the assertion that {D_{KL} = D_K^s D_L^r}. It follows in this case (again by transitivity of discriminant) that {M = \mathcal O_{KL}}.

Transitivity of discriminants:
Let {K\subseteq K' \subseteq K''} be a finite extension of number fields and {A, A', A''} be their rings of integers. Then the discriminants satisfy

\displaystyle D_{A''/A} = D_{A'/A}^{[K'':K']}.\mathrm{N}_{ K'/ K}(D_{ A''/ A'}).

In this post, I will show how algebraic curves and algebraic numbers are related. I shall continue using the notations developed in the previous post here. You may read that post to refresh the theory of algebraic curves.

The unifying feature between the two is the concerned rings are Dedekind domains. A Dedekind domain is an integrally closed Noetherian domain of dimension 1. It has two excellent properties,

  • The localization at every prime ideal is a dvr.
  • The integral closure of a DD in a finite separable extension of its field of fractions is again a DD.

In the curves case, suppose we have two curves (i.e., non-singular projective varieties of dimension 1) and a non-constant morphism between them.

\displaystyle \phi : C_1 \rightarrow C_2.

Then as seen earlier, it induces an inclusion of function fields,

\displaystyle \phi^* : k(C_2) \hookrightarrow k(C_1).


We shall now define what ramification at a point {P \in C_1} means. Intuitively, it means there is a knot at {P}. Let us use some commutative algebra to make this notion precise.

Let {Q = \phi(P)}. Let {t} be an element of the function field {k(C_2)} that generates the maximal ideal of the local ring (also a dvr) {k[C_2]_Q}. Let {e_P} be the {P}-order of {\phi^*(t) = t \circ \phi} as a function of {k(C_1)}. Then {\phi} is ramified at {P} if {e_P>1} and unramified if {e_P=1} for every point {P\in C_1}.

We have the following :


Let {\phi : C_1 \rightarrow C_2} be a non-constant map of curves. Then,

  • For every point {Q\in C_2},

\displaystyle \text{deg } \phi = \displaystyle\sum_{P\mapsto Q} e_P.

  • For all but finitely many points {Q\in C_2},

\displaystyle |\phi^{-1}(Q)| = \text{deg}_s \phi,

where {\text{deg}_s} denotes the degree of separability of {k(C_1) / \phi^* (k(C_2))}.

  • If {\psi : C_2 \rightarrow C_3} is another non-constant map of curves, then

\displaystyle e_{\psi\circ\phi}(P) = e_\phi (P) . e_\psi (\phi P).

Corresponding results in number theory

The first result corresponds to the identity {\displaystyle\sum_{i=1}^g e_i f_i = [L:K]} for number fields {L/K}. The second one says that only finitely many primes ramify and the third result is the multiplicativity of ramification indices in a tower of number fields. Let us state these results more precisely in the following


Let {K \subseteq L} be number fields with {[L:K]< \infty}. Let {\mathcal O_K} and {\mathcal O_L} be the corresponding rings of integers. Then,

  • For every prime {P} in {\mathcal O_K}, we have

\displaystyle P \mathcal O_L = Q_1^{e_1} Q_2^{e_2} \cdots Q_g^{e_g}

where the {Q_i}‘s are primes in {\mathcal O_L}. Then, \displaystyle \displaystyle\sum_{i=1}^g e_i f_i = [L:K] holds, where {f_i} is the inertial degree given by {[ \mathcal O_L/Q_i : \mathcal O_K/P ]}.

  • At most finitely many primes of {\mathcal O_K} ramify in {\mathcal O_L}. (A prime {P} of {\mathcal O_K} is said to ramify in {L} if {e_i>1} for some {i}).
  • If {M} is a number field containing {L}, then for every prime {P} of {\mathcal O_K},

\displaystyle e_{M/K} = e_{M/L}. e_{L/K}.

More analogy

The similarity between number fields and algebraic curves does not end here. In the number theoretic case, we have the class group of a number field which is the quotient of the free abelian group on prime ideals modulo the principle ideals. Similarly, for algebraic curves we have the Picard group which is the free abelian group on divisors modulo principal divisors. Both groups turn to be finite (after some struggle in proving it).

Finally, the analog of the exact sequence in number theory (here {U_K} is the group of units)

\displaystyle 1 \rightarrow U_K \rightarrow K^* \rightarrow \displaystyle \begin{pmatrix} \text{fractional} \\ \text{ideals of }\mathcal O_K \end{pmatrix} \rightarrow C_K \rightarrow 0

is the exact sequence of degree-zero divisors

\displaystyle 1 \rightarrow K^* \rightarrow K(C)^* \rightarrow \text{Div}^0(C) \rightarrow \text{Pic}^0(C) \rightarrow 0.

They were the brilliant schemes of Grothendieck and his co-workers that unified algebraic geometry and number theory with tools (results) from the former being made available to the latter. He was able to prove the Weil conjectures with these abstract unified objects known as schemes but more on that later (after I study it!)

In analytic number theory, they denote a complex variable by {s = \sigma + it}. Its an unfortunate choice of notation but has become too banal to be able to change it. In this post, we shall study the important notion of a Dirichlet series and its convergence in the complex plane. One important tool in analytic number theory is that by defining various meromorphic functions, we look at the residue at their poles. This residue gives a lot of number-theoretic information. We define the Dirichlet series to be of the form

\displaystyle f(s) = \displaystyle\sum_{n\in\mathbb N} \frac{a_n}{n^s}, \qquad a_n \in \mathbb C.

The `prime’ example of a Dirichlet series is the famous Riemann {\zeta}-function defined by

\displaystyle \zeta(s) = \displaystyle\sum_{n\in\mathbb N} \frac{1}{n^s}.

Finding its zeros (roots) is a central open problem in number theory, called the Riemann Hypothesis and may be one of the toughest ways to be a millionaire!

There are other generalizations like Dedekind’s {\zeta}-function given by

\displaystyle \zeta_K(s) = \displaystyle\sum \frac{1}{\mathbb N(\mathfrak a)^s},

where the sum is taken over all proper ideals of the ring of integers of the number field {K} and \mathbb{N} denotes the norm of the ideal. There are also the Dirichlet {L}-series and Hecke {L}-series which further generalize this but we won’t define them now. We shall see the convergence of all these in the complex plane follow by using a lemma whose proof uses only basic complex analysis. Interested readers can look up for a proof in Janusz’s Algebraic Number Fields, Ch. IV Prop. 2.1.

Lemma 1 Let {f(s)} be the Dirichlet series as defined above and let {S(x) = \sum_{n \leq x} a_n}. Suppose there exist positive constants {a} and {b} such that {|S(x)| \leq a x^b} for all {x >> 0}. Then the series {f(s)} is uniformly convergent in {D(b, \delta, \varepsilon)} with any positive {\delta, \varepsilon}, where

\displaystyle D(b, \delta, \varepsilon) = \{ s : \sigma \geq b + \delta, |arg(s-b)\leq \pi/2 - \varepsilon \}.

In particular, {f(s)} is analytic in the half plane {Re(s) > b}.

We use this lemma to prove interesting stuff about poles at the Riemann-{\zeta} function. The proof involves a smart trick.

Proposition 2

  1. {\zeta(s)} is analytic for {Re(s) > 1}.
  2. {\zeta(s)} extends to a meromorphic function for {Re(s) > 0}.
  3. {\zeta(s)} has only one pole in {Re(s) > 0}; its located at {s = 1} and is a simple pole.


1. This follows from the lemma since {S(x) \leq x}.

2. Here is a cool trick: define

{\zeta_2(s) = 1 - \displaystyle\frac{1}{2^s} + \frac{1}{3^s} - \frac{1}{4^s} \cdots }

and observe that

\displaystyle \zeta_2(s) = \displaystyle \sum \frac{1}{n^s} - 2. \frac{1}{2^s} \sum \frac{1}{n^s}

so that

\displaystyle \zeta(s) = \zeta_2(s) . \frac{1}{1 - 2^{(1-s)}}. \ \ \ \ \ (1)

Since {S(x) = 0, 1} for {\zeta_2}, it follows by the lemma that {\zeta_2} is analytic on {Re(s) > 0}. Since the other fraction in equation (1)above is meromorphic, it follows that {\zeta} too is meromorphic. This settles (2).

3. By (1), the poles of {\zeta} must be poles of the function {\displaystyle\frac{1}{1-2^{(1-s)}}}, which are precisely the points {s = 1 + \displaystyle\frac{2\pi i m}{\ln 2}, \; m \in\mathbb Z}.

Now similar to {\zeta_2}, define

\displaystyle \zeta_3 (s) = 1 + \displaystyle \frac{1}{2^s} - \frac{2}{3^s} + \frac{1}{4^s} + \frac{1}{5^s} - \frac{2}{6^s} \cdots .

Conclude that

\displaystyle \zeta(s) = \zeta_3(s) . \displaystyle\frac{1}{1-3^{(1-s)}},

so the possible poles of {\zeta(s)} are {s = 1 + \displaystyle\frac{2 \pi i n }{\ln 3}, \; n \in \mathbb Z}. Since the only common pole of {\zeta_2} and {\zeta_3} is {1}, it must be the only pole of {\zeta}! Further, the order of the pole of {\zeta} at {1} must equal the order of {\displaystyle\frac{1}{1-2^{(1-s)}}} which is 1. The pole is hence, simple. \Box

As I said earlier, the residue at this pole for various Dirichlet series gives number-theoretic information but I will discuss that after learning further.

2011 has been declared as a Number Theory Year at IMSc and it also coincides with Prof R Balasubramaniam’s 60th birthday. On that occasion, we have introductory courses on special topics in Number Theory by various imminent Number Theorists across India as well as abroad. This week, rather this fortnight, we have Pietro Corvaja from Italy speaking on Integral points on varieties: an introduction to Diophantine Geometry. Like most talks, I lost track of the seminar after about a couple of minutes. But till then, I witnessed some fantastic number theory, that I describe below:

Theorem: (Dirichlet – 1842) Let \alpha, Q\in \mathbb R with Q>1. Then there are integers p, q such that 1 \leq q < Q and

| \alpha q - p | \leq \displaystyle \frac{1}{Q}.

Proof: First consider the case when Q is an integer. Then, 0, 1, [\alpha], [2 \alpha] , \cdots, [(Q-1)\alpha] are Q+1 numbers inside $[0,1]$. Partitioning the unit interval into Q subintervals, namely, [0, 1/Q), [1/Q, 2/Q), \cdots , [\frac{1}{(Q-1)}, 1] it follows from Dirichlet’s box principle that some interval must contain two of these points. Hence there must be nonnegative integers p_1, p_2, q_1, q_2 each less than Q such r_1 \neq r_2 and

|( r_1 \alpha - s_1) - (r_2 \alpha - s_2) | \leq \frac{1}{Q}.

Since r_1 \neq r_2, the theorem follows for Q\in\mathbb Z.

For Q\not\in\mathbb Z, take Q' = [Q] + 1 and since the inequality holds with \frac{1}{Q'}; it must also hold when Q' is replaced by Q.

Dirichlet was a brilliant mathematician who produced astonishing results by smartly applying the (Dirichlet’s) Box principle, more popularly known as the Pigeon-Hole principle. One of the best and useful theorems due to him is that an arithmetic progression with coprime first term and common difference must contain infinitely many primes. In the above theorem and its corollary, we see that irrational numbers can ‘nicely’ be approximated by rationals.

Corollary: Suppose that \alpha is irrational. Then there are infinitely many rational numbers \frac{p}{q} such that

| \alpha - \frac{p}{q} | \leq \frac{1}{q^2}

Proof: Obviously, we can demand p, q to be relatively prime in the above theorem. Since p\alpha - q is never zero, the inequality | \alpha - \frac{p}{q} | \leq \frac{1}{Q} can be satisfied by finitely many Q‘s. Thus as Q increases, the primes p, q must also vary.

Note that the corollary is false if \alpha is rational. There is a result due to Liouville which says that an algebraic number cannot be so ‘nicely’ approximated by rationals. This theorem due to Liouville also gives the first ever found transcendental number, the Liouville number. Returning back, Roth said that Dirichlet’s bound could not be improved upon using the same hypotheses. Here is the precise statement (without proof) :

Theorem: (Roth) If \alpha \in \overline{\mathbb Q} \cap \mathbb R and \varepsilon >0, then there are only finitely many rationals satisfying |\alpha - \frac{p}{q}| < \frac{1}{q^{2+\varepsilon}}.

Another fascinating result due to Thue follows below from Roth’s theorem. (There was a lot of mathematics in the talk that I am not mature enough to follow. Nevertheless, the following theorem needed none of the previous discussion in its proof).

Theorem: (Thue) If F(X,Y)\in\mathbb Z[X,Y] is a homogeneous polynomial of degree at least 3 and if it has distinct roots over $\latex \overline{\mathbb Q}$, then for every integer c, the equation

F(x,y) = c

has only finitely many solutions (x,y)\in\mathbb Z^2.


Proof: Since F is homogeneous, we can write F(X,Y) = a (X - \alpha_1 Y) \cdots (X - \alpha_n Y) with all the \alpha_i \in\overline{\mathbb Q}‘s distinct. Then, if (x,y)\in\mathbb Z^2 is a solution,

a (\frac{x}{y} - \alpha_1) \cdots (\frac{x}{y} - \alpha_n) = \frac{c}{y^n} \rightarrow 0 as y \rightarrow \infty.

Thus, at least one of the terms on the left, which we assume WLOG to be the first, goes to zero. This means that there is a \delta>0 such that |\frac{x}{y} - \alpha_i | > \delta for i = 2, 3, \cdots, n.

Now, \displaystyle|\frac{a}{c^n}|. \frac{1}{y^n} = | \frac{x}{y} - \alpha_1 | \prod_{j=2}^n | \frac{x}{y} - \alpha_j | \geq | \frac{x}{y} - \alpha_1 | . \delta^{n-1}. Thus, |\frac{x}{y} - \alpha_1 | \leq | \frac{c}{a\delta^{n-1}}| . \frac{1}{y^n}. If there are infinitely many solutions (x,y)\in\mathbb Z^2, then y can be made as large as possible. But this contradicts Roth’s theorem if n>2.

Thus, we can say, X^3 + 3 Y^3 = 1 can have at most finitely many integer-valued solutions. This is a deep enough result to be interesting, isn’t it?

Recently, I came across a post on MO that asked for complicated proofs of trivial statements. The highest voted answer was :


Theorem: 2^{\frac{1}{n}} is irrational for n > 2.

Proof: Assume the contrary, say 2^{\frac{1}{n}} = \frac{x}{y}. Then, x^n = 2 y^n = y^n + y^n contradicting Fermat’s Last Theorem.

Remark: Unfortunately, FLT can’t prove the irrationality of \sqrt 2!


Tailpiece: I recently made a (just working) webpage on my institute website. Here is its link.

Result: {v: \mathbb{Q}^* \rightarrow \mathbb Z} is a discrete valuation satisfying the following properties:

  • {v} is surjective.
  • {v(ab) = v(a) + v(b) \quad \text{for all} \quad a,b\in \mathbb Q^*}
  • {v(a+b) \geq \{ v(a), v(b) \} \quad \text{provided} \quad a+b \neq 0}

Then v=v_p for some prime p, given by, v_p\displaystyle\left(p^r \displaystyle\frac{a}{b} \right) = r for (a,p)=(b,p)=1.

Proof: It is a fact (cf. Dummit & Foote Ex 39 Sec. 7.4) that {R = \{ x \in \mathbb{Q}^* : v(x) \geq 0 \}} is a local ring with a unique maximal ideal {\mathfrak m} of elements of positive valuation. (Recollect that an element of {R} is a unit iff its valuation is zero.) Now, {v(1)=0 \Rightarrow 1 \in R \Rightarrow \mathbb Z \leq R}.
Claim: {\mathbb Z \cap \mathfrak m = (p)}. Clearly, {v} being surjective, {\mathbb Z \cap \mathfrak m \neq (0)} because otherwise, each nonzero integer would have valuation zero and so would every nonzero rational. If {\mathbb Z \cap \mathfrak m = (n)} then we factorize {n=ab} with {a,b} integers. Then {ab\in \mathfrak m \Rightarrow a\in \mathfrak m } or {b \in \mathfrak m}. Thus {\mathbb Z \cap \mathfrak m} must be a prime ideal. So the claim is justified.
Now given {\displaystyle\frac{a}{b}}, write {\displaystyle\frac{a}{b}=p^r \frac{a'}{b'}} with {(p,a')=(p,b')=1}. I claim that {a'} and {b'} are units. Since {(p,a')=1}, we can write {px+a'y=1}. If {a'} is not a unit then {a'\in \mathfrak m} and thus {1\in \mathfrak m}, a contradiction. A similar argument suggests that {b'} is also a unit. Hence,
\displaystyle v\displaystyle\left(\frac{a}{b}\right)=rv(p)+v(a')-v(b')=rv(p)

Now {v} surjective implies, there exists {\displaystyle\frac{a}{b}=p^r \frac{a'}{b'}} such that {v\displaystyle\left(\displaystyle\frac{a}{b}\right)=rv(p)=1}. This leaves two possibilities, namely, {v(p)=\pm 1}. But {v(p)=-1} gives an easy contradiction: {-1=v(p)\geq \min\{v(1),v(p-1)\} = 0}. Thus the only possibility is that {v(p)=1} and thus, {v=v_p}. \blacksquare

About me

Abhishek Parab

I? An Indian. A mathematics student. A former engineer. A rubik's cube addict. A nature photographer. A Pink Floyd fan. An ardent lover of Chess & Counter-Strike.

View my complete profile

Previous Posts

Quotable Quotes

“Do not think; let the equation think for you”

”You cannot be perfect, but if you won’t try, you won’t be good enough”

“Don’t worry about your maths problems; I assure you, mine are greater”

"A comathematician is a device for turning cotheorems into ffee"

More quotes

Fight World Hunger