What Can Be Computed
in Algebraic Geometry?

Dave Bayer Partially supported by NSF grant DMS-90-06116.    David Mumford
May 7, 1992

This paper evolved from a long series of discussions between the two authors, going back to around 1980, on the problems of making effective computations in algebraic geometry, and it took more definite shape in a survey talk given by the second author at a conference on Computer Algebra in 1984. The goal at that time was to bring together the perspectives of theoretical computer scientists and of working algebraic geometers, while laying out what we considered to be the main computational problems and bounds on their complexity. Only part of the talk was written down and since that time there has been a good deal of progress. However, the material that was written up may still serve as a useful introduction to some of the ideas and estimates used in this field (at least the editors of this volume think so), even though most of the results included here are either published elsewhere, or exist as “folk-theorems” by now.

The article has four sections. The first two parts are concerned with the theory of Gröbner bases; their construction provides the foundation for most computations, and their complexity dominates the complexity of most techniques in this area. The first part introduces Gröbner bases from a geometric point of view, relating them to a number of ideas which we take up in more detail in subsequent sections. The second part develops the theory of Gröbner bases more carefully, from an algebraic point of view. It could be read independently, and requires less background. The third part is an investigation into bounds in algebraic geometry of relevance to these computations. We focus on the regularity  of an algebraic variety (see Definition 3.2), which, beyond its intrinsic interest to algebraic geometers, has emerged as a measure of the complexity of computing Gröbner bases (see [BS87a], [BS87b], [BS88]). A principal result in this part is a bound on the regularity of any smooth variety by the second author: Theorem 3.12(b). This bound has stimulated subsequent work, and has now been generalized by [BEL91]. Another result of interest is Proposition 3.13, which elucidates the scheme structure of the ideal membership problem. The fourth part is a short discussion of work on algorithms for performing some other key operations on varieties, some open problems about these operations and some general ideas about what works and what doesn’t, reflecting the prejudices of the authors.

One of the difficulties in surveying this area of research is that mathematicians from so many specialties have gotten involved, and they tend both to publish in their own specialized journals and to have specific agendas corresponding to their area. Thus one group of researchers, the working algebraic geometers, are much more interested in actually computing examples than in worst-case complexity bounds. This group, including the first author, has put a great deal of work into building a functioning system, Macaulay, based on Gröbner bases, which has solved many problems and provided many examples to the algebraic geometry community [BS92a]. Another group comes from theoretical computer science and is much more interested in theoretical bounds than practical systems (cf. the provocative comments in Lenstra’s survey [Len92]). It seems to us that more communication would be very helpful: On the one hand, the working algebraic geometer knows lots of facts about varieties that can be very relevant to finding fast algorithms. Conversely asymptotic and/or worst-case performance bounds are sometimes, at least, important indicators of real-time performance. These theoretical bounds may also reveal important distinctions between classes of procedures, and may pose new and deep problems in algebraic geometry. Thus we will see in Section 3 how regularity estimates flesh out a picture explaining why Gröbner basis computations can have such explosive worst case behavior, yet be so useful for the kinds of problems typically posed by mathematicians. Finally, to make this article more useful in bridging this gap, we have tried to include a substantial number of references in our discussions below.

1 A Geometric Introduction

Let XX be a subvariety or a subscheme of projective nn-space 𝐏n{{\bf P}^{n}}, over a field kk. Let {\cal F} be a vector bundle or a coherent sheaf supported on XX. We would like to be able to manipulate such objects by computer. From algebra we get finite descriptions, amenable to such manipulations: Let S=k[x0,,xn]S=k[x_{0},\ldots,x_{n}] be the homogeneous coordinate ring of 𝐏n{{\bf P}^{n}}. Then XX can be taken to be the subscheme defined by a homogenous ideal ISI\subset S, and {\cal F} can be taken to be the sheaf associated to a finitely generated SS-module MM. We can represent II by a list of generators (f1,,fr)(f_{1},\ldots,f_{r}), and MM by a presentation matrix FF, where

M1FM0M0M_{1}\stackrel{F}{\longrightarrow}M_{0}\longrightarrow M\longrightarrow 0

presents MM as a quotient of finitely generated free SS-modules M0M_{0}, M1M_{1}. We concentrate on the case of an ideal II; by working with the submodule J=Im(F)M0J={\rm Im}(F)\subset M_{0}, the module case follows similarly.

The heart of most computations in this setting is a deformation of the input data to simpler data, combinatorial in nature: We want to move through a family of linear transformations of 𝐏n{{\bf P}^{n}} so that in the limit our objects are described by monomials. Via this family, we hope to pull back as much information as possible to the original objects of study.

Choose a one-parameter subgroup λ(t)GL(n+1)\lambda(t)\subset GL(n+1) of the diagonal form

λ(t)=[tw0tw1twn],\lambda(t)=\left[\begin{array}[]{cccc}t^{w_{0}}&&&\\ &t^{w_{1}}&&\\ &&\ldots&\\ &&&t^{w_{n}}\end{array}\right],

where W=(w0,,wn)W=(w_{0},\ldots,w_{n}) is a vector of integer weights. For each t0t\neq 0, λ(t)\lambda(t) acts on XX via a linear change of coordinates of 𝐏n{{\bf P}^{n}}, to yield the subscheme Xt=λ(t)XXX_{t}=\lambda(t)X\cong X. The limit

X0=limt0XtX_{0}=\lim_{t\rightarrow 0}X_{t}

is usually a simpler object, preferable to XX for many computational purposes.

Even if we start out by restricting XX to be a subvariety rather than a subscheme of 𝐏n{{\bf P}^{n}}, it does not suffice to take the limit X0X_{0} set-theoretically; often all we will get pointwise in the limit is a linear subspace L𝐏nL\subset\mbox{${{\bf P}^{n}}$}, reflecting little besides the dimension of the original variety XX. By instead allowing this limit to acquire embedded components and a nonreduced structure, we can obtain an X0X_{0} which reflects much more closely the character of XX itself.

We compute explicitly with the generators f1,,frf_{1},\ldots,f_{r} of II: Let λ\lambda act on SS by mapping each xix_{i} to twixit^{w_{i}}x_{i}; λ\lambda maps each monomial 𝐱A=x0a0xnan\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}=x_{0}^{a_{0}}\cdots x_{n}^{a_{n}} to tWA𝐱A=tw0a0++wnanx0a0xnant^{W\cdot A}\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}=t^{w_{0}a_{0}+\ldots+w% _{n}a_{n}}x_{0}^{a_{0}}\cdots x_{n}^{a_{n}}. If f=a𝐱A+b𝐱B+f=a\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}+b\hskip{0.7227pt}{\bf x}\hskip{% 0.7227pt}^{B}+\ldots, then λf=atWA𝐱A+btWB𝐱B+\lambda f=a\,t^{W\cdot A}\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}+b\,t^{W% \cdot B}\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{B}+\ldots. We take the projective limit in(f)=limt0λf{\rm in}(f)=\lim_{t\rightarrow 0}\lambda f by collecting the terms of λf\lambda f involving the least power of tt; in(f){\rm in}(f) is then the sum of the terms a𝐱Aa\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A} of ff so WAW\cdot A is minimal. For a given ff and most choices of λ\lambda, in(f){\rm in}(f) consists of a single term.

The limit X0X_{0} we want is defined with all its scheme structure by the ideal in(I)=limt0λI{\rm in}(I)=\lim_{t\rightarrow 0}\lambda I, generated by the set {in(f)| fI}\{\,{\rm in}(f)\,|\,f\in I\,\}. For a given II and most choices of λ\lambda, in(I){\rm in}(I) is generated by monomials. Unfortunately, this definition is computationally unworkable because II is an infinite set, and in(I){\rm in}(I) need not equal (in(f1),,in(fr))({\rm in}(f_{1}),\ldots,{\rm in}(f_{r})) for a given set of generators f1,,frf_{1},\ldots,f_{r} of II. To understand how to compute in(I){\rm in}(I), we need to look more closely at the family of schemes XtX_{t} defined by λ\lambda.

Let S[t]S[t] be the polynomial ring k[x0,,xn,t]k[x_{0},\ldots,x_{n},t]; we view S[t]S[t] as the coordinate ring of a one-parameter family of projective spaces 𝐏tn{\bf P}^{n}_{t} over the affine line with parameter tt. For each generator fjf_{j} of II, rescale λfj\lambda f_{j} so the lowest power of tt has exponent zero: Let gj=t-λfjg_{j}=t^{-\ell}\lambda f_{j}, where =WA\ell=W\cdot A is the least exponent of tt in λfj\lambda f_{j}. Then fj=gj|t=1f_{j}=g_{j}\!\!\mid_{t=1} and in(fj)=gj|t=0{\rm in}(f_{j})=g_{j}\!\!\mid_{t=0}. Now, let JS[t]J\subset S[t] be the ideal generated by (g1,,gr)(g_{1},\ldots,g_{r}); JJ defines a family YY over 𝐀1{{\bf A}^{1}} whose central fiber is cut out by (in(f1),,in(fr))({\rm in}(f_{1}),\ldots,{\rm in}(f_{r})).

What is wrong with the family YY? YY can have extra components over t=0t=0, which bear no relation to its limiting behavior as t0t\rightarrow 0. Just as the set-theoretic limit limt0Xt\lim_{t\rightarrow 0}X_{t} can be too small (we need the nonreduced structure), this algebraically defined limit can be too big; the natural limit lies somewhere in between.

The notion of a flat  family captures exactly what we are looking for here. For example, if YY is flat, then there are no extra components over t=0t=0. While the various technical definitions of flatness can look daunting to the newcomer, intuitively flatness captures exactly the idea that every fiber of a family is the natural scheme-theoretic continuation of its neighboring fibers.

In our setting, all the XtX_{t} are isomorphic for t0t\neq 0, so we only need to consider flatness in a neighborhood of t=0t=0. Artin [Art76] gives a criterion for flatness applicable here: The syzygies  of g1,,grg_{1},\ldots,g_{r} are the relations h1g1++hrgr=0h_{1}g_{1}+\ldots+h_{r}g_{r}=0 for h1,,hrS[t]h_{1},\ldots,h_{r}\in S[t]. Syzygies correspond to elements (h1,,hr)(h_{1},\ldots,h_{r}) of the S[t]S[t]-module S[t]rS[t]^{r}; the set of all syzygies is a submodule of S[t]rS[t]^{r}. YY is a flat family at t=0t=0 if and only if the restrictions (h1|t=0,,hr|t=0)(h_{1}\!\!\mid_{t=0},\ldots,h_{r}\!\!\mid_{t=0}) of these syzygies to the central fiber generate the SS-module of syzygies of g1|t=0,,gr|t=0g_{1}\!\!\mid_{t=0},\ldots,g_{r}\!\!\mid_{t=0}.

When g1|t=0,,gr|t=0g_{1}\!\!\mid_{t=0},\ldots,g_{r}\!\!\mid_{t=0} are single terms, their syzygies take on a very simple form: The module of syzygies of two terms a𝐱Aa\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}, b𝐱Bb\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{B} is generated by the syzygy b𝐱C(a𝐱A)-a𝐱D(b𝐱B)=0b\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}(a\hskip{0.7227pt}{\bf x}\hskip{0.% 7227pt}^{A})-a\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{D}(b\hskip{0.7227pt}{% \bf x}\hskip{0.7227pt}^{B})=0, where 𝐱E=𝐱C𝐱A=𝐱D𝐱B\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{E}=\hskip{0.7227pt}{\bf x}\hskip{0.72% 27pt}^{C}\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}=\hskip{0.7227pt}{\bf x}% \hskip{0.7227pt}^{D}\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{B} is the least common multiple of 𝐱A\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A} and 𝐱B\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{B}. The module of syzygies of rr such terms is generated (usually not minimally) by the syzygies on all such pairs.

We want to lift these syzygies to syzygies of g1,,grg_{1},\ldots,g_{r}, working modulo increasing powers of tt until each syzygy lifts completely. Whenever we get stuck, we will find ourselves staring at a new polynomial gr+1g_{r+1} so tgr+1Jt^{\ell}g_{r+1}\in J for some >0\ell>0. Including gr+1g_{r+1} in the definition of a new JJJ^{\prime}\supset J has no effect on the family defined away from t=0t=0, but will cut away unwanted portions of the central fiber; what we are doing is removing tt-torsion. By iterating this process until every syzygy lifts, we obtain explicit generators g1,,gr,gr+1,,gsg_{1},\ldots,g_{r},g_{r+1},\ldots,g_{s} for a flat family describing the degeneration of X=X1X=X_{1} to a good central fiber X0X_{0}. The corresponding generators g1|t=1,,gs|t=1g_{1}\!\!\mid_{t=1},\ldots,g_{s}\!\!\mid_{t=1} of II are known as a Gröbner basis for II.

This process is best illustrated by an example. Let S=k[w,x,y,z]S=k[w,x,y,z] be the coordinate ring of 𝐏3{{\bf P}^{3}}, and let I=(f1,f2,f3)SI=(f_{1},f_{2},f_{3})\subset S for

f1=w2-xy, f2=wy-xz, f3=wz-y2.f_{1}=w^{2}-xy,\;f_{2}=wy-xz,\;f_{3}=wz-y^{2}.

II defines a twisted cubic curve X𝐏3X\subset\mbox{${{\bf P}^{3}}$}; XX is the image of the map (r,s)(r2s,r3,rs2,s3)(r,s)\mapsto(r^{2}s,r^{3},rs^{2},s^{3}). Let

λ(t)=[t-16t-4t-1t0].\lambda(t)=\left[\begin{array}[]{cccc}t^{-16}&&&\\ &t^{-4}&&\\ &&t^{-1}&\\ &&&t^{0}\end{array}\right].

If waxbyczdw^{a}x^{b}y^{c}z^{d} is a monomial of degree <4<4, then λwaxbyczd=t-waxbyczd\lambda\cdot w^{a}x^{b}y^{c}z^{d}=t^{-\ell}w^{a}x^{b}y^{c}z^{d} where =16a+4b+c\ell=16a+4b+c. Thus, sorting the monomials of SS of each degree <4<4 by increasing powers of tt with respect to the action of λ\lambda is equivalent to sorting the monomials of each degree in lexicographic order.

We have

g1\displaystyle g_{1} =t32λf1\displaystyle=\;t^{32}\lambda f_{1} =w2-t27xy,\displaystyle=\;w^{2}-t^{27}xy,
g2\displaystyle g_{2} =t17λf2\displaystyle=\;t^{17}\lambda f_{2} =wy-t13xz,\displaystyle=\;wy-t^{13}xz,
g3\displaystyle g_{3} =t16λf3\displaystyle=\;t^{16}\lambda f_{3} =wz-t14y2.\displaystyle=\;wz-t^{14}y^{2}.

The module of syzygies on w2w^{2}, wywy, wzwz is generated by the three possible pairwise syzygies; we start with the syzygy y(w2)-w(wy)=0y(w^{2})-w(wy)=0. Substituting g1g_{1}, g2g_{2} for the lead terms w2w^{2}, wywy we get

y(w2-t27xy)-w(wy-t13xz)=t13wxz-t27xy2y(w^{2}-t^{27}xy)-w(wy-t^{13}xz)=t^{13}wxz-t^{27}xy^{2}

which is a multiple t13xt^{13}x of g3g_{3}. Thus, the syzygy

yg1-wg2-t13xg3=0yg_{1}-wg_{2}-t^{13}xg_{3}=0

of g1g_{1}, g2g_{2}, g3g_{3} restricts to the monomial syzygy y(w2)-w(wy)=0y(w^{2})-w(wy)=0 when we substitute t=0t=0, as desired.

Similarly, the syzygy

zg1-t14yg2-wg3=0zg_{1}-t^{14}yg_{2}-wg_{3}=0

restricts to the monomial syzygy z(w2)-w(wz)=0z(w^{2})-w(wz)=0. When we attempt to lift z(wy)-y(wz)=0z(wy)-y(wz)=0, however, we find that

z(wy-t13xz)-y(wz-t14y2)=-t13xz2+t14y3.z(wy-t^{13}xz)-y(wz-t^{14}y^{2})=-t^{13}xz^{2}+t^{14}y^{3}.

xz2xz^{2} is not a multiple of w2w^{2}, wywy, or wzwz, so we cannot continue; J=(g1,g2,g3)J=(g_{1},g_{2},g_{3}) does not define a flat family. Setting t=1t=1, the troublesome remainder is -xz2+y3-xz^{2}+y^{3}. Making this monic, let f4=xz2-y3f_{4}=xz^{2}-y^{3}; f4If_{4}\in I and

g4=t4λf4=xz2-ty3.g_{4}=\;t^{4}\lambda f_{4}=\;xz^{2}-ty^{3}.

Adjoin g4g_{4} to the ideal JJ, redefining the family YY. Now,

zg2-yg3+t13g4=0zg_{2}-yg_{3}+t^{13}g_{4}=0

restricts to z(wy)-y(wz)=0z(wy)-y(wz)=0 as desired.

The module of syzygies of w2w^{2}, wywy, wzwz, and xz2xz^{2} is generated by the pairwise syzygies we have already considered, and by the syzygy xz(wz)-w(xz2)=0xz(wz)-w(xz^{2})=0, which is the restriction of

-ty2g2+xzg3-wg4=0.-ty^{2}g_{2}+xzg_{3}-wg_{4}=0.

Thus, J=(g1,g2,g3,g4)J=(g_{1},g_{2},g_{3},g_{4}) defines a flat family YY, and

w2-xy, wy-xz, wz-y2, xz2-y3w^{2}-xy,\;wy-xz,\;wz-y^{2},\;xz^{2}-y^{3}

is a Gröbner basis for II. The limit X0X_{0} is cut out by the monomial ideal in(I)=(w2,wy,wz,xz2){\rm in}(I)=(w^{2},wy,wz,xz^{2}), which we shall see shares many properties with the original ideal II. Note that xz2-y3=0xz^{2}-y^{3}=0 defines the projection of XX to the plane 𝐏2{{\bf P}^{2}} in xx, yy, and zz.

The scheme structure of X0X_{0} is closely related to the combinatorial structure of the monomial kk-basis for S/in(I)S/{\rm in}(I): For each degree dd in our example, the monomials not belonging to in(I){\rm in}(I) consist of three sets {xd,xd-1y,,yd}\{x^{d},x^{d-1}y,\ldots,y^{d}\}, {xd-1z,xd-2yz,,yd-1z}\{x^{d-1}z,x^{d-2}yz,\ldots,y^{d-1}z\}, {yd,yd-1z,,zd}\{y^{d},y^{d-1}z,\ldots,z^{d}\}, and a lone extra monomial xd-1wx^{d-1}w. The first two sets correspond to a double line supported on w=z=0w=z=0, the third set to the line w=x=0w=x=0, and the extra monomial to an embedded point supported at w=y=z=0w=y=z=0. Together, this describes the scheme structure of X0X_{0}. The first two sets consist of d+1d+1 and dd monomials, respectively; the third set adds d-1d-1 new monomials, and overlaps two monomials we have already seen. With the extra monomial, we count 3d+13d+1 monomials in each degree, which agrees with the dimensions of the graded pieces of S/IS/I. The embedded point is crucial; it makes this count come out right, and it alone keeps X0X_{0} nonplanar like XX.

The new monomial generator xz2xz^{2} of in(I){\rm in}(I) excludes the line w=y=0w=y=0 from X0X_{0}; combinatorially, it excludes all but three monomials of the set {xd,xd-1z,,zd}\{x^{d},x^{d-1}z,\ldots,z^{d}\} from the monomial kk-basis for each degree of the quotient S/in(I)S/{\rm in}(I). We can see that this line is unwanted as follows: Away from t=0t=0, YY is parametrized by (r,s,t)(t16r2s,t4r3,trs2,s3,t)(r,s,t)\mapsto(t^{16}r^{2}s,t^{4}r^{3},trs^{2},s^{3},t). Thus, fixing rr and ss, the curve (r,ts,t)(t17r2s,t4r3,t3rs2,t3s3,t)(r,ts,t)\mapsto(t^{17}r^{2}s,t^{4}r^{3},t^{3}rs^{2},t^{3}s^{3},t), with projective limit (0,0,r,s,0)(0,0,r,s,0) as t0t\rightarrow 0. Similarly, the curve (r,t3s,t2)(r,t^{3}s,t^{2}) has as its limit (0,r2,s2,0,0)(0,r^{2},s^{2},0,0). These calculations show that the lines w=z=0w=z=0 and w=x=0w=x=0 indeed belong set-theoretically to the limit X0X_{0}. We can find no such curve whose limit is a general point on the line w=y=0w=y=0, for (r,t4s,t3)(r,t^{4}s,t^{3}) doesn’t work. Thus, the line w=y=0w=y=0 sticks out of the good total space YY.

One usually computes Gröbner bases by working directly in the ring SS, dispensing with the parameter tt. The one-parameter subgroup λ\lambda is replaced by a total order on the monomials of each degree, satisfying the multiplicative  property 𝐱A>𝐱B𝐱C𝐱A>𝐱C𝐱B\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}>\hskip{0.7227pt}{\bf x}\hskip{0.72% 27pt}^{B}\Rightarrow\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}\hskip{0.7227pt% }{\bf x}\hskip{0.7227pt}^{A}>\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}\hskip% {0.7227pt}{\bf x}\hskip{0.7227pt}^{B} for all 𝐱C\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}. In fact, for our purposes these are equivalent concepts: The weight vector WW associated with λ\lambda induces the order 𝐱A>𝐱BWA<WB\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}>\hskip{0.7227pt}{\bf x}\hskip{0.72% 27pt}^{B}\Longleftrightarrow W\cdot A<W\cdot B, which is a total multiplicative order in low degrees as long as no two monomials have the same weight. Conversely, given any multiplicative order and a degree bound dd, one can find many λ\lambda which induce this order on all monomials of degree <d<d. See [Bay82], [Rob85] for characterizations of such orders.

We shall be particularly interested in two multiplicative orders, the lexicographic  order used in our example, and the reverse lexicographic  order. The lexicographic order simply expands out the monomials of each degree into words, and sorts them alphabetically, i.e. 𝐱A>𝐱B\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}>\hskip{0.7227pt}{\bf x}\hskip{0.72% 27pt}^{B} iff the first nonzero entry in A-BA-B is positive. The reverse lexicographic order pushes highest powers of xnx_{n} in any expression back to the end, then within these groups pushes highest powers of xn-1x_{n-1} to the end, etc., i.e. 𝐱A>𝐱B\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}>\hskip{0.7227pt}{\bf x}\hskip{0.72% 27pt}^{B} iff the last nonzero entry of A-BA-B is negative.

What do these orders mean geometrically? The dominant effect of the lexicographic order is a projection from 𝐏n{{\bf P}^{n}} to 𝐏n-1{{\bf P}^{n-1}}, eliminating x0x_{0}. A second order effect is a projection to 𝐏n-2{{\bf P}^{n-2}}, and so forth. We could compute the deformation from XX to X0X_{0} with respect to the lexicographic order in stages carrying out these projections, first applying a λ\lambda with W=(-1,0,,0)W=(-1,0,\ldots,0), then with W=(-1,-1,0,,0)W=(-1,-1,0,\ldots,0), etc. Alternatively, for monomials of each degree <d<d, we can apply the single λ\lambda with W=(-dn-1,,-d,-1,0)W=(-d^{n-1},\ldots,-d,-1,0), generalizing the λ\lambda used in our example. Use of the lexicographic order tends to muck up the family YY more than necessary in most applications, because projections tend to complicate varieties.

For the reverse lexicographic order, the dominant effect is a projection of 𝐏n{{\bf P}^{n}} down to the last coordinate point (0,,0,1)(0,\ldots,0,1). As a secondary effect, this order projects down to the last coordinate line, and so forth. In other words, this order first tries to make XX into a cone over the last coordinate point, and only then tries to squash the result down to or cone it over the last coordinate line, etc. For monomials of each degree <d<d, this can be realized by applying λ\lambda with W=(0,1,d,,dn-1)W=(0,1,d,\ldots,d^{n-1}). Like such cones, the reverse lexicographic order enjoys special properties with respect to taking linear sections of XX or X0X_{0} by intersection with the spaces defined by the last variable(s) (see [BS87a]). The preferred status of the reverse lexicographic order can be attributed to this relationship, because generic linear sections do not complicate varieties.

For example, if we take XX to be three general points in 𝐏2{{\bf P}^{2}}, then using the lexicographic order X0X_{0} becomes a triple point on a line, because the first order effect is the projection of the three points to a line, and the second order limiting process keeps the points within this line. By contrast, if we use the reverse lexicographic order then X0X_{0} becomes the complete first order neighborhood of a point (a point doubled in all directions). This is because the first order limiting process brings the three points together from distinct directions, tracing out a cone over the three points. The first order neighborhood of the vertex in this cone has multiplicity 3, and is the same as the complete first order neighborhood in the plane of this vertex.

For those familiar with the theory of valuations in birational geometry [ZS76, Vol. II, Ch. VI], the lexicographic and reverse lexicographic orders have simple interpretations. Recall that if XX is a variety of dimension nn, and

F:X=Z0Z1Z2ZnF:X=Z_{0}\supset Z_{1}\supset Z_{2}\supset\ldots\supset Z_{n}

is a flag of subvarieties, codimX(Zi)=i{\rm codim}_{X}(Z_{i})=i, with ZiZ_{i} smooth at the generic point of Zi+1Z_{i+1}, then we can define a rank nn valuation vFv_{F} on XX as follows: For each i=1,,n-1i=1,\ldots,n-1, fix fif_{i} to be a function on Zi-1Z_{i-1} with a 1st1\raisebox{3.0pt}{\small{st}} order zero on ZiZ_{i}. Then for any function ff, we can define e1=ordZ1(f)e_{1}={\rm ord}_{Z_{1}}(f), e2=ordZ2((f/f1e1)|Z1)e_{2}={\rm ord}_{Z_{2}}((f/f_{1}^{e_{1}})\!\mid_{Z_{1}}), etc., and vF(f)=(e1,,en)𝐙nv_{F}(f)=(e_{1},\ldots,e_{n})\in\mbox{\bf Z}^{n}, where the value group 𝐙n\mbox{\bf Z}^{n} is ordered lexicographically. The arbitrarily chosen fif_{i} are not needed to compare two functions ff, gg: We have vF(f)vF(g)v_{F}(f)\succ v_{F}(g) if and only if ordZ1(f/g)>0{\rm ord}_{Z_{1}}(f/g)>0, or if this order is zero and ordZ2((f/g)|Z1)>0{\rm ord}_{Z_{2}}((f/g)\!\mid_{Z_{1}})>0, and so forth. Such a valuation also defines an order on each graded piece SdS_{d} of the homogeneous coordinate ring: take any f0Sdf_{0}\in S_{d} and say f>gf>g if and only if vF(f/f0)vF(g/f0)v_{F}(f/f_{0})\succ v_{F}(g/f_{0}). More generally, one may take the ZiZ_{i} to be subvarieties of a variety XX^{\prime} dominating XX and pull back functions to XX^{\prime} before computing vFv_{F}.

The lexicographic order on monomials of each degree of 𝐏n{{\bf P}^{n}} is now induced by the flag

𝐏nV(x0)V(x0,x1)V(x0,,xn-1).\mbox{${{\bf P}^{n}}$}\supset V(x_{0})\supset V(x_{0},x_{1})\supset\ldots% \supset V(x_{0},\ldots,x_{n-1}).

For example, the first step in the comparison defining vF(𝐱A/f0)vF(𝐱B/f0)v_{F}(\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}/f_{0})\succ v_{F}(\hskip{0.7% 227pt}{\bf x}\hskip{0.7227pt}^{B}/f_{0}) has the effect of asking if a0-b0>0a_{0}-b_{0}>0.

The reverse lexicographic order is induced by a flag on a blowup XX of 𝐏n{{\bf P}^{n}}: First blow up V(x0,,xn-1)V(x_{0},\ldots,x_{n-1}) and let E1E_{1} be the exceptional divisor. Next blow up the proper transform of V(x0,,xn-2)V(x_{0},\ldots,x_{n-2}), and let E2E_{2} be this exceptional divisor. Iterating, we can define a flag

XE1E1E2E1EnX\supset E_{1}\supset E_{1}\cap E_{2}\supset\ldots\supset E_{1}\cap\ldots\cap E% _{n}

which induces the reverse lexicographic order on monomials in each degree. For example, looking at the affine piece of the first blow up obtained by substituting x0=x0xn-1, , xn-2=xn-2xn-1x_{0}=x_{0}^{\prime}x_{n-1},\;\ldots,\;x_{n-2}=x_{n-2}^{\prime}x_{n-1}, the power of xn-1x_{n-1} in the transform of 𝐱A\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A} is a0++an-1a_{0}+\ldots+a_{n-1}, which is the order of vanishing of this monomial on E1E_{1}. Thus, the first step in the comparison defining vF(𝐱A/f0)vF(𝐱B/f0)v_{F}(\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}/f_{0})\succ v_{F}(\hskip{0.7% 227pt}{\bf x}\hskip{0.7227pt}^{B}/f_{0}) has the effect of asking if a0++an-1-b0--bn-1>0a_{0}+\ldots+a_{n-1}-b_{0}-\ldots-b_{n-1}>0, which is what we want.

Taking into account the equivalence between multiplicative orders and one-parameter subgroups, the process we have described in S[t]S[t] is exactly the usual algorithm for computing Gröbner bases. It is computationally advantageous to set t=1t=1 and dismiss our extra structure as unnecessary scaffolding, but it is conceptually advantageous to treat our viewpoint as what is “really” going on; many techniques of algebraic geometry become applicable to the family YY, and assist in analyzing the complexity of Gröbner bases. Moreover, this picture may help guide improvements to the basic algorithm. For example, for very large problems, it could be computationally more efficient to degenerate to X0X_{0} in several stages; this has not been tried in practice.

The coarsest measure of the complexity of a Gröbner basis is its maximum degree, which is the highest degree of a generator of the ideal in(I){\rm in}(I) defining X0X_{0}. This quantity is bounded by the better-behaved regularity  of in(I){\rm in}(I): The regularity of an ideal II is the maximum over all ii of the degree minus ii of any minimal iith syzygy of II, treating generators as 00th syzygies. When II is the largest (the saturated) ideal defining a scheme XX, we call this the regularity of XX. We take up regularity in detail in Section 3; here it suffices to know that regularity is upper semi-continuous  on flat families, i.e. the regularity can only stay the same or go up at special fibers.

Let reg(I){\rm reg}(I) denote the regularity of II, and reg0(I){\rm reg}_{0}(I) denote the highest degree of a generator of II. In our case, t=0t=0 is the only special fiber, and the above says that

reg0(I)reg(I)reg(in(I))reg0(in(I)),{\rm reg}_{0}(I)\;\leq\;{\rm reg}(I)\;\leq\;{\rm reg}({\rm in}(I))\;\geq\;{\rm reg% }_{0}({\rm in}(I)),

where reg0(I){\rm reg}_{0}(I) can be immediately determined from the input data, and reg0(in(I)){\rm reg}_{0}({\rm in}(I)) is the degree-complexity of the Gröbner basis computation. In practice, each of these inequalities are often strict.

However when kk is infinite, then for any set of coordinates for 𝐏n{{\bf P}^{n}} chosen from a dense open set UGL(n+1)U\subset GL(n+1) of possibilities, Galligo ([Gal74]; see also [BS87b]) has shown that the limiting ideal in(I){\rm in}(I) takes on a very special form: in(I){\rm in}(I) is invariant under the action of the Borel subgroup of upper triangular matrices in GL(n+1)GL(n+1). This imposes strong geometric conditions on X0X_{0}. In particular, the associated primes of in(I){\rm in}(I) are also Borel-fixed, so they are all of the form (x0,,xi)(x_{0},\ldots,x_{i}) for various ii. This means that the components of X0X_{0} are supported on members of a flag.

In characteristic zero, it is shown in [BS87a] that the regularity of a Borel-fixed ideal is exactly the maximum of the degrees of its generators, or in our notation, that reg(in(I))=reg0(in(I)){\rm reg}({\rm in}(I))={\rm reg}_{0}({\rm in}(I)) when in(I){\rm in}(I) is Borel-fixed. Thus, for generic coordinates in characteristic zero, the degree-complexity of computing Gröbner bases breaks down into two effects: the gap reg0(I)reg(I){\rm reg}_{0}(I)\leq{\rm reg}(I) between the input degrees and the regularity of XX, and the gap reg(I)reg(in(I)){\rm reg}(I)\leq{\rm reg}({\rm in}(I)) allowed by upper-semicontinuity.

A combination of theoretical results, hunches and experience guides the practitioner in assessing the first gap; what about the second? Does the regularity have to jump at all? One can easily find examples of ideals and total orders exhibiting such a jump, but in [BS87a], it is shown that for the reverse lexicographic order, in generic coordinates and any characteristic, there is no jump: reg(I)=reg(in(I)){\rm reg}(I)={\rm reg}({\rm in}(I)), so in characteristic zero we have

reg0(in(I))=reg(I).{\rm reg}_{0}({\rm in}(I))={\rm reg}(I).

In this sense, this order is an optimal choice: For the reverse lexicographic order, the degree-complexity of a Gröbner basis computation is exactly the regularity of the input data. This agrees with experience; computations made on the same inputs using the lexicographic order can climb to much higher degrees than the reverse lexicographic order, in practice.

For many applications, one is free to choose any order, but some problems restrict us to using orders satisfying combinatorial properties which the reverse lexicographic order fails to satisfy. An example, developed further in Section 2, is that of eliminating variables, or equivalently, of computing projections. To compute the intersection of II with a subring R=k[xi,,xn]R=k[x_{i},\ldots,x_{n}], it is necessary to use an order which in each degree sorts all monomials not in RR ahead of any monomial in RR. The lexicographic order is an example of such an order, for each ii simultaneously. This strength comes at a cost; we are paying in regularity gaps for properties we may not need in a particular problem. An optimal order if you need one specific projection (in the same sense as above) is constructed by sorting monomials by total degree in the variables to be eliminated, and then breaking ties using the reverse lexicographic order. See [BS87b] for this result, and a generalization to the problem of optimally refining any nonstrict order.

Using this elimination order, one finds that the inherent degree-complexity of a computation is given not by the regularity of XX itself, but rather by the regularity of the flat projection  XX^{\prime} of XX, which is the central fiber of a flat family which animates the desired projection of XX as t0t\rightarrow 0. The jump in regularity between XX and XX^{\prime} is unavoidable; by choosing an optimal order, we avoid the penalty of a further jump in regularity between XX^{\prime} and X0X_{0}.

The regularity of algebraic varieties or schemes XX is far from being well understood, but there is considerable interest in its study; this computational interpretation of regularity as the inherent degree-complexity of an ideal is but one more log on the fire.

From a theoretical computer science perspective, the full complexity of computing Gröbner bases is determined not merely by the highest degree reg0(I){\rm reg}_{0}(I) in the basis, but by the total number of arithmetic operations in the field kk required to compute this basis. This has not been analyzed in general, but for 00-dimensional ideals II, Lakshman and Lazard ([Lak91], [LL91]) have shown that the complexity of computing reduced Gröbner bases is bounded by a polynomial in dnd^{n}, where dd is the maximum degree of the generators, and nn is the number of variables.

2 Gröbner Bases

Let S=k[x0,,xn]S=k[x_{0},\ldots,x_{n}] be a graded polynomial ring over the field kk, and let ISI\subset S be a homogeneous ideal.

Let SdS_{d} denote the finite vector space of all homogeneous, degree d polynomials in SS, so S=S0S1SdS=S_{0}\oplus S_{1}\oplus\ldots\oplus S_{d}\oplus\ldots. Writing II in the same manner as I=I0I1IdI=I_{0}\oplus I_{1}\oplus\ldots\oplus I_{d}\oplus\ldots, we have IdSdI_{d}\subset S_{d} for each dd. Recall that the Hilbert function of II is defined to be the function p(d)=dim(Id)p(d)=\dim(I_{d}), for d0d\geq 0.

A total order >> on the monomials of SS is said to be multiplicative  if whenever 𝐱A>𝐱B\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}>\hskip{0.7227pt}{\bf x}\hskip{0.72% 27pt}^{B} for two monomials 𝐱A\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}, 𝐱B\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{B}, then 𝐱C𝐱A>𝐱C𝐱B\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}\hskip{0.7227pt}{\bf x}\hskip{0.722% 7pt}^{A}>\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}\hskip{0.7227pt}{\bf x}% \hskip{0.7227pt}^{B} for all monomials 𝐱C\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}. This condition insures that if the terms of a polynomial are in order with respect to >>, then they remain in order after multiplication by a monomial.

Definition 2.1

Let >> be a multiplicative order. For a homogeneous polynomial f=c1𝐱A1++cm𝐱Amf=c_{1}\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A_{1}}+\ldots+c_{m}\hskip{0.72% 27pt}{\bf x}\hskip{0.7227pt}^{A_{m}} with 𝐱A1>>𝐱Am\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A_{1}}>\ldots>\hskip{0.7227pt}{\bf x}% \hskip{0.7227pt}^{A_{m}}, define the initial term in(f){\rm in}(f) to be the lead (that is, the largest) term c1𝐱A1c_{1}\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A_{1}} of ff. For a homogeneous ideal ISI\subset S, define the initial ideal in(I){\rm in}(I) to be the monomial ideal generated by the lead terms of all elements of II.

Note that the definitions of in(f){\rm in}(f) and in(I){\rm in}(I) depend on the choice of multiplicative order >>. See [BM88] and [MR88] for characterizations of the finite set of in(I){\rm in}(I) realized as the order >> varies.

Fix a multiplicative order >> on SS.

Proposition 2.2 (Macaulay)

I and in(I) have the same Hilbert function.

Proof.([Mac27]) The lead terms of IdI_{d} span in(I)d{\rm in}(I)_{d}, because every monomial 𝐱Ain(I)\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}\in{\rm in}(I) is itself the lead term in(f){\rm in}(f) of some polynomial fIf\in I: Since 𝐱A=𝐱C𝐱B\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}=\hskip{0.7227pt}{\bf x}\hskip{0.72% 27pt}^{C}\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{B} for some 𝐱B=in(g)\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{B}={\rm in}(g) with gIg\in I, we have 𝐱A=in(f)\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}={\rm in}(f) for f=𝐱Cgf=\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}g.

Choose a kk-basis BdIdB_{d}\subset I_{d} with distinct lead terms, and let in(Bd){\rm in}(B_{d}) be the set of lead terms of BdB_{d}; in(Bd){\rm in}(B_{d}) has cardinality p(d)=dim(Id)p(d)=\dim(I_{d}). Since any element of IdI_{d} is a linear combination of elements of BdB_{d}, any lead term of IdI_{d} is a scalar multiple of an element of in(Bd){\rm in}(B_{d}). Thus, in(Bd){\rm in}(B_{d}) is a basis for in(I)d{\rm in}(I)_{d}, so p(d)=dim(in(I)d).p(d)=\dim({\rm in}(I)_{d}).      

One can compute the Hilbert function of II by finding in(I){\rm in}(I) and applying this result; see [MM83], [BCR91], and [BS92b].

Corollary 2.3

The monomials of S which don’t belong to in(I){\rm in}(I) form a kk-basis for S/IS/I.

Proof. These monomials are linearly independent in S/IS/I, because any linear relation among them is a polynomial belonging to II, and all such polynomials have lead terms belonging to in(I){\rm in}(I). These monomials can be seen to span S/IS/I by a dimension count, applying Proposition 2.2.      

Two examples of multiplicative orders are the lexicographic order and the reverse lexicographic order. 𝐱A>𝐱B\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}>\hskip{0.7227pt}{\bf x}\hskip{0.72% 27pt}^{B} in the lexicographic order if the first nonzero coordinate of A-BA-B is positive. For example, if S=k[w,x,y,z]S=k[w,x,y,z], then w>x>y>zw>x>y>z in S1S_{1}, and

w2>wx>wy>wz>x2>xy>xz>y2>yz>z2w^{2}>wx>wy>wz>x^{2}>xy>xz>y^{2}>yz>z^{2}

in S2S^{2}.

𝐱A>𝐱B\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}>\hskip{0.7227pt}{\bf x}\hskip{0.72% 27pt}^{B} in the reverse lexicographic order if the last nonzero coordinate of A-BA-B is negative. For example, if S=k[w,x,y,z]S=k[w,x,y,z], then w>x>y>zw>x>y>z in S1S_{1}, and

w2>wx>x2>wy>xy>y2>wz>xz>yz>z2w^{2}>wx>x^{2}>wy>xy>y^{2}>wz>xz>yz>z^{2}

in S2S^{2}. These two orders agree on S1S_{1}, but differ on the monomials of SS of degree >1>1 when n2n\geq 2.

The lexicographic order has the property that for each subring k[xi,,xn]Sk[x_{i},\ldots,x_{n}]\subset S and each polynomial fSf\in S, fk[xi,,xn]f\in k[x_{i},\ldots,x_{n}] if and only if in(f)k[xi,,xn]{\rm in}(f)\in k[x_{i},\ldots,x_{n}]. The reverse lexicographic order has the property that for each fk[x0,,xi]f\in k[x_{0},\ldots,x_{i}], xix_{i} divides ff if and only if xix_{i} divides in(f){\rm in}(f).

One can anticipate the applications of these properties by considering a kk-basis BdIdB_{d}\subset I_{d} with distinct lead terms, as in the proof of Proposition 2.2. With respect to the lexicographic order, Bdk[xi,,xn]B_{d}\cap k[x_{i},\ldots,x_{n}] is then a kk-basis for Idk[xi,,xn]I_{d}\cap k[x_{i},\ldots,x_{n}] for each ii. With respect to the reverse lexicographic order, Bd(xn)B_{d}\cap(x_{n}) is then a kk-basis for Id(xn)I_{d}\cap(x_{n}). Thus, these orders enable us to find polynomials in an ideal which do not involve certain variables, or which are divisible by a certain variable. For a given degree dd, one could construct such a basis BdB_{d} by applying Gaussian elimination to an arbitrary kk-basis for IdI_{d}. However, this cannot be done for all dd at once; such a computation would be infinite. We will finesse this difficulty by instead constructing a finite set of elements of I whose monomial multiples yield polynomials in II with every possible lead term.

Such sets can be described as follows:

Definition 2.4

A list F=[f1,,fr]IF=[f_{1},\ldots,f_{r}]\subset I is a (minimal) Gröbner basis for II if in(f1),,in(fr){\rm in}(f_{1}),\ldots,{\rm in}(f_{r}) (minimally) generate in(I){\rm in}(I).

in(I){\rm in}(I) is finitely generated because SS is Noetherian, so Gröbner bases exist for any ideal I.

The order of the elements of FF is immaterial to this definition, so FF can be thought of as a set. We are using list notation for FF because we are going to consider algorithms for which the order of the elements is significant. For convenience, we shall extend the notation of set intersections and containments to the lists FF.

A minimal set of generators for an ideal II need not form a Gröbner basis for I. For example, if S=k[x,y]S=k[x,y] and I=(x2+y2,xy)I=(x^{2}+y^{2},xy), then with respect to the lexicographic order, in(x2+y2)=x2{\rm in}(x^{2}+y^{2})=x^{2} and in(xy)=xy{\rm in}(xy)=xy. Yet y(x2+y2)-x(xy)=y3Iy(x^{2}+y^{2})-x(xy)=y^{3}\in I, so y3in(I)y^{3}\in{\rm in}(I). Thus, any Gröbner basis for II must include y3y^{3}; it can be shown that in(I)=(x2,xy,y3){\rm in}(I)=(x^{2},xy,y^{3}) and [x2+y2,xy,y3][x^{2}+y^{2},xy,y^{3}] is a Gröbner basis for II.

On the other hand,

Lemma 2.5

If F=[f1,,fr]F=[f_{1},\ldots,f_{r}] is a Gröbner basis for II, then f1,,frf_{1},\ldots,f_{r} generate II.

Proof. For each degree dd, we can construct a kk-basis BdIdB_{d}\subset I_{d} with distinct lead terms, whose elements are monomial multiples of f1,,frf_{1},\ldots,f_{r}: For each 𝐱Ain(I)d\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}\in{\rm in}(I)_{d}, 𝐱A\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A} is a scalar multiple of 𝐱Cin(fi)\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}{\rm in}(f_{i}) for some 𝐱C\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C} and some ii; include 𝐱Cfi\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}f_{i} in the set BdB_{d}. Thus, the monomial multiples of f1,,frf_{1},\ldots,f_{r} span II.      

Proposition 2.6 (Spear, Trinks)

Let RSR\subset S be the subring R=k[xi,,xn]R=k[x_{i},\ldots,x_{n}]. If F=[f1,,fr]F=[f_{1},\ldots,f_{r}] is a Gröbner basis for the ideal II with respect to the lexicographic order, then FRF\cap R is a Gröbner basis for the ideal IRI\cap R. In particular, FRF\cap R generates IRI\cap R.

Proof. ([Spe77], [Zac78], [Tri78]) Let fIRf\in I\cap R; in(f){\rm in}(f) is a multiple of in(fi){\rm in}(f_{i}) for some ii. Since in(f)R{\rm in}(f)\in R, in(fi)R{\rm in}(f_{i})\in R, so fiRf_{i}\in R. Thus, FRF\cap R is a Gröbner basis for IRI\cap R. By Lemma 2.5, FRF\cap R generates IRI\cap R.      

Proposition 2.6 has the following geometric application: If II defines the subscheme X𝐏nX\subset\mbox{\bf P}^{n}, then Ik[xi,,xn]I\cap k[x_{i},\ldots,x_{n}] defines the projection of XX to 𝐏n-i=Proj(k[xi,,xn])\mbox{\bf P}^{n-i}={\rm Proj}(k[x_{i},\ldots,x_{n}]).

Recall that the saturation IsatI^{\rm sat} of II is defined to be the largest ideal defining the same subscheme X𝐏nX\subset\mbox{\bf P}^{n} as II. IsatI^{\rm sat} can be obtained by taking an irredundant primary decomposition for II, and removing the primary ideal whose associated prime is the irrelevant ideal (x0,,xn)(x_{0},\ldots,x_{n}). II is saturated if I=IsatI=I^{\rm sat}.

If the ideal II is saturated, and defines a finite set of points X𝐏nX\subset\mbox{\bf P}^{n}, then Ik[xn-1,xn]I\cap k[x_{n-1},x_{n}] is a principal ideal (f)(f), where {f=0}\{f=0\} is the image of the projection of XX to 𝐏1=Proj(k[xn-1,xn])\mbox{\bf P}^{1}={\rm Proj}(k[x_{n-1},x_{n}]). Given a linear factor of ff of the form (bxn-1-axn)(bx_{n-1}-ax_{n}), we can make the substitution xn-1=azx_{n-1}=az, xn=bzx_{n}=bz for a new variable zz, to obtain from II an ideal Jk[x0,,xn-2,z]J\subset k[x_{0},\ldots,x_{n-2},z] defining a finite set of points in 𝐏n-1\mbox{\bf P}^{n-1}. For each point (c0,,cn-2,d)(c_{0},\ldots,c_{n-2},d) in the zero locus of JJ, (c0,,cn-2,ad,bd)(c_{0},\ldots,c_{n-2},ad,bd) is a point in the zero locus of II.

If X𝐏n-1X\subset\mbox{\bf P}^{n-1} is of dimension 11 or greater, then in general Ik[xn-1,xn]=(0)I\cap k[x_{n-1},x_{n}]=(0), because a generic projection of XX to 𝐏1\mbox{\bf P}^{1} is surjective. In this case, an arbitrary substitution xn-1=azx_{n-1}=az, xn=bzx_{n}=bz can be made, and the process of projecting to 𝐏1\mbox{\bf P}^{1} iterated. Thus, the lexicographic order can be used to find solutions to systems of polynomial equations.

Recall that the ideal quotient (I:f)(I:f) is defined to be the ideal {gS| fgI}\{\,g\in S\,|\,fg\in I\,\}. Since SS is Noetherian, the ascending chain of ideals (I:f)(I:f2)(I:f3)(I:f)\subset(I:f^{2})\subset(I:f^{3})\subset\ldots is stationary; call this stationary limit (I:f)={gS|fmgI for some m}(I:f^{\infty})=\{\,g\in S\,|\,f^{m}g\in I\mbox{ for some }m\,\}.

Proposition 2.7

If [xna1f1,,xnarfr][x_{n}^{a_{1}}f_{1},\ldots,x_{n}^{a_{r}}f_{r}] is a Gröbner basis for the ideal II with respect to the reverse lexicographic order, and if none of f1,,frf_{1},\ldots,f_{r} are divisible by xnx_{n}, then F=[f1,,fr]F=[f_{1},\ldots,f_{r}] is a Gröbner basis for the ideal (I:xn)(I:x_{n}^{\infty}). In particular, f1,,frf_{1},\ldots,f_{r} generate (I:xn)(I:x_{n}^{\infty}).

Proof. ([Bay82], [BS87a]) We have F(I:xn)F\subset(I:x_{n}^{\infty}). Let f(I:xn)f\in(I:x_{n}^{\infty}); xnmfIx_{n}^{m}f\in I for some mm, so in(xnmf){\rm in}(x_{n}^{m}f) is a multiple of in(xnaifi){\rm in}(x_{n}^{a_{i}}f_{i}) for some ii. Since fif_{i} is not divisible by xnx_{n}, in(fi){\rm in}(f_{i}) is not divisible by xnx_{n}, so in(f){\rm in}(f) is a multiple of in(fi){\rm in}(f_{i}). Thus, FF is a Gröbner basis for (I:xn)(I:x_{n}^{\infty}). By Lemma 2.5, f1,,frf_{1},\ldots,f_{r} generate (I:xn)(I:x_{n}^{\infty}).      

If I=𝐪0𝐪1𝐪tI=\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{0}\cap\hskip{0.7227pt}{\bf q}\hskip% {0.7227pt}_{1}\cap\ldots\cap\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{t} is a primary decomposition of II, then (I:xn)=(𝐪i:xn)=(𝐪i:xn)(I:x_{n}^{\infty})=(\cap\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{i}:x_{n}^{% \infty})=\cap(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{i}:x_{n}^{\infty}). We have (𝐪i:xn)=(1)(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{i}:x_{n}^{\infty})=(1) if the associated prime 𝐩i\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{i} of 𝐪i\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{i} contains xnx_{n}, and (𝐪i:xn)=𝐪i(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{i}:x_{n}^{\infty})=\hskip{0.7227pt}{% \bf q}\hskip{0.7227pt}_{i} otherwise. Thus, if II defines the subscheme X𝐏nX\subset\mbox{\bf P}^{n}, then (I:xn)(I:x_{n}^{\infty}) defines the subscheme consisting of those primary components of XX not supported on the hyperplane {xn=0}\{x_{n}=0\}.

(I:xn)(I:x_{n}^{\infty}) is saturated, because it cannot have (x0,,xn)(x_{0},\ldots,x_{n}) as an associated prime. If xnx_{n} belongs to none of the associated primes of II except (x0,,xn)(x_{0},\ldots,x_{n}), or equivalently if {xn=0}\{x_{n}=0\} is a generic hyperplane section of X𝐏nX\subset\mbox{\bf P}^{n}, then (I:xn)=Isat(I:x_{n}^{\infty})=I^{\rm sat}. Thus, the reverse lexicographic order can be used to find the saturation of II.

One of the most important uses of Gröbner bases is that they lead to canonical representations of polynomials modulo an ideal II, i.e. a division algorithm in which every fSf\in S is written canonically as f=gifi+hf=\sum g_{i}f_{i}+h, where [f1,,fr][f_{1},\ldots,f_{r}] is a Gröbner basis for II, and hh is the remainder after division.

Recall the division algorithm for inhomogeneous, univariate polynomials f(x)f(x), g(x)k[x]g(x)\in k[x]: Let in(f){\rm in}(f) denote the highest degree term of ff. The remainder of gg under division by ff can be recursively defined by

Rf(g)=Rf(g-cxaf)R_{f}(g)=R_{f}(g-cx^{a}f)

if in(f){\rm in}(f) divides in(g){\rm in}(g), where cxa=in(g)/in(f)cx^{a}={\rm in}(g)/{\rm in}(f), and by

Rf(g)=gR_{f}(g)=g

otherwise.

Division can be generalized to homogeneous polynomials f1,,fr,gSf_{1},\ldots,f_{r},g\in S, given a multiplicative order on SS ([Hir64], [Bri73], [Gal74], [Sch80]): The remainder RF(g)R_{F}(g) of gg under division by the list of polynomials F=[f1,,fr]F=[f_{1},\ldots,f_{r}] can be recursively defined by

RF(g)=RF(g-c𝐱Afi)R_{F}(g)=R_{F}(g-c\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}f_{i})

for the least ii so in(g){\rm in}(g) is a multiple c𝐱Ac\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A} of in(fi){\rm in}(f_{i}), and by

RF(g)=in(g)+RF(g-in(g))R_{F}(g)={\rm in}(g)+R_{F}(g-{\rm in}(g))

if in(g){\rm in}(g) is not a multiple of any in(fi){\rm in}(f_{i}). RF(g)R_{F}(g) is an element of SS.

Thus, the fate of in(g){\rm in}(g) depends on whether or not in(g)(in(f1),,in(fr)){\rm in}(g)\in({\rm in}(f_{1}),\ldots,{\rm in}(f_{r})). Let II be the ideal generated by f1,,frf_{1},\ldots,f_{r}. If F=[f1,,fr]F=[f_{1},\ldots,f_{r}] fails to be a Gröbner basis for II, then the remainder is poorly behaved. For example, with respect to the lexicographic order on k[x,y]k[x,y],

R[xy,x2+y2](x2y)=x2y-x(xy)=0,R_{[xy,x^{2}+y^{2}]}(x^{2}y)=x^{2}y-x(xy)=0,

but

R[x2+y2,xy](x2y)=x2y-y(x2+y2)=-y3,R_{[x^{2}+y^{2},xy]}(x^{2}y)=x^{2}y-y(x^{2}+y^{2})=-y^{3},

so the remainder RF(g)R_{F}(g) is dependent on the order of the list FF. Note that x2y(x2+y2,xy)x^{2}y\in(x^{2}+y^{2},xy).

If on the other hand, FF is a Gröbner basis for the ideal II, then RF(g)R_{F}(g) is a kk-linear combination of monomials not belonging to in(I){\rm in}(I). By Corollary 2.3, these monomials form a kk-basis for S/IS/I, so each polynomial in SS has a unique representation in terms of this kk-basis, modulo the ideal II. The remainder gives this unique representation, and is independent of the order of FF (but dependent on the multiplicative order chosen for the monomials of SS). In particular, RF(g)=0R_{F}(g)=0 if and only if gIg\in I.

An algorithm for computing a Gröbner basis for II from a set of generators for II was first given by Buchberger ([Buc65], [Buc76]). This algorithm was discovered independently by Spear ([Spe77], [Zac78]), Bergman [Ber78], and Schreyer [Sch80]. It was termed the division algorithm by Schreyer, after the division theorem of Hironaka ([Hir64], [Bri73], [Gal74]).

Define S(fi,fj)S(f_{i},f_{j}) for i<ji<j by

S(fi,fj)=b𝐱Bfi-c𝐱Cfj,S(f_{i},f_{j})=b\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{B}f_{i}-c\hskip{0.722% 7pt}{\bf x}\hskip{0.7227pt}^{C}f_{j},

where 𝐱A=b𝐱Bin(fi)=c𝐱Cin(fj)\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}=b\hskip{0.7227pt}{\bf x}\hskip{0.7% 227pt}^{B}{\rm in}(f_{i})=c\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}{\rm in}% (f_{j}) is the least common multiple of in(fi){\rm in}(f_{i}) and in(fj){\rm in}(f_{j}). b𝐱Bfib\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{B}f_{i} and c𝐱Cfjc\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}f_{j} each have 𝐱A\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A} as lead term, so 𝐱A\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A} cancels out in S(fi,fj)S(f_{i},f_{j}), and 𝐱A>in(S(fi,fj))\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}>{\rm in}(S(f_{i},f_{j})).

If FF is a Gröbner basis for the ideal II, then RF(S(fi,fj))=0R_{F}(S(f_{i},f_{j}))=0 for each i<ji<j, since S(fi,fj)IS(f_{i},f_{j})\in I. Conversely,

Proposition 2.8 (Buchberger)

If RF(S(fi,fj))=0R_{F}(S(f_{i},f_{j}))=0 for each i<ji<j, then F=[f1,,fr]F=[f_{1},\ldots,f_{r}] is a Gröbner basis for the ideal I=(f1,,fr)I=(f_{1},\ldots,f_{r}).

See [Buc65], [Buc76]. We postpone a proof until the theory has been extended to SS-modules. This result can also be thought of as an explicit converse to the assertion that if FF is a Gröbner basis, then division is independent of the order of FF: Whenever we have a choice in division between subtracting off a multiple of fif_{i} and a multiple of fjf_{j}, the difference is a multiple of S(fi,fj)S(f_{i},f_{j}). If division is independent of the order of FF, then these differences must have remainder zero, so by Proposition 2.8, FF is a Gröbner basis.

As sketched in Section 1, Proposition 2.8 can be used to compute a Gröbner basis from a set of generators f1,,frf_{1},\ldots,f_{r} for the ideal II: For each i<ji<j so fr+1=RF(S(fi,fj))0f_{r+1}=R_{F}(S(f_{i},f_{j}))\neq 0, adjoin fr+1f_{r+1} to the list F=[f1,,fr]F=[f_{1},\ldots,f_{r}]. Note that fr+1If_{r+1}\in I. By iterating until no new polynomials are found, a Gröbner basis FF is obtained for II. This process terminates because SS is Noetherian, and each new basis element corresponds to a monomial not in the ideal generated by the preceding lead terms.

We now extend this theory to SS-modules. Let MM be a graded, finitely generated SS-module, given by the exact sequence of graded SS-modules

M1FM0M𝟎,M_{1}\stackrel{F}{\longrightarrow}M_{0}\longrightarrow M\longrightarrow{\bf 0},

where M0=Se01Se0qM_{0}=Se_{01}\oplus\ldots\oplus Se_{0q} and M1=Se11Se1rM_{1}=Se_{11}\oplus\ldots\oplus Se_{1r} are free SS-modules with deg(eij)=dij\deg(e_{ij})=d_{ij} for each ii, jj. We now think of FF both as a list [f1,,fr][f_{1},\ldots,f_{r}] of module elements, and as a map between free modules: Let fi=F(e1i)0f_{i}=F(e_{1i})\neq 0 for i=1,,ri=1,\ldots,r, and let IM0I\subset M_{0} be the homogeneous submodule generated by f1,,frf_{1},\ldots,f_{r}. Thus, M=M0/IM=M_{0}/I.

A monomial of M0M_{0} is an element of the form 𝐱Ae0i\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}e_{0i}; such an element has degree deg(𝐱A)+d0i\deg(\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A})+d_{0i}. An order on the monomials of M0M_{0} is multiplicative if whenever 𝐱Ae0i>𝐱Be0j\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}e_{0i}>\hskip{0.7227pt}{\bf x}% \hskip{0.7227pt}^{B}e_{0j}, then 𝐱C𝐱Ae0i>𝐱C𝐱Be0j\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}\hskip{0.7227pt}{\bf x}\hskip{0.722% 7pt}^{A}e_{0i}>\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}\hskip{0.7227pt}{\bf x% }\hskip{0.7227pt}^{B}e_{0j} for all 𝐱CS\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C}\in S. For some applications, such as developing a theory of Gröbner bases over quotients of SS, one wants this order to be compatible with an order on SS: If 𝐱A>𝐱B\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}>\hskip{0.7227pt}{\bf x}\hskip{0.72% 27pt}^{B}, then one wants 𝐱Ae0i>𝐱Be0i\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}e_{0i}>\hskip{0.7227pt}{\bf x}% \hskip{0.7227pt}^{B}e_{0i} for i=1,,ri=1,\ldots,r. The orders encountered in practice invariably satisfy this second condition, but it does not follow from the first, and we do not require it here.

One way to extend a multiplicative order on SS to a compatible multiplicative order on M0M_{0} is to declare 𝐱Ae0i>𝐱Be0j\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}e_{0i}>\hskip{0.7227pt}{\bf x}% \hskip{0.7227pt}^{B}e_{0j} if i<ji<j, or if i=ji=j and 𝐱A>𝐱B\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}>\hskip{0.7227pt}{\bf x}\hskip{0.72% 27pt}^{B}. Another way is to assign monomials 𝐱C1,,𝐱Cq\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{C_{1}},\ldots,\hskip{0.7227pt}{\bf x}% \hskip{0.7227pt}^{C_{q}} in SS to the basis elements e01,,e0qe_{01},\ldots,e_{0q} of M0M_{0}, and to declare 𝐱Ae0i>𝐱Be0j\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}e_{0i}>\hskip{0.7227pt}{\bf x}% \hskip{0.7227pt}^{B}e_{0j} if 𝐱A+Ci>𝐱B+Cj\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A+C_{i}}>\hskip{0.7227pt}{\bf x}% \hskip{0.7227pt}^{B+C_{j}}, or if A+Ci=B+CjA+C_{i}=B+C_{j} and i<ji<j.

Fix a choice of a multiplicative order >> on M0M_{0}. The constructions developed for SS carry over intact to M0M_{0}, with the same proofs ([Gal79], [Sch80], [Bay82]): Given an element fM0f\in M_{0}, define in(f){\rm in}(f) to be the lead term of ff. Define in(I){\rm in}(I) to be the submodule generated by the lead terms of all elements of IM0I\subset M_{0}; in(I){\rm in}(I) is a monomial submodule of M0M_{0} with the same Hilbert function as II. Define F=[f1,,fr]IF=[f_{1},\ldots,f_{r}]\subset I to be a Gröbner basis for II if in(f1),,in(fr){\rm in}(f_{1}),\ldots,{\rm in}(f_{r}) generate in(I){\rm in}(I); a set of generators for II need not be a Gröbner basis for II, but a Gröbner basis for II generates II. Given an element gM0g\in M_{0}, define RF(g)M0R_{F}(g)\in M_{0} exactly as was done for the free module SS. If FF is a Gröbner basis for II, then RF(g)=0R_{F}(g)=0 if and only if gIg\in I.

The quotient of gg under division by f1,,frf_{1},\ldots,f_{r} can be recursively defined by

QF(g)=c𝐱Ae1i+QF(g-c𝐱Afi)Q_{F}(g)=c\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}e_{1i}+Q_{F}(g-c\hskip{0.% 7227pt}{\bf x}\hskip{0.7227pt}^{A}f_{i})

for the least ii so in(g){\rm in}(g) is a multiple c𝐱Ac\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A} of in(fi){\rm in}(f_{i}), and by

QF(g)=QF(g-in(g))Q_{F}(g)=Q_{F}(g-{\rm in}(g))

if in(g){\rm in}(g) is not a multiple of any in(fi){\rm in}(f_{i}). The quotient is an element of M1M_{1}.

Following the recursive definitions of the remainder and quotient, it can be inductively verified that

g=F(QF(g))+RF(g).g=F(Q_{F}(g))+R_{F}(g).

If FF is a Gröbner basis for II, and gIg\in I, then RF(g)=0R_{F}(g)=0, so the quotient lifts gg to M1M_{1}. In this case, the quotient can be thought of as expressing gg in terms of f1,,frf_{1},\ldots,f_{r}.

Define S(fi,fj)S(f_{i},f_{j}) for i<ji<j by

S(fi,fj)=b𝐱Bfi-c𝐱Cfj,S(f_{i},f_{j})=b\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{B}f_{i}-c\hskip{0.722% 7pt}{\bf x}\hskip{0.7227pt}^{C}f_{j},

if in(fi){\rm in}(f_{i}) and in(fj){\rm in}(f_{j}) have a least common multiple 𝐱Ae0k=b𝐱Bin(fi)=c𝐱Cin(fj)\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}e_{0k}=b\hskip{0.7227pt}{\bf x}% \hskip{0.7227pt}^{B}{\rm in}(f_{i})=c\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{% C}{\rm in}(f_{j}). Leave S(fi,fj)S(f_{i},f_{j}) undefined if in(fi){\rm in}(f_{i}) and in(fj){\rm in}(f_{j}) lie in different summands of M0M_{0}, and so don’t have common multiples.

Recall that the module of syzygies of f1,,frf_{1},\ldots,f_{r} is defined to be the kernel of the map FF, which is the submodule of M1M_{1} consisting of all hM1h\in M_{1} so F(h)=0F(h)=0. Thus, if h=h1e11++hre1rh=h_{1}e_{11}+\ldots+h_{r}e_{1r} is a syzygy, then h1f1++hrfr=0h_{1}f_{1}+\ldots+h_{r}f_{r}=0. Let JM1J\subset M_{1} denote the module of syzygies of f1,,frf_{1},\ldots,f_{r}, and let KM1K\subset M_{1} denote the module of syzygies of in(f1),,in(fr){\rm in}(f_{1}),\ldots,{\rm in}(f_{r}).

Define the map in(F):M1M0{\rm in}(F):M_{1}\rightarrow M_{0} by in(F)(e1i)=in(fi){\rm in}(F)(e_{1i})={\rm in}(f_{i}); KK is the kernel of in(F){\rm in}(F). For each i<ji<j so S(fi,fj)S(f_{i},f_{j}) is defined, define tijt_{ij} to be the element

tij=b𝐱Be1i-c𝐱Ce1jM1,t_{ij}=b\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{B}e_{1i}-c\hskip{0.7227pt}{% \bf x}\hskip{0.7227pt}^{C}e_{1j}\in M_{1},

where 𝐱Ae0k=b𝐱Bin(fi)=c𝐱Cin(fj)\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}e_{0k}=b\hskip{0.7227pt}{\bf x}% \hskip{0.7227pt}^{B}{\rm in}(f_{i})=c\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{% C}{\rm in}(f_{j}) is the least common multiple of in(fi){\rm in}(f_{i}) and in(fj){\rm in}(f_{j}), as before. in(F)(tij)=0{\rm in}(F)(t_{ij})=0, so each tijt_{ij} belongs to the syzygy module KK. Observe that F(tij)=S(fi,fj)F(t_{ij})=S(f_{i},f_{j}).

Assign the following multiplicative order on M1M_{1}, starting from the order on M0M_{0} ([Sch80]; see also [MM86]): Let 𝐱Ae1i>𝐱Be1j\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}e_{1i}>\hskip{0.7227pt}{\bf x}% \hskip{0.7227pt}^{B}e_{1j} if 𝐱Ain(fi)>𝐱Bin(fj)\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}{\rm in}(f_{i})>\hskip{0.7227pt}{% \bf x}\hskip{0.7227pt}^{B}{\rm in}(f_{j}), or if these terms are kk-multiples of each other and i<ji<j. If the order on M0M_{0} is compatible with an order on SS, then this order on M1M_{1} is compatible with the same order on SS.

With respect to this order on M1M_{1}, we have

Lemma 2.9

The list [tij][\,t_{ij}\,] is a Gröbner basis for the module KK of syzygies of in(f1),,in(fr){\rm in}(f_{1}),\ldots,{\rm in}(f_{r}).

Proof. Let hM1h\in M_{1}, so in(F)(h)=0{\rm in}(F)(h)=0. Then in(F)(in(h)){\rm in}(F)({\rm in}(h)) is canceled by in(F)(h-in(h)){\rm in}(F)(h-{\rm in}(h)) in M0M_{0}. Therefore, if in(h)=𝐱Ae1i{\rm in}(h)=\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}e_{1i}, then hh has another term 𝐱Be1j\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{B}e_{1j} so 𝐱Ain(fi)\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}{\rm in}(f_{i}) and 𝐱Bin(fj)\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{B}{\rm in}(f_{j}) are kk-multiples of each other and i<ji<j. Thus, tijt_{ij} is defined and in(tij){\rm in}(t_{ij}) divides in(h){\rm in}(h), so [tij][\,t_{ij}\,] is a Gröbner basis for KK.      

Thus, the set {tij}\{t_{ij}\} generates KK. In general, the [tij][\,t_{ij}\,] are far from being a minimal Gröbner basis for KK; we consider the effects of trimming this list in Proposition 2.10 below.

Define

sij=tij-QF(S(fi,fj))s_{ij}=t_{ij}-Q_{F}(S(f_{i},f_{j}))

whenever RF(S(fi,fj))=0R_{F}(S(f_{i},f_{j}))=0. Note that in(sij)=in(tij){\rm in}(s_{ij})={\rm in}(t_{ij}). Each sijs_{ij} is the difference of two distinct elements of M1M_{1}, each of which is mapped by FF to S(fi,fj)S(f_{i},f_{j}), so F(sij)=0F(s_{ij})=0. In other words, sijs_{ij} belongs to the syzygy module JJ. Conversely,

Proposition 2.10 (Richman, Spear, Schreyer)

Choose a set of pairs T={(i,j)}T=\{(i,j)\} such that the set {tij}(i,j)T\{t_{ij}\}_{(i,j)\in T} generates the module KK of syzygies of in(f1),,in(fr){\rm in}(f_{1}),\ldots,{\rm in}(f_{r}). If RF(S(fi,fj))=0R_{F}(S(f_{i},f_{j}))=0 for each (i,j)T(i,j)\in T, then

(a) F=[f1,,fr]F=[f_{1},\ldots,f_{r}] is a Gröbner basis for II;

(b) the set {sij}(i,j)T\{s_{ij}\}_{(i,j)\in T} generates the module JJ of syzygies of f1,,frf_{1},\ldots,f_{r}.

Moreover,

(c) if [tij](i,j)T[\,t_{ij}\,]_{(i,j)\in T} is a Gröbner basis for KK, then [sij](i,j)T[\,s_{ij}\,]_{(i,j)\in T} is a Gröbner basis for JJ.

Proof. ([Ric74], [Spe77], [Zac78], [Sch80]) First, suppose that [tij](i,j)T[\,t_{ij}\,]_{(i,j)\in T} is a Gröbner basis for KK. Let hJh\in J, so F(h)=0F(h)=0. By the same reasoning as in the proof of Lemma 2.9, we can find (i,j)T(i,j)\in T so in(tij){\rm in}(t_{ij}) divides in(h){\rm in}(h). Since in(sij)=in(tij){\rm in}(s_{ij})={\rm in}(t_{ij}), in(sij){\rm in}(s_{ij}) also divides in(h){\rm in}(h), so [sij](i,j)T[\,s_{ij}\,]_{(i,j)\in T} is a Gröbner basis for JJ, proving (c).

Now, suppose that {tij}(i,j)T\{t_{ij}\}_{(i,j)\in T} merely generates KK. Let TT^{\prime} be a set of pairs so [tm](,m)T[\,t_{\ell m}\,]_{(\ell,m)\in T^{\prime}} is a Gröbner basis for KK. It is enough to construct a list [um](,m)T[\,u_{\ell m}\,]_{(\ell,m)\in T^{\prime}} of elements of JJ, generated by {sij}(i,j)T\{s_{ij}\}_{(i,j)\in T}, so in(um)=in(tm){\rm in}(u_{\ell m})={\rm in}(t_{\ell m}) for all (,m)T(\ell,m)\in T^{\prime}. Then by the preceding argument, [um](,m)T[\,u_{\ell m}\,]_{(\ell,m)\in T^{\prime}} is a Gröbner basis for JJ, so {sij}(i,j)T\{s_{ij}\}_{(i,j)\in T} generates JJ.

Write each tm=gmijtijt_{\ell m}=\sum g_{\ell mij}t_{ij}, for (,m)T(\ell,m)\in T^{\prime} and (i,j)T(i,j)\in T, in such a way that the terms of tmt_{\ell m} and each term of each product gmijtijg_{\ell mij}t_{ij} map via in(F){\rm in}(F) to multiples of the same monomial in M0M_{0}. In other words, find a minimal expression for each tmt_{\ell m}, which avoids unnecessary cancellation. Then define

um=gmijsij.u_{\ell m}=\sum g_{\ell mij}s_{ij}.

We have in(um)=in(tm){\rm in}(u_{\ell m})={\rm in}(t_{\ell m}), proving (b).

Let fIf\in I, and choose gM1g\in M_{1} so f=F(g)f=F(g). Let hM1h\in M_{1} be the remainder of gg under division by [um](,m)T[\,u_{\ell m}\,]_{(\ell,m)\in T^{\prime}}; f=F(h)f=F(h). Since in(h){\rm in}(h) is not a multiple of any in(um)=in(tm){\rm in}(u_{\ell m})={\rm in}(t_{\ell m}), the lead term of F(in(h))F({\rm in}(h)) is not canceled by any term of F(h-in(h))F(h-{\rm in}(h)). Therefore, if in(h)=a𝐱Ae1i{\rm in}(h)=a\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}e_{1i}, then in(fi){\rm in}(f_{i}) divides in(F){\rm in}(F). Thus, F=[f1,,fr]F=[f_{1},\ldots,f_{r}] is a Gröbner basis for II, proving (a).      

Proposition 2.8 follows as a special case of this result.

The above proof can be understood in terms of an intermediate initial form in0(h){\rm in}_{0}(h) for hM1h\in M_{1}: Apply the map in(F){\rm in}(F) separately to each term of hh, and let 𝐱AM0\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}\in M_{0} be the greatest monomial that occurs in the set of image terms. Define in0(h){\rm in}_{0}(h) to be the sum of all terms of hh which map via in(F){\rm in}(F) to multiples of 𝐱A\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}. Then in{\rm in} refines in0{\rm in}_{0}, for according to the order we have defined on M1M_{1}, in(h){\rm in}(h) is the term of in0(h){\rm in}_{0}(h) lying in the summand of M1M_{1} whose basis element eie_{i} has the smallest index ii.

In this language, tij=in0(tij)=in0(sij)t_{ij}={\rm in}_{0}(t_{ij})={\rm in}_{0}(s_{ij}). Our expressions for the tmt_{\ell m} have the property that each gmijtij=in0(gmijtij)g_{\ell mij}t_{ij}={\rm in}_{0}(g_{\ell mij}t_{ij}), with each term of each product for a given tmt_{\ell m} mapping via in(F){\rm in}(F) to multiples of the same monomial 𝐱A\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}. Thus, each in0(gmijsij)=gmijtij{\rm in}_{0}(g_{\ell mij}s_{ij})=g_{\ell mij}t_{ij}; the tails gmij(sij-in0(sij))g_{\ell mij}(s_{ij}-{\rm in}_{0}(s_{ij})) stay out of our way, mapping termwise via in(F){\rm in}(F) to monomials which are less than 𝐱A\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A} with respect to the order on M0M_{0}.

Observe that QF(g)Q_{F}(g) is a linear combination of monomials not belonging to in(J){\rm in}(J), for any gM0g\in M_{0}.

In [Buc79], Buchberger gives a criterion for selecting a set TT of pairs (i,j)(i,j) in the case where II is an ideal: If (i0,i1),(i1,i2),,(is-1,is)T(i_{0},i_{1}),(i_{1},i_{2}),\ldots,(i_{s-1},i_{s})\in T, and the least common multiple of in(fi0),in(fi1),,in(fis){\rm in}(f_{i_{0}}),{\rm in}(f_{i_{1}}),\ldots,{\rm in}(f_{i_{s}}) is equal to the least common multiple of in(fi0){\rm in}(f_{i_{0}}) and in(fis){\rm in}(f_{i_{s}}), then (i0,is)(i_{0},i_{s}) need not belong to TT. In other words, if ti0is(ti0i1,,tis-1is)t_{i_{0}i_{s}}\in(t_{i_{0}i_{1}},\ldots,t_{i_{s-1}i_{s}}), then the pair (i0,is)(i_{0},i_{s}) is unnecessary; this condition is equivalent to the condition of Proposition 2.10, for the case of an ideal.

Suppose that we wish to compute the syzygies of a given set of elements g1,,gsg_{1},\ldots,g_{s} of M0M_{0}. To do this, compute a Gröbner basis f1,,frf_{1},\ldots,f_{r} for the submodule IM0I\subset M_{0} generated by g1,,gsg_{1},\ldots,g_{s}. Keep track of how to write each fif_{i} in terms of g1,,gsg_{1},\ldots,g_{s}. Using these expressions, each syzygy of f1,,frf_{1},\ldots,f_{r} can be mapped to a syzygy of g1,,gsg_{1},\ldots,g_{s}. These images generate the module of syzygies of g1,,gsg_{1},\ldots,g_{s}; the set of syzygies obtained in this way is not in general minimal.

Syzygies can be used to find a minimal set of generators for a submodule IM0I\subset M_{0} from a given set of generators g1,,gsg_{1},\ldots,g_{s}: If h1g1++hrgr=0h_{1}g_{1}+\ldots+h_{r}g_{r}=0 is a syzygy of g1,,gsg_{1},\ldots,g_{s} with h1kh_{1}\in k, then g1=(h2g2++hrgr)/h1g_{1}=(h_{2}g_{2}+\ldots+h_{r}g_{r})/h_{1}, so g1g_{1} is not needed to generate II. All unnecessary generators can be removed in this way.

Alternatively, a careful implementation of Gröbner bases can directly find minimal sets of generators for submodules: Starting from an arbitrary set of generators, we can eliminate unnecessary generators degree by degree, by removing those which reduce to zero under division by a Gröbner basis for the ideal generated by the preceding generators.

Either way, we can trim the set of syzygies computed via Gröbner bases for a given set of generators g1,,gsg_{1},\ldots,g_{s} of II, to obtain a minimal set of generators for the syzygy module JJ. By starting with a minimal generating set for II, and iterating this method, a minimal free resolution can be found for II.

A beautiful application of these ideas yields a proof of the Hilbert syzygy theorem, that minimal free resolutions terminate (Schreyer [Sch80], [Sch91], for an exposition see also Eisenbud [Eis92]). At each stage of a resolution, order the Gröbner basis FF for II in such a way that for each i<ji<j, letting in(fi)=a𝐱Ae0k{\rm in}(f_{i})=a\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}e_{0k} and in(fj)=b𝐱Be0{\rm in}(f_{j})=b\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{B}e_{0\ell}, we have 𝐱A>𝐱B\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{A}>\hskip{0.7227pt}{\bf x}\hskip{0.72% 27pt}^{B} in the lexicographic order. If the variables x1,,xmx_{1},\ldots,x_{m} are missing from the initial terms of the fif_{i}, then the variables x1,,xm+1x_{1},\ldots,x_{m+1} will be missing from the initial terms of the syzygies sijs_{ij}. Iterating, we run out of variables, so the resolution terminates.

3 Bounds

How hard are the algorithms in algebraic geometry? We describe some key bounds. The best known example is the bound established by G. Hermann [Her26] for ideal membership:

Theorem 3.1 (G. Hermann)

Let kk be any field, let (f1,,fk)k[x1,,xn](f_{1},...,f_{k})\subset k[x_{1},...,x_{n}] and let d=max(deg(fi))d=\max(\deg(f_{i})). If g(f1,,fk)g\in(f_{1},...,f_{k}), then there is an expression

g=i=1kaifig=\sum_{i=1}^{k}a_{i}f_{i}

where deg(ai)deg(g)+2(kd)2n-1\deg(a_{i})\leq\deg(g)+2(kd)^{2^{n-1}}.

This type of bound is called “doubly exponential”. However, with the advent of the concept of coherent sheaf cohomology [Ser55] and the systematic study of vanishing theorems, it has become apparent that the vanishing of these groups in high degrees is almost always the most fundamental bound. The concept of an ideal being “m-regular” or “regular in degrees m\geq m” was introduced by one of us [Mum66] by generalizing ideas of Castelnuovo:

Definition 3.2
11 The definition has been slightly modified so as to apply to ideals II instead of the corresponding sheaf of ideals {\cal I}.

Let kk be any field, let Ik[x0,,xn]I\subset k[x_{0},\ldots,x_{n}] be an ideal generated by homogeneous polyomials, let IdI_{d} be the homogeneous elements in II of degree dd, let II be the corresponding sheaf of ideals in 𝒪𝐏n{\cal O}_{\mbox{${{\bf P}^{n}}$}}, and let I(d)I(d) be the d𝑡ℎd^{\it th} twist of II. Then the following properties are equivalent and define the term “m-regular”:

(a) the natural map ImH0((m))I_{m}\rightarrow H^{0}({\cal I}(m)) is an isomorphism and Hi((m-i))H^{i}({\cal I}(m-i)) =(0)=(0), 1in1\leq i\leq n

(b) the natural maps IdH0((d))I_{d}\rightarrow H^{0}({\cal I}(d)) are isomorphisms for all dmd\geq m and Hi((d))H^{i}({\cal I}(d)) = (0)(0) if d+imd+i\geq m, i1i\geq 1.

(c) Take a minimal resolution of II by free graded k[X]k[X]-modules:

0α=1rn\;\stackrel{\textstyle\stackrel{r_{n}}{\oplus}}{{}_{\alpha=1}}\;k[𝐱]eα,nϕnϕ1α=1r0\;\stackrel{\textstyle\stackrel{r_{0}}{\oplus}}{{}_{\alpha=1}}\;k[𝐱]eα,0ϕ0k[𝐱]k[𝐱]/I0.0\,\rightarrow\,\raisebox{-5.0pt}{$\;\stackrel{\textstyle\stackrel{r_{n}}{% \oplus}}{{}_{\alpha=1}}\;$}\,k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]\cdot e% _{\alpha,n}\,\stackrel{\phi_{n}}{\rightarrow}\,\ldots\stackrel{\phi_{1}}{% \rightarrow}\,\raisebox{-5.0pt}{$\;\stackrel{\textstyle\stackrel{r_{0}}{\oplus% }}{{}_{\alpha=1}}\;$}\,k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]\cdot e_{% \alpha,0}\,\stackrel{\phi_{0}}{\rightarrow}\,k[\hskip{0.7227pt}{\bf x}\hskip{0% .7227pt}]\longrightarrow k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]/I% \rightarrow 0.

Then deg(eα,i)m+i\deg(e_{\alpha,i})\leq m+i for all α\alpha, ii. (In particular, if fα=ϕ0(eα,0)f_{\alpha}=\phi_{0}(e_{\alpha,0}), then f1,,fr0f_{1},\ldots,f_{r_{0}} are minimal generators of II, and deg(eα,0)=deg(fα)m\deg(e_{\alpha,0})=\deg(f_{\alpha})\leq m.)

The intuitive idea is that past degree mm, nothing tricky happens in the ideal II. Unfortunately, neither (a), (b) nor (c) can be verified by any obvious finite algorithm. This lack of a finitely verifiable criterion for mm-regularity has been remedied by a joint result of the first author and M. Stillman [BS87a]:

Theorem 3.3 (Bayer-Stillman)

II is mm-regular if and only if the degrees of the minimal set of generators of II are at most mm, and there exists a set y0,,yy_{0},\ldots,y_{\ell} of linear combinations of x0,,xnx_{0},...,x_{n} such that for all homogeneous ff of degree mm,

y0fI\displaystyle y_{0}f\in I \displaystyle\Rightarrow fI\displaystyle f\in I
y1fI\displaystyle y_{1}f\in I \displaystyle\Rightarrow fI+k[𝐱]y0\displaystyle f\in I+k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]\cdot y_{0}
\displaystyle\cdots
yfI\displaystyle y_{\ell}f\in I \displaystyle\Rightarrow fI+i=0-1k[𝐱]yi\displaystyle f\in I+\sum_{i=0}^{\ell-1}k[\hskip{0.7227pt}{\bf x}\hskip{0.7227% pt}]\cdot y_{i}

and

fI+i=0k[𝐱]yi.f\in I+\sum_{i=0}^{\ell}k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]\cdot y_{i}.

Moreover, if this holds at all, it holds for y0,,yy_{0},\ldots,y_{\ell} taken arbitrarily from a Zariski-open set in the space of +1\ell+1 linear forms.

To see why mm-regularity is a key bound, we want to show that it controls some of the geometric features of the ideal II. Let’s introduce several refined notions of the “degree” of II:

Definition 3.4

If I=𝐪0𝐪1𝐪tI=\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{0}\,\cap\,\hskip{0.7227pt}{\bf q}% \hskip{0.7227pt}_{1}\,\cap\,\ldots\,\cap\,\hskip{0.7227pt}{\bf q}\hskip{0.7227% pt}_{t} is a primary decomposition of II, 𝐪i=𝐩i\sqrt{\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{i}}=\hskip{0.7227pt}{\bf p}% \hskip{0.7227pt}_{i} is prime and V(𝐩i)V(\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{i}) is the subvariety ZiZ_{i} of 𝐏n{{\bf P}^{n}} for i1i\geq 1, while 𝐩0=(x0,,xn)\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{0}=(x_{0},\ldots,x_{n}) (so that V(𝐩0)=V(\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{0})=\emptyset), then first let 𝐪1,,𝐪s\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{1},\ldots,\hskip{0.7227pt}{\bf q}% \hskip{0.7227pt}_{s} be the isolated components, (i.e., ZiZjZ_{i}\not\subset Z_{j} if 1is1\leq i\leq s, 1jt1\leq j\leq t, iji\neq j, or equivalently, V(I)=Z1ZsV(I)=Z_{1}\,\cup\,\ldots\,\cup\,Z_{s} is set-theoretically the minimal decomposition of V(I)V(I) into varieties). Then let

mult(𝐪i)\displaystyle{\rm mult}(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{i}) =\displaystyle= length \ell of a maximal chain of 𝐩i\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{i}-primary ideals:
𝐪i=J\,\stackrel{\subset}{\scriptstyle\neq}\,J-1\,\stackrel{\subset}{\scriptstyle\neq}\,\,\stackrel{\subset}{\scriptstyle\neq}\,J1=𝐩i\displaystyle{\bf q}\hskip{0.7227pt}_{i}=J_{\ell}\raisebox{-2.5pt}{$\,% \stackrel{\subset}{\scriptstyle\neq}\,$}J_{\ell-1}\raisebox{-2.5pt}{$\,% \stackrel{\subset}{\scriptstyle\neq}\,$}\ldots\raisebox{-2.5pt}{$\,\stackrel{% \subset}{\scriptstyle\neq}\,$}J_{1}=\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{i}

(Equivalently, this is the length of the local ring k[𝐱]𝐩i/Ik[𝐱]𝐩ik[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{\hskip{0.7227pt}{\bf p}\hskip{0.72% 27pt}_{i}}/Ik[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{\hskip{0.7227pt}{\bf p% }\hskip{0.7227pt}_{i}}, or, in the language of schemes, if η\eta is the generic point of ZiZ_{i}, then this is the length of 𝒪η,𝐏n{{\bf P}^{n}}.{\cal O}_{\eta,\mbox{${{\bf P}^{n}}$}}.)

deg(Zi)\displaystyle\deg(Z_{i}) =\displaystyle= usual geometric degree of ZiZ_{i}:
the cardinality of ZiLZ_{i}\cap L for almost all
linear spaces LL of complementary dimension.
geom-degr(I)\displaystyle\mbox{\rm geom-deg}_{r}(I) =\displaystyle= 1isii such that dimZi=r\dim Z_{i}=rmult(𝐪i)deg(Zi)\displaystyle\sum_{\stackrel{\mbox{\it\scriptsize$i$ such that $\dim Z_{i}=r$}% }{\mbox{\scriptsize$1\leq i\leq s$}}}{\rm mult}(\hskip{0.7227pt}{\bf q}\hskip{% 0.7227pt}_{i})\deg(Z_{i})

If 𝐪i\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{i} is one of the non-isolated, or embedded components, then we extend the concept of multiplicity more carefully: Let

Ii={𝐪jj such that 𝐩j\,\stackrel{\subset}{\scriptstyle\neq}\,𝐩i or equivalently Zj\,\stackrel{\supset}{\scriptstyle\neq}\,Zi}𝐩iI_{i}=\left\{\cap\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{j}\mbox{\Large$\mid$% }j\mbox{ such that }\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{j}\raisebox{-2.5% pt}{$\,\stackrel{\subset}{\scriptstyle\neq}\,$}\hskip{0.7227pt}{\bf p}\hskip{0% .7227pt}_{i}\mbox{ or equivalently }Z_{j}\raisebox{-2.5pt}{$\,\stackrel{% \supset}{\scriptstyle\neq}\,$}Z_{i}\right\}\cap\hskip{0.7227pt}{\bf p}\hskip{0% .7227pt}_{i}

and

multI(𝐪i)\displaystyle{\rm mult}_{I}(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{i}) =\displaystyle= length \ell of a maximal chain of ideals:
𝐪iIi=J\,\stackrel{\subset}{\scriptstyle\neq}\,J-1\,\stackrel{\subset}{\scriptstyle\neq}\,\,\stackrel{\subset}{\scriptstyle\neq}\,J0=Ii\displaystyle{\bf q}\hskip{0.7227pt}_{i}\cap I_{i}=J_{\ell}\raisebox{-2.5pt}{$% \,\stackrel{\subset}{\scriptstyle\neq}\,$}J_{\ell-1}\raisebox{-2.5pt}{$\,% \stackrel{\subset}{\scriptstyle\neq}\,$}\ldots\raisebox{-2.5pt}{$\,\stackrel{% \subset}{\scriptstyle\neq}\,$}J_{0}=I_{i}
where each JkJ_{k} satisfies: abJk,a𝐩ibJk.\displaystyle\mbox{where each $J_{k}$ satisfies: }ab\in J_{k},a\not\in\hskip{0% .7227pt}{\bf p}\hskip{0.7227pt}_{i}\Rightarrow b\in J_{k}.

(Equivalently, JkJ_{k} equals 𝐪kIi\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{k}\cap I_{i} for some 𝐩i\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{i}-primary ideal 𝐪k\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{k}.) In particular:

I0\displaystyle I_{0} =\displaystyle= j=1t\;\stackrel{\textstyle\stackrel{t}{\bigcap}}{{}_{j=1}}\;𝐪j is known as Isat, and\displaystyle\raisebox{-5.0pt}{$\;\stackrel{\textstyle\stackrel{t}{\bigcap}}{{% }_{j=1}}\;$}\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{j}\mbox{ is known as }I^{% \rm sat}\mbox{, and}
multI(𝐪0)\displaystyle{\rm mult}_{I}(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{0}) =\displaystyle= length \ell of a maximal chain of ideals
I=J\,\stackrel{\subset}{\scriptstyle\neq}\,J-1\,\stackrel{\subset}{\scriptstyle\neq}\,\,\stackrel{\subset}{\scriptstyle\neq}\,J0=Isat\displaystyle I=J_{\ell}\raisebox{-2.5pt}{$\,\stackrel{\subset}{\scriptstyle% \neq}\,$}J_{\ell-1}\raisebox{-2.5pt}{$\,\stackrel{\subset}{\scriptstyle\neq}\,% $}\ldots\raisebox{-2.5pt}{$\,\stackrel{\subset}{\scriptstyle\neq}\,$}J_{0}=I^{% \rm sat}
=\displaystyle= dimk(Isat/I).\displaystyle\dim_{k}(I^{\rm sat}/I).

For s+1its+1\leq i\leq t, an equivalent way to define multI(𝐪i){\rm mult}_{I}(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{i}) is as the length of the module

Iik[𝐱]𝐩i/Ik[𝐱]𝐩iI_{i}k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{\hskip{0.7227pt}{\bf p}\hskip% {0.7227pt}_{i}}/Ik[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{\hskip{0.7227pt}{% \bf p}\hskip{0.7227pt}_{i}}

or, in the language of schemes, the length of

Ii𝒪η,𝐏n/I𝒪η,𝐏nI_{i}{{\cal O}}_{\eta,\mbox{${{\bf P}^{n}}$}}/I{{\cal O}}_{\eta,\mbox{${{\bf P% }^{n}}$}}

where η\eta is the generic point of ZiZ_{i}.

Then write

arith-degr(I)=1isii such that dimZi=r\dim Z_{i}=rmultI(𝐪i)deg(Zi)\mbox{\rm arith-deg}_{r}(I)=\sum_{\stackrel{\mbox{\it\scriptsize$i$ such that % $\dim Z_{i}=r$}}{\mbox{\scriptsize$1\leq i\leq s$}}}{\rm mult}_{I}(\hskip{0.72% 27pt}{\bf q}\hskip{0.7227pt}_{i})\deg(Z_{i})

and

arith-deg-1(I)=multI(𝐪0).\mbox{\rm arith-deg}_{-1}(I)={\rm mult}_{I}(\hskip{0.7227pt}{\bf q}\hskip{0.72% 27pt}_{0}).

The idea here is best illustrated by an example: let

I=(x12,x1x2)k[x0,x1,x2].I=(x_{1}^{2},x_{1}x_{2})\subset k[x_{0},x_{1},x_{2}].

Then

I\displaystyle I =\displaystyle= 𝐪1𝐪2\displaystyle{\bf q}\hskip{0.7227pt}_{1}\cap\hskip{0.7227pt}{\bf q}\hskip{0.72% 27pt}_{2}
q1\displaystyle q_{1} =\displaystyle= (x1),𝐩1=(x1),Z1={line x1=0}\displaystyle(x_{1}),\;\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{1}=(x_{1}),\;Z% _{1}=\{\mbox{line }x_{1}=0\}
q2\displaystyle q_{2} =\displaystyle= (x12,x1x2,x22), 𝐩2=(x1,x2), Z2={point (1,0,0)}.\displaystyle(x_{1}^{2},x_{1}x_{2},x_{2}^{2}),\;\hskip{0.7227pt}{\bf p}\hskip{% 0.7227pt}_{2}=(x_{1},x_{2}),\;Z_{2}=\{\mbox{point }(1,0,0)\}.

Then

deg(Z1)=1, mult(q1)=1\deg(Z_{1})=1,\;{\rm mult}(q_{1})=1

so

geom-deg1(I)=arith-deg1(I)=1.\mbox{\rm geom-deg}_{1}(I)=\mbox{\rm arith-deg}_{1}(I)=1.

One might be tempted to simply define

multI(𝐪2)= length of chain of 𝐩2\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{2}-primary ideals between 𝐪2,𝐩2{\rm mult}_{I}(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{2})=\mbox{ length of chain of $\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{2}$-primary ideals % between }\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{2},\hskip{0.7227pt}{\bf p}% \hskip{0.7227pt}_{2}

and since

k[𝐱]𝐪2/𝐪2k[𝐱]𝐩2K1+Kx1+Kx2,K=k(x0)k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{\hskip{0.7227pt}{\bf q}\hskip{0.72% 27pt}_{2}}/\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{2}k[\hskip{0.7227pt}{\bf x% }\hskip{0.7227pt}]_{\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{2}}\cong K\cdot 1% +K\cdot x_{1}+K\cdot x_{2},K=k(x_{0})

this is 33. But embedded components are not unique! In fact,

I\displaystyle I =\displaystyle= 𝐪1𝐪2\displaystyle{\bf q}\hskip{0.7227pt}_{1}\cap\hskip{0.7227pt}{\bf q}\hskip{0.72% 27pt}_{2}^{\prime}
𝐪2\displaystyle{\bf q}\hskip{0.7227pt}_{2}^{\prime} =\displaystyle= (x12x2) also,\displaystyle(x_{1}^{2}x_{2})\mbox{ also},

which leads to

k[𝐱]𝐩2/𝐪2k[𝐱]𝐩2K1+Kx2k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{\hskip{0.7227pt}{\bf p}\hskip{0.72% 27pt}_{2}}/\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{2}^{\prime}k[\hskip{0.7227% pt}{\bf x}\hskip{0.7227pt}]_{\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{2}}\cong K% \cdot 1+K\cdot x_{2}

which has length 22. The canonical object is not the local ring k[𝐱]𝐩2/𝐪2k[𝐱]𝐩2k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{\hskip{0.7227pt}{\bf p}\hskip{0.72% 27pt}_{2}}/\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{2}k[\hskip{0.7227pt}{\bf x% }\hskip{0.7227pt}]{\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{2}} but the ideal

Ker(k[𝐱]𝐩2/Ik[𝐱]𝐩2k[𝐱]𝐩2/𝐩2k[𝐱]𝐩2)kx1\mbox{\rm Ker}\left(k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{\hskip{0.7227% pt}{\bf p}\hskip{0.7227pt}_{2}}/Ik[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{% \hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{2}}\rightarrow k[\hskip{0.7227pt}{\bf x% }\hskip{0.7227pt}]_{\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{2}}/\hskip{0.7227% pt}{\bf p}\hskip{0.7227pt}_{2}k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{% \hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{2}}\right)\cong k\cdot x_{1}

which has length 11. Thus, the correct numbers are

multI(𝐪2)=1{\rm mult}_{I}(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{2})=1

and

geom-deg0(I)\displaystyle\mbox{\rm geom-deg}_{0}(I) =\displaystyle= 0\displaystyle 0
arith-deg0(I)\displaystyle\mbox{\rm arith-deg}_{0}(I) =\displaystyle= 1.\displaystyle 1.

Now the question arises: find bounds on these degrees in terms of generators of II. For geometric degrees, a straightforward extension of Bezout’s theorem gives:

Proposition 3.5

Let d(I)d(I) be the maximum of the degrees of a minimal set of generators of II. Then

geom-degr(I)d(I)n-r.\mbox{\rm geom-deg}_{r}(I)\leq d(I)^{n-r}.

A proof can be found in [MW83]. The idea is clear from a simple case: Suppose f,g,hK[x,y,z]f,g,h\in K[x,y,z] and f=g=h=0f=g=h=0 consists of a curve CC and \ell points PiP_{i} off CC. We can bound \ell like this: Choose 22 generic combinations ff^{\prime}, gg^{\prime} of ff, gg, hh so that f=g=0f^{\prime}=g^{\prime}=0 does not contain a surface. It must be of the form CCC\cup C^{\prime}, CC^{\prime} one-dimensional, containing all the PiP_{i} but not the generic point of CC. Then by the usual Bezout theorem

degCdegfdegg=d(I)2.\deg C^{\prime}\leq\deg f^{\prime}\;\deg g^{\prime}=d(I)^{2}.

Let hh^{\prime} be a 33rd generic combination of f,g,hf,g,h. Then C{h=0}C^{\prime}\cap\{h^{\prime}=0\} consists of a finite set of points including the PiP_{i}’s. Thus

\displaystyle\ell =\displaystyle= #Pi#(C{h=0})\displaystyle\#P_{i}\leq\#(C^{\prime}\cap\{h^{\prime}=0\})
\displaystyle\leq degCd(I) by Bezout’s theorem\displaystyle\deg C^{\prime}\cdot d(I)\mbox{ by Bezout's theorem}
\displaystyle\leq d(I)3.\displaystyle d(I)^{3}.

Can arith-deg(I)\mbox{\rm arith-deg}(I) be bounded in the same way? In fact, it cannot, as we will show below. Instead, we have

Proposition 3.6

If m(I)m(I) is the regularity of II, then for -1rn-1\leq r\leq n,

arith-degr(I)(m(I)+n-r-1n-r)m(I)n-r\mbox{\rm arith-deg}_{r}(I)\leq{m(I)+n-r-1\choose n-r}\leq m(I)^{n-r}

which replaces d(I)d(I) by the regularity of II. A proof is given in the technical appendix.

We have introduced two measures of the complexity of a homogeneous ideal II. The first is d(I)d(I), the maximum degree of a polynomial in a minimum set of generators of II. The second is m(I)m(I), which bounds the degrees of generators and of all higher order syzygies in the resolution of II (Definition 3.2 (c)). Obviously,

d(I)m(I).d(I)\leq m(I).

A very important question is how much bigger can m(I)m(I) be than d(I)d(I)? The nature of the answer was conjectured by one of us in his thesis [Bay82] and this conjecture is being borne out by subsequent investigations. This conjecture is that in the worst case m(I)m(I) is roughly the (2n)(2^{n})th power of d(I)d(I) – a bound like G. Hermann’s. But that if I=I(Z)I=I(Z) where ZZ is geometrically nice, e.g. is a smooth irreducible variety, then m(I)m(I) is much smaller, like the nnth power of d(I)d(I) or better. This conjecture then has three aspects:

 (1) a doubly exponential bound for m(I)m(I) in terms of d(I)d(I),
which is always valid,
 (2) examples of II where the bound in (1) is best possible, or nearly so,
 (3) much better bounds for m(I)m(I)
valid if V(I)V(I) satisfies various conditions.

All three aspects are partially proven, but none are completely clarified yet. We will take them up one at a time.

A doubly exponential bound for m(I)m(I) in terms of d(I)d(I) may be deduced easily in characteristic zero from the work of M. Giusti [Giu84] and A. Galligo [Gal79]:

Theorem 3.7

If char(k)=0\mbox{\rm char}(k)=0 and Ik[x0,xn]I\subset k[x_{0},\ldots x_{n}] is any homogeneous ideal, then

m(I)(2d(I))2n-1.m(I)\leq(2d(I))^{2^{n-1}}.

It seems likely that Theorem 3.7 holds in characteristic pp, too. A weaker result can be derived quickly in any characteristic by straightforward cohomological methods:

Proposition 3.8

If Ik[x0,xn]I\subset k[x_{0},\ldots x_{n}] is any homogeneous ideal, then

m(I)(2d(I))n!.m(I)\leq(2d(I))^{n!}.

The proof is given in the technical appendix.

Next, we ask whether Theorem 3.7 is the best possible, or nearly so. The answer is yes, because of a very remarkable example due to E. Mayr and A. Meyer [MM82].

Example 3.9

Let InAI_{n}^{A} be the ideal in 10n10n variables S(m),F(m),Ci(m),Bi(m),1i4,1mnS^{(m)},F^{(m)},C_{i}^{(m)},B_{i}^{(m)},1\leq i\leq 4,1\leq m\leq n defined by the 10n-610n-6 generators

2mn{S(m)-S(m-1)C1(m-1)F(m)-S(m-1)C4(m-1)Ci(m)F(m-1)B2(m-1)-Ci(m)Bi(m)F(m-1)B3(m-1),  1i41mn-1{F(m)C1(m)B1(m)-S(m)C2(m)F(m)C2(m)-F(m)C3(m)S(m)C3(m)B1(m)-S(m)C2(m)B4(m)S(m)C3(m)-F(m)C4(m)B4(m)  Ci(1)S(1)-Ci(1)F(1)(Bi(1))2,  1i4\begin{array}[]{rl}2\leq m\leq n&\left\{\;\;\begin{array}[]{l}S^{(m)}-S^{(m-1)% }C_{1}^{(m-1)}\\ F^{(m)}-S^{(m-1)}C_{4}^{(m-1)}\\ C_{i}^{(m)}F^{(m-1)}B_{2}^{(m-1)}-C_{i}^{(m)}B_{i}^{(m)}F^{(m-1)}B_{3}^{(m-1)}% ,\;\;1\leq i\leq 4\end{array}\right.\\ \\ 1\leq m\leq n-1&\left\{\;\;\begin{array}[]{l}F^{(m)}C_{1}^{(m)}B_{1}^{(m)}-S^{% (m)}C_{2}^{(m)}\\ F^{(m)}C_{2}^{(m)}-F^{(m)}C_{3}^{(m)}\\ S^{(m)}C_{3}^{(m)}B_{1}^{(m)}-S^{(m)}C_{2}^{(m)}B_{4}^{(m)}\\ S^{(m)}C_{3}^{(m)}-F^{(m)}C_{4}^{(m)}B_{4}^{(m)}\end{array}\right.\\ \\ &\mbox{  }C_{i}^{(1)}S^{(1)}-C_{i}^{(1)}F^{(1)}(B_{i}^{(1)})^{2},\;\;1\leq i% \leq 4\end{array}

Let InHI_{n}^{H} be the ideal gotten from InAI_{n}^{A} by homogenizing with an extra variable uu. Then Mayr and Meyer [MM82, lemma 8, p. 318] prove:

Lemma 3.10

Let en=22ne_{n}=2^{2^{n}}. If MM is any monomial in these variables, S(n)Ci(n)-F(n)MInAS^{(n)}C_{i}^{(n)}-F^{(n)}M\in I_{n}^{A} if and only if

M=Ci(n)(Bi(n))en,M=C_{i}^{(n)}(B_{i}^{(n)})^{e_{n}},

and S(n)Ci(n)-S(n)MInAS^{(n)}C_{i}^{(n)}-S^{(n)}M\in I_{n}^{A} if and only if

M=Ci(n).M=C_{i}^{(n)}.

Now note that the generators of InAI_{n}^{A} and InHI_{n}^{H} are all of the very simple type given by a difference of two monomials. Quite generally, if

J\displaystyle J \displaystyle\subset k[x1,,xn]\displaystyle k[x_{1},\ldots,x_{n}]
J\displaystyle J =\displaystyle= (,𝐱αi-𝐱βi,)1ik\displaystyle(\ldots,\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{\alpha_{i}}-% \hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{\beta_{i}},\ldots)_{1\leq i\leq k}

then the quotient ring k[𝐱]/Jk[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]/J has a very simple form. In fact, we get an equivalence relation between monomials generated by

𝐱αi+γ𝐱βi+γ, any i,γ\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}^{\alpha_{i}+\gamma}\sim\hskip{0.7227pt% }{\bf x}\hskip{0.7227pt}^{\beta_{i}+\gamma},\mbox{ any }i,\gamma

and

k[𝐱]/Jδk𝐱δk[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]/J\cong\oplus_{\delta}\;k\cdot\hskip% {0.7227pt}{\bf x}\hskip{0.7227pt}^{\delta}

where δ\delta runs over a set of representatives of each equivalence class.

Bearing this in mind, let’s look at the 11st order syzygies for the homogeneous ideal:

JnH=(S(n),F(n),InH).J_{n}^{H}=(S^{(n)},F^{(n)},I_{n}^{H}).

S(n)S^{(n)} and F(n)F^{(n)} are part of a minimal set of generators, and let fαInHf_{\alpha}\in I_{n}^{H} complete them. Then syzygies are equations:

pS(n)+qF(n)+rαfα= 0.p\,S^{(n)}\,+\,q\,F^{(n)}\,+\,\sum\,r_{\alpha}\,f_{\alpha}\,=\,0.

One such is given by:

[uen+eCi(n)]S(n)+[-ue(Bi(n))enCi(n)]F(n)+Rαfα= 0\left[u^{e_{n}+e}\,C_{i}^{(n)}\right]\,S^{(n)}\,+\,\left[-u^{e}\,(B_{i}^{(n)})% ^{e_{n}}\,C_{i}^{(n)}\right]\,F^{(n)}\,+\,\sum\,R_{\alpha}\,f_{\alpha}\,=\,0

for some RαR_{\alpha}, and some e0e\geq 0 (the extra power ueu^{e} is necessary because some terms RαfαR_{\alpha}f_{\alpha} have degree greater than en+2e_{n}+2) whose degree is 2+en+e2+e_{n}+e. Now express this syzygy as a combination of a minimal set of syzygies. This gives us in particular:

uen+eCi(n)\displaystyle u^{e_{n}+e}\,C_{i}^{(n)} =\displaystyle= aλpλ\displaystyle\,\sum\,a_{\lambda}\,p_{\lambda}
-ue(Bi(n))enCi(n)\displaystyle-u^{e}\,(B_{i}^{(n)})^{e_{n}}\,C_{i}^{(n)} =\displaystyle= aλqλ\displaystyle\,\sum\,a_{\lambda}\,q_{\lambda}
pλS(n)+qλF(n)+Rαλfα\displaystyle p_{\lambda}\,S^{(n)}\,+\,q_{\lambda}\,F^{(n)}\,+\,\sum\,R_{% \alpha\lambda}\,f_{\alpha} =\displaystyle= 0 .\displaystyle\,0\,.

Then for some λ\lambda, pλp_{\lambda} must have a term of the form uu^{\ell} or uCi(n)u^{\ell}\,C_{i}^{(n)}, hence the monomial uS(n)u_{\ell}\,S^{(n)} or uCi(n)S(n)u_{\ell}\,C_{i}^{(n)}\,S^{(n)} occurs in pλS(n)p_{\lambda}\,S^{(n)}. But by the general remark on quotient rings by such simple ideals, this means that this term must equal some second term MS(n)M\,S^{(n)} (MM a monomial in pλp_{\lambda}) or MF(n)M\,F^{(n)} (MM a monomial in qλq_{\lambda}) mod InHI_{n}^{H}. By the lemma, the first doesn’t happen and the second only happens if the term uCi(n)(Bi(n))enu^{\ell}\,C_{i}^{(n)}\,(B_{i}^{(n)})^{e_{n}} occurs in qλq_{\lambda}, in which case en+1degqλ=deg(syzygy(pλ,qλ,Rαλ))-1e_{n}+1\leq\deg\,q_{\lambda}\,=\,\deg(\mbox{syzygy}(p_{\lambda},q_{\lambda},R_% {\alpha\lambda}))-1. This proves:

Proposition 3.11

JnHJ_{n}^{H} has for its bounds:

d(J)\displaystyle d(J) =\displaystyle= 4\displaystyle 4
m(J)\displaystyle m(J) \displaystyle\geq 22n+1.\displaystyle 2^{2^{n}}+1.

Going on to the 33rd aspect of the conjecture, consider results giving better bounds for m(I)m(I) under restrictive hypotheses on V(I)V(I).

Theorem 3.12

If Z𝐏nZ\subset\mbox{${{\bf P}^{n}}$} is a reduced subscheme purely of dimension rr, and I=I(Z)I=I(Z) is the full ideal of functions vanishing on ZZ, then

(a) if r1r\leq 1, or ZZ is smooth, 𝑐ℎ𝑎𝑟(k)=0\hbox{char}(k)=0 and r3r\leq 3, then:

m(I)degZ-n+r+1m(I)\leq\deg Z-n+r+1

(b) if char(k)=0(k)=0 and ZZ is smooth,

m(I)(r+1)(deg(Z)-2)+2.m(I)\leq(r+1)(\deg(Z)-2)+2.

Since deg(Z)d(I)n-r\deg(Z)\leq d(I)^{n-r} (Proposition 3.5), these bound m(I)m(I) in terms of d(I).d(I).

Part (a) of this are due to Gruson-Lazarsfeld-Peskine [GLP83] for r1r\leq 1, and to Pinkham [Pin86], Lazarsfeld [Laz87], and Ran [Ran90] for r3r\leq 3. It is conjectured by Eisenbud and Goto [EG84], and others, that the bound in (a) holds for all reduced irreducible ZZ, and it might well hold even for reduced equidimensional ZZ which are connected in codimension 1. As this problem is now understood, the needed cohomological arguments follow formally, once one can control the singularities of a projection of the variety. These singularities become progressively harder to subdue as the dimension of the variety increases, and are what impedes definitive progress beyond dimension 3.

Part (b) is due to the second author and is proven in the technical appendix. It has been generalized by Bertram, Ein, and Lazarsfeld [BEL91] to show that any smooth characteristic 00 variety of codimension ee defined as a subscheme of 𝐏n{\bf P}^{n} by hypersurfaces of degrees d1dmd_{1}\geq\ldots\geq d_{m} is (d1+de-e+1)(d_{1}+\ldots d_{e}-e+1)-regular. Since we cannot decide the previous conjecture, this is a result of considerable practical importance, for it strongly bounds the complexity of computing Gröbner bases of smooth characterisitic 00 varieties in terms of the degrees of the input equations.

The biggest missing link in this story is a decent bound on m(I)m(I) for any reduced equidimensional ideal II. We would conjecture that if a linear bound as in part (a) doesn’t hold, at the least a so-called “single exponential” bound, i.e. m(I)d0(n)m(I)\leq d^{0(n)} ought to hold. This is an essential ingredient in analyzing the worst-case behavior of all algorithms based on Gröbner bases, and would complete the story about what causes the bad examples discussed above. At least in some cases Ravi [Rav90] has proven that the regularity of the radical of a scheme is no greater than the regularity of the scheme itself.

There is a direct link between the bounds that we have given so far and the G. Hermann bound with which we started the section. This results from the following:

Proposition 3.13

Let IAk[x1,,xn]I^{A}\subset k[x_{1},\ldots,x_{n}] have generators f1,,fkf_{1},\ldots,f_{k} and let IHk[x0,x1,,xn]I^{H}\subset k[x_{0},x_{1},\ldots,x_{n}] be the ideal generated by homogenizations f1h,,fkhf_{1}^{h},\ldots,f_{k}^{h} of the fif_{i}. Let IH=𝐪0𝐪tI^{H}=\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{0}\cap\ldots\cap\hskip{0.7227pt% }{\bf q}\hskip{0.7227pt}_{t} be the primary decomposition of IHI^{H}, let Zi=V(𝐪i)Z_{i}=V(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{i}) and let

mult(IH)=max[multI(𝐪i1)++multI(𝐪ik)+multI(𝐪0)]{\rm mult}_{\infty}(I^{H})=\max\left[{\rm mult}_{I}(\hskip{0.7227pt}{\bf q}% \hskip{0.7227pt}_{i_{1}})+\ldots+{\rm mult}_{I}(\hskip{0.7227pt}{\bf q}\hskip{% 0.7227pt}_{i_{k}})+{\rm mult}_{I}(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{0})\right]

where the max\max is taken over chains V((x0))Zi1\,\stackrel{\supset}{\scriptstyle\neq}\,\,\stackrel{\supset}{\scriptstyle\neq}\,ZikV((x_{0}))\supset Z_{i_{1}}\raisebox{-2.5pt}{$\,\stackrel{\supset}{% \scriptstyle\neq}\,$}\ldots\raisebox{-2.5pt}{$\,\stackrel{\supset}{% \scriptstyle\neq}\,$}Z_{i_{k}}. If gIAg\in I^{A}, then we can write:

g=i=1kaifig=\sum_{i=1}^{k}\,a_{i}\,f_{i}

where

degaidegg+mult(IH).\deg a_{i}\leq\deg g+{\rm mult}_{\infty}(I^{H}).

The proof goes like this: Let ghg^{h} be the homogenization of gg. Consider the least integer mm such that x0mghIHx_{0}^{m}g^{h}\in I^{H}. Since gIg\in I, this mm is finite. Moreover, if

x0mgh=x0miaihfihx_{0}^{m}g^{h}=\sum x_{0}^{m_{i}}a_{i}^{h}f_{i}^{h}

then

g=aifig=\sum a_{i}f_{i}

and

degai=deg(ah)deg(x0mgh)-degfjm+deg(g).\deg a_{i}=\deg(a^{h})\leq\deg(x_{0}^{m}g^{h})-\deg f_{j}\leq m+\deg(g).

Now in the primary decomposition of IHI^{H}, suppose that for some kk,

x0kghiS\;\stackrel{\textstyle\stackrel{}{\bigcap}}{{}_{i\in S}}\;qi, and x0kgh𝐪j if jS.x_{0}^{k}g^{h}\in\raisebox{-5.0pt}{$\;\stackrel{\textstyle\stackrel{}{\bigcap}% }{{}_{i\in S}}\;$}q_{i},\mbox{ and }x_{0}^{k}g^{h}\not\in\hskip{0.7227pt}{\bf q% }\hskip{0.7227pt}_{j}\mbox{ if }j\not\in S.

Choose S\ell\not\in S such that V(q)V(q_{\ell}) is maximal. Since gIAg\in I^{A}, we know V(𝐪)V((x0))V(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{\ell})\subset V((x_{0})), hence x0𝐩x_{0}\in\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{\ell}. Let

IS=iS\;\stackrel{\textstyle\stackrel{}{\bigcap}}{{}_{i\in S}}\;𝐪i.I_{S}=\raisebox{-5.0pt}{$\;\stackrel{\textstyle\stackrel{}{\bigcap}}{{}_{i\in S% }}\;$}\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{i}.

Then multI(𝐪){\rm mult}_{I}(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{\ell}) is easily seen to be the length of a maximal chain of ideals between:

Ik[𝐱]p and ISk[𝐱]p.I\cdot k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{p_{\ell}}\mbox{ and }I_{S}\cdot k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{p_{\ell}}.

But look at the ideals JpJ_{p}, for p0p\geq 0, defined by

Ik[𝐱]𝐩(I,x0k+pgh)k[𝐱]𝐩JpISk[𝐱]𝐩.I\,k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{\hskip{0.7227pt}{\bf p}\hskip{0% .7227pt}_{\ell}}\subset\underbrace{(I,x_{0}^{k+p}g^{h})\,k[\hskip{0.7227pt}{% \bf x}\hskip{0.7227pt}]_{\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{\ell}}}_{% \textstyle J_{p}}\subset I_{S}\,k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{% \hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{\ell}}.

If Jp=Jp+1J_{p}=J_{p+1}, then

x0k+pgh\displaystyle x_{0}^{k+p}g^{h} \displaystyle\in (I,x0k+p+1gh)\displaystyle(I,x_{0}^{k+p+1}g^{h})
i.e.,x0k+pgh\displaystyle\mbox{i.e.,}\;\;x_{0}^{k+p}g^{h} =\displaystyle= ax0k+p+1gh+b, bI.\displaystyle a\,x_{0}^{k+p+1}g^{h}+b,\;\;b\in I.

But 1-ax01-ax_{0} is a unit in k[𝐱]𝐩k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{\hskip{0.7227pt}{\bf p}\hskip{0.72% 27pt}_{\ell}}, so

Jp=x0k+pgh=(1-ax0)-1bIk[𝐱]𝐩.J_{p}=x_{0}^{k+p}g^{h}=(1-ax_{0})^{-1}b\in I\,k[\hskip{0.7227pt}{\bf x}\hskip{% 0.7227pt}]_{\hskip{0.7227pt}{\bf p}\hskip{0.7227pt}_{\ell}}.

This means that in any case

x0k+multI(𝐪)ghIk[𝐱]𝐩x_{0}^{k+{\rm mult}_{I}(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{\ell})}g^{h}% \in I\cdot k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{\hskip{0.7227pt}{\bf p}% \hskip{0.7227pt}_{\ell}}

hence, because qq_{\ell} is pp_{\ell}-primary:

x0k+multI(𝐪)gh𝐪x_{0}^{k+{\rm mult}_{I}(\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{\ell})}g^{h}% \in\hskip{0.7227pt}{\bf q}\hskip{0.7227pt}_{\ell}

Induction now shows that

x0mult(IH)ghIHx_{0}^{{\rm mult}_{\infty}(I^{H})}g^{h}\in I^{H}
Corollary 3.14

Let IAI^{A}, IHI^{H} be as above. If gIAg\in I^{A}, then

g=aifig=\sum a_{i}f_{i}

where deg(ai)deg(g)+(m(I)+n+1n+1)\deg(a_{i})\leq\deg(g)+{m(I)+n+1\choose n+1}.

Proof. Combine Propositions 3.6 and 3.13.

If we further estimate m(I)m(I) by Theorem 3.7 in characteristic 00 or by Proposition 3.8, we get somewhat weaker versions of Hermann’s Theorem 3.1. But if I=V(Z)I=V(Z), ZZ a good variety, we may expect the Corollary to give much better bounds than Theorem 3.1.

Corollary 3.14 shows that any example which demonstrates the necessity of double exponential growth in Hermann’s ideal membership bound (Theorem 3.1) also demonstrates the necessity of double exponential growth in the bounds on m(I)m(I) given in Theorem 3.7 and Proposition 3.8. Thus we can make use of the general arguments for the existence of such examples given in [MM82], rather than depending on the single example of Proposition 3.11, to show that the bounds on m(I)m(I) inevitably grow double exponentially: Since in Corollary 3.14, the degrees of the aia_{i} are bounded by a single exponential function of m(I)m(I), in all examples where the degrees of the aia_{i} grow double exponentially, m(I)m(I) also grows double exponentially.

This line of argument gives a geometric link between the ideal membership problem and m(I)m(I): In Corollary 3.14, if IAI^{A} exhibits aia_{i} of high degree, then IHI^{H} has primary components of high multiplicity. These components force m(I)m(I) to be large, and distinguish IHI^{H} from good ideals considered in Theorem 3.12 and related conjectures.

A major step in understanding the gap between the double exponential examples and the strong linear bounds on the regularity of many smooth varieties was taken by Brownawell [Bro87] and Kollár [Kol88]. They discovered the beautiful and satisfying fact that if we replace membership in II by membership in I\sqrt{I}, then there are single exponential bounds on the degrees of aia_{i}:

Theorem 3.15 (Brownawell, Kollár)

Let kk be any field, let I=(f1,,fk)k[x1,,xn]I=(f_{1},...,f_{k})\subset k[x_{1},...,x_{n}] and let d=max(deg(fi),i=1,,k;3)d=\max(\deg(f_{i}),i=1,\cdots,k;3). If n=1n=1, replace dd by 2d-12d-1. If gIg\in\sqrt{I}, then there is an expression

gs=i=1kaifig^{s}=\sum_{i=1}^{k}a_{i}f_{i}

where sdns\leq d^{n} and deg(ai)(1+deg(g))dn\deg(a_{i})\leq(1+\deg(g))d^{n}. In particular:

(I)dnI.\left(\sqrt{I}\right)^{d^{n}}\subset I.

What this shows is that although the bad examples have to have primary components at infinity of high degree, nonetheless these primary ideals contain relatively small powers of IH\sqrt{I^{H}}. The picture you should have is that these embedded components at infinity are like strands of ivy that creep a long way out from the hyperplane at infinity, but only by clinging rather closely to the affine components.


Technical Appendix to Section 3


1. Proof of the equivalence of the conditions in Definition 3.2:

In [Mum66, pp. 99-101], it is proven that for any coherent sheaf {\cal F} on 𝐏n{{\bf P}^{n}}, Hi((-i))=(0)H^{i}({\cal F}(-i))=(0), i1i\geq 1 implies that the same holds for (d){\cal F}(d), all d0d\geq 0, and that H0((d))H^{0}({\cal F}(d)) is generated by H0()H0(𝒪(d))H^{0}({\cal F})\otimes H^{0}({\cal O}(d)). In particular, if you apply this to =(m){\cal F}={\cal I}(m), the equivalence of (a) and (b) follows. (Note the diagram:

IdH0((d))k[𝐱]dH0(𝒪𝐏n(d))\begin{array}[]{ccc}I_{d}&\longrightarrow&H^{0}({\cal I}(d))\\ \bigcap&&\bigcap\\ k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{d}&\longrightarrow&H^{0}({\cal O}_% {\mbox{${{\bf P}^{n}}$}}(d))\end{array}

which shows that ImH0((m))I_{m}\rightarrow H^{0}({\cal I}(m)) is injective for every dd). To show that (b) \Rightarrow (c), first note that we may rephrase the reults in [Mum66] to say that if Hi((-i))=(0)H^{i}({\cal F}(-i))=(0), i1i\geq 1, then the degrees of the minimal generators of the k[𝐱]k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]-module

d𝐙H0((d))\stackrel{\textstyle\oplus}{d\in\mbox{\bf Z}}\;\;H^{0}({\cal F}(d))

are all zero or less. So we may construct the resolution in (c) inductively: at the kkth stage, say

α=1rk+1\;\stackrel{\textstyle\stackrel{r_{k}+1}{\oplus}}{{}_{\alpha=1}}\;k[𝐱]eα,k-1ϕk-1k[𝐱]k[𝐱]/I0\raisebox{-5.0pt}{$\;\stackrel{\textstyle\stackrel{r_{k}+1}{\oplus}}{{}_{% \alpha=1}}\;$}\;k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]\cdot e_{\alpha,k-1}% \stackrel{\phi_{k-1}}{\longrightarrow}\cdots\longrightarrow k[\hskip{0.7227pt}% {\bf x}\hskip{0.7227pt}]\longrightarrow k[\hskip{0.7227pt}{\bf x}\hskip{0.7227% pt}]/I\longrightarrow 0

has been constructed, let Mk= ker(ϕk-)M_{k}=\mbox{ ker}(\phi_{k-\ell}) and let k{\cal F}_{k} be the corresponding sheaf of ideals. The induction hypothesis will say that Hi(k(m+k-1))=(0)H^{i}({\cal F}_{k}(m+k-1))=(0), i1i\geq 1. Therefore MkM_{k} is generated by elements of degree m+k\leq m+k, i.e., dα=degeα,km+kd_{\alpha}=\deg e_{\alpha,k}\leq m+k, all α\alpha. We get an exact sequence

0Mk+1α=1rk\;\stackrel{\textstyle\stackrel{r_{k}}{\oplus}}{{}_{\alpha=1}}\;(k[𝐱]eα,k)Mk 00\;\longrightarrow\;M_{k+1}\;\longrightarrow\raisebox{-5.0pt}{$\;\stackrel{% \textstyle\stackrel{r_{k}}{\oplus}}{{}_{\alpha=1}}\;$}\left(k[\hskip{0.7227pt}% {\bf x}\hskip{0.7227pt}]\cdot e_{\alpha,k}\right)\;\longrightarrow\;M_{k}\;% \longrightarrow\;0

hence

0k+1α=1rk\;\stackrel{\textstyle\stackrel{r_{k}}{\oplus}}{{}_{\alpha=1}}\;𝒪𝐏n(-dα)k 0\displaystyle 0\;\longrightarrow\;{\cal F}_{k+1}\;\longrightarrow\raisebox{-5.% 0pt}{$\;\stackrel{\textstyle\stackrel{r_{k}}{\oplus}}{{}_{\alpha=1}}\;$}{\cal O% }_{\mbox{${{\bf P}^{n}}$}}(-d_{\alpha})\;\longrightarrow\;{\cal F}_{k}\;% \longrightarrow\;0 (1)

Therefore

α=1rk\;\stackrel{\textstyle\stackrel{r_{k}}{\oplus}}{{}_{\alpha=1}}\;Hi(𝒪𝐏n(m+k-i-dα))Hi(k(m+k-i))\displaystyle\raisebox{-5.0pt}{$\;\stackrel{\textstyle\stackrel{r_{k}}{\oplus}% }{{}_{\alpha=1}}\;$}H^{i}({\cal O}_{\mbox{${{\bf P}^{n}}$}}(m\!+\!k\!-\!i\!-\!% d_{\alpha}))\longrightarrow H^{i}({\cal F}_{k}(m\!+\!k\!-\!i))\longrightarrow (2)
Hi+1(k+1(m+(k+1)-(i+1)))α=1rk\;\stackrel{\textstyle\stackrel{r_{k}}{\oplus}}{{}_{\alpha=1}}\;Hi+1(𝒪𝐏n(m+k-i-dα))\displaystyle\;\;\;\;H^{i+1}({\cal F}_{k+1}(m\!+\!(k\!+\!1)\!-\!(i\!+\!1)))% \longrightarrow\raisebox{-5.0pt}{$\;\stackrel{\textstyle\stackrel{r_{k}}{% \oplus}}{{}_{\alpha=1}}\;$}H^{i+1}({\cal O}_{\mbox{${{\bf P}^{n}}$}}(m\!+\!k\!% -\!i\!-\!d_{\alpha}))

is exact. But m+k-i-dα-im+k-i-d_{\alpha}\geq-i so Hi+1(𝒪𝐏n(m+k-i-dα))=(0)H^{i+1}({\cal O}_{\mbox{${{\bf P}^{n}}$}}(m+k-i-d_{\alpha}))=(0). This shows that k+1{\cal F}_{k+1} satisfies the induction hypothesis and we can continue. Thus (c) holds. To see that (c) \Rightarrow (a), we just use the same exact sequences (1) and prove now by descending induction on kk that Hi(k(m+k-i))=(0)H^{i}({\cal F}_{k}(m+k-i))=(0), i1i\geq 1. Since I=0I={\cal F}_{0}, this does it. The inductive step again uses (2), since Hi(𝒪𝐏n(m+k-i-dα))=(0)H^{i}({\cal O}_{\mbox{${{\bf P}^{n}}$}}(m+k-i-d_{\alpha}))=(0) too.


2. Proof of Proposition 3.6:

Look first at the case r=0r=0. Let {\cal I} be the sheaf of ideals defined by II and let *{\cal I}^{*}\supset{\cal I} be the sheaf defined by omitting all 00-dimensional primary components of II. Consider the exact sequence:

0(m-1)*(m-1)(*/)(m-1) 00\;\longrightarrow\;{\cal I}(m-1)\;\longrightarrow{\cal I}^{*}(m-1)\;% \longrightarrow\;({\cal I}^{*}/{\cal I})(m-1)\;\longrightarrow\;0

This gives us:

H0(*(m-1))H0((*/)(m-1))H1((m-1))H^{0}({\cal I}^{*}(m-1))\;\longrightarrow\;H^{0}(({\cal I}^{*}/{\cal I})(m-1))% \;\longrightarrow\;H^{1}({\cal I}(m-1))

Now H1((m-1))=(0)H^{1}({\cal I}(m-1))=(0) by mm-regularity, and h0((*/)(m-1))=h0(*/)= length(*/)=arith-deg0(I)h^{0}(({\cal I}^{*}/{\cal I})(m-1))=h^{0}({\cal I}^{*}/{\cal I})=\mbox{ length% }({\cal I}^{*}/{\cal I})=\mbox{\rm arith-deg}_{0}(I) since */{\cal I}^{*}/{\cal I} has 00-dimensional support. But H0(*(m-1))H0(𝒪𝐏n(m-1))H^{0}({\cal I}^{*}(m-1))\subset H^{0}({\cal O}_{\mbox{${{\bf P}^{n}}$}}(m-1)), so

arith-deg0(I)\displaystyle\mbox{\rm arith-deg}_{0}(I) \displaystyle\leq h0(*(m-1))\displaystyle h^{0}({\cal I}^{*}(m-1))
\displaystyle\leq h0(𝒪𝐏n(m-1))\displaystyle h^{0}({\cal O}_{\mbox{${{\bf P}^{n}}$}}(m-1))
=\displaystyle= (m+n-1n)\displaystyle{m+n-1\choose n}

If r>0r>0, we can prove the Proposition by induction on rr. Let HH be a generic hyperplance in 𝐏n{{\bf P}^{n}}, given by h=0h=0. Let IH=(I,h)/(h)k[x0,,xn]/(h)k[x0,,xn-1]I_{H}=(I,h)/(h)\subset k[x_{0},\ldots,x_{n}]/(h)\cong k[x_{0}^{\prime},\ldots,% x_{n-1}^{\prime}] for suitable linear combinations xix_{i}^{\prime} of xix_{i}. Then it is easy to check that:

arith-degr(I)=arith-degr-1(IH)\mbox{\rm arith-deg}_{r}(I)=\mbox{\rm arith-deg}_{r-1}(I_{H})

and that IHI_{H} is also mm-regular, so by induction

arith-degr-1(IH)\displaystyle\mbox{\rm arith-deg}_{r-1}(I_{H}) \displaystyle\leq (m+(n-1)-(r-1)-1(n-1)-(r-1))\displaystyle{m+(n-1)-(r-1)-1\choose(n-1)-(r-1)}
=\displaystyle= (m+n-r-1n-r)\displaystyle{m+n-r-1\choose n-r}

If r=-1r=-1, we use the fact that

0IdH0((d))(Isat)d0\longrightarrow I_{d}\longrightarrow H^{0}({\cal I}(d))\stackrel{\approx}{% \longleftarrow}(I^{\mbox{sat}})_{d}

if dmd\geq m, hence

dim(Isat/I)dimk[𝐱]/(x0,,xn)m=(m+nn+1).\dim(I^{\mbox{sat}}/I)\leq\dim k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]/(x_{% 0},\ldots,x_{n})^{m}={m+n\choose n+1}.

3. Proof of Proposition 3.8:

Let Ik[x0,xn]I\subset k[x_{0},\ldots x_{n}] and assume, after a linear change of coordinates, that xnx_{n} is not contained in any associated prime ideals of II. Let I¯k[x0,xn-1]\overline{I}\subset k[x_{0},\ldots x_{n-1}] be the image of II. Then d(I¯)=d(I)d(\overline{I})=d(I) and by induction we may assume

m(I¯)(2d(I))(n-1)!.m(\overline{I})\leq(2d(I))^{(n-1)!}.

We will prove, in fact, that

m(I)m(I¯)+(m(I¯)-1+nn)\displaystyle m(I)\leq m(\overline{I})+{m(\overline{I})-1+n\choose n} (3)

and then we will be done by virtue of the elementary estimate:

if m*=(2d(I))(n-1)!, and d2, then m*+(m*-1+nn)(2d(I))n!\mbox{if }m^{*}=(2d(I))^{(n-1)!},\mbox{ and }d\geq 2,\mbox{ then }m^{*}+{m^{*}% -1+n\choose n}\leq(2d(I))^{n!}

To prove (3), we use the long exact sequence

0(I:(x0))k-1x0IkI¯k00H0((k-1))H0((k-1))H0(¯(k-1))δδH1((k-1))H1((k))H1(¯(k))\begin{array}[]{ccccccccc}0&\longrightarrow&(I:(x_{0}))_{k-1}&\stackrel{x_{0}}% {\longrightarrow}&I_{k}&\longrightarrow&\overline{I}_{k}&\longrightarrow&0\\ &&\downarrow&&\downarrow&&\downarrow\\ 0&\longrightarrow&H^{0}({\cal I}(k-1))&\longrightarrow&H^{0}({\cal I}(k-1))&% \longrightarrow&H^{0}(\overline{\cal I}(k-1))&\stackrel{\delta}{% \longrightarrow}\\ \\ &\stackrel{\delta}{\longrightarrow}&H^{1}({\cal I}(k-1))&\longrightarrow&H^{1}% ({\cal I}(k))&\longrightarrow&H^{1}(\overline{\cal I}(k))&\\ \end{array}

where (I:(x0))={f|x0fI}(I:(x_{0}))=\{\,f\,|\,x_{0}f\in I\,\}. Let m¯=m(I¯)\overline{m}=m(\overline{I}). Note that Hi(¯(k-1))=(0)H^{i}(\overline{\cal I}(k-1))=(0), i1i\geq 1, km¯k\geq\overline{m}, hence

Hi((k-1))Hi((k))H^{i}({\cal I}(k-1))\rightarrow H^{i}({\cal I}(k))

is an isomorphism if km¯-1+1k\geq\overline{m}-1+1 and i2i\geq 2. Since Hi((k))=(0)H^{i}({\cal I}(k))=(0), k0k\gg 0, this shows that Hi((k))=(0)H^{i}({\cal I}(k))=(0), i2i\geq 2, km¯-ik\geq\overline{m}-i. Moreover I¯kH0(¯(k))\overline{I}_{k}\rightarrow H^{0}(\overline{\cal I}(k)) is an isomorphism if km¯k\geq\overline{m}, hence δ=0\delta=0 if km¯k\geq\overline{m}, hence H1((k))=(0)H^{1}({\cal I}(k))=(0), km¯-1k\geq\overline{m}-1. But now look at the surjectivity of IkH0((k))I_{k}\rightarrow H^{0}({\cal I}(k)). For all kk, let MkM_{k} be the cokernel. Then Mk\oplus M_{k} is a k[𝐱]k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]-module of finite dimension. Multiplication by x0x_{0} induces a sequence:

0(I:(x0))k-1Ik-1Mk-1x0Mk 00\;\longrightarrow\;{(I:(x_{0}))_{k-1}\over I_{k-1}}\;\longrightarrow\;M_{k-1}% \;\stackrel{x_{0}}{\longrightarrow}\;M_{k}\;\longrightarrow\;0

which is exact if km¯k\geq\overline{m}. But if, for one value of km¯k\geq\overline{m},

(I:(x0))k=Ik\displaystyle(I:(x_{0}))_{k}=I_{k} (4)

then by Theorem 3.3, II is kk-regular and (4) continues to hold for larger kk, and MkM_{k} must be (0)(0). In other words,

dimMk, km¯-1\dim M_{k},\;\;k\;\geq\;\overline{m}-1

is non increasing and monotone decreasing to zero when km¯k\geq\overline{m}. Therefore

m(I)\displaystyle m(I) \displaystyle\leq m¯+dimMm¯-1\displaystyle\overline{m}+\dim M_{\overline{m}-1}
\displaystyle\leq m¯+dimk[𝐱]m¯-1\displaystyle\overline{m}+\dim k[\hskip{0.7227pt}{\bf x}\hskip{0.7227pt}]_{% \overline{m}-1}
\displaystyle\leq m¯+(m¯-1+nn)\displaystyle\overline{m}+{\overline{m}-1+n\choose n}

which proves (3).


4. Proof of Theorem 3.12(b):

Let ZZ be a smooth rr-dimensional subvariety of 𝐏n{{\bf P}^{n}} and d=d= degree of ZZ. We first consider linear projections of ZZ to 𝐏r{{\bf P}^{r}} and to 𝐏r-1{{\bf P}^{r-1}}. To get there, let L1𝐏nL_{1}\subset\mbox{${{\bf P}^{n}}$} be a linear subspace of dimension n-r-1n-r-1 disjoint from ZZ and L2L1L_{2}\subset L_{1} a linear subspace of dimension n-r-2n-r-2. Take these as centers of projection:

𝐏n-L1Zp2p1𝐏r+1-{P}Z1p2𝐏r\begin{array}[]{ccc}\mbox{${{\bf P}^{n}}$}-L_{1}&\supset&\;\;\;\;\;Z\;\;\;% \stackrel{p_{2}}{\longrightarrow}\\ \downarrow&&\downarrow p_{1}\\ \mbox{${{\bf P}^{r+1}}$}-\{P\}&\supset&Z_{1}\\ \downarrow\\ \stackrel{p_{2}}{\longrightarrow}\;\mbox{${{\bf P}^{r}}$}\end{array}

Let x0,xr+1x_{0},\ldots x_{r+1} be coordinates on 𝐏r+1{{\bf P}^{r+1}} so that p=(0,,0,1)p=(0,\ldots,0,1), hence x0,xrx_{0},\ldots x_{r} are coordinates on 𝐏r{{\bf P}^{r}}. Let f(x0,xr+1)=0f(x_{0},\ldots x_{r+1})=0 be the equation of the hypersurface Z1Z_{1}.

Now there are two ways of getting rr-forms on ZZ: by pullback of rr-forms on 𝐏r{{\bf P}^{r}} and by residues of (r+1)(r+1)-forms on 𝐏r-1{{\bf P}^{r-1}} with simple poles along Z1Z_{1}. The first gives us a sheaf map

p2*Ω𝐏rrΩZrp_{2}^{*}\;\Omega_{\mbox{${{\bf P}^{r}}$}}^{r}\hookrightarrow\Omega_{Z}^{r}

whose image is ΩZr(-B1)\Omega_{Z}^{r}(-B_{1}), B1B_{1} the branch locus of p2p_{2}. Corresponding to this on divisor classes:

KZ\displaystyle K_{Z} \displaystyle\equiv p2*(K𝐏r)+B1\displaystyle p_{2}^{*}(K_{\mbox{${{\bf P}^{r}}$}})+B_{1} (5)
\displaystyle\equiv -(r+1)H+B1,\displaystyle-(r+1)H+B_{1},

where H=H= hyperplane divisor class on ZZ. The second is defined by

a(𝐱)dx1dxr+1fp1*(a(𝐱)dx1dxrf/xr+1)\displaystyle a(\hskip{0.7227pt}{\bf x}\hskip{0.7227pt})\cdot{dx_{1}\wedge% \ldots\wedge dx_{r+1}\over f}\;\longmapsto\;p_{1}^{*}\left(a(\hskip{0.7227pt}{% \bf x}\hskip{0.7227pt})\cdot{dx_{1}\wedge\ldots\wedge dx_{r}\over\partial f/% \partial x_{r+1}}\right) (6)

and it gives us an isomorphism

p1*(Ω𝐏r+1r+1(Z1)|Z1)ΩZr(B2)p_{1}^{*}(\Omega_{\mbox{${{\bf P}^{r+1}}$}}^{r+1}(Z_{1}){\mid}_{Z_{1}})\;\cong% \;\Omega_{Z}^{r}(B_{2})

B2B_{2} is a divisor which can be interpreted as the conductor of the affine rings of ZZ over those of Z1Z_{1}: i.e.,

f𝒪Z(-B2)f(p1,*𝒪Z)𝒪Z1.f\in{\cal O}_{Z}(-B_{2})\;\Longleftrightarrow\;f\cdot(p_{1,*}{\cal O}_{Z})% \subset{\cal O}_{Z_{1}}.

In particular,

p1,*(𝒪Z(-B2)) sheaf of 𝒪Z1- ideals C in 𝒪Z1.\displaystyle p_{1,*}({\cal O}_{Z}(-B_{2}))\cong\mbox{ sheaf of }{\cal O}_{Z_{% 1}}-\mbox{ ideals }C\mbox{ in }{\cal O}_{Z_{1}}. (7)

A classical reference for these basic facts is Zariski [Zar69], Prop. 12.13 and Theorem 15.3. A modern reference is Lipman [Lip84] (apply Def. (2.1)b to p1p_{1} and apply Cor. (13.6) to Z1𝐏r+1Z_{1}\subset\mbox{${{\bf P}^{r+1}}$}). (4) gives us the divisor class identity:

KZ+B2\displaystyle K_{Z}+B_{2} \displaystyle\equiv p1*(K𝐏r+1+Z1)\displaystyle p_{1}^{*}(K_{\mbox{${{\bf P}^{r+1}}$}}+Z_{1}) (8)
\displaystyle\equiv (d-r-2)H.\displaystyle(d-r-2)H.

(5) and (8) together tell us that

B1+B2(d-1)H.B_{1}+B_{2}\equiv(d-1)H.

In fact, the explicit description (6) of the residue tells us more: namely that if y1,,yry_{1},\ldots,y_{r} are local coordinates on ZZ, then

(x1,,xr)(y1,,yr)1f/xr+1dy1dyr{\partial(x_{1},\ldots,x_{r})\over\partial(y_{1},\ldots,y_{r})}\cdot{1\over% \partial f/\partial x_{r+1}}\;dy_{1}\wedge\ldots\wedge dy_{r}

generates ΩZr(B2)\Omega_{Z}^{r}(B_{2}) locally. But (x1,,xr)(y1,,yr)=0{\partial(x_{1},\ldots,x_{r})\over\partial(y_{1},\ldots,y_{r})}=0 is a local equation for B1B_{1}, so this means that f/xr+1=0\partial f/\partial x_{r+1}=0 is a local equation for B1+B2B_{1}+B_{2}. But f/xr+1=0\partial f/\partial x_{r+1}=0 is a global hypersurface of degree d-1d-1 in 𝐏r+1{{\bf P}^{r+1}}, hence globally:

B1+B2=p1*(V(fxr1))B_{1}+B_{2}=p_{1}^{*}(V({\partial f\over\partial x_{r_{1}}}))

(equality of divisors, not merely divisor classes). All this is standard classical material.

(7) has an important cohomological consequence: let C*𝒪𝐏r+1C^{*}\subset{\cal O}_{\mbox{${{\bf P}^{r+1}}$}} be the sheaf of ideals consisting of functions whose restriction to Z1Z_{1} lies in C. Then we get an exact sequence:

0𝒪𝐏r+1(-Z1)C*𝒪𝐏r+1C𝒪Z1 00\,\rightarrow\,{\cal O}_{\mbox{${{\bf P}^{r+1}}$}}(-Z_{1})\,\rightarrow\,C^{*% }\,{\cal O}_{\mbox{${{\bf P}^{r+1}}$}}\,\rightarrow\,C\,{\cal O}_{Z_{1}}\,% \rightarrow\,0

hence an exact sequence

0𝒪𝐏r+1(-d)C*𝒪𝐏r+1()p1,*(𝒪Z(H-B2)) 00\,\rightarrow\,{\cal O}_{\mbox{${{\bf P}^{r+1}}$}}(\ell-d)\,\rightarrow\,C^{*% }{\cal O}_{\mbox{${{\bf P}^{r+1}}$}}(\ell)\,\rightarrow\,p_{1,*}({\cal O}_{Z}(% \ell H-B_{2}))\,\rightarrow\,0

for all integers \ell. But H1(𝒪𝐏r+1(-d))=(0)H^{1}({\cal O}_{\mbox{${{\bf P}^{r+1}}$}}(\ell-d))=(0), hence

H0(C*𝒪𝐏r+1())H0(𝒪Z(H-B2))H^{0}(C^{*}{\cal O}_{\mbox{${{\bf P}^{r+1}}$}}(\ell))\,\rightarrow\,H^{0}({% \cal O}_{Z}(\ell H-B_{2}))

is surjective, hence

H0(𝒪Z(H-B2)) Im[H0(𝒪𝐏r+1())H0(𝒪Z(H))].\displaystyle H^{0}({\cal O}_{Z}(\ell H-B_{2}))\,\subset\,\mbox{ Im}\left[H^{0% }({\cal O}_{\mbox{${{\bf P}^{r+1}}$}}(\ell))\,\rightarrow\,H^{0}({\cal O}_{Z}(% \ell H))\right]. (9)

Now let us vary the projections p1p_{1} and p2p_{2}. For each choice of L1L_{1}, we get a different B1B_{1}: call it B1(L1)B_{1}(L_{1}), and for each choice of L2L_{2}, as different B2B_{2}: call it B2(L2)B_{2}(L_{2}). By (5) and (8), all divisors B1(L1)B_{1}(L_{1}) are linearly equivalent as are all divisors B2(L2)B_{2}(L_{2}). Moreover:

L1\;\stackrel{\textstyle\stackrel{}{\bigcap}}{{}_{L_{1}}}\;B1(L1)\displaystyle\raisebox{-5.0pt}{$\;\stackrel{\textstyle\stackrel{}{\bigcap}}{{}% _{L_{1}}}\;$}B_{1}(L_{1}) =\displaystyle= \displaystyle\emptyset
L2\;\stackrel{\textstyle\stackrel{}{\bigcap}}{{}_{L_{2}}}\;B2(L2)\displaystyle\raisebox{-5.0pt}{$\;\stackrel{\textstyle\stackrel{}{\bigcap}}{{}% _{L_{2}}}\;$}B_{2}(L_{2}) =\displaystyle= \displaystyle\emptyset

This is because, if xZx\in Z, then there is a choice of L1L_{1} such that p1:Z𝐏rp_{1}:Z\rightarrow\mbox{${{\bf P}^{r}}$} is unramified at yy; and a choice of L2L_{2} such that p2(x)Z1p_{2}(x)\in Z_{1} is smooth, hence p2p_{2} is an isomorphism near xx. Thus

B1(L1)=KZ+(r+1)H\mbox{${\mid\!B_{1}(L_{1})\!\mid}$}=\mbox{${\mid\!K_{Z}+(r+1)H\!\mid}$}

and

B2(L2)=KZ+(d-r-2)H\mbox{${\mid\!B_{2}(L_{2})\!\mid}$}=\mbox{${\mid\!K_{Z}+(d-r-2)H\!\mid}$}

are base point free linear systems.

Next choose (r+1)(r+1) L2L_{2}’s, called L2αL_{2}^{\alpha}, 1αr+11\leq\alpha\leq r+1, so that if B2(α)=B2(L2(α))B_{2}^{(\alpha)}=B_{2}(L_{2}^{(\alpha)}), then α\;\stackrel{\textstyle\stackrel{}{\bigcap}}{{}_{\alpha}}\;B2(α)=\raisebox{-5.0pt}{$\;\stackrel{\textstyle\stackrel{}{\bigcap}}{{}_{\alpha}}\;$% }B_{2}^{(\alpha)}=\emptyset. Look at the Koszul complex:

0𝒪Z(H-B2(α))\displaystyle 0\,\rightarrow\,{\cal O}_{Z}(\ell H-\sum B_{2}^{(\alpha)}) \displaystyle\rightarrow \displaystyle\cdots
α,β𝒪Z(H-B2(α)-B2(β))\displaystyle\rightarrow\,\sum_{\alpha,\beta}{\cal O}_{Z}(\ell H-B_{2}^{(% \alpha)}-B_{2}^{(\beta)}) \displaystyle\rightarrow α𝒪Z(H-B2(α))𝒪Z(H) 0.\displaystyle\sum_{\alpha}{\cal O}_{Z}(\ell H-B_{2}^{(\alpha)})\,\rightarrow\,% {\cal O}_{Z}(\ell H)\,\rightarrow\,0.

This is exact and diagram chasing gives the conclusion:

Hi(𝒪Z(H-(i+1)B2))=(0), all i1H^{i}({\cal O}_{Z}(\ell H-(i+1)B_{2}))=(0),\mbox{ all }i\geq 1
αH0(𝒪Z(H-B2(α)))H0(𝒪Z(H)) surjective\Rightarrow\sum_{\alpha}H^{0}({\cal O}_{Z}(\ell H-B_{2}^{(\alpha)}))% \rightarrow H^{0}({\cal O}_{Z}(\ell H))\mbox{ surjective}

hence by (9)

H0(𝒪𝐏n())H0(𝒪Z(H)) surjectiveH^{0}({\cal O}_{\mbox{${{\bf P}^{n}}$}}(\ell))\rightarrow H^{0}({\cal O}_{Z}(% \ell H))\mbox{ surjective}

and

Hi+j(𝒪Z(H-(i+1)B2))=(0), all i0H^{i+j}({\cal O}_{Z}(\ell H-(i+1)B_{2}))=(0),\mbox{ all }i\geq 0
Hj(𝒪Z(H))=(0).\Rightarrow\;\;H^{j}({\cal O}_{Z}(\ell H))=(0).

Now I(Z)I(Z) is mm-regular if and only if Hi(Z(m-i))=(0)H^{i}({\cal I}_{Z}(m-i))=(0), i1i\geq 1, hence if and only if

H0(𝒪𝐏n(m-1))H0(𝒪Z(m-1)) surjectiveH^{0}({\cal O}_{\mbox{${{\bf P}^{n}}$}}(m-1))\rightarrow H^{0}({\cal O}_{Z}(m-% 1))\mbox{ surjective}
Hi(𝒪Z(m-i-1))=(0), i1.H^{i}({\cal O}_{Z}(m-i-1))=(0),\;i\geq 1.

By the previous remark, this follows provided that

Hi+j(𝒪Z((m-i-1)H-(j+1)B2))=(0), if i,j0, i+j1.H^{i+j}({\cal O}_{Z}((m-i-1)H-(j+1)B_{2}))=(0),\mbox{ if }i,j\geq 0,\;i+j\geq 1.

But let us rewrite:

(m-i-1)H-(j+1)B2KZ+jB1+(m-i-(j+1)(d-1)+r)H(m-i-1)H-(j+1)B_{2}\;\equiv\;K_{Z}+jB_{1}+(m-i-(j+1)(d-1)+r)H

using (5) and (8). Note that jB1+HjB_{1}+\ell H is an ample divisor if 1,j0\ell\geq 1,j\geq 0, because B1{\mid\!B_{1}\!\mid} is base point free. Therefore by the Kodaira Vanishing Theorem,

Hi(𝒪Z(KZ+jB1+H))=(0), i,j1, j0H^{i}({\cal O}_{Z}(K_{Z}+jB_{1}+\ell H))=(0),\;i,j\geq 1,\;j\geq 0

and provided m=(r+1)(d-2)+2m=(r+1)(d-2)+2, this gives the required vanishing.

4 Applications

From some points of view, the first main problem of algebraic geometry is to reduce the study of a general ideal II to that of prime ideals, or the study of arbitrary schemes to that of varieties. One way of doing this is to find a decomposition of the ideal into primary ideals: i.e. write it as an intersection of primary ideals. But even when non-redundancy is added, this is not unique, and usually one actually wants something less: to find its radical and perhaps write the radical as an intersection of prime ideals, or to find its top dimensional part, or to find its associated prime ideals and their multiplicities. There are really four computational problems involved here which should be treated separately: (i) eliminating the multiplicities in the ideal II, (ii) separating the pieces of different dimension, (iii) “factoring” the pieces of each dimension into irreducible components, and finally (iv) describing the original multiplicities, either numerically or by a primary ideal. Three of these four problems are the direct generalizations of the basic problems for factoring a single polynomial: we can eliminate multiple factors, getting a square-free polynomial, we can factor this into irreducible pieces and we can ask for the multiplicities with which each factor appeared in the original polynomial. There is a fifth question which arises when we work, as we always must do on a computer, over a non-algebarically closed field kk: we can ask (v) for an extension field kk^{\prime} of kk over which the irreducible components break up into absolutely irreducible components.

Classical algorithms for all of these of these rely heavily on making explicit projections of V(I)V(I) to lower dimensional projective spaces. This can be done either by multi-variable resultants if you want only the set-theoretic projection, or by Gröbner bases with respect to the lexicographic order or an elimination order, to get the full ideal Ik[X0,,Xm]I\cap k[X_{0},\cdots,X_{m}]. Recent treatments of multi-variable resultants can be found in [Can89], [Cha91], and a recent treatment of the basis method can be found in [GTZ88]. There is no evidence that either of these is an efficient method, however, and taking Gröbner bases in the lexicographical order or an elimination order is often quite slow, certainly slow in the worst case. The general experience is that taking projections can be very time consuming. One reason is that the degree of the generators may go up substantially and that sparse defining polynomials may be replaced by more or less generic polynomials. A specific example is given by principally polarized abelian varieties of dimension rr: they are defined by quadratic polynomials in (4r-1)(4^{r}-1)-space, but their degree here (hence the degree of their generic projection to 𝐏r+1{{\bf P}^{r+1}}) is 4rr!4^{r}r! [Mum70a]. In fact, any variety is defined purely by quadratic relations in a suitable embedding [Mum70b].

Instead of using real computational experience, the fundamental method in theoretical computer science for analyzing complexity of algorithms is to count operations. For algebraic algorithms, the natural measure of complexity is not the number of bit operations, but the number of field operations, addition, subtraction, multiplication and (possibly) division that are used. In this sense, any methods that involve taking Gröbner bases for any order on monomials will have a worst-case behavior whose complexity goes up with the regularity of the ideal hence will take “double exponential time”. However, it appears that this worst-case behavior may in fact only concern problem (iv) – finding the primary ideals – and that problems (i), (ii) and (iii) may be solvable in “single exponential time”. The idea that such algorithms should exist for finding V(I)V(I) set-theoretically was proposed in the 1984 lecture on which this article is based, but turned out, in fact, to have been already proven by Chistov and Grigoriev, cf. their unpublished 1983 note [CG83]. Their line of research led, in some sense, to the work of Brownawell and Kollár, showing the single exponential bound (I)mI\left(\sqrt{I}\right)^{m}\subset I for m=dnm=d^{n}, where d=max( degrees of generators of I)d=\max(\hbox{ degrees of generators of }I).

Based on this work, Giusti and Heintz [GH91] give a singly exponentially bounded algorithm for computing ideals 𝐪i{\bf q}_{i} such that V(𝐪i)V({\bf q}_{i}) are the irreducible components of V(I)V(I) (over the ground field kk). The method depends on computing what is essentially the Chow form of each component, and leads to an ideal defining this variety but not its full ideal. In fact, their 𝐪i{\bf q}_{i} may be guaranteed to be prime except for possible embedded components.

A direct approach to constructing both I\sqrt{I} and the intersection of the top-dimensional primary components of II, denoted Top(I)\hbox{Top}(I), is given in a recent paper by Eisenbud, Huneke and Vasconcelos [EHV92]. Their construction of the radical uses the Jacobian ideals, i.e. the ideals of minors of various sizes of the Jacobian matrix of generators of II. This is certainly the most direct approach, but, again they have trouble with possible embedded components, and must resort to ideal quotients, hence they need a Gröbner basis of II in the reverse lexicographic order. They compute Top(I)\hbox{Top}(I) as the annihilator of Extcodim(I)(k[X0,,Xn]/I,k[X0,,Xn])\hbox{Ext}^{{\rm codim}(I)}(k[X_{0},\cdots,X_{n}]/I,k[X_{0},\cdots,X_{n}]), which is readily found from a full resolution using Gröbner bases. Their algorithm appears to be practical in some cases of interest, but still has double exponential time worst-case behavior.

It may turn out to be most effective in practice to combine these ideas. Often an ideal under study has regularity far smaller than the geometric degree of its top dimensional components; projecting these components to a hypersurface requires computing in degrees up to the geometric degree, which is wasteful. On the other hand, methods such as those in [EHV92] work better in low codimensions, if only because there are fewer minors to consider in the Jacobian matrix. Thus, projecting an arbitary scheme down to low codimension and then switching to direct methods may work best of all.

This still does not settle the issue of the complexity of calculating I\sqrt{I}, or, for that matter, calculating the full prime ideal of any subvariety of codimension greater than one. Chow form type methods give you an effective method of defining the set V(I)V(I) but only of generating II up to possible embedded components. For this reason, the two schools of research, one based on the algebra of II, the other based on subsets of 𝐏n{\bf P}^{n} have diverged. If we knew, as discussed in the previous section, that the regularity of a reduced ideal could be bounded singly exponentially, then we could bound the degrees of the generators of I\sqrt{I}, and, using Brownawell-Kollár, we could determine I\sqrt{I} up to these degrees and get the whole ideal. But without such a bound, it is still not clear whether only V(I)V(I) and not I\sqrt{I} can be found in worst-case single exponential time.

Let’s look at problem (iii). Assume you have found a reduced equidimensional II. To study splitting it into irreducible or absolutely irreducible pieces, we shall assume initially it is a hypersurface, i.e. I=(f)I=(f). Computationally, there may often be advantages to not projecting a general II to a hypersurface, and we will discuss one such approach below. Geometrically, there is nothing very natural about irreducible but not absolutely irreducible varieties: from the standpoint of their properties, they behave like reducible varieties, except that, being conjugate over kk, their components have very similar properties. If the ground field kk gets bigger or smaller, the set of absolutely irreducible components gets partitioned in finer or coarser ways into the kk-components. If one has never done any calculations, one would therefore be inclined to say – let’s extend kk as far as needed to split our algebraic set up into absolutely irreducible components. This is a very bad idea! Unless this extension kk^{\prime} happens to be something simple like a quadratic or cyclotomic extension of kk, the splitting field kk^{\prime} is usually gigantic. This is what happens if one component of V(I)V(I) is defined over an extension field k1k_{1} of kk of degree ee, and the Galois group of k1/kk_{1}/k is the full symmetric group, a very common occurence. Then V(I)V(I) only splits completely over the Galois closure of k1/kk_{1}/k and this has degree e!e!. The moral is: never factor unless you have to.

In fact, unless you need to deal simultaneously with more than one of its irreducible components, you can proceed as follows: the function field K=k[X0,,Xn]/(f)K=k[X_{0},\cdots,X_{n}]/(f) contains as a subfield an isomorphic copy of k1k_{1}: you find that field as an extension k1=k[y]/(p(y))k_{1}=k[y]/(p(y)) of kk, and solve for the equation of one irreducible component f1k1[X0,,Xn]f_{1}\in k_{1}[X_{0},\cdots,X_{n}] by the formula Normk1/k(f1)=f\hbox{Norm}_{k_{1}/k}(f_{1})=f.

Pursuing this point, why should one even factor the defining equation ff over kk? Factoring, although it takes polynomial time [LLL82], is often very slow in real time, and, unless the geometry dictates that the components be treated separately, why not leave them alone. In some situations, for instance, [DD84] one may have an ideal, module or other algebraic structure defined by polynomials or matrices of polynomials over a ground ring D=k[y]/(p(y))D=k[y]/(p(y)), where pp is a square-free polynomial. Thus DD is a direct sum of extension fields, but there is no need to factor pp or split up DD until the calculations take different turns with the structures over different pieces of Spec(D)Spec(D).

The standard methods of factoring in computer algebra all depend on (i) writing the polynomial over a ring, finitely generated over 𝐙\bf Z, and reducing modulo a maximal ideal 𝐦\bf m in that ring, obtaining a polynomial over a finite field; and (ii) restricting to a line LL, i.e. substituting Xi=aiX0+bi,i1X_{i}=a_{i}X_{0}+b_{i},i\geq 1 for all but one variable, obtaining a polynomial in one variable over a finite field. This is then factored and then, using Hensel’s lemma, one lifts this factorization modulo higher powers of 𝐦\bf m and of the linear space LL. One then checks whether a coarsened version of this factorization works for ff. This is all really the arithmetic of various small fields. Geometrically, every polynomial in one variable factors over a suitable extension field and the question of counting the absolutely irreducible components of a variety is really more elementary: it is fundamentally topological and not arithmetic. One should, therefore, expect there to be direct geometric ways of counting these components and separating them. Assuming II is a reduced, equi-r-dimensional ideal, the direct way should be to use Serre duality, computing the cohomology Hr(ΩV(I)r)H^{r}(\Omega^{r}_{V(I)}), where ΩV(I)rωV(I)\Omega^{r}_{V(I)}\subset\omega_{V(I)} is the subsheaf of the top-dimensional dualizing sheaf of V(I)V(I) of absolutely regular rr-forms. Its dimension will be the number of absolutely irreducible components into which V(I)V(I) splits. Calculating this cohomology involves two things: algebraically resolving the ideal II and geometrically resolving the singularities of V(I)V(I) far enough to work out ΩV(I)r\Omega^{r}_{V(I)}. Classically, when I=(f)I=(f) was principal, ΩV(I)r\Omega^{r}_{V(I)} was called its ideal of “subadjoint” polynomials.

There is one case where this is quite elementary and has been carried out: this is for plane curves. One can see immediately what is happening by remarking that a non-singular plane curve is automatically absolutely irreducible, hence one should expect that its singularities control its decomposition into absolutely irreducible pieces. Indeed, if 𝐂k[X0,X1,X2]/(f){\bf C}\subset k[X_{0},X_{1},X_{2}]/(f) is the conductor ideal, then ΩV(f)1\Omega^{1}_{V(f)} is given by the homogeneous ideal 𝐂{\bf C}, but with degree 0 being shifted to be polynomials of degree d-3d-3, dd the degree of ff. To calculate H1H^{1}, assume X0X_{0} is not zero at any singularity of V(f)V(f) and look at the finite-dimensional vector space of all functions k[X1/X0,X2/X0]/(𝐂+(f))k[X_{1}/X_{0},X_{2}/X_{0}]/({\bf C}+(f)) modulo the restrictions g/(X0d-3)g/(X_{0}^{d-3}) for all homogeneous polynomials gg of degree d-3d-3. This will be canonically the space of functions on the set of components of V(f)V(f) with sum 00. In particular, it is (0)(0) if and only if V(f)V(f) is absolutely irreducible. This follows from standard exact sequences and duality theory. It was known classically as the Cayley-Bacharach theorem, for the special case where V(f)V(f) was smooth except for a finite number of ordinary double points. It states that V(f)V(f) is absolutely irreducible if and only if for every double point PP, there is a curve of degree d-3d-3 passing through all the double points except PP.

This example gives one instance where a deeper computational analysis of varieties requires a computation of its resolution of singularities. We believe that there will be many instances where practical problems will require such an analysis. In many ways, resolution theorems look quite algorithmic, and, for instance, Abhyankar and his school have been approaching the problem in this way [Abh82], as have Bierstone and Milman [BM91]. However, the only case of resolution of singularities to be fully analyzed in the sense of computational complexity is that of plane curves. This has been done by Teitelbaum [Tei89], [Tei90]. His analysis is notable in various ways: he is extremely careful about not making unnecessary factorizations, let alone taking unnecessary field extensions, and uses the “DD” formalism discussed above. He describes his algorithm so precisely that it would be trivial to convert it to code and, as a result, he gives excellent bounds on its complexity.

References

  • Abh82 S. S. Abhyankar, Weighted expansions for canonical desingularization, Lecture Notes in Math., vol. 910, Springer-Verlag, 1982.
  • Art76 M. Artin, Lectures on deformations of singularities, Tata Institute on Fundamental Research, Bombay, 1976.
  • Bay82 Dave Bayer, The division algorithm and the Hilbert scheme, Ph.D. thesis, Harvard University, Department of Mathematics, June 1982, order number 82-22588, University Microfilms International, 300 N. Zeeb Rd., Ann Arbor, MI 48106.
  • BEL91 Aaron Bertram, Lawrence Ein, and Robert Lazarsfeld, Vanishing theorems, a theorem of Severi, and the equations defining projective varieties, J. Amer. Math. Soc. 4 (1991), 587–602.
  • Ber78 G. M. Bergman, The diamond lemma for ring theory, Adv. in Math. 29 (1978), 178–218.
  • BM88 Dave Bayer and Ian Morrison, Standard bases and geometric invariant theory I. Initial ideals and state polytopes, J. Symb. Comput. 6 (1988), no. 2–3, 209–217, reprinted in [Rob89].
  • BM91 E. Bierstone and P. Milman, A simple constructive proof of canonical resolution of singularities, Effective methods in algebraic geometry (Castiglioncello, 1990), Progr. Math., vol. 94, Birkhauser Boston, 1991, pp. 11–30.
  • BCR91 A. M. Bigatti, M. Caboara, and L. Robbiano, On the computation of Hilbert-Poincare series, Applicable Algebra in Engineering, Communications, and Computing 2 (1991), 21–33.
  • Bri73 J. Briancon, Weierstrass prepare a la Hironaka, Astérisque 7,8 (1973), 67–73.
  • Bro87 W. D. Brownawell, Bounds for the degrees in the Nullstellensatz, Ann. of Math. (2) 126 (1987), 577–591.
  • BS87a Dave Bayer and Mike Stillman, A criterion for detecting mm-regularity, Invent. Math. 87 (1987), 1–11.
  • BS87b Dave Bayer and Mike Stillman, A theorem on refining division orders by the reverse lexicographic order, Duke Math. J. 55 (1987), no. 2, 321–328.
  • BS88 Dave Bayer and Mike Stillman, On the complexity of computing syzygies, J. Symb. Comput. 6 (1988), 135–147.
  • BS92a Dave Bayer and Mike Stillman, Macaulay: A system for computation in algebraic geometry and commutative algebra, 1982–1992, computer software available via anonymous ftp from zariski.harvard.edu.
  • BS92b Dave Bayer and Mike Stillman, Computation of Hilbert functions, J. Symb. Comput. 6 (1992), 31-50.
  • Buc65 B. Buchberger, Ph.D. thesis, Univ. Innsbrück, 1965.
  • Buc76 B. Buchberger, A theoretical basis for the reduction of polynomials to canonical forms, ACM SIGSAM Bull. 39 (1976), 19–29.
  • Buc79 B. Buchberger, A criterion for detecting unnecessary reductions in the construction of Gröbner bases, Symbolic and Algebraic Computation (Proceedings of EUROSAM 79), Lecture Notes in Computer Science, vol. 72, Springer-Verlag, 1979, pp. 3–21.
  • Can89 J. Canny, Generalized characteristic polynomials, Symbolic and Algebraic Computation (Proceedings of ISSAC 88), Lecture Notes in Computer Science, vol. 358, Springer-Verlag, 1989, pp. 293–299.
  • CG83 A. L. Chistov and D. Yu. Grigoriev, Subexponential-time solving systems of algebraic equations I, II, Steklov Mathematical Institute, Leningrad department, LOMI Preprints E-9-93, 0E-10-c83, 1983.
  • Cha91 Marc Chardin, Un algorithme pour le calcu des résultants, Effective methods in algebraic geometry (Castiglioncello, 1990), Progr. Math., vol. 94, Birkhauser Boston, 1991, pp. 47–62.
  • DD84 C. Dicrescenzo and D. Duval, Computations on curves, Lecture Notes in Computer Science, vol. 174, Springer-Verlag, 1984.
  • EG84 David Eisenbud and Shiro Goto, Linear free resolutions and minimal multiplicity, J. Algebra 88 (1984), no. 1, 89–133.
  • EHV92 David Eisenbud, Craig Huneke, and Wolmer Vasconcelos, Direct methods for primary decomposition, Invent. Math. (1992), to appear.
  • Eis92 David Eisenbud, Commutative algebra with a view toward algebraic geometry, 1992, in preparation.
  • Gal74 A. Galligo, A propos du theoreme de preparation de Weierstrass, Fonctions de Plusieurs Variables Complexes, Lecture Notes in Math., vol. 409, Springer-Verlag, 1974, pp. 543–579.
  • Gal79 A. Galligo, Theoreme de division et stabilite en geometrie analytique locale, Ann. Inst. Fourier (Grenoble) 29 (1979), 107–184.
  • GH91 Marc Giusti and Joos Heintz, Algorithmes—disons rapides—pour la decomposition d’une variete algebrique en composantes irreductibles et equidimensionnelles [“Fast” algorithms for the decomposition of an algebraic variety into irreducible and equidimensional components], Effective methods in algebraic geometry (Castiglioncello, 1990), Progr. Math., vol. 94, Birkhauser Boston, 1991, pp. 169–194.
  • Giu84 Marc Giusti, Some effectivity problems in polynomial ideal theory, EUROSAM 84), Lecture Notes in Computer Science, vol. 204, Springer-Verlag, 1984, pp. 159–171.
  • GLP83 L. Gruson, R. Lazarsfeld, and C. Peskine, On a theorem of Castelnuovo, and the equations defining space curves, Invent. Math. 72 (1983), 491–506.
  • GTZ88 P. Gianni, B. Trager, and G. Zacharias, Gröbner bases and primary decomposition of polynomial ideals, J. Symb. Comput. 6 (1988), no. 2–3, 149–167, reprinted in [Rob89].
  • Her26 Grete Hermann, Die Frage der endlich vielen Schritte in der Theorie der Polynomideale, Math. Ann. 95 (1926), 736–788.
  • Hir64 H. Hironaka, Resolution of singularities of an algebraic variety over a field of characteristic zero: I, II, Ann. of Math. (2) 79 (1964), 109–326.
  • Kol88 János Kollár, Sharp effective Nullstellensatz, J. Amer. Math. Soc. 1 (1988), no. 4, 963–975.
  • Lak91 Y. N. Lakshman, A simple exponential bound on the complexity of computing gröbner bases of zero-dimensional ideals, Effective methods in algebraic geometry (Castiglioncello, 1990), Progr. Math., vol. 94, Birkhauser Boston, 1991, pp. 227–234.
  • Laz87 Robert Lazarsfeld, A sharp Castelnuovo bound for smooth surfaces, Duke Math. J. 55 (1987), 423–429.
  • Len92 H. W. Lenstra, Jr., Algorithms in algebraic number theory, Bull. Amer. Math. Soc. (N.S.) 26 (1992), no. 2, 211–244.
  • Lip84 Joseph Lipman, Dualizing sheaves, differentials and residues on algebraic varieties, Astérisque, vol. 117, 1984.
  • LL91 Y. N. Lakshman and D. Lazard, On the complexity of zero-dimensional algebraic systems, Effective methods in algebraic geometry (Castiglioncello, 1990), Progr. Math., vol. 94, Birkhauser Boston, 1991, pp. 217–225.
  • LLL82 A. K. Lenstra, H. W. Lenstra, Jr., and L. Lovász, Factoring polynomials with rational coefficients, Math. Ann. 261 (1982), 515–534.
  • Mac27 F. S. Macaulay, Some properties of enumeration in the theory of modular systems, Proc. London Math. Soc. 26 (1927), 531–555.
  • MM82 Ernst W. Mayr and Albert R. Meyer, The complexity of the word problem for commutative semigroups and polynomial ideals, Adv. in Math. 46 (1982), 305–329.
  • MM83 H. Michael Möller and Ferdinando Mora, Upper and lower bounds for the degree of Gröbner bases, Computer Algebra (EUROCAL 83), Lecture Notes in Computer Science, vol. 162, Springer-Verlag, 1983, pp. 157–167.
  • MM86 H. Michael Möller and Ferdinando Mora, New constructive methods in classical ideal theory, J. Algebra 100 (1986), no. 1, 138–178.
  • MR88 T. Mora and L. Robbiano, The Gröbner fan of an ideal, J. Symb. Comput. 6 (1988), no. 2–3, 183–208, reprinted in [Rob89].
  • Mum66 David Mumford, Lectures on curves on an algebraic surface, Princeton University Press, Princeton, New Jersey, 1966.
  • Mum70a David Mumford, Abelian varieties, Oxford University Press, Oxford, 1970.
  • Mum70b David Mumford, Varieties defined by quadratic equations, Questions on Algebraic Varieties, Centro Internationale Matematica Estivo, Cremonese, Rome, 1970, pp. 29–100.
  • MW83 D. W. Masser and G. Wüstholz, Fields of large transcendence degree generated by values of elliptic functions, Invent. Math. 72 (1983), 407–464.
  • Pin86 Henry C. Pinkham, A Castelnuovo bound for smooth surfaces, Invent. Math. 83 (1986), 491–506.
  • Ran90 Ziv Ran, Local differential geometry and generic projections of threefolds, J. Differential Geom. 32 (1990), 131–137.
  • Rav90 M. S. Ravi, Regularity of ideals and their radicals, Manuscripta Math. 68 (1990), 77–87.
  • Ric74 F. Richman, Constructive aspects of Noetherian rings, Proc. Amer. Math. Soc. 44 (1974), 436–441.
  • Rob85 L. Robbiano, Term orderings on the polynomial ring, Proceedings of EUROCAL ’85 (Linz), Lecture Notes in Computer Science, vol. 204, Springer-Verlag, 1985, pp. 513–517.
  • Rob89 Lorenzo Robbiano (ed.), Computational aspects of commutative algebra, Academic Press, 1989, ISBN 0-12-589590-9.
  • Sch80 Frank-Olaf Schreyer, Die Berechnung von Syzygien mit dem verallgemeinerten Weierstrass’schen Divisionssatz, Diplomarbeit am Fachbereich Mathematik der Universität Hamburg, 1980.
  • Sch91 Frank-Olaf Schreyer, A standard basis approach to syzygies of canonical curves, J. Reine Angew. Math. 421 (1991), 83–123.
  • Ser55 J.-P. Serre, Faisceaux algebrique coherents, Ann. of Math. (2) 61 (1955), 197–278.
  • Spe77 D. Spear, A constructive approach to commutative ring theory, Proceedings of the 1977 MACSYMA Users’ Conference, NASA CP-2012, 1977, pp. 369–376.
  • Tei89 Jeremy Teitelbaum, On the computational complexity of the resolution of plane curve singularities, Symbolic and algebraic computation (Rome, 1988), Lecture Notes in Computer Science, vol. 358, Springer, 1989, pp. 285–292.
  • Tei90 Jeremy Teitelbaum, The computational complexity of the resolution of plane curve singularities, Math. Comp. 54 (1990), no. 190, 797–837.
  • Tri78 W. Trinks, Über B. Buchberger’s Verfahren, Systeme algebraischer Gleichungen zu lösen, J. Number Theory 10 (1978), 475–488.
  • Zac78 G. Zacharias, Bachelor’s thesis, Mass. Inst. of Technology, 1978.
  • Zar69 Oscar Zariski, An introduction to the theory of algebraic surfaces, Lecture Notes in Math., vol. 83, Springer-Verlag, 1969.
  • ZS76 Oscar Zariski and Pierre Samuel, Commutative algebra, Volumes I, II, Graduate texts in mathematics, vol. 28–29, Springer-Verlag, 1975–1976.