These notes present some ideas which I believe that almost everyone who uses the integral as a tool takes for granted, but which are often not taught systematically (in part because they seem too obvious) and which seem to be very difficult for many students to learn.

The basic principles here apply just as easily to double and triple integrals as to simple integrals.

The purpose of these notes is not to **prove**
things, but to **explain** them.
There is nothing in these notes that has any
claim to mathematical originality.
For the most part,
what is here is essentially just basic measure theory.
In fact, if I were still doing the style of mathematics
I used to,
I might write ``Definition:
By an *application of integration*
we mean an absolutely continuous measure on R^{n}.''
(Fortunately, I don't write mathematics like that anymore.
But a sentence like that does make the point
that finding the right formula for most applications
of integration is actually just a matter
of finding the appropriate measure;
one need not re-invent the integral
by constructing a sequence of step functions
that converges to the given value.)

An integral measures the cumulative effect that a function produces over a finite closed interval [a,b] (or over a compact subregion of the plane or 3-space in the case of double or triple integrals).

The first question addressed by these notes is that of how to tell whether a scientific relationship -- such as that between velocity and distance, or between force and work, or between pressure and force -- can be expressed by an integral, as contrasted with scientific rules such as Ohm's Law, Hook's Law, or Newton's Second Law of Motion, which can never be written as integrals. (I have deliberately chosen the word ``relationship'' in these notes because it is not a standard mathematical term. In particular, I wanted to avoid intimidating terms such as ``functional.'')

The crucial idea here is that a relationship expressed by
an integral is always **cumulative**.
For instance, if a time t_{2} is intermediate
between times t_{1} and t_{3},
then the distance resulting from a velocity function
v(t) applied between t_{1} and t_{3}
will be the sum of the distance traveled between time
t_{1} and t_{2} and the distance
between t_{2} and t_{3}.

Likewise, if a pressure p(x,y) is applied to a finite region in the plane, and we think of the region as being made up of two (non-overlapping) pieces, then the force resulting on the entire region will be the sum of the forces on the two separate pieces.

In mathematical terms,
we are saying that a scientific relationship which
can be given by an integral is always
**finitely additive** with respect to the
relevant sets.
In courses in Measure Theory, it is shown
that being finitely additive over sets
is not quite an adequate condition
to guarantee that a relationship such as is considered here
can be expressed as an integral.
However in practice it is an extremely good rule of thumb
that any relationship
where a quantity is determined by the values
a function takes on a set
in a cumulative way
will in fact be expressable as an integral.
I call this the **First Rule of Thumb**.

Once one has determined that a relationship is expressable by an integral, the next question is how to find the correct integral formula. In a lot of cases this is obvious, and the quantity to be computed is simply the integral of the function in question. In other cases, though, the formula is a little more complicated. The most standard example of such a case is the ``shell method'' for determining the volume of a solid of revolution.

In graduate analysis courses,
one learns that a relationship
which is expressible by an integral
is determined by what is called a **measure**,
which is a process of assigning a numerical
value to every (reasonable) set.
In practical terms,
what this means is that a relationship expressible
by an integral
is completely determined once we know
how it works in those cases when the relevant function
is a constant function.
(In technical terms, we need to know how to
evaluate the given relationship
in the case of ``characteristic functions.''
But on the level of sophistication -- or lack thereof --
of these notes,
this simply comes down to the fact that the
proposed formula must give the correct result
in the case of step functions.
A step function is nothing except a function
made by piecing together constant functions.)

This gives use the **Second Rule of Thumb:**
When looking for a formula expressed as an integral,
it's a very good bet that one which gives the
correct answer for constant functions
will in fact be correct.

Unfortunately, there do exist important exceptions to the Second Rule of Thumb. The most obvious such exception is the formula for the length of the graph of a function f(x) between points x=a and x=b. In this case, one will not be able to find a correct formula by looking for one which gives the correct answer in the case when the function f(x) is constant.

In order to understand **when** the Second Rule of Thumb is valid
(as it usually is),
the notes proceed to a discussion of **why**
it usually works.
Namely, as already stated,
if a relationship is cumulative
(additive over disjoint sets)
and a formula gives the correct answer
for constant functions,
then it will also give the correct answer
for step functions.
But every Riemann integrable function
is a limit of step functions.

One can now see why the Second Rule of Thumb
breaks down in the case of the formula for the length
of a graph.
The relationship between a function
and the length of its graph is not
**stable**, in my terminology
with respect to the function.
(I think the usual word ``continuous''
in this context would be
confusing for beginning calculus students)
Two functions can be extremely close to each other,
to the extent that the eye can't even distinguish
the two graphs,
and yet one graph can be far longer than the other.
In fact, no matter how closely one approximates
a given function by a step function,
the length of the graph of the step function
will never be a good approximation for the length of
the graph of the original function.

Intuitively, it is clear that most cumulative scientific relationships are stable in the sense indicated above. This is why the second rule of thumb is so widely useful.

In particular, it can be easily shown
that any **increasing** cumulative relationship
(in the sense that making the function larger
will increase the resulting value)
is stable in this sense.
(This is essentially the theorem in Bourbaki
[which I here misstate]
that every positive linear functional
is representable as an integral.
The notes here stop just short of actually proving that result
for Riemann-integrable functions.
The proof is essentially given in many calculus books
when the integral is first defined,
although the result is usually not stated in this generality.)

Most scientific relationships tend to be increasing. For instance, increasing velocity will increase the resulting distance; increasing force will result in greater work; increasing pressure will produce a greater force on a given region. In all such cases, the Second Rule of Thumb will be valid, and to find the correct integral formula for such relationships one needs only find one which gives the correct answer for constant functions.

Finally, the notes take up the subject of the derivative. According to the Fundamental Theorem of Calculus, differentiation and integration are reverse processes (under the hypothesis that the relevant function is continuous). Thus the same principles given here for finding formulas in the form of integrals apply just as well to finding those in the form of derivatives.

Furthermore, for every integral formula there is a corresponding derivative one, and vice-versa. Since derivatives are usually easier to think about than integrals, this means that one can often most easily figure out how to set up an integral by first looking for the corresponding formula in the form of a derivative.

As an example, this approach is used here to fairly convincingly and simply find the formula for the length of the graph of a function. The interesting thing is that, unlike the conventional derivation, one can do this without having a rigorous definition for the length of a curve, provided that one takes as axioms simple properties of arc length that students will univerally accept.