# ← Functions, Operators, and Derivatives ↑ →

A quick trip through some essential concepts and language.

### Terminology

Terminology in pure mathematics and in applied mathematics and theoretical physics is not always in perfect agreement. This account lies seeks to bridge the gap between the two. It is necessary to define terms. I lean toward the terminology and concepts of pure mathematics, because in pure mathematics everything is accurately and precisely defined. A proper understanding of physics requires a proper understanding of the application of mathematics in physics. Use of other language can lead to misapplication. The foundation of mathematics lies in definitions made in set theory. Set theory tends to reduce to hieroglyphics, designed to eliminate any possible trace of ambiguity. That can be unnecessarily difficult to read, but the more intuitive treatments of applied mathematics often mistate what is said in pure mathematics. To clarify what is said in a rigorous treatment, I will translate the formal language of pure mathematics into less formal language appropriate to applied mathematics. It will be sufficient to work with an intuitive understanding that a set is a finite collection of objects.### Functions and Graphs

A function, or a*map*, may be defined between any two sets, X and Y. Often, in applied mathematics, X and Y are sets of scalars. In pure mathematics, generally, as a matter of policy, we do not specify what we are talking about. I do not define a function to mean necessarily a numerical function. Any sets, X and Y, may be used. In pure mathematics, X and Y can be infinite sets, for example real or complex numbers, but, in the application of mathematics to physical problems, it is sufficient to use finite sets containing a range and density of numbers appropriate to the accuracy of physical measurement.

**Definition:**A

*function*,

*f*: X → Y, is a set of ordered pairs, (

*x*,

*y*) with

*x*in X and

*y*in Y, such that, for every

*x*, there is a unique pair (

*x*,

*y*) in

*f*.

It follows that, for each

*x*,

*f*specifies a particular

*y*. We say that

*f*is a function of

*x*, and we write,

*f*:

*x*→

*y*, or

*y*=

*f*(

*x*). We should avoid the common notation,

*f*=

*f*(

*x*). It has no merit over correct notation,

*f*:

*x*→

*f*(

*x*), and literally means that a function is equal to one of its values. Although, in context, what is meant is usually clear enough, the use of such language leads to a lack of clarity of thought, which, in turn, leads to an increasing level of difficulty with more sophisticated concepts. I have known highly reputable physicists make absurd errors in differential geometry because they do not have the language to distinguish a function of spacetime from the value of that function at a particular point.

A function is continuous if its graph can be plotted without breaks. X and Y are not necessarily sets of scalars. |

**Definition:**The

*graph*of a function

*f*: X → Y is the set of ordered pairs, (

*x*,

*f*(

*x*)).

**Definition:**If, for sufficiently small

*dx*and some

*l*in Y,

*f*(

*a + dx*) −

*l*is negligible, then

*l*is the

*limit of f*(

*x*)

*as x tends to a*. We write

*f*(

*x*) →

*l*as

*x → a*, or

**Definition:**If, for sufficiently large

*x*and some

*l*in Y,

*f*(

*x*) −

*l*is negligible, then

*l*is the

*limit of f*(

*x*)

*as x tends to infinity*. We write

*f*(

*x*) →

*l*as

*x → ∞*, or

**Definition:**

*f*: X → Y is

*continuous*at

*a*, if

*f*(

*x*) →

*as*

*f*(*a*)*x → a*.

**Definition:**

*f*: X → Y is

*continuous*, if it is continuous at all

*x*in X.

**Definition:**A

*curve*is a continuous function from a one dimensional space to an

*n*-dimensional coordinate space.

In spaces with curvature, coordinates are not vectors; a curve is not, in general, a vector valued function. A curve may be expressed in terms of coordinate functions of a parameter,

*t*,

*x*:

^{i}*t*→

*x*(

^{i}*t*),

*x*:

*t*→

*x*(

*t*).

### Function Composition

Function composition means acting with one function followed by another.**Definition:**If

*f*: X → Y with

*f*:

*x*→

*f*(

*x*) and

*g*: Y →

*Z*with

*g*:

*y*→

*g*(

*y*) then we define the

*composite*function

*g f*: X →

*Z*, by

*g f*:

*x*→

*gf*(

*x*) =

*g*(

*f*(

*x*)).

Sometimes the notation is used for

*g f*, but I have never known it necessary. Note the reversal in order;

*g f*means that

*f*is performed first, then

*g*, as indicated by brackets in the definition.

### Identity and Inverse

**Definition:**The

*identity*function, 1 : X → X, is the set of pairs of the form (

*x*,

*x*).

Sometimes

*I*is used for the identity function. I wish to keep that notation for operators describing interactions between particles. In spite of the slight ambiguity, there is little prospect of confusion. In function composition, the identity 1 behaves like number 1 in multiplication; for any

*f*: X → Y,

1

For a function *f*=*f*1 =*f*.*f*: X → Y, the

*inverse*function,

*f*

^{ −1}, exists, if and only if there is a unique pair (

*x*,

*y*) in

*f*for every member

*y*of Y.

**Definition:**When the set of ordered pairs, (

*y*,

*x*), found by reversing the pairs in

*f*: X → Y, is a function, it is the

*inverse*function,

*f*

^{ −1}: Y → X.

If the functions

*f*: X → Y and

*g*: Y → Z have inverses, then

(

Clearly, the composition of any function *g f*)^{−1}=*f*^{ −1}*g*^{−1}.*f*with its inverse,

*f*

^{ −1}, when it exists, is the identity function,

*f*

^{ −1}

*f*=

*f f*

^{ −1}= 1.

### Functionals and Operators

**Definition:**A

*functional*is a function from a vector space to a set of scalars.

For example, bras are functionals on a vector space of kets.

**Definition:**An

*operator*is a function from a vector space to a vector space.

The vector spaces used to define an operator can be, but are not necessarily, the same. Since scalars are a one dimensional vector space, functionals and functions can both be regarded as operators. For the operators

*O*

_{1}: V

_{1}→ V

_{2}and

*O*

_{2}: V

_{1}→ V

_{2}, there is a natural definition of addition and multiplication, giving operators the structure of vector space.

**Definition:**For any vector,

*x*in V

_{1}, and for any scalars,

*a*and

*b*,

Thus functions, functionals and operators can be treated as vectors. In physics, we only require answers at a finite range and resolution. It is sufficient to regard them as

*n*-dimensional vectors, where

*n*is a large, but unspecified, finite number. A lower bound on the value of

*n*will depend on the problem in hand. If we need an actual value of

*n*, for example in programming a computer solution, we simply choose some value greater than this lower bound (providing sufficient computer power is available).

### Linear Operators

A linear operator is one which preserves vector addition and multiplication by scalars.**Definition:**(

*Linearity*) If V

_{1}and V

_{2}are vector spaces, then

*O*: V

_{1}→ V

_{2}is

*linear*if for any vectors,

*x*and

*y*, and for any scalars,

*a*and

*b*,

Let

*O*be an operator from an

*n*-dimensional Hilbert space H

_{1}, with basis elements , to an

*m*-dimensional Hilbert space, H

_{2}, with basis elements . For any in H

_{1}and in H

_{2}, we can form the inner product in H

_{2}between and . Using the resolution of unity, together with linearity,

*O*can be regarded as an

*m*×

*n*matrix,

_{1}and H

_{2}are the same Hilbert space (or of the same dimension), is a square matrix.

### The Commutator and Anticommutator

Commutation and anticommutation relations between operators play an important role in quantum theory.**Definition:**The

*commutator*between operators

*A*and

*B*is [

*A*,

*B*] =

*AB*−

*BA*.

**Definition:**The

*anticommutator*between operators

*A*and

*B*is {

*A*,

*B*} =

*AB*+

*BA*.

In general the order in which functions and operators act affects the result; [

*A*,

*B*] ≠ 0. It is straightforward to show that, for any scalars,

*a*and

*b*, and for any operators,

*A*,

*B*,

*C*: H → H, the commutator satisfies the following relations, which, together with the fact that operators form a vector space, define a Lie algebra.

**Bilinearity:**

[

[

*aA + bB*,*C*] =*a*[*A*,*C*] +*b*[*B*,*C*].[

*C*,*aA + bB*] =*a*[*C*,*A*] +*b*[*C*,*B*].**Anticommutativity or skew-symmetry:**

[

*A*,*B*] = −[*B*,*A*].**The Jacobi Identity:**

[[

*A*,*B*],*C*] + [[*B*,*C*],*A*] + [[*C*,*A*],*B*] = 0.### Hermitian Conjugation

For any in a Hilbert space H_{2}, and any linear operator

*O*: H

_{1}→ H

_{2}there is a unique bra, , in the dual space of H

_{2}, such that, for any in H

_{1},

**Definition:**The Hermitian adjoint,

*Hermitian conjugate*, or conjugate transpose of the linear operator

*O*: H

_{1}→ H

_{2}is the linear operator,

*O*

^{†}: H

_{1}→ H

_{2}, such that is the ket corresponding to .

Thus, the inner product between and is the same as the inner product between and . Clearly, for any operators

*O*

_{1}: H

_{1}→ H

_{2}and

*O*

_{2}: H

_{2}→ H

_{3}, (

*O*

_{2}

*O*

_{1})

^{†}=

*O*

_{1}

^{†}

*O*

_{2}

^{†}.

Since reversing an inner product is complex conjugation,

*O*

^{†}is found by taking the complex conjugate of each matrix element and taking the transpose. The Hermitian conjugate may thus be regarded as a generalisation of the complex conjugate. Sometimes a Hermitian conjugate is taken to mean that H

_{1}and H

_{2}are the same Hilbert space, but the definition is also useful when H

_{1}and H

_{2}are different Hilbert spaces.

### Hermitian Operators

**Definition:**An operator,

*O*: H

_{1}→ H

_{2}, is

*Hermitian*, or

*self-adjoint*, if

*O*

^{†}=

*O*.

Inner products formed with Hermitian operators are real,

### Unitary Operators

**Definition:**An operator,

*O*: H

_{1}→ H

_{2}, is

*unitary*, if its Hermitian conjugate is also its inverse,

*O*

^{†}

*O*=

*OO*

^{†}= 1.

For a unitary operator

*O*, and any and in

*H*,

### The Differential Operator

A derivative, when it exists, is an approximation to the slope, or gradient, of the tangent to a graph, using a value of dx so small that making it any smaller makes no practical difference to the result. The differential operator maps a function to its derivative. |

**Definition:**If X is a set of scalars, Y is a set of scalars or a vector space, and

*f*: X → Y, then the

*derivative*,

*f '*: X → Y, when it exists, is given by

**Definition:**

*f*is

*differentiable*, if its derivative exists for all

*x*in X.

**Definition:**The

*differential operator*is D

_{x}:

*f → f '*.

It is straightforward to show that D

_{x}is a linear operator. Newton’s dot notation for the derivative with respect to time may be used for a curve parameterised by

*t*,

*unit tangent vector*for a curve is found by normalising the result of differentiating the curve with respect to the parameter,

### Partial Derivatives

Partial differentiation generalises differentiation of scalar functions to functions of more than one variable. In particular, it generalises differentiation to operators on vector space. It is normal to distinguish a partial derivative using the symbol ∂ instead of*d*.

**Definition:**When it exists, the

*partial derivative*,

*f*

_{,i}of an operator,

*f*: X → Y, is

*dx*is a small vector along the

^{i}*i*-axis.

**Definition:**

*f*is

*differentiable*, if its partial derivatives exist for all

*x*in X.

**Definition:**The

*partial differential operator*is ∂

_{i}:

*f*→ ∂

_{i}

*f*=

*f*

_{,i}.

A second order (partial) derivative is found by (partial) differentiation of a (partial) derivative. An

*n*

^{th}order (partial) derivative is found by (partial) differentiation of an (

*n*− 1)

^{th}order (partial) derivative. The notation is usually abbreviated by the removal of all but the first comma, e.g.,

**Definition:**A function

*f*is

*smooth*, if its (partial) derivatives exist to all orders for all

*x*in X.

It is straightforward to show that ∂

_{i}is a linear operator, and that, if second order partial derivatives exist, the

*total derivative*of

*f*along a curve,

*x*:

*t*→

*x*(

*t*), is

*f*is a covariant vector and this is an inner product. This result does not extend to tensors; the partial derivative of a vector or tensor valued function is not in general a tensor.

_{,i }For a coordinate transformation,

**Clairaut’s theorem (proof):**For a functional,

*f*: X → Y, with continuous second derivatives, the partial derivatives commute,

Functions, Operators, and Derivatives ↑ Introduction to Tensors →