Implicit function theorem

In mathematics, more specifically in multivariable calculus, the implicit function theorem^[1] is a tool that allows relations to be converted to functions of several real variables. It does so by representing the relation as the graph of a function. There may not be a single function whose graph can represent the entire relation, but there may be such a function on a restriction of the domain of the relation. The implicit function theorem gives a sufficient condition to ensure that there is such a function.

The theorem states that if the equation F(x₁, ..., x_n, y₁, ..., y_m) = F(x, y) = 0 satisfies some mild conditions on its partial derivatives, then one can in principle (though not necessarily with an analytic expression) express the m variables y_i in terms of the n variables x_j as y_i = f_i(x), at least in some disk. Then each of these implicit functions f_i(x),^[2] implied by F(x, y) = 0, is such that geometrically the locus defined by F(x, y) = 0 will coincide locally (that is in that disk) with the hypersurface given by y = f(x).

History

Augustin-Louis Cauchy (1789-1857) is credited with the first rigorous form of the implicit function theorem. Ulisse Dini (1845-1918) generalized the real-variable version of the implicit function theorem to the context of functions of any number of real variables.^[3]

First example

The unit circle can be specified as the level curve f(x, y) = 1 of the function

f(x,y)=x^2 + y^2

. Around point A, y can be expressed as a function y(x), specifically

g_1(x)=\sqrt{1-x^2}

. No such function exists around point B.

If we define the function $f(x,y)=x^2 + y^2$ , then the equation f(x, y) = 1 cuts out the unit circle as the level set {(x, y)| f(x, y) = 1}. There is no way to represent the unit circle as the graph of a function of one variable y = g(x) because for each choice of x ∈ (−1, 1), there are two choices of y, namely $\pm\sqrt{1-x^2}$ .

However, it is possible to represent part of the circle as the graph of a function of one variable. If we let $g_1(x) = \sqrt{1-x^2}$ for −1 < x < 1, then the graph of $y = g_1(x)$ provides the upper half of the circle. Similarly, if $g_2(x) = -\sqrt{1-x^2}$ , then the graph of $y = g_2(x)$ gives the lower half of the circle.

The purpose of the implicit function theorem is to tell us the existence of functions like $g_1(x)$ and $g_2(x)$ , even in situations where we cannot write down explicit formulas. It guarantees that $g_1(x)$ and $g_2(x)$ are differentiable, and it even works in situations where we do not have a formula for f(x, y).

Definitions

Let f : R^n+m → R^m be a continuously differentiable function. We think of R^n+m as the Cartesian product Rⁿ × R^m, and we write a point of this product as (x, y) = (x₁, ..., x_n, y₁, ..., y_m). Starting from the given function f, our goal is to construct a function g: Rⁿ → R^m whose graph (x, g(x)) is precisely the set of all (x, y) such that f(x, y) = 0.

As noted above, this may not always be possible. We will therefore fix a point (a, b) = (a₁, ..., a_n, b₁, ..., b_m) which satisfies f(a, b) = 0, and we will ask for a g that works near the point (a, b). In other words, we want an open set U of Rⁿ containing a, an open set V of R^m containing b, and a function g : U → V such that the graph of g satisfies the relation f = 0 on U × V, and that no other points within U × V do so. In symbols,

\{ (\mathbf{x}, g(\mathbf{x})) \mid \mathbf x \in U \} = \{ (\mathbf{x}, \mathbf{y})\in U \times V \mid f(\mathbf{x}, \mathbf{y}) = 0 \}.

To state the implicit function theorem, we need the Jacobian matrix of f, which is the matrix of the partial derivatives of f. Abbreviating (a₁, ..., a_n, b₁, ..., b_m) to (a, b), the Jacobian matrix is

(Df)(\mathbf{a},\mathbf{b}) = \left[\begin{matrix} \frac{\partial f_1}{\partial x_1}(\mathbf{a},\mathbf{b}) & \cdots & \frac{\partial f_1}{\partial x_n}(\mathbf{a},\mathbf{b})\\ \vdots & \ddots & \vdots\\ \frac{\partial f_m}{\partial x_1}(\mathbf{a},\mathbf{b}) & \cdots & \frac{\partial f_m}{\partial x_n}(\mathbf{a},\mathbf{b}) \end{matrix}\right|\left. \begin{matrix} \frac{\partial f_1}{\partial y_1}(\mathbf{a},\mathbf{b}) & \cdots & \frac{\partial f_1}{\partial y_m}(\mathbf{a},\mathbf{b})\\ \vdots & \ddots & \vdots\\ \frac{\partial f_m}{\partial y_1}(\mathbf{a},\mathbf{b}) & \cdots & \frac{\partial f_m}{\partial y_m}(\mathbf{a},\mathbf{b})\\ \end{matrix}\right] = [X|Y]

where X is the matrix of partial derivatives in the variables x_i and Y is the matrix of partial derivatives in the variables y_j. The implicit function theorem says that if Y is an invertible matrix, then there are U, V, and g as desired. Writing all the hypotheses together gives the following statement.

Statement of the theorem

Let f: R^n+m → R^m be a continuously differentiable function, and let R^n+m have coordinates (x, y). Fix a point (a, b) = (a₁, ..., a_n, b₁, ..., b_m) with f(a, b) = c, where c ∈ R^m. If the Jacobian matrix $J f, y (a, b) = [(\partial f i / \partial y j)(a, b)]$ is invertible, then there exists an open set U containing a, an open set V containing b, and a unique continuously differentiable function g: U → V such that

\{(\mathbf {x} ,g(\mathbf {x} ))\mid \mathbf {x} \in U\}=\{(\mathbf {x} ,\mathbf {y} )\in U\times V\mid f(\mathbf {x} ,\mathbf {y} )=\mathbf {c} \}.

Regularity

It can be proven that whenever we have the additional hypothesis that f is continuously differentiable k times inside U × V, then the same holds true for the explicit function g inside U and

{\frac {\partial g}{\partial x_{j}}}(x)=-J_{f,y}(x,g(x))^{-1}{\frac {\partial f}{\partial x_{j}}}(x,g(x))

Similarly, if f is analytic inside U × V, then the same holds true for the explicit function g inside U.^[4] This generalization is called the analytic implicit function theorem.

The circle example

Let us go back to the example of the unit circle. In this case n = m = 1 and $f(x,y) = x^2 + y^2 - 1$ . The matrix of partial derivatives is just a 1 × 2 matrix, given by

(Df)(a,b) = \left [ \frac{\partial f}{\partial x}(a,b) \ \ \frac{\partial f}{\partial y}(a,b) \right ] = [2a \ \ 2b]

Thus, here, the Y in the statement of the theorem is just the number 2b; the linear map defined by it is invertible iff b ≠ 0. By the implicit function theorem we see that we can locally write the circle in the form y = g(x) for all points where y ≠ 0. For (±1, 0) we run into trouble, as noted before. The implicit function theorem may still be applied to these two points, but writing x as a function of y, that is, $x = h(y)$ ; now the graph of the function will be $\left(h(y), y\right)$ , since where b = 0 we have a = 1, and the conditions to locally express the function in this form are satisfied.

The implicit derivative of y with respect to x, and that of x with respect to y, can be found by totally differentiating the implicit function $x^2+y^2-1$ and equating to 0:

2x dx+2y dy = 0,

giving

dy/dx=-x/y

and

dx/dy=-y/x.

Application: change of coordinates

Suppose we have an m-dimensional space, parametrised by a set of coordinates $(x_1,\ldots,x_m)$ . We can introduce a new coordinate system $(x'_1,\ldots,x'_m)$ by supplying m functions $h_1\ldots h_m$ . These functions allow us to calculate the new coordinates $(x'_1,\ldots,x'_m)$ of a point, given the point's old coordinates $(x_1,\ldots,x_m)$ using $x'_1=h_1(x_1,\ldots,x_m), \ldots, x'_m=h_m(x_1,\ldots,x_m)$ . One might want to verify if the opposite is possible: given coordinates $(x'_1,\ldots,x'_m)$ , can we 'go back' and calculate the same point's original coordinates $(x_1,\ldots,x_m)$ ? The implicit function theorem will provide an answer to this question. The (new and old) coordinates $(x'_1,\ldots,x'_m, x_1,\ldots,x_m)$ are related by f = 0, with

f(x'_1,\ldots,x'_m,x_1,\ldots x_m)=(h_1(x_1,\ldots x_m)-x'_1,\ldots , h_m(x_1,\ldots, x_m)-x'_m).

Now the Jacobian matrix of f at a certain point (a, b) [ where $a=(x'_1,\ldots,x'_m), b=(x_1,\ldots,x_m)$ ] is given by

(Df)(a,b) = \left [\begin{matrix} -1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & -1 \end{matrix}\left| \begin{matrix} \frac{\partial h_1}{\partial x_1}(b) & \cdots & \frac{\partial h_1}{\partial x_m}(b)\\ \vdots & \ddots & \vdots\\ \frac{\partial h_m}{\partial x_1}(b) & \cdots & \frac{\partial h_m}{\partial x_m}(b)\\ \end{matrix} \right.\right] = [-1_m |J ].

where 1_m denotes the m × m identity matrix, and J is the m × m matrix of partial derivatives, evaluated at (a, b). (In the above, these blocks were denoted by X and Y. As it happens, in this particular application of the theorem, neither matrix depends on a.) The implicit function theorem now states that we can locally express $(x_1,\ldots,x_m)$ as a function of $(x'_1,\ldots,x'_m)$ if J is invertible. Demanding J is invertible is equivalent to det J ≠ 0, thus we see that we can go back from the primed to the unprimed coordinates if the determinant of the Jacobian J is non-zero. This statement is also known as the inverse function theorem.

Example: polar coordinates

As a simple application of the above, consider the plane, parametrised by polar coordinates (R, θ). We can go to a new coordinate system (cartesian coordinates) by defining functions x(R, θ) = R cos(θ) and y(R, θ) = R sin(θ). This makes it possible given any point (R, θ) to find corresponding cartesian coordinates (x, y). When can we go back and convert cartesian into polar coordinates? By the previous example, it is sufficient to have det J ≠ 0, with

J =\begin{bmatrix} \frac{\partial x(R,\theta)}{\partial R} & \frac{\partial x(R,\theta)}{\partial \theta} \\ \frac{\partial y(R,\theta)}{\partial R} & \frac{\partial y(R,\theta)}{\partial \theta} \\ \end{bmatrix}= \begin{bmatrix} \cos \theta & -R \sin \theta \\ \sin \theta & R \cos \theta \end{bmatrix}.

Since det J = R, conversion back to polar coordinates is possible if R ≠ 0. So it remains to check the case R = 0. It is easy to see that in case R = 0, our coordinate transformation is not invertible: at the origin, the value of θ is not well-defined.

Generalizations

Banach space version

Based on the inverse function theorem in Banach spaces, it is possible to extend the implicit function theorem to Banach space valued mappings.^[5]

Let X, Y, Z be Banach spaces. Let the mapping f : X × Y → Z be continuously Fréchet differentiable. If $(x_0,y_0)\in X\times Y$ , $f(x_{0},y_{0})=0$ , and $y\mapsto Df(x_0,y_0)(0,y)$ is a Banach space isomorphism from Y onto Z, then there exist neighbourhoods U of x₀ and V of y₀ and a Fréchet differentiable function g : U → V such that f(x, g(x)) = 0 and f(x, y) = 0 if and only if y = g(x), for all $(x,y)\in U\times V$ .

Implicit functions from non-differentiable functions

Various forms of the implicit function theorem exist for the case when the function f is not differentiable. It is standard that it holds in one dimension.^[6] The following more general form was proven by Kumagai^[7] based on an observation by Jittorntrum.^[8]

Consider a continuous function $f : R^n \times R^m \to R^n$ such that $f(x_{0},y_{0})=0$ . If there exist open neighbourhoods $A \subset R^n$ and $B \subset R^m$ of x₀ and y₀, respectively, such that, for all y in B, $f(\cdot, y) : A \to R^n$ is locally one-to-one then there exist open neighbourhoods $A_0 \subset R^n$ and $B_0 \subset R^m$ of x₀ and y₀, such that, for all $y \in B_0$ , the equation f(x, y) = 0 has a unique solution

x = g(y) \in A_0

where g is a continuous function from B₀ into A₀.

Notes

↑ Also called Dini's theorem by the Pisan school in Italy. In the English-language literature, Dini's theorem is a different theorem in mathematical analysis.
↑ See Chiang 1984, pp. 204-206.
↑ Steven Krantz and Harold Parks, The Implicit Function Theorem, Modern Birkhauser Classics, Birkhauser, 2003.
↑ See Fritzsche & Grauert 2002, p. 34.
↑ See Lang 1999, pp. 15–21 and Edwards 1994, pp. 417–418.
↑ See Kudryavtsev 2001.
↑ See Kumagai 1980, pp. 285–288.
↑ See Jittorntrum 1978, pp. 575–577.

References

Chiang, Alpha C. (1984). Fundamental Methods of Mathematical Economics (3rd ed.). McGraw-Hill.

Danilov, V.I. (2001), "Implicit function (in algebraic geometry)", in Hazewinkel, Michiel, Encyclopedia of Mathematics, Springer, ISBN 978-1-55608-010-4 .

Edwards, Charles Henry (1994) [1973]. Advanced Calculus of Several Variables. Mineola, New York: Dover Publications. ISBN 978-0-486-68336-2.

Fritzsche, K.; Grauert, H. (2002). From Holomorphic Functions to Complex Manifolds. Springer.

Jittorntrum, K. (1978). "An Implicit Function Theorem". Journal of Optimization Theory and Applications. 25 (4). doi:10.1007/BF00933522.

Kudryavtsev, Lev Dmitrievich (2001), "Implicit function", in Hazewinkel, Michiel, Encyclopedia of Mathematics, Springer, ISBN 978-1-55608-010-4 .

Kumagai, S. (1980). "An implicit function theorem: Comment". Journal of Optimization Theory and Applications. 31 (2). doi:10.1007/BF00934117.

Lang, Serge (1999). Fundamentals of Differential Geometry. Graduate Texts in Mathematics. New York: Springer. ISBN 978-0-387-98593-0.

This article is issued from Wikipedia - version of the 12/4/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.