Author's Notes on Measure Theory for Applications to Probability and Economics

PMA: Principles of Mathematical Analysis (Rudin)
RCA: Real and Complex Analysis (Rudin)
PS: Probability and Stochastics (Cinlar)
PM: Probability and Measure (Bilingsley)
CPM: Convergence of Probability Measures (Bilingsley)
LAF: Linear Algebra (Friedberg)
LAH: Linear Algebra (Hoffman and Kunze)

1. Topology and Preliminaries

Like the title suggests, this chapter includes the topological and analytical concepts I believe are necessary to measure theory and probability. The topics are developed based on Munkres' Topology and RCA.

I focus more on the countabiilty and separation axioms than most measure theory texts; second countability in particular I found to be a key property that connects topologies and Borel σ-algebras.

I also present three different versions of Urysohn's lemma. The first is the one found in RCA, which assumes the underlying topological space is locally compact Hausdorff and thus requires the closed set involved to be compact. Urysohn's lemma for normal spaces is more general in that it holds for any closed set, since normal spaces are special cases of locally compact Hausdorff spaces. Lastly, restricting the underlying space to a metrizable space even allows us to construct a Lipschitz continuous Urysohn function, which is particularly useful when proving the Portmanteau theorem later on.

While it is not a topological topic, I conclude the chapter with a discussion of subsequences and limits superior/inferior. I have yet to see a satisfactorily comprehensive treatment of these core concepts of analysis, even in PMA, so I decided to include my own treatise on the subject here. The results derived for subsequences are then immediately applied to euclidean spaces to show the completeness of euclidean spaces, the Heine-Borel theorem, and the extreme value theorem. These applications, as well as the alternate proof of the intermediate value theorem based on the binary search algorithm, are attributed to Princeton's Math Camp classes taught by Fedor Sandomirskiy.

2. Measure Spaces and Measurable Functions

We define measurable spaces and study properties of measurable functions as well as (positive) measures in this chapter.

Unlike most textbooks on measure theory, we include the π-λ theorem and the monotone class theorem for functions (basically the π-λ theorem but for measurable functions). These theorems, as found in PS, are central to many of the proofs in probability theory, hence their inclusion.

For similar reasons, the conditions for equivalence of finite and σ-finite measures are also touched upon. Our chapter concludes with Caratheodory's extension theorem, which remains one of the most frequently invoked and most easily applicable results when constructing a measure space from more primitive conditions. For instance, the extension theorem is key to Ionescu-Tulcea's theorem, which proves the existence of an underlying probability space supporting a sequence of random variables with the desired distributions.

3. Abstract Integration

In this chapter we define abstract integration and study some of the properties of abstract integrals. The exposition is quite straightforward. The exception is some additional theorems pertaining to the Dominated Convergence Theorem, such as Scheffe's lemma, that are seldom included in measure theory textbooks but which prove useful for the study of probability theory.

A section on transition kernels is also included, with the intention of formally introducing Markov transition probabilities. Markov chains are central to Bayesian econometrics and MCMC methods, among other areas, and the content in this section furnishes us with the mathematical tools for studying Markov chains with infinite state spaces.

4. Borel Measures and Lebesgue Integration

This chapter constructs the Lebesgue measure using the Riesz representation theorem, and follows chapter 2 of RCA almost word for word.

Where this chapter deviates from RCA is in the content on linear algebra and the connection between linear algebra and the geometry of euclidean spaces. After proving the Riesz representation theorem we take an extended detour to study finite-dimensional vector spaces, linear transformations, determinants and eigenvalues/eigenvectors. This machinery reappears after the construction of the Lebesgue measure, where we show that the determinant has a geometric interpretation and present a related formula for a linear change of variables. The content on linear algebra in this chapter is heavily based mostly on LAF, with the exception of the development of determinants, which borrows heavily from LAH.

5. L^p Spaces

L^p spaces, being at once function spaces and Banach spaces, also appear very often in probability theory. As such, we delve deep into the various results associated with L^p spaces, including the Holder and Minkowsky inequalities as well as Jensen's inequality. The exposition is standard, and borrows both from RCA and PS.

6. Hilbert Space Theory

This chapter is dedicated to inner product spaces and Hilbert spaces in particular. We focus on the Hilbert projection theorem, which ensures the existence of an orthogonal projection onto any closed subset. The projection theorem is the source of many of the most useful results concerning Hilbert spaces, most notably the Riesz-Frechet representation theorem for continuous linear functionals; the representation theorem in turn plays an integral role in von Neumann's proof of the Radon-Nikodym theorem.

While it is overlooked in many introductory measure theory courses, the Radon-Nikodym theorem is a crucial result for probability theory. The theorem is used to prove the existence of conditional expectations and probability density functions, among various other results. I thus devote the last section of this chapter to the proof of the Radon-Nikodym theorem, and more generally the Lebesgue decomposition theorem, based on chapter 6 of RCA.

Although our main preoccupation is with Hilbert spaces, we also briefly focus on finite-dimensional inner product spaces. In particular, I develop Schur's theorem and subsequently the principal axis theorem, based on LAF. Likewise, the QR and Cholesky decompositions, the latter of which is featured heavily in almost all areas of economics, is proved here with the help of inner product methods.

7. Integration on Product Spaces

Product spaces are defined and studied in this chapter, and it includes most of the standard results such as Fubini's theorem. Also included is a short section on transition probability kernels and the construction of probability measures on product spaces. The results derived in this section are then used for a mathematically formal proof of Bayes' rule. The content in this chapter overlaps with chapter 2, section 2 of my text on probability theory.

8. Differentiation

This chapter is included here mostly for the sake of completeness, since a development of integration would feel incomplete without a discussion of the fundamental theorem of calculus.

Instead of delving into the theory of absolute continuity and the Lebesgue differentiaation theorem, which constitutes a worthy topic in its own right, I focus here on the basic form of the FTC for continuous functions, since it suffices for our purposes in most cases. The material in this chapter is based on chapters 5 and 9 of PMA, and is duplicated in the text on convex analysis.