Recently, I encountered a number of seemingly simple questions about
strings. For example, what is the runtime complexity of sorting an array
of strings. Intuitively, if n is the
number of elements in the array, then the optimal runtime complexity – using
merge sort or heap sort – should be O(n\log n), no?
Well, not quite. You probably don’t need to think too much to realize
that it depends on the length of the strings in the array. If we are
sorting an array of small English words, that’s going to have a
different runtime to sorting an array of large strings of DNA sequences
or Dickens novels.
That and similar problems made me search my soul a bit. What other
string problems have I oversimplified in my mind?
In a series of blog entries, I want to discuss a few truly astounding
“linear” results that I learned from some of the great cybertutors
(references below), starting with the one that has the narrowest focus:
Manacher’s Algorithm. In fact, I have never seen a complete exposition
of Manacher’s algorithm presented together with proof of its optimality.
I thought I’d be the documentarian of this clever algorithm; I find that
it’s an effort well-spent.
In future parts, I would like to discuss Knuth-Morris-Pratt (Part 2),
Aho-Corasick (Part 3), Compact Tries (Part 4) and Suffix
Trees (Part 5).
My hope is to pique the interests of algorithm enthusiasts to add a few
string-related tricks to their arsenal of cool algorithms.
Longest Palindromic Substring Problem
A palindrome is a string that is identical when reversed. For example,
the name “BOB” is the same spelled backwards. Ignoring white-spaces,
phrases like “we panic in a pew” are often fun and famous examples that,
when set to music, may sound something like this: “Bob”.
Given a string S, we can ask: what is the longest palindromic substring?
As an example, the string "abracarbrabaddabra", while not itself a
palindrome, has a few palindromic substrings: "bracarb"” (length 7)
and "baddab" (length 6). What is an algorithm that finds the
longest such palindromic substring "bracarb"?
Let’s start with the Brute-force approach: list out every substring
and check which one is the longest one. We begin by writing down a
function to check if a string is a palindrome. Here, we assume that
"" is a palindrome.
(For the TDD folks, you might want to start with
some obvious test cases: "", "a", "abba", "aba" and "abbb"
and "abb".)
// IsPalindrome - Tests whether a string s is a palindrome// or notfuncIsPalindrome(sstring)bool{sLen:=len(s)fori:=0;i<sLen/2;i++{ifs[i]!=s[sLen-i-1]{returnfalse}}returntrue}
To solve the problem, we can now approach it like this:
set aside a tracker for the longest palindrome substring seen
so far maxPalindrome (optionally cache its length maxLen)
generate a list of all substrings of some s
iterate through each substring: if a substring is a palindrome
and is longer than the current maximum, replace the current
maximum with the substring (and update the maximal length)
(A few test cases might include: "", "a", "abba", "acadab",
"abababcac" – multiple possible palindromes, "abrasive" – multiple
possible Palindrome.)
The runtime complexity is O(n^3) where
n is the length of the string.
There are a few suboptimal steps in the computation. For example, if
the substring defined by (i, j) is not a palindrome, then the one defined
by (i - 1, j + 1) will not be a palindrome either. Furthermore, if
(i, j) defines a palindrome, then checking whether (i - 1, j + 1) defines
a palindrome requires only one comparison: s[i - 1] against s[j + 1].
These would cut down on the number of unnecessary comparisons.
However, it is far from obvious that such simplifications – if implemented
correctly – would reduce the runtime from O(n^3) to
O(n^2) (nevermind a linear runtime!).
Manacher’s Algorithm
Let’s instead take a different approach. Let us create an auxiliary array
that tracks the size of the largest possible palindrome centered at each
character. As some palindromes may have even number of characters – "aa" (say)
– and whose point of symmetry occurs between characters, we will use a special
indexing trick to track these as well. Often the index tricks can be thought of
as inserting a special character # between each character, and at the front
and end of the string.
To help us visualize, consider the string in the last section abracarbrabaddabra.
Writing the array of maximal palindrome lengths P as numbers beneath the augmented
string, we arrive at the following values:
Going through the array, it is immediately obvious that the longest palindromic
substring is centered at c (index 4) with length 7. Given such an array,
the runtime for finding the maximum length and retrieving the string value will be
linear. The trick is then to find P in linear time – and that is the crux of
Manacher’s algorithm.
The naive solution to find P is relatively simple to code up:
// Assuming that s is already augmented with '#', computes// an array of the largest PalindromefuncfindPNaively(sstring)[]int{P:=make([]int,len(s),len(s))fori,_:=ranges{longestPalindrome:=1forj:=1;j<=i;j++{ifi+j>=len(s){break}ifs[i-j]==s[i+j]{longestPalindrome+=2}else{break}}// we are dividing by 2 to discount the '#' charactersP[i]=longestPalindrome/2}returnP}
It is easy to see that the naive algorithm runs in O(n^2)
time. (Consider, for example, the string aaaaaaaaaaaa.) Not great.
However, there is also quite a bit of waste. In the naive implementation,
a palindrome’s symmetry is ignored in findPNaively when we iterate through
each character of s. Let’s see how we can better exploit the symmetry
to reduce wasted lookups.
Consider the palindrome “acacdabadcaca”. If you look carefully, there
are two “off-centered” palindromic substrings (both “cac”). The fact there
are 2 of the same value is not
a coincidence; the right “cac” is just a reflection of the
left one about the point of symmetry of the parent palindrome.
In fact, this is generally true: any “off-centered” palindromes
contained entirely within another palindrome occur in pairs. This
is because of two facts:
palindromes are, by definition, invariant under reversal and
for any palindrome, the substring after the pivot point is equal
to the reverse of the substring before the pivot point
We might want to state this insight more formally in terms of
indices of centers of palindromes and the array P of
maximal palindrome length:
For nonnegative integers i and j where
i < j, if i - P[i]/2 > j - P[j]/2 then
P[i] = P[2j - i].
Proof: if i - P[i] / 2 > j - P[j] / 2 then the
maximal palindrome centered at i is contained entirely
within the palindrome centered at j. Its reflection
about j is another palindrome of exactly the same length.
To see this, if its reflection were longer by 2 characters (say),
then, since i - P[i]/2 > j - P[j]/2, those two characters
are also within the palindrome centered at j, and whose
reflection about j is a longer palindrome centered at
i, contradicting the maximality of P[i].
\blacksquare
The above covers the case when one palindromic substring is contained
entirely inside another. What about the case when two palindromic
substrings only partially intersect? In this case, we derive the
following results about the array P:
For nonnegative integers i and j where
i < j, if i - P[i]/2 < j - P[j]/2 then the
maximal palindrome centered at 2j - i is a suffix of the palindrome
centered at j. In particular, P[2j - i] = 2i + P[j] - 2j
Proof: That a suffix centered at 2j - i is a
palindrome is clear: it is a reflection of a substring of the palindrome
centered at i about j.
If the maximal palindrome centered at 2j - i were any longer by 2
characters, for example, then the character immediately before maximal palindrome
centered at j and the character immediately after would be the
same, contradicting its maximality.
\blacksquare
A “reflection” of the above result about j gives the following:
For nonnegative integers i and j where
i < j, if i - P[i]/2 = j - P[j]/2 then
the maximal palindrome centered at 2j - i extends to at least
the end of the palindrome centered at j. That is,
P[2j - i] \geq P[i]\blacksquare
To put it altogether, let us sketch out the algorithm for FindPManacher.
The idea is to iterate through each character to find the maximal palindrome
centered at that character. We will leverage these three results to avoid
checking the maximal palindrome sizes when we already have information about
them.
The algorithm may be stated as follows:
Given a non-nil string s
Let j track the index of the center of the current largest palindrome.
This value is initialised as 0 and will be updated when we start to evaluate
indices outside of the maximal palindrome centered at j or if
we encountered a palindrome that is longer than the one centered at j.
Let r track the radius of the palindrome initialised as 0. Notice that
this is the same as P[j] because we are treating the augmented string s
with # characters inserted. This value will be updated whenever we update j.
We will include the variable in the verbal description of the algorithm, but
omit this value from the code.
Let P track the array of maximal palindrome lengths, initialised with
0 with capacity and length equal to the length of s
For each i, do the following:
if i is greater than j + r, then set
j = i and find the longest palindrome centered at j; suppose its
radius is maxLen then set P[i] = maxLen and set r = maxLen
otherwise, let k equal to the pivot of i about j. If
k - P[k] < j - r then set P[i] = i + P[j] - j. If
k - P[k] > j - r then set P[i] = P[k]. Otherwise, find the
largest palindrome centered at i starting at position k + r.
Suppose its radius is maxLen then set P[i] = maxLen.
Update j and r if the palindrome centered at i extends beyond
the palindrome centered at j – because the former offers us more
symmetry data than the latter.
Return the array P
In code, this looks something like:
funcpivot(i,centerint)int{return2*center-i}funcfindMaxPalindromeLen(sstring,center,rightStartint)int{right:=rightStartfor;right<len(s);right++{left:=pivot(right,center)ifleft<0{break}ifs[left]!=s[right]{break}}returnright-center-1}// FindPManacher - for a given string s, find the array of// maximal palindrome lengths where the value at position i// is the length of a palinwith centered at ifuncFindPManacher(sstring)[]int{// As defined above:// j is the index of the last largest palindrome// P is the array of maximal palindrome lengthsj:=0P:=make([]int,len(s),len(s))fori,_:=ranges{ifi>j+P[j]{maxLen:=findMaxPalindromeLen(s,i,i)j=iP[i]=maxLencontinue}k:=pivot(i,j)ifk-P[k]<j-P[j]{// if the palindrome centered at the mirror extends// beyond the boundary of the palindrome, then the// palindrome at i must extend to the boundary and// no furtherP[i]=j+P[j]-i}elseifk-P[k]>j-P[j]{// if the palindrome centered at the mirror does// not extend beyond the boundary of the palindrome// then the palindrome at i has exactly the same length// as its mirror counterpartP[i]=P[k]}else{maxLen:=findMaxPalindromeLen(s,i,j+P[j]+1)P[i]=maxLen// if the palindrome centered at i extends beyond // the boundary of that centered at j, then update jifi+P[i]>j+P[j]{j=i}}}returnP}
Walk-through of an Example
Let us run through the code for an example string "dadccdadccd". Recall
that the string we will be operating on is actually augmented with
#; therefore, let s equal to the resulting string: "#d#a#d#c#c#d#a#d#c#c#d#".
Rather than going through the entire example, we will focus, instead,
on the key steps:
start
when it encounters its first long palindrome "dadccdad"
when it encounters its longest palindrome "dccdadccd"
For (1), j = 0 and P is an array of length and capacity
equal to 23. For i = 0, since i = j + P[j], we skip the block,
and begin by defining k = pivot(0, 0) which equals 0.
In this case, k - P[k] equals 0, and therefore, equal to
j - P[j] (the else condition applies). In this case,
findMaxPalindromeLen returns 0, and nothing consequential
happens thereafter.
Going into (2), j = 7, P = [0,1,0,3,0,1,0,1,0,...,0] and
i = 8. In this case, i == j + P[j] and, skipping the first
if-block, we set k = pivot(8, 7) which equals 6. From this,
we see that k - P[k] equals 6, which equals j - P[j].
At this point, we start to find the maximal palindrome length
at i starting at position 9. In this case, the length of
the maximal palindrome is 8. We set P[8] = 8 and, since this
new palindrome extends beyond the one centered at 7, we set
j = 8.
Finally, going into (3), we have i = 13, j = 8 and
P = [0,1,0,3,0,1,0,1,8,1,0,1,0,0,...,0]. We see that i < j + P[j]
and, once again, we skip the if-block. Setting k = pivot(13, 8) which
equals 3, we then have k - P[k] equal to 0, which equals to j - P[j];
we are now executing the else-block.
At this point, we find the maximal palindrome length at 13 starting at
17; the maximal palindrome has length 9, and we set P[13] = 9. Since
i + P[i] > j + P[j], we update j = 13.
Linearity of Manacher’s Algorithm
Looking at FindPManacher, it is not obvious that this function runs
in linear time. The interaction between the for-loop and findMaxPalindromeLen
might indicate that, in some pathological example, the algorithm runs
in quadratic time.
There are two ways that we would like to prove that FindPManacher, in fact,
has linear run-time. The first way involves a clever argument and no additional
machinery. The second way – which we will cover in another entry – involves
potential functions and amortized analysis.
Before we present the first proof, we make the observation that the
only potential source of non-linearity of Manacher’s is from comparing
characters in findMaxPalindromeLen; the other cases in the for-loop
in FindPManacher are constant time operations. Therefore, to demonstrate
that FindPManacher runs in linear time, it is enough to prove that
the total number of comparisons across all iterations is O(n).
In fact, the first proof shows that the total number of comparisons is
bounded by 2n.
The function FindPManacher runs in linear time with respect to
the length of the input string s.
Proof: Fix the string s, and consider the iterations
in FindPManacher on each character of s. Let C_i
denote the number of character comparisons that takes place during the
i^{\mathrm{th}} iteration. These character comparisons would
take place as a result of calling findMaxPalindromeLen when, for a given
centre, we compare right and left characters with respect to the center
to detect the size of the largest possible palindrome.
Recall that, as we iterate through each character, j is tracking
the index of the center of the palindrome whose right boundary extends to
the right-most position in s. That is, j + P[j] is
the largest of all known palindromes at the start of the
i^{\mathrm{th}} iteration. Let I_i denote
2n - i - (j + P[j]).
The idea is to show that
I_i is strictly monotonic, with I_0 = 2n and
I_n = 1
Therefore, the number of comparisons are bounded by 2n. (As math
and computer science treat indices differently, some tidyness is sacrificed
for the sake of bridging the two languages.)
The intuition here is that the more you compare in
the i^{\mathrm{th}} iteration, the less you need to compare in
future iterations; you may need to access the same characters multiple times
– for example in aaaab – but total number of times you access
the same characters is bounded by n.
To show (1), note that i increases strictly monotonically and
j + P[j] only changes when i + P[i] > j + P[j]
for some i.
The fact that I_0 = 2n follows from initial values of j
and P[j]. When i = n, j + P[j] = n - 1. It
follows that I_{n} = 1.
To show (2), recall that during an iteration within FindPManacher, we do one
of the following:
(a) we retrieve existing values from P and do not make any
comparisons and just increment i.
(b) we make some comparison but does not change the right-most boundary
j + P[j]
(c) we make some comparisons and we do update the right-most boundary
j + P[j]
In case (a), I_{i - 1} - I_i = 1 and C_i = 0. In
this case, C_i \leq I_{i - 1} - I_i. Good.
In case (b), I_{i - 1} - I_i = 1. We claim that we made
at most one comparisons. If we don’t update j + P[j], this
means that for some i > j and P[i] = j + P[j] - i,
and value returned by findMaxPalindromeLen(s, i, j + P[j] + 1) is
j + P[j] - i.
Looking at the function findMaxPalindromeLen, this can only happen if either
j + P[j] + 1 > len(s) – in which case, the number of comparison
is 0, the pivot of j + P[j] + 1 about i is less than 0
– impossible, or s[j + P[j] + 1] is the only comparison made –
namely, we made a single comparison.
In all these possible cases, 1 = I_{i - 1} - I_i \geq C_i. Good.
Finally, in case (c), suppose (i + P[i]) - (j + P[j]) = c. We want
to show that c \geq C_i - 1. To see this, we know that the return values
of findMaxPalindromeLen is P[i] by definition. This must mean that the
value of right at the end of the for-loop in
findMaxPalindrome(s, i, j + P[j] + 1) is i + P[i] + 1. Since right
is initialised at j + P[j] + 1, the number of iterations in the for-loop is
at most i + P[i] - (j + P[j]) + 1, each of which is possibly a
comparison. It follows that c \geq C_i - 1 and
I_{n - 1} - I_n = i - (i - 1) + c \geq 1 + (C_i - 1) = C_i
as desired.
\blacksquare
Readers familiar with potential functions in amortized analysis will note
that we actually defined a potential function \Phi(i) = C_i
and worked around the mechanics of analysing using potential functions.
However, we are satisfied with how little machinery we needed to present
to prove the linearity of Manacher’s algorithm that we thought it is a
treat to include it without another few pages of the theory behind
amortization analysis.
Before we end this section, let us note that, since it is not possible to
find palindromes without at least scanning the string from beginning to
end, the following is clear:
Manacher’s algorithm has optimal run-time complexity in resolving the maximal
palindrome question. \blacksquare
Acknowledgement
I want to thank the many Youtubers and authors who contributed to my
understanding of this algorith, all of whom are represented in the
references.
The proof of linearity is my own. If similarity with your work appear here
without due reference, I do apologise profusely that ideas often overlap in
shape and presentation, especially if they are very wrong or very right.
I want to reach a higher level of unit test coverage
in my code. In python, this is often quite doable without a lot
of efforts: during test time, you can set any method or function
— imported or built-in — to be its mocked equivalent. The trick
is then to keep the mocked methods organised, and there are a number of
packages that allow you to do this.
This is trickier in go. First, it’s not possible to redefine methods
at will. For example, the following will not compile:
Furthermore, if your code is heavy on side effects — such as modifying an external
data store by making library calls to a connector or a transaction — then some care
has to be taken to expose the library functions or methods so that assertions can be
made about the API calls. Why you would or would not want to unit test certain aspects
of your code aside, it is definitely of some interest to think about ways that would
increase unit test coverage of your application.
We introduce this pattern in the context of the following common use cases: opening
a file for reading. The idea here is to test as much of the code as possible. While
in general this may be somewhat wasteful, it is of practical interest to
achieve 100% unit test coverage of code
test that the correct API calls are made to external libraries given proper inputs
exercise the critical code paths and all error scenarios to ensure that they have
been properly handled
Whatever the practical merits are for doing this, and notwithstanding the huge
effort that would pour into implementing the unit tests, knowing how to do it
would help you with more effective code design for unit level verifications.
To that end, here are some ideas.
Initial Code
Let’s open a file for reading some data. Here is one way one might do this in Go:
packagedataimport("io""os""io/ioutil""github.com/golang/glog""github.com/pkg/errors")typeDatastruct{/* ... */}typeDataMarshalerstruct{/* ... */}func(marshaler*DataMarshaler)Marshal(readerio.Reader)(*Data,error){data:=Data{}/* Magic! */return&data,nil}funcReadData(filePathstring)(*Data,error){// open the file for readingdataFile,ioErr:=os.Open(filePath)ifioErr!=nil{glog.Errorf("Error opening file: %v",ioErr)returnnil,errors.Wrap(ioErr,"Error opening file.")}deferdataFile.Close()marshaler:=DataMarshaler{/* ... */}// return the marshaler's outputreturnmarshaler.Marshal(dataFile)}
To limit the scope, suppose we have very good test coverage for marshaler.
After all, the struct author thought ahead and used the io.Reader interface
as input, which allowed us to pass in a bytes.Buffer — smart! But what about
ReadData?
One possible way to test this code is to actually use files: one for each positive test case that
you want to exercise; a few invalid file paths or improperly permissioned files to generate
common IO errors for the negative test cases. Another way tests is to stub out os.Open
and pass in a substitute function instead. While there are practical advantages to the former
approach, we will instead focus on the latter strategy. Some
advantages for the latter approach are:
the test data files may be hard to generate (because they are large, those files may need
to be generated programmatically, or files are media or binary in nature)
it is easier to manage test code that pretends to be files than to manage test data files
(when your library changes, it is easier to change test code, and harder to change test
files)
the test cases may be more visible and easier to comprehend, whereby a test writer
would otherwise have to map file names to test cases, and test cases to file content
the tests are conducive to benchmarking or stress testing which may be available only via
code (TB of static data files or a streaming ByteBuffer that programmatically generates
data)
conceptually the tests are better encapsulated in the sense that all test preambles live
entirely in the test code/binary itself; the correctness of tests is wholly determined
by code and code alone
Whatever the benefits may be, the os.Open is an archetypical example of any
library function whose (potentially complicated) behaviours are driven by
external states that can be mocked, and whose outputs are readily usable data structures
or interface objects. Other examples include rand.Float32, time.Now, or s3.New.
Refactoring Pattern
To mock out os.Open in the example above, we want to perform the following refactoring:
define (implicitly) dataFile to be an io.ReadCloser
create a singleton object that tracks all external functions via private fields.
In particular, the open field — a function of signature func(string) (*os.File, error)
is assigned to os.Open during struct initialisation
wrap the singleton object’s open field in a function that returns io.ReadCloser instead
instead of using os.Open directly in ReadData, use the wrapper function
In code this looks like:
packagedataimport("io""os""io/ioutil""github.com/golang/glog""github.com/pkg/errors")/* as before */typeDatastruct{/* ... */}typeDataMarshalerstruct{/* ... */}func(marshaler*DataMarshaler)Marshal(readerio.Reader)(*Data,error){data:=Data{}/* ... */return&data,nil}// A singleton object that tracks all external functionsvar__external__*external_functionstypeexternal_functionsstruct{openfunc(string)(*os.File,error)// etc}funcEXTERNAL()*external_functions{if__external__==nil{__external__=&external_functions{open:os.Open,// etc}}return__external__}// Wrapping __external__.open() in a slightly different return type:// (io.ReadCloser, error) so that we are dealing completely with// interface objects.funcopen(filePathstring)(io.ReadCloser,error){external:=EXTERNAL()returnexternal.open(filePath)}funcReadData(filePathstring)(*Data,error){// open the file for readingdataFile,ioErr:=open(filePath)ifioErr!=nil{glog.Errorf("Error opening file: %v",ioErr)returnnil,errors.Wrap(ioErr,"Error opening file.")}deferdataFile.Close()marshaler:=DataMarshaler{/* ... */}// return the marshaler's outputreturnmarshaler.Marshal(dataFile)}
Now we can test away:
packagedataimport("testing""io""bytes""github.com/pkg/errors")typeMockOpenstruct{Contentio.ReaderErrorerrorfilePathstring}funcNewMockOpen(content[]byte,errerror)*MockOpen{return&MockOpen{Content:bytes.NewBuffer(content),Error:error,}}func(mock*MockOpen)Open(filePathstring)(io.Reader,error){mock.filePath=filePathreturnmock.Content,mock.Error}/* Test postiive */funcTestReadData(t*testing.T){testCases:=[]struct{FileNamestringContent[]bytesData*Data}{{/* ... */},}external:=EXTERNAL()oldOpen:=external.opendeferfunc(){external.open=oldOpen}()fori,c:=rangetestCases{mock:=NewMockOpen(c.Content,nil)external.open=mock.Open// verify that no error occurred in the processdata,err:=ReadData(c.FileName)iferr!=nil{t.Error(/* ... */)}// verify that the correct parametre(s) is (are) // passed to the external functionifmock.filePath!=c.FileName{t.Error(/* ... */)}// test that data generated is as expectedif!theSame(data,c.Data){t.Error(/* ... */)}}}funcTestReadDataError(t*testing.T){external:=EXTERNAL()oldOpen:=external.opendeferfunc(){external.open=oldOpen}()mock:=NewMockOpen([]byte{},errors.New("File open error."))external.open=mock.Open_,err:=ReadData("random")// check that the function handles error correctlyiferr==nil{t.Error(/* ... */)}}
We may also want to verify that the external methods are correctly assigned
in the singleton object:
Now we’re all covered! All for the price of a little (arguably necessary)
increase in code complexity. What you have gained with this code are:
programmatic access to the content of the stubbed input data
100% test coverage (for what it’s worth)
ease of generating new test cases and more logically complete input coverage
semantically apparent test cases in the test source file itself
However, it is worth noting that there are some trade-offs. Here are a few that
come to mind:
there is at least one layer of redirection: rather than making an API call
directly, we are wrapping the library function in potentially multiple layers,
and this pattern can be confusing to someone new to the code-base
there are some performance penalties paid by creating objects of function
pointers. In isolation, the penalities are relatively scant, but in aggregate,
the penalities add up. If performance over testability is a big factor, don’t
do this.
factoring out external library function is particularly time consuming, and
greatly impacts velocity of development in two ways. It takes some time to
plan out, and it adds some complexity to make changes. These are also
considerations for having more of your code nailed down by unit tests early
in the application dev cycle. I would lean more towards this level of
refactoring for more mature library code. However, when seasoned code needs this
level of refactoring, refactoring may be risky, especially since one of the reasons for
this type of refactoring was a dearth of test coverage. There is a very valid
question of when is an optimal time for doing this type of refactoring.
I have been doing some pretty serious Scala coding these days, all
of which really come from working with Apache Spark on this project or
that. From the very start, a term that I have not encountered
before (in 12 years or so of building things) began to surface in surplus:
monads.
Now researching about monads have been a relatively bad experience.
Here are a list of sources that I have studied and have found them
insufficient one way or another:
And the list goes on, with (10) perfectly summarising the problem
at hand:
I don’t know if I’m exaggerating but it seems like every programmer
who gets monads posts a tutorial about them. (And each post begins
with: There’s already a lot of monad tutorials on the Internet,
but...)
Here is the thing: It doesn’t matter how many people who understand monads write about
them. At the time of writing this piece, I was still having trouble understanding
them. It is not that the above
expositions are poorly written. On the contrary some of them are
quite enjoyable and beaming with individual creativity (see 7, which
is excellent!). What I find most unsatisfying is that the following
questions remained unanswered for me either because the authors fail
to address them or they were answered sans the authority that come
from unifying the various angles about this subject.
What do people mean when they mean X is a monads? They seem to
mean something that is too aberrated from each other for the word
to have a single focused meaning.
What is the connection with the category theoretical monad in
mathematics? There is a perfectly good definition in category
theory, and if I am worth a damn at what I do, I better be able
to explain the connection to this seemingly unrelated concept.
What is the unifying idea that summarizes all the monad examples
related? Maybe, List, State, Future, IO, etc etc. They behave
overtly differently as code constructs (esp. IO and List). How are
IO or List (as monads) a “programmable semicolon”?
What problem do monads solve, specifically?
More than that, this is intentionally meant to very mathematical. We
want an unified understanding that only comes into focus when we
bring more mathematical machinery — those of categories, functors,
and natural transformations — into the picture. We apologise here
that any such choices would render an otherwise practical note on
the subject into one that is rather uselessly verbose.
A Note About Notation
Most pseudocode is in the Scala syntax. I find that, without its more arcane notations,
Scala is rather straight forward to read, and for a typed language that isn’t Haskell,
about which there have been numerous complaints, it is the most
succinct at dealing with notational demands of functional programming.
I do accept the criticism that it is meaningless to talk about monad
in a language known for imperative elements. After all,
Seq("a").map {s => sidesEffects(); s} is not particularly “monadic”
since s => sideEffect(); s is not actually a “morphism” in the
functional universe. However, if any language, divorced from its
implementations, can be functional in that it is equipped with not
only all the functional features but also readable ones, Scala is
certainly right for the job, and its support for the functional
features in this lengthy exposition is more than sufficient for
our purposes: it has clear type-delineations, and very compact
notations for anonymous functions.
Finally, those unfamiliar with Scala should not fear. The following
code block demonstrates all the Scala language features essential
to understanding the remainder of this post. You may find the
embedded comments helpful:
// this is a comment
// foo is a function that accepts two arguments;
// one of type S and one of type T, returns an
// instance of type U
deffoo(s:S,t:T):U=...// bar is a function with the same signature
// as foo
valbar:(S,T)=>U=foo// currying
valbaz:S=>(T=>U)={// takes some s and returns a function which
// takes an instance t of T and returns
// foo(s, t)
s=>foo(s,_)}// generics/templates: [U] means of class U
// morp is an instance of a List of type U, and
// it is obtained by transforming (via map) a
// list of a single item t of type T via the
// curried baz(s)
valmorp:List[U]=List[T](t).map[U](baz(s))
What do people mean when they say M is a monad?
Monads as chainable, flattenable wrappers
Most prevalent explanations of monads seem to have one thing in
common: monads are wrappers, augmenters, extensions, containers that
expose some kind of “flattening” and “chaining”. In essense,
a monad is a generic class Monad[?] that exposes the following interface
patterns
Monad[T].wrap - construct an instance of Monad[T] from a type T.
I will sometimes write this as wrap in the case that the context
can be inferred.
Monad[S].map[S] - takes a method with signature T => S and a method
with signature Monad[T] => Monad[S], sometimes written as map,
especially if the context can be inferred
Monad[Monad[T]].flatten - take an instance of Monad[Monad[T]] and get an
instance of Monad[T], or flatten if the context is
Such that for any foo of signature T => S,
Naturality of wrap For all instance t: T:
// wrapping t then mapping foo into it is the
// same as applying foo first, and wrapping that
Monad[T].map[S](foo)(wrap(t))==Monad[S].wrap(foo(t))
Naturality of flatten For all instance dblWrapped: Monad[Monad[T]]:
// the two operations on dblWrapped are the same:
// 1. flatten dblWrapped and then map foo to it
// 2. flatten the result of lifting foo to dblWrapped
map(foo)(flatten(dblWrapped))==flatten(map(map(foo))(dblWrapped))
Distributivity of map over flatten For any instance
triWrapped: Monad[Monad[Monad[T]]]]:
// the two operations on dblWrapped are the same:
// 1. flatten dblWrapped and then map foo to it
// 2. flatten the result of lifting foo to dblWrapped
flatten(flatten(triWrapped))==flatten(map(flatten)(triWrapped))
Right inverse of flatten For any instance wrapped: Monad[T]:
// The following are all equal to wrapped itself
// 1. mapping wrap to wrapped and then flattening
// the result
// 2. apply wrap to wrapped directly, and then flatten
flatten(map(wrap)(wrapped))==flatten(Monad[Monad[T]].wrap(wrapped))==wrapped
In literature (especially of the Haskell persuasion), the method
wrap is often referred to as return, while for the method
foo: T => Monad[S]
is oftened referred to as bind. The definition
very common amongst the Haskell literature is that a monad is any type
that emits bind and return. This definition is equivalent to our notion,
since flatten can be recovered by bind(identity[Monad[T]]).
As the name wrap may suggest, when I code up monads in practice,
I think of them as predominantly wrappers. However, those familiar with
the Maybe monad in Haskell may take issue with that, since Nothing is an
instance of Maybe that can’t be easily thought of as wrapping
anything in particular. Hence “predominantly”. However, the intuition is
mostly correct. Monads conjure up the idea of containers because in most of the
cases, container types such as List, Set are monads.
However, if one were to ask for the most general understanding,
monads are adapters that exposes the above methods as a way to substitute
a more semantically appropriate type for the original one.
For example, Option[T] (the Scala equivalent of the Haskell Maybe) is used in the place of T
whenever null may occur during the course of some computation purported
to return T. Using Option is in many ways better than checking for null
explicitly because Option allows for compiler-time checking that the null
case is handled appropriately and promotes flexibility by allowing the
null-case to be handled at a time of the developer’s choosing.
The flexibility afforded the Option[T] to handle the
null case is really attributed to the monadic map. The method
handles the null-case within the abstraction layer, allowing the
developer to work with an instance of Option[T] as if it were that
of T, deferring any null-case handling till necessary.
In general, a more semantically appropriate type is always preferred to a
lesser one, souped-up with additional code because whenever a library or a language
enforces semantics at a syntactic level, the compiler can then
make semantic verifications, which would otherwise be only possible
during runtime; the compilable code is thereby more robust or would not pass
muster at the compilation stage.
Understanding monad as semantic modifiers of types make sense for Maybe,
List, and Future. How about for IO and State? In this way,
monads in practice actually bear two meanings. One as containers, and the
other way as vessels for functional execution. I must admit that this is one major
gap for me to overcome, that the term “monad” is really
a theoretical construct that unifies two distinct patterns in the practice.
To see how these two patterns are related requires a theoretical viewpoint.
Before writing at length about that, let me talk about the second
code pattern found only in functional programming and attributed
— rightly so — to monads.
Monads as “programmable semicolons”
Pure functions have no side effects; that is, pure functions are referentially
transparent in that any
expression can be replaced by its value without altering the behavior of the
programme. For example, the following is referentially transparent:
// the following is the same as
//
// "hello world!"
//
({x:List[String]=>x.reduce(_+_)})(List[String]("hello"," world","!"))
while the following is not
// the following is NOT the same as
//
// val str = "hello world!"
//
// because of the output to stdout,
// the HTTP request to google to
// search for "yodel", and writing the
// response string to stdout
List[String]("hello"," world","!").reduce{x,y=>println(x,y)Http(url("http://bit.ly/1TfvCeN")OKasString).foreach(println)x+y}
How can you modify state that is external to the application (such as via IO
functions) and still maintain referential transparency? (Is
this even possible for applications that have external dependent
processes whose states can also be affected by other sources?)
One way to enforce RT is by mandating that a book-keeping object
(say, an IOTranscript type) persist for the duration of the application
to represent the external state change affected by the application.
// now we are (sort of) referentially
// transparent! If you find the following
// code ugly, you're meant to...
valkitchenSink:Tuple4[String,
IOTranscript,
HTTPCall,
IOTranscript]=List[String]("hello"," world","!").fold(("",IOTranscript.empty,HTTPCall.empty,IOTranscript.empty)){x,y=>valstdoutXY:IOTranscript=PrintLn(x._1,y._1)valgoogleCall:HTTPCall=newHTTPCall("http://bit.ly/1TfvCeN")valstdoutGoogleResponse:IOTranscript=PrintLn(googleCall.response)(x._1+y,x._2++stdoutXY,x._3++googleCall,x._4++stdoutGoogleResponse)}valstr:String=kitchenSink._1
Barring the obvious problems — that the code doesn’t handle
failures gracefully, that the code is monstrous because
there’s all this extraneous IOTranscript and HTTPCall
transcript objects that we don’t much care about, not to
mention all this extra code that we would have to write to
wrap dispatch.Http and println — there is one major flaw with
this: assignment like the one we just made on line 14 are not strictly
referentially transparent!
If one tries to eliminate the assignments by (say) moving their
construction into the returned tuple on line 20, one would find
that we cannot get around the fact that the instance of HTTPCall
is needed to create the second PrintLn. While it is possible to
eliminate all assignments in the previous code block, some nontrivial
refactoring seems to be in order.
However, a much cleaner way to achieve this would be to design an
abstraction that:
keeps State during all those IO calls
encapsulates transcript objects except the ones we care about
we can use to maintain referential transparency
looks a lot cleaner
That abstraction layer is IO (and more generally, the State monad):
// how's this for a slice of fried gold?
valstrIO:IO[String]=List[String]("hello"," world","!").fold(IO[String].wrap("")){x,y=>x.flatMap[IOTranscript]{// now I'm an IO[IOTranscript]
_=>x.map(PrintLn(_,y))}.map[HTTPCall]{// now I'm an IO[HTTPCall]
_=>newHTTPCall("http://bit.ly/1TfvCeN")}.map[IOTranscript]{// now I'm an IO[IOTranscript] again
googleCall=>PrintLn(googleCall.response)}.flatMap[String]{// and now I'm flattened from
// IO[IO[String]]
_=>x.map(_+y)}}
What just happened? and how is that referential transparency? Fried gold
or not, the code looks cleaner, and it seems to encapsulate those darn
transcript objects almost to the point of obscurity. As for referential transparency,
the map and by extension flatMap methods of
the IO know how to keep a record of all the return values of the
function arguments. At each iteration of map/flatMap, the IO object is
recording state via the transcript objects, and the resulting object strIO is
functionally equivalent to each of its composite IO invocations, and thereby
preserving the functional purity of the program.
While the hand-wavy colloquy above
is not a “proof” of referential transparency, readers who seek rigor are now
equipped with enough to verify this for themselves.
(What if I tell you that Scala out of the box provides sugaring for the above:
In this way, the methods map and flatMap are binding together what would
otherwise be imperative statements. Thus originates a second prevalent way
of thinking about monads: as programmable semicolons.
In many ways, the pattern
is quite elegant. However, the elegance is framed in a restrictive
programming paradigm. The imperative programmer would ask “why
do this to yourself?” The merits of functional programming are well-known,
and deserve no more mention here. Any one interested in the topic can
find plenty of references online and on amazon.
Is there some conceptual connection between these two views?
The two views: a wrapper and the programmable semicolon. How do we connect
them? It is difficult to say that they are part of the same phenomenon
or that they share a conceptual ancestor. It has been suggested to me quite
vehemently that the semicolon notion is ultimately the unifying idea;
that at the root of it, whenever we are using monads, we are really doing
one thing, and that is to define the semantic context under which operations
are performed. In the case of State or IO, this is especially apparent.
In the case of List, this is less so, but a bit of thinking on it would
almost convince me that there is a semantic context in this case, and it is that of
iteration. Whenever we use List, we are in some sense saying to the
executor: “How you execute these functions (via map) is to perform them
on each object in this collection.”
In this sense, wrap is a way to embed some object into that context,
and flatten is a way to realize nested contexts as a (less nested)
context. If we look at the
wikipedia article
on this subject, the phrase
to build pipelines that process data in
steps, in which each action is decorated with additional processing
rules provided by the monad
captures this idea well. Especially regarding monads like List or
Option, where their map method is enabling this construction of
sequences of steps by defining how each step should be applied to
the underlying objects, if any.
What problem do monads solve specifically?
We have already seen that monads arose naturally as an elegant way
to allow for IO operations in pure functional programming
languages. However, what about imperative languages?
I cannot take credit for this answer, which is really one that
my coworker Brett
provided. Paraphrasing, “monadic pattern is useful for designing
libraries.”
Thinking on it, it makes a lot of sense. For languages that support
generics (otherwise — in a language like Go, say — one cannot
reasonably speak of this patterns), it is really useful to design
data structures in the form of a monad. I’m merely speculating here,
but the notion of a monad must have played a key role in the design
and implementation of the Apache Spark project, where the
RDD
abstract class implements parallelize (the wrap equivalent),
map, and flatMap to specifically leverage the power of monads.
We have definitely seen how monads simplify code. However, I have
more than just succinctness in mind. Instead, I want to speculate
and analyze what monads provide along the following dimensions:
Delegating Control
Library that strikes a good
balance between conceding and allowing control are often robust
without diminishing reuse. Let us look at the Java Collection
library for example, in the following buggy code to prune “bad names”
out and replicate “good names”:
There are two problems: infinite loops and the infamous concurrent modification
exception. But what caused these problems? What we really want to do is
operate on the element itself, and not the list. Making changes to the
list is what one might have thought as a first pass, permitted by accessing the
methods themselves. If, instead, the library allowed only the map and flatten
(rolled up as flatMap) then the code may go something like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
List<String>names=getLotsOfNames(somehow);names=names.flatMap(// using the Java 8 Lambda expression,// not as clean as I would have liked(Stringname)->goodName(name)?newArrayList<String>(Arrays.asList(newString[]{name,name})):badName(name)?newArrayList<String>():newArrayList<String>(Arrays.asList(newString[]{name}));
Efficiency aside, the code is now more robust, because only the control
to access the elements are needed, and that level of access is encouraged.
This doesn’t prevent you from modifying names during the flatMap stage,
but in most of the use cases, the developer’s first instinct is to write
element-level functions rather than attempting to modify the iterated list.
Promoting readability
As this is my personal speculation, I hesitate to speak without
qualifications: don’t believe this if you have good evidence not to.
Code is more readable if there are fewer forks. Code is more readable
if the forks are more local (within the same method, or definitely
within the same file). We have already pointed to the fact that Option
defers handling of the null case (and even ensure that it is handled),
which maximizes readability by allowing for the most convenient place
to handle such a branching.
What about other monads? The List monad’s map can be thought of
as functionally equivalent to:
1
2
3
4
5
6
7
// applying func to the collection list: List<T>
func=>valresult:List<T>=List()for(i<-list){result+=func(i)}result
There isn’t any trouble with this version per se, except that there
are lines here that deal specifically with iterating, which can be
replaced with a single method that speaks to the function of the
code.
To moderate this point-of-view, the map-flatten-wrap pattern
often require the developer familiarity. The vocabulary is at times
subtle — what does it mean to map an Option? In truth, much
of the streamed-line feeling to monads are due to the developers
themselves who must expend sometimes not an insignificant amount of
energy to structure or refactor their code to fit the pattern.
Whether or not this effort is worthwhile is subject to the discretion
of the engineer.
Simplifying Reasoning
This is equally speculative as the last: what makes for ease of
reasoning is domain isolation. What I mean by this is, if two pieces
of code do exactly the same thing, but one requires the developer
to keep less of the logic in her mind, then that code is easier to
reason about. This definition is rather naïve, but allows me to
make my point: monads simplify reasoning by allowing the developer
to substitute an object for a semantically appropriate wrapper,
but still operate on the wrapper as if it were the original object.
The advantage is twofold. First, semantically more appropriate wrappers
simplify reasoning because the quirks of working with the original type
is now encapsulated within
the type information. Second, because the wrapper abstraction layer
is thin and that it allows for a nearly transparent way to work with
the original type, one needs only to deal with the logic of
processing the underlying object.
As an example, consider the scenario of retrieving an object from a
map, first without Option:
// retrieves grades from some kind of recordGradegetGrade(Stringname,Semestersemester)throwsNoSemesterGradesException,NoSuchStudentException{Map<String,Record>records=getAllGrades();if(grades.containsKey(name)){Recordrecord=records.get(name);ifrecord.grades.containsKey(semester){returnrecord.grades.get(semester);}throwNoSemesterGradesException(semester);}throwNoSuchStudentException("No such student: "+name);}...// processing the recordtry{GradesonGrade=getGrade("George-Michael Bluth",newSemester(2009,"Fall"));process(record);}catch(NoSuchStudentException|NoSemesterGradesExceptionex){alertClient();}
and then with Option:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// this is Scala
defgetGrade(name:String,semester:Semester):Option[Record]={grades:Map<String,Record>=getGrades()grades.get(name).flatMap(_.grades.get(semester))}...getGrade("George-Michael Bluth",newSemester(2009,"fall")).match{caseSome(record)=>process(record)caseNone=>alertClient()}
The two examples illustrate the main difference between
the two approaches: one forcing the developer to deal with
the possibility of keys not being present in a Map, and
also with what happens when the key is not present. Without
a type like Option, one is forced to handle the different
cases explicitly, forcing the developer to reason not only
about how to process a record to retrieve the grade, but
also to handle the various pathological cases that may arise.
The choice to throw exceptions adds additional complexity,
since now the developer must also keep track of how to
handle these exceptions whenever getGrade is called.
In effect, the pathological cases in getGrade has escaped
encapsulation of the method via the exceptions.
The semantic change to Option addresses these issues
perfectly by pushing the responsibility of book-keeping the
pathological cases to the compiler. Via flatMap, we need
only to focus on retrieving the grade object from the record
with some syntactic tweaks (made even simpler via for
comprehension). The word
“uncomplecting”
comes to mind: Option uncomplected the getGrade logic
from the pathological cases, and both the author of getGrade and the
consumer of the method no longer need to know that the student’s
record may be absent or the student’s grade for a particular
semester may be missing.
What is the connection with the monad of category theory?
In the last few frames, I would like to explain the subject, mostly to
myself and to those who are curious, what the connection is with the
category theoretical monad.
I was pleased to learn that there is actually a rather elegant adaptation
of monads to homological algebra, which I will not be going into — though
I will allow myself to say that it is an endofunctor on the suitable abelian
category exhibiting the monadic laws. The following is, I would say, not
technical, but may be more mathematically theoretical and require some prior
familiarity with category theory.
The concept of category is developed around mid-20th century by the mathematicians
Eilenburg and Maclane to formalize the relationship between geometric and
topological objects and algebraic structures like modules of rings. They
wanted to express the still nascent idea that geometry and algebra were
actually intricately related by examining and comparing the properties of maps
— functions that respected the underlying mathematical structures —
between geometric and algebraic objects respectively. Their discovery would
revolutionize mathematics, and the framework that they invented to present
their results evolved into the present day abstract category theory, an important
tool in the repertoire of many research mathematicians today. Later, Alonzo
Church adapted the language of
category theory to present the theory of \lambda-calculus, the
theoretical beginning of the functional programming paradigm.
So what is a category? There are several very good expositions that answer
the question better than I would hope to:
wikipedia (free), MacLane’s
Category Theory for a Working Mathematician, and
Awody’s Category Theory. Roughly speaking,
categories are classifications. A category is composed of objects and relations,
where the relations are often interpreted as specific mappings from one object to
another, and they satisfy very specific laws: closed under composition, associative,
and equipped with units.
For example, one can define
the category of numbers (where the objects are real numbers), and the
relation between numbers x and y is the difference
between them: x - y. Or the category of words (sequence of
letters) on the English alphabet, where the relations between words
A and B are words that begin with A
and ends with B. In the latter example, there are infinitely many relations
between “abc” and “cde”: “abcde” and “abccde” are two obvious examples.
These examples are rather contrived, and in practice, examples abound. However
most of these are beyond our scope.
To formalize the study of categories — how categories relate to each other,
and what are the properties of those relations — category theorists introduced the notions of
functors and
natural transformations.
The language is versatile enough to describe very concrete
types of objects, but is also capable of capturing the mathematical essence
of very abstract concepts to provide the backbone for modern algebraic geometry.
The application of category theory to computer science is in defining
a suitable category of data types, i.e. a category whose objects are data
structures, and whose morphisms between data structures S and
T are functions with signature S \Rightarrow T.
The category of types must satisfy the following criteria
Terminal Objects - There is a data type \mathbf{Unit} that all other type
maps to uniquely.
Closed under finite product - If S and T are types, then there
is a product type subsuming S and T. (The word “product type”
is actually derived from the notion of products
of two objects in category theory.)
Closed under exponents - If S and T are types, then
there is a a type to represent functions with signature S \Rightarrow T.
The above criteria describes the defining characteristics of the so-called
Catesian Closed Category.
The type system that we use in practice for typed languages is usually a
variation of this, subject to the natural constraints and limitations of compilers,
interpreters and runtime engines. In practice, we would want objects like arrays and
dictionaries, so modern type system are also equipped with a “arbitrary coproduct”
requirement, so type systems for certain languages are also equipped with “sum types”
— from the notion of sums or “coproduct”.
(For example, an array of type T is actually an arbitrary sum type of all
finite tuples of type T.)
What is the connection between math monads and code monads?
In the last section, we hand-wavingly discussed the category of types as the
category whose objects are data types (a.k.a. classes) and whose morphisms
are the single-input functions. To save on exposition, let
\mathbf{Types} represent this category.
In category theory, a monad is a (covariant) endofunctor
M: \mathcal{C} \to \mathcal{C} together with two natural
transformations \eta: \mathrm{Id} \to M and
\mu: M^2 \to M such that for any object A in
\mathcal{C}:
\mu_{A} \circ \mu_{M(A)} = \mu_{A} \circ M(\mu_{A}) as
natural transformations from M^3 \to M
\mu_A \circ \eta_{M(A)} = \mu_{A} \circ M(\eta_A) =
1_{M(A)}, where 1_{M(A)} is the identity natural
transformation from M \to M
At the outset, it is unclear how the above relates to the monad commonly
discussed in blogs and forums. However, let’s unravel the definition in the
context of the category of types.
What is an endofunctor M on \mathbf{Types}? It an
association from one type T to another type M(T)
such that for all morphisms \varphi: T \to S, there exists a
morphism M(\varphi): M(T) \to M(S) preserving composition, associativity
and unity. In code-speak: a monad is a way to associate a type T with another
type M[T] that is equipped with a way to “lift” a function S => T to
M[S] => M[T]. This latter requirement is satisfied precisely by map,
which is actually a function of type (T => S) => (M[T] => M[S]).
How about the bit about natural transformations?
In \mathbf{Types}, the natural transformation \mathrm{Id}
\to M are morphisms — one for each type S —
w(S): S \to M(S) such that for each map \varphi: T \to S,
the composition
w(S) \circ \varphi: T \to S \to M(S)
is equal to
M(\varphi) \circ w(S): T \to M(T) \to M(S).
Furthermore, the natural transformation M^2 \to M are
morphisms f(S): M(M(S)) \to M(S) such that each morphism
\varphi: T \to S the composition
So what do these mean in terms of code? As mentioned in the previous section,
morphisms are single-parameter functions. As such, saying that there are
two natural transformations is the same as saying for each type T, there
are specific single-parameter functions with signatures T => M[T] and
M[M[T]] => M[T]. These will correspond to the wrap and flatten methods
respectively. The fact that these are natural transformations means that
the functions must satisfy some additional criteria, namely:
for any given function foo: T => S, then the function map(foo)(wrap(_))
(of signature T => M[S]) is the same as wrap(foo(_)). This is
the naturality of wrap.
for any given function foo: T => S, map(foo)(flatten(_)) (of signature
M[M[T]] => M[S]) is the same as flatten(map(map(foo))(_)). This is the
naturality of flatten.
Finally, it is a matter of interpreting the two laws, which correspond
exactly to the criteria that we have mentioned:
For any instance triWrapped: Monad[Monad[Monad[T]]]]:
Afterword - Why is it so hard to get a straight answer about monads?
Having gone through all this math and code, I thought I’d reclaim
a bit of my own space and speculate.
I think the titular question is a tough one to answer. The following
is a speculation at best. Monad is a very abstract
concept. People who aren’t used to that level of conceptual
abstractness may explain it poorly. Even if experts understand something
very well, e.g. \infty-categories, they may not always be able
to give a succinct and meaningful answer to questions explicitly and often
implicitly asked of them about the subject.
Monads seem to be tangible because, thanks to Scala and Haskell, they
have concrete representations that illustrate the fundamental elements
and help demonstrate how specific monads may be implemented and used in
practice. However, a fish is not a cod, and a monad is not a Maybe,
or a IO, no matter how clearly these examples are explained. What
the examples seek to illustrate is a general pattern that is then obfuscated
by one or two odd examples.
The problem is further compounded by the fact that most colloquial
understanding of monads seems specific to Haskell, and those who
contributed to the tutorials to explain the subject for Haskell have a different audience
in mind: someone who develops in Haskell need not be a type theorist,
and definitely need not to have studied category theory. Fortunately
for me, the classes in Haskell (Monad, Functor, etc) are
consistent with their namesakes in type theory. Unfortunately, though,
the tutorials rely on very different concepts than those familiar to
their imperative counterparts.
Having finally had some handle of the subject, I believe the acute problem seems
to be that monads in practice actually capture two outwardly
different patterns: one of container/wrapper, one of semantic binding
for functional expressions. There is a tenuous link between these two
notions if one moves past what happens to the underlying data structures,
but instead focus on the syntactic appearance. The true link is in type theory,
which is rarely discussed (if at all) in any discussion of lambda calculus.
The bridge being conspicuously absent in any programming discourse, any hope of
having a coherent narrative for “monad” is hopelessly lost.
In trying to verify II.3.7 in [H], we came upon a discussion which I would like to summarize here:
Let k be a field, then any k-algebra R with finitely many prime ideals has dimension 0. In general, a ring R with finitely many prime ideals is not necessarily Artin. The case here follows from the following facts.
1) Noether normalization: any k-algebra R is finite (as a module) over k[x_1,...,x_n] for some n. In the case where k is infinite see [AM, Ex 5.16], and the general case, see [E, Sect. 8.2.1].
2) k[x] (a PID and thus a UFD) necessarily has infinitely many prime ideals, since it has infinitely many irreducible polynomials, each of which generates a prime. (The Euclid’s argument used to prove this is due to EK: if there were only finitely many irreducible polynomials f_1,...,f_k, consider 1 + \prod_k f_k, which is either irreducible or contains an irreducible factor not one of f_1,...,f_k.)
It follows from (1) and (2) that if R is finite over k[x_1,...,x_n], and if R has only finitely many primes, then n = 0 and R is finite over k. That is, R is a finite dimensional vector space.
In this way, for any domain B, and A finitely generated B-domain. If A \otimes_B K(B) is a K(B)-algebra with finitely many primes, by Noether normalization, A \otimes_B K(B) is finite over K(B)[x_1,\cdot\cdot\cdot,x_n]; by the above, it must be integral over K(B). That is, A \otimes_B K(B) is a field equal to K(A).
In preparation for what is to come, we prove that for A and B as above, there exists some multiplicative set S \subset B such that A \otimes_B S^{-1}B is a finite S^{-1}B module.
To see this, notice that A is a finitely generated B-algebra. Write A = B[x_1,...,x_n]/I for some ideal I in the polynomial ring with coefficients in B. Certainly, each x_i is an element of K(A) and satisfies some monic polynomial with coefficient in K(B):
Let S be the multiplicative subset of B generated by f_{i,j}. It is plain that each x_i is integral over S^{-1}B, and A \otimes_B S^{-1}B is finitely over S^{-1}B as a module by generators x_1^{m_i} \cdot\cdot\cdot x_n^{m_n} where m_1 + \cdot\cdot\cdot + m_n < N_1 + \cdot\cdot\cdot + N_n (and hence of finite quantity).
Chern classes are a collection of invariants in differential and algebraic geometry. In the analytic setting, Chern classes are characteristic invariant of vector bundles on complex smooth manifolds; in algebraic geometry, Chern classes are invariants of locally free coherent sheaves over projective schemes. For the latter, if such schemes are \mathbb{C}-schemes, in which case such coherent sheaves are the algebraic analogues of vector bundles, via duality theories in the respective domains, the two Chern classes agree.
In general when it comes to manifolds, there are a lot of parallels between analytic and algebraic techniques. Whether or not it is the aim of these two camps to cooperate, certainly understanding how to transition from one to another greatly increases the level of understanding regarding manifold theory. In view of this, I aim to answer the following: what is an intuitive way to understand why the two approaches to Chern classes are the same?
The exposition is broken roughly into the following sections:
Chern classes of locally free coherent sheaves over projective schemes
Chern classes of complex vector bundles over smooth manifolds
Intersection theory
Dictionary: bridging the gap
These sections will not correspond to entries: I intend to split this exposition into manageable chunks to write. The references will be composed of bits and pieces from [BoTu], [F], [GrHa], and [V]. If other references arise, I will make these explicit.
I would like to give thanks to AM and VD for innumerable conversations that attempt to break down the language barriers between geometers and a stubborn algebraist such as myself. I would also like to thank CW for the conversations that motivated this side project.
Section 1 – Locally Free Coherent Sheaves of Projective Schemes
The story of Chern classes began for me with the Intersection Theory appendix of [H]: Let X be a projective \mathbb{k}-scheme. For simplicity, assume \mathbb{k} = \overline{\mathbb{k}}. There is a unique way to associate to each locally free coherent sheaf \mathcal{E} on X a formal polynomial c_t(E) = \sum_{i=0} c_i(E)t^i with c_i(E) \in A_i(X) such that
Functoriality. For f: Y \rightarrow X, for locally free coherent sheaf \mathcal{E} on X, then c_t(f^*\mathcal{E}) = f^*c_t(E).
Normality. For any any line bundle \mathcal{E} = \mathcal{L}(D), we have that c_t(\mathcal{E}) = 1 + Dt.
Additivity. For any short exact sequence \[0 \longrightarrow \mathcal{E}\,' \longrightarrow \mathcal{E} \longrightarrow\mathcal{E}\,'' \longrightarrow 0,\] we have c_t(\mathcal{E}) = c_t(\mathcal{E}\,’)c_t(\mathcal{E}\,”).
The statement is proven in [Gr1]. Here, Chern classes c_i(\mathcal{E}) is defined by \[\sum_{i = 0}^r (-1)^i\pi^*c_i(\mathcal{E}).\xi^{r - i} = 0\] where \xi is the generator of A^*(\mathbb{P}\mathcal{E}) as a free module over A^*(X). It is unique of all characteristic classes satisfying functoriality, normality, and additivity. In fact, [Gr1] offers an explanation, albeit unintuitive, of how the geometric Chern class is the same as its algebraic counterpart.
In the following, we briefly introduce the notions necessary to define and understand Chern classes in the algebraic setting.
1.1 Projectivization of Locally Free Coherent Sheaves
For the remainder, let \mathcal{E} be a locally free coherent sheaf on a scheme X. That is, there exist affine open subsets \{U_i \subseteq X\} of X, for which \mathcal{E}|_{U_i} = \tilde{M_i} where M_i is a free \mathcal{O}_X(U_i) module, and \tilde{M_i} is the sheaf of \mathcal{O}_{U_i}-module associated to M_i (see [H] 2.5).
Let S(\mathcal{E}) be the symmetric algebra on \mathcal{E}: \[S(\mathcal{E}) = \bigoplus_{i \geq 0} S^i(\mathcal{E}).\] Locally, it sends an open subset U of X to S(\mathcal{E}(U)) = \bigoplus_{i \geq 0} S^i(\mathcal{E}(U)). One can check via ([H] Ex 2.1.22) that S(\mathcal{E}) is a sheaf on X. In particular, locally at least, the module is graded by tensor power (and you may wish to verify that the grading is preserved along transition maps). It is easy to see that for U = \mathrm{spec} R of X, the generators of \Gamma(U, \mathcal{E}) generate the symmetric algebra S(\mathcal{E}(U)) over R.
For example, let \mathcal{O}(1) be the hyperplane bundle over \mathbb{P}^n = \mathbf{Proj}\mathbb{k}[x_0,...,x_n], then locally over U_i = D_+(x_i), \mathcal{O}(1) is generated by x_i over R_i = \mathbb{k}[x_0,\dots,x_n]_{x_i}, and S(\mathcal{O}(1)(U_i)) is the graded ring R_i[x_it]. The transition maps of S(\mathcal{E}) from U_i to U_j is obtained from the (degree 0) ring map which sends x_it \in R_i[x_it] to x_jt\cdot x_i/x_j \in R_j[x_jt].
At this point, we construct the projective fibre bundle from the sheaf S(\mathcal{E}), which is a scheme \mathbb{P}(\mathcal{E}) together with a map \pi: \mathbb{P}(\mathcal{E}) \rightarrow X such that the fibre over each p \in X is \mathbf{Proj}\,S(\mathcal{E}_p). The construction is exactly that of ([H], p 160), applied to S(\mathcal{E}):
Construction: to each affine open subset U of X on which \mathcal{E} is trivial associate the projective space \mathbf{Proj}\,S(\mathcal{E}(U)) — \mathbf{Proj}\,S(\mathcal{E}(U)) is well-defined since S(\mathcal{E}(U)) is graded (see [H] p 76-77). This gives a family of projective schemes \{P_U = \mathbf{Proj} S(\mathcal{E}(U))\,|\,U \subset X\textrm{ is affine open}\}. The end goal is to use ([H], Ex 2.2.12). To do so, we must obtain a subcollection \{U \subset X \| U \textrm{ open in }X\} of the open sets covering X such that for every U, V \subset X we have that P_{UV} \subset P_U, P_{VU} \subset P_V and that P_{UV} \simeq P_{VU}. In our case, notice that over U on which \mathcal{E} is trivial, P_U = \mathbb{P}^{n – 1} \times U, where n is the rank of \mathcal{E}. With this in mind, let i_U (resp. i_V) be the inclusion of U \cap V into U (resp. V). Then, set P_{UV} = \mathbb{P}^n \times i_U(U \cap V) and P_{VU} = \mathbb{P}^n \times i_V(U \cap V). If we let the subcollection of open subsets of X be those on which \mathcal{E} is trivial, it is easy to see that the conditions of ([H], Ex 2.2.12) is satisfied. We obtain from the gluing: a scheme \mathbb{P}(\mathcal{E}) together with a map \pi: \mathbb{P}(\mathcal{E}) \rightarrow X such that at the fibre of each point p of X is a projective scheme.
Example. If \mathcal{E} is trivial, then \mathbb{P}(\mathcal{E}) = \mathbb{P}^{n – 1} \times X, where n (assuming X connected) is the (global) rank of the bundle.
What about \mathbb{P}(\mathcal{O}(1)) on \mathbb{P}^n?? On each U_i = D_+(x_i), we have trivial line-bundles, hence \pi^{-1}(U_i) = \mathbb{P}^0 \times U_i. One might think that \mathbb{P}(\mathcal{O}(1)) = \mathbb{P}(\mathcal{O}) = \mathbb{P}^n. In fact, this is not true: no trivialization exists for \mathbb{P}(\mathcal{O}(1)). To see this, it is enough to show that there are no constant nonzero sections of \mathbb{P}(\mathcal{O}(1)). To see this, notice that
I recently encountered an IMO question that’s stated as follows: Given a function f: \mathbb{R} \rightarrow \mathbb{R} such that for any x, y \in \mathbb{R}, f(x + y) \leq yf(x) + f(f(x)), show that f(x) = 0 for all x \leq 0.
There are two solutions, and I will present the one that AM and I came up with: one can easily verify that f(x) \leq f(f(x)). Let f^2(x) = f(f(x)). We will proceed by showing the following:
f(x) \leq f^2(0)
f(x) \leq 0
f^2(0) = 0
At which point the result should be fairly straight-forward.
To show (1), consider f(f(x)) = f(f(x) + 0) \leq f(x)f(0) + f^2(0). From this, we have that f(x - f(0)) \leq -f(0)f(x) + f^2(x) \leq -f(0)f(x) + f(0)f(x) + f^2(0) = f^2(0). As x is arbitrary, (1) follows.
For (2), let y > 0, and consider f(x - y + y) \leq yf(x - y) + f^2(x + y). As y is positive, we have that yf(x - y) \leq -y^2f(x) + yf(f(x)), so f(x) \leq -y^2f(x) + yf^2(x) + f^2(x + y). Via (1), we have that f(f(x)) \leq f^2(0), f^2(x + y) \leq f^2(0), which in turn implies that f(x) \leq -y^2f(x) + (1 + y)f^2(0).
But this implies (1 + y^2)f(x) \leq (1 + y)f^2(0), which in turn implies that
f(x) \leq \frac{1 + y}{1 + y^2}f^2(0).
As y is arbitrary, we have that f(x) \leq 0: this proves (2).
To prove (3), notice that for any x, f(x) = f(x - 1 + 1) \leq f(x - 1) + f(f(x - 1)), and use (1). That is 2f^2(0) \geq f(x) for any x — in particular, x = f(0). Together with (2), we have f^2(0) = 0.
Finally, as f^2(0) \leq f^3(0) = f(0), we have that f(0) = 0; now observe 0 = f(0) = f(x - x) \leq -xf(x) + f(f(x)). The main result follows.