Monads
26 Nov 2015I have been doing some pretty serious Scala coding these days, all of which really come from working with Apache Spark on this project or that. From the very start, a term that I have not encountered before (in 12 years or so of building things) began to surface in surplus: monads.
Now researching about monads have been a relatively bad experience. Here are a list of sources that I have studied and have found them insufficient one way or another:
- Wiki Article on Monads
- Stack Overflow - Haskell
- Haskell Wiki
- More Monad and Haskell
- Mark Toberdorf’s Paper
- Python and Monad
- Monads, Haskell and Pictures
- Multi-Part Monads
- Maybe Monad and Javascript
- Monads and C++
And the list goes on, with (10) perfectly summarising the problem at hand:
I don’t know if I’m exaggerating but it seems like every programmer who gets monads posts a tutorial about them. (And each post begins with: There’s already a lot of monad tutorials on the Internet, but...)
Here is the thing: It doesn’t matter how many people who understand monads write about them. At the time of writing this piece, I was still having trouble understanding them. It is not that the above expositions are poorly written. On the contrary some of them are quite enjoyable and beaming with individual creativity (see 7, which is excellent!). What I find most unsatisfying is that the following questions remained unanswered for me either because the authors fail to address them or they were answered sans the authority that come from unifying the various angles about this subject.
-
What do people mean when they mean X is a monads? They seem to mean something that is too aberrated from each other for the word to have a single focused meaning.
-
What is the connection with the category theoretical monad in mathematics? There is a perfectly good definition in category theory, and if I am worth a damn at what I do, I better be able to explain the connection to this seemingly unrelated concept.
-
What is the unifying idea that summarizes all the monad examples related? Maybe, List, State, Future, IO, etc etc. They behave overtly differently as code constructs (esp. IO and List). How are IO or List (as monads) a “programmable semicolon”?
-
What problem do monads solve, specifically?
More than that, this is intentionally meant to very mathematical. We want an unified understanding that only comes into focus when we bring more mathematical machinery — those of categories, functors, and natural transformations — into the picture. We apologise here that any such choices would render an otherwise practical note on the subject into one that is rather uselessly verbose.
A Note About Notation
Most pseudocode is in the Scala syntax. I find that, without its more arcane notations, Scala is rather straight forward to read, and for a typed language that isn’t Haskell, about which there have been numerous complaints, it is the most succinct at dealing with notational demands of functional programming.
I do accept the criticism that it is meaningless to talk about monad
in a language known for imperative elements. After all,
Seq("a").map {s => sidesEffects(); s}
is not particularly “monadic”
since s => sideEffect(); s
is not actually a “morphism” in the
functional universe. However, if any language, divorced from its
implementations, can be functional in that it is equipped with not
only all the functional features but also readable ones, Scala is
certainly right for the job, and its support for the functional
features in this lengthy exposition is more than sufficient for
our purposes: it has clear type-delineations, and very compact
notations for anonymous functions.
Finally, those unfamiliar with Scala should not fear. The following code block demonstrates all the Scala language features essential to understanding the remainder of this post. You may find the embedded comments helpful:
// this is a comment
// foo is a function that accepts two arguments;
// one of type S and one of type T, returns an
// instance of type U
def foo(s: S, t: T): U = ...
// bar is a function with the same signature
// as foo
val bar: (S, T) => U = foo
// currying
val baz: S => (T => U) = {
// takes some s and returns a function which
// takes an instance t of T and returns
// foo(s, t)
s => foo(s, _)
}
// generics/templates: [U] means of class U
// morp is an instance of a List of type U, and
// it is obtained by transforming (via map) a
// list of a single item t of type T via the
// curried baz(s)
val morp: List[U] = List[T](t).map[U](baz(s))
What do people mean when they say M is a monad?
Monads as chainable, flattenable wrappers
Most prevalent explanations of monads seem to have one thing in
common: monads are wrappers, augmenters, extensions, containers that
expose some kind of “flattening” and “chaining”. In essense,
a monad is a generic class Monad[?]
that exposes the following interface
patterns
-
Monad[T].wrap
- construct an instance ofMonad[T]
from a typeT
. I will sometimes write this aswrap
in the case that the context can be inferred. -
Monad[S].map[S]
- takes a method with signatureT => S
and a method with signatureMonad[T] => Monad[S]
, sometimes written asmap
, especially if the context can be inferred -
Monad[Monad[T]].flatten
- take an instance ofMonad[Monad[T]]
and get an instance ofMonad[T]
, orflatten
if the context is
Such that for any foo
of signature T => S
,
- Naturality of
wrap
For all instancet: T
:
// wrapping t then mapping foo into it is the
// same as applying foo first, and wrapping that
Monad[T].map[S](foo)(wrap(t)) ==
Monad[S].wrap(foo(t))
- Naturality of
flatten
For all instancedblWrapped: Monad[Monad[T]]
:
// the two operations on dblWrapped are the same:
// 1. flatten dblWrapped and then map foo to it
// 2. flatten the result of lifting foo to dblWrapped
map(foo)(flatten(dblWrapped)) ==
flatten(map(map(foo))(dblWrapped))
- Distributivity of
map
overflatten
For any instancetriWrapped: Monad[Monad[Monad[T]]]]
:
// the two operations on dblWrapped are the same:
// 1. flatten dblWrapped and then map foo to it
// 2. flatten the result of lifting foo to dblWrapped
flatten(flatten(triWrapped)) ==
flatten(map(flatten)(triWrapped))
- Right inverse of
flatten
For any instancewrapped: Monad[T]
:
// The following are all equal to wrapped itself
// 1. mapping wrap to wrapped and then flattening
// the result
// 2. apply wrap to wrapped directly, and then flatten
flatten(map(wrap)(wrapped)) ==
flatten(Monad[Monad[T]].wrap(wrapped)) ==
wrapped
In literature (especially of the Haskell persuasion), the method
wrap
is often referred to as return, while for the method
foo: T => Monad[S]
def bind(f: T => Monad[S], m: Monad[T]) = flatten(map(f)(m))
is oftened referred to as bind. The definition
very common amongst the Haskell literature is that a monad is any type
that emits bind and return. This definition is equivalent to our notion,
since flatten
can be recovered by bind(identity[Monad[T]])
.
As the name wrap
may suggest, when I code up monads in practice,
I think of them as predominantly wrappers. However, those familiar with
the Maybe
monad in Haskell may take issue with that, since Nothing
is an
instance of Maybe
that can’t be easily thought of as wrapping
anything in particular. Hence “predominantly”. However, the intuition is
mostly correct. Monads conjure up the idea of containers because in most of the
cases, container types such as List
, Set
are monads.
However, if one were to ask for the most general understanding, monads are adapters that exposes the above methods as a way to substitute a more semantically appropriate type for the original one.
For example, Option[T]
(the Scala equivalent of the Haskell Maybe
) is used in the place of T
whenever null
may occur during the course of some computation purported
to return T
. Using Option
is in many ways better than checking for null
explicitly because Option
allows for compiler-time checking that the null
case is handled appropriately and promotes flexibility by allowing the
null
-case to be handled at a time of the developer’s choosing.
The flexibility afforded the Option[T]
to handle the
null
case is really attributed to the monadic map
. The method
handles the null
-case within the abstraction layer, allowing the
developer to work with an instance of Option[T]
as if it were that
of T
, deferring any null
-case handling till necessary.
In general, a more semantically appropriate type is always preferred to a lesser one, souped-up with additional code because whenever a library or a language enforces semantics at a syntactic level, the compiler can then make semantic verifications, which would otherwise be only possible during runtime; the compilable code is thereby more robust or would not pass muster at the compilation stage.
Understanding monad as semantic modifiers of types make sense for Maybe
,
List
, and Future
. How about for IO
and State
? In this way,
monads in practice actually bear two meanings. One as containers, and the
other way as vessels for functional execution. I must admit that this is one major
gap for me to overcome, that the term “monad” is really
a theoretical construct that unifies two distinct patterns in the practice.
To see how these two patterns are related requires a theoretical viewpoint.
Before writing at length about that, let me talk about the second
code pattern found only in functional programming and attributed
— rightly so — to monads.
Monads as “programmable semicolons”
Pure functions have no side effects; that is, pure functions are referentially transparent in that any expression can be replaced by its value without altering the behavior of the programme. For example, the following is referentially transparent:
// the following is the same as
//
// "hello world!"
//
({x: List[String] => x.reduce(_ + _)})
(List[String]("hello", " world", "!"))
while the following is not
// the following is NOT the same as
//
// val str = "hello world!"
//
// because of the output to stdout,
// the HTTP request to google to
// search for "yodel", and writing the
// response string to stdout
List[String]("hello", " world", "!").reduce {
x, y =>
println(x, y)
Http(url("http://bit.ly/1TfvCeN")
OK as String).foreach(println)
x + y
}
How can you modify state that is external to the application (such as via IO
functions) and still maintain referential transparency? (Is
this even possible for applications that have external dependent
processes whose states can also be affected by other sources?)
One way to enforce RT is by mandating that a book-keeping object
(say, an IOTranscript
type) persist for the duration of the application
to represent the external state change affected by the application.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// now we are (sort of) referentially
// transparent! If you find the following
// code ugly, you're meant to...
val kitchenSink: Tuple4[String,
IOTranscript,
HTTPCall,
IOTranscript] =
List[String]("hello", " world", "!").fold(
("",
IOTranscript.empty,
HTTPCall.empty,
IOTranscript.empty)) {
x, y =>
val stdoutXY: IOTranscript =
PrintLn(x._1, y._1)
val googleCall: HTTPCall =
new HTTPCall("http://bit.ly/1TfvCeN")
val stdoutGoogleResponse: IOTranscript =
PrintLn(googleCall.response)
(x._1 + y,
x._2 ++ stdoutXY,
x._3 ++ googleCall,
x._4 ++ stdoutGoogleResponse)
}
val str: String = kitchenSink._1
Barring the obvious problems — that the code doesn’t handle
failures gracefully, that the code is monstrous because
there’s all this extraneous IOTranscript
and HTTPCall
transcript objects that we don’t much care about, not to
mention all this extra code that we would have to write to
wrap dispatch.Http
and println
— there is one major flaw with
this: assignment like the one we just made on line 14 are not strictly
referentially transparent!
If one tries to eliminate the assignments by (say) moving their
construction into the returned tuple on line 20, one would find
that we cannot get around the fact that the instance of HTTPCall
is needed to create the second PrintLn
. While it is possible to
eliminate all assignments in the previous code block, some nontrivial
refactoring seems to be in order.
However, a much cleaner way to achieve this would be to design an abstraction that:
- keeps State during all those IO calls
- encapsulates transcript objects except the ones we care about
- we can use to maintain referential transparency
- looks a lot cleaner
That abstraction layer is IO
(and more generally, the State
monad):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// how's this for a slice of fried gold?
val strIO: IO[String] =
List[String]("hello", " world", "!").fold(
IO[String].wrap("")) {
x, y =>
x.flatMap[IOTranscript] {
// now I'm an IO[IOTranscript]
_ => x.map(PrintLn(_, y))
}
.map[HTTPCall] {
// now I'm an IO[HTTPCall]
_ => new HTTPCall(
"http://bit.ly/1TfvCeN")
}
.map[IOTranscript] {
// now I'm an IO[IOTranscript] again
googleCall =>
PrintLn(googleCall.response)
}
.flatMap[String] {
// and now I'm flattened from
// IO[IO[String]]
_ => x.map(_ + y)
}
}
What just happened? and how is that referential transparency? Fried gold
or not, the code looks cleaner, and it seems to encapsulate those darn
transcript objects almost to the point of obscurity. As for referential transparency,
the map
and by extension flatMap
methods of
the IO
know how to keep a record of all the return values of the
function arguments. At each iteration of map
/flatMap
, the IO
object is
recording state via the transcript objects, and the resulting object strIO
is
functionally equivalent to each of its composite IO invocations, and thereby
preserving the functional purity of the program.
While the hand-wavy colloquy above
is not a “proof” of referential transparency, readers who seek rigor are now
equipped with enough to verify this for themselves.
(What if I tell you that Scala out of the box provides sugaring for the above:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
val strIO: IO[String] =
List[String]("hello", " world", "!").fold(
IO[String].wrap("")) {
x, y =>
for {
printXY <- x.map(PrintLn(_, y))
googleCall <-
new HTTPCall(
"http://bit.ly/1TfvCeN")
printSearchResult <-
PrintLn(googleCall.response)
}
yield printSearchResult.flatMap {
_ => x.map(_ + y)
}
}
It is quite a neat language.)
In this way, the methods map
and flatMap
are binding together what would
otherwise be imperative statements. Thus originates a second prevalent way
of thinking about monads: as programmable semicolons.
In many ways, the pattern is quite elegant. However, the elegance is framed in a restrictive programming paradigm. The imperative programmer would ask “why do this to yourself?” The merits of functional programming are well-known, and deserve no more mention here. Any one interested in the topic can find plenty of references online and on amazon.
Is there some conceptual connection between these two views?
The two views: a wrapper and the programmable semicolon. How do we connect
them? It is difficult to say that they are part of the same phenomenon
or that they share a conceptual ancestor. It has been suggested to me quite
vehemently that the semicolon notion is ultimately the unifying idea;
that at the root of it, whenever we are using monads, we are really doing
one thing, and that is to define the semantic context under which operations
are performed. In the case of State
or IO
, this is especially apparent.
In the case of List
, this is less so, but a bit of thinking on it would
almost convince me that there is a semantic context in this case, and it is that of
iteration. Whenever we use List
, we are in some sense saying to the
executor: “How you execute these functions (via map
) is to perform them
on each object in this collection.”
In this sense, wrap
is a way to embed some object into that context,
and flatten
is a way to realize nested contexts as a (less nested)
context. If we look at the
wikipedia article
on this subject, the phrase
to build pipelines that process data in steps, in which each action is decorated with additional processing rules provided by the monad
captures this idea well. Especially regarding monads like List
or
Option
, where their map
method is enabling this construction of
sequences of steps by defining how each step should be applied to
the underlying objects, if any.
What problem do monads solve specifically?
We have already seen that monads arose naturally as an elegant way to allow for IO operations in pure functional programming languages. However, what about imperative languages?
I cannot take credit for this answer, which is really one that my coworker Brett provided. Paraphrasing, “monadic pattern is useful for designing libraries.”
Thinking on it, it makes a lot of sense. For languages that support
generics (otherwise — in a language like Go, say — one cannot
reasonably speak of this patterns), it is really useful to design
data structures in the form of a monad. I’m merely speculating here,
but the notion of a monad must have played a key role in the design
and implementation of the Apache Spark project, where the
RDD
abstract class implements parallelize
(the wrap
equivalent),
map
, and flatMap
to specifically leverage the power of monads.
We have definitely seen how monads simplify code. However, I have more than just succinctness in mind. Instead, I want to speculate and analyze what monads provide along the following dimensions:
Delegating Control
Library that strikes a good
balance between conceding and allowing control are often robust
without diminishing reuse. Let us look at the Java Collection
library for example, in the following buggy code to prune “bad names”
out and replicate “good names”:
1
2
3
4
5
6
7
8
9
10
11
List<String> names = getLotsOfNames(somehow);
for(int i = 0; i < names.size(); ++i) {
String name = names.get(i);
if (goodName(name)) {
// uh oh, infinite loop???
names.add(name);
} else if (badName(name)) {
// concurrent modification??
names.remove(name);
}
}
There are two problems: infinite loops and the infamous concurrent modification
exception. But what caused these problems? What we really want to do is
operate on the element itself, and not the list. Making changes to the
list is what one might have thought as a first pass, permitted by accessing the
methods themselves. If, instead, the library allowed only the map
and flatten
(rolled up as flatMap
) then the code may go something like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
List<String> names = getLotsOfNames(somehow);
names = names.flatMap(
// using the Java 8 Lambda expression,
// not as clean as I would have liked
(String name) ->
goodName(name)?
new ArrayList<String>(
Arrays.asList(
new String[] {name, name})) :
badName(name)?
new ArrayList<String>() :
new ArrayList<String>(
Arrays.asList(
new String[] {name}));
Efficiency aside, the code is now more robust, because only the control to access the elements are needed, and that level of access is encouraged. This doesn’t prevent you from modifying names during the flatMap stage, but in most of the use cases, the developer’s first instinct is to write element-level functions rather than attempting to modify the iterated list.
Promoting readability
As this is my personal speculation, I hesitate to speak without
qualifications: don’t believe this if you have good evidence not to.
Code is more readable if there are fewer forks. Code is more readable
if the forks are more local (within the same method, or definitely
within the same file). We have already pointed to the fact that Option
defers handling of the null
case (and even ensure that it is handled),
which maximizes readability by allowing for the most convenient place
to handle such a branching.
What about other monads? The List
monad’s map
can be thought of
as functionally equivalent to:
1
2
3
4
5
6
7
// applying func to the collection list: List<T>
func =>
val result: List<T> = List()
for (i <- list) {
result += func(i)
}
result
There isn’t any trouble with this version per se, except that there are lines here that deal specifically with iterating, which can be replaced with a single method that speaks to the function of the code.
To moderate this point-of-view, the map
-flatten
-wrap
pattern
often require the developer familiarity. The vocabulary is at times
subtle — what does it mean to map
an Option
? In truth, much
of the streamed-line feeling to monads are due to the developers
themselves who must expend sometimes not an insignificant amount of
energy to structure or refactor their code to fit the pattern.
Whether or not this effort is worthwhile is subject to the discretion
of the engineer.
Simplifying Reasoning
This is equally speculative as the last: what makes for ease of reasoning is domain isolation. What I mean by this is, if two pieces of code do exactly the same thing, but one requires the developer to keep less of the logic in her mind, then that code is easier to reason about. This definition is rather naïve, but allows me to make my point: monads simplify reasoning by allowing the developer to substitute an object for a semantically appropriate wrapper, but still operate on the wrapper as if it were the original object.
The advantage is twofold. First, semantically more appropriate wrappers simplify reasoning because the quirks of working with the original type is now encapsulated within the type information. Second, because the wrapper abstraction layer is thin and that it allows for a nearly transparent way to work with the original type, one needs only to deal with the logic of processing the underlying object.
As an example, consider the scenario of retrieving an object from a
map, first without Option
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// retrieves grades from some kind of record
Grade getGrade(String name, Semester semester)
throws NoSemesterGradesException, NoSuchStudentException
{
Map<String, Record> records = getAllGrades();
if(grades.containsKey(name)) {
Record record = records.get(name);
if record.grades.containsKey(semester) {
return record.grades.get(semester);
}
throw NoSemesterGradesException(semester);
}
throw NoSuchStudentException("No such student: " + name);
}
...
// processing the record
try {
Grade sonGrade = getGrade("George-Michael Bluth",
new Semester(2009, "Fall"));
process(record);
} catch (NoSuchStudentException|NoSemesterGradesException ex){
alertClient();
}
and then with Option
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// this is Scala
def getGrade(name: String, semester: Semester):
Option[Record] = {
grades: Map<String, Record> = getGrades()
grades.get(name)
.flatMap(_.grades.get(semester))
}
...
getGrade("George-Michael Bluth",
new Semester(2009, "fall")).match {
case Some(record) => process(record)
case None => alertClient()
}
The two examples illustrate the main difference between
the two approaches: one forcing the developer to deal with
the possibility of keys not being present in a Map
, and
also with what happens when the key is not present. Without
a type like Option
, one is forced to handle the different
cases explicitly, forcing the developer to reason not only
about how to process a record to retrieve the grade, but
also to handle the various pathological cases that may arise.
The choice to throw exceptions adds additional complexity,
since now the developer must also keep track of how to
handle these exceptions whenever getGrade
is called.
In effect, the pathological cases in getGrade
has escaped
encapsulation of the method via the exceptions.
The semantic change to Option
addresses these issues
perfectly by pushing the responsibility of book-keeping the
pathological cases to the compiler. Via flatMap
, we need
only to focus on retrieving the grade object from the record
with some syntactic tweaks (made even simpler via for
comprehension). The word
“uncomplecting”
comes to mind: Option
uncomplected the getGrade
logic
from the pathological cases, and both the author of getGrade
and the
consumer of the method no longer need to know that the student’s
record may be absent or the student’s grade for a particular
semester may be missing.
What is the connection with the monad of category theory?
In the last few frames, I would like to explain the subject, mostly to myself and to those who are curious, what the connection is with the category theoretical monad. I was pleased to learn that there is actually a rather elegant adaptation of monads to homological algebra, which I will not be going into — though I will allow myself to say that it is an endofunctor on the suitable abelian category exhibiting the monadic laws. The following is, I would say, not technical, but may be more mathematically theoretical and require some prior familiarity with category theory.
The concept of category is developed around mid-20th century by the mathematicians Eilenburg and Maclane to formalize the relationship between geometric and topological objects and algebraic structures like modules of rings. They wanted to express the still nascent idea that geometry and algebra were actually intricately related by examining and comparing the properties of maps — functions that respected the underlying mathematical structures — between geometric and algebraic objects respectively. Their discovery would revolutionize mathematics, and the framework that they invented to present their results evolved into the present day abstract category theory, an important tool in the repertoire of many research mathematicians today. Later, Alonzo Church adapted the language of category theory to present the theory of \lambda-calculus, the theoretical beginning of the functional programming paradigm.
So what is a category? There are several very good expositions that answer the question better than I would hope to: wikipedia (free), MacLane’s Category Theory for a Working Mathematician, and Awody’s Category Theory. Roughly speaking, categories are classifications. A category is composed of objects and relations, where the relations are often interpreted as specific mappings from one object to another, and they satisfy very specific laws: closed under composition, associative, and equipped with units.
For example, one can define the category of numbers (where the objects are real numbers), and the relation between numbers x and y is the difference between them: x - y. Or the category of words (sequence of letters) on the English alphabet, where the relations between words A and B are words that begin with A and ends with B. In the latter example, there are infinitely many relations between “abc” and “cde”: “abcde” and “abccde” are two obvious examples. These examples are rather contrived, and in practice, examples abound. However most of these are beyond our scope.
To formalize the study of categories — how categories relate to each other, and what are the properties of those relations — category theorists introduced the notions of functors and natural transformations. The language is versatile enough to describe very concrete types of objects, but is also capable of capturing the mathematical essence of very abstract concepts to provide the backbone for modern algebraic geometry.
The application of category theory to computer science is in defining a suitable category of data types, i.e. a category whose objects are data structures, and whose morphisms between data structures S and T are functions with signature S \Rightarrow T.
The category of types must satisfy the following criteria
-
Terminal Objects - There is a data type \mathbf{Unit} that all other type maps to uniquely.
-
Closed under finite product - If S and T are types, then there is a product type subsuming S and T. (The word “product type” is actually derived from the notion of products of two objects in category theory.)
-
Closed under exponents - If S and T are types, then there is a a type to represent functions with signature S \Rightarrow T.
The above criteria describes the defining characteristics of the so-called
Catesian Closed Category.
The type system that we use in practice for typed languages is usually a
variation of this, subject to the natural constraints and limitations of compilers,
interpreters and runtime engines. In practice, we would want objects like arrays and
dictionaries, so modern type system are also equipped with a “arbitrary coproduct”
requirement, so type systems for certain languages are also equipped with “sum types”
— from the notion of sums or “coproduct”.
(For example, an array of type T
is actually an arbitrary sum type of all
finite tuples of type T
.)
What is the connection between math monads and code monads?
In the last section, we hand-wavingly discussed the category of types as the category whose objects are data types (a.k.a. classes) and whose morphisms are the single-input functions. To save on exposition, let \mathbf{Types} represent this category.
In category theory, a monad is a (covariant) endofunctor M: \mathcal{C} \to \mathcal{C} together with two natural transformations \eta: \mathrm{Id} \to M and \mu: M^2 \to M such that for any object A in \mathcal{C}:
-
\mu_{A} \circ \mu_{M(A)} = \mu_{A} \circ M(\mu_{A}) as natural transformations from M^3 \to M
-
\mu_A \circ \eta_{M(A)} = \mu_{A} \circ M(\eta_A) = 1_{M(A)}, where 1_{M(A)} is the identity natural transformation from M \to M
At the outset, it is unclear how the above relates to the monad commonly discussed in blogs and forums. However, let’s unravel the definition in the context of the category of types.
What is an endofunctor M on \mathbf{Types}? It an
association from one type T to another type M(T)
such that for all morphisms \varphi: T \to S, there exists a
morphism M(\varphi): M(T) \to M(S) preserving composition, associativity
and unity. In code-speak: a monad is a way to associate a type T
with another
type M[T]
that is equipped with a way to “lift” a function S => T
to
M[S] => M[T]
. This latter requirement is satisfied precisely by map
,
which is actually a function of type (T => S) => (M[T] => M[S])
.
How about the bit about natural transformations? In \mathbf{Types}, the natural transformation \mathrm{Id} \to M are morphisms — one for each type S — w(S): S \to M(S) such that for each map \varphi: T \to S, the composition
w(S) \circ \varphi: T \to S \to M(S)
is equal to
M(\varphi) \circ w(S): T \to M(T) \to M(S).
Furthermore, the natural transformation M^2 \to M are morphisms f(S): M(M(S)) \to M(S) such that each morphism \varphi: T \to S the composition
M(\varphi) \circ f(S): M(M(S)) \to M(S) \to M(T)
is the same as
f(T) \circ M(M(\varphi)): M(M(S)) \to M(M(T)) \to M(T).
So what do these mean in terms of code? As mentioned in the previous section,
morphisms are single-parameter functions. As such, saying that there are
two natural transformations is the same as saying for each type T
, there
are specific single-parameter functions with signatures T => M[T]
and
M[M[T]] => M[T]
. These will correspond to the wrap
and flatten
methods
respectively. The fact that these are natural transformations means that
the functions must satisfy some additional criteria, namely:
-
for any given function
foo: T => S
, then the functionmap(foo)(wrap(_))
(of signatureT => M[S]
) is the same aswrap(foo(_))
. This is the naturality ofwrap
. -
for any given function
foo: T => S
,map(foo)(flatten(_))
(of signatureM[M[T]] => M[S]
) is the same asflatten(map(map(foo))(_))
. This is the naturality offlatten
.
Finally, it is a matter of interpreting the two laws, which correspond exactly to the criteria that we have mentioned:
- For any instance
triWrapped: Monad[Monad[Monad[T]]]]
:
flatten(flatten(triWrapped)) ==
flatten(map(flatten)(triWrapped))
- For any instance
wrapped: Monad[T]
:
flatten(map(wrap)(wrapped)) ==
flatten(Monad[Monad[T]].wrap(wrapped)) ==
wrapped
Afterword - Why is it so hard to get a straight answer about monads?
Having gone through all this math and code, I thought I’d reclaim a bit of my own space and speculate. I think the titular question is a tough one to answer. The following is a speculation at best. Monad is a very abstract concept. People who aren’t used to that level of conceptual abstractness may explain it poorly. Even if experts understand something very well, e.g. \infty-categories, they may not always be able to give a succinct and meaningful answer to questions explicitly and often implicitly asked of them about the subject.
Monads seem to be tangible because, thanks to Scala and Haskell, they
have concrete representations that illustrate the fundamental elements
and help demonstrate how specific monads may be implemented and used in
practice. However, a fish is not a cod, and a monad is not a Maybe
,
or a IO
, no matter how clearly these examples are explained. What
the examples seek to illustrate is a general pattern that is then obfuscated
by one or two odd examples.
The problem is further compounded by the fact that most colloquial
understanding of monads seems specific to Haskell, and those who
contributed to the tutorials to explain the subject for Haskell have a different audience
in mind: someone who develops in Haskell need not be a type theorist,
and definitely need not to have studied category theory. Fortunately
for me, the classes in Haskell (Monad
, Functor
, etc) are
consistent with their namesakes in type theory. Unfortunately, though,
the tutorials rely on very different concepts than those familiar to
their imperative counterparts.
Having finally had some handle of the subject, I believe the acute problem seems to be that monads in practice actually capture two outwardly different patterns: one of container/wrapper, one of semantic binding for functional expressions. There is a tenuous link between these two notions if one moves past what happens to the underlying data structures, but instead focus on the syntactic appearance. The true link is in type theory, which is rarely discussed (if at all) in any discussion of lambda calculus. The bridge being conspicuously absent in any programming discourse, any hope of having a coherent narrative for “monad” is hopelessly lost.