I love Haskell. My first encounter with Haskell started out about eight years ago. Like many people in those days, when I was in high school I spent a lot of time playing around with code on my computer. Reading and understanding open source projects was a main source of knowledge and inspiration for me when I was learning how to program. When I came upon the bzip2 homepage and consequently Julian Seward’s homepage I found a short note about Haskell and how it was a super fun and interesting language to write a compiler for. Haskell? What’s that?
After reading more about Haskell, functional programming, lazy evaluation, and type inference and seeing the elegance of the various code samples, I was hooked. I spent the next couple of weeks going through “Yet Another Haskell Tutorial” and I remember it being incredibly difficult yet incredibly rewarding. After I wrote my first fold over a recursive algebraic datatype, I felt like I was finally starting to speak Haskell. I felt like I had massively improved as a programmer.
While I’ve been at Dropbox, Python has been my main language of computational expression. Even though it was a bit rocky at first, I’ve grown to really love Python and the culture of fast iteration and duck typing. Like a good Pythonista, I’m of the opinion that types are training wheels, but that’s really only until you use a language with a real type system. In C# or Java, types can get in your way and even force you to write overly verbose code or follow silly “design patterns.” In Haskell, types help you soar to higher computational ground. They encourage you to model your data in coherent, concise, and elegant ways that feel right. They aren’t annoying.
More people should use Haskell. The steep learning curve forces you to understand what you are doing at a deeper level and you will be a better programmer because of it. To help that happen, this post will be in a semi-literate programming style and I’ll be describing a Dropbox API app written in Haskell. This won’t be like a normal tutorial so you’ll probably have to do a bit of supplemental reading and practice afterward. The goal is to give you a flavor of what a real program in Haskell looks like.
This post assumes no previous knowledge with Haskell but it does assume moderate programming ability in another language, e.g. C, C++, Ruby, Python, Java, or Lisp. Since this post does not assume previous Haskell experience the beginning will be more of a Haskell tutorial and core concepts will be sectioned off to facilitate the learning process. This post is a published version of a Literate Haskell file. Code lines prefixed with the “>
” character are actually part of the final program. This makes it so that it’s possible to simply copy & paste the text here into your favorite editor and run it, just make sure you save the file with a “.lhs” extension. If you ever get tired of reading you can get the real source at the GitHub repo.
The Haskell implementation we’ll be using is The Haskell Platform, it’s a full stack of Haskell tools prepackaged to work out of the box on all three major desktop operating systems. It’s based on GHC, The Glasgow Haskell Compiler. GHC is an advanced optimizing compiler focused on producing efficient code.
Image Search for Dropbox
Years ago Robert Love of Linux kernel hacker fame wrote a FUSE file system that made it so that user-created folders were populated with Beagle search results using the folder name as the search query. It was called beaglefs. The point was to demonstrate the power of user-space file systems, notably the power of having so much more library code available to you than in kernel-space.
We can do a similar thing with the Dropbox API. We’re going to write a hypothetical Dropbox API app that populates user-created folders with Creative Commons licensed images found by performing a web image search using the folder name as the search term. Using Dropbox, all the user has to do to perform an image search is simply create a folder.
Let’s get started!
LANGUAGE Pragma
> {-# LANGUAGE TypeFamilies #-}
> {-# LANGUAGE QuasiQuotes #-}
> {-# LANGUAGE MultiParamTypeClasses #-}
> {-# LANGUAGE FlexibleContexts #-}
> {-# LANGUAGE TemplateHaskell #-}
> {-# LANGUAGE DeriveDataTypeable #-}
Despite being more than two decades old, Haskell is still evolving. You can tell your Haskell compiler to allow newer language features using this syntax, this is called the LANGUAGE Pragma. Please don’t worry about what these exact language extensions do just yet, you can read more in the GHC docs.
Import Declarations
> import Yesod
This is an import declaration. Declarations like these inform the Haskell module system that I am going to use definitions from this other module. A module in Haskell is a collection of values (functions are values in Haskell too!), datatypes, type synonyms, type classes, etc.
Here I am telling the module system to bring in all definitions from the module Yesod
in the namespace of this module. If I wanted to I could also access those definitions prefixed with “Yesod.” similar to Java.
Yesod is a fully-featured modern web-framework for Haskell. We’ll be using it to create the web interface of our API app.
> import Text.XML.HXT.Core hiding (when)
This is a another import declaration. This is just like the Yesod import except we use the “hiding” syntax to tell the Haskell module system to not import the “when” definition into this module’s namespace. The module we are importing is the main module from HXT, the XML processing library that we use to parse out the search results from an image search result page. More details on this much later.
> import Control.Applicative ((<$>), (<*>))
> import Control.Concurrent (forkIO, threadDelay, newChan,
> writeChan, readChan, Chan, isEmptyChan)
> import Control.Monad (forM_, when)
> import Control.Monad.Base (MonadBase)
> import Control.Monad.Trans.Control (MonadBaseControl)
> import Data.Maybe (fromJust, mapMaybe, isNothing)
> import Data.Text.Encoding (decodeUtf8With)
> import Data.Text.Encoding.Error (ignore)
> import Data.Typeable (Typeable)
> import Network.HTTP.Enumerator (simpleHttp)
> import System.FilePath.Posix (takeFileName, combine)
These import declarations are slightly different. Instead of bringing in all names from the modules I am only bringing in specific names. This is very similar to the “from module import name” statement in Python.
> import qualified Control.Exception.Lifted as C
> import qualified Data.ByteString as B
> import qualified Data.ByteString.Lazy as L
> import qualified Data.List as DL
> import qualified Data.Map as Map
> import qualified Data.Text as T
> import qualified Data.URLEncoded as URL
> import qualified Dropbox as DB
> import qualified Network.URI as URI
Remember how I said that I could also access names by prefixing them with “Yesod.” earlier? Adding qualified
to the import declaration makes it so you must refer to names in other modules using the module prefix. The “as C” part in the first line makes it so that I can do C.try
instead of Control.Exception.Lifted.try
.
Algebraic Datatypes
> data ImageSearch = ImageSearch (Chan (DB.AccessToken, String)) DB.Config
This is an algebraic datatype declaration. Nevermind what algebraic means for the moment, this is the basic way to define new types in Haskell. Even though it’s very short this little code actually does a couple of things:
- It defines a type called
ImageSearch
. This is the immediate text after “data”. - It defines a function called
ImageSearch
that takes two arguments, the first of typeChan (DB.AccessToken, String)
, the second of typeDB.Config
, and it returns a value of typeImageSearch
, ignore these complex types for now they aren’t important. This function is called a constructor in Haskell.
Constructors play a big role in Haskell. Wherever a name can be bound to a value in Haskell, you can also use contructor pattern matching, or deconstruction, to extract out the data contained within that value. Here’s a function to get out the channel component of our ImageSeach
type:
imageSearchChan (<span class="dt">ImageSearch</span> chan config) <span class="fu">=</span> chan
imageSearchChan
takes in an ImageSearch
argument and return the channel wrapped inside of it. You’ll see deconstruction a lot more later.
So far we’ve defined a type called ImageSearch
and we’ve also defined a function called ImageSearch
. This is okay because in Haskell type names and value names live in different namespaces.
Maybe Type
The Maybe
type is one algebraic datatype that you’ll see a lot in Haskell code. It’s often used to denote an error value from a function or an optional argument to a function.
data Maybe a = Nothing | Just a
Unlike ImageSearch
, the Maybe
type has not one but two constructors: Nothing
and Just
. You can use either constructor to create a value in the Maybe
type. A value of Nothing
usually denotes an error in the Maybe
type. A Just
value indicates success.
Another difference from our ImageSearch
type is that Maybe
isn’t a concrete type on its own; it has to wrap some other type. In Haskell, a “higher-order” type like this is called a type constructor; it takes a concrete type and returns a new concrete type. For example, Maybe Int
or Maybe String
are two concrete types created by the Maybe
type constructor. This is similar to generics, like List<T>
, in Java or C#. We use the type variable “a
” in the declaration to show that the Maybe
type constructor can be applied to any concrete type.
It’s important to note that type constructors, like Maybe
, are different from data constructors, like Nothing
or Just
.
Type Signatures & Currying
> toStrict :: L.ByteString -> B.ByteString
> toStrict = B.concat . L.toChunks
This is a function definition in Haskell. I’ll explain the definition in the next section but first I wanted to talk about the use of “::
” since it keeps coming up. This is how we explicitly tell the Haskell compiler what type we expect a value to be. Even though a Haskell compiler is good enough to infer the types of all values in most cases, it’s considered good practice to at least explicitly specify the types of top-level definitions.
The arrow notation in the type signature denotes a function. toStrict
is a function that takes an L.ByteString
value and returns a B.ByteString
value.
One cool thing about Haskell is that functions can only take one argument. I know it doesn’t sound cool but it’s actually really cool. How do you specify a function that takes two arguments you ask? Well that’s a function that takes a single argument and returns another function that takes another argument. This reformulation of multiple-argument functions is called currying. Currying is cool because it allows us to easily do partial application with functions, you’ll see examples of this later.
Here’s the type signature for a function that takes two arguments of any two types and returns a value in the second type:
twoArgFunc :: a -> b -> b
We use the type variables “a
” and “b
” to indicate that twoArgFunc
is polymorphic, i.e. we can apply twoArgFunc
to any two values of any two respective types.
With currying in mind, it shouldn’t be too hard to figure out that the arrow is right-associative, i.e. “a -> b -> b
” actually means “a -> (b -> b)
”. That would make sense, twoArgFunc
is a function that takes an argument and returns another function that takes an argument and returns the final value. Say that over and over again until you understand it.
What if we grouped the arrow the other way?
weirdFunc :: (a -> b) -> b
In this case weirdFunc
is a function that takes another function as a its sole argument and returns a value. This is much different from twoArgFunc
which instead returns a second function after it accepts its first argument. Passing functions to functions like this is a common idiom in Haskell and it’s one of the strengths of a language where functions are ordinary values just like integers and strings.
Functions
The definition of toStrict
makes use of the function composition operator, “.
”, but Haskell functions don’t have to be defined this way. Here’s a function defined using a variable name:
square x = x * x
What about two arguments?
plus x y = x + y
We can define a function that adds five to its argument by making using of currying:
addFive = plus 5
We curried in the 5
argument to the plus
function and just as we said, it returns a new function that takes an argument and adds a five to it:
addFive 2 == 7
Another way to define functions is to use a lambda abstraction. You can write a function as an expression by using the backslash and arrow symbols. Lambda abstractions are also known as anonymous functions. Here’s another way to define plus
:
plus = x y -> x + y
All functions in Haskell are pure. This means that functions in Haskell cannot do anything that causes side-effects like changing a global variable or performing IO. More on this later.
Operators
Okay now that you’re cool with functions let’s get back to toStrict
. Here’s another way to define it using a lambda:
toStrict = x -> B.concat (L.toChunks x)
Instead, we define toStrict
using the function composition operator, “.
”. The function composition operator takes two functions and returns a new function that passes its argument to the right function and the result of that is passed to the left function and the result of that is returned. Here’s one possible definition:
f . g = x -> f (g x)
Yes this is a real way to define an operator in Haskell! The function composition operator comes standard in Haskell but even if it didn’t you’d still be able to define it yourself. In Haskell, operators and functions are actually two different syntaxes for the same thing. The form above is the infix form but you can also use an operator in the prefix form, like a normal function. Here’s another way to define “.
”:
(.) f g = x -> f (g x)
In this definition of the function composition operator we use prefix notation. It is only necessary to surround an operator with parentheses to transform it into its prefix form. Switching between infix and prefix notation works for any operator, e.g. you can add two numbers with (+) 4 5
in Haskell.
The ability to switch between prefix and infix notation isn’t limited to operators, you can do it with functions too by surrounding the function name with backticks, “`”:
div 1 2 == 1 `div` 2
This is useful for code like: "Dropbox" `isInfixOf` "The Dropbox API is sick!"
One last thing about operators; Haskell has special syntax to curry in arguments to operators in infix form. For example, (+5)
is a function that takes in a number and adds five to that number. These are called sections. To further illustrate, all of the following functions do the same thing:
(+5)
(5+)
x -> x + 5
(+) 5
plus 5
addFive
With operator currying it’s important to recognize that the side you curry the argument in matters. For example, (.g)
and (g.)
behave differently.
The “$” Operator
Next to the “.
” operator there is another function-oriented operator that you’ll see often in Haskell code. This is the function application operator and it’s defined like this:
f $ x = f x
Weird right? Why does Haskell have this?
In Haskell, normal function application has a higher precedence than any other operator and it’s left-associative:
f g h j x == (((f g) h) j) x
Conversely, “$
” has the lowest precedence of all operators and it’s right-associative:
f $ g $ h $ j $ x == f (g (h (j x)))
Using “$
” can make Haskell code more readable as an alternative to using parentheses. It has other uses too, more on that later.
Lists
> listToPair :: [a] -> (a, a)
> listToPair [x, y] = (x, y)
> listToPair _ = error "called listToPair on list that isn't two elements"
listToPair
is a little function I use to convert a two-element list to a two-element tuple, usually called a pair.
A list in Haskell is a higher order type that represents an ordered collection of same-typed values. A list type is denoted using brackets, e.g. “[a]
” is the polymorphic list of any inner type and “[Int]
” is a concrete type that denotes a list of Int
values. Unlike vectors or arrays in other languages, you can’t index into a Haskell list in constant time. It is more akin to the traditional Linked List data structure.
You can construct a list in a number of ways:
- The empty list:
[]
- List literal:
[1, 2, 3]
- Adding to the front of a list:
1 : [2]
The “:
” operator in Haskell constructs a new list that starts with the left argument and continues with the right argument:
1 : [2, 3, 4] == [1, 2, 3, 4]
One last thing about the list type in Haskell, it’s not that special. We can define our own list type very simply:
data MyList a = Nil | Cons a (MyList a)
-- using "#" as my personal ":" operator
x # xs = Cons x xs
myHead :: MyList a -> a
myHead (Cons x xs) = x
myHead Nil = error "empty list"
Yep, a list is just a recursive algebraic datatype.
foldr and map
In the Lisp tradition, lists are a fundamental data structure in Haskell. They provide an elegant model for solving problems that deal with multiple values at once. While lists are still fresh in your mind let’s go over two functions that are essential to know when manipulating lists.
The first function is foldr
. foldr
means “fold from the right” and it’s used to build a aggregate value by visiting all the elements in a list. It’s arguments are an aggregating function, a starting value, and a list of values. The aggregating function accepts two argu—Screw it, let’s just define it using recursion:
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f z [] = z
foldr f z (x:xs) = f x $ foldr f z xs
Notice how we used the “:
” operator to deconstruct the input list into its head and tail components. Like I said, you can use foldr
to aggregate things in a list, e.g. adding all the numbers in a list:
foldr (+) 0 [1..5] == 15
“[1..5]
” is syntactic sugar for all integers from 1 to 5, inclusive.
Another less useful thing you can do with foldr
is copy a list:
foldr (:) [] ["foo", "bar", "baz"] == ["foo", "bar", "baz"]
foldr
’s brother is foldl
; it means “fold from the left.”
foldl :: (b -> a -> b) -> b -> [a] -> b
foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs
foldl
collects values in the list from the left while foldr
starts from the right. Notice how the type signature of the aggregating function is reversed, this should help as a sort of mnemonic when using foldr
and foldl
. Though it may not seem like it, the direction of the fold matters a lot. As an exercise try copying a list by using foldl
instead of foldr
.
map
is another common list operation. map
is awesome! It takes a function and a list and returns a new list with the user-supplied function applied to each value in the original list. Recursion is kind of clunky so let’s define it using foldr
:
map :: (a -> b) -> [a] -> [b]
map f l = foldr ((:) . f) [] l
Even though foldr
is more primitive than map
I find myself using map
much more often. Here’s how you would map a list of strings to a list of their lengths:
map length ["a", "list", "of", "strings"] == [1, 4, 2, 7]
Remember “$
”, the function application operator? We can also use map
to apply the same argument to a list of functions:
map ($5) [square, (+5), div 100] == [25, 10, 20]
We curried in the 5
value into right side the “$
” operator. That creates a function that takes a function and then returns the application of that function to the value 5
. Using map
we then apply that to every function in the list. This is why having an “$
” operator in a functional language is a good idea but it’s also why map
is awesome!
A Quick Note About Tuples
I talked about lists but I kind of ignored tuples. Like lists, tuples are a way to group values together in Haskell. Unlike lists, with tuples you can store values of different types in a single tuple.
intAndString :: (Int, String)
intAndString = (07734, "world")
A more subtle difference from lists is that tuples of different lengths are of different types. For instance, writing a function that returns the first element of a three-element tuple is easy:
fst3 :: (a, b, c) -> a
fst3 (x, _, _) = x
Unfortunately, there is no general way to define a “first” function for tuples of any length without writing a function for each tuple type. Haskell does at least provide a fst
function for two-element tuples.
Final note, see how we ignored the second and third elements of the tuple deconstruction by using “_
”? This is a common way to avoid assigning names in Haskell.
Template Haskell
> $(mkYesod "ImageSearch" [parseRoutes|
> / HomeR GET
> /dbredirect DbRedirectR GET
> |])
Yesod makes heavy use of Template Haskell. Template Haskell allows you to do compile-time metaprogramming in Haskell, essentially writing code that writes code at compile-time. It’s similar to C++ templates but it’s a lot more like Lisp macros. Template Haskell is a pretty exotic feature that is rarely used in Haskell but Yesod makes use of it to minimize the amount of code you have to write to get a website up and running.
The mkYesod
function here generates all the boilerplate code necessary for connecting the HTTP routes to user-defined handler functions. In our app we have two HTTP routes:
/ -> HomeR
/dbredirect -> DbRedirectR
The first route connects to a resource called HomeR
. The second route, located at /dbredirect
, connects to a resource called DbRedirectR
. We’ll define these resources later.
Type Classes
This part of the app brings us to one of the most powerful parts of Haskell’s type system, type classes.
Type classes specify a collection of functions that can be applied to multiple types. This is similar to interfaces in C# and Java or duck typing in dynamically-typed languages like Python and Ruby. An instance declaration actually defines the functions of a type class for specific type. Formally, type classes are an extension to the Hindley-Milner type system that Haskell implements to allow for ad-hoc polymorphism, i.e. an advanced form of function overloading.
Note that the word instance used in this context is very different from the meaning in object-oriented languages. In an object-oriented language instances are more akin to Haskell values.
Section 6.3 of the Haskell 2010 Report has a graphic of the standard Haskell type classes. Many core functions in the standard library are actually part of some type class, e.g. (==)
, the equals operator, is part of the Eq
type class. For fun, let’s make a new type class called Binary
. It defines functions that convert data between the instance type and a byte format:
import qualified Data.ByteString as B
class Binary a where
toBinary :: a -> B.ByteString
fromBinary :: B.ByteString -> a
You can imagine I might use this class when serializing Haskell data over a byte-oriented transmission medium, for example a file or a BSD socket. Here’s an example instance for the String
type:
import Data.Text as T
import Data.Text.Encoding as E
import Data.ByteString as B
instance Binary String where
toBinary = E.encodeUtf8 . T.pack
fromBinary = T.unpack . E.decodeUtf8
For this instance toBinary
is defined as a composition of E.encodeUtf8
and T.pack
. Notice the use of T.pack
, you’ll see it a lot. T.pack
converts a String
value into a Text
value. E.encodeUtf8
converts the resulting Text
value into a ByteString
value. fromBinary
does the inverse conversion, it converts a ByteString
value into a String
value.
Let’s define another instance:
import Control.Arrow ((***))
import Data.Int (Int32)
import Data.Bits (shiftR, shiftL, (.&.))
-- stores in network order, big endian
instance Binary Int32 where
toBinary x = B.pack $ map (fromIntegral . (0xff.&.) . shiftR x . (*8))
$ reverse [0..3]
fromBinary x = sum $ map (uncurry shiftL . (fromIntegral *** (*8)))
$ zip (B.unpack x) (reverse [0..3])
fromBinary
in this instance may look kind of gnarly but you should know what it does; it converts a ByteString
value to an Int32
value. Understanding it is left as an exercise for the reader.
Now getting Haskell data into a byte format is as easy as calling toBinary
. An important distinction between type classes and the interfaces of C# and Java is that it’s very easy to add new functionality to existing types. In this example, the creators of the both the String
and Int32
types didn’t need any foreknowledge of the Binary
type class. With interfaces, it would have been necessary to specify the implementation of toBinary
and fromBinary
at the time those types were defined.
It’s also possible to define functions that depend on their arguments or return values being part of a certain type class:
packWithLengthHeader :: Binary a => a -> B.ByteString
packWithLengthHeader x = B.append (toBinary $ B.length bin) bin
where bin = toBinary x
Here packWithLengthHeader
requires that its input type “a
” be a part of the Binary
type class, this is specified using the “Binary a =>
” context in the type signature. A subtle point in the definition of this function is that it requires the Int
type to be a part of the Binary
type class as well (the return value of B.length
).
> instance Yesod ImageSearch where
> approot _ = T.pack "http://localhost:3000"
Yesod requires you to declare a couple of instances for your app type. Most of the definitions in the Yesod
type class are optional and have reasonable defaults but it does require you to define approot
, the root URL location of your app. This is necessary for Yesod to be able generate URLs.
> instance RenderMessage ImageSearch FormMessage where
> renderMessage _ _ = defaultFormMessage
Here’s another instance declaration. The RenderMessage
type class in this case is actually based on two types, ImageSearch
and FormMessage
. It defines a function called renderMessage
which takes two arguments and returns defaultFormMessage
.
Exceptions & Automatic Type Class Derivation
> data EitherException = EitherException String
> deriving (Show, Typeable)
> instance C.Exception EitherException
There are multiple ways to specify errors in Haskell. In purely functional contexts it’s not uncommon to see the use of either the Maybe
or Either
types. In monads based on the IO
monad, I usually like to use Haskell exceptions. What’s a monad you say? It’s complicated. Just kidding 🙂 I’ll get to them later.
For now, we’re defining a new exception type. It’s the same algebraic datatype declaration you saw earlier for our ImageSearch
type except now there’s this “deriving” thing. A Haskell compiler can automatically derive instance declarations for some of the standard type classes. For EitherException
we automatically derive instances for the type classes Show
and Typeable
. As a note, the ability to automatically derive instances for the Typeable
type class was enabled by the LANGUAGE pragma DeriveDataTypeable
above.
The last line declares EitherException
to be an instance of C.Exception
. It might be weird that we didn’t define any functions for this instance. This is because type classes sometimes provide default implementations for the functions in the class. The C.Exception
type class actually provides default implementations for types that are part of the Typeable
type class.
Monads
> exceptOnFailure :: MonadBase IO m => m (Either String v) -> m v
> exceptOnFailure = (>>=either (C.throwIO . EitherException) return)
exceptOnFailure
is a function that takes a monad that wraps an Either
type and returns that same monad except now wrapping the right side of the Either
type. This makes sense to me very clearly but I know, o patient reader, that this must look like gibberish to you.
First let’s talk about monads. Monads are types within the Monad
type class. The Monad
type class specifies two functions (or an operator and a function):
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a
The actual Monad
type class definition is slightly more complicated but for simplicity’s sake this will do.
The first operator “>>=
” is called bind. What does it do? Well it depends on your monad instance! What we can say for sure is that it takes takes a monad of type “a
”, a function that maps from “a
” to a monad of type “b
”, and returns a monad of type “b
”.
The second function, return
, takes a value and wraps it in the monad. What it means to be wrapped in the monad, again, depends on the instance. Please note, the return
function isn’t like the return
statement in other languages, i.e. it doesn’t short-circuit execution of a monad bind sequence.
If that sounds abstract that’s because it is! Monads are a sort of computational framework, many different types of computation are monadic, i.e. they fit the form imposed by bind. Why are monads important? Perhaps the most important reason they exist in Haskell is that they provide a purely functional way to perform (or specify how to perform) IO. Monads are much more than just a way to do IO, however, their general applicability extends to many things.
I won’t dwell on monads too much in this post but for the purposes of your immediate understanding, it suffices to explain the IO
monad and do notation. Just as I’m not dwelling on monads, you shouldn’t either. It takes a long time to really understand and master what’s going on. The more you program in Haskell, the more it’ll make sense.
So why can’t we do IO in Haskell without the IO
monad? Haskell is a purely functional language, that means that functions in Haskell mimic their mathematical counterparts. A pure function is a mathematical entity that consistently maps values from one domain to another. You expect cos(0)
to always evaluate to 1
, if it ever evaluated to something else something would be very wrong.
In a purely functional language how would you define getChar
?
getChar :: Char
getChar = ...
It takes no arguments so how can it deterministically return what the user is submitting? The answer is it can’t. You can’t do this in a purely functional language.
So what are our options? The answer is to generate a set of actions to take as IO is occurring, in a purely functional manner. This is what the IO
monad is and this is why values in the IO
monad are called IO Actions. It’s a form of metaprogramming. Here’s an IO action that prints “yes” if a user types “y” and “no” otherwise:
-- type signatures
print :: Show a => a -> IO ()
getChar :: IO Char
main :: IO ()
main = getChar >>= (x -> print $ if x == 'y' then "yes" else "no")
Why does this work? Notice how the “output” of getChar
isn’t tied to its own value, instead the bind operation gets the output value for us. We’re using monads here to build and model sequential and stateful computation, in a purely functional way!
You can imagine that writing real programs in the IO
monad could get ugly if you used “>>=
” and lambdas everywhere so that’s why Haskell has some syntactic sugar for writing out complex monadic values. This is called do notation. Here’s the same IO action from above written in do notation.
main = do
x <- getChar
print $ if x == 'y'
then "yes"
else "no"
In do notation each monad is bound using bind in order and values are pulled out using the left-pointing arrow “<-
”.
We’re coming to the close of yet another Haskell monad explanation but before we finish I really want to emphasize that monads and the IO
monad in particular aren’t that special. Here’s my very own implementation of the IO
monad:
data MyIO a = PrimIO a | forall b. CompIO (MyIO b) (b -> (MyIO a))
myGetChar :: MyIO Char
myGetChar = PrimIO 'a'
myPrint :: Show a => a -> MyIO ()
myPrint s = PrimIO ()
instance Monad MyIO where
m >>= f = CompIO m f
return x = PrimIO x
runMyIO :: MyIO m -> m
runMyIO (PrimIO x) = x
runMyIO (CompIO m f) = runMyIO (f (runMyIO m))
Of course in the real IO
monad, getChar
isn’t hard-coded to return the same thing each time and print
actually prints something on your terminal. IO actions are run by your Haskell runtime which is usually written in a language where you can actually call a non-pure getChar
function, like C.
Either Type
Now, back to exceptOnFailure
. Let’s look at it again:
exceptOnFailure :: MonadBase IO m => m (Either String v) -> m v
exceptOnFailure = (>>=either (C.throwIO . EitherException) return)
Is it still confusing? 🙂
Remember how I said earlier that the Either
datatype was used to denote errors in Haskell? The definition of Either
looks like this:
data Either a b = Left a | Right b
either :: (a -> c) -> (b -> c) -> Either a b -> c
either f _ (Left x) = f x
either _ g (Right x) = g x
You can use the either
function to return different values depending on the Either
datatype passed in. By convention the Left
constructor is used to denote an error value.
For exceptOnFailure
, first we create a function that takes an Either
value and if it’s a failure we throw an exception using C.throwIO
otherwise we call return
to rewrap the success value. Then we curry in that function to the right side of the “>>=
” operator.
Some Utility Functions
> myAuthStart :: DB.Config -> Maybe DB.URL -> IO (DB.RequestToken, DB.URL)
> myAuthStart config murl = exceptOnFailure $ DB.withManager $ mgr ->
> DB.authStart mgr config murl
>
> myAuthFinish :: DB.Config -> DB.RequestToken -> IO (DB.AccessToken, String)
> myAuthFinish config rt = exceptOnFailure $ DB.withManager $ mgr ->
> DB.authFinish mgr config rt
>
> myMetadata :: DB.Session -> DB.Path -> Maybe Integer
> -> IO (DB.Meta, Maybe DB.FolderContents)
> myMetadata s p m = exceptOnFailure $ DB.withManager $ mgr ->
> DB.getMetadataWithChildren mgr s p m
Here I’ve defined a couple of convenience functions for using the Dropbox SDK. This is just so I don’t have to use DB.withManager
every time I call these functions. Also all of the vanilla Dropbox SDK functions return an Either
value in the IO
monad so we make use of exceptOnFailure
to automatically throw an exception for us if something goes wrong.
> tryAll :: MonadBaseControl IO m => m a -> m (Either C.SomeException a)
> tryAll = C.try
C.try
is the normal way to catch exceptions in Haskell. Unfortunately it uses ad-hoc polymorphism within the Exception
type class to determine which exception it catches. Since tryAll
has an explicit type signature, it’s bound to the instance of C.try
that catches C.SomeException
.
> dbRequestTokenSessionKey :: T.Text
> dbRequestTokenSessionKey = T.pack "db_request_token"
This is a constant I use for the name of my session key that stores the OAuth request token when authenticating the user to my API app but more on that later.
The Dropbox API Authentication Process
The Dropbox API uses OAuth to grant apps access to Dropbox user accounts. To access any of the HTTP endpoints of the Dropbox API an app must provide an access token as part of the HTTP request. Access tokens are revokable long-lived per-user per-app tokens that are granted to an app at the request of a user.
Acquiring an access token is a three step process:
- The app must obtain a unauthenticated request token via the Dropbox API.
- The app redirects the user to the Dropbox website. Once the user is at the Dropbox website the user can either approve or deny the request to authenticate the request token for the app.
- The Dropbox website redirects the user back to the app’s website. The app can then exchange the authenticated request token for an access token.
A request token actually consists of two components, a key and a secret. Only the key component should be exposed in plaintext. To ensure proper security, the secret should only ever be known to the Dropbox servers and the API app attempting authentication. To exchange an authenticated request token for an access token, the app must also provide the original secret of the request token. This prevents third-parties from hijacking authenticated request tokens.
An access token is long-lived but at any point in time can become invalid. When an access token becomes invalid it is the responsibility of the API app to go through the authentication process again. This allows Dropbox and its users to revoke access tokens at will.
Initiating the Authentication Process
> getHomeR :: Handler RepHtml
> getHomeR = do
Users can enable our app for their Dropbox account using the web. Yesod is the web framework we are using to implement web pages in Haskell. In Yesod all HTTP routes are values in the Handler
monad. The convention is that the handler name is the combination of the HTTP method (GET in this case) and the name of the resource. getHomeR
is the handler for the GET method on the “HomeR” resource which is located at root HTTP path, “/”. Handlers are connected to HTTP routes served by the web server via the use of Template Haskell above.
The Handler
monad is essentially a decorated IO
monad so don’t worry about what it is that much. You should use it just like you would use the IO
monad.
> ImageSearch _ dbConfig <- getYesod
getYesod
retrieves the app’s value. In our app this value has the ImageSearch
type that we defined at the very beginning. Here we’re deconstructing the ImageSearch
value and extracting only the config component (while ignoring the channel component). The config value stores some information about our app, like app key and locale, that is used by the Dropbox SDK.
> myRender <- getUrlRender
getUrlRender
gets the URL render function for your app. It turns a resource value for your app into a Text
URL.
> (requestToken, authUrl) <- liftIO
> $ myAuthStart dbConfig
> $ Just $ T.unpack $ myRender DbRedirectR
Here we call the DB.authStart
function (by way of myAuthStart
). This function performs the first step of the Dropbox API authentication process. We pass in the URL, DbRedirectR
, that we want the user to be redirected to after they authenticate and we get back our new unauthenticated request token and the Dropbox URL where the user can authenticate it. Note that T.unpack
converts a Text
value into a String
value.
The myAuthStart
function is in the IO
monad so we make use of liftIO
to execute myAuthStart
in the Handler
monad. Since Handler
is a wrapper around the IO
monad, the liftIO
function “lifts” the IO action into the higher monad.
> let DB.RequestToken ky scrt = requestToken
Do notation allows you to bind names using “let” in a do block. This is for when you need to assign a name to a value that isn’t coming from a monad. Here we’re deconstructing the request token from myAuthStart
to get the key and the secret.
> setSession dbRequestTokenSessionKey $ T.pack $ ky ++ "|" ++ scrt
A session in Yesod is a set of key-value pairs that is preserved per browsing session with our web site. It’s implemented using encryption on top of HTTP cookies. We store the key and the secret of the request token in the session using setSession
. We’ll need the secret to finish the authentication process after the user authenticates our app.
setSession
expects two Text
values so we use T.pack
to turn the second argument from a String
type into a Text
type.
> redirectText RedirectTemporary $ T.pack authUrl
Finally we redirect the user to the URL given to us from myAuthStart
, authUrl
. Dropbox will ask the user if they want to allow our app to have access to their account. After they respond, Dropbox will authenticate the request token and then redirect the user to the URL passed to the call to myAuthStart
above.
Applicative Functors
> getDropboxQueryArgs :: Handler (T.Text, T.Text)
> getDropboxQueryArgs = runInputGet $ (,)
> <$> ireq textField (T.pack "oauth_token")
> <*> ireq textField (T.pack "uid")
When Dropbox redirects the user back to our site it passes along some query args in the GET request: “oauth_token” and “uid”. Yesod provides a convenient way, using runInputGet
, to extract those in the handler.
The (,)
operator is a special prefix-only operator that creates pairs for us:
(,) x y = (x, y)
You might be wondering what “<$>
” and “<*>
” are. Relax, these are regular operators. They are used for applicative functors. Applicative functors are kind of like monads except not as powerful. I’m going to do something horrible here and define “<$>
” and “<*>
” in monad terms:
(<*>) :: Monad m => m (a -> b) -> m a -> m b
mf <*> m = do
f <- mf
x <- m
return $ f x
(<$>) :: Monad m => (a -> b) -> m a -> m b
f <$> m = return f <*> m
If you pretend that bind, “>>=
”, doesn’t exist and you only have “<*>
” and return
defined for your type, then your type isn’t a monad, it’s an applicative functor. The only exception is that return
is instead called pure
in the applicative functor type class:
class Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
The actual Applicative
type class definition is slightly more complicated but for simplicity’s sake this is good enough.
Now let’s put it all together, in our definition of getDropboxQueryArgs
we apply (,)
in the applicative functor, then pass the resulting value to runInputGet
. runInputGet
then runs the applicative functor in the Handler
monad.
I know it sounds crazy, I know it does, but luckily you don’t have to fully understand what’s going on behind the scenes to understand how it’s supposed to behave. Keep writing and reading Haskell and eventually it’ll make a lot of sense. Trust me, if I can understand this stuff you can too.
The Authentication Result Handler
> getDbRedirectR :: Handler RepHtml
> getDbRedirectR = do
This is the handler for the location of the redirect in the app authentication process. After the user has given our app access to their account they are redirected here.
> mtoken <- lookupSession dbRequestTokenSessionKey
Remember how before we redirected the user to the Dropbox authentication URL we first set a key-value pair in the Yesod session using setSession
? After the user is redirected, we use the lookupSession
function to get the token back out. lookupSession
returns a Maybe
value so that if a key-value pair does not exist in the current session it can return Nothing
, otherwise it will return the value wrapped in the Just
constructor.
> let noCookieResponse = defaultLayout [whamlet|
> cookies seem to be disable for your browser! to use this
> app you have to enable cookies.|]
>
> when (isNothing mtoken) $ noCookieResponse >>= sendResponse
when
is a nice function courtesy of Control.Monad
that runs the second argument, a value in some monad, only if the first argument is true. It’s defined like this:
when :: Monad m => Bool -> m () -> m ()
when p m = if p then m else return ()
If mtoken
is bound to a Nothing
value we’ll return an error message to the user asking them to enable cookies and discontinue normal execution by using sendResponse
. After the isNothing
check we are guaranteed that mtoken
is bound to a Just
value.
> let rt@(sessionTokenKey:_) = T.splitOn (T.pack "|")
> $ fromJust mtoken
We extract the token from mtoken
using fromJust
and pass that along to T.splitOn
. T.splitOn
will split a Text
value into a list of Text
values using the input argument (“|” in this case) as the delimiter. Then we use the “@
” syntax to simultaneously bind the result to the rt
name and deconstruct the first element of the result into the sessionTokenKey
name.
> (getTokenKey, _) <- getDropboxQueryArgs
> when (getTokenKey /= sessionTokenKey)
> $ invalidArgs [T.pack "oauth_token"]
To get the request token key that the user authenticated at the Dropbox website we use getDropboxQueryArgs
. Checking this key against the request token key that we stored in the session helps prevent request forgery. If the keys don’t match we stop execution of this handler by calling invalidArgs
. “/=
” is the not equals operator, like “!=
” in other languages.
We do this verification because we want to prevent other sites from successfully coercing a user into invoking this handler. We only want the Dropbox website to invoke this handler.
> let requestToken = uncurry DB.RequestToken
> $ listToPair $ map T.unpack rt
Here’s listToPair
in action! Since it returns a tuple we use this nifty built-in function called uncurry
:
uncurry f (x, y) = f x y
Using uncurry
and listToPair
, we pass the DB.RequestToken
constructor the request token key and secret that we stored in the session. Since rt
contains two Text
values we use “map T.unpack
” to convert them into two String
values.
> ImageSearch chan dbConfig <- getYesod
> accessToken <- liftIO $ myAuthFinish dbConfig requestToken
Now we can finish up the authentication process and get our access token. We pass in the authenticated request token to DB.authFinish
(by way of myAuthFinish
) and if everything is successful we obtain the access token.
> liftIO $ writeChan chan accessToken
In this app we make use of Concurrent Haskell. This allows us to create multiple independent threads of control in the IO
monad. There are logically two main concurrently running threads in this app, the web server thread where all of our handler code is run, and the thread that is updating the Dropbox accounts of the users of our app with the relevant search results. We’ll talk about our use of threads a lot more later.
Channels are a mechanism for typed inter-thread communication in Haskell. We use the channel component of our app value to send over the access token so that we can begin the updating process for this user’s Dropbox account.
> defaultLayout [whamlet|okay you are hooked up!|]
Finally, we send a success message to the user’s browser indicating they have linked their account to our app.
Arrows
We’ve gone over monads and we’ve gone over their weaker counterparts, applicative functors. Arrows are another pattern you’ll likely see used in Haskell code. They commonly serve as a general framework for representing and combining operations that convert input to output, kind of like filters. Arrows, like monads and applicative functors, are actually types and arrow types are instances of the Arrow
type class:
class Arrow a where
arr :: (b -> c) -> a b c
(>>>) :: a b c -> a c d -> a b d
first :: a b c -> a (b, d) (c, d)
The actual Arrow
type class definition is slightly more complicated but for simplicity’s sake this will do just fine 🙂
Given my analogy to filters you might think that regular Haskell functions resemble arrows and you’d be right. Arrows are generalizations of regular Haskell functions and functions are, in fact, defined as instances of the Arrow
type class:
instance Arrow (->) where
arr f = f
f >>> g = g . f
first f (x, y) = (f x, y)
To understand this instance declaration it’s important to note that the arrow symbol, “->
”, is an operator in the Haskell type language. Just like normal operators, operators in the type language also have prefix forms, e.g. “(->) a a
” is the same type as “a -> a
.”
In “instance Arrow (->)
”, we turned the “->
” operator into its prefix form and used that to define the Arrow
instance for functions. In Haskell, the “->
” operator in the type language is actually a type constructor, i.e. when you apply it to two types it creates a new type, a function of those two types.
> selectImageUrls :: ArrowXml a => a XmlTree String
> selectImageUrls = deep (isElem
> >>>
> hasName "a"
> >>>
> hasAttr "href"
> >>>
> getAttrValue "href"
> >>>
> isA isImgUrl
> )
> where isImgUrl = DL.isInfixOf "imgurl="
It’s common to use the Arrow
type class as a basis for combinator libraries; libraries that allow you to combine domain-specific functions in intuitive ways to create new functions. HXT is one such combinator library, it’s a library for manipulating XML documents. In our app we define an arrow, selectImageUrls
, that takes an XML document, denoted by the XmlTree
type, and extracts all of the links that contain the string “imgurl=”. These are links in the image search result page that contain the URLs to the found image files.
Thread Architecture
Above I wrote that there were two main threads of execution in this app: the thread that served out HTTP requests for our web front-end and the thread that implemented the image search functionality in our user’s Dropboxes. For the latter part, there are actually many threads. There is one thread that is listening on the Haskell channel for new users linking our app to their accounts from the web server thread. There is a thread per user account that polls the user’s Dropbox account every 30 seconds and waits for new folders for it to populate with images. There is also a thread per new folder per user account that is responsible for populating a specific folder with the images found during the image search.
In other languages like C, C++, Java, or Python this unbounded use of threads wouldn’t be very efficient since threads in those languages often map 1:1 to kernel-level threads. Normally you can’t depend on kernel-level threads scaling into the tens of thousands. In Haskell (or at least in modern versions of GHC) threads are relatively cheap and the runtime does a good job of distributing many Haskell threads across a bounded number of kernel-level threads, usually one per CPU.
Image Uploading Thread
> handleFolder :: DB.Session -> String -> IO ()
> handleFolder session filePath = do
handleFolder
is the thread that is responsible for populating a specific folder in a user’s Dropbox with the image search results.
> let searchTerm = takeFileName filePath
We use the file name portion of the folder path as the search term.
> src__ = "http://www.yourfavoritesearchengine.com/"
> src_ = src__ ++ "search?tbm=isch&tbs=sur:f&"
> src = src_ ++ (URL.export $ URL.importList [("q", searchTerm)])
src
is the generated full image search URL. We use the URLEncoded
library to generate a properly escaped URL query string.
> imageSearchResponse_ <- simpleHttp src
> let imageSearchResponse = T.unpack
> $ decodeUtf8With ignore
> $ toStrict imageSearchResponse_
Here we fetch the image search result page using simpleHttp
. simpleHttp
returns a lazy ByteString
type but our XML library requires a String
type so we have to convert between the two using a combination of T.unpack
, decodeUtf8With
, and toStrict
.
> images <- runX ( readString [ withParseHTML yes
> , withWarnings no
> ] imageSearchResponse
> >>>
> selectImageUrls
> )
Here we make use of the HXT XML processing library to parse out all the relevant image search URLs from the HTML document returned from the search query. Notice the use of the selectImageUrls
arrow defined earlier. images
is of type [String]
.
> forM_ images $ url -> tryAll $ do
forM_
executes a monad for each element in its list argument. The second argument is a function that takes an element from the input list and returns the corresponding monadic value.
Using forM_
we’re performing an IO action for each image URL we parsed out of the result page to ultimately upload that image into the user’s Dropbox. We wrap the monad expression in a tryAll
to prevent an exception in the processing of any single element from stopping the entire process.
> urlenc <- URL.importURI $ fromJust $ URI.parseURIReference url
> let imgUrl = fromJust $ URL.lookup ("imgurl" :: String) urlenc
> imgUrlURI = fromJust $ URI.parseURIReference imgUrl
> imgName = takeFileName $ URI.uriPath imgUrlURI
> dropboxImgPath = combine filePath imgName
Each of the URLs that were parsed out of the HTML contain the source URLs of the images in an embedded query arg, “imgurl”. In this code snippet we extract the actual source URL of the image, imgUrl
, from the “imgurl” query arg and we generate the path into the user’s Dropbox where we want to place the image, dropboxImgPath
.
> image <- simpleHttp imgUrl
simpleHttp
performs an HTTP request to the location of the source image and returns the response body in a lazy ByteString
.
> DB.withManager $ mgr ->
> DB.addFile mgr session dropboxImgPath
> $ DB.bsRequestBody $ toStrict image
This is the call to the Dropbox SDK that allows us to upload a file. We upload the file data, image
, to the path we generated earlier, dropboxImgPath
. If a file already exists at that path, DB.addFile
won’t overwrite it.
New Folder Detection Thread
> handleUser :: Chan DB.AccessToken
> -> DB.Config
> -> DB.AccessToken -> [String] -> IO ()
> handleUser chan dbConfig accessToken_ foldersExplored = do
handleUser
is the thread that runs for each user that is linked to our API app. It monitors the user’s Dropbox for new folders that we should populate with search results. It polls the user’s Dropbox every 30 seconds and loops forever.
While this thread is running it’s possible for the handleNewUsers
thread to send us a new access token to use through the channel given by the chan
argument.
> accessToken <- let getCurrentAccessToken x = do
> emp <- isEmptyChan chan
> if emp
> then return x
> else readChan chan >>= getCurrentAccessToken
> in getCurrentAccessToken accessToken_
I use the “let ... in ...
” syntax to privately define the getCurrentAccessToken
function. This function repeatedly polls the access token channel using isEmptyChan
until it’s empty at which point it returns the last access token that was pulled off the channel.
> efolders <- tryAll $ do
We wrap all the IO actions in this run of handleUser
just in case a transient exception occurs.
> let session = DB.Session dbConfig accessToken
session
is the name bound to the DB.Session
value that the Dropbox SDK interface needs to upload file data into a user’s Dropbox.
> metadata <- myMetadata session "/" Nothing
We make use of myMetadata
to get a collection of all the children inside the root of our API app sandbox, “/”.
> let (_, Just (DB.FolderContents _ children)) = metadata
> folders = mapMaybe ((DB.Meta base extra) ->
> case extra of
> DB.Folder -> Just $ DB.metaPath base
> _ -> Nothing) children
> newFolders = folders DL. foldersExplored
This snippet of code extracts out the list of new Dropbox paths that we should be populate with image search results.
mapMaybe
is a combination of map
and filter
. Any element that the input function returns Nothing
for is filtered out of the returned list. Elements that the function returns a Just
value for are included in the output list without the Just
wrapper. We use it here to return all the paths in the “/” folder that are folders and we exclude children that are plain files.
The “DL.
” operator returns all the elements in the first list operand that aren’t included in the second list operand, it’s like a set difference operation.
> forM_ newFolders (forkIO . handleFolder session)
Here we use the forM_
function again. This time we spawn off a new thread using forkIO
for each new folder we found in the app’s sandbox folder.
> return folders
We need to give our parent IO action access to the new list of paths in the sandbox folder so it can keep track of what paths are new.
> threadDelay $ 30 * 1000 * 1000
threadDelay
is like sleep()
in other languages; It pauses execution for 30 seconds.
> let curFolders = either (const []) id efolders
If an error occurred while polling the user’s account for new folders we bind an empty list to curFolders
, otherwise we bind the current list of folders to curFolders
.
> handleUser chan dbConfig accessToken
> $ curFolders `DL.union` foldersExplored
After sleeping for 30 seconds we loop by recursing. This is the common way in Haskell to loop in a monad. Before we recurse here we update the total lists of folders we’ve ever seen so that we don’t attempt to update them again.
New App User Thread
> handleNewUsers :: ImageSearch
> -> Map.Map String (Chan DB.AccessToken)
> -> IO ()
> handleNewUsers app_@(ImageSearch chan dbConfig) map_ = do
handleNewUsers
is the thread that is listening for newly linked users to our API app via the channel and spawns off a handleUser
thread for each new user.
We make use of the “@
” syntax again to simultaneously bind the ImageSearch
argument to the app_
name and deconstruct it into its chan
and dbConfig
components. The map_
argument keeps a mapping from user ID to the channel of the thread that is handling that user ID. We need that so we can update the access token a thread is using if it is revoked.
> (accessToken, userId) <- readChan chan
readChan
gets a value off the channel shared between this thread and the web server thread. Each value is a tuple that contains a user ID and an access token for that user ID.
> newMap <- case Map.lookup userId map_ of
> Just atChan -> do
> writeChan atChan accessToken
> return map_
> Nothing -> do
> nChan <- newChan
> _ <- forkIO $ handleUser nChan dbConfig accessToken []
> return $ Map.insert userId nChan map_
We look up the user ID we were given in our map of thread channels. If we have a thread handling the account of user ID we got, we send it the new access token by writing to its channel. If we don’t have a thread handling this specific user account then we create a new channel, spawn off a new handleUser
thread, and update our channel map.
> handleNewUsers app_ newMap
Finally we loop with the new map.
Main Program Entry Point
> main :: IO ()
> main = do
So it’s been a long and arduous path but finally we arrive at that main
IO action. The main
IO action kicks off execution for every Haskell program just like in C/C++ and Java.
> let defaultAppKey = undefined :: String
> defaultAppSecret = undefined :: String
> defaultAppType = undefined :: DB.AccessType
Here are the default credentials for the app. We use the Haskell value undefined
, otherwise known as _|_
. This is a polymorphic constant that you can use anywhere in Haskell, it can be of any type. An exception will be thrown if an undefined value is ever evaluated in your Haskell program. To get this app to work you will need to supply your own values for these constants.
In theory these should be parsed out of the command line or a configuration file but for the purposes of this demo app we define them inline here.
> chan <- newChan
Create the inter-thread communication channel using newChan
.
> dbConfig <- DB.mkConfig
> DB.localeEn
> defaultAppKey
> defaultAppSecret
> defaultAppType
> let app_ = ImageSearch chan dbConfig
Create our application specific ImageSearch
value. It contains both the channel and a DB.Config
value.
> _ <- forkIO $ handleNewUsers app_ Map.empty
Kick off the handleNewUsers
thread that accepts new users to our app.
> warpDebug 3000 app_
And finally, call warpDebug
which kicks off our Yesod web interface.
Conclusion
That’s it. That’s our Dropbox API app in Haskell. If you were a newcomer to Haskell this would be a healthy time to have tons of questions. Actually if I’ve done my job right you should be very curious to know more about Haskell 🙂 Head on over to HaskellWiki and start your journey. If you want a nice friendly book to help you get more formally acquainted I can recommend both “Real World Haskell” and “Learn You a Haskell for Great Good”. One piece of advice for your new Haskell journey: don’t sweat the monads.
As for our API app, it’s actually not finished yet. One huge thing missing is that it doesn’t remember which users linked to our app and what folders we’ve populated across restarts, we’d need to store that data in some kind of persistent database to fix that.
Another pain point is the user has to wait 30 seconds in the worst case for their folders to be populated with images. This is because each of our handleUser
threads poll the Dropbox API every 30 seconds. While this is bad from a user experience perspective it’s also bad from an engineering perspective. This will cause the load we induce on the Dropbox API to increase linearly with the number of users using our app, we’d instead like it to increase linearly with the number of active users using our app. Currently there’s no way to get around this issue but we’re working on it!
Other minor improvements include picking a better algorithm to decide which folders we populate with photos, better HTML for our web interface, and streaming uploads to the Dropbox API. I’m sure there are more.
As an exercise, consider fixing some of these problems, remember you can fork this project on GitHub. By the way our Haskell SDK is also on GitHub, you may need to fork that too.
If you have any questions, feel free to reach out. My email is rian+dropbox+com. Have fun!
Many thanks to Kannan Goundan, Brian Smith, Dan Wheeler, ChenLi Wang, Martin Baker, Tony Grue, Ramsey Homsany, Bart Volkmer, Jie Tang, Chris Varenhorst, and Scott Loganbill for their help in reviewing this post. Also special thanks to Michael Snoyman for creating Yesod and accepting my patches. Lastly, a huge thanks to all those who have researched and pushed Haskell forward throughout the years.