Musings on code

There is No Magic

It's my catchphrase, if I were to have one in the technical realm. There is No Magic. It's a mantra and a truism and a guiding principle.
All too often, in a technical context, I see developers behave like primitive man during an eclipse - their dependencies and frameworks and languages, some unassailable force of nature that is beyond the reckoning of mankind.

The confusion seems to stem from a misjudgment - that something one doesn't understand in the moment must be driven by forces beyond understanding. It's evil omen thinking. But we're developers, our core skill is problem solving, and yet we don't always reach for it in the moment. We perhaps get too focused on the goal, or on our own lack of knowledge, and leave that skill behind. We don't break it down, we don't do the analysis, we don't trust ourselves.

Let's start changing that.

Computation Expressions

One of the features that makes F# super neat, but also confuses the hell out of newcomers to the language, is Computation Expressions. In some cases we'll see them called workflows or builders - for reasons we'll end up at later.

We tend to be first introduced to them via F#'s core async and task workflows, as it's the strongly preferred way to work with asynchronous operations in F#. A direct example being a workflow for making an HTTP request using the standard .NET HttpClient:

let httpGet (url:string) = task {
    use http = new HttpClient()
    let! response = http.GetAsync(url)
    return! response.Content.ReadAsStringAsync()
}

There are two clues that we are working with a Computation Expression - the first is the combination of an identifier: task being followed by a block of curly-braced code; the second is that block containing ! versions of familiar terms: let! for example.

Regardless, we can all read this code, and broadly understand what it's doing. Especially if we're familiar with languages that do similar things with await as an inline keyword construct. We have asynchronous operations like GetAsync and something here is doing something to wait for them to finish before moving onwards to the next step in the code; but also in a non-blocking way. And we'd be correct in that assumption.

But then we're going to stumble across a custom Computation Expression:

let unsafeAsyncOperation x = asyncResult {
    let! step1 = doUnsafeThing x
    let step2 = thenSomething step1

    return getResult step1 step2
}

And we try to modify it by adding a third step, using that previous HTTP workflow:

let unsafeAsyncOperation x = asyncResult {
    let! step1 = doUnsafeThing x
    let step2 = thenSomething step1
    let! step3 = httpGet "http://laenas.github.io"

    return getResult step1 step2
}

The compiler suddenly complains, it wants an Async<'a> but we're giving it a Task<string>. And this is one, but not the only, place where people seem to get conceptually stuck. What is going on, why doesn't it just work? They enter a spiral of removing the !s, which causes even more arcane compiler errors, or adding them to other places - to the same effect. The panic sets in, we retreat from our place of logic and principle, and resort to just throwing keypresses at the problem, hoping something will just compile.

We can work with CEs like this, but we can work with them more effectively - especially with regards to building our own - if we understand their machinery. MSDN has documentation covering much of this, but I think it skips (rightfully) a more foundational step to thinking about the sugaring the compiler applies to Computation Expressions.

Back to the Beginning

The F# language(and compiler) is constantly sugaring things for us, in ways that we forget about - especially if we haven't been with the language since its birth - or don't have experience with sibling languages.

One direct example being functions - the mathematical-theoretical underpinnings of F# (and functional programming more generally) are based in the mathematical concept of functions - mappings from some single input value to some output value. The key point there is single input, even if that single input is a structure like a tuple or a list, it is still logically a single input. When we write functions with multiple parameters - what we're 'actually' doing is writing a chain of single-input functions that produce single-input functions:

let f x y = x >>> y
//Is conceptually equivalent to
let g x = (fun y -> x >>> y)

We don't have to think about this with F# - we can do the more natural thing by just listing all parameters and, if anything, we end up needing to learn from a different direction the nature of partial application that aligns with this form. The language has our backs, without sacrificing the foundational concept at play.

In another slightly more dated example, F# has - though not used often - a so-called verbose syntax. One notable feature of it is it's use of the keyword in:

let x = 1 in
let y = 2 in
x + y

I draw attention to it because it poses a curious question: Why in? What about that order of operations puts x "in" the code that follows? And herein lies a curious link between this and the previous point - If F# is ostensibly grounded in mathematics - input/output functions - then where is the function in that snippet? With experience elsewhere, we might casually shrug this question off as wordplay, but it's got a curious, subtle power. Those three lines might reside inside a function, certainly, but there is still no function amongst them. (let us disregard the operator, dear reader)

We can rewrite that snippet in a more conceptually pure way as follows:

1 |> (fun x -> 
  2 |> (fun y ->
    x + y
  )
)

It looks weird, and isn't going to pass for day to day code, but it's logically what's happening. The value we bind to x becoming an 'external' value that gets pushed into a function whereby x is the input binding name and the rest of the code being the body of that function. Allowing us to write it the other way not only helps reduce the burden of increasingly arrowed code, but also structures it in ways that are more familiar and similar to other languages.

This is all well and good, it's curious, but how does it help us demystify Computation Expressions? We have one more brief detour to make.

The other Bind

It likely seems as a curiosity that when talking about code like let x = 1 we refer to them as bindings. That we have bound a value to a name. As we see above, there's a logic to it. But we are also familiar with the same word being used in module functions: Option.bind, for example. How this fits together is another, important, part of the overall picture we are attempting to assemble. When we see code such as this:


Some 1
|> Option.bind (fun x -> Some (x + 2))

It's visible that we are binding the contextual value and assigning a name to it in the function that follows. In this case, x is 1 and Option.bind is just acting as a removal of the boilerplate code of matching the input value and in the Some case, applying the value, and in the None case, continuing onwards.

There's a parallel here:

Some 1
|> Option.bind (fun x ->
  Some 2 
  |> Option.bind (fun y ->
    Some (x + y)
  )
)

There's that arrow again - the same as above, except that we're now dealing with Option values, rather than primitives. And this code is entirely reasonable. Your codebase is probably filled with situations where you will have chains of functions following this pattern. It's entirely predictable and normal. But it's the same arrow and wouldn't it be nice if we could flatten it, in the same way that we can with primitive types.

Enter Computation Expressions.

F# doesn't ship with one for Option - and from a practicality standpoint it seems like overhead, since Option tends to be relatively short-lived in the context of workflows - either getting defaulted or converted to a richer type like Result in short order. But that doesn't mean we can't make one.

What makes it tick

Computation Expressions are based on creating a type - generally called a Builder - that implements by convention some combination of methods of specific known signatures. And while that page may seem unassailable to begin with, we can demystify it by looking at our own use case and its implementation.

The let! keyword, for example, requires a Bind method to be present, and in the table we can see it expects a signature of M<'T> * ('T -> M<'U>) -> M<'U> - and it initially looks frightening, just a soup of letters. But we can quickly make sense of it by substituting in our specific example: Option<'T> * ('T -> Option<'U>) -> Option<'U> and that's much more understandable! Indeed, while it's using tupled params, it's clearly just Option.bind!

type MaybeBuilder() =
  member _.Bind(x,f) = Option.bind f x

let maybe = MaybeBuilder()

let flat = maybe {
  let! x = Some 1
  let! y = Some 2
}

Note here that the term Maybe comes up in this(and all other similar examples) both because it's a friendly alternate name for Option - but also because the CE style keyword - option is already used as a postfix type notation, so maybe keeps it simple.

Along with that, a note of the curiosity: maybe is just an alias for an instance of our Builder type. There's no magic here either - and indeed the consistency remains - you can use the full class name as well (but why would you?):

let flat = MaybeBuilder() {
  let! x = Some 1
  let! y = Some 2
}

But we haven't finished yet, our arrow isn't entirely flattened, we still need the sum of these two values. To do so, we just need to add support for the return keyword, via the Return method ('T -> M<'T>). We can logic this one out: If I give you a value, then clearly you have Some value, and that's what we want Return to do:

type MaybeBuilder() =
  member _.Bind(x,f) = Option.bind f x
  member _.Return(x) = Some x

let maybe = MaybeBuilder()

let flat = maybe {
  let! x = Some 1
  let! y = Some 2

  return x + y
}

And thus, we've flattened the arrow. It's all still just that chain of function calls under the hood, but it looks like it isn't. It's easier to read, it doesn't arrow, and we can ignore some of the complexities.

And thus armed with this knowledge...

Let's return to our earlier example:

let unsafeAsyncOperation x = asyncResult {
    let! step1 = doUnsafeThing x
    let step2 = thenSomething step1
    let! step3 = httpGet "http://laenas.github.io"

    return getResult step1 step2
}

Where our compiler was complaining about wanting an Async, getting a Task, all that good stuff. With what we now know, can we puzzle out a reasoning to that error, and in doing so, a solution?

We first need to think about what it is that the CE is doing, with all that binding. We might be inclined to handwave it away as some sort of magic, we've learned the incantation, we're good to go. But it's important for us to connect the dots, to think it through, and see the bigger picture for what it is. Let's do so by looking at the CE in the above example: asyncResult - we don't need to know the exact implementation - can be inferred to work with the type Async<Result<'T>> - but something very curious is happening here, at the head of which we might ask the question: Why not just use async {}? And the answer can be drafted quickly enough:

let asyncResultX : Async<Result<int,unit>> = async {return (Ok 1)}
let asyncResultY : Async<Result<int,unit>> = async {return (Ok 2)}

let func = async {
  let! x = asyncResultX
  let! y = asyncResultY

  return x + y //ERROR
}

We get an error here because you cannot directly add, with the + infix operator, two Results. They themselves would need to be bound. So how can something like asyncResult {} get around this?

Consider how CE's work: let! is not magically, presciently, stripping away the context of Async or Option - nor is it automatically calling to Option.bind - the builder type is something in our control, and the compiler desugars let! into a call to our Bind method. It's all in the control of the developer of the CE! And since code is just code, we can do what we need.

How can asyncResult {} allow you to let! bind Async<Result<>> to get at the inner value? Well, it might look something like this:

//snip
member _.Bind(x,f) = async {
  match! x with
  | Ok x'' -> return! f x''
  | Error _ -> return! x }

We use an async CE to let us match! (bind, then match on the inner value) and that lets us unwrap the Result to get at the innermost value to pass it to our continuation function. As an aside, return! does exactly what you'd expect from a ! - it binds and then returns. Since our continuation function by necessity must return an AsyncResult - if we were to return from the async {} we would have an Async<Async<Result<>>>.

The key here is to recognize that since it's all just code, it's possible to tweak and adjust this code to suit our use cases. We're able to create CEs for composed cases like AsyncResult - just as we could do so for AsyncOption - what would that look like? Give it a shot! But I draw upon the greater impression: it's all just code, and we can customize it in other ways as well, using our foundational principles!

Let's return to our maybe {} builder. Let's watch it in motion!

type MaybeBuilder(format) =
  member _.Bind(x,f) = 
    printfn format x
    Option.bind f x
  member _.Return(x) = Some x

let maybe format = MaybeBuilder(format)

let flat = maybe "--binding: %A" {
  let! x = Some 1
  let! y = Some 2

  return x + y
}

Take a moment and consider the somewhat unexpected composition here, don't just understand what it is doing, but how it is possible to combine things in this manner. The tool here isn't the pattern of how to debug print in a builder - it's how you can customize a builder, and therefore a CE; from 'outside' the CE itself, as a consumer of it.

MaybeBuilder() isn't magic, it's just a class type, and like any other class type it can have constructor parameters that it uses elsewhere in its internal logic.

maybe isn't magic, it's just a binding, and we've simply changed it from a value binding to a function that is parameterized.

And now our CE prints out the incoming binding values when our Bind method is called.

What other compositional logic can you, reader, envision being useful and interesting to add to a CE?

And thus, our solution

We return, finally, to that initial error. Confused as we were when the compiler barked about type mismatches, and we wrestled with exclamation points, do we see it more clearly now?

Inside our hypothetical asyncResult {} we've attempted to bind a Task - perhaps even a Task<Result<>>! But what the compiler is hinting towards is that the builder for our CE doesn't operate against that type. A conclusion forms: We need to turn our Task into an Async - and thus a call to Async.AwaitTask resolves our problem.

When we see our own code arrowing rightwards across the indents, we might be able to stop and consider building our own CE, to not only improve readability, but declutter and reduce opportunities for error and confusion when handling increasingly nested functions.

There's a lot more to be said about CEs, and perhaps in the future the topic can be revisited, but in this instance they're a proxy, a show pony, for the greater lesson I'm trying to deliver: There Is No Magic. Everything makes sense, and when faced with the strange, the unknown, the confusing, or the magical - we should use those impressions as a clue to ourselves to strap in and analyze the problem.

It'll save you a lot of time in the long run.

Three words

If there's one thing, one rule, that I hope to get across in this pile of words, it's that functions are values.

Functions. Are. Values.

These three words are a key principle that unlock a door to a different way of solving problems, an important and significant step in solving problems and writing simpler, more functional code. But there often seems to be a divide between being able to strictly read this - which everyone gets; the ability to use it when presented with an API expecting it - something I think most developers these days do, even if they don't realize it; and the ability to wield the idea to design our own code with this concept in mind. The pause appears to occur most severely on that latter step - as though standing on the bridge over the chasm has given pause and a wondering which side is closer. Let's help cross that bridge, as to better add another tool to our bags.

A value or a name

What do we mean when we say 'functions are values'? Let's attack this as specifically as possible, and ensure that we've defined our terms.
First: values. Every developer understands values, as they're so critical to our work that to need to define or think about it, is like trying to be conscious about the act of breathing.

let x = 3
let y = "foo"
let z = [9;8;7]

Here, we have some values. x is an int with a value of 3. And so on. Left hand side is a name, right hand side is the value.

These also mean that there is referential equivalence between using the name and the value itself. In the above, anywhere I can use the raw value 3, I can now use x and achieve the same result.

Then: functions. Every developer understands functions, they're also fundamental to our work, regardless of language or paradigm. They allow us to name and capture a subset of our code in such a way that we can both split up our program textually, but also reuse parts of the code from multiple places. Functions have one defining characteristic: They take some values as arguments and produce some other value as a result.

let double x = x * 2
let add x y = x + y

We have a few arithmatic functions: double takes a single argument and multiplies it by two. add takes two arguments and sums them. We're all perfectly familiar with this: functions taking values and producing values.

Left hand side: name.

We're all used to declaring functions. We're all used to calling them. And in what seems to be most language these days, we're even comfortable using them as arguments:

[1;2;3]
|> List.map (fun i -> i * 2) //[2;4;6]

This sort of lambda syntax comes naturally once you've used it for even short while. I just have to provide a little transformation function and it'll call it with each thing in my list. Sure, this makes sense.

But here's the question: If functions are defined because they take values and produce other values, and List.map is a function, then what are the values that we give it as arguments? Definitely our list of [1;2;3] sure - but...also that lambda.

Let's rewrite the above, slightly:

let double x = x * 2

[1;2;3]
|> List.map double

See it?

Right hand side: value.

A quick recap of points.
Values can be given a name: let x = 3
And afterwards, we can use either the basic value or the name interchangeably: double x and double 3 will both produce the same result.
Functions take values as arguments to produce new values: let double x = x * 2

And now, with List.map as our go-to example function, we've seen that the mapping function we use can be passed in as either a lambda, or a previously declared and named function. Repeated with emphasis: It can be passed in as either a lambda or a named function. A value or a name.

Now, the weeds

Let's look at the same simple function, but written three separate ways:

let add x y = x + y

let add x = (fun y -> x + y)

let add = (fun x y -> x + y)

These are all conceptually equivalent - and all can be called identically: add 1 2 will produce 3. But it's worth it to pause for a moment and consider the three forms:

Everyone is familiar with the first form - we declare the name, we name the parameters, and we do something with them. It's the baseline way of working with functions[1]

The second form is one that we tend to see while learning F# - oftentimes in examples involving partial application. But it's here that we also may start feeling a fog set in: Because we tend to learn, by default, that functions have to return solid things, values, something - for lack of a better word - tangible. We're used to the concept of something like add giving us a number back. So there's something a little weird and alien at the idea that this...thing...isn't a number, it's a function. But it's also here where we probably have that lightbulb moment, at least in part, that 'ok, cool, so I can make lambdas and return them so that they can be called!'. But that's like learning to walk, without learning to run.

The third form, and I want to be perfectly crystal clear I sincerely don't advocate for writing functions in this way, I do so solely to unify some examples. But here we have something that looks more like what we see in all non-function cases in the language - a name on the left, a value on the right. It lays bare the fact that our function is just 'a lambda' with a name. Equivalency, the same as let x = 3.

Put to good use

While this is trivial - in the case of add - or familiar to a point of acceptance, in the case of List.map - there's something more powerful lurking here that deserves a little bit more attention. It's the power to dramatically shift the overall flow of an application without needing to dramatically alter the primary flow path of the code.

Consider an example: We have an application that will search an InputDir for files with ExtensionSuffix and we'll output whatever work we're doing to some OutputDir. We also have some defaults.

type AppConfig = {
    InputDir: string
    ExtensionSuffix: string
    OutputDir: string
}

let defaultConfig = {
    InputDir = "./search"
    ExtensionSuffix = ".log"
    OutputDir = "./results"
}

But we want to allow users to override these settings using some form of runtime config. For the sake of brevity, we'll ignore where this configuration comes from, as well as how it gets parsed. But we'll assume that what we have to work with is something in the form of:

type ConfigOption =
    | InputDir of string
    | ExtensionSuffix of string
    | OutputDir of string

And we'll assume these are parsed into a list.

Now, the first way newcomers to FP will tend to approach such a problem has the air of other languages to it - especially those without first-class functions:

let configureApp (options: ConfigOption list) =
    let config = defaultConfig

    let inputDir =
        options
        |> List.tryFind (fun opt -> 
            match opt with
            | InputDir _ -> true
            | _ -> false)

    let config = 
        if Option.isSome inputDir then
            {config with InputDir = Option.get inputDir}
        else
            config

    //Snip - you get the picture    

Do note, none of the above is 'real' code, it's just an off-the-cuff amalgamation of the sort of things that seem to come up rather repeatedly. And somewhere around here, the question rightfully tends to come up as well: "How can I easily test which case a union type is?" (Which is generally a smell in its own right.) As well as having to deal better with the Option that's caused by dealing safely with the list, branching in general - it's painful. I earnestly think it is likely to push some people away - because in this sense, something like C# looks cleaner - you aren't forced to else the if and can use null-coalescing with LINQ to avoid the Option. And those feelings will also make us potentially lean towards trying to model all our individual options as their own types - and write helper functions for them - but then we start to recognize the duplication of logic down that road as well. But what about if we take what we've learned about functions and apply it here, what can we do?

let configureApp (options: ConfigOption list) =
    options
    |> List.map (fun opt ->
        match opt with
        | InputDir dir -> (fun cfg -> {cfg with InputDir = dir})
        | ExtensionSuffix ext -> (fun cfg -> {cfg with ExtensionSuffix = ext})
        | OutputDir dir -> (fun cfg -> {cfg with OutputDir = dir}))
    |> List.fold (|>) defaultConfig

What's going on here? The major difference here is in the recognition of three key details: First is that our options - conceptually, not in code - just want to update the config; thinking about this in F#, pretending to write that function itself, we can notice how the shape AppConfig -> AppConfig just feels natural. Take a config, return an altered config. Second is that what we're trying to do: take an AppConfig(our default), apply it to our first function, then take that function's output and apply it to the second, and so forth. That's pipelining - but since we can't know at compile-time which functions to call we can't just write a pipeline, so we need a way to create a 'dynamic' pipeline. The third and final piece of the puzzle, that ties the whole thing together, is that functions are values - as we saw above - and so it's possible in the first place to have something like an (AppConfig -> AppConfig) list - a list of functions. So we have a list of functions that by signature compose themselves, we just need a starting point, and some way to combine them sequentially - and that sure sounds pretty much spot-on for List.fold, doesn't it?

So we take our list of options. We transform each one into a function that updates a config - keep in mind that they don't have to be lambdas: they can be named functions. It's entirely the same! But now that we have a list of functions that update a config, we fold them over our default config. Admittedly, using the |> operator in such a way is a flourish, but it conceptually pairs better with that previous realization that what we want is to pipeline. We could have used an explicit folder function as well:

//snip
|> List.fold (fun cfg optF -> optF cfg) defaultConfig

When people say that F# is a terse language, and packs a lot of power per LOC relative to C#, this is the sort of thing that seems to be the case. Yes - C# is shorter and nicer when we compare direct, naive solutions - but when you lean into the power of the language, F# starts to be shorter, more expressive, and still somehow more typesafe - the union means we know that all config options get checked, and not needing to do an equivalent of tryFind to figure out if an option exists means we can just directly map whatever we do have and work with it. And while you can achieve a similar behavior in C# with Func<TConfig,TConfig> and LINQ's Aggregate, both of those will result in much less comprehensible code - to say nothing about the trouble in trying to model the options themselves without a union.

In summary

Functions are values and behave similarly to all other values we use in the language. The left hand side of a let is a name (and some parameter names, for functions) - and the right hand side is the value (and what we do with those params, for functions) This goes from ints to lists to classes to functions. And in cases where we can deal with more primitive types, we can also have function types themselves:

//Record fields (though Interfaces can be recommended)
type FuncRecord = {F: int -> int}

//Union cases
type FuncUnion = FuncUnion of (int -> int)

//Type argument to generic types
type FuncList = (int -> int) list

//Tuple members
type FuncTuple = (int * (int -> int))

And in the same way that we're all familiar that things like LINQ's Select (F#'s map) is more expressive than writing a full foreach or list expression to do the same work - we can also use that same elegance, that same power, in our own code outside the BCL itself. Because whether they come from the BCL, or our own code, functions are just values. So let's use them as such!

I promise not to talk about LEGO

We've all heard that one before, and I think people might be getting a little tired of it. (Even if it's a good metaphor) But it's undeniable that, in the world of F# (and from my perusing, statically-typed functional programming as a whole), the signatures of things are given an markedly more central role. Immensely moreso than in our sibling language C#, to use the reliable old point of comparison. I've reflected upon this a fair bit, recently, and come to the conclusion that it boils down to two primary differences:
First, foremost, we can actually express them without the patience of a monk and the memory of a pub quiz champion. Consider the differences between these two equivalent things (and don't worry too much about making sense of either, we'll get there later, this is just a run-on sentence in a run-on introduction):

('a -> 'b -> 'a) -> 'a -> seq<'b> -> 'a
TAccumulate Aggregate<TSource,TAccumulate> (this System.Collections.Generic.IEnumerable<TSource> source, TAccumulate seed, Func<TAccumulate,TSource,TAccumulate> func)

Second, there's a much stronger compositional inertia in a language like F#; and part of that composition is being able to quickly and easily understand how pieces fit together. There's a reason why tooling like Ionide adds signature hints liberally, and FSI prints them for all bindings.

So let's walk through this together, starting simple, and working upwards. And each step of the way, let's consider how to reason about what we're doing in a way that will develop our intuition about how to work with unfamiliar code and libraries.

No, no, not the T word, I thought we weren't going to talk about the T word

Types. Whoo, been holding that in for the entire introduction.
When we talk about signatures what we're talking about is types - or, more clearly, a specific subset of types. The types of functions. But before I get there, I want to start from the ground up, to make sure we're really clear about the distinction between a Type and a Value - because that's going to be important to understand before moving on.

let ten = 10

We have three elements in this snippet:

  • A binding name: ten
  • A value: 10
  • The type of that value, and thereby, of ten: int

We can phrase this more conversationally as: ten is an int with a value of 10. "hello" is a string - 'x' is a char - (1,2) is a int * int and () is unit

That last point is one to linger on for just a moment: () is a value of the type unit - a type that has only that one value. There are no magic voids in the F# type system, everything has a clearly defined type and value.

This rule has two implications, beyond the simple values above, which I suspect everyone can reason easily about.

The first is with regards to generics:

let abc = ['a';'b';'c']

abc is a char list - or put another way - a List<char> and the type parameter there is important. Because you can also have List<int> or List<string> - but they all behave the same, they are all, fundamentally, a list. Or, to wit, 'a list. Because in F# we denote type parameters for our generic types as 'a and 'b and 'anyNameReally and so forth. It just needs to start with that tick.

The second is with regards to functions:

let addOne i = i + 1

addOne is a int -> int. The -> notation in F# indicates a function. The final type in the signature is the output type of the function, the rest are the parameters that are sent as input to the function. So in this case, exactly as we see in the code - we want one input (an int) and from that the function will produce an output (an int).

Thinking our way through the basics

So we have looked at simple values. They have types. Probably not much confusion there. And when dealing with generic types, like list, we can have types that generically operate on other types, their values dependent on that other type. And with functions, they have types - signatures - also.

But stop and read that through again. There's a logical association at play.
Values have types. 2 is an int Generic values have types. [|1;2;3|] is an int array And functions have types. But where did the value in that statement go? What is the value of a function?

It's not its output. Its not its binding name. It is the function itself

Consider for a moment how we think about values and types otherwise: Consider 3. What can we say about it? It's an int, yes. It's value is three - as in the natural number. But we can't necessarily do much with just 3. It's a fine value, but it's just a single value. Consider "shanty town faded sub-orbital city range-rover rain singularity artisanal modem" - it's a fine string, but do we really want to pass that around everywhere we might need it? Nope, so we bind it to a name that's easier to work with:

let lorem = "shanty town faded sub-orbital city range-rover rain singularity artisanal modem"

And we can now use these interchangeably, because for all intents and purposes, they are the same thing. We put the name beside let and then a value after the = and now we have associative equivalence.

So when we look at a function like this:

let addOne i = i + 1

We can say that addOne has the signature int -> int. But the value of it is the entire function itself - more easily seen (but please don't write it this way in day-to-day code) by simply our writing our function more like every other value we work with:

let addOne = (fun i -> i + 1)

Pause there. Both of those addOne declarations are functionally equivalent. Both can be called by simply doing addOne 1 (producing 2). And this is because - and this is critical for folks not used to functional programming: Functions Are Values

Ok one more time, because this is going to get very important very soon.

Functions are values. They can be bound to a name (often are!) but they can also be used in their 'value' form - generally just called a lambda(as in other languages as well).

In exactly the same way that we can assign the value 3 to a name, and then use either interchangeably, we can do the same with functions.

Higher order readership

So to briefly recap: All values have a type, and can be bound to a name. Functions are values, and by previous definition, have a type signature. That signature is a definition that describes the input(s) and output by type. We can call that function, to produce a new value, by sending values as input(s).

There's a logical loop in our definitions there. If you aren't already familiar with it, I strongly encourage thinking about it. Functions are values and functions take values to produce new values. This means, very simply, that functions can accept other functions as arguments. When you see the term 'higher order functions' this is (part) of what that means.

Let's build one for ourselves!

let transformInt i (f: int -> int) = f i

Surprisingly simple, right? The type of transformInt is int -> (int -> int) -> int - note the parenthesis there that distinguish the second parameter as a function, as compared to an int -> int -> int -> int which is a function taking 3 int arguments. Depending on your experience, this can be a significant insight, and may not settle immediately. Make a cup of coffee, think about it, consider any problems you've worked on that perhaps may have benefitted from being able to change the behavior of a function by using a function parameter.

The subtlety here is that any function of the form int -> int can be used here. And we can use their values directly, as well!

let transformInt i (f: int -> int) = f i
let addOne i = i + 1
transformInt 2 addOne //produces '3'

transformInt 3 (fun i -> i - 2) //produces '1'

Reasoning about Signatures

Hey, that's the title of the article!

So we know now what signatures are, and how to read them. But that's trivia, right? I mean, we still need to stare at a mountain of documentation to understand what they do, don't we? While partially true, and while samples are always helpful, understanding how to reduce unfamiliar code to raw signatures - combined with helpful names on the functions themselves - is a powerful tool to help every developer. Especially in a non-top-5 language like F#, not every library is documented to the gills and with rich samples for every use case. Indeed, several times a week there are people in the various F# communities asking How-To questions about libraries that have reasonable documentation, but simply lack fully-featured sample code. Understanding how to sift through the signatures in a library - either in API documentation or via Intellisense - leads to rapid productivity gains.

But how do we do it? Well, the primary trouble seems to be around generics. When we talk about an int -> int there's something easy to grasp - give an int, get an int. Even when we have different inputs and outputs: int -> char -> string gives us something I think we can reason about easily. But I suspect something starts to short circuit for a lot of people when we move to a signature that looks like ('a -> 'b) -> ('b -> 'c) -> 'a -> 'c

That looks indecipherable at first. And oftentimes our first instinct internally when faced with something like that is just to throw up our hands, panic, shut down, and just decide it's too much. But hold yourself steady for a minute, because we're going to apply our thinking powers and we are going to conquer this!

As with any problem, let's start small.

'a

We know this is a generic, but unlike the ones we've seen previously it's not attached to anything. There's no list or option. It's entirely alone. So in that sense, it can be...anything. Any type whatsoever. Let's just scribble in something simple, to help ourselves think. And remember, we're using types and not values right now. A forewarning: this is about helping that panic to subside, and introduces a fatal logical flaw to our reasoning, that we'll need to remove later by going back to the generic signature, but for now, let's get our mind right.

(int -> 'b) -> ('b -> 'c) -> int -> 'c

Ok, one thing down. Now it's the same problem again. Let's choose another type for 'b, just to mix things up.

(int -> string) -> (string -> 'c) -> int -> 'c

And, once more, for completeness:

(int -> string) -> (string -> char array) -> int -> char array ('a -> 'b) -> ('b -> 'c) -> 'a -> 'c

There they are, side-by-side. And while the top one feels a little more approachable, filling in the types like that is just a means to reason a little bit more about what a function like this is doing. Don't forget that those types are chosen arbitrarily and could be anything!

Now, with things simplified for the sake of understanding, we can continue methodically and logically thinking about this function.

We know that it has three parameters: (int -> string) and (string -> char array) and int And that it produces a char array

We can tell that the first two parameters are functions. This can lead us, through logic, to a conclusion that this function on the whole must be reasonably simple - because if it's taking functions as parameters, then it is altering its behavior based on those functions, rather than holding the logic itself. There's a hypothetically large number of ways that a function like int -> string can be implemented, and so by taking that signature as a parameter, we can't know which one is going to be used. Or, put another way, we've declared with this overall signature that any int -> string function will do. That's a wide net to cast, and we can work backwards from it to our conclusion that there must not be 'very much' concrete logic happening in this function.

The other logical conclusion we can draw is that because of that abstracted behavior, there must be some relationship between the parameters. Since we're taking in a function, we're probably intending to call it somehow, and in the case of int -> string that means we are going to need an int. We can find it as the third parameter - which logic demands must be used as an input to the first parameter.

Wait. That's wrong. There could be a hardcoded int inside the function that is used. And here's the fatal flaw in thinking about generic signatures from the standpoint of concrete types:

(int -> string) -> (string -> char array) -> int -> char array can have unknown, hard-coded behavior that we can't make strong logical assumptions about. ('a -> 'b) -> ('b -> 'c) -> 'a -> 'c cannot. You cannot write a function that holds this signature and hides away a (meaningful) hardcoded 'a to be used with the function parameters because you can't know what the type of 'a might be. It's this very logic that lets us presume that the 'a used as a parameter is used as input to the 'a -> 'b

So we can see that there's a strong association between the first and third parameters, and as we scan over the rest of the signature, we can draw some other logical conclusions: Our second parameter is a 'b -> 'c which means we need to get a 'b from somewhere. Unlike our 'a - there isn't one sent into our function alone. But from the signature itself we see one is the output of the first parameter. So it's reasonable to conclude that the output of the 'a -> 'b is used as input to 'b -> 'c.

And then our final output, a 'c is produced by our function, and that's also the output of the second parameter. So that makes sense.

So what can we reason out about this function's behavior solely from its signature? 'a is passed into the ('a -> 'b) to give us a 'b. That 'b is passed into the ('b -> 'c) to give us a 'c. And that 'c is the output of this very function.

('a -> 'b) -> ('b -> 'c) -> 'a -> 'c is the signature of function composition - present in F# via the >> operator.

Stop and take a moment to reflect on what we've just done. Imagine looking at >> in your IDE and seeing that signature and just...gasping. But we've walked through it, step by step, and made sense of exactly what it's doing. That's huge! We've done it without needing to see examples of it being used. We've done it without needing to reach for reliable old metaphors about Danish plastic. Well done!

Once more, with less flourish.

There's an awkward flaw in written tuition like this - I'm victim to it as often as you. It's so easy to read something, especially when it is in a teaching pace and tone, and feel very good about understanding it, but then walk away not really able to apply that knowledge. Part of that, I think, is that so many of these sort of articles in development tend to make their point once and move on, understanding assured. Instead, I want to walk through approaching signatures using reasoning with you here, with slightly less boilerplate, to show how this isn't about a single magical >> operator - but that it works universally and really can be applied.

Up at the start, we glanced at this one: ('a -> 'b -> 'a) -> 'a -> seq<'b> -> 'a

Uff! It's a weird one! But is it really? Our parameterized function uses an 'a and a 'b to produce an 'a. We once again see that we send in a 'a as another parameter. So let's use the same logic - that it's used as input to that function. We also take in a seq<'b> and that 'b matches our function...but not the seq. It's notable that there's no other seq in the signature. So we take in this collection of 'b values but we don't use them with our function, nor do we output a seq. Logic leads us to consider that we must be iterating over all the values in that seq - because if we were using only a single value, then this function would be better stated as ('a -> 'b -> 'a) -> 'a -> 'b -> 'a (itself nonsensical, since we could just call the function parameter directly); and if we wanted some subset of values from the seq - then we'd still be handling a seq and so the matter of selecting a subset would be better handled in its own function (or in yet another function-as-parameter that somehow indicated how many to use, like an int or a 'a -> bool). We can also explore the idea that we keep using our single 'a with each 'b in the function...but that would give us a seq<'a>, which doesn't match our output, either. So we can break it down: We call our function with our 'a parameter and the first 'b and that gives us...an 'a - which we can use with the second 'b to get an 'a. Once we're out of 'b then we would be left with just the last 'a.

And that's exactly what this function does. Seq.fold (which has siblings in List and Array, amongst others). We've sussed out the behavior without seeing the code in action, or even the (very helpful) documentation about it.

Here's one that I have, prior to this, never actually used - just to level the playing field.

Seq.scan has a signature of ('a -> 'b -> 'a) -> 'a -> seq<'b> -> seq<'a>

That panic I mentioned earlier? Yeah, I just felt a rush of it. But almost all our logic from up above still stands, up until the very end. That's helpful. Instead of a single 'a we have a seq<'a>. Wait, with fold, we had reasoned that the 'a parameter must be used to bootstrap the chaining of our function, since we didn't get a seq<'a> on the output. But now we do. Does that mean it's what it does? Well no, that's flawed. If we had a seq<'b> that we wanted to transform into a seq<'a> then we would just use Seq.map ((('a -> 'b) -> seq<'a> -> seq<'b>) - so this must be doing something else. The other significant logical thing it could be doing is collecting all the output 'a from the function we pass in. And sure enough, that's precisely what it does.

More than just a neat trick

None of this is to say that there is infallibility or that a signature can tell you everything. Seq.foldBack is ('a -> 'b -> 'b) -> seq<'a> -> 'b -> 'b - and without the hint in the name - or in the documentation - the signature alone doesn't clue you into the fact that it's operating from the end of the collection instead of the front. (Even though the raw behavior we reasoned out is correct).

But when we look at the F# ecosystem, there's a strong sense in projects and libraries that this sort of thing is implicitly understood. We talk a lot about the signatures of functions, because in so little text it can communicate so much. A relatively comprehensive amount of API Reference documentation - with modules, function names, and their signatures - but much smaller amounts of sample code covering all the possible functions and how to call them and use them. We, as developers, can bridge that divide by learning skills such as this in order to empower ourselves. These sorts of things need not remain mysteries, or esoteric corners we avoid until illuminated by some other hand.

Let us use our own reasoning faculties to illuminate these dark corners. Let us realize that we are already capable of understanding and working through the code before us. We just need, sometimes, to sit down and consider it step by logical step. But we'll come back to that in a later article.

Type Safety is difficult (or at least irritating)

One of the challenges in moving to a language like F#, especially when your point of departure is a language like C#, is in coping with what seems to be a fixation on types. Whereas in C# it is commonplace and idiomatic to cast types to and fro, up and down their inheritence trees, such practices are discouraged and made more difficult in F#. While not uniquely the cause of confusion, being well-practiced in Object Oriented patterns appears to have a problematic effect when a developer approaches programming involving algebraic data types, especially sum types - such as Discriminated Unions in F#. They provide an enormous boost to type safety - reducing our need to worry about the permutations on data state - but that does come at a cost of actually needing to ensure that safety exists. It's a major stumbling block that I see again, and again, and again with F# novices, so let's take a slightly in-depth look at unions and how they can be used to solve familiar problems in different ways.

Everything old is new again

F#'s discriminated unions are, despite their simple nature, an unbelievably powerful tool in modelling a domain, because it allows us to cleanly model the 'either-or' nature of much of our data.

type Hypothesis =
    | True
    | False

Here, we can see how we can immediately constrain a Hypothesis to being either true or false. Immediately, we can see that this could also be modelled as a bool - which is true. OO-savvy readers will additionally notice that this looks a lot like an enum - and that is also true. But hang on, it gets better:

type Hypothesis =
    | True of Proof
    | False of CounterExample

We can define the cases with different data! In our example, we can acknowledge that a True Hypothesis consists of a Proof and a False Hypothesis consists of a CounterExample.

We should really stop to think for a moment about what this means, especially if we're used to Object Oriented type hierarchies.

Unions as alternative inheritence

Spend almost any time modelling a domain in an object-oriented fashion and you'll end up with examples of that leaky abstraction, or that hanging method, due to inheritence. At which point we go back to the drawing board and consider long and hard how to restructure things to ensure that we don't accidentally call the wrong method at the wrong time. We end up thinking about everything, not necessarily in terms of their own types - but of their base types, as well as their child types. What data properties should be private or protected or public - what is virtual or perhaps abstract?

Unions allow us to gracefully sidestep these issues of trying to keep track of what data bleeds to where in a class hierarchy. Each case defines its own data. Consider for a moment the parallels, in an ideal world: In C#, we often use inheritence as a means to pass child instances with different types by handling them as a shared ancestor class. Apple and Orange both inherit from Fruit. But with unions, we say clearly: A Fruit is either an Apple or an Orange - and there is nothing else.

While this initially seems constraining, to understand how it actually simplifies and solidifies our domain requires us to look a little more at the usage of unions.

Cases != Types

Consider the following definition:

type Fruit =
    | Apple
    | Orange

Here, we have defined one type. It's important to understand that. Neither Apple nor Orange are types. An extremely common mistake made by novices is to attempt to use unions as a form of shorthand for describing inheritence trees:

let core (apple:Apple) = //"The type 'Apple' is not defined."

We can't write a function that only handles Apples because, as the compiler tells us, Apple is not a defined type. This confusion is often magnified when using a union with cases named after other types (often records):

type Apple = {Type: string; Cored: bool}
type Orange = {Type: string}
type Fruit =
    | Apple of Apple
    | Orange of Orange

let core (apple:Apple) =
    {apple with Cored = true}

let myFruit = Apple {Type = "Ambrosia"; Cored = false}

core myFruit 
(* This expression was expected to have type
    'Apple'    
but here has type
    'Fruit' *)

In this instance, core takes an Apple(the type) but we give it a Fruit (of case Apple), and the compiler throws an error. And without understanding the critical difference between the case and the type, even the function signature of core (Apple -> Apple) doesn't immediately help us. We've given it an Apple! What more does it want!?

Unions are not just glorified Enums

One common pattern seems to follow developers with experience with other languages:

type FruitType = 
    | Apple
    | Orange

type Fruit = {Type: FruitType}

Which seems logical, in isolation: We want a Fruit as a record type, which looks and feels familiar like a class - and then we just use the union to indicate which type of Fruit any given instance is.

The problem here is that in - I would estimate - the majority of cases we've sacrified everything to be in this state:

If we want to do anything meaningful with the distinction between types of Fruit, we now end up needing even more pattern matching code - either by dotting into Fruit or by writing helpers to identify types for us: if IsApple someFruit then //... Both being ultimately redundant, since they're just hiding the union matching. We lose in the complex case as well: Since we now have to be very careful about how we extend our Fruit type - all common traits must be shared across all Fruit types, and we introduce the need to do very careful error handling to track states. Apples have Cores, Oranges have rinds, Cherries have pits (shared with Peaches, let us not forget), and so forth.

A remedy worse than the ailment?

It's at about this point the realization dawns: In order to work with Fruit in the abstract - we have to pattern match it to do anything meaningful with it.

//For the sake of this example, we're not going to attach data
type Fruit = 
    | Apple
    | Orange

let prepare fruit =
    match fruit with
    | Apple -> //core it
    | Orange -> //peel it

let juice fruit =
    match fruit with
    | Apple  -> //This example just gave me a twenty minute long distraction
    | Orange -> //Into researching how to make apple juice at home

And so on. It's understandable that after some amount of this repetition, something starts to feel like it needs be reduced. But it's important to note that we're playing by slightly different rules here - because of the earlier point: this is an alternative to type hierarchies. So whereas in that traditional inherited model, we'd hide away our differing behavior inside the child classes themselves, with unions it stands at the forefront and we gain compiler support to ensure that we always handle all our cases.

The Prestige

So with a lot of boilerplate out of the way, here's the real why and how of the power of unions - we just need to move our example to something a little more involved:

//Let's imagine we're writing a game and want to model equipment that benefits a character
type Equipment =
    | Base of name: string * wisdomStat: int
    | Prefixed of equipment: Equipment * wisdomBonus: int * prefix: string

Do you see it? Yes? No? Let's pull that trick one more time.

type Equipment =
    | Base of name: string * wisdomStat: int
    | Prefixed of equipment: Equipment * wisdomBonus: int * prefix: string
    | Suffixed of equipment: Equipment * wisdomBonus: int * suffix: string

The power comes from that most terrifying r-word: recursion. Deep breath, it's alright, let's just reason about this for a moment!
In this example we'll see that there is (and must be, as with all recursion) our base case: Base - which specifically does not include other Equipment. The other two cases do, however, and if we think about how we'd go about creating those cases, we can work from a position of pure logic: In order to create a Prefixed we need to have some other Equipment...if we have some other Prefixed then...we're back where we started. And if we have Suffixed then we end up with a similar problem. But I can create Prefixed Equipment by passing some form of Base in, and that's that:

let heroicSocks = Prefixed(Base("Socks",0),3,"Heroic")
//Heroic Socks (+3 Wisdom)

And to step back to that logic, we can make Suffixed equipment in the same way. Even better, we can do both, and in either order!

let heroicSocksOfBar = Prefixed(Suffixed(Base("Socks",0),2, "of Bar"),3,"Heroic")
let heroicSocksOfFoo = Suffixed(Prefixed(Base("Socks",0),3, "Heroic"),2,"of Foo")

And note that nothing here stops us from continuing onwards: Fancy Oversized Blue Socks of Astonishment and Woe is just a matter of stacking Prefixed and Suffixed cases on top of one another, using the same, simple, elegant pattern.
It's not to say this is impossible to handle without unions - just more complex, or difficult to work with. Consider how you'd model this behavior with a class hierarchy. Trying to subclass PrefixedEquipment : Equipment makes standalone SuffixedEquipment : Equipment an exclusive choice. Adding IPrefixedEquipment interfaces demands handling those interfaces and types separately from the ISuffixedEquipment. Perhaps even an IAugmentedEquipment - and then writing complicated (and leaky) logic to indicate whether a given implementation is a suffix or a prefix.

Even moreso, consider how you'd model this, even in F#, but without that recursive union. It's not impossible, just not as clean and - importantly - typesafe through-and-through. And not just now, but as easily as we can add more enhancements in the same way, beyond just the scope of a name and stat calculation: | Scripted of equipment: Equipment * behavior: (EquipmentEvent -> Equipment) - by adding a case that contains a function that returns equipment given some known type - we can end up with ad-hoc behavior. And even better, just as with the name, we can do this repeatedly so that individual snippets of behavior can be just that - small snippet functions - rather than a single enormous function with complicated logic to handle all possible cases.

What's the cost? Where's the struggle? Just more types.

So we can have these nested recursive unions, but unwrapping them and processing all of those layers every time I just want to know 'what is the name of this item' is annoying - even if we have a helper function. Entirely true! But by simply creating more types to suit those needs, we can retain both the flexibility we have and also augment it with simpler handling:

type Equipment =
    | Base of name: string * wisdomStat: int
    | Prefixed of equipment: Equipment * wisdomBonus: int * prefix: string

type Practical = {Name: string; WisdomBonus: int; RawEquipment: Equipment}

let makePractical e =
    let rec build equip (pName,pWis)= 
        match equip with
        | Base (name,wis) -> 
            ($"%s{pName}%s{name}",wis + pWis)
        | Prefixed (eq,wis,prefix) -> 
            build eq ($"%s{prefix} %s{pName}", wis + pWis)

    let (name,wis) = build e ("",0)

    {Name = name; WisdomBonus = wis; RawEquipment = e}

let heroicSocks = Prefixed(Base("Socks",0),3,"Heroic")

let practicalSocks = makePractical heroicSocks

We hold that reference to our raw stats, both so that we can augment and modify and recalculate on the fly:

let cursedSocks = Prefixed(practicalSocks.RawEquipment, -5, "Cursed") |> makePractical

I leave, knowingly, the sequencing of these augmentations as an exercise for the reader.

In conclusion

What can in addition be said, relative to what stands above? We've looked at how unions aren't just enums. We've built up the need to do complicated pattern matching by using unions - but then also reduced them back to simply handled, but strictly typesafe, records again. We've seen how they can take what would normally be a complicated, interwoven jumble of interfaces and inherited types, and distill them down to simple, high-level abstractions. We've talked about fruit and socks.

And, hopefully, we've learned something.

Foreword:

My adoption of F# as a language-of-choice was slightly rocky. After around a decade of nearly exclusive C# work, my curiosity was piqued with an uptick in hearing about this other #-lang. My initial reaction was one I've since seen in other C# developers - dismissal - C# is a good language and I was comfortable with it so why bother with the effort of learning a different one? But the curiosity remained - and at least a few times I decided I'd set aside an evening to go through a basic introduction post and try to write some katas in F#. It didn't stick because I just felt lost and couldn't translate my experience with C# into feeling even remotely comfortable with F#. Easy enough to drop the curly braces, a little bit of a hiccup to remember to let instead of var - but how to do what I wanted?

I didn't realize it then, but what I was observing is, I think, a potential gap in the way that F# developers talk about, describe, and introduce their language to the outside world. There's a thorough library of materials about all the features and functionality of F#: Algebraic Data Types, Exhaustive Matching, Type Inference, the works. There's a lot of articles handling how to solve a wide range of problems with F#. But what's missing is, I think, something like what follows here: Some guideposts about how to take what you are already comfortable with in C# and translate them into F#. So, I wonder if we can't close that gap somewhat.

Doing so expects just a little bit from the reader - a passing familiarity with three main points of F# syntax: let is used like var in C# - to declare a variable. |> is F#'s piping operator, which takes the left side and uses it as the final argument to the right side. F# uses lowercase and a tick for generic type annotations, so SomeType<T> is represented as SomeType<'a>.

The rest should be understandable from usage and context as we go. This isn't meant to be a comprehensive, no stone left unturned, guide - but enough information to cover most initial questions and get people off on the right foot. A primer, if you will.

Table of Contents

I need to:

I need to:

Work with collections

F#'s core collection types (mostly) tend to look a lot like C#'s, but often with (sometimes subtle) behavioral differences to enforce immutability. In most cases, functions that operate on these collections will return references and will not modify the original reference's contents.

Choose a collection type

Something like Array<T>

You're in luck! Arrays in F# are the same as Arrays in C#. A few points to be made, however:

  1. Arrays in F# generally use the [|element|] notation - because [] is the notation for F# Lists.
  2. Separating collection elements in F# involves a semicolon, rather than a comma: [|elementA;elementB|]
  3. Accessing by index in F# requires a prefixed dot before the braces:
let myArray = [|1;2;3|]
myArray.[1] //2
  1. F# also offers multidimensional arrays of up to 4 dimensions, through the Array2<'a>, Array3<'a>, and Array4<'a> types.

Something like List<T>

The default list type in F# is slightly different than the List<T> type in C#.

Here's what you need to know:

  1. Lists in F# generally use the [element] notation instead of arrays.
  2. Lists, just like arrays, separate elements with semicolons instead of commas: [elementA;elementB]
  3. F# Lists are implemeneted as singly-linked lists - which means that appending individual elements is at the front of the list with the :: operator:
let myList = [1;2;3]
4 :: myList //[4;1;2;3]
  1. If we need to append at the end, we can use the `@ operator to join two lists:
let listA = [1;2]
let listB = [3;4]
listA @ listB //[1;2;3;4]

Something like Dictionary<TKey,TValue>

Along with the looks-similar-but-isn't motif of list - F# provides a default Map<'key,'value> type that isn't C#'s native Dictionary<TKey,TValue>, but does implement the usual group of .NET interfaces such as IDictionary<TKey,TValue> and IEnumerable<T>

Here's what you need to know:

  1. Maps can be created from any collection of 2-item tuples, where the first item is the key and the second is the value:
[(1,2);(3,4)] |> Map.ofList //[1] = 2, [3] = 4
  1. If there are duplicates when we create from a sequence like this, the last value for a given key is what the Map contains:
[(1,2);(1,3)] |> Map.ofList |> Map.find 1 = 3 //true
  1. The reverse process is also true: Maps can be easily turned into collections of 2-item tuples:
[(1,2);(3,4)] |> Map.ofList |> Map.toList //[(1,2);(3,4)]
  1. F#'s native Map type isn't especially well suited to consumption by C#, in cases of interop, we can create a more C#-friendly IDictionary by utilizing the dict function with any collection of 2-item tuples. But do note, this is still an immutable structure, and will throw an exception on attempts to add elements to it.
[(1,2);(3,4)] |> dict

Choose a function

One important distinction between F# and C# when it comes to working with Collections is that in C# you tend to operate on an instance of a collection - by dotting into methods on that type; while F# prefers to offer families of functions in modules that take instances as an argument. So C#'s myDictionary.Add(someKey,someValue) in F# would be Map.add someKey someValue myMap.

I just want my LINQ

F# offers functions that are analogous to those that C# programmers will be familiar with from LINQ, but the names are often different, as F# uses nomenclature that is more in alignment with the terminology used in the rest of the functional programming world. Rest assured, they mostly behave as you would expect. Rather than be exhaustive - LINQ is huge - I'll list what in my experience are the most common LINQ methods and their F# analogues:

  • .Aggregate() is called .fold or .reduce depending on whether or not you're providing an initial state or just using the first element, respectively.
  • .Select() is called .map
  • .SelectMany() is called .collect
  • .Where() is called .where or .filter (same thing, two names, long story)
  • .All() is called .forall
  • .Any() is called .exists if we are supplying a predicate, or .isEmpty if we just want to know if the collection has any elements
  • .Distinct() is still .distinct - or .distinctBy if we are supplying a projection function.
  • .GroupBy() is still .groupBy
  • .Min() and .Max() are still .min and .max - with .minBy and .maxBy alternatives for using a projection.
  • .OrderBy() is called .sortBy - and similarly, .OrderByDescending() is .sortbyDescending
  • .Reverse() is called .rev
  • .First() is called .head if we want the first element - or .find if we want the first element that matches a predicate. Similarly, instead of .FirstOrDefault() we use .tryHead and .tryFind - which will return an Option of either Some matchingValue or None if not found, instead of throwing an exception.
  • .Single() is called .exactlyOne - and similarly, .SingleOrDefault() is .tryExactlyOne

I'm not sure which function I need. I have

  • A collection, and want:

    • A single value or element:
      • .min, .minBy, .max, and .maxBy will get an element of your collection relative to the others
      • .sum, .sumBy, .average, .averageBy,
      • .find, .tryFind, .pick, and .tryPick will allow you to get a single specific element of your collection
      • .head, .tryHead, .last, and .tryLast will get you items from the front or back of your collection
      • .fold and .reduce will allow you to apply logic and use every element of your collection to create a single value
      • .foldBack and .reduceBack do the same, but from the end of the collection
    • An equal number of elements:
      • .map will allow you to transform each element of your collection.
      • .indexed will turn each element of your collection into a tuple, whose first item is its index: [1] would become [(0,1)], for example.
      • .mapi does this implicitly, by providing the index as an additional first argument to the mapping function.
      • .sort, .sortDescending, sortBy, and .sortByDescending allow you to change the order of your collection.
    • A possibly smaller number of elements:
      • .filter will give you back a collection only containing elements that match the predicate provided.
      • .choose is like .filter - but allows you to map the elements at the same time.
      • .skip will return the remaining elements after ignoring the first n
      • .take and .truncate will return up to the first n items and either throw or not, respectively.
      • .distinct and distinctBy will allow you to remove duplicates from the collection
    • A possibly greater number of elements:
      • .collect will apply a collection-generating function to each element of your collection, and concatenate all the results together.
    • To change the shape of the collection:
      • .windowed will return a new collection of all n sized groups from the original collection: [1;2,3] would become [[1;2];[2;3]] when n = 2, for example.
      • .groupBy will return a new collection of tuples, where the first item is the projection key, and the second is a collection of starting elements that matched the projection: [1;2;3] projected by (fun i -> i % 2) would result in [(0, [2]); (1, [1; 3])], for example.
      • .chunkBySize will return a new collection of up to n sized collections of your original. [1;2;3] would become [[1;2];[3]] when n = 2, for example.
      • .splitInto will return a new collection containing n equally sized collections from your original. [1;2;3] would become [[1];[2];[3]] when n = 3, for example.
    • To iterate it without changing it:
      • .iter and .iteri take a function and apply each element of your collection to it, but not return any value.
  • A single value, and want:

    • It to be part of a collection:
      • .singleton can be used to create a one-item collection from the value
      • .init will take a size and an initializer function and create a new collection of that size.
  • Multiple collections, and want:

    • To combine them:
      • .append takes two collections and creates a new single collection containing all the elements of both.
      • .concat does the same but for a collection of collections.
      • .map2 and .fold2 act like map and fold from above, but will provide items from the same index in two source collections to the mapping/folding function.
      • .allPairs takes two collections and provides all 2-item permutations between both.
      • .zip and .zip3 take 2(or 3) collections and produce a single collection consisting of tuples of items from the same index in the sources.

Work Asynchronously

F#'s asynchronicity model resembles C#'s but has a few important differences that will occasionally catch out C# developers:

  1. F# has a separate Async<'t> type that is similar to C#'s Task<T>
  2. Due to F#'s type system requiring returns, it uses Async<unit> instead of Task for cases where we don't return an actual value
  3. F# can generate and consume Task<T> with the Async.StartAsTask and Async.AwaitTask functions from the core library.

F# has one other very notable difference from C# with regards to asynchronous code: C# 'enables' the await keyword inside a method by applying the async keyword to that method's signature; F# uses a language feature called a computation expression - which results in the async being part of the function body instead. This also comes with some implications for how you write the code inside that function body:

let timesTwo i = i * 2 // We have our basic function definition

//And now we can make it async

let timesTwoAsync i = async { //Note that when working with computation expressions, we start with our keyword, and then the function itself inside curly braces
    return i * 2 //We also use the `return` keyword to end the expression
}

let timesFour i = async {
    let! doubleOnce = timesTwoAsync i //Note the ! in our let! - this is like `await` in C# - the function we call on the right side has to be something that returns an Async<'a>
    //After we have bound the result of an Async function with let! - we can use it afterwards just like normal
    let doubleTwice = timesTwo doubleOnce //In the case of non-Async functions, we can write our code like usual

    return doubleTwice
}
  1. Keep in mind that let! in Async blocks only work when calling Async-producing functions - similar to how C#'s await can only be used on Task returning methods.
  2. Differently, however, is that since F# handles async purely in the body of functions - there's no requirement about which functions you can let! bind - anything returning Async<'a> is acceptable. This is in contrast to C#'s requirement that you can only await methods flagged as async

Signal an error or control the program flow

First, a definition: When we talk about error signalling and program flow, I don't mean exceptions - F# has those and they work very similarly to C#. What I mean is predictable and potentially recoverable errors; because this is an area where F# can seem like C# at a glance, but very quickly it becomes apparent how different it is. Specifically, this turns up in the use of null as a common error signal in C#. It isn't an uncommon pattern in C# that looks something like this:

public Foo DoSomething(Bar bar)
{
    if(bar.IsInvalid)
    {
        return null;
    }

    return new Foo(bar.Value);
}

And then, the caller of DoSomething can check the return for null and if so, does something similar to either handle it or pass it on. One area where this pops up often, in my experience, is around LINQ's FirstOrDefault() - which gets used to avoid the exception in the case of an empty IEnumerable<T> - but often ends up just propagating the null.

F# initially appears to offer a translation of this with its Option<'a> type - and the question tends to arise: isn't None just a shortcut for null except now it's more difficult to get at the value, now wrapped in Some? Because that's going to require pattern matching or checking .HasValue on the option - and is that really better? It isn't, and that's why F# by way of functional programming offers a cleaner solution: writing the majority of your code without worrying about checking for existing errors, and instead only worrying about signalling potential new ones specific to a given function. We can do this by writing most of our functions as though the inputs have already been validated for us, and then by using the map or bind functions to chain our happy-path functions together. Let's look at these in the context of Option:

map wants two arguments: a 'a -> 'b function, and an Option<'a>, from which it will produce an Option<'b>
bind also wants two arguments: a 'a -> Option<'b> function, and an Option<'a>, from which it will produce an Option<'b>

Let's consider what these can do for us:

// string -> Option<string>
let getConfigVariable varName =
    Private.configFile
    |> Map.tryFind varName

// string -> Option<string[]>
let readFile filename = 
    if File.Exists(filename) 
        then Some File.ReadLines(filename)
        else None

// string[] -> int
let countLines textRows = Seq.length file

getConfigVariable "storageFile"                 // 1
|> Option.bind readFile                         // 2
|> Option.map countLines                        // 3

So what's going on there?

  1. We try to grab a variable from our configuration. Maybe it exists, maybe it doesn't, but it only matters to that single function.
  2. Then we pipe into Option.bind - which implicitly handles the safety logic for us: if the previous step has Some value - use it as an argument to this function - otherwise keep it as None and move on
  3. Option.map does the same - if there is Some value, use it with this function, otherwise just move on.

The astute observer here will notice that there doesn't appear to be an immediate difference between bind and map at step 3 - they're both just automatically handling the same thing, right? But note the different signatures between readFile and countLines - bind has an additional step that flattens the Option that its function outputs. Consider the alternative: If we had used map then at the end of line 2 we would have an Option<Option<string[]>> - and so on line 3 we would need to Option.map (Option.map countLines)!

But, the question stands, how do I actually get the value, if there is one, out of that Option? And it's a fair question. And the answer is to avoid doing so as long as possible. Because the later you wait to try to unwrap an Option, the less code you have to write that has any idea an error is even possible. And at a point when you finally, absolutely, need to get a value out - you have two options:
Option.defaultValue takes an 'a and an Option<'a> - if the Option has a value, it returns that, otherwise it returns the 'a you've given it.
Option.defaultWith is the same, but instead of a value, it takes a unit -> 'a function to generate a value.

Coincidentally, this same logic applies with F#'s built-in Result<'a,'b> type, which also offers bind and map (and mapError if you need it) - but instead of None you have the Error case, which you can use to store information about what went wrong - be it a string or a custom error type of your choosing.

Use a C# library in F#

One of the great benefits to F# - and probably why a C# developer looks at it first rather than something like Haskell - is that it is part of the greater .NET ecosystem and supports interop with all of the C# libraries that a developer is already familiar with. C# code can (mostly) be consumed by F# - but some rough edges tend to crop up, but generally with easy workarounds:

  • When calling C# methods, the F# compiler treats the method as a single-argument tuple. Because of this, partial application is strictly not available and piping can be difficult due to overload resolution:
"1" |> Int32.Parse                          //Works like Int32.Parse("1")
("1", NumberStyles.Integer) |> Int32.Parse  //Works like Int32.Parse("1", NumberStyles.Integer)
NumberStyles.Integer |> Int32.Parse "1"     //Won't compile, because it's expecting a tupled argument, not two separate args.
  • C# libraries - specifically those that involve serialization or reflection - are often not equipped for understanding native F# types. The most common case here is JSON libraries - who can struggle with serialization and/or deserialization of Unions and Records - it's strongly advisable in cases such as this to check for an extension library that provides F# specific functionality. Newtonsoft.Json has the Newtonsoft.Json.FSharp package, for example - System.Text.Json has FSharp.SystemTextJson - alternately these cases may also make for a good time to check out the native F# libraries for the same work, like Thoth or Chiron.

  • Owing to C#'s ability to produce nulls for any reference type - and no (at time of writing) native interop for C#'s nullable ? type notation for reference types - it's helpful to try to isolate C# code on the outside edge of your logic, and use helpers such as Option.ofNullable (for Nullable) or Option.ofObj (for reference types) to quickly provide type safety for your own code.

  • C# methods that expect delegate types such as Action<T> or Func<T> can be given an F# lambda of the appropriate signature, and the compiler will handle the conversion. Remember: unit fills in for void in F# - and its value is () - so an Action<T> would expect a 'T -> unit, such as (fun _ -> printfn "I'm a lambda!"); and likewise, Func<T> would expect a unit -> 'T, such as (fun () -> 123).

  • In cases where a C# library expects things to be decorated with Attributes, they can be used almost identically with the tricky catch that F# uses <> inside the square brackets - so [Serializable] in C# would become [<Serializable>] in F#. Arguments work the same: [<DllImport("user32.dll", CharSet = CharSet.Auto)>]. And, just like with collections above, multiple attributes are separated with a semicolon, not a comma: [<AttributeOne; AttributeTwo>], for example.