Monday, 24 November 2008

Some notes on F#

About a year ago I was looking at how to convert a second order polynomial fit written in R to F#. The function looked like

polyfit <- function(x, y, filename = 'fit_out.csv') {
x2 <- x^2
## make linear model fit <- lm(y ~ x + x2)
write('Intercept,x,x^2', filename, append = FALSE)
cat($coefficients, file = filename, sep = ',', append = TRUE)

Unfortunately at this time there where not many mathematical libraries available for F# and I would have had to write my own. So I put F# to one side

At the PDC 2008 there where many sessions on how to make the most of multi core processors. To do this in C# is very challenging and hard to maintain. Here are some notes I made during the F# session on the last day of the PDC:

F# is a non imperative programming language. In imperative languages x = x + 1 makes sense, however from a mathematical stand point it is nonsence. I F# you first explore the domain by writing code and executing it with alt enter. Once finished you can compile the code and expose it to other .net languages. Since F# uses immutable variables it does not have the problem of updating shared values when running in parallel. It declative style lends itself well to parallel processing and is used by some users of the htc server.

Let y = o
Does not mean assign 0 to y. Instead it means bind the value 0 to the symbol assigned to y. In this way y is read-only and immutable. The same principle holds for functions eg
Let sqr x = x * x

Writing a function in F# as if it where imperative
Let sqr x = x * x
Let SumOfSquareI = nums =
Let mutable acc = 0.0
For x in nums do
Acc <- acc + sqr x

This is not the way to program F#. Notice the genric approach to variable typing. In this case the return type was determined by the compiler working backward through the code to 0.0 which is a double.

A mathematician would do
Let SumOfSqrF =
Match nums with
[] -> 0.0
h::f -> sqr h + sumof: sumofsquaresf t

Match is a switch on steroids. Its a branch mechanism that binds. This has the effect of making the calculation parrellizeable

The F# way would be
Let sumofsquares nums =
> sqr
> seq.sum

The map function applies the function sqr to each element of nums. The > is a pipeline operator that works in the same way as a pipe in unix or dos

The flow has been encapsulated. There are fewer places to screwup. It lets the bar tender make the coffee.

There is a parallel extension framework that can be used to make this multi-core multi server enabled.

Raising the level of abstraction enables parallisation without restructuring code.

Let ticker = "msft"
Let url = "http....
Let req = webrequest.create(url)
Let resp = req.getresponse()
Let stream = resp.getresponsestream()
Let reader = newstreamreader(stream)
Let csv = reader.readtoend()
Let prices = csv.split([','])
>seq.skip 1
> line -> line.split([',']))
>seq.filter(fun values -> values > seq.length 7)
> values -> system.datetimeparse(values.[0]) float values.[6])

The results can be visualized by the FlyingFrog library and the following commands

Grid prices;;
Plot prices ;;

To give up a thread so as not to block tasks when waiting

Let! Req = req.AsynchGetResponse

To make a static method asynchronous

Let internal loadprices ticker = async {.....

To make a pipeline asynchronous

> loadprices
>.map(fun prices -> new stockanalyser(prices,days)

The flow of logic would be completely overwhelmed by the code needed to setup delegates and threads.

In 2009 there will be a supported release of F#
For further details see