Wednesday, 17 July 2013

Roslyn - The June and Sept 2012 releases broke some of the examples and updates were not provided

Arg! As the title says, the June and September 2012 releases of Roslyn made some API changes which meant that some of the published examples, even ones on the Roslyn page itself, no longer compile. Of particular note is this example, wherein the author shows us how to highlight broken usages of Enumerable.Count(). If you're not aware, Enumerable.Count() will almost certainly enumerate the entire sequence. The broken usage is illustrated by code like this
var l = < some enumerable > ;
return l.Count()>0;
Nooooo! Don't do it! Use Any()! Anyway, the code example given by the Roslyn peeps no longer compiles because the GetSemanticInfo() function has been removed and replaced with a host of other, context-specific, functions. If you want the example to compile, these changes will fix it up for you.
        private bool IsCallToEnumerableCount(IDocument document, ExpressionSyntax expression, CancellationToken cancellationToken)
        {
            var invocation = expression as InvocationExpressionSyntax;
            if (invocation == null) return false;
            var call = invocation.Expression as MemberAccessExpressionSyntax;
            if (call == null) return false;

            var semanticModel = document.GetSemanticModel(cancellationToken);
            var methodSymbol = semanticModel.GetSymbolInfo(expression).Symbol as MethodSymbol;
            if (methodSymbol == null || methodSymbol.Name != "Count" || methodSymbol.ConstructedFrom == null) return false;

            var enumerable = semanticModel.Compilation.GetTypeByMetadataName(typeof(Enumerable).FullName);
            if (enumerable == null || !methodSymbol.ConstructedFrom.ContainingType.Equals(enumerable)) return false;

            return true;
        }
and
        private bool IsRelevantRightSideComparison(IDocument document, ExpressionSyntax expression, SyntaxKind kind, CancellationToken cancellationToken)
        {
            var constant = document.GetSemanticModel(cancellationToken).GetConstantValue(expression);
            int? value;
            if(!constant.HasValue || (value = constant.Value as int?) == null) return false;

            if (kind == SyntaxKind.GreaterThanExpression && value == 0 || kind == SyntaxKind.GreaterThanOrEqualExpression && value == 1) return true;

            return false;
        }
Obviously you will have to make the same changes to IsRelevantLeftSideComparison but I'll leave that to you, the diligent reader ;-) In case you are wondering, the changes are to replace GetSemanticInfo() with GetSymbolInfo().Symbol in the initial check function and then with GetConstantValue() in the comparison checking function. Easy, but it took me an hour or two to track it down because all I could find on t'internet were comments saying that GetSemanticInfo() had been removed but not how to use the replacements. Doh!

Tuesday, 5 March 2013

Span part 2 - this time the file is XML

Oh ho! It transpires that the Australian Stock Exchange (ASX) publishes their Span data in XML. I haven't found the schema description but, because XML is human-readable, I was able to build a parser. This time I did it using discriminated unions. Much better, although disappointingly more difficult to query after construction. That merits further investigation. (edit: URL at ASX which explains what Span is all about: ASX) and here is a link to the input files used by the parser: Risk parameter files Go and check out Span2 on Google Code Aha! I've found the schema definition. It's here

Span (or SPAN?) - they use it for margin calculations but I need to build a parser

London Clearing House provide daily Span files at LCH which list, as far as I can determine, a large number of contracts and the inter-contract spreads. These inter-contract spreads may apply to 2, 3, 4 or more particular contracts and they are applied in a specific order. Therefore, if the portfolio contains contracts AA, BB and CC and there is a listed spread between AA and BB and another between BB and CC then we take the one with the highest priority and calculate margin for the remaining contract by itself. For example, if AA - BB has priority 1 and BB - CC has priority 2 we will be left with CC in our portfolio on which we must calculate the full amount margin. AA and BB are mutually (although not necessarily completely) offsetting. The Span files given at the LCH URL above are fixed-format. Ewwww! When was the last time you had to build a parser for a fixed-format file? Especially one containing hierarchical data. Moreover, when I started this the specific column format of the file was unclear and required reverse-engineering. Hmm, nice. Luckily, before I had mistaken too much error as progress, I found a specification for the format. Phew! The Span v4 file format specification There it is. What a beauty! It seems that the format specification is incomplete. That is, it is sufficient for files received from LCH but for the general case, there is a fuller description. This format is specified at the CME and this is it. So I set about creating a parser for this little gem. Oh, and I should mention that I did it in F#. Check it out at Google Code in the Span directory. Yes, I know it's in the wrong place but it'll do. So go and look at it. Note that the fixed-format parser does not currently support the fuller format from CME, just the one supplied by LCH. Lovely. Now, assuming you've made it this far you'll be thinking that that code is not very nice. Well, you'd be right! I'm not afraid to admit that it's pretty evil stuff. In my defense, however, I would like to say that it was done quite quickly with a view to being rewritten later and that it's been quite a while since I wrote any F#. Ho hum ...

Wednesday, 30 January 2013

Which is quicker? Iteration or Recursion?

For a simple problem, see Project Euler #1, I thought it would an interesting exercise to code alternative implementations and time them. You see, when I originally solved the problem way back in the mists of time (well, perhaps not that long ago as it was done using F# 1.9.9 as far as I recall) I was using PE as a tool to teach myself F#. Nowadays I'm comfortable with F# so I thought I would return to PE and explore a little more. So, follows is some F# code which solves problem #1. Spoiler alert! If you haven't already solved it and intend to do so, stop reading now!
 let first = fun (a,b) -> a
  
 let second = fun (a,b) -> b
  
 let upperLimit = 1000 // ie we want to act upon the integers which are less than 1000
  
 let maxLoops = 25000 // an arbitrary large integer - the number of times to repeat the core calculation in a single (timed) execution
  
 let howMany = 15 // ie the number of times we want to perform the calculation in order to find the average calculation time
  
 let st = new System.Diagnostics.Stopwatch()
  
 let Impl1a = fun () ->
  
   let whichImpl = "Iteration"
  
   st.Restart()
  
   for loop = 1 to maxLoops do
  
     let mutable sum = 0    
  
     for i = 1 to upperLimit-1 do
  
       sum <- if (i%3 = 0 || i%5 = 0) then sum + i else sum
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl
  
 let Impl1b = fun () ->
  
   let whichImpl = "Iteration with mutable creation outside loop"
  
   st.Restart()
  
   let mutable sum = 0    
  
   for loop = 1 to maxLoops do
  
     sum <- 0
  
     for i = 1 to upperLimit-1 do
  
       sum <- if (i%3 = 0 || i%5 = 0) then sum + i else sum
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl
  
 let Impl1c = fun () ->
  
   let whichImpl = "Reverse iteration"
  
   st.Restart()
  
   for loop = 1 to maxLoops do
  
     let mutable sum = 0    
  
     for i = upperLimit-1 downto 1 do
  
       sum <- if (i%3 = 0 || i%5 = 0) then sum + i else sum
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl
  
 let Impl1d = fun () ->
  
   let whichImpl = "Reverse iteration with mutable creation outside loop"
  
   st.Restart()
  
   let mutable sum = 0    
  
   for loop = 1 to maxLoops do
  
     sum <- 0
  
     for i = upperLimit-1 downto 1 do
  
       sum <- if (i%3 = 0 || i%5 = 0) then sum + i else sum
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl
  
 let Impl2a = fun () ->
  
   let whichImpl = "Iteration using List higher-order fns"
  
   st.Restart()
  
   for loop = 1 to maxLoops do
  
     let sum = [1..upperLimit-1] |> List.filter (fun i -> i%3=0 || i%5=0) |> List.fold (fun state i -> i+state) 0
  
     ignore sum
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl
  
 let Impl2b = fun () ->
  
   let whichImpl = "Iteration using List higher-order fns with pre-compiled integer range"
  
   st.Restart()
  
   let input = [1..upperLimit-1]
  
   for loop = 1 to maxLoops do
  
     let sum = input |> List.filter (fun i -> i%3=0 || i%5=0) |> List.fold (fun state i -> i+state) 0
  
     ignore sum
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl
  
 let Impl2c = fun () ->
  
   let whichImpl = "Iteration using Seq higher-order fns"
  
   st.Restart()
  
   for loop = 1 to maxLoops do
  
     let sum = seq {1 .. upperLimit-1} |> Seq.filter (fun i -> i%3=0 || i%5=0) |> Seq.fold (fun state i -> i+state) 0
  
     ignore sum
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl
  
 let Impl2d = fun () ->
  
   let whichImpl = "Reverse Iteration using List higher-order fns"
  
   st.Restart()
  
   for loop = 1 to maxLoops do
  
     let sum = [upperLimit-1 .. -1 .. 1] |> List.filter (fun i -> i%3=0 || i%5=0) |> List.fold (fun state i -> i+state) 0
  
     ignore sum
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl
  
 let Impl2e = fun () ->
  
   let whichImpl = "Reverse Iteration using List higher-order fns with pre-compiled integer range"
  
   st.Restart()
  
   let input = [upperLimit-1 .. -1 .. 1]
  
   for loop = 1 to maxLoops do
  
     let sum = input |> List.filter (fun i -> i%3=0 || i%5=0) |> List.fold (fun state i -> i+state) 0
  
     ignore sum
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl
  
 let Impl2f = fun () ->
  
   let whichImpl = "Reverse Iteration using Seq higher-order fns"
  
   st.Restart()
  
   for loop = 1 to maxLoops do
  
     let sum = seq {upperLimit-1 .. -1 .. 1} |> Seq.filter (fun i -> i%3=0 || i%5=0) |> Seq.fold (fun state i -> i+state) 0
  
     ignore sum
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl
  
 let Impl3a = fun () ->
  
   let whichImpl = "Recursion with Accumulator"
  
   let rec Impl3p counter acc =
  
     if counter=upperLimit then
  
       acc
  
     else if counter%3=0 || counter%5=0 then
  
       Impl3p (counter+1) (acc+counter)
  
     else
  
       Impl3p (counter+1) acc
  
   st.Restart()
  
   for loop = 1 to maxLoops do
  
     let sum = Impl3p 1 0
  
     ignore sum
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl
  
 let Impl3b = fun () ->
  
   let whichImpl = "Recursion"
  
   let rec Impl3p counter =
  
     if counter=upperLimit then
  
       0
  
     else if counter%3=0 || counter%5=0 then
  
       counter + Impl3p (counter+1)
  
     else
  
       Impl3p (counter+1)
  
   st.Restart()
  
   for loop = 1 to maxLoops do
  
     let sum = Impl3p 1
  
     ignore sum
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl
  
 let Impl4a = fun () ->
  
   let whichImpl = "Iteration incrementing a pair of counters"
  
   st.Restart()
  
   for loop = 1 to maxLoops do
  
     let mutable sum = 0
  
     let mutable loopDiv3,loopDiv5 = 0,0
  
     while loopDiv3<upperLimit do // because loopDiv3 increases slower than loopDiv5
  
       loopDiv3 <- loopDiv3 + 3
  
       loopDiv5 <- loopDiv5 + 5      
  
       // if we haven't already added it in loopDiv5 ...
  
       sum <- sum + (if loopDiv3 < upperLimit then (if loopDiv3 % 5 <> 0 then loopDiv3 else 0) else 0)
  
       sum <- sum + (if loopDiv5 < upperLimit then loopDiv5 else 0)
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl
  
 let Impl4b = fun () ->
  
   let whichImpl = "Recursion using a pair of counters"
  
   let rec Impl4p counterDiv3 counterDiv5 =
  
     if counterDiv3>=upperLimit then
  
       0
  
     else 
  
       (if counterDiv3 % 5 <> 0 then counterDiv3 else 0) + (if counterDiv5 < upperLimit then counterDiv5 else 0) + Impl4p (counterDiv3+3) (counterDiv5+5)
  
   st.Restart()
  
   for loop = 1 to maxLoops do
  
     let sum = Impl4p 0 0
  
     ignore sum
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl
  
 let Impl4c = fun () ->
  
   let whichImpl = "Recursion with Accumulator using a pair of counters"
  
   let rec Impl4p counterDiv3 counterDiv5 acc =
  
     if counterDiv3>=upperLimit then
  
       acc
  
     else 
  
       Impl4p (counterDiv3+3) (counterDiv5+5) ((if counterDiv3 % 5 <> 0 then counterDiv3 else 0) + (if counterDiv5 < upperLimit then counterDiv5 else 0) + acc)
  
   st.Restart()
  
   for loop = 1 to maxLoops do
  
     let sum = Impl4p 0 0 0
  
     ignore sum
  
   st.Stop()
  
   st.ElapsedMilliseconds, whichImpl  

 let run (whichImpl : unit -> int64 * string) = 
  
   let res = [ for count = 1 to howMany do yield whichImpl() ]
  
   let name = res |> List.head |> second
  
   let elapsed = res |> List.map first
  
   elapsed |> List.iter (fun elem -> printfn "%s : %i iterations took %i milliseconds" name maxLoops elem)
  
   let avg = float (elapsed |> List.fold (fun state i -> state+i) 0L) / float (List.length elapsed)
  
   name, avg
  
 let implementations = [ Impl1a; Impl1b; Impl1c; Impl1d; Impl2a; Impl2b; Impl2c; Impl2d; Impl2e; Impl2f; Impl3a; Impl3b; Impl4a; Impl4b; Impl4c ]
  
 implementations |> List.map (fun impl -> run impl)
  
Well, well! How interesting! Using F# interactive, these are the results obtained from my test run:
  [("Iteration", 252.8);
  
   ("Iteration with mutable creation outside loop", 255.0666667);
  
   ("Reverse iteration", 225.0);
  
   ("Reverse iteration with mutable creation outside loop", 236.1333333);
  
   ("Iteration using List higher-order fns", 3509.933333);
  
   ("Iteration using List higher-order fns with pre-compiled integer range", 386.6);
  
   ("Iteration using Seq higher-order fns", 4848.933333);
  
   ("Reverse Iteration using List higher-order fns", 3679.466667);
  
   ("Reverse Iteration using List higher-order fns with pre-compiled integer range", 396.7333333);
  
   ("Reverse Iteration using Seq higher-order fns", 4883.266667);
  
   ("Recursion", 242.4);
  
   ("Recursion with Accumulator", 231.9333333);
  
   ("Iteration incrementing a pair of counters", 48.6);
  
   ("Recursion using a pair of counters", 90.4);
  
   ("Recursion with Accumulator using a pair of counters", 53.26666667)]