Archive for April, 2008

Storing Colours in 31 bits (Part 2)

Tuesday, April 15th, 2008

Jean-Baptiste Rouquier rose to the challenge in my last post: to provide a fast way of storing premultiplied colours in OCaml’s 31bit integers (see his post here).

With his kind permission, I’ve included this code (somewhat optimized) in the Colour module, together with the longstanding compositing code.
http://www.coherentgraphics.co.uk/colour.zip
It’s under the BSD license.

A Simple Parser with Genlex

Wednesday, April 9th, 2008

(If you’re reading this via the OCaml planet, click the post title to see it with proper formatting - I shall fix this problem for the next post, I hope.)

When calling our command line PDF Tools to, for instance, merge files, one can use page specifiers with a simple grammar. For instance:

cpdf file.pdf 1,2,6-end -o out.pdf


cpdf in.pdf 1,all -o fax.pdf

(The second example is useful to duplicate the first page of a document as a fax cover sheet)

If the grammar is simple, OCaml’s Genlex plus a small recursive parser is sufficient.

Here’s an informal description of the mini language:

A dash (-) defines ranges, e.g. 1-5 or 6-3. 

A comma (,) allows one to specify several ranges, e.g. 1-2,4-5. 

The word ”end” represents the last page number. 

The words ”odd” and ”even” represent the odd and even pages. 

The word ”reverse” is the same as end-1. 

The word ”all” is the same as 1-end. 

A range must contain no spaces. 


Our input is the string in this language, our output an ordered list of page numbers. (I’m using some functions from the Utility module available with CamlPDF)

First build a lexer:

let lexer =

  Genlex.make_lexer ["-"; ","; "all"; "reverse"; "even"; "odd"; "end"]

and define a utility function which lexes a string to a list of Genlex lexemes (no need to be lazy here):

let lexwith s =

  list_of_stream (lexer (Stream.of_string s))

Now, this language is quite simple to deal with. We’ll pattern-match for all the simple cases - anything which doesn’t contain a comma is finite. If nothing matches, we assume the string contains one or more commas, and attempt to split it up.

Here is the main function, which is given the end page of the PDF file to which the range applies (the start page is always 1), and a list of tokens to match against:

let rec mk_numbers endpage = function

  | [Genlex.Int n] -> [n]

  | [Genlex.Int n; Genlex.Kwd "-"; Genlex.Int n'] ->

      if n > n’ then rev (ilist n’ n) else ilist n n’

  | [Genlex.Kwd "end"; Genlex.Kwd "-"; Genlex.Int n] ->

      if n <= endpage

        then rev (ilist n endpage)

        else failwith “n > endpage”

  | [Genlex.Int n; Genlex.Kwd "-"; Genlex.Kwd "end"] ->

      if n <= endpage

        then ilist n endpage

        else failwith “n > endpage2″

  | [Genlex.Kwd "end"; Genlex.Kwd "-"; Genlex.Kwd "end"] ->

      [endpage]

  | [Genlex.Kwd "even"] ->

       drop_odds (ilist 1 endpage)

  | [Genlex.Kwd "odd"] ->

       really_drop_evens (ilist 1 endpage)

  | [Genlex.Kwd "all"] ->

       ilist 1 endpage

  | [Genlex.Kwd "reverse"] ->

       rev (ilist 1 endpage)

  | toks ->

      let ranges = splitat_commas toks in

        (* Check we’ve made progress *)

        if ranges = [toks] then error “Bad page range” else 

          flatten (map (mk_numbers endpage) ranges)

(ilist x y  produces the list [x, x + 1, ... y - 1, y],  drop_odds [1;2;3;4;5;6;7] is [2;4;6], really_drop_evens [1;2;3;4;5;6] is [1;3;5])

The auxilliary function to split a token list at commas into a list of token lists:

let rec splitat_commas toks =

  match cleavewhile (neq (Genlex.Kwd “,”)) toks with

  | [], _ -> []

  | some, [] -> [some]

  | _::_ as before, _::rest -> before::splitat_commas rest 

(cleavewhile returns, in order, the elements at the beginning of a list until a predicate is false, paired with the rest of the elements, in order.  neq is ( <> ))

And here’s the wrapper function, which unifies the error handling and tests for page numbers not within range.

let parse_pagespec pdf spec =

  let endpage = endpage_of_pdf pdf in

    let numbers =

      try mk_numbers endpage (lexwith spec) with

        _ -> error (”Bad page specification ” ^ spec)

    in

      iter

        (fun n -> if n <> endpage then

           error (”Page ” ^ string_of_int n ^ ” does not exist.”))

        numbers;

      numbers


This is often easier to write and debug than using a separate tool intended for highly complex grammars.

Storing Colours in 31 bits (Part One)

Friday, April 4th, 2008

For rendering vector graphics scenes, colours are usually stored with what we call “premultiplied” or “associated” alpha. For instance, opaque dark red is stored as:

0.5, 0, 0, 0.5 (R * A, G * A, B * A, A)
instead of
0.5, 0, 0, 1 (R, G, B, A).
This originally had to do with making compositing algorithms fast (fewer multiplications), but it has other advantages - for instance there is only a single value for the clear colour (0, 0, 0, 0) instead of many (x, y, z, 0).
We usually use 8 bits for each component, packing them into a 32 bit word.
Now, 32 bits can store 2^32 colours. In fact, in the premultiplied scheme, many bit patterns are unused (when R, G or B is > A). It turns out there are only slightly more than 2^30 unique premultiplied colours. In other words, with a suitable mapping, we should be able to store them in OCaml’s 31 bit integers. This is important so we can store them in native arrays unboxed, for example.
Such a mapping (togther with a discussion of all this) is in Jim Blinn’s book “Dirty Pixels”, Chapter 20. Unfortunately, it’s too slow for practical use. Can you think of a fast one?
Meanwhile, here’s some code from our renderer which uses a lossy approach: throwing away the least significant red bit (The question of which colour to lose the bit is not clear: theoretically the eyes are less sensitive to changes in blue, but my tests didn’t seem to bear that out).
To build one of these colours (assertions left out for this post)
let colour_of_rgba r g b a =
  (a lsl 23) lor (b lsl 15) lor (g lsl 7) lor (r lsr 1)
Extraction of blue, green and alpha components is easy, but where we’ve dropped the LSB, we need to reconstruct carefully, at least making sure 254 reconstructs to 255 - otherwise we couldn’t represent full red. We must also make sure the invariant that a component can never be more than the alpha is obeyed.
let rec red_of_colour c =
  let red =
    match (c land 127) lsl 1 with
    | 254 -> 255
    | r -> r
  and alpha = c lsr 23 in
    if red > alpha then alpha else red
In Part Two, I’ll release the Colour module, which provides for all this, and implements the standard Porter/Duff compositing arithmetic efficiently.

A Proper GUI for OCaml (Part Two)

Tuesday, April 1st, 2008

I’ve packaged up the basic libraries described last time and you can get them here, together with a little example.

It should work on any platform / toolchain where OCaml and Python and WxPython can be installed. It allows you to build standalone windows exes and mac applications out of the box.

As I mentioned before, this is just a proof of concept. I have no intention of writing a generic GUI library for my renderer, instead sticking with the specialised one I’m using already.

That means I’m using a different (specialised) wxgui.ml and main.ml and main.py, but the same mltalk.py, pytalk.ml, camlpy.ml and pycaml.py — and I’ll only be keeping those basic four files up to date.

I’ve put it under the BSD license, so have fun.