CamlPDF Version 0.4 (for Ocaml and F#)

CamlPDF Version 0.4 has been released.

The biggest change is that this release now compiles with Microsoft F Sharp as well as with OCaml.

Some major non-compatible changes have been made to the low-level API, due to the experience of building large software with the library. These should be the last such changes, at least to the basic modules.

One of the changes is that many functions which used to take a Pdf.pdfdoc and return another one now modify the document in-place. This is rather un-idiomatic for a functional library, but threading all the documents through complicated functions in code using CamlPDF became wearying.

There are several new modules:

  • PDFSpace (Parsing Colourspaces)
  • PDFText module extended for more encodings and better text extraction
  • Cff module (Parse Type 1 fonts and convert to Type 3)
  • PDFMarks (Bookmark handling – unfinished)
  • PDFAnnot (PDF Annotations – unfinished)
  • PDFGraphics (Structured graphics – unfinished – included only because Cff uses it

This release is about a third faster in general than the last one, due to profiling under .NET. Many bug fixes are included, and extra facilities for dealing with malformed PDF files.

CamlPDF is in commercial use in our Command Line PDF Toolkit and PDF Toolkit for .NET.

Posted in Uncategorized | Tagged | Leave a comment

Keeping the Codebase Together

We have to generate the following things from our OCaml codebase:

(a) Command Line PDF Tools for Windows, Mac, Linux and Solaris, demo and full versions
(b) .NET DLL by compiling with F#, demo and full versions
(c) The open-source CamlPDF library
(d) The demonstration renderer and its GUI
(e) Literate Programs for all the code

So far all this has been achieved with just plain Ocaml and a couple of makefiles and the very useful OCamlMakefile. A little trickery is required to have demo and full versions compiled from the same source.

Soon we’ll add:

(f) Plain C Wrapper for the same library exposed to .NET

which will allow the PDF library to work natively from C on any platform where OCaml can natively compile. This wrapper will be a little harder to write than the .NET wrapper, since we have less rich types available to express the various ML data structures.

I’ve also been playing with Apple’s Cocoa with Objective-C, in preparation for a new PDF-related product for OS X. I’ll be linking in the new Plain C wrapper to that. Cocoa and Objective C is all object-oriented-kool-aid, but seemingly of the less gross kind. I’ll only be using it for the interface, where it seems to work.

This will be our first consumer rather than business product, so a whole new set of problems to deal with.

Posted in Uncategorized | Tagged | Leave a comment

.NET Toolkit Released, Command Line Tools Updated

Our .NET PDF Toolkit, in 100% F#, cross-compiled with OCaml is now available, starting at £495. It does everything the command line tools do, and more, but is usable natively from VB.NET, C# and ASP.NET.

At the same time, we’ve updated the Command Line Tools (written in pure OCaml) with bug fixes and new features, and now provided for Solaris 10 Sparc and Intel out of the box.

The codebase of about 40,000 lines ended up requiring only 30 points at which conditional compilation was needed to cover the differences between F# and OCaml, though plenty of code had to be modified to compile in both environments in a way that didn’t require conditional compiltion.

There will be a new release of CamlPDF soon, which will have many new features, and which will cross-compile with F# out of the box.

Posted in Uncategorized | Tagged | Leave a comment

Coherent PDF Toolkit for .NET Beta Release

We’re almost ready to release the .NET SDK version of our PDF Command Line Toolkit, and are looking for feedback.

This is the fruit of altering our OCaml codebase of about 20,000 lines to cross-compile with F# – a not entirely straightforward process.

Here’s the .msi which installs a demo version of the SDK:

http://www.coherentpdf.com/cpdflib.msi

Here’s the PDF of the user manual:

http://www.coherentpdf.com/cpdflibmanual.pdf

Installation is covered in Chapter One of the manual, and is simple.

I’m particularly interested in feedback from existing .NET developers, so do take a look, and you can contact us with your suggestions and criticisms either in the comments below or by email.

Posted in Uncategorized | Tagged | Leave a comment

Profiling F Sharp Code for Speed

I wrote earlier about profiling F# code for memory usage. I’ve been looking at products for profiling speed, and have settled on JetBrains dotTrace for the forthcoming .NET release of our PDF tools. Here are a couple of screenshots profiling speed on our PDF library:

Profiling for Speed

Profiling for speed - hotspots

The speed increases achieved are partly F#-specific, but plenty of the changes made have speeded up the code when compiled with OCaml too.

Posted in Uncategorized | Tagged , | Leave a comment

PDF Command Line Tools 1.3 and CamlPDF Progress

PDF Command Line Tools 1.3 now out, with new features for fonts, better splitting by bookmark, and dozens of smaller improvements.

There will be a new release of CamlPDF in the next few months. It will be somewhat backward-incompatible due in part to the changes required to get it cross-compiling with F#, and partly due to some redesigning due to experience.

New modules:

  • Pdfspace: Colourspaces
  • Cff: Parse Type 1 Compact Format Fonts
Large numbers of other things fixed / added. Better text support etc.

And, if there’s time, the Pdfgraphics module, which lifts a PDF page operator stream, its resources (fonts, colourspaces, images) to an immutable tree of paths, images, clipviews and resources etc, allowing proper structured editing of graphic content with the guarantee that every page can be represented in this format and that the read-edit-write cycle is non-destructive. This is somewhat difficult to get right, though, so may take a while to complete. Here’s the main part of the (rather large) type Pdfgraphic.graphic:


and graphic_elt =
| Path of (path * path_attributes)
| Text of textblock list * textblock_attributes
| MCPoint of string
| MCPointProperties of string * Pdf.pdfobject
| MCSection of string * graphic_elt list
| MCSectionProperties of string * Pdf.pdfobject * graphic_elt list
| Image of image_attributes * int
| GraphicInlineImage of Pdf.pdfobject * bytestream * Transform.transform_matrix
| Clip of path * graphic_elt list
| Shading of path option * shading * Transform.transform_matrix

and graphic =
{elements : graphic_elt list;
fonts : fontname list;
resources : Pdf.pdfobject}

Posted in Uncategorized | Tagged | Leave a comment

Building Cpdf into a .NET library – progress

I’ve been turning our cpdf PDF command line tools into a .NET DLL by cross-compiling it with F#. Here are a couple of screenshots of editing code using the library in C#.

Automatically generated Tooltips for Cpdf functions

And here are a couple of screenshots of Microsoft’s CLRProfiler which profiles garbage collection and memory usage.

Debugging with CLRProfilerDebugging with CLRProfiler

There’s a lot more work to be done on speed and functionality, but I’m hoping to have it out by the end of November. A new release of CamlPDF which cross-compiles with F# will be released shortly after. There’s a lot of new stuff in CamlPDF too, especially text support.

Posted in Uncategorized | Tagged | Leave a comment

On the Use of non-Tail-Recursive Functions In Commercial Code

Dear Customer,

Attached is the software for which you paid several hundred pounds. It should work. Maybe on larger inputs it will fail. Sometimes several small parts of it will work, but when you put them together it will fail.

You see, there’s a maximum input data size for which it works. I don’t know exactly what that is, and it may vary according to your platform, configuration, or the day of the week.

On some platforms it might fail cleanly, on some it might segfault. We’re not sure.

(Our codebase is entirely tail-recursive, but people I’ve mentioned this to in passing seem to consider this an extreme position, citing loss of speed. Really?)

Posted in Uncategorized | Tagged | Leave a comment

New Reviews of Old Books #57

Digital Typography, Donald E. Knuth, 1999 (Amazon)

Digital Typography Book Front Cover

This collection of more than thirty articles and notes covering the Knuth’s foray into digital type in the late seventies and eighties. They range from font design (a whole chapter on the shape of the letter S), to the history of typography, to some TeX related material (the entire exposition of TeX’s line-breaking algorithm, for instance). There is also plently of the concrete mathematical analysis associated with Knuth, including a piece explaining why arrowheads plotted on bitmap displays can look asymmetrical even at quite high resolutions.

Perhaps the most appealing of the less-technical articles is Chapter 17 (AMS Euler), which describes the collaboration between Knuth and Herman Zapf to produce a new maths font for the American Mathematical Society. It’s drawn directly from the correspondence between the two as the maths of MetaFont progresses along with the design of the font itself.

The book is, as you would expect, beautifully typeset and bound. It costs about sixteen pounds. 

Posted in Uncategorized | Tagged , , | Leave a comment

Compiling Code Under OCaml and F# (Part Two)

[Part One]

Twenty thousand lines of CamlPDF and cpdf later, here are some numbers:

  • Occasions on which conditional compilation is required: 22
  • Compilation warnings with fsc –no-warn 62: 15
  • Time taken: 22 hours

The current executable appears to be about 8 times slower than OCaml native compilation, but I haven’t examined this enough to know how much we might be able to improve upon it.

I’m planning to clean up the code to see how much of the conditional compilation we can get into a single Utility module.

What’s next is to repackage the command line tools as an API for .NET users. I know very little about this topic, so it’s going to require quite a bit of effort before I’m willing to put it on sale and support it.

If you’re familiar with packaging up libraries for .NET and would like to beta test, drop me a line in the comments, or via our website.

Posted in Uncategorized | Tagged , , , | Leave a comment