B-Splines for VTK

This is the conclusion of yet another of my long-delayed projects. I started writing some classes for doing image resampling with B-splines in VTK way back in 2007. After working on them for a while, I realized that even after I finished them, I would have to do a lot of testing before I could safely use them. Recently I was reading up on B-splines again, and happened to come across some code from the authors of one of the important B-spline papers. That gave me enough incentive to go back to my old code and finish it off. See a short paper that I wrote here.

Optimized floor() for cross-platform software

My work in computer graphics software involves a lot of float-to-int conversions, the majority of which require a floor() operation rather than a simple truncation. The problem is that floor() is a math library function, and quite hefty one at that. To conform to the IEEE-754 standard, it must correctly deal with overflow and with NANs, in addition to the fact that the function call itself introduces some overhead.

Since the days of the Pentium III, I had been using a fast inlined floor() approximation based on the bit-twiddle trick suggested at stereopsis.com, but that fix was specific to the FPU on the Pentium III and earlier x86 CPUs, and was completely irrelevant to other CPU brands or to newer x86 CPUs with SSE2 instructions. Also, the code not only ugly but was, as I noted in the first sentence of this paragraph, an approximation. Float values that were only slightly less than an integer (on the order of 1e-6 less) would sometimes be rounded up instead of down.

So, a few months ago, I decided to get rid of the obsolete bit-twiddle trick in favor of something more general. I came up with the following, which gives exact results as long as it doesn't encounter overflow or NAN, and which does not require any branching or other expensive CPU operations:

inline int FastFloor(double x)
{
  int i = (int)x;
  return i - ( i > x );
}

The trick here is very simple. If truncation to an int caused the value to increase, then we have to subtract one. We rely on the fact that conditionals such as ( i > x ) evaluate to one or zero.

As I mentioned earlier, this trick does not give correct results on overflow or underflow: it will counter-intuitively return 2147483647 on underflow and -2147483648 on overflow, though this is hardly worse than static_cast<int>(), which returns -2147483648 on both underflow and overflow in most implementations. A check for over/underflow could be included without adding any conditional branches, but that is left as an exercise for the reader.

Don't worry, I didn't forget about the ceil() operation:

inline int FastCeil(double x)
{
  int i = (int)x;
  return i + ( i < x );
}

Like FastFloor(), this code requires as a preconditon that the floating-point value is not NAN and lies within the range [-2147483648,2147483647]. If either of these is likely to occur, then an isnan() check and/or a range check on the float value can be performed. The same is also true for the math-library floor().

New WrapVTK XML format

I've started working on my WrapVTK project again, which aims to be a new wrapper-generator for VTK. This project started last spring as a test-bed for the "2010 VTK Wrapper Overhaul". Essentially, I coupled my revamped vtkParse wrapper-generator front end with an XML-generator back end, so that I could be certain that vtkParse was correctly parsing the VTK header files.

My big news is that I have updated the XML that WrapVTK produces. In the first iteration, it produced highly verbose XML output that was less readable than I liked. This time around, I put a lot of thought into how I could use attributes to make the XML more compact, without losing any flexibility. This is the result. At the same time, I did a few fixes to make sure that it will work with the upcoming VTK 5.8 release (the first VTK release with my new Python wrappers!)

Also, if anyone out there knows of any other XML file formats for describing C++ header files, can you send me a comment? The closest things I have been able to find in my searches are GCCXML, XML-RPC, and SOAP, none of which are really suitable for me. GCCXML is almost usable, except that it cannot provide information about unspecialized templates, and is lacking in human-readability and compactness.

Better late than never

I have an embarrassing amount of source code that I haven’t released. Little projects that I started but never finished. Or worse, project that were finished but never tested. I like to pick through my pile of detritus every so often to see if anything is worth bringing back to life. Last month I did so for two projects.

Project 1: Stencil iterators. Way back during my PhD, I wrote a C++ class for storing a binary image stencil with run-length encoding. It performed well but was a big pain to use. I started working on an iterator-style interface for it to address its usability issues but (ah, yes) ran out of time and had to move on to other things. Now, six years later, I finally got back to it, and converted four VTK image processing filters so that they use it. I also took my old python code for generating image stencils from contours, and converted it to C++. And as soon as I had done so, a grad student in Vancouver started using it. Progress!

Project 2: MNI file formats. I use several file formats that were developed by the Brain Imaging Centre at the Montreal Neurological Institute. Some of the software that I've shipped uses these file formats, too. For the past four years, though, the code that I wrote for these formats was sitting around unused. Well, I finally started working that that old data again, so I thought it would be worthwhile to spend a few days cleaning up the code and releasing it. Now VTK has a vtkMNIObjectReader/Writer (for anatomical geometry like brain surfaces), a vtkMNITransformReader/Writer (for image-to-image transformations), and a vtkMNITagPointReader/Writer (for dataset labelling).

All of these have just barely come in under the wire for the upcoming VTK 5.8 release. It's nice to see the kids leave the nest.

SimVTK 0.2.4 Release

Announcing SimVTK 0.2.4, the lates SimVTK release. Source and binary packages can be downloaded at http://media.cs.queensu.ca/media/SimITKVTK/downloads.php. We have also written a new article on SimVTK, which can be found in the VTK Journal. This new release includes Windows and Mac OS X binaries for VTK 5.6.1, and features various style and stability fixes.

Progress on SimVTK had been going slow for the past year, but we finally have a student working on the project again. With this added manpower, we will be able to merge this project with my WrapVTK project, which is itself a spinoff of the VTK/Python wrapper enhancements that I did over the summer. The result will be that more VTK parameters will be successfully wrapped in SimVTK. We also hope to speed up the wrapping by doing a rewrite of the Perl code-generating code.

Major refactoring

Over the last decade I've done a lot of work on the python wrappers. The wrapper-generator code in "vtkPython.c" grew from 760 lines of code mid-way through my PhD, to 5300 lines of code last month. And a lot of that code was within a single, massive function. That was the function that generated a python "wrapper" function for each method in the C++ class being wrapped. Now, my guideline is that a comfortable file length is 1000 lines, and anything over 1500 is too long.

So I split off three files: one for Python-to-C++ data conversion functions, one for reusable code-generation functions, and one for reusable text/help processing. And the main file, still 3700 lines long, was reorganized into many smallish subroutines with clear inputs and outputs. But the most important thing is that the generated code, i.e. the C++ code that this code writes out, is much cleaner and more compact than before. No goto statements or complex error handling logic, just a simple flow from beginning to end. And it is 25% faster than before.

This was the last time that I'll have to hold the whole file in my head at once, so to speak. Now it is possible to understand the program one subroutine at a time, and I won't have to do a major study of the code every time that I want to add a couple new features. Which is good, because I have many features left to add, but I'll have to do them in bits and snatches when I find the time.

More VTK in Python

Last week I switched vtkWrapPython.c over to the new vtkParse API, which means that vtkWrapPython now has access to much more information about the VTK classes than it used to. After this was done, it was amazing how straightforward it was to add new features to the python wrappers:

  • Multidimensional args like Multiply3x3(double A[3][3], double x[3], double y[3]) are wrapped.
  • Reference args like GetDistance(double p1[3], double p2[3], double &d) are wrapped.
  • The types size_t and ssize_t are wrapped.
  • All typedef'd types are wrapped, as long as they resolve to a supported type.
  • Methods with optional args with default values are properly wrapped.

All of these changes except for reference args were only possible because of the move to the new vtkParse API. The old API simply did not make enough information available.

The reference arg support required a different bit of trickery. Python's ints and floats are immutable (as are Python strings) so it is impossible to change their value through a reference arg the way that C does. So the obvious solution was to make a new python type that is a mutable numeric type. My new mutable() type behaves just like a number, except that its value can be changed. You construct a mutable object like this, mutable(2), and it will behave just like a "2". I also plan to make it so that mutable('string') will generate a mutable string object.

Pointers are still not wrapped, except in cases where the wrappers know the size of the memory area that they point to. Proper wrapping of pointers is dependent on developing a hinting system that will allow the wrappers to deduce the size. For the vtkDataArray::SetTuple(double *) method, I have hard-coded a hint that the size is given by GetNumberOfComponents(). This works fine for this one class, but there are tons of pointer methods in VTK and they cannot all be hard-coded into the wrappers. Once a good hinting scheme is in place, most of these methods can be automatically wrapped. That is a project that will have to wait for another day.

Python ghosts

Python and VTK each have their own garbage collection systems, with reference counts as the primary means of tracking references. The two GC systems are tied together by a very simple rule: the python wrappers hold only one reference to the VTK object. Because of this simple rule, there are only two places in the Python wrapper code where we have to worry about VTK reference counts: 1) the point where VTK objects first enter Python and have a PyVTKObject constructed for them, and 2) the point where the PyVTKObject is destroyed by Python's GC system.

The above system would work perfectly except that, several years ago, I added a Python "dict" to PyVTKObject so that python users could add custom attributes to their VTK objects. I even made it possible to add new methods by subclassing VTK classes within Python. Hence, VTK objects can by extended with pure-python attributes and methods. The problem was that these attributes would only last for as long as the PyVTKObject existed. So if a PyVTKObject was e.g. stored in a vtkCollection and then retrieved, it would come back as a stripped object with all of its pythonic attributes removed because only the vtkObject and not the PyVTKObject would be stored. I would work around this by always using pythonic containers to store python-customized VTK objects.

I have finally fixed the python wrappers so that this unexpected stripping of pythonic attributes no longer occurs. Whenever a PyVTKObject is destroyed, its attributes are saved in a 'ghost' object. The ghost is simple, it is just a C struct that holds the pythonic attributes and a weak pointer to the VTK object. Every time a new ghost object is to be added to the list, the weak pointers are checked, and any ghosts for VTK objects that no longer exist are erased. This keeps the "graveyard" from growing continuously.

So now whenever a VTK object pointer enters Python, the graveyard (the list of ghosts) is checked to see if the object used to have special pythonic attributes. If so, then a PyVTKObject is constructed with these attributes. So to all appearances, the user will be getting the old object back again, but really, they are getting a resurrected ghost. I really don't think that I did a good job of explaining all of that. But since I'm just filling some quiet time during my vacation, it will just have to stand as it is.

Cleaning up VTK’s weakrefs

Today I replaced the internals of VTK's weak pointer implementation. The weak pointers added an observer to their object so they could catch the DeleteEvent signal that every VTK object emits when it destructs. This made weak pointers very resource-intensive, and it also meant that a weak pointer could be left with an invalid pointer if someone called RemoveAllObservers() on the object. My new implementation adds an optional list of weak pointers to vtkObjectBase. This is very lightweight, and it means that observers and weak pointers will no longer get in each other's way.

Since this new implementation works with vtkObjectBase instead of vtkObject, it will also allow me to do a useful thing with the VTK Python wrappers. It used to be that when a python reference to VTK object went out of scope, the pythonic part of the VTK object would be destroyed, even if C++ references to the object still existed. So if the VTK object was later returned to Python, the pythonic part would have to be recreated and important parts of it like its python "dict" would be lost. With weak pointers, I can make an engram of the python object before it destructs, and then resurrect it when the VTK object returns to Python, or I can mark the engram for deletion when the VTK object destructs.