Supported Operations

The following images are from the toy,
and make up the 16 boolean operations
AB
AB
UnionIntersect
UnionIntersect
A-BB-A
A - BB - A
ExcludeNone
ExcludeNone

Complements:

AB
AB
UnionIntersect
UnionIntersect
A-BB-A
A - BB - A
ExcludeFull
ExcludeFull

BOOLOPS Report

Boolean Operations

My project for this summer (of 07) has been graciously funded by Google. This funding helped make the project "official" - it was something I knew I would release to the world, and I had to make it as good as possible. I have completed the objectives laid out in my proposal, yet I intend to continue refining it.

This project certainly had some low points - points where if working under my own impetus I'd have probably jumped ship and started work on something else ( in this case a robot simulator in Haskell :) ). Yet because of the mandatory nature of the arrangement, I persisted, and things soon got better. I'd like to think that this project has taught me to stick through the rough periods - I now know the gratification of having something complex work (somewhat) flawlessly after persistent labor.

The project involved many redesigns and rewrites, which I document here.

Trying It Out

Currently it's not very easy to try out the boolean operations toy. Even if they were released, compiled binaries have various dependancies such as gtk and cairo. Hopefully it will soon be included as an Inkscape Live-Path-Effect (another google summer of code project).

I did my work within lib2geom, a library with close Inkscape affiliation. Instructions for compiling are contained within the SVN repository. Once the project is compiled, run src/toys/boolops. If run from a command line, the first two parameters specify svgd files to be used as the shapes in the operations. SVGD files are just the contents of an SVG path element's data.

API / Architectural Overview

Layer 0: Parametric Paths

Curve: a continuous 2d function on the interval [0,1], with precise start and end points.

Path: a series of connected Curves, implicitly closed with a final line segment if necessary.

These particular concepts/structures already existed in lib2geom, within which I have written my boolean operations code.

Layer 1: Path Intersection

src/path-intersection.cpp
src/path-intersection.h
src/crossing.h
src/crossing.h
src/toys/winding-test.cpp

Winding Number: the number of times the path goes around a particular point. Since paths are closed, the result is always integral. A counter-clockwise rotation corresponds to a positive wind, whereas a clockwise rotation is represented as a negative wind. Here, and throughout, the terms clockwise and counter-clockwise refer to the standard mathematical coordinate system rather than the computer, inverted y-axis, coordinate system.

Crossing: a location where two paths cross. This is a subset of the intersections of the paths, with the requirement that the winding must change. Crossings are stored as two time values and a boolean indicating the relationship between the curves at the crossing. A particular time value corresponds to the point on curve i at f, where i is the integral part, and f is the fractional part. This boolean is useful for making decisions during a traversal of crossings. It is defined as the sign of the cross product of the derivatives. When the paths are regions with the same winding, this indicates that along B, A crosses 'outside'.

Winding functions:

int winding(Path, Point);
bool path_direction(Path);  —  assumes the path is simple (like a region)
bool contains(Path, Point, bool even-odd = true);  —  convenience wrapper around winding

There are a few convenience typedefs for the popular crossing collections:
typedef std::vector<Crossing> Crossings;
typedef std::vector<Crossings> CrossingSet;

The intersection system is designed to be modular, so that different algorithms may be used. The current algorithms are generic, and apply to anything that implements Curve. A custom curve type might provide an algorithm with special cases, and default to the generic algorithms. The following crossing functions use the DefaultCrosser, which is currently a simple binary search.

Crossings crossings(Path, Path);  —  finds the crossings between two paths
CrossingSet crossings(std::vector<Path>, std::vector<Path>);  —  finds the crossings between two sets of paths.

Layer 2: Regions

src/region.cpp
src/region.h

Region: a limited point-set, defined by a boundary. This boundary is stored as a simple (non-self-crossing) path. Regions with positive winding (counter-clockwise) include points which are inside the boundary (fills), whereas Regions with negative winding include all points outside the boundary (holes). It also caches information such as winding direction and bounding box.

Most of the public api for region simply wraps and caches queries to a Path. It also provides various convenience functions such as as_fill and as_hole. Region should usually only be used by clients of this library when dealing with paths with the same invariants as regions.

Layer 3: Shapes

src/shape.cpp
src/shape.h
src/toys/boolops.cpp

Shape: a point-set upon which boolean operations may be performed. Shapes are defined by a list of regions, where the resulting point-set is the cumulative intersection. Though it may be figured out from the regions, shapes also store whether the most outer paths are fill or hole. In other words, it stores the value of all the points completely outside the boundary of all its constituent regions. Shapes have the following invariants:

The shape files really constitutes the main body of my work, while the other stuff is peripheral support. It contains the algorithms which perform boolean operations and sanitization on shapes.

The most important function looks like this:

Shape shape_boolean(bool, Shape, Shape, CrossingSet);

If the initial boolean is false, the function unions the shapes. If it is true, it intersects them. The CrossingSet contains the crossings between the regions of the two shapes (an overload of this function allows omission of this parameter). The functions operation is actually surprisingly simple. It traverses the crossings, keeping track of which it has hit, collecting portions of path as it goes, until it reaches a crossing it has already visited, at which point it stores the path and starts again at an unvisited crossing.

The real magic happens in the traversal. It took me quite a long time to really figure it out, but eventually I realized that quite simple logic would allow for traversing intersecting sets of regions correctly. If the boolean input doesn't equal the direction of the crossing, path A is followed, if they are equal, the path B is followed. I actually figured this particular bit fairly early on, but it was with a pair of regions. At the time I had no idea it could also work for doing boolean operations on sets of regions.

While the previous function contains the main code, the actual public function intended for public use is quite different.
Shape boolop(Shape, Shape, unsigned

It takes the two shapes, and a flag integer specifying which of 16 boolean operations to perform. These flags use the first 4 bits to represent a boolean truth table. This is the function which is used in the boolops toy. Half of the operations, the complements, deal with cases where the output shape is filled in areas where neither of the input shapes are. The binary complement operation(~) applied to the flags, will yield flags specifying the complementary boolean operation. 6 of the combinations of flags don't even call shape_boolean - the various identity, inversion, and none/all operations.

src/sweep.cpp
src/sweep.h
src/toys/sweep.cpp
src/tests/intersection-test.cpp

These are some support algorithms which provide a sweep on bounds, for efficient intersections between sets of bounded objects.

Current Issues

The main current deficiency is the sanitizer. I've only recently managed to get a path uncrosser somewhat working. I have a variety of sanitization methods implemented to various degrees, yet have only recently figured out the real reasons I haven't yet gotten it fully functioning. I hope to fix it soon, so that boolops may be used as a live-path-effect within inkscape (the live-path-effects project is a Summer-of-Code project). I think this is a good way to introduce the new boolean operations, and I definitely want my work to be used.

There's also some code which is currently unused:

I have an implementation of intersection routines which splits paths into monotonic portions, and uses the properties of these sections to perform very fast intersection. It works somewhat, but there are bugs I haven't had time to work out, which make the current implementation unstable. This intersection algorithm is especially valuable for self intersections, as monotonic splits are already required.

I wrote some unit test code, however, I tended towards testing using toys during development. However, for debugging and optimizing it may be useful in the future.

Possible Future Design Changes

It may be possible for the boolean operations to return a set of portions. When the actual path data is required, the portion operations would be applied. The actual boolean operations would take no time to execute, as all the computational load would be in the crossing-finder and path synthesizer. This would also work for the sanitization step, saving many spurious portions. The portions found in the boolean operations could be composed onto those produced by the sanitization routine, so that the unnecessary intermediary paths between sanitization and boolean operations are never generated.

The current representation of crossings between sets of paths is fairly awkward. The main core of the data structure, storing times and the direction boolean is fine, the main issue is the storage methods of collections of these crossings. I've come to realize it's actually one of the main reasons that sanitization is as hard as it has been. I've figured out that a better representation would be for each crossing to store pointers to the next and previous crossings along both its participating paths. This would form a connectivity graph, where each vertex has a degree of 4. A dedicated structure such as this would likely be worth it, as algorithms that use it would be more efficient, simpler, and less buggy.

Without too much work, it should also be possible to do boolean operations with non-filled paths. The main change would be to have shape_boolean handle coincident crossings, and to derive the resultant region fill from the fill of the contributing regions. This would allow all combinations of boolean operations between filled regions and paths.

Other Implementations of Boolean Operations

There are a few other interesting implementations of boolean operations:

Inkscape's current boolean operations work by converting to poly-line, using a Bentley-Ottman sweep-line, with the Hobby algorithm, and then converting back to curves. The speed gains found by using clever algorithms are lost in the overhead of converting to and from poly-lines. It is also imprecise and unstable for anything more than trivial cases. The worst part is probably that no-one can fix it, as it consists of about 9000 lines of impenetrable code. This technique of converting to poly-lines seems to be all to common.

CGAL's 2D Regularized Boolean Set-Operations - this implementation of boolean operations takes the approach of doing polygonal boolean operations very well, and then generalizing the technique to restricted curves. Theirs is likely better accuracy and stability-wise, however, the extreme template style, no support for sanitization, and conflicting licence makes it unattractive for anything other than academic projects. The interesting thing about this implementation of boolean operations is that some aspects of it directly coincide with my eventual design, in particular, the treatment of holes.

End-Notes

Cows with Guns
"Cows with Guns" - this is the exclusion of a cow and a gun.

The official time for this project has been about 50 days, during which I've worked hard on it. I intend to continue work on it, though at a less intense pace, as school has started. The total line count comes to ~1400 for the files authored specifically for this project. I also added many things to the lib2geom library in order to support boolops, in particular, numerous support path functions.

Many thanks to Raph Levien for arranging this, as well as Nathan Hurst, Mentalguy, Peter Moulder, Paul Harrison, and others for advice and help on this project.