# 3. Algorithms to generate Autostereograms

Copyright 1996 Simon Booth, University of Leeds

## 3.1 Depth Maps

A 3-Dimensional volume contains far more information than a 2-Dimensional image, and much of the information is unnecessary for binocular vision. Since stereograms are generated from a fixed viewpoint, it is only necessary to store the distance from the eyes to the first point in space visible through each pixel of the image. Hence, scenes are usually stored as ‘Depth Maps’ for the purpose of stereogram generation. These are two dimensional images in which the colour of each pixel is taken to represent depth. Suppose the image allows 256 shades of grey, then black would be the far plane, and white would be the nearest point represented. Since proper colour information cannot be included in a stereogram it does not matter that pixel colour is used to represent a different quantity, and this method is the simplest means of storage and manipulation of the necessary 3-D information.

Depth Maps can be generated in many ways. The simplest is to use an art package to create them by hand, though this can be difficult since using brightness as depth goes against intuitive perception of lighting. For simple stereo images it is the easiest method though. Alternatively, a ray-tracing approach can be taken. Standard ray-tracing packages can be modified so that they return the depth at each pixel on screen rather than colour. This is the method used by most commercial companies, and a public domain package ‘Ray-SIS’ is available on the internet to perform the task. A similar approach is to use a 3-D rendering package which supports Z-Buffered output. A Z-Buffer is essentially a Depth-Map – a 2-D image where colour represents depth, so if the Z-Buffer can be exported from the package it can be used to generate stereograms. Mathematical formulae can also be used to generate interesting depth maps, where the depth at each pixel is a function of x and y – Stereogram fractals can be created in this way, for instance. Once a depth map is created, the next stage is to apply an algorithm to create a stereogram with the same depth pattern.

## 3.2. Processing Depth-Maps

Because the human eyes lie on a horizontal line, the constraints on the stereogram to produce an illusion of depth also lie on a horizontal line (assuming the image is viewed straight on). This means that the image can be generated a line at a time since no information is required from other lines to calculate the constraints. The input to a stereogram creation algorithm at any stage is, therefore, a single line of a depth map. The ‘colour’ at each point on the line is the depth, or z co-ordinate of that point in the scene. The position on the line is the x co-ordinate and the line number the y co-ordinate. Converting a depth map to a stereogram involves using the formula derived in section 2.2 to calculate the two (i,j) pixels on the stereogram that correspond to that (x,y,z) point of 3-D space. There are infact a number of algorithms used to perform this seemingly simple task. First, though, a little terminology for the features that are common to all the algorithms:

The terms far plane, near plane, and image plane are assumed from the stereogram geometry discussed in section 2.2. Let S be the stereo separation of the image in pixels. Further, let Z(x,y) by the value of the Depth Map at pixel (x,y), scaled to the range [0,1], where 0 is the furthest depth in the image (the far plane), and 1 is the nearest possible – the image plane, or position of the actual stereogram. Let µ be the field depth of the image – the range of depth values actually used. The first stage of stereogram generation is to create a strip of dots S pixels wide, which I will call the Source Strip. It may be either random dots or a bitmap.  The nature of this strip will determine the texture of the resulting stereogram.

## 3.2.1 The simplest algorithm.

The simplest algorithm, credited to Tyler & Chang in around 1977, involves repeating a strip of dots across the stereogram, wallpaper-like, but modulating the repeat width according to the depth at each pixel. For areas of the depthmap with Z = 0, the Source Strip is repeated at intervals of S pixels. The repeat width is decreased for areas with greater Z values – at the image plane the repeat width would be 0, were depth values that great allowed. In the simplest form of the algorithm, the repeat width is a linear function of Z. Assuming a field depth of µ, the repeat width s(x,y) = µ * Z(x,y) * S. Hence, at the near plane, the Source Strip would be repeated at intervals of µS pixels. This linear interpolation of repeat widths is technically incorrect, but it is a simple matter to replace the formula with that derived in section 2.2.

The actual application of this process is complicated by the fact that Z will probably not remain constant across the whole span of the repeat width, so it is not possible to simply copy a whole strip of that width to the stereogram. To handle this fact, a quantity dx is calculated at each pixel (x,y). This is the quantity that needs to be added to the offset in the Source Strip at each pixel of the stereogram to produce a strip s(x,y) pixels wide. In general, dx = S/s(x,y).  If Z(x,y) = 0, then s(x,y) = S, so dx = 1. If Z(x,y) = 1, then s(x,y) ≈ µS, so dx 1/m. To generate the stereogram, each line of the depthmap is scanned in turn. Two offsets are maintained – x is the current pixel of the depth map and i is the current pixel of the source strip.

Starting with x = i = 0, pixel x of the stereogram is set equal to pixel i of the Source Strip. Then dx is calculated and added to i. If this makes i greater than S, i is reset to i mod S. x is then incremented by 1, and this process is repeated until the end of the line is reached.

The effect of this algorithm can be clearly seen in stereograms with bitmap textures – the basic ‘wallpaper effect’ is distorted, with areas of the image seemingly squashed. These squashed areas correspond to the parts that will be closest to the viewer when the image is viewed in stereo.

The problem with the algorithm is that it is basically wrong. It does not provide the correct stereo separation for all points of the image, and hence does not produce a correct image. This is particularly apparent in stereograms generated from depth maps with sharp changes of depth. The stereo image can deviate very wildly from the depth map, and often be completely unintelligible. It can produce some nice images when the depthmap has only smooth changes of depth, though – particularly when a bitmap texture is used. In order to generate a technically correct image, however, it is not possible to consider each pixel in isolation; all pixels need to be related to the pixels that have gone before. This observation leads to the ‘Look-Back Algorithm’.

Note: Stereogrammer no longer supports the Simplest Algorithm.

## 3.2.2. The Look-Back Algorithm.

The Look-Back Algorithm was created by Christopher Tyler & Maureen Chang in 1979, and handles sharp changes in depth properly. It differs from the simplest algorithm in that the colour of each pixel is calculated by ‘looking back’ at an earlier pixel in the line. The distance that is looked back (the ‘Look-Back distance’) is governed by the depth at that pixel.

First, the initial strip of dots is copied into the stereogram image without distortion, so it is S pixels wide. This is necessary so that there are sufficient pixels in the image to Look Back at, but it has the side effect that the area they are copied into cannot have any depth variations. Scanning of the depth map begins after this initial strip has been copied. Let x be the position in a line of the depth map, and let i be the position in the corresponding line of the stereogram. The first S pixels of the stereogram line are equal to the Source Strip, so the algorithm begins with i = x = S. At each stage, both i and x are incremented by 1, and the depth value Z(x,y) is examined. A record of the depth at the last pixel, Z(x-1,y), is also maintained throughout the line. If Z(x,y) > Z(x-1,y), the depth is increasing and the pixel is not constrained to be the same as any previous one. A new random colour can be chosen, or the pixel can be set equal to the pixel before. For random dot textures, generating a new random pixel reduces the chance of ‘artifactual echoes’, caused by a false pixel association by the brain, but for bitmap textures this produces unattractive images and repeating the previous pixel looks far better. If Z(x,y) < Z(x-1,y) then the pixel needs to be paired with an earlier one. The stereo separation s(x,y) for a point at that depth is calculated, using the formula in section 2.2, and the algorithm ‘looks back’ to find the colour of the pixel (i – s(x,y), y), which was determined earlier in the process. Pixel x is then set to the same colour. For simplicity, the look-back distance s(x,y) is usually referred to as L, so the algorithm becomes:

For every pixel (i,y):

if Z(x,y) < Z(x-1,y), let pixel (i,y) = pixel (i-1,y)

else if Z(x,y) ≥ Z(x-1,y), let pixel (i,y) = pixel (i-L,y)

The above formulation is asymmetric, as it assumes that the left eye looks directly at the image whilst the right eye looks at an angle (Figure 3.2.1). This produces a distortion in the image to the right, so that objects in the stereo image will not be in the same position as they are in the depth map. This can be compensated for by setting x = i – S/2 for every point, so that each pixel of the stereogram is determined by examining the pixel S/2 pixels back in the depth map. This re-adjusts the position of objects so that depth map and stereo image concur.

The look-back algorithm produces results that are similar to the simplest algorithm, in that nearer areas appear to be more squashed, but it can also be seen that are sometimes ‘leaps’ in the texture, corresponding to sharp changes in depth. These leaps occur because the stereo separation at neighbouring pixels is significantly different, so the distance looked back is correspondingly different. The Look-Back algorithm produces correct images whether the depth map has sharp changes or not, and generates perfectly acceptable stereograms under most circumstances. It is still asymmetric however, even with the compensation described above. This is because the mathematics used still assume that the left eye is looking directly at the image, whilst the right eye looks at an angle. In reality the lines of sight of both eyes meet the stereogram at the same angle, and the point of convergence lies in line with the centre of the eyes. The effect of this is a distortion with depth, so that nearer objects in the stereo image appear horizontally displaced. The effect is not large, or even noticeable under most circumstances. However, in order to be truly accurate, Harold Thimbleby, Stuart Inglis and Ian Witten developed a new stereogram generation algorithm, based upon constraint satisfaction. The algorithm also has the advantage of supporting Hidden Surface Removal.

## 3.2.3. The Constraint Satisfaction Algorithm.

The Thimbleby, Inglis and Witten algorithm is adequately described, along with other aspects of the Autostereogram, in their paper ‘Displaying 3D Images: Algorithms for Single Image Random Dot Stereograms’, available by ftp from the university of Waikato in New Zealand. Their approach is much closer to the actual mathematical theory behind the stereogram. The algorithm calculates the constraints imposed on each pixel of a stereogram by the points of the stereo image to which they correspond. An array same[] is maintained to record these constraints. If the stereogram is w pixels wide, than the array same has w elements, initialised so that same[i] = i for i ε [0,w]. In other words, each pixel of the stereogram is initially constrained to be the same as itself, which is the same as no constraint. The line of the depthmap is then scanned from right to left (the direction is arbitrary). The stereo separation s at each point is calculated as a function of Z(x,y), using the formula from section 2.2. Since the (x,y,z) is assumed to be in line with the centre of the eyes under this scheme, the pixels corresponding to that point in the image are at left=x-s/2 and right=x+s/2. These two pixels must be of the same colour, and this fact is recorded by setting same[left]=right, provided both left and right lie within the bounds of the image. When this is repeated for every Z(x,y) on the line, the result is a complete record of the constraints on the pixels of that line. The pixels of the stereogram are then filled in by scanning the same array from right to left. For every i, if same[i] = i there is no constraint on the pixel, so it is allocated a random colour. Otherwise, if same[i] = x, pixel i is set to the same colour as pixel x. The result is a line of the stereogram which represents the depthmap in a symmetric manner – there is no lateral distortion of closer parts of the image.

The algorithm produces largely similar results to the Look-Back algorithm, visually, since the lateral distortion was never very apparent.  The differences between this algorithm and the look-back algorithm become more apparent when using bitmap textures.  Firstly, since it assigns pixels from the right to the left, the image appears more distorted at the left than the right. The opposite is true of the Look-Back algorithm, which works from left to right.  Secondly, unconstrained pixels do not always match up with their neighbouring pixels, which can cause incongruous lines in the stereogram. This also happens in the Look-Back algorithm when a random pixel is assigned rather than setting pixel (x,y) equal to pixel (x-1,y), and it could be removed from the constraint satisfaction algorithm by setting the colour of pixels where same[i] = i equal to the colour of pixel i+1, but I chose not to do this.

The major difference between Thimbleby, Inglis and Witten’s algorithm and the Look-Back algorithm is the provision of Hidden Surface Removal. The term is common in 3-D graphics to refer to faces of an object that cannot be seen from the current viewpoint as they are on the ‘back’ side. It has a slightly different meaning in the context of stereograms – it refers to parts of an object that are only visible to one eye, because they are obscured from the other eye by another part of the object. This situation is best explained by means of a diagram (Figure 3.2.2.).

Because the point should not be seen by one eye, it is not necessary that its two corresponding image pixels be of the same colour. Indeed, it can be damaging to the effect of the stereogram as it increases the risk of ‘echoes’, which are pixels of the image that the brain mistakenly associates, creating a depth change where there should not be one.

Thimbleby, Inglis and Witten’s Hidden Surface Removal operates in a simplistic and slow manner. Before a constraint is recorded, the line of sight of each eye is traced from the point in the stereo image back to the image plane. At each point, the depthmap is tested to see if there is a part of the image which intersects the line of sight. If so, the pixels are left unconstrained. This reduces the number of constraints on each line of the stereogram (assuming there are some hidden surfaces), and thereby increases the randomness of a random dot stereogram. This reduces the chance of there being pairs of pixels that the brain should not associate, but actually does, which reduces the chance of echoes in the stereo image. If the stereogram is being generated from a bitmap texture, the effect of hidden surface removal is greatly reduced, since the choice of colour for an unconstrained pixel is not random after all. It does diminish the effect, but echoes can still occur. It should be noted that these echoes are mostly caused by imperfect convergence on the image, and they can be almost totally removed by concentrating harder on getting the convergence correct.

The main draw-back of the hidden surface removal used in the constraint satisfaction algorithm is its speed. It can slow stereogram generation down to less than a quarter of its free speed.

## 3.2.4. Horoptic Algorithm.

The Horoptic algorithm is a variation on the constraint satisfaction algorithm, developed by myself. It was observed in section 2.1 that the geometry developed was simplified because it assumed that the retina was a flat surface, and that points in 3-D space which registered on corresponding retinal points therefore lay on a flat plane. Infact, since the retina is curved, the points actually lie on a circle approximately connecting the focal point with the centre of rotation of each eye. This circle is known as the ‘Horopter’, and has been investigated by Woodworth & Schlosberg (1954) and Shipley & Rawlings (1970). The effect of assuming a flat retina on the stereogram generation algorithms is to produce a slight bulge in the centre of the resulting image. Stereo separations should really be calculated by considering the distance of each plane on the depth image from the horopter, rather than the flat far plane. The horoptic algorithm adjusts the geometry of Thimbleby, Inglis and Witten’s algorithm to account for the difference between the horopter and the far plane. The distance from the far plane to the horopter is calculated, and this is subtracted from the Z(x,y) value of the depth map. This is illustrated by Fig 3.2.3. It should be noted, first, that this means that some stereo separations will be calculated from a negative z value. This does not matter as the equation remains consistent. Negative z values simply indicate that the point being represented lies behind the horopter (a z of 0 implies that the point lies on the horopter). More significantly, the algorithm assumes that the eye positions are fixed with relation to the stereogram. It is assumed that they lie directly in line with the centre of the image. Further, the distance between the viewer and the stereogram is assumed to be fixed. I assume a distance equal to 20 times the image stereo separation. For an Sfar of 1.25”, this fixes the viewing distance at 25”, a little over two feet. In my experience this is a good average for the viewing distance. If these figures are not exactly correct, the bulge in the image will not be totally eliminated, though it will be reduced. The difference between this algorithm and the Constraint Satisfaction algorithm can be observed by generating a stereogram from a blank depth map. In the Constraint Satisfaction algorithm a bulge toward the viewer in the centre of the image is apparent, but with the Horoptic algorithm it is greatly reduced.

In order to simplify the mathematics in the Horoptic algorithm, I assume that the horopter passes not through the centre of rotation of each eye, but through the point centred between the two eyes. With the high ratio of viewing distance to eye separation involved, this change is negligible. It does mean, however that the centre of the circle can be assumed to lie on the image plane, and that the radius of the horopter is equal to the viewing distance. The algorithm only assumes the eyes to be fixed horizontally with respect to the stereogram, but to be directly level with each line as it is calculated. This does leave a slight distortion along the y axis.

Since the centre of the circle approximating the horopter is known, and its radius, it is possible to calculate the z coordinate of any point given the x and y coordinates. y is fixed for each line, and x spans the range -W/2 to W/2, where W is the width of the stereogram. The distance from the image plane to the horopter is given by the positive solution of the equation:

x2 + z2 = R2

This is the standard equation of a circle. As R is known to be equal to D, and x is known for each point on the line, the distance from the far plane to the horopter is given by: This is then scaled to the range [0,1] to match the values of the depth image received by the algorithm, and subtracted from Z(x,y) before the stereo separation S(x,y) is calculated. The result is a compensation for the effects of the horopter.

I have also implemented an alternative constraint satisfaction algorithm to that used by Thimbleby, Inglis and Witten, which produces more symmetrical results.

## 3.2.5 Techmind Algorithm

The Techmind algorithm was developed by Andrew Steer, and is described in detail on his website Techmind.org.  The algorithm has been adapted slightly to fit into the Stereogrammer framework, but operates on the same principles.

One advantage of the Techmind algorithm is that Hidden Surface Removal is handled implicitly by the algorithm (the option has no effect on this algorithm) and is performed in a manner which is much faster than the method used in the Constraint Satisfaction and Horoptic algorithms.  Typically, this algorithm is the fastest except for the Lookback algorithm (which does not support HSR at all).

The algorithm also introduces a vertical distortion into textures, which can help to eliminate artefacts in the generated image (such as ghosting).

Next: Observations and Conclusions