Pinch-to-zoom is one of the basic multi-touch gestures. It's an intuitive way to zoom in/out and is found pretty much anywhere multi-touch is found. It's really easy to implement but it seems that more often than not, people get it wrong. Unfortunately, even Nick Gravelyn's touch-gesture sample on the XNA website got it wrong. (But to be fair, the sample was more about demonstrating the API than what you could do with it.)
Let's first take a look at how pinch-to-zoom is supposed to work.
The Gold Standard
If you have an iPhone or iPod Touch, pull it out now and view a big image. Find two "landmarks" in the image: spots which are easy to precisely identify in an image - for example the tip of a skyscraper or somebody's nose. Find two of these spots, place a finger on each and pinch together. As you pinch you'll notice that your two chosen landmark spots still remain in place underneath your fingers.
This is how pinch-to-zoom is supposed to work. Each of your two fingers should act as an "anchor" on the image - once the two fingers are pressed on the screen, the points on the image underneath each finger should stay stationary relative to their respective finger.
Okay, so maybe that explanation wasn't very clear. Maybe an image or two will help:
Now you can see what I mean: the points on the image underneath each finger don't move relative to the finger. After the gesture is complete our friendly koala's eye is still under the thumb and the top of his head is still under the index finger.
Okay, so how do you actually implement that? The easiest way would be to use matrices - using matrices you can easily define the center of scaling to handle much of this for you. But the SpriteBatch.Draw method doesn't allow you to specify a transformation matrix - so you have to make do specifying a scale and a position.
Scaling is pretty simple. In your GestureSample, you get position and a delta for each finger. If you subtract the delta from the position, you have the previous position of each finger. And the amount you need to scale by is simply the ratio of the old and new distances between each finger.
position1 = position of finger 1
position2 = position of finger 2
delta1 = delta of finger 1
delta2 = delta of finger 2
oldPosition1 = position1 - delta1;
oldPosition2 = position2 - delta2;
newDistance = dist(position1, position2);
oldDistance = dist(oldPosition1, oldPosition2);
scaleFactor = newDistance / oldDistance;
The scaleFactor is the amount you multiply your object's scale by, not the absolute scale of your object. In other words, it doesn't determine your object's scale directly, only how much to change your object's scale by. In pseudocode:
float scaleFactor = GetScaleFactor();
obj.Scale *= scaleFactor;
Translation is a little bit trickier. We've found the amount to scale our object by, but remember that the points underneath each finger shouldn't move relative to the finger - those points should be "anchored" to each finger. So that means we need to translate the object in such a way that whatever was under each finger before the pinch motion are still under the finger afterwards. Again, in pseudocode:
newPos1 = position1 - (oldPosition1 - obj.Position) * scaleFactor;
newPos2 = position2 - (oldPosition2 - obj.Position) * scaleFactor;
obj.Position = midpoint(newPos1, newPos2);
Note: this assumes that obj.Position is in screen-space!
Where the definitions of position1, position2, etc. are the same as previously.
Ignore one of the fingers for now - consider only one of the two fingers. You know that the finger moved by some amount this frame, and the object has also been scaled by some amount this frame. And you want to ensure that the point on the object underneath the finger previously is still underneath the finger now.
All that the code does is find the difference between the object's position and the finger's old position, scales it according to the scaling factor, then adds it to the finger's new position. But wait! There are two fingers - which one do we use? Well, both, actually - which is why we take the midpoint of the two possible new positions.
That's it. If you apply these simple formulae for the scaling and translation of your objects, your pinch-zoom behaviour will be correct and will "feel" right to your user. Reading and translating all that obtuse prose can be a pain, I know, so I've also provided some sample code that you can use.
There isn't much to it, really. I've provided a static class PinchZoom which contains the functions necessary to implement pinch zooming, as well as a sample project. You'll need a multi-touch capable monitor or an actual WP7 device to test it, of course. Alternatively you can try this excellent multi-touch simulator drop-in component.
PinchZoom.zip Sample project (789KB)
Here it is in action: