The Work done so far...

("don't you ever think it'll stop!")

"The facial lift" (25-Apr-99)

The first major part in getting to grips with the project was to learn all there was to know about the existing one. This I did by reading the project report by Gail Shaw and going through printouts of her code. Some improving suggestions came readily to mind, while others took some time. A lot of code was repeated (the spline would be traversed for each feature) without really providing new computational information. I therefore decided to remodel the system.
The linkedlist of 3Dpoints that was previously passed around now became a structure of its own. In an object-oriented manner, it now knows its data-points, its latest length and angle and can compute these values autonomously.
The GestureRecognizer is presently holding the pieces together. It holds the spline and the features and obviously knows about the gestures. New Features can easily be plugged into the system and repetitive calculations are no longer necessary, since the Spline holds all information important to the features readily available. In iterating through the list only once, the spline calls each feature and causes it to perform its feature-specific calculations. After all Featurevalues have been extracted, the evaluation can take place.
The Interface to the system was not changed as yet and probably won't.


Gesture Recognition goes democratic (25-Apr-99)

This is not so much a politic statement, as more the latest development with the recognition part of the VRGestureRecognizer. The previously implemented recognizers (Recognizer1 and Recognizer2 respecively) had some severe shortcomings.
Recognizer1, based on a solid cut-off mechanism simply determined wheather a the computed features of a gesture lied strictly within the bounds specified by an upper and lower bound. The first gesture to lie within these bounds would be said to be recognized. This mechanism heavily relies on the independence of gestures, as no two gestures may have the same allowed ranges for their characteristic feature values.
Recognizer2 improved on this and calculated an error from an expected value for each gesture and feature. The gesture that minimized the error would be choosen. This proved a distinct improvement, but failed under certain cicrumstances. Some features would have the problem of being extremely unreliable for specific gestures and thus the error returned from these features would "drown out" any small errors and dominate the recognition process.
This is where the idea of a voting system evolved. Instead of accumulating all errors in one big result I looked at the errors individually. This was first realised for the original three gestures, where a gesture was recognised, if two or three of the features agreed on a gesture. Soon the concept hardened and was generalized to any amount of gestures.
At the present time any number of features can "vote" on what they think the gesture to be recognized is. The absolute majority (two-thirds or more) will decide on the final result. Not only does this make it possible to ignore recognition outliers, but also an estimation of confidence can be computed. The more features agree on a result, the more reliable the result is theoretically.
The next step was to generally forbid some features to vote for certain gestures knowing that they are "trouble-causers" and their judgement cannot be trusted, thus disallowing them to spoil an election. The way this was achieved was to generate a normalised standard deviation of sample data and thus get an estimate of the reliability of the features for a particular gesture. The smaller the normalised standard deviation the more reliable it is considered. The introduction to this new featurevalue also can be used to improve the mechanism of recogniser2, by weighing the respective errors with the reliability-factor provided by the standard deviation.
Presently, features are allowed or disallowed according to their standard deviation value falling above or below an arbitrarily assigned threshold value.


The birth of a new Feature (25-Apr-99)

Having developed the new Recognizer system, I was in urgent need of a new feature to improve the usability of my system. Already from a very early stage in the project I wanted to get some information about the geometrical shape of the object, as in "is it round" or a triangle or a rectangle and so on. I decided it would be useful to know how many corners (discontinuities in an otherwise smooth curve) are in a spline. With the restructured recognition model in place this was relatively easily accomplished.
I derived a new feature "Kinkyfeature" from the Feature Class, and implemented the ComputeFeatureValue method as a LowPass-Filter with debouncing capabilities. This worked very well for gestures that actually have "kinks" in them (like the arrow, the cross, the scribble, etc.) but would not be totally reliable for smooth gestures (such as the select or the circles gestures) for some irregularities were still able to penetrate the filter. To remedy this I decided to multiply the number of corners found in a gesture by the sum of the derivatives of the curve at those points. This made for some improvement, but still needs looking into.


"You never stop learning" (25-Apr-99)

The new recogniser introduced earlier not only proved to be more adaptable and autonomous than the previous ones, but also evened the way for the next big step in improving the gesture recognition system: to include the ability to learn new gestures and/or train existing ones.
The system will prompt a user to repeatedly perform a certain gesture a number of times and then evaluate the data. This will be done in several steps: The gesture will be recorded and featurevalues for all features will be calculated and tabulated. Outliers will then be eliminated from these recorded featurevalues and afterwards maximum, minimum, average and normalised standard deviation will be calculated. From these values the system will be able to judge the quality of a gesture (i.e. if the majority of features would be disallowed to vote on that gesture it is a "bad" gesture) and also be able to tell if it is too similar to an existing gesture.


"Everybody loves stats..." (10-May-99)

In order to test the system's learning curve, I recorded some data and let it run through the evaluator and the validator. The evaluator successfully pointed out weeknesses and strengths of features regarding each gesture. In a sufficiently large data-set, the outliers even for feature 3 were spotted and the data-set thus smoothed. From visual inspection of the resulting graphs, the evaluator prooved a valuable tool in learning new gestures or training existing ones. The fact that the validator seems to be too strict in rejecting features for the recognition process needs further looking into.

But for all of those who can't take the suspense any longer: Here are the graphs!!!

Feature 0 - Curvature
Feature 1 - Absolute Curvature (no alcohol involved here!)
Feature 2 - LengthOverDisplacement
Feature 3 - KinkyFeature

Not all gestures were recorded the same number of times. This results in the frequency-distribution having higher values for often-recorded gestures. Since the absolute values of the frequency distribution are irelavant to the discussion, care should be taken only to look at the spatial distribution and not the absolute values. For a closer look at the gestures mentioned in this section follow this

In this Histogram we can clearly see some of the outliers spotted by the evaluator (e.g. Tick-values around 2217 and 2253 or Scribble-value of 2519). The unpredictability of the Arrow-values (being grouped in two regions) resulted in Feature 0 not being eligible to vote for this gesture.
Here it becomes obvious how several Gestures yield the same values for Feature 1 (i.e. Select & Cross as well as Hill & Tick). These similarities are obvious, when looking at visual representatins of these gestures. The Hill is basically a round Tick and the Cross a "kinky" Select. These situations will be spotted by the validator, who disallows Features to vote for Gestures that are too similar.
The graph here is represented differently because otherwise some gestures would overlap others, rendering them invisible. The spikes are of interest here: It looks like a considerable overlap occurs in basically all the gestures (except Select, which for apparent reasons is disallowed anyway). Praxis shows though that Feature 2 is well able to distinguish most of the gestures (except Tick and Arrow maybe), so that the previously assumed strategy for the validator has to be reviewed.
Feature 3 shows well defined integer values for the gestures. An apparent flaw is that Select, Circles and Hill naturally share the Featurevalue 10, which makes them indistinguishable. It has been seen, that for a sufficiently large data-set outliers are removed even here with great confidence.

(Notice the aforementioned unevenly distributed number of times the data was recorded for each gesture. Also notice that the spatial distribution is not affected by this)