I see two ways of getting your desired result:
-First by haar training openCV. Download and install OpenCV, there is a training file, where you can make a hand/finger train file to use in detect object later.
-Secondly, use contour node and analyse the coordinates for angle (point2vector and +(spectral with bin size 5 or 10) and expr (abs(a))>Threshold) to get finger caps.
If you watch this video closely you can see that only (one) upper, vertical finger is been taken as sort of mouse pointer.
To get a click you have to check coord values changing rapidly by using framedifference animation.
To check this pull down areas Trautner is your node.
I would prefer contour and coord analysis as its faster than detect object.