So you'll need VR SDKs or utilities for whatever your development pipeline is. Depending on your VR setup, there are several VR packages for Unity, Google has one for Cardboard, I also like the one from Dive, but all of them split the feed and add distortion to allow for VR on the output. If you're Oculus DK2 or Gear VR you will need the Oculus Uitlities for Unity. Vuforia I haven't tinkered with since early DK2 last year, but it worked fine with it then. The key is getting a rig that uses webcams or the phone camera in addition to the VR virtual cameras in your scene. You need to present the video stream somehow in your application/game and line it up with the CG world stereoscopically. Leap isn't good with AR recognition because the cameras are too low rez, so consider using the phone camera or a webcam to do your video feed and AR recognition. You can probably still do the leap passthrough stuff, but you'll need at least one other camera for good AR recognition. Once you get all of that set up, the issue then is Performance, because you are now running 2 virtual cameras displaying CG, the leap motion (2 more cameras), the leap gesture overhead (hands, recognition, animations), plus at least one and maybe 2 other HD+ cameras. So getting this optimized is a serious task. DK2 with a solid computer can probably handle this at a decent framerate, but not mobile.
Let me know where you get with this, would love to see what you come up with.
-C