Hi Theo,
Thanks for your feedback! I completely agree with your statements about where we need to be heading in the near future. The biggest challenge associated with my project was building in strong visual cues to covey the depth of objects in a 3D environment. For example, the photo below shows my heart model using Unity's built-in Transparency shader (left) and the custom transparency shader I eventually made (right).
The built-in transparency does not do a great job at respecting strong depth cues like occlusion, cast shadows, and reflected light. My shader relies heavily on rim lighting and layers of culling masks to preserve contours.
I did quite a bit of usability testing on using two hands compared to one, and complex vs simple gestures. In the end, the two handed interactions were viewed as strenuous and it was difficult to get smooth transitions between navigation and selection tasks without distorting the view of the model. Interestingly, even with nav and selection functions simultaneously enabled, students tended to focus on one task at a time; either manipulating the position of the model, or identifying specific structures, but rarely both at the same time. In the end, the one hand approach was well received, easiest to learn, and resulted in the fastest workflow.
I'm happy to share some of my code and other design strategies. Is there anything in particular you would be interested in?