An Egocentric Perspective on Active Vision and Visual Object Learning in Toddlers
Sven Bambach, David Crandall, Linda Smith, Chen Yu
IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL) 2017
[download paper] Abstract: Toddlers quickly learn to recognize thousands of everyday objects despite the seemingly suboptimal training conditions of a visually cluttered world. One reason for this success may be that toddlers do not just passively perceive visual information, but actively explore and manipulate objects around them. The work in this paper is based on the idea that active viewing and exploration creates "clean" egocentric scenes that serve as high-quality training data for the visual system. We tested this idea by collecting first-person video data of free toy play between toddler-parent pairs. We use the raw frames from this data, weakly annotated with toy object labels, to train state-of-the-art machine learning models (Convolutional Neural Networks, or CNNs). Our results show that scenes captured by parents and toddlers have different properties, and that toddler scenes lead to models that learn more robust visual representations of the toy objects.