crossposted from willowgarage.com
Ethan Dreyfuss, who recently received a master's degree from Stanford University, is continuing his work here on autonomous person-following and dataset collection and annotation. The former project provides a useful building block for a wide variety of tasks. Consider a robot that helps you carry groceries. This robot is vastly more useful if it can carry your bags to the house without requiring teleoperation; the robot can simply track you and follow behind. At a high level, person-following comprises two principal tasks: person tracking and navigation.
The approach developed by Ethan and Caroline Pantofaru fuses a face detector with two weak person trackers: one for legs, and one for 3D blobs at person-height. None of these approaches is individually effective enough to provide robust tracking, but their
strengths are complementary. The face detector is effective when the person is close to, and directly facing the robot. While the leg tracker provides high accuracy
when multiple people are present, it is often confused by non-human
obstacles and can therefore not work reliably from afar. Conversely, the height-based blob tracker can effectively track from further away, yet it is
easily confused by groups of people. By combining techniques, Ethan and Caroline were able to develop a more robust person-tracking tool.
Once
the robot can track a designated person, the information is passed on
to the navigation stack. This same navigation software was used to
complete Milestone 2, with
some
improvements made to help deal more quickly and robustly with
dynamically-moving obstacles such as people.
In addition to the person-following project, Ethan is contributing to
the collection and labeling of a large dataset of people in an indoor
office environment. One
of the major drivers of computer vision research is the availability of
high-quality labeled data. The bulk of existing person datasets
exclude indoor environments, and instead focus on outdoor pedestrians.
Indoor environments present numerous challenges for person detection,
including poor lighting and environmental clutter. By automating as
much as possible, the process of both
collecting (using the robot) and labeling (using Amazon's Mechanical
Turk and Alex Sorokin's CV Web Annotation Toolkit), Ethan's team will be able to provide a large,
compelling dataset
to encourage other researchers to tackle these challenging problems.
Ethan
also picked up a number of side projects including rapid neighborhood
computation on point clouds, and implementing a package that uses the
open-source video codec Theora to allow
low-bandwidth video streaming within ROS.
Leave a comment