Abstract: The Internet of Things (IoT) and smart city paradigm includes ubiquitous
technology to extract context information in order to return useful services to
users and citizens. An essential role in this scenario is often played by
computer vision applications, requiring the acquisition of images from specific
devices. The need for high-end cameras often penalizes this process since they
are power-hungry and ask for high computational resources to be processed.
Thus, the availability of novel low-power vision sensors, implementing advanced
features like in-hardware motion detection, is crucial for computer vision in
the IoT domain. Unfortunately, to be highly energy-efficient, these sensors
might worsen the perception performance (e.g., resolution, frame rate, color).
Therefore, domain-specific pipelines are usually delivered in order to exploit
the full potential of these cameras. This paper presents the development,
analysis, and embedded implementation of a realtime detection, classification
and tracking pipeline able to exploit the full potential of background
filtering Smart Vision Sensors (SVS). The power consumption obtained for the
inference - which requires 8ms - is 7.5 mW.