In some implementations, there is provided a method. The method may include receiving data characterizing a plurality of digital video frames; detecting a plurality of features in each of the plurality of digital video frames; determining, from the detected features, a local scale change and a translational motion of one or more groups of features between at least a pair of the plurality of digital video frames; and calculating a likelihood of collision. Related apparatus, systems, techniques, and articles are also described.