A system, comprising a computer that includes a processor and a memory, the memory storing instructions executable by the processor to detect and locate an object by processing video camera data with a capsule network, wherein training the capsule network includes determining routing coefficients with a scale-invariant normalization function. The computer can be further programmed to receive the detected and located object.