The object recognition device includes an imaging unit that captures images of a predetermined monitoring area to acquire a three-dimensional image and a two-dimensional image, an object extraction unit that extracts an area having pixels whose pixel values are within a predetermined range from the three-dimensional image acquired by the imaging unit, an image searching unit that searches the two-dimensional image, acquired by the imaging unit, for a reference image registered in advance according to the type of an object, and a determination unit that determines the type of the object depending on whether or not the reference image searched for by the image searching unit exists in the area extracted by the object extraction unit.