An object detection method and an object detection apparatus are provided. The object detection method includes: mapping at least one image frame in an image sequence into a three dimensional physical space to obtain three dimensional coordinates of each pixel in the at least one image frame; extracting a foreground region in the at least one image frame; segmenting the foreground region into a set of blobs; and detecting, for each blob in the set of blobs, an object in the blob through a neural network based on the three dimensional coordinates of at least one predetermined reference point in the blob, to obtain an object detection result.