US Patent 10318803 Text line segmentation method

In a text line segmentation process, connected components (CCs) in document image are categorized into three subsets (normal, large, small) based on their sizes. The centroids of the normal size CCs are used to perform line detection using Hough transform. Among the detected candidate lines, those with line bounding box heights greater than a certain height are removed. For each normal size CC, if its bounding box does not overlap the bounting box of any line with an overlap area greater than a predefined fraction of the CC bounding box, a new line is added for this CC, which passes through the centroid of the CC and has an average slant angle. Each large size CCs are broken into two or more CCs. All CCs are then assigned to the nearest lines. A refinement method is also described, which can take any text line segmentation result and refine it.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 10318803 Text line segmentation method

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 10318803 Text line segmentation method