A Hierarchical Cluster Tree Approach Leveraging Delaunay Triangulation

Authors

  • Cristian Avatavului University Politehnica of Bucharest, 060042 Bucharest, Romania
  • Costin-Anton Boiangiu

DOI:

https://doi.org/10.18662/brain/14.3/482

Keywords:

hierarchical clustering, document image layout analysis, Delaunay triangulation, cluster tree formation, layout element segmentation, advanced document image processing

Abstract

This research introduces a robust and reliable technique for structuring document image pages hierarchically, harnessing the power of Delaunay triangulation. Central to our approach is the formation of a cluster tree, which encapsulates the page's content through strategically exploiting layout elements arrangements and their relative distances. By applying our technique, we proficiently categorize the page into distinct clusters encompassing images, titles, and paragraphs. The consequent hierarchical framework, founded on the cluster tree, establishes a durable and trustworthy blueprint of the document layout, thereby accelerating document comprehension and examination.

References

Nagy, G., Seth, S. & Viswanathan. K. (2004). A prototype document image analysis system for technical journals. Computer, 7(25), 10–22.

Baird, H. S. (2002). Document image defect models and their uses. Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR ’93). Tsukuba Science City. doi:10.1109/icdar.1993.395781

Kise, K., Sato, A. & Iwata, M. (1998). Segmentation of page images using the area Voronoi diagram. Computer Vision and Image Understanding: CVIU, 70(3), 370–382. doi:10.1006/cviu.1998.0684

Wang, J., Bourbakis, N. G. & Triantafyllidis, G. A. (1999). Document image analysis using Voronoi tessellation. IEEE International Conference on Systems, Man, and Cybernetics, 1, 230–234.

Zhu, G. & Doermann, D. (2007). Automatic Document Logo Detection. Ninth International Conference on Document Analysis and Recognition (ICDAR 2007). 2. doi:10.1109/icdar.2007.4377038

Qumsiyeh, R. J. (1995). Line detection in document images. In International Conference on Image Processing . 477–480.

Jain, A.K. & Dubes, R.C. (1988). Algorithms for clustering data. Prentice-Hall

Lee, D. T. & Schachter, B. J. (1980). Two algorithms for constructing a Delaunay triangulation. International Journal of Computer & Information Sciences, 9(3), 219–242. doi:10.1007/bf00977785

Lee, D. T. & Lin, A. K. (1986). Generalized delaunay triangulation for planar graphs. Discrete & Computational Geometry, 1(3), 201–217. doi:10.1007/bf02187695

Guibas, L. J. & Stolfi, J. (1985). Primitives for the manipulation of general subdivisions and the computation of Voronoi diagrams. ACM Transactions on Graphics, 11(3). doi:10.1145/800061.808751

Wenyin, L. & Dori, D. (1997). A protocol for performance evaluation of line detection algorithms. Machine Vision and Applications, 9(5–6), 240–250. doi:10.1007/s001380050045

Jain, A. K. & Yu, B. (1998). Document representation and its application to page decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(3), 294–308. doi:10.1109/34.667886

Baird, H. S. (2005). The State of the Art of Document Image Degradation Modeling,” in Document Analysis Systems VI DAS 2004. Lecture Notes in Computer Science, B. H. and S. A.L, Eds., Berlin, Heidelberg: Springer.

Sarkar, Prateek, Baird, H. S. & Zhang, X. (2004). Training on severely degraded text-line images. Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings. doi:10.1109/icdar.2003.1227624

Sarkar, P. & Bhowmick, P. (2011). An Approach to Document Image Block Classification and Grouping Using ICA. 2011 International Conference on Document Analysis and Recognition.

Gong, Y. & Liu, X. (2016). Document Clustering via Matrix Representation. 2016 International Conference on Pattern Recognition (ICPR).

Coustaty, M., Bertet, K., Visani, M. & Ogier, J. M. (2011). A New Multi-layered Bitmap Model for Document Image Analysis. 2011 International Conference on Document Analysis and Recognition.

Boiangiu, C. A., Cananau, D. C. & Bucur, I. (2008). A Hierarchical Clustering Method Aimed at Document Layout Understanding and Analysis. International Journal of Mathematical Models and Methods in Applied Sciences, 2(1), 413–422.

Otsu N. (1979). A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics,9(1), 62–66.

Sauvola, J. & Pietikäinen, M. (2000). Adaptive document image binarization. Pattern Recognition,33(2), 225–236 doi:10.1016/s0031-3203(99)00055-2.

Suzuki, S. & Be, K. (1985). Topological structural analysis of digitized binary images by border following. Computer Vision, Graphics, and Image Processing, 30(1), 32–46. doi:10.1016/0734-189x(85)90016-7

Gan, G., Ma, C. & Wu, J. (2007). Data Clustering: Theory, Algorithms, and Applications. SIAM.

Watson, D. F. (1981). Computing the n-dimensional Delaunay tessellation with application to Voronoi polytopes. The Computer Journal, 24(2), 167–172. doi:10.1093/comjnl/24.2.167

Haralick, R. M. & Shapiro, L. G. (1992). Computer and Robot Vision, Volume II. Upper Saddle River, NJ: Pearson.

Doermann, D. &Tombre, K. (1998). Progress in Pattern Recognition, Image Analysis and Applications. Springer-Verlag.

Louloudis, G., Gatos, B., Pratikakis, I. &Halatsis, C. (2009). Text line and word segmentation of handwritten documents. Pattern Recognition, 42(12), 3169–3183. doi:10.1016/j.patcog.2008.12.016

Chen, H., Bloomberg, D. S. & Baird, H. S. (1999). Document Image Defect Models and Their Uses,” in Document Analysis Systems. DAS 1998. Lecture Notes in Computer Science. Springer.

Kong, Y., Franke, K. &Rosenhahn, B. (2010). Using Graph Cuts for Document Layout Extraction. Advances in Visual Computing. ISVC 2010. Springer.

Antonacopoulos, A. (1998). Page segmentation using the description of the background. Computer Vision and Image Understanding: CVIU, 70(3), 350–369. doi:10.1006/cviu.1998.0691

Malisiewicz, T., Gupta, A. & Efros, A. A. (2011). Ensemble of exemplar-SVMs for object detection and beyond. 2011 International Conference on Computer Vision. doi:10.1109/iccv.2011.6126229

Ren, S., He, K., Girshick, R. & Sun, J. (2017). Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137–1149. doi:10.1109/TPAMI.2016.2577031

Bodla, N., Singh, B., Chellappa, R. & Davis, L. S. (2017). Soft-NMS -- improving object detection with one line of code. http://arxiv.org/abs/1704.04503

Hosang, J., Benenson, R., Dollár, P. & Schiele, B. (2016). What makes for effective detection proposals? IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(4), 814–830. doi:10.1109/TPAMI.2015.2465908

He, P., Cai, Z., Tian, X. & Zuo, W. (2019). AP Loss for Multi-box Detection. In Proceedings of the British Machine Vision Conference. BMVC.

Sharma, A., Ojha, U. & Govindaraju, V. (2013). Adapting state-of-the-art printed document layout analysis techniques for handwriting identification. Pattern Recognition, 46(3), 881–895.

Downloads

Published

2023-10-06

How to Cite

Avatavului, C., & Boiangiu, C.-A. (2023). A Hierarchical Cluster Tree Approach Leveraging Delaunay Triangulation. BRAIN. Broad Research in Artificial Intelligence and Neuroscience, 14(3), 408-433. https://doi.org/10.18662/brain/14.3/482

Publish your work at the Scientific Publishing House LUMEN

It easy with us: publish now your work, novel, research, proceeding at Lumen Scientific Publishing House

Send your manuscript right now