Information Extraction from Handwritten Tables in Historical Documents
Published in DAS 2022: Document Analysis Systems, 2022
Recently, significant advances have been made in Document Understanding in structured historical documents. However, not much research has been done in information extraction from handwritten structured historical documents. In this paper, we compare two Machine Learning approaches and another approach that is based on heuristic rules to extract information in historical pre-printed forms with handwritten information. We analyze how each approach performs at each step of the extraction process. The proposed approaches improve the heuristic-rule baseline by up to 0.14 F-measure points throughout the information extraction pipeline.
Recommended citation: Andrés, J., Prieto, J.R., Granell, E., Romero, V., Sánchez, J.A., Vidal, E. (2022). Information Extraction from Handwritten Tables in Historical Documents. In: Uchida, S., Barney, E., Eglin, V. (eds) Document Analysis Systems. DAS 2022. Lecture Notes in Computer Science, vol 13237. Springer, Cham. https://doi.org/10.1007/978-3-031-06555-2_13 https://doi.org/10.1007/978-3-031-06555-2_13