• Title of article

    Predictive Analysis for Optimal Text Visibility: A Comprehensive Study on Frame-of-Interest Prediction in Book Digitization Videos

  • Author/Authors

    Buddhawar ، G. Sardar Vallabhbhai National Institute of Technology , Dave ، D. Pimpri Chinchwad College of Engineering , Jariwala ، K. N. Sardar Vallabhbhai National Institute of Technology , Chattopadhyay ، C. School Computing and Data Sciences - FLAME University

  • From page
    2256
  • To page
    2267
  • Abstract
    This research paper addresses an important challenge in book digitization, i.e., accurately predicting frames where text visibility is optimal. Existing models often suffer from high computational complexity, resulting in inefficiencies in automation and accuracy. In contrast, our proposed models offer a solution with lower complexity and higher accuracy. Leveraging a diverse dataset of book flipping videos, we introduce three novel models: the Regular CNN LeNet-5 Model, the Custom LSTM Model, and the 3D CNN Model. Evaluation reveals that our 3D CNN Model achieves an accuracy score of 99.01%, with 377,921 parameters. These models demonstrate a significant increase in efficiency in terms of accuracy metric  with significantly less number of parametrers. Thereby the proposed approach enhances the process of identifying frames of interest. Our findings highlight the transformative potential of these models in streamlining book digitization workflows and improving accessibility to digitized textual content. This study contributes valuable insights at the intersection of computer vision, machine learning, and digitization efforts, offering a promising avenue for enhancing the usability of digitized textual resources.
  • Keywords
    Book Flipping Videos , Frame of Interest , Book Digitization , predictive analysis
  • Journal title
    International Journal of Engineering
  • Journal title
    International Journal of Engineering
  • Record number

    2777018