This blog stems from my curiosity to research the performance of Paddle-OCR, a robust and versatile optical character recognition (OCR) toolkit. Paddle-OCR offers an array of features and capabilities for extracting text from images and documents, and I was eager to explore its capabilities and limitations.
Text detection of image data of the English-1 book, Chapter-1, Council of Higher Secondary Education, Odisha, Bhubaneswar for +2 Examination
Line number, bounding box, text and confidence are captured in a Python dictionary.