Rekognition as the proper solution to detect and recognize text from the scanned

Rekognition as the proper solution to detect and recognize text from the scanned

cdt78 updated 2 years ago 2 Members · 3 Posts
AWS Certified Solutions Architect Professional
cdt78

Member
April 27, 2022 at 10:02 pm

<i data-stringify-type=”italic”> I was seeing this question<code data-stringify-type=”code”>A print media company has a popular web application hosted on their on-premises network which allows anyone around the globe to search its back catalog and retrieve individual newspaper pages on their web portal. They have scanned the old newspapers into PNG image format and used Optical Character Recognition (OCR) software to automatically convert images to a text file. The license of their OCR software will expire soon and the news organization decided to move to AWS and produce a scalable, durable, and highly available architecture.The suggested correct solution does mention Rekognition as the proper solution to detect and recognize text from the scanned old newspapers.

From my reading seems that DetectText method of Rekognition can detect <b data-stringify-type=”bold”><i data-stringify-type=”italic”>up to 100 words in an image.

So maybe this is not the ideal solution as on a standard A4 page you have on average 400-500 words?

See here
https://docs.aws.amazon.com/rekognition/latest/dg/limits.html
Kenneth-Samonte-Tutorials-Dojo

Member
April 30, 2022 at 8:25 pm

Hi cdt78,

Thank you for sharing your feedback on this question.

Our team will work hard to review this question and will update if necessary.

We’ll upload the changes as soon as our team reviews the updates.

Thanks and regards,

Kenneth Samonte @ Tutorials Dojo
cdt78

Member
May 5, 2022 at 11:24 pm

ok. I believe <b style=”font-family: inherit; font-size: inherit;”>Amazon Textract could be a good alternative 🙂

Viewing 1 - 3 of 3 replies

Rekognition as the proper solution to detect and recognize text from the scanned

cdt78

Kenneth-Samonte-Tutorials-Dojo

cdt78