Home › Forums › AWS › AWS Certified Solutions Architect Professional › Rekognition as the proper solution to detect and recognize text from the scanned
-
Rekognition as the proper solution to detect and recognize text from the scanned
cdt78 updated 2 years, 9 months ago 2 Members · 3 Posts
-
<i data-stringify-type=”italic”> I was seeing this question<code data-stringify-type=”code”>A print media company has a popular web application hosted on their on-premises network which allows anyone around the globe to search its back catalog and retrieve individual newspaper pages on their web portal. They have scanned the old newspapers into PNG image format and used Optical Character Recognition (OCR) software to automatically convert images to a text file. The license of their OCR software will expire soon and the news organization decided to move to AWS and produce a scalable, durable, and highly available architecture.The suggested correct solution does mention Rekognition as the proper solution to detect and recognize text from the scanned old newspapers.
From my reading seems that DetectText method of Rekognition can detect <b data-stringify-type=”bold”><i data-stringify-type=”italic”>up to 100 words in an image.
So maybe this is not the ideal solution as on a standard A4 page you have on average 400-500 words?
See here
https://docs.aws.amazon.com/rekognition/latest/dg/limits.html -
Hi cdt78,
Thank you for sharing your feedback on this question.
Our team will work hard to review this question and will update if necessary.
We’ll upload the changes as soon as our team reviews the updates.
Thanks and regards,
Kenneth Samonte @ Tutorials Dojo
-
ok. I believe <b style=”font-family: inherit; font-size: inherit;”>Amazon Textract could be a good alternative 🙂
Log in to reply.