Does PDF.ConvertToText support PDFs which have protections to prevent text being "Copy and pasted"

Question

Hi,&nbsp;I have a requirement to validate the contents of an invoice PDF. The PDF has a protection mechanism to prevent the contents from being copy and pasted (the contents come out as "garbled" text). This means I can't use libraries like Apache PDFBox (tried this with TestComplete, and the contents come out "garbled").&nbsp;&nbsp;I wanted to know if anyone can confirm if TestComplete's PDF.ConvertToText method would support this type of PDF,, since it uses OCR to extract the text. My organization has a secured network, and port 443 is blocked, the process for me to get the port opened is quite lengthy with numerous approvals. I would hate to go through the process to find out the the functionality wouldn't be able to extract the text from my PDF.&nbsp;&nbsp;&nbsp;Thank you!

hkim5 · Accepted Answer

if you were to get 443 opened the pdftotext should work since its not a copy and paste but ocr as you mentioned

Forum Discussion

Does PDF.ConvertToText support PDFs which have protections to prevent text being "Copy and pasted"

1 Reply

Related Content

CORS error when trying to access a protected resource

Sentinel Admin control control is password protected

PDF.ConvertToText

Cannot run loadtestrunner because the project is protected

Ready API Protection server vs on on premises license-server

Recent Discussions

How to always click on the record on the top of a table

TestComplete15.63 is unable to recognize a range of standard html controls in Object Browser

Delay after pressing specific Key

End of Life for Support & Maintenance for perpetual licenses.

Log.SaveResultsAs() Unable to export the summary