How to quickly and batch rename garbled PDF file names to the content of the PDF text

Actually, I have written a tutorial about this before. The more troublesome way is to handle each one by one with Quicker, which is a bit slow but can still be completed in a few minutes: https://mp.weixin.qq.com/s/khBLIcmrNiLEJ1jCAxXiLw

Or perhaps the solution from Umi-OCR can still be used.

First, download Umi-OCR and open it：https://github.com/hiroi-sora/Umi-OCR

Select the batch document function, and then drag all the PDF documents into it.

Then double-click the document directory on the left, set it as shown in the following picture, change the range to 1-1, and then hold down the right mouse button on the right side to drag out two areas that do not need to be recognized.

Set up as shown in the following picture. Change the layout analysis scheme to "Single column - No line break", and the content extraction mode to "Full page forced OCR". You need to manually set the page numbers one by one. This step is a bit troublesome.

Then double-click to open the advanced settings, keep only "%name" for the file name format, and then select the "p.txt" plain text format.

After recognition, the corresponding files in the folder will be like the following:

Then download the conversion script I wrote：https://direct-link.net/1394003/aBLeqQoONL6Y

Put the script and the files generated by Umi-OCR in the same folder, and double-click the script to run it.

Search This Blog

Kyon's Blog

How to quickly and batch rename garbled PDF file names to the content of the PDF text

Comments

Post a Comment