09-08-2009, 11:42 AM
Download these files from http://code.google.com/p/tesseract-ocr/downloads/list and unpack (eg with 7-Zip) somewhere:
1. tesseract-2.04.tar.gz. It contains tesseract.exe that we'll use.
2. tesseract-2.00.eng.tar.gz. It contains English language data. Place the tessdata folder in the folder where is tesseract exe.
Also need gflax:
Image manipulation
In the code, change the tesseract exe path (fTes).
Macro Macro1276
1. tesseract-2.04.tar.gz. It contains tesseract.exe that we'll use.
2. tesseract-2.00.eng.tar.gz. It contains English language data. Place the tessdata folder in the folder where is tesseract exe.
Also need gflax:
Image manipulation
In the code, change the tesseract exe path (fTes).
Macro Macro1276
str fTes="D:\Downloads\tesseract\tesseract.exe" ;;change this
int scale=2 ;;try to change this if recognition is poor. Tesseract is very sensitive to text size. Usually with 2 works best.
;---------------------
str fBmp="$temp$\qm_tesseract.bmp"
str fTif="$temp$\qm_tesseract.tif"
str fTxt="$temp$\qm_tesseract.txt"
fTes.expandpath
fBmp.expandpath
fTif.expandpath
fTxt.expandpath
;capture bitmap (optional)
if(!CaptureImageOrColor(0 0 _hwndqm fBmp)) ret
;bmp -> tif
typelib GflAx {059321F1-207A-47A7-93A1-29CDF876FDD3} 1.0
GflAx.GflAx g._create
g.LoadBitmap(fBmp)
g.Resize(g.width*scale g.height*scale)
g.SaveFormatName="tiff"
g.SaveBitmap(fTif)
;run fTif
;ret
;convert to text
if(fTxt.endi(".txt")) fTxt.fix(fTxt.len-4) ;;tesseract always adds ".txt"
str cl.format("%s ''%s'' ''%s''" fTes fTif fTxt) so
if(RunConsole2(cl so)) end so
;show results
fTxt+".txt"
_s.getfile(fTxt)
out
out _s