Skip to main content

tesseract-ocr

 conda activate

 conda install -c conda-forge tesseract


    package                    |            build
    ---------------------------|-----------------
    leptonica-1.82.0           |       h950d820_0         2.6 MB  conda-forge
    libarchive-3.5.2           |       hccf745f_1         1.6 MB  conda-forge
    lzo-2.10                   |    h516909a_1000         314 KB  conda-forge
    tesseract-5.0.1            |       h84e3e21_0       171.4 MB  conda-forge
    ------------------------------------------------------------
                                           Total:       175.9 MB

tesseract

Usage:
  tesseract --help | --help-extra | --version
  tesseract --list-langs
  tesseract imagename outputbase [options...] [configfile...]

OCR options:
  -l LANG[+LANG]        Specify language(s) used for OCR.
NOTE: These options must occur before any configfile.

Single options:
  --help                Show this help message.
  --help-extra          Show extra help for advanced users.
  --version             Show version information.
  --list-langs          List available languages for tesseract engine.


example 

wget https://tesseract-ocr.github.io/tessdoc/images/eurotext.png


 

 tesseract eurotext.png eurotext-eng

cat eurotext-eng

eurotext-eng.txt  eurotext.png      
(base) [[email protected] ~]$ cat eurotext-eng.txt
The (quick) [brown] {fox} jumps!
Over the $43,456.78 <lazy> #90 dog
& duck/goose, as 12.5% of E-mail
from [email protected] is spam.
Der ,.schnelle” braune Fuchs springt
tiber den faulen Hund. Le renard brun
«rapide» saute par-dessus le chien
paresseux. La volpe marrone rapida
salta sopra il cane pigro. El zorro
marron rapido salta sobre el perro
perezoso. A raposa marrom rapida
salta sobre 0 cao preguigoso.

 

TESSDATA_PREFIX environment variable

tesseract testing/eurotext.tif testing/eurotext-eng -l eng
tesseract testing/eurotext.png testing/eurotext-engdeu -l eng+deu
 tesseract testing/bilingual.jpg testing/bilingual-enghin -l eng+hin
 tesseract testing/bilingual.jpg testing/bilingual-hineng -l hin+eng
 
OUTPUTS
 
searchable pdf output:: 
tesseract testing/eurotext.png testing/eurotext-eng -l eng pdf
 
pdf+txt layer
tesseract c:\temp\test_ara.jpg -l ara -psm 3 c:\temp\test_ara pdf 
  
Hocr output
tesseract testing/eurotext.png testing/eurotext-eng -l eng hocr
cat eurotext-eng.hocr 
 
tsv output
tesseract testing/eurotext.png testing/eurotext-eng -l eng tsv    
 
page segmentation
tesseract  testing/san002.png testing/san002-psm6 -l san -psm 6 
tesseract testing/san002.png testing/san002-psm3 -l san -psm 3
    
 
src:: 

https://anaconda.org/conda-forge/tesseract

https://tesseract-ocr.github.io/tessdoc/Command-Line-Usage.html 

https://github.com/tesseract-ocr/


Comments

Popular posts from this blog

sxhkd volume andbrightness config for dwm on void

xbps-install  sxhkd ------------ mkdir .config/sxhkd cd .config/sxhkd nano/vim sxhkdrc -------------------------------- XF86AudioRaiseVolume         amixer -c 1 -- sset Master 2db+ XF86AudioLowerVolume         amixer -c 1 -- sset Master 2db- XF86AudioMute         amixer -c 1 -- sset Master toggle alt + shift + Escape         pkill -USR1 -x sxhkd XF86MonBrightnessUp          xbacklight -inc 20 XF86MonBrightnessDown          xbacklight -dec 20 ------------------------------------------------------------- amixer -c card_no -- sset Interface volume run alsamixer to find card no and interface names xbps-install -S git git clone https://git.suckless.org/dwm xbps-install -S base-devel libX11-devel libXft-devel libXinerama-devel  vim config.mk # FREETYPEINC = ${X11INC}/freetype2 #comment for non-bsd make clean install   cp config.def.h config.h vim config.h xbps-install -S font-symbola #for emoji on statusbar support     void audio config xbps-i

fix idm integration on chrome

Chrome Browser Integration I do not see IDM extension in Chrome extensions list. How can I install it?  How to configure IDM extension for Chrome? Please note that all IDM extensions that can be found in Google Store are fake and should not be used. You need to install IDM extension manually from IDM installation folder. Read in step 2 how to do it . 1. Please update IDM to the latest version by using  "IDM Help->Check for updates..."  menu item 2.  I don't see  "IDM Integration module"  extension in the list of extensions in  Chrome . How can I install it? Press on  Chrome  menu ( arrow 1  on the image), select  "Settings"  menu item ( arrow 2  on the image) and then select  "Extensions"  tab ( arrow 3  on the image). After this open IDM installation folder ( "C:\Program Files (x86)\Internet Download Manager"  by default,  arrow 4  on the image) and drag and drop  "IDMGCExt.crx"  ( arrow 5  on the image) file int

Hidden Wiki

Welcome to The Hidden Wiki New hidden wiki url 2015 http://zqktlwi4fecvo6ri.onion Add it to bookmarks and spread it!!! Editor's picks Bored? Pick a random page from the article index and replace one of these slots with it. The Matrix - Very nice to read. How to Exit the Matrix - Learn how to Protect yourself and your rights, online and off. Verifying PGP signatures - A short and simple how-to guide. In Praise Of Hawala - Anonymous informal value transfer system. Volunteer Here are five different things that you can help us out with. Plunder other hidden service lists for links and place them here! File the SnapBBSIndex links wherever they go. Set external links to HTTPS where available, good certificate, and same content. Care to start recording onionland's history? Check out Onionland's Museum Perform Dead Services Duties. Introduction Points Ahmia.fi - Clearnet search engine for Tor Hidden Services (allows you