"german_kids_song.pdf", 0 ) echo "Example 3: german_kids_song.pdf \n" //- // Example 4) Process multipage tiff with text/ignore zones specified for each page, optionally provide English as the target language // A) Setup empty destination doc $doc = new PDFDoc ( ) // B) Setup options with a single language plus text/ignore zones $opts = new OCROptions ( ) $opts -> AddLang ( "eng" ) $ignore_zones = new RectCollection ( ) // ignore signature box in the first 2 pages $ignore_zones -> AddRect ( new Rect ( 1492.0, 56.0, 2236.0, 432.0 ) ) $opts -> AddIgnoreZonesForPage ( $ignore_zones, 1 ) $opts -> AddIgnoreZonesForPage ( $ignore_zones, 2 ) // can use a combination of ignore and text boxes to focus on the page area of interest, // as ignore boxes are applied first, we remove the arrows before selecting part of the diagram $ignore_zones -> Clear ( ) $ignore_zones -> AddRect ( new Rect ( 992.0, 1276.0, 1368.0, 1372.0 ) ) $opts -> AddIgnoreZonesForPage ( $ignore_zones, 3 ) $text_zones = new RectCollection ( ) // we only have text zones selected in page 3 // select horizontal BUFFER ZONE sign $text_zones -> AddRect ( new Rect ( 900.0, 2384.0, 1236.0, 2480.0 ) ) // select right vertical BUFFER ZONE sign $text_zones -> AddRect ( new Rect ( 1960.0, 1976.0, 2016.0, 2296.0 ) ) // select Lot No. ![]() pdf with options OCRModule :: ProcessPDF ( $doc, $opts ) // D) check the result $doc -> Save ( $output_path. "german_kids_song.pdf" ) // B) Setup options with a single language and an ignore zone $opts = new OCROptions ( ) $opts -> AddLang ( "deu" ) $ignore_zones = new RectCollection ( ) $rect = new Rect ( 424.0, 163.0, 493.0, 730.0 ) $ignore_zones -> AddRect ( $rect ) $opts -> AddIgnoreZonesForPage ( $ignore_zones, 1 ) // C) Run OCR on the. pdf document $doc = new PDFDoc ( $input_path. pdf specifying a language - German - and ignore zone comprising a sidebar image // A) Open the. "multi_lang.pdf", 0 ) echo "Example 2: multi_lang.jpg \n" //- // Example 3) Process a. "multi_lang.jpg", $opts ) // C) check the result $doc -> Save ( $output_path. png with options OCRModule :: ImageToPDF ( $doc, $input_path. ![]() "psychomachia_excerpt.pdf", 0 ) echo "Example 1: psychomachia_excerpt.png \n" //- // Example 2) Process document using multiple languages // A) Setup empty destination doc $doc = new PDFDoc ( ) // B) Setup options with multiple target languages, English will always be considered as secondary language $opts = new OCROptions ( ) $opts -> AddLang ( "rus" ) $opts -> AddLang ( "deu" ) // B) Run OCR on the.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |