skip to main text

Extracting Text from Scanned Images (OCR)

Scan text in scanned magazines and newspapers and display it in your text editor.

Note

  • You can extract text when scanning via Document, Custom, or Driver.
  • The screens for scanning documents are used as examples in the following descriptions.
  1. Start IJ Scan Utility.

  2. Click Settings....

    figure: IJ Scan Utility

    The Settings dialog appears.

  3. Click Document Scan.

    figure: Settings dialog

    Note

    • For Resolution, only 300 dpi or 400 dpi can be set when Start OCR is selected in Application Settings.
  4. Select Start OCR for Application Settings, then select the application in which you want to display the result.

    figure: Settings dialog

    Note

    • If a compatible application is not installed, the text in the image is extracted and appears in your text editor.
      Text to be displayed is based on Document Language in the Settings (General Settings) dialog. Select the language you want to extract in Document Language and scan.
    • You can add the application from the pop-up menu.
  5. Click OK.

    figure: Settings dialog

    The IJ Scan Utility main screen appears.

    Note

  6. Click Document.

    figure: IJ Scan Utility

    Scanning starts.

    When scanning is completed, the scanned images are saved according to the settings, and the extracted text appears in the specified application.

    Note

    • Click Cancel to cancel the scan.
    • Text displayed in your text editor is for guidance only. Text in the image of the following types of documents may not be detected correctly.

      • Documents containing text with font size outside the range of 8 points to 40 points (at 300 dpi)
      • Slanted documents
      • Documents placed upside down or documents with text in the wrong orientation (rotated characters)
      • Documents containing special fonts, effects, italics, or hand-written text
      • Documents with narrow line spacing
      • Documents with colors in the background of text
      • Documents containing multiple languages