Table of Contents
Tahiti has wide support for scanning and supports desktop, mid-range and also hi-speed scanners. It is important to carefully prepare scanning process. Such process have to be cost/time efective. There are lot of different scenarios and Tahiti can be easily accomodated for such process. Scanning is often directly connect not only with the document digitalization but also with document identification - attaching document type, filling attributes and sending document to the document store.
Some of the possible scanning scenarios:
scan documents and store on the local disk
scan documents, attach attributes and send to the document store
scan documents, read attributes from the barcode and send to the document store
scan documents and send them to the Chapter 7, Document Assembly system.
Used scenario depends on the type of the scanned documents, amount of documents and other conditions. It is also possible to use combination ot these methods.
Important part of the efective scanning process is to use predefined Section 1, “Scanning profiles”. These profiles are stored in stand-alone xml file which can be distributed to the users. Profiles can be also used to further automate document processing, set barcode recognition options and generation of attributes.
Scanning profiles are defined in the xml file. This file is part
of the configuration and is called scans.xml
. File
has to be in one of the following locations:
<tahiti-dir>/locale/<locale>/scans.xml
<tahiti-dir>/configs/<domain>/scans.xml
<repository>/<domain>/scans.xml
scans.xml
can contain list of definitions
common for all scanners and also individual list of definition for given
scanner. Following example contains two profiles common for all
scanners. First profile is called "ADF-gray-150
"
and second "Flat-Photo 9x13
". Profile name should
be short and easily understandable. E.g. First part of name in the
example says if automatic document feeder (ADF) is used or flat scanner
(Flat).
Example 6.1. scans.xml
<?xml version="1.0"?> <scan_formats> <scanner ProductName="*"> <format id="1" name="ADF-gray-150" resolution="150" depth="8" feeder="1" autofeed="1" duplex="0" size_x="8.268" size_y="11.692" option="compression=30;" /> <format id="2" name="Flat-Photo 9x13" resolution="300" depth="24" feeder="0" autofeed="0" duplex="0" size_y="4.2" size_x="5.8" /> </scanner> </scan_formats>
There have to be root tag called
<scan_formats>
. This contains one or more tags called
<scanner>
each describing configuration for one
scanner. Common configuration for all scanners is <scanner
ProductName="*">
and each configuration file heve to contain
such entry. There can be used name of the scanner instead of asterisk
for individual scanner configuration.
Tag <scanner>
contains list of
predefined scanning profiles. Each of them is defined inside separate
tag <format>
.
Table 6.1. scans.xml, tag format
Attribute | Mandatory | Description |
---|---|---|
id | Y | Identificator of the entry. |
name | Y | Name of the profile - user visible |
resolution | Y | Resolution in DPI |
depth | Y | Bit-depth, can be 1, 8, 24. |
threshold | N | Threshold (0-255), valid only for
depth="1" |
option | N | String describing options for used codec. |
xfer | N | Transfer protocol - communication between Tahiti and scanner. Only Twain experts should change this value. Possible values: NATIVE, FILE, MEMORY. |
xferFormat | N | Transfer format if xfer="FILE" . Possible
values: TIFF , PICT , BMP ,
XBM , JFIF , FPX ,
TIFFMULTI , PNG , SPIFF ,
EXIF |
size_x | N | Page size in inches (width). |
size_y | N | Page size in inches (height). |
feeder | N | Flag is feeder shoould be used. 0 - not used, 1 - use automatic feeder |
duplex | N | Flag if use duplex scanning. O - no used, 1 - use duplex scanning |
pageFormat | N | Format of scanned page, can be use instead of
size_x , size_y . Available values:
A3 , A4 , A5 ,
B3 , B4 , B5 ,
C3 , C4 , C5 ,
LETTER , USLEGAL . Some scanners do not
allow to set size_x and size_y and
only page format can be specified. |
transformation | N | Transformation function |
During scanning process various attributes can be generated and used in created documents. Attribute generation is driven by configuration file scanattrs.xml.
Example 6.2. scanattrs.xml
<ScanAttributes> <Profile name="All"> <Fixed id="Scan.Agency" name="Agentura" value="TA"/> <Fixed id="Scan.separator" name="separator" value=""/> <DocumentType id="Document.type" name="Dokument" shared="0"/> <Incremented id="CISLO_JEDNACI" name="CISLO_JEDNACI" prefix="" length="4" shared="0" incrOnValue="1"/> <Fixed id="HOSP_ROK" name="HOSP_ROK" value=""/> <Fixed id="OBDOBI" name="OBDOBI" value=""/> </Profile> </ScanAttributes>
Attributes are generated in groups ( all attributes from active group are inserted into newly created document ). Group is defined in section <Profile>. Every group has it's name and set of attribute generators. During scanning at most one group can be active (user select active group by it's name in Tahiti). Group can contain attribute generators of following types:
Fixed
DocumentType
Incremented
Fixed attribute generator produce constant value for every new document.
Table 6.2. Attributes
Attribute Name | Description |
---|---|
id | Attribute with this id will be added to created document. |
name | Name of attribute. This name is displayed in Tahiti. |
value | Value of attribute. Can be changed in Tahiti. |
DocumentType attribute generator produce attribute which contains name of document type for every new document. Value can be set in Tahiti where user can select document type from all document types supported in Tahiti.
Table 6.3. Attributes
Attribute Name | Description |
---|---|
id | Attribute with this id will be added to created document. |
name | Name of attribute. This name is displayed in Tahiti. |
shared | 1-value of this attribute is shared across all groups. 0-value is local for this group. |
Incremented attribute generator produce attribute which contains value created from prefix and numerical part for every new document. Numerical part is incremented on given event type. Value is incremented before inserted into document. Last used value is stored on disk for next use.
Table 6.4. Attributes
Attribute Name | Description |
---|---|
id | Attribute with this id will be added to created document. |
name | Name of attribute. This name is displayed in Tahiti. |
prefix | Prefix of value |
length | Length of value |
incrOnValue | Event type for increment numerical part of value. Numerical part is incremented when value of attribute Scan.separator is same as given value. Tahiti internally generate following values of attribute Scan.separator: 1-separation page type-1, 2-separation page type-2 |
shared | 1-value of this attribute is shared across all groups. 0-value is local for this group. |
It is possible to generate current date, user name as part of prefix. Variables usable in the prefix:
year (last 2 digits)
month (2 digits)
day (2 digits)
hours
minutes
username
prefix="cp-%y%m%d" - will generate string with prefix and current date, e.g. cp-080517
During scanning process it is posible to detect and reject empty pages. It is very usefull when duplex scan mode is used.
Parameters are set in tahiti.xml and can be overwritten in domain.xml.
Table 6.5. Parameters driving detection of empty page.
Scan.EmptyPage.Soil.Level | Detection of "interesting" pixels ( pixels carrying information ). Pixel is interesting when its intensity differ from average value more then Scan.EmptyPage.Soil.Level. Posible values <0,255>. Default value 15. |
Scan.EmptyPage.Soil.Ratio | Factor of filling of page <0,10000>. 0 - no data on page. Page is not empty when detected filling is greater than Scan.EmptyPage.Soil.Ratio. Default value - 90 ( at least 0.9 % of page is filled ) |
Scan.EmptyPage.Side | Size of strip of ignored part of image <0,1000> per mille. Default value - 30 ( ignore 3 % from each margin). |
Scan.EmptyPage.Type | Type of detection algorithm 2-old, 3-new (recomended). |