The world is changing every day, so
does technology. Librarians and archivists find themselves facing the prospects
of digitization. Digitization is the process of converting analogue signals or
information of any form into a digital format that can be understood by
computer systems or electronic devices. Digitization can be important elements
to protecting originals from excessive handling and repeating copying.
Digitized information is easier to store, access and transmit and digitization
is used by a number of consumer electronic devices. Digitization process can
used any method such as data capture to collecting information and then
changing it into a form that can read and used by a computer.

Data capturing is the method to
putting a document into an electronic format. Many organizations implement to
automatically identify and classify information and make the information
available within particular systems. It takes documents content, in any format,
and converts it into something that a computer can contrive. Systems for
automated data capture are for example OCR, OMR, and ICR. One of the function
of the data capture is to make the user easy to find the information, so, it
does not matter if how fast the document been capture when they can’t extract
data, append metadata or even integrate with other content. Capturing data
without search ability will completely limits user’s ability to know what data
organization have and where to find it.

Usually, paper based is the forms when
it comes to capture the data and there have two ways to capture the data either
in manual or automated. Manual data capture or rekeying is way when you can’t
capture the materials because it is not be in good repair, contain handwriting
notes or additions that impossible to read. If handwriting does not OCR well,
and need to be rekeyed by hand. Rekeying is the process of taking a document
and physically typing the information contained in the document directly into
your word processor. Rekeying is extremely involved and time consuming. If a
document or a document project needs to be rekeyed, then it needs to allot
extra time for the rekeying of the text and proofreading. When rekeying text,
you should open word processing program and begin typing the text exactly the
same way it appears on the document. However, it should be sure to preserve the
structure and content closely to the original. The process should do it right
and not in hurry or feel rushed when rekeying texts. Rekeying is a long and
slow process and should only be performed when necessary.

For automated, is called automatic
identification and data capture (AIDC) is to identify, verify, record,
communicate and store information on discrete, packaged or containerized items.
Technologies that are considered as part of AIDC are bar codes, magnetic
stripes, Optical Character Recognition (OCR), Intelligent Character Recognition
(ICR), Optical Mark Recognition (OMR) and others. These technologies are capable
of performing automatic data capture. Modern technology allows data capture to
be quick, accurate and reliable. Automatic data capturing is a technology
driven solution to document processing. Prior existing technology was unable to
accurately process forms such as invoices because of the various fields that
such documents contain. The data then saved electronically for access at later
point. This will make document work efficient and convenient.  For instance, to convert a document image to
electronic text, OCR is suitable software to bring up the TIFF image of the
scanned document, select the necessary text portion and put it into a format
where we can edit the text for accuracy and usability. For the record, OCR
recognizes text and character from PDF scanned documents (include multipage
files), photographs and digital camera captured images. OCR will taking
archival TIFF images and converted them into readable and editable text. OCR
will not changing or alter images files, instead you are using a program to
read the text in an image and create text files that are used for long term
storage and to mark up document for online viewing.