Project Documentation was prepared by Peter Ryan.
Project Documentation: Metadata

Page Contents:

1. Descriptive Metadata
2. Example of a Complete Record

1. Descriptive Metadata (Original Document)

The following guidelines are based on the work of Erika Banski, Joel MacKeen, Fern Russell (June, 2000), which can found at Alberta Folklore and Local History Collection Digitization Project web site:

http://digital.library.ualberta.ca/folklore/documentation/index.html

Use this section to record information about the original document (item or file) being photographed and recorded in the digital database. There is no original library for these objects as they are still kept at the family home in Kyoto, Japan.

Title

- Use title indicated in the archive metadata inventory on each CD-Rom
- Capitalize first word and proper nouns only

Creator: PersonalName and CorporateName

- record creator name if known
- if unknown, record: "unknown" without quotation marks
if personal name, record as "last name, first name"
- in most instances this is IMANISHI, Kinji, with the exception of Maps that he used and other memorabilia that he kept of his travels

Contributor

- use for secondary authorship information
- follow instructions for "Creator" field

Subtype.Description

- describe original document in hand, not digital archival object; additional sources to consult:
- date layout Dublin Core when in the Description field (yyyy-mm-dd)

Publisher.Of Original

- use for published documents only
- record place of publication and name of publisher
example: Mainichi Shinbun, Kyoto, Japan

Date: Created and Digitized
Date.Created

- the original document creation year, e.g. "1938"
if exact date of creation known, record it in the Description field
- if year unknown use approx. decade "1940 ca"
- Note: see Date.Digitized below under Admin. Metadata

Type.Genre

- list of possible genre types that will be added to the pull-down menu in the template: personal reminiscences; legends; tall tales; newspapers, periodical articles; play scripts; poetry, short stories; essays, radio scripts; school histories, photographs; project documents, correspondence, reports
- select from pull-down menu or add new to the table and then select it for the record

Coverage: Geographic and Time

- Coverage.Geographic e.g. Kyoto, Japan.

Coverage.Time

- record the time that the document refers to, not the date that the document was created. Use year or range of years (do not use days or months in date range)
- Examples: 1920 ca.; 1935-1940, 19th century

Subject.Keyword

- Subject Headings List for new headings
- add additional subject headings upon completion of archive and verification by other scholars

Language

- the default is Japanese
- English and German are also used at times

Source.Collection

- the default entry is The Kinji Imanishi Digital Archive

Source.Series

- indicate the Series to which the item/file belongs (list of Series will be determined after the inventory is complete)

Source.Publication

- use for published documents only, e.g. newspaper articles, maps, etc...
- example: originally published in the Mianichi Shinbun; originally published by the Uganda Protectorate

Accession number

- Dublin Core element used: Source.Relationship; the repeated DC element "Source.Relationship" has been changed to "Accession number" for clarity
- enter accession number of the file/item to relate original document to the digital archival object example: 01-01-3

Rights.Access

- this element is used to inform about accessibility of original document
- use "Open access" or "Restricted access" "Restricted access" files/items status must be clarified prior to making them accessible in public domain

Note: use the following model for naming the image files:

uniqueid#.tif

uniqueid

= 3rd level number in the Accession number of the original document

#

= page/leaf number of the document being scanned to which the image corresponds; always use a four digit number (eg. 0001)

.tif

= extension of the image file format

Examples:
the .TIFF image of the 1st leaf in the file 96-93-3 will be named 00030001.jpg
the .TIFF image of the 14th leaf in the file 96-93-3 will be named 00030014.jpg
the .TIFF image of the 1st page in the file 96-93-201 will be named 02010001.jpg

Format

- record file format in which the product of scanning will be stored (select from pull-down menu)
- Example: jpg

Total number of image files

- record total number of image files that make up the pages/sections of the full document
- this is important information for the work at the final stage when the complete document will be pulled together in a format that will allow us to manage multi-page electronic documents (pdf, html or other)

Dimensions

- give the dimension of the images in pixels
- Example: 640 x 480

File range

- generated automatically by software after File sequence information (see below) is entered
- Example: AB0001.jpg - AB0014 jpg
- this is important information for the work at the final stage when the complete document will be pulled together in a format that will allow us to manage multi-page electronic documents (pdf, html or other)

File sequence

- list all image files created that constitute the pages/leaves/parts of pages of the original document
- Example: AFNB0001.jpg; AFNB0002.jpg ... AFNB0014.jpg

Total size

- calculate the total size of all image files that constitute the document
- Example: 7 MB

Resolution

- Example: 300 dpi

Equipment

- select from the three options of equipment used: Nikon Coolpix, Fuji Finepix Camera, or the Panasonic Digital Scanner

Date.Digitized

- enter date of photography (YYYY-MM-DD)
- Example: 2001-05-18

Photographer

- enter initials of the person who completed the scanning
- Notes: enter important scanning information for which know specific field was provided

Publisher.Of Digital

- default: Edmonton: University of Alberta Department of Anthropology

Rights.Copyright

- used to record copyright clearance
- default: Copyright 2000 University of Alberta Department of Anthropology
- used to record copyright clearance Rights.Access
- Notes: enter important information for which no specific field was provided

top

2. Example of a Complete Record

This is an example of a completed archive record for an individual image in EXCEL format:

1) By Category: Archive_Contents.xls

2) By Individual Work (CD-ROM Level): Toimisaki_uma_1950-51-52_IMAGES.xls

Documentation: Photography Guidelines & Camera Notes

Equipment

  • Nikon Coolpix
  • Fuji Finepix Camera
  • Panasonic Digital Scanning Camera (with Mac Computer): Supplied by Computer and Networking Services (CNS) at the University of Alberta and the MACI lab.

Goal

For scanning text images we were focused upon the content rather than preserving the image itself. However, where possible, image and content were preserved as best as possible. Most of the documents were handwritten or typed. In some cases the ink was in a different color with blurred and faded text. We also had to take into account the paper quality. The quality of the paper varies from onionskin to heavy and thickly textured pages. Some of the documents also have folds and wrinkles in them, which were also be picked up by the photography equipment.

Another aspect that had to be taken into consideration was the changes that would be made using the Photoshop retouching software. We tailored the photographic settings to what was desirable, taking into account the above considerations and the Photoshop's capabilities. We used Photoshop retouching software to clean the images so they were as clear as possible and color corrected.

General Settings

Setting

Description

Image Type -- Real Color

There is a very slight difference between the scans done at 256 and the 12-bit grayscale. Scanning at the higher quality does not necessarily appear to be all that much better for this type of material. The 12-bit grayscale would be better for photos.

Resolution -- TIFF Quality

This is the best for text images because the image is not too dark and still scans in high quality. Cornell suggests scanning this type of material at 400 dpi, however from analyzing and comparing the images to one another the 300 dpi seems to be a cleaner copy.

Focus -- -2.0

For text documents, the focus is important. The focus can be adjusted manually or automatically in the preview of the scan. The automatic setting is better for graphic images.

Exposure -- 1

For text documents this is the most important and will be adjusted nearly every single time. Using a plain white page we had to set the exposure to 1. This setting will also vary on the text of the document and on how dark and clear it is overall. You do not want to have the exposure either too dark or light because then the image will be overly dark and the text hard to distinguish, or you may lose the text if the image is scanned at too high of an exposure.

 

 

Sample 1

Perfect text document; clear nice type on a plain white paper.

Resolution, we had determined the best resolution for photographing text documents was between 200 and 400. Anything higher than 400 took longer to photograph but that also the scan becomes darker. As you increase the resolution value the image also becomes more grainy and unclear.

White page - 1

Gray - 2

Yellow newsprint (when a picture is present make changes so text is visible. The picture is changed in the retouching software. - 6

Onion paper - 2

Dark brown - 6/7 and up

Gamma - 50

This adjustment can be made after setting the exposure, to darken the text if the entire image has been lightened. Usually this is set to 50 for dark documents if it is used. If you find that you lose some darkness in the text move the scroll bar to the left and the text should become a little bit darker. However, be careful because this will also darken changes in texture in the background from the paper, like the creases and the folds.

The other scan settings are not as important in comparison but can make a slight difference. Thus far these settings have only hindered the quality of the image and have been left alone.

All the scan settings are found in the preview window and you will see the changes that you make to the document before you do the final scan. It is difficult to ascertain how the actual image will look, and thus it will take some time playing with the process.

Tonal adjustments are rarely made because no image was well enhanced by this process. This is more effective on color images rather than grayscale images. If you have no other options then this can be used, but the results are not worth either the time or the trouble.

Duration for complete scan: average 2 minutes per scan, if on light paper and typed or handwritten. Usually the handwritten images have been easier to capture than the typed documents, because it is darker and are not blurred like the typed copies. The more an image digresses from the standard then the more time is needed in determining what settings will be changed to produce the best image.

Once the scan is complete the image is increased in size. Determining the quality of the image is easier when the image is blown up. The image is looked over and made sure that nothing is missing from the edges. And at the same time a judgment is made as to what, if anything, will need to be retouched.

If there has been problems with creases and pages that do not lie flat on the scanning bed we have used a large filer, so that the document can be placed between sheets of plastic, but the results have not been very impressive. Instead we have been using a sheet of plexi-glass – it does not leave any shadows -- on the back so heavy creases can be eliminated. Surprisingly the cover of the scanner does not flatten the original document completely and this is why we are using plexi-glass as a weight because it does not hinder the quality of the scan since it is clear. Of course the fragility of the original document is taken into consideration before this is done.

Retouching

This is done right after the photographing using PhotoShop 6.0 on the Panasonic Scanning Camera. Other .TIFF images were retouched later. Frequently all that is adjusted is only the Brightness and the Contrast. However, cropping has been performed and changes have been also made in the Color Levels in creating the JPEGs on-line.

Retouching was limited to four months in the summer of 2004. To do extensive retouching can easily take hours on one image, and so we have stayed away from doing any real complex changes, both because of time and authenticity.

top

 

 
Copyright 2004 Pamela Asquith | Contact | Web design by Natasha Nunn