Difference between revisions of "SUNScholar/Digitisation"

From Libopedia
Jump to navigation Jump to search
Line 1: Line 1:
 
=Objectives=
 
=Objectives=
The objective is to convert to digital format any material using [[SUNSCholar/DigitisationEquipment|digital equipment]].
+
The objective is to convert to digital format any material using [[SUNScholar/DigitisationEquipment|digital equipment]].
  
 
The resultant digital object must adhere to the following:
 
The resultant digital object must adhere to the following:

Revision as of 12:42, 3 July 2010

Objectives

The objective is to convert to digital format any material using digital equipment.

The resultant digital object must adhere to the following:

  1. Use an uncompressed bitstream for storage.
  2. Use open digital formats with no patent liability and which have open published standards.

For more information, see: http://en.wikipedia.org/wiki/Digitizing

Digital Format Registry

Common Closed Digital Formats

See: http://patentabsurdity.com and http://en.swpat.org

Documents

All the Microsoft document formats are closed.

This is a huge problem for digital preservation.

Multimedia

All the Microsoft media formats are closed.

This is a huge problem for digital preservation.

Other closed media formats

Multimedia

Converter software

List of HTML5 compatible video formats

See: http://wiki.whatwg.org/wiki/Main_Page

Open Codecs

Dirac video and Vorbis audio in Matroska container
<source src='video.mkv' type='video/x-matroska; codecs="dirac, vorbis"'>
Theora video and Vorbis audio in Matroska container
<source src='video.mkv' type='video/x-matroska; codecs="theora, vorbis"'>
Dirac video and Vorbis audio in Ogg container
<source src='video.ogv' type='video/ogg; codecs="dirac, vorbis"'>
http://diracvideo.org/wiki/index.php/Ffmpeg2dirac
Theora video and Vorbis audio in Ogg container
<source src='video.ogv' type='video/ogg; codecs="theora, vorbis"'>
http://v2v.cc/~j/ffmpeg2theora
Theora video and Speex audio in Ogg container
<source src='video.ogv' type='video/ogg; codecs="theora, speex"'>
Vorbis audio alone in Ogg container
<source src='audio.ogg' type='audio/ogg; codecs=vorbis'>
Speex audio alone in Ogg container
<source src='audio.spx' type='audio/ogg; codecs=speex'>
FLAC audio alone in Ogg container
<source src='audio.oga' type='audio/ogg; codecs=flac'>

Closed Codecs

H.264

H.264 Simple baseline profile video (main and extended video compatible) level 3 and Low-Complexity AAC audio in MP4 container
<source src='video.mp4' type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'>
H.264 Extended profile video (baseline-compatible) level 3 and Low-Complexity AAC audio in MP4 container
<source src='video.mp4' type='video/mp4; codecs="avc1.58A01E, mp4a.40.2"'>
H.264 Main profile video level 3 and Low-Complexity AAC audio in MP4 container
<source src='video.mp4' type='video/mp4; codecs="avc1.4D401E, mp4a.40.2"'>
H.264 'High' profile video (incompatible with main, baseline, or extended profiles) level 3 and Low-Complexity AAC audio in MP4 container
<source src='video.mp4' type='video/mp4; codecs="avc1.64001E, mp4a.40.2"'>

MPEG-4

MPEG-4 Visual Simple Profile Level 0 video and Low-Complexity AAC audio in MP4 container
<source src='video.mp4' type='video/mp4; codecs="mp4v.20.8, mp4a.40.2"'>
MPEG-4 Advanced Simple Profile Level 0 video and Low-Complexity AAC audio in MP4 container
<source src='video.mp4' type='video/mp4; codecs="mp4v.20.240, mp4a.40.2"'>
MPEG-4 Visual Simple Profile Level 0 video and AMR audio in 3GPP container
<source src='video.3gp' type='video/3gpp; codecs="mp4v.20.8, samr"'>

Audio

Codecs

Video

Codecs
MPEG-4
H.264/MPEG-4 AVC

Images

Container Formats

Documents

Comments
Dear Hilton,

I would advise that you adopt open (i.e. non-propriety) standards, as these have the best chance of remaining readable in the long-term future.
Propriety formats are dependent on the continuing existence of the firm who markets them, as well as the continued support by this firm, even if they continue to exist.
This is in my opinion very risky.

For documents I am aware of an ISO standard that is targeted at archival, known as PDF/A (see www.pdfa.org).

For audio and video the situation is less developed, and there are as far as I know no standards specifically for archival.
In both cases I would recommend that data be saved without lossy compression, and again that open standards be sought.
Hence mp3 and WMV should be avoided, both because they are based on lossy  compression and are are propriety.
The audio format FLAC on the other hand is open and does not employ lossy compression.

I hope this is of help,
Best regards,
Thomas Niesler.

------------------------------------------------
Prof. Thomas Niesler
Digital Signal Processing Group
Department of Electronic Engineering
University of Stellenbosch
Private Bag X1, Stellenbosch 7602, South Africa
Phone: +27 21 8084118
Fax:   +27 21 8084981
Email: trn@dsp.sun.ac.za

Microfiche

Software

Data Sets

Engineering drawings

See: http://www.opendesign.com

Metadata

Click on the heading above.

Language

Digitisation Guidelines

Media type Resolution Bit depth Enhancements Allowed
Printed text 300 dpi Bitonal Sharpening, descreening, cropping, deskewing, despeckling
Rare/ damaged printed text 400 dpi 8-gray or 24 colour Contrast stretching; Minimal adjustments for tone and colour
Book illustrations 400 - 600 dpi with enhancement 8-gray or 24 colour; Bitonal Contrast stretching; Minimal adjustments for tone and colour; Descreen/ rescreen, sharpen
Manuscripts 300 - 500 dpi with enhancement 8-gray or 24 colour Contrast stretching; Minimal adjustments for tone and colour
Maps and other oversized items 300 - 400 dpi 8-gray or 24 colour Contrast stretching; Minimal adjustments for tone and colour
Graphic Art 400 - 600 dpi 8-bit/ channel internal reduction Contrast stretching; Minimal adjustments for tone and colour
Please note
  • All archival material to be digitised in tiff format
  • Tiff copy together with derivated png or any additional copies to be submitted to SUNScholar
  • Document provenance metadata:
    • dc.description.provenance e.g. Original scanned in at 600 dpi, 100% DigiBook 10000 RGB colour, downsized to 840 pixels in width, resolution 250. Web version done automatically by PhotoShop 7 software. Downloading time approx. 26 seconds. Date done March - April 2007.