Difference between revisions of "SUNScholar/Media Filters/5.X"

From Libopedia
Jump to navigation Jump to search
Line 6: Line 6:
 
The media filters have changed by incorporating the use of ImageMagick and Ghostscript. See the link below for details.
 
The media filters have changed by incorporating the use of ImageMagick and Ghostscript. See the link below for details.
 
*https://wiki.duraspace.org/display/DSDOC5x/ImageMagick+Media+Filters
 
*https://wiki.duraspace.org/display/DSDOC5x/ImageMagick+Media+Filters
 +
To enable, follow the instructions mentioned in the link above.
  
 
==Requirements==
 
==Requirements==

Revision as of 14:13, 21 February 2015

Back to Media Filters

PLEASE NOTE:

The media filters have changed by incorporating the use of ImageMagick and Ghostscript. See the link below for details.

To enable, follow the instructions mentioned in the link above.

Requirements

Check the following and then return.

http://wiki.lib.sun.ac.za/index.php/SUNScholar/Install_DSpace/S03#Step_3.2

Step 1 - Login to the server

http://wiki.lib.sun.ac.za/index.php/SUNScholar/Prepare_Ubuntu/S01

Complete ALL of the following as the "dspace" user!

Step 2 - Install the Ubuntu software packages

Type the following:

sudo apt-get install xpdf poppler-utils

Step 3 - Install the java packages

Step 3A - Install "jai_imageio.jar"

mkdir $HOME/temp 
cd $HOME/temp
curl -O http://download.java.net/media/jai-imageio/builds/release/1.1/jai_imageio-1_1-lib-linux-i586.tar.gz
tar -xzvf jai_imageio-1_1-lib-linux-i586.tar.gz
 mvn install:install-file \
                    -Dfile=jai_imageio-1_1/lib/jai_imageio.jar  \
                    -DgroupId=com.sun.media                     \
                    -DartifactId=jai_imageio                    \
                    -Dversion=1.0_01                            \
                    -Dpackaging=jar                             \
                    -DgeneratePom=true

Step 3B - Install "jai_core.jar"

mkdir $HOME/temp 
cd $HOME/temp 
wget --no-check-certificate https://m2.duraspace.org/content/repositories/thirdparty/org/fcrepo/jai_core/1.1.2_01/jai_core-1.1.2_01.jar
mvn install:install-file \
                    -Dfile=jai_core-1.1.2_01.jar  \
                    -DgroupId=javax.media                      \
                    -DartifactId=jai_core                      \
                    -Dversion=1.1.2_01                         \
                    -Dpackaging=jar                            \
                    -DgeneratePom=true

Step 4 - Configuration

Step 4A

First, be sure there is a value for thumbnail.maxwidth and that it corresponds to the size you want for preview images for the UI.

Edit the "dspace.cfg" file.

nano $HOME/source/config/dspace.cfg

Search for the following and modify.

# maximum width and height of generated thumbnails
thumbnail.maxwidth  = 160
thumbnail.maxheight = 160

Step 4B

Search for "filter.plugins" and replace with the following.

filter.plugins = \
        PDF Text Extractor, \
        PDF Thumbnail, \
        HTML Text Extractor, \
        Word Text Extractor, \
        PowerPoint Text Extractor, \ 
        JPEG Thumbnail, \
        Branded Preview JPEG

Step 4C

Change the MediaFilter plugin configuration to remove the old "org.dspace.app.mediafilter.PDFFilter" and add the new filters "org.dspace.app.mediafilter.XPDF2Text = PDF Text Extractor" and "org.dspace.app.mediafilter.XPDF2Thumbnail = PDF Thumbnail". Replace with the following.

plugin.named.org.dspace.app.mediafilter.FormatFilter = \
  org.dspace.app.mediafilter.XPDF2Text = PDF Text Extractor, \
  org.dspace.app.mediafilter.XPDF2Thumbnail = PDF Thumbnail, \
  org.dspace.app.mediafilter.HTMLFilter = HTML Text Extractor, \
  org.dspace.app.mediafilter.WordFilter = Word Text Extractor, \
  org.dspace.app.mediafilter.PowerPointFilter = PowerPoint Text Extractor, \
  org.dspace.app.mediafilter.JPEGFilter = JPEG Thumbnail, \
  org.dspace.app.mediafilter.BrandedPreviewJPEGFilter = Branded Preview JPEG

Step 4D

Then replace "filter.org.dspace.app.mediafilter.PDFFilter.inputFormats = Adobe PDF" with the following:

filter.org.dspace.app.mediafilter.XPDF2Thumbnail.inputFormats = Adobe PDF
filter.org.dspace.app.mediafilter.XPDF2Text.inputFormats = Adobe PDF

Step 4E

Above the comment, "#Custom settings for PDFFilter", add the following:

#The paths to the XPDF utilities
xpdf.path.pdftotext = /usr/bin/pdftotext
xpdf.path.pdftoppm  = /usr/bin/pdftoppm
xpdf.path.pdfinfo   = /usr/bin/pdfinfo

Step 4 - Build and Install

To build, type the following:

cd $HOME/source
mvn -U clean package -Pxpdf-mediafilter-support

To install, type the following: (Replace XXX with your DSpace version number)

cd $HOME/source/dspace/target/dspace-XXX-build
ant update
ant clean_backups

Step 5 - Update dspace rebuild script

If the test build works then add the switch"-Pxpdf-mediafilter-support" to the dspace rebuild script, so that:

mvn -U clean package

becomes

mvn -U clean package -Pxpdf-mediafilter-support

See: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Rebuild_DSpace

Step 6 - Test the media filers

Type the following to test. Select an item that has pdf files attached and use it as replacement for "123456789/29097".

$HOME/bin/dspace filter-media -n -v -i 123456789/29097

Step 7 - Create new thumbnails

The script is configured to do 1000 items at a time only. This saves on memory and CPU time. Therefore on a large system you may need to run the script several times. Also make sure that the dspace user has full read/write access to all items in the assetstore folders.

$HOME/bin/dspace filter-media -n -v -f -m 1000 -p "PDF Thumbnail"

Step 8 - Add a daily admin task

See: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Daily_Admin. Check the "filter-media" options!

References