Difference between revisions of "SUNScholar/Media Filters/4.X"

From Libopedia
Jump to navigation Jump to search
 
(102 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
<center>
 
<center>
  '''[[SUNScholar/Media Filters|Back to Media Filters]]'''
+
  '''[[SUNScholar/Media_Filters/Thumbnails|Back to Thumbnails]]'''
 
</center>
 
</center>
 +
==<font color="red">'''PLEASE NOTE:'''</font>==
 +
After a while we noticed our server load increasing radically when doing the nightly media-filter jobs. We isolated the problem to the "Branded Preview JPEG" filter. This filter has been disabled as these branded previews are not important to us.
  
==Step 1 - Login to the server==
+
==Step 1 - Install the Ubuntu software packages==
  http://wiki.lib.sun.ac.za/index.php/SUNScholar/Prepare_Ubuntu/S01
+
Type the following:
 +
sudo apt-get install xpdf poppler-utils curl
 +
 
 +
==Step 2 - Install the java packages==
 +
 
 +
===Step 2A - Install "jai_imageio.jar"===
 +
mkdir $HOME/temp
 +
 
 +
  cd $HOME/temp
 +
 
 +
wget --no-check-certificate http://download.java.net/media/jai-imageio/builds/release/1.1/jai_imageio-1_1-lib-linux-i586.tar.gz
 +
 
 +
tar -xzvf jai_imageio-1_1-lib-linux-i586.tar.gz
 +
 
 +
<pre>
 +
mvn install:install-file \
 +
                    -Dfile=jai_imageio-1_1/lib/jai_imageio.jar  \
 +
                    -DgroupId=com.sun.media                    \
 +
                    -DartifactId=jai_imageio                    \
 +
                    -Dversion=1.0_01                            \
 +
                    -Dpackaging=jar                            \
 +
                    -DgeneratePom=true
 +
</pre>
 +
 
 +
===Step 2B - Install "jai_core.jar"===
 +
mkdir $HOME/temp
  
==Step 2 - Install the Ubuntu software packages==
+
cd $HOME/temp
Type the following:
+
 
  sudo apt-get install xpdf poppler-utils jai-core jai-imageio-core
+
wget --no-check-certificate https://m2.duraspace.org/content/repositories/thirdparty/org/fcrepo/jai_core/1.1.2_01/jai_core-1.1.2_01.jar
 +
 
 +
<pre>
 +
mvn install:install-file \
 +
                    -Dfile=jai_core-1.1.2_01.jar \
 +
                    -DgroupId=javax.media                      \
 +
                    -DartifactId=jai_core                      \
 +
                    -Dversion=1.1.2_01                        \
 +
                    -Dpackaging=jar                            \
 +
                    -DgeneratePom=true
 +
</pre>
  
 
==Step 3 - Configuration==
 
==Step 3 - Configuration==
 
===Step 3A===
 
===Step 3A===
First, be sure there is a value for thumbnail.maxwidth and that it corresponds to the size you want for preview images for the UI.
+
Edit the ''"dspace.cfg"'' file.
 +
nano $HOME/{{Source}}/dspace/config/dspace.cfg
 +
 
 +
First enable thumbnails, search for following and change to true:
 +
webui.browse.thumbnail.show = true
 +
 
 +
webui.item.thumbnail.show = true
 +
 
 +
webui.preview.enabled = true
 +
 
 +
Then, search for the following and change as needed:
 +
webui.preview.brand = My Institution Name
 +
 
 +
webui.preview.brand.abbrev = MyOrg
 +
 
 +
Lastly, be sure there is a value for ''thumbnail.maxwidth'' and that it corresponds to the size you want for preview images for the UI.
  
Edit the "dspace.cfg" file.
 
nano /home/dspace/source/config/dspace.cfg
 
 
Search for the following and modify.
 
Search for the following and modify.
 
<pre>
 
<pre>
 
# maximum width and height of generated thumbnails
 
# maximum width and height of generated thumbnails
        thumbnail.maxwidth= 80
+
thumbnail.maxwidth = 160
        thumbnail.maxheight = 80
+
thumbnail.maxheight = 160
 
</pre>
 
</pre>
 +
 
===Step 3B===
 
===Step 3B===
Now, add the absolute paths to the XPDF tools you installed.
+
Search for "filter.plugins" and replace with the following.
 
 
See example below.
 
<pre>
 
xpdf.path.pdftotext = /usr/bin/pdftotext
 
xpdf.path.pdftoppm  = /usr/bin/pdftoppm
 
xpdf.path.pdfinfo  = /usr/bin/pdfinfo
 
</pre>
 
===Step 3C===
 
Change the MediaFilter plugin configuration to remove the old ''"org.dspace.app.mediafilter.PDFFilter"'' and add the new filters ''"org.dspace.app.mediafilter.XPDF2Text = PDF Text Extractor"'' and ''"org.dspace.app.mediafilter.XPDF2Thumbnail = PDF Thumbnail"''.
 
 
 
Add filter plugins.
 
 
<pre>
 
<pre>
 
filter.plugins = \
 
filter.plugins = \
Line 41: Line 81:
 
         HTML Text Extractor, \
 
         HTML Text Extractor, \
 
         Word Text Extractor, \
 
         Word Text Extractor, \
 +
        PowerPoint Text Extractor, \
 
         JPEG Thumbnail, \
 
         JPEG Thumbnail, \
 
         Branded Preview JPEG
 
         Branded Preview JPEG
 
</pre>
 
</pre>
===Step 3D===
+
 
Add human readable names to plugins.
+
===Step 3C===
</pre>
+
Change the MediaFilter plugin configuration to remove the old ''"org.dspace.app.mediafilter.PDFFilter"'' and add the new filters ''"org.dspace.app.mediafilter.XPDF2Text = PDF Text Extractor"'' and ''"org.dspace.app.mediafilter.XPDF2Thumbnail = PDF Thumbnail"''. Replace with the following.
 +
 
 +
<pre>
 
plugin.named.org.dspace.app.mediafilter.FormatFilter = \
 
plugin.named.org.dspace.app.mediafilter.FormatFilter = \
 
   org.dspace.app.mediafilter.XPDF2Text = PDF Text Extractor, \
 
   org.dspace.app.mediafilter.XPDF2Text = PDF Text Extractor, \
  org.dspace.app.mediafilter.XPDF2Thumbnail = PDF Thumbnail, \
 
 
   org.dspace.app.mediafilter.HTMLFilter = HTML Text Extractor, \
 
   org.dspace.app.mediafilter.HTMLFilter = HTML Text Extractor, \
 
   org.dspace.app.mediafilter.WordFilter = Word Text Extractor, \
 
   org.dspace.app.mediafilter.WordFilter = Word Text Extractor, \
 +
  org.dspace.app.mediafilter.PowerPointFilter = PowerPoint Text Extractor, \
 +
  org.dspace.app.mediafilter.XPDF2Thumbnail = PDF Thumbnail, \
 
   org.dspace.app.mediafilter.JPEGFilter = JPEG Thumbnail, \
 
   org.dspace.app.mediafilter.JPEGFilter = JPEG Thumbnail, \
 
   org.dspace.app.mediafilter.BrandedPreviewJPEGFilter = Branded Preview JPEG
 
   org.dspace.app.mediafilter.BrandedPreviewJPEGFilter = Branded Preview JPEG
Line 57: Line 101:
  
 
===Step 3D===
 
===Step 3D===
Then add the input format configuration properties for each of the new filters.
+
Then replace ''"filter.org.dspace.app.mediafilter.PDFFilter.inputFormats = Adobe PDF"'' with the following:
 
 
See example below.
 
 
<pre>
 
<pre>
 
filter.org.dspace.app.mediafilter.XPDF2Thumbnail.inputFormats = Adobe PDF
 
filter.org.dspace.app.mediafilter.XPDF2Thumbnail.inputFormats = Adobe PDF
 
filter.org.dspace.app.mediafilter.XPDF2Text.inputFormats = Adobe PDF
 
filter.org.dspace.app.mediafilter.XPDF2Text.inputFormats = Adobe PDF
 
</pre>
 
</pre>
 +
 
===Step 3E===
 
===Step 3E===
Finally, if you want PDF thumbnail images, don't forget to add that filter name to the filter.plugins property.
+
Above the comment, "#Custom settings for PDFFilter", add the following:
 +
<pre>
 +
#The paths to the XPDF utilities
 +
xpdf.path.pdftotext = /usr/bin/pdftotext
 +
xpdf.path.pdftoppm  = /usr/bin/pdftoppm
 +
xpdf.path.pdfinfo  = /usr/bin/pdfinfo
 +
</pre>
 +
 
 +
==Step 4 - Build and Install==
 +
To build, type the following:
 +
cd $HOME/{{Source}}
 +
 
 +
mvn -U clean package -Pxpdf-mediafilter-support
 +
To install, type the following: (Replace XXX with your DSpace version number)
 +
cd $HOME/{{Source}}/dspace/target/dspace-XXX-build
 +
 
 +
ant update
 +
 
 +
ant clean_backups
 +
 
 +
==Step 5 - Update dspace rebuild script==
 +
If the test build works then add the switch"-Pxpdf-mediafilter-support" to the dspace rebuild script, so that:
 +
mvn -U clean package
 +
becomes
 +
mvn -U clean package -Pxpdf-mediafilter-support
 +
 
 +
See: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Rebuild_DSpace
 +
==Step 6 - Test the media filers==
 +
[[SUNScholar/Restart_DSpace|Restart DSpace]] and then type the following to test. Select an item that has pdf files attached and use it as replacement for "123456789/29097".
 +
$HOME/bin/dspace filter-media -n -v -i 123456789/29097
 +
 
 +
==Step 7 - Create new thumbnails==
 +
The scripts are configured to do 1000 items at a time only. This saves on memory and CPU time. Therefore on a large system you may need to run the script several times. Also make sure that the dspace user has full read/write access to all items in the assetstore folders (<tt>'''sudo chmod 0777 -R $HOME/assetstore/'''</tt>).
 +
 
 +
$HOME/bin/dspace filter-media -n -v -m 1000 -p "PDF Thumbnail"
 +
 
 +
$HOME/bin/dspace filter-media -n -v -m 1000 -p "JPEG Thumbnail"
  
See example below.
+
==Step 8 - Add a daily admin task==
 +
See: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Daily_Admin. Check the '''"filter-media"''' options!
 +
==Step 9 - Item list preview settings==
 +
Edit the following file:
 +
nano $HOME/{{Source}}/dspace/config/dspace.cfg
 +
Search for <tt>'''xmlui.theme.mirage.item-list.emphasis'''</tt>. There are two options available namely "metadata" or "file", select "file" and save the "dspace.cfg" file, then rebuild DSpace. See example below.
 
<pre>
 
<pre>
filter.plugins = PDF Thumbnail,  PDF Text Extractor, ...
+
### Settings for Item lists in Mirage theme ###
 +
# What should the emphasis be in the display of item lists?
 +
# Possible values : 'file', 'metadata'. If your repository is
 +
# used mainly for scientific papers 'metadata' is probably the
 +
# best way. If you have a lot of images and other files 'file'
 +
# will be the best starting point
 +
# (metdata is the default value if this option is not specified)
 +
xmlui.theme.mirage.item-list.emphasis = file
 
</pre>
 
</pre>
  
Line 76: Line 167:
 
*https://wiki.duraspace.org/display/DSDOC4x/Configuration+Reference#ConfigurationReference-XPDFFilter
 
*https://wiki.duraspace.org/display/DSDOC4x/Configuration+Reference#ConfigurationReference-XPDFFilter
 
*http://packages.ubuntu.com/precise/xpdf
 
*http://packages.ubuntu.com/precise/xpdf
 +
*https://gist.github.com/alanorth/b71a458e9b83c3a8015a
 +
[[Category:Customisation]]

Latest revision as of 15:51, 29 May 2016

Back to Thumbnails

PLEASE NOTE:

After a while we noticed our server load increasing radically when doing the nightly media-filter jobs. We isolated the problem to the "Branded Preview JPEG" filter. This filter has been disabled as these branded previews are not important to us.

Step 1 - Install the Ubuntu software packages

Type the following:

sudo apt-get install xpdf poppler-utils curl

Step 2 - Install the java packages

Step 2A - Install "jai_imageio.jar"

mkdir $HOME/temp 
cd $HOME/temp
wget --no-check-certificate http://download.java.net/media/jai-imageio/builds/release/1.1/jai_imageio-1_1-lib-linux-i586.tar.gz
tar -xzvf jai_imageio-1_1-lib-linux-i586.tar.gz
 mvn install:install-file \
                    -Dfile=jai_imageio-1_1/lib/jai_imageio.jar  \
                    -DgroupId=com.sun.media                     \
                    -DartifactId=jai_imageio                    \
                    -Dversion=1.0_01                            \
                    -Dpackaging=jar                             \
                    -DgeneratePom=true

Step 2B - Install "jai_core.jar"

mkdir $HOME/temp 
cd $HOME/temp 
wget --no-check-certificate https://m2.duraspace.org/content/repositories/thirdparty/org/fcrepo/jai_core/1.1.2_01/jai_core-1.1.2_01.jar
mvn install:install-file \
                    -Dfile=jai_core-1.1.2_01.jar  \
                    -DgroupId=javax.media                      \
                    -DartifactId=jai_core                      \
                    -Dversion=1.1.2_01                         \
                    -Dpackaging=jar                            \
                    -DgeneratePom=true

Step 3 - Configuration

Step 3A

Edit the "dspace.cfg" file.

nano $HOME/source/dspace/config/dspace.cfg

First enable thumbnails, search for following and change to true:

webui.browse.thumbnail.show = true
webui.item.thumbnail.show = true
webui.preview.enabled = true

Then, search for the following and change as needed:

webui.preview.brand = My Institution Name
webui.preview.brand.abbrev = MyOrg

Lastly, be sure there is a value for thumbnail.maxwidth and that it corresponds to the size you want for preview images for the UI.

Search for the following and modify.

# maximum width and height of generated thumbnails
thumbnail.maxwidth  = 160
thumbnail.maxheight = 160

Step 3B

Search for "filter.plugins" and replace with the following.

filter.plugins = \
        PDF Text Extractor, \
        PDF Thumbnail, \
        HTML Text Extractor, \
        Word Text Extractor, \
        PowerPoint Text Extractor, \
        JPEG Thumbnail, \
        Branded Preview JPEG

Step 3C

Change the MediaFilter plugin configuration to remove the old "org.dspace.app.mediafilter.PDFFilter" and add the new filters "org.dspace.app.mediafilter.XPDF2Text = PDF Text Extractor" and "org.dspace.app.mediafilter.XPDF2Thumbnail = PDF Thumbnail". Replace with the following.

plugin.named.org.dspace.app.mediafilter.FormatFilter = \
  org.dspace.app.mediafilter.XPDF2Text = PDF Text Extractor, \
  org.dspace.app.mediafilter.HTMLFilter = HTML Text Extractor, \
  org.dspace.app.mediafilter.WordFilter = Word Text Extractor, \
  org.dspace.app.mediafilter.PowerPointFilter = PowerPoint Text Extractor, \
  org.dspace.app.mediafilter.XPDF2Thumbnail = PDF Thumbnail, \
  org.dspace.app.mediafilter.JPEGFilter = JPEG Thumbnail, \
  org.dspace.app.mediafilter.BrandedPreviewJPEGFilter = Branded Preview JPEG

Step 3D

Then replace "filter.org.dspace.app.mediafilter.PDFFilter.inputFormats = Adobe PDF" with the following:

filter.org.dspace.app.mediafilter.XPDF2Thumbnail.inputFormats = Adobe PDF
filter.org.dspace.app.mediafilter.XPDF2Text.inputFormats = Adobe PDF

Step 3E

Above the comment, "#Custom settings for PDFFilter", add the following:

#The paths to the XPDF utilities
xpdf.path.pdftotext = /usr/bin/pdftotext
xpdf.path.pdftoppm  = /usr/bin/pdftoppm
xpdf.path.pdfinfo   = /usr/bin/pdfinfo

Step 4 - Build and Install

To build, type the following:

cd $HOME/source
mvn -U clean package -Pxpdf-mediafilter-support

To install, type the following: (Replace XXX with your DSpace version number)

cd $HOME/source/dspace/target/dspace-XXX-build
ant update
ant clean_backups

Step 5 - Update dspace rebuild script

If the test build works then add the switch"-Pxpdf-mediafilter-support" to the dspace rebuild script, so that:

mvn -U clean package

becomes

mvn -U clean package -Pxpdf-mediafilter-support

See: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Rebuild_DSpace

Step 6 - Test the media filers

Restart DSpace and then type the following to test. Select an item that has pdf files attached and use it as replacement for "123456789/29097".

$HOME/bin/dspace filter-media -n -v -i 123456789/29097

Step 7 - Create new thumbnails

The scripts are configured to do 1000 items at a time only. This saves on memory and CPU time. Therefore on a large system you may need to run the script several times. Also make sure that the dspace user has full read/write access to all items in the assetstore folders (sudo chmod 0777 -R $HOME/assetstore/).

$HOME/bin/dspace filter-media -n -v -m 1000 -p "PDF Thumbnail"
$HOME/bin/dspace filter-media -n -v -m 1000 -p "JPEG Thumbnail"

Step 8 - Add a daily admin task

See: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Daily_Admin. Check the "filter-media" options!

Step 9 - Item list preview settings

Edit the following file:

nano $HOME/source/dspace/config/dspace.cfg

Search for xmlui.theme.mirage.item-list.emphasis. There are two options available namely "metadata" or "file", select "file" and save the "dspace.cfg" file, then rebuild DSpace. See example below.

### Settings for Item lists in Mirage theme ###
# What should the emphasis be in the display of item lists?
# Possible values : 'file', 'metadata'. If your repository is
# used mainly for scientific papers 'metadata' is probably the
# best way. If you have a lot of images and other files 'file'
# will be the best starting point
# (metdata is the default value if this option is not specified)
xmlui.theme.mirage.item-list.emphasis = file

References