Difference between revisions of "SUNScholar/Media Filters/5.X"

From Libopedia
Jump to navigation Jump to search
 
(55 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
<center>
 
<center>
  '''[[SUNScholar/Media Filters|Back to Media Filters]]'''
+
  '''[[SUNScholar/Media_Filters/Thumbnails|Back to Thumbnails]]'''
 
</center>
 
</center>
  
==<font color="red">'''''PLEASE NOTE''''':</font>==
+
==<font color="red">'''PLEASE NOTE''':</font>==
The media filters have changed by incorporating the use of ImageMagick and Ghostscript. See the link below for details.
+
*The media filters have changed by incorporating the use of ImageMagick and Ghostscript. See the link below for details about enabling media filters.
*https://wiki.duraspace.org/display/DSDOC5x/ImageMagick+Media+Filters
+
https://wiki.duraspace.org/display/DSDOC5x/ImageMagick+Media+Filters
 +
*After a while we noticed our server load increasing radically when doing the nightly media-filter jobs.
 +
*We isolated the problem to the "Branded Preview JPEG" filter.
 +
*This filter has been disabled as these branded previews are not important to us.
  
==Requirements==
+
==Step 1 - Install the Ubuntu software packages==
Check the following and then return.
 
http://wiki.lib.sun.ac.za/index.php/SUNScholar/Install_DSpace/S03#Step_3.2
 
 
 
==Step 1 - Login to the server==
 
http://wiki.lib.sun.ac.za/index.php/SUNScholar/Prepare_Ubuntu/S01
 
 
 
<font color="red">
 
'''Complete ALL of the following as the "dspace" user!'''
 
</font>
 
 
 
==Step 2 - Install the Ubuntu software packages==
 
 
Type the following:
 
Type the following:
 
  sudo apt-get install imagemagick ghostscript
 
  sudo apt-get install imagemagick ghostscript
  
==Step 3 - Install the java packages==
+
==Step 2 - Configuration==
N/A
+
Edit the ''"dspace.cfg"'' file.
 +
nano $HOME/{{Source}}/dspace/config/dspace.cfg
 +
===Enable===
 +
Search for following and change to true:
 +
webui.browse.thumbnail.show = true
 +
webui.item.thumbnail.show = true
 +
webui.preview.enabled = true
  
==Step 4 - Configuration==
+
===Dimensions===
===Step 4A===
+
Check the value for ''thumbnail.maxwidth'' and that it corresponds to the size you want for preview images for the UI.
First, be sure there is a value for ''thumbnail.maxwidth'' and that it corresponds to the size you want for preview images for the UI.
 
  
Edit the ''"dspace.cfg"'' file.
 
nano $HOME/source/config/dspace.cfg
 
 
Search for the following and modify.
 
Search for the following and modify.
 
<pre>
 
<pre>
Line 38: Line 33:
 
</pre>
 
</pre>
  
===Step 4B===
+
===Filters===
N/A
+
Enable filters as follows:
 +
<pre>
 +
#Names of the enabled MediaFilter or FormatFilter plugins
 +
filter.plugins = PDF Text Extractor, HTML Text Extractor, Word Text Extractor, \
 +
                PowerPoint Text Extractor, \
 +
                Branded Preview JPEG, \
 +
                ImageMagick Image Thumbnail, ImageMagick PDF Thumbnail
 +
</pre>
  
===Step 4C===
+
===Names===
N/A
+
Assign names for filters as follows:
 +
<pre>
 +
#Assign 'human-understandable' names to each filter
 +
plugin.named.org.dspace.app.mediafilter.FormatFilter = \
 +
  org.dspace.app.mediafilter.PDFFilter = PDF Text Extractor, \
 +
  org.dspace.app.mediafilter.HTMLFilter = HTML Text Extractor, \
 +
  org.dspace.app.mediafilter.WordFilter = Word Text Extractor, \
 +
  org.dspace.app.mediafilter.PowerPointFilter = PowerPoint Text Extractor, \
 +
  org.dspace.app.mediafilter.BrandedPreviewJPEGFilter = Branded Preview JPEG, \
 +
  org.dspace.app.mediafilter.ImageMagickImageThumbnailFilter = ImageMagick Image Thumbnail, \
 +
  org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter = ImageMagick PDF Thumbnail
 +
</pre>
 +
===Input Formats===
 +
Assign MIME file types to media filters as follows:
 +
<pre>
 +
#Configure each filter's input format(s)
 +
filter.org.dspace.app.mediafilter.PDFFilter.inputFormats = Adobe PDF
 +
filter.org.dspace.app.mediafilter.HTMLFilter.inputFormats = HTML, Text
 +
filter.org.dspace.app.mediafilter.WordFilter.inputFormats = Microsoft Word
 +
filter.org.dspace.app.mediafilter.PowerPointFilter.inputFormats = Microsoft Powerpoint, Microsoft Powerpoint XML
 +
filter.org.dspace.app.mediafilter.BrandedPreviewJPEGFilter.inputFormats = BMP, GIF, JPEG, image/png
 +
filter.org.dspace.app.mediafilter.ImageMagickImageThumbnailFilter.inputFormats = BMP, GIF, image/png, JPG, TIFF, JPEG, JPEG 2000
 +
filter.org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter.inputFormats = Adobe PDF
 +
</pre>
  
===Step 4D===
+
===Permissions===
N/A
+
Configure media filter permissions. Search for "filter.org.dspace.app.mediafilter.publicPermission" and modify as follows:
 +
<pre>
 +
#Publicly accessible thumbnails of restricted content.
 +
#List the MediaFilter name's that would get publicly accessible permissions
 +
#Any media filters not listed will instead inherit the permissions of the parent bitstream
 +
filter.org.dspace.app.mediafilter.publicPermission = BrandedPreviewJPEGFilter, ImageMagickImageThumbnailFilter, ImageMagickPdfThumbnailFilter
 +
</pre>
  
===Step 4E===
+
===List Emphasis===
N/A.
+
Search for <tt>'''xmlui.theme.mirage.item-list.emphasis'''</tt>. There are two options available namely "metadata" or "file", select "file".
  
==Step 4 - Build and Install==
+
See example below.
See please note above.
+
<pre>
 +
### Settings for Item lists in Mirage theme ###
 +
# What should the emphasis be in the display of item lists?
 +
# Possible values : 'file', 'metadata'. If your repository is
 +
# used mainly for scientific papers 'metadata' is probably the
 +
# best way. If you have a lot of images and other files 'file'
 +
# will be the best starting point
 +
# (metdata is the default value if this option is not specified)
 +
xmlui.theme.mirage.item-list.emphasis = file
 +
</pre>
  
==Step 5 - Update dspace rebuild script==
+
Save the ''"dspace.cfg"'' file and exit nano.
See please note above.
 
  
==Step 6 - Test the media filers==
+
{{NANO}}
 +
 
 +
==Step 4 - [[SUNScholar/Rebuild_DSpace|Rebuild DSpace]]==
 +
 
 +
==Step 5 - Test the media filers==
 
Type the following to test. Select an item that has pdf files attached and use it as replacement for "123456789/29097".
 
Type the following to test. Select an item that has pdf files attached and use it as replacement for "123456789/29097".
  $HOME/bin/dspace filter-media -n -v -i 123456789/29097
+
  $HOME/bin/dspace filter-media -v -i 123456789/29097
==Step 7 - Create new thumbnails==
+
 
The script is configured to do 1000 items at a time only. This saves on memory and CPU time. Therefore on a large system you may need to run the script several times. Also make sure that the dspace user has full read/write access to all items in the assetstore folders.
+
==Step 6 - Create new thumbnails==
 +
The script is configured to do 1000 items at a time only. This saves on memory and CPU time. Therefore on a large system you may need to run the script several times. Also make sure that the dspace user has full read/write access to all items in the asset store folders.
 +
 
 +
$HOME/bin/dspace filter-media -v -f -m 1000 -p "ImageMagick PDF Thumbnail"
  
  $HOME/bin/dspace filter-media -n -v -f -m 1000 -p "PDF Thumbnail"
+
  $HOME/bin/dspace filter-media -v -f -m 1000 -p "ImageMagick Image Thumbnail"
  
==Step 8 - Add a daily admin task==
+
==Step 7 - Add a daily admin task==
 
See: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Daily_Admin. Check the '''"filter-media"''' options!
 
See: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Daily_Admin. Check the '''"filter-media"''' options!
  
Line 70: Line 116:
 
*https://wiki.duraspace.org/display/DSDOC5x/Mediafilters+for+Transforming+DSpace+Content
 
*https://wiki.duraspace.org/display/DSDOC5x/Mediafilters+for+Transforming+DSpace+Content
 
*https://wiki.duraspace.org/display/DSDOC5x/Configuration+Reference#ConfigurationReference-XPDFFilter
 
*https://wiki.duraspace.org/display/DSDOC5x/Configuration+Reference#ConfigurationReference-XPDFFilter
 +
[[Category:Customisation]]

Latest revision as of 13:36, 26 August 2016

Back to Thumbnails

PLEASE NOTE:

  • The media filters have changed by incorporating the use of ImageMagick and Ghostscript. See the link below for details about enabling media filters.
https://wiki.duraspace.org/display/DSDOC5x/ImageMagick+Media+Filters
  • After a while we noticed our server load increasing radically when doing the nightly media-filter jobs.
  • We isolated the problem to the "Branded Preview JPEG" filter.
  • This filter has been disabled as these branded previews are not important to us.

Step 1 - Install the Ubuntu software packages

Type the following:

sudo apt-get install imagemagick ghostscript

Step 2 - Configuration

Edit the "dspace.cfg" file.

nano $HOME/source/dspace/config/dspace.cfg

Enable

Search for following and change to true:

webui.browse.thumbnail.show = true
webui.item.thumbnail.show = true
webui.preview.enabled = true

Dimensions

Check the value for thumbnail.maxwidth and that it corresponds to the size you want for preview images for the UI.

Search for the following and modify.

# maximum width and height of generated thumbnails
thumbnail.maxwidth  = 160
thumbnail.maxheight = 160

Filters

Enable filters as follows:

#Names of the enabled MediaFilter or FormatFilter plugins
filter.plugins = PDF Text Extractor, HTML Text Extractor, Word Text Extractor, \
                 PowerPoint Text Extractor, \
                 Branded Preview JPEG, \
                 ImageMagick Image Thumbnail, ImageMagick PDF Thumbnail

Names

Assign names for filters as follows:

#Assign 'human-understandable' names to each filter
plugin.named.org.dspace.app.mediafilter.FormatFilter = \
  org.dspace.app.mediafilter.PDFFilter = PDF Text Extractor, \
  org.dspace.app.mediafilter.HTMLFilter = HTML Text Extractor, \
  org.dspace.app.mediafilter.WordFilter = Word Text Extractor, \
  org.dspace.app.mediafilter.PowerPointFilter = PowerPoint Text Extractor, \
  org.dspace.app.mediafilter.BrandedPreviewJPEGFilter = Branded Preview JPEG, \
  org.dspace.app.mediafilter.ImageMagickImageThumbnailFilter = ImageMagick Image Thumbnail, \
  org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter = ImageMagick PDF Thumbnail

Input Formats

Assign MIME file types to media filters as follows:

#Configure each filter's input format(s)
filter.org.dspace.app.mediafilter.PDFFilter.inputFormats = Adobe PDF
filter.org.dspace.app.mediafilter.HTMLFilter.inputFormats = HTML, Text
filter.org.dspace.app.mediafilter.WordFilter.inputFormats = Microsoft Word
filter.org.dspace.app.mediafilter.PowerPointFilter.inputFormats = Microsoft Powerpoint, Microsoft Powerpoint XML
filter.org.dspace.app.mediafilter.BrandedPreviewJPEGFilter.inputFormats = BMP, GIF, JPEG, image/png
filter.org.dspace.app.mediafilter.ImageMagickImageThumbnailFilter.inputFormats = BMP, GIF, image/png, JPG, TIFF, JPEG, JPEG 2000
filter.org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter.inputFormats = Adobe PDF

Permissions

Configure media filter permissions. Search for "filter.org.dspace.app.mediafilter.publicPermission" and modify as follows:

#Publicly accessible thumbnails of restricted content.
#List the MediaFilter name's that would get publicly accessible permissions
#Any media filters not listed will instead inherit the permissions of the parent bitstream
filter.org.dspace.app.mediafilter.publicPermission = BrandedPreviewJPEGFilter, ImageMagickImageThumbnailFilter, ImageMagickPdfThumbnailFilter

List Emphasis

Search for xmlui.theme.mirage.item-list.emphasis. There are two options available namely "metadata" or "file", select "file".

See example below.

### Settings for Item lists in Mirage theme ###
# What should the emphasis be in the display of item lists?
# Possible values : 'file', 'metadata'. If your repository is
# used mainly for scientific papers 'metadata' is probably the
# best way. If you have a lot of images and other files 'file'
# will be the best starting point
# (metdata is the default value if this option is not specified)
xmlui.theme.mirage.item-list.emphasis = file

Save the "dspace.cfg" file and exit nano.


NANO Editor Help
CTL+O = Save the file and then press Enter
CTL+X = Exit "nano"
CTL+K = Delete line
CTL+U = Undelete line
CTL+W = Search for %%string%%
CTL+\ = Search for %%string%% and replace with $$string$$
CTL+C = Show line numbers

More info = http://en.wikipedia.org/wiki/Nano_(text_editor)


Step 4 - Rebuild DSpace

Step 5 - Test the media filers

Type the following to test. Select an item that has pdf files attached and use it as replacement for "123456789/29097".

$HOME/bin/dspace filter-media -v -i 123456789/29097

Step 6 - Create new thumbnails

The script is configured to do 1000 items at a time only. This saves on memory and CPU time. Therefore on a large system you may need to run the script several times. Also make sure that the dspace user has full read/write access to all items in the asset store folders.

$HOME/bin/dspace filter-media -v -f -m 1000 -p "ImageMagick PDF Thumbnail"
$HOME/bin/dspace filter-media -v -f -m 1000 -p "ImageMagick Image Thumbnail"

Step 7 - Add a daily admin task

See: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Daily_Admin. Check the "filter-media" options!

References