SUNScholar/Optimisations

From Libopedia
Revision as of 12:01, 22 January 2013 by Hgibson (talk | contribs) (→‎SOLR)
Jump to navigation Jump to search

Introduction

This wiki help page assumes that you have used the three system setup procedures to install an Ubuntu server with DSpace software.

This wiki page details the major optimisations of the system performed at Stellenbosch University in order to create a truly production version of DSpace.

Tomcat

See: http://www.turnkeylinux.org/tomcat

  • Remove "mod_jk", use "authbind" exclusively with no need of the Tomcat AJP connector in order to reduce the CPU and memory load
http://wiki.lib.sun.ac.za/index.php/SUNScholar/Prepare_Ubuntu/S05
  • Remove "development mode" of Tomcat by adding the following to the server context in order to reduce DNS lookups.
enableLookups="false"

XMLUI

  • Use XMLUI exclusively to reduce the memory load.
http://wiki.lib.sun.ac.za/index.php/SUNScholar/Install_Dspace/S08

Indexes

  • Fix "browse index" configuration to reduce the PostgreSQL database server query load.
http://wiki.lib.sun.ac.za/index.php/SUNScholar/Indexes#Browse_Indexes

Logs

DSpace application

  • Changed all instances of "INFO" to "ERROR" in the following config file to reduce disk I/O and CPU load.
log4j.properties

SOLR

Create the following file:

nano /home/dspace/dspace-1.8.2-src-release/dspace/modules/solr/src/main/webapp/WEB-INF/classes/logging.properties

Add the following to the file:

org.apache.solr.level = SEVERE

Save the file and rebuild.


NANO Editor Help
CTL+O = Save the file and then press Enter
CTL+X = Exit "nano"
CTL+K = Delete line
CTL+U = Undelete line
CTL+W = Search for %%string%%
CTL+\ = Search for %%string%% and replace with $$string$$
CTL+C = Show line numbers

More info = http://en.wikipedia.org/wiki/Nano_(text_editor)


Bitstream checker

Modified bitstream checker settings as follows to reduce database size.

#### Checksum Checker Settings ####
# Default dispatcher in case none specified
plugin.single.org.dspace.checker.BitstreamDispatcher=org.dspace.checker.SimpleDispatcher

# check history retention
checker.retention.default=1y
checker.retention.CHECKSUM_MATCH=2w

Monit monitor service

In case the Tomcat service halts or hangs due to whatever... , I installed monit to restart the service and then alert me. See an example of my config below.

dspace@ir1:/etc/monit$ sudo cat /etc/monit/monitrc
set daemon  60
set logfile syslog facility log_daemon
set mailserver localhost
set mail-format { from: XXXX@XX.XXX.XX.XX }
set alert XXXX@localhost
set httpd port 2812
     allow %user%:%password%

check process sshd with pidfile /var/run/sshd.pid
   start program  "/etc/init.d/ssh start"
   stop program  "/etc/init.d/ssh stop"
   if failed port 22 protocol ssh then restart
   
check host sunscholar with address scholar.sun.ac.za
   start program = "/etc/init.d/tomcat6 restart"
   stop program  = "/etc/init.d/tomcat6 stop"
   if failed port 80 proto http then restart
   alert XXXX@XXX.XX.XX
   alert XXXX@XXX.XX.XX

All confidential information has been replaced with % signs or captial X's.

References


All our tweaks and optimisations seem to be working.

The load dropped when we started using "authbind" for Tomcat thereby eliminating the need for the Apache "mod_jk" module, which was creating extra processing overhead.

Sunscholar-load-year.png

Looks like we have enough disk space in the /home partition for the next 3yrs at least, at our current rate of submissions. The /var partition which holds the database was reduced in size by tweaking the bitstream checker properties and then running a full database vacuum.

Sunscholar-disk-usage-year.png

We have more than enough compute muscle.

Sunscholar-cpu-year.png

Our memory usage stabilised when we stopped using the JSPUI. However after the upgrade to DSpace 1.8.2 and enabling discovery we are back to a memory intensive system.

Sunscholar-memory-year.png

Back to After Installation Tasks