Difference between revisions of "SUNScholar/Elastic Statistics/Import Logs"

From Libopedia
Jump to navigation Jump to search
Line 34: Line 34:
  
 
===Step 2 - Import prepared log data to Elastic Statistics===
 
===Step 2 - Import prepared log data to Elastic Statistics===
====Please note:====
+
----
 +
====PLEASE NOTE:====
 
*''For a busy web site, the imports will take a long time. You will be updating a long history of visits in a very short space of time on the server. Please be patient, the import may take hours.''
 
*''For a busy web site, the imports will take a long time. You will be updating a long history of visits in a very short space of time on the server. Please be patient, the import may take hours.''
 
*''You can speed up the imports on an Ubuntu server by installing a name server caching daemon, type: <tt>'''sudo apt-get install nscd'''</tt> before you start the imports.''
 
*''You can speed up the imports on an Ubuntu server by installing a name server caching daemon, type: <tt>'''sudo apt-get install nscd'''</tt> before you start the imports.''

Revision as of 11:37, 23 January 2015

BACK TO ELASTIC STATISTICS

Introduction

To get a full statistical history after enabling Elastic Statistics, requires the import of old log data. Below are details to do just that.

PLEASE NOTE:

When importing records, try not to duplicate records. For example, if you enabled Elastic Statistics a month previously, then do not import the logs for the past month, because you will duplicate records from the period when you first enabled Elastic Statistics.

Step 1 - Export old log data to suitable import format

  • Create export script:
mkdir $HOME/scripts
nano $HOME/scripts/stats-export
  • Copy and paste the following, then save the file:
#!/bin/sh

cd $HOME/log
ITEM=`ls dspace.log.*`
#echo $ITEM

for i in $ITEM ; do
	echo "###################################"
	echo "Exporting stats for log file:... $i"
	$HOME/bin/dspace stats-log-converter -n -i $i -o $i.export
done

NANO Editor Help
CTL+O = Save the file and then press Enter
CTL+X = Exit "nano"
CTL+K = Delete line
CTL+U = Undelete line
CTL+W = Search for %%string%%
CTL+\ = Search for %%string%% and replace with $$string$$
CTL+C = Show line numbers

More info = http://en.wikipedia.org/wiki/Nano_(text_editor)


  • Make the script executable:
chmod 0775 $HOME/scripts/stats-export
  • Convert the logs:
$HOME/scripts/stats-export

Step 2 - Import prepared log data to Elastic Statistics


PLEASE NOTE:

  • For a busy web site, the imports will take a long time. You will be updating a long history of visits in a very short space of time on the server. Please be patient, the import may take hours.
  • You can speed up the imports on an Ubuntu server by installing a name server caching daemon, type: sudo apt-get install nscd before you start the imports.

  • Create import script:
mkdir $HOME/scripts
nano $HOME/scripts/stats-import
  • Copy and paste the following, then save the file:
#!/bin/sh

cd $HOME/log
ITEM=`ls *.export`
#echo $ITEM

for i in $ITEM ; do
	echo "###################################"
	echo "Importing stats for log file:... $i"
	$HOME/bin/dspace stats-log-importer -v -i $i
        echo "Import of:... $i completed. Resting for 60 seconds... Phew!"
        sleep 60
done
  • Make the script executable:
chmod 0755 $HOME/scripts/stats-import
  • Import the converted log files
$HOME/scripts/stats-import

References

Export Utility

$HOME/bin/dspace stats-log-converter
usage: ClassicDSpaceLogConverter
       
 -h,--help        help
 -i,--in <arg>    source file ('-' or omit for standard input)
 -m,--multiple    treat the input file as having a wildcard ending
 -n,--newformat   process new format log lines (1.6+)
 -o,--out <arg>   destination file or directory ('-' or omit for standard
                  output)
 -v,--verbose     display verbose output (useful for debugging)

	ClassicDSpaceLogConverter -i infilename -o outfilename -v (for verbose output)

Import Utility

$HOME/bin/dspace stats-log-importer-elasticsearch
usage: StatisticsImporterElasticSearch
       
 -h,--help       help
 -i,--in <arg>   the input file ('-' or omit for standard input)
 -m,--multiple   treat the input file as having a wildcard ending
 -s,--skipdns    skip performing reverse DNS lookups on IP addresses
 -v,--verbose    display verbose output (useful for debugging)