Difference between revisions of "SUNScholar/Elastic Statistics/Import Logs"

From Libopedia
Jump to navigation Jump to search
m
 
(46 intermediate revisions by the same user not shown)
Line 4: Line 4:
 
===Introduction===
 
===Introduction===
 
To get a full statistical history after enabling Elastic Statistics, requires the import of old log data. Below are details to do just that.
 
To get a full statistical history after enabling Elastic Statistics, requires the import of old log data. Below are details to do just that.
 +
===<font color="red">PLEASE NOTE:</font>===
 +
When importing records, try not to duplicate records.
 +
 +
For example:
 +
 +
If you enabled Elastic Statistics a month previously, then do not import the logs for the past month, because you will duplicate records from the period when you first enabled Elastic Statistics.
  
 
===Step 1 - Export old log data to suitable import format===
 
===Step 1 - Export old log data to suitable import format===
I prepared the following bash script to do the conversion.
+
*Create export script:
 +
mkdir $HOME/scripts
 +
 
 +
nano $HOME/scripts/stats-export
 +
*Copy and paste the following, then save the file:
 
<pre>
 
<pre>
 
#!/bin/sh
 
#!/bin/sh
  
cd /home/dspace/log
+
cd $HOME/log
 
ITEM=`ls dspace.log.*`
 
ITEM=`ls dspace.log.*`
 +
## Uncomment the following to debug ##
 
#echo $ITEM
 
#echo $ITEM
  
Line 17: Line 28:
 
echo "###################################"
 
echo "###################################"
 
echo "Exporting stats for log file:... $i"
 
echo "Exporting stats for log file:... $i"
/home/dspace/bin/dspace stats-log-converter -n -v -i $i -o $i.export
+
$HOME/bin/dspace stats-log-converter -n -i $i -o $i.export
 
done
 
done
 
</pre>
 
</pre>
  
After running the script you will have a lot of log files with the .export extension. This takes quite a while with a lot of log files. Be patient.
+
{{NANO}}
 +
*Make the script executable:
 +
chmod 0775 $HOME/scripts/stats-export
 +
*Convert the logs:
 +
$HOME/scripts/stats-export
  
 
===Step 2 - Import prepared log data to Elastic Statistics===
 
===Step 2 - Import prepared log data to Elastic Statistics===
I prepared the following bash script to do the import.
+
----
 +
====<font color="red">PLEASE NOTE:</font>====
 +
*''For a busy web site, the imports will take a long time. You will be updating a long history of visits in a very short space of time on the server. Please be patient, the import may take hours, maybe even days!''
 +
*You can speed up the imports on an Ubuntu server by installing a name server caching daemon, type: <tt>'''sudo apt-get install nscd'''</tt> before you start the imports.
 +
*The following can be used to run the import script in the background in a "detached mode".
 +
**https://en.wikipedia.org/wiki/Nohup
 +
**https://en.wikipedia.org/wiki/GNU_Screen
 +
**https://en.wikipedia.org/wiki/Tmux
 +
*If the import should fail, then check the "stats-import.log" file to see which imports succeeded. Then remove the import files that have been processed and begin the import again. Repeat this process until all the import files have been processed.
 +
 
 +
====Procedure====
 +
*Create import script:
 +
mkdir $HOME/scripts
 +
 
 +
nano $HOME/scripts/stats-import
 +
*Copy and paste the following, then save the file:
 
<pre>
 
<pre>
 
#!/bin/sh
 
#!/bin/sh
  
cd /home/dspace/log
+
cd $HOME/log
 
ITEM=`ls *.export`
 
ITEM=`ls *.export`
 +
## Uncomment the following to debug ##
 
#echo $ITEM
 
#echo $ITEM
  
 
for i in $ITEM ; do
 
for i in $ITEM ; do
echo "###################################"
+
echo "###################################" > stats-import.log
echo "Importing stats for log file:... $i"
+
echo "Importing stats for log file:... $i" > stats-import.log
/home/dspace/bin/dspace stats-log-importer -v -i $i
+
$HOME/bin/dspace stats-log-importer -v -i $i
 +
        echo "Import of:... $i completed. Resting for a while!" > stats-import.log
 +
        sleep 3
 
done
 
done
 
</pre>
 
</pre>
This takes quite a while with a lot of log files. Be patient.
+
*Make the script executable:
 +
chmod 0755 $HOME/scripts/stats-import
 +
*Import the converted log files
 +
$HOME/scripts/stats-import
  
 
===References===
 
===References===
 +
See: https://github.com/osulibraries/kb-stats-csv-import-elasticsearch
 +
 
====Export Utility====
 
====Export Utility====
:'''/home/dspace/bin/dspace stats-log-converter'''
+
:'''$HOME/bin/dspace stats-log-converter'''
 
<pre>
 
<pre>
 
usage: ClassicDSpaceLogConverter
 
usage: ClassicDSpaceLogConverter
Line 58: Line 96:
  
 
====Import Utility====
 
====Import Utility====
:'''/home/dspace/bin/dspace stats-log-importer-elasticsearch'''
+
:'''$HOME/bin/dspace stats-log-importer-elasticsearch'''
 
<pre>
 
<pre>
 
usage: StatisticsImporterElasticSearch
 
usage: StatisticsImporterElasticSearch
Line 68: Line 106:
 
  -v,--verbose    display verbose output (useful for debugging)
 
  -v,--verbose    display verbose output (useful for debugging)
 
</pre>
 
</pre>
 +
[[Category:Customisation]]

Latest revision as of 12:40, 26 August 2016

BACK TO ELASTIC STATISTICS

Introduction

To get a full statistical history after enabling Elastic Statistics, requires the import of old log data. Below are details to do just that.

PLEASE NOTE:

When importing records, try not to duplicate records.

For example:

If you enabled Elastic Statistics a month previously, then do not import the logs for the past month, because you will duplicate records from the period when you first enabled Elastic Statistics.

Step 1 - Export old log data to suitable import format

  • Create export script:
mkdir $HOME/scripts
nano $HOME/scripts/stats-export
  • Copy and paste the following, then save the file:
#!/bin/sh

cd $HOME/log
ITEM=`ls dspace.log.*`
## Uncomment the following to debug ##
#echo $ITEM

for i in $ITEM ; do
	echo "###################################"
	echo "Exporting stats for log file:... $i"
	$HOME/bin/dspace stats-log-converter -n -i $i -o $i.export
done

NANO Editor Help
CTL+O = Save the file and then press Enter
CTL+X = Exit "nano"
CTL+K = Delete line
CTL+U = Undelete line
CTL+W = Search for %%string%%
CTL+\ = Search for %%string%% and replace with $$string$$
CTL+C = Show line numbers

More info = http://en.wikipedia.org/wiki/Nano_(text_editor)


  • Make the script executable:
chmod 0775 $HOME/scripts/stats-export
  • Convert the logs:
$HOME/scripts/stats-export

Step 2 - Import prepared log data to Elastic Statistics


PLEASE NOTE:

  • For a busy web site, the imports will take a long time. You will be updating a long history of visits in a very short space of time on the server. Please be patient, the import may take hours, maybe even days!
  • You can speed up the imports on an Ubuntu server by installing a name server caching daemon, type: sudo apt-get install nscd before you start the imports.
  • The following can be used to run the import script in the background in a "detached mode".
  • If the import should fail, then check the "stats-import.log" file to see which imports succeeded. Then remove the import files that have been processed and begin the import again. Repeat this process until all the import files have been processed.

Procedure

  • Create import script:
mkdir $HOME/scripts
nano $HOME/scripts/stats-import
  • Copy and paste the following, then save the file:
#!/bin/sh

cd $HOME/log
ITEM=`ls *.export`
## Uncomment the following to debug ##
#echo $ITEM

for i in $ITEM ; do
	echo "###################################" > stats-import.log
	echo "Importing stats for log file:... $i" > stats-import.log
	$HOME/bin/dspace stats-log-importer -v -i $i
        echo "Import of:... $i completed. Resting for a while!" > stats-import.log
        sleep 3
done
  • Make the script executable:
chmod 0755 $HOME/scripts/stats-import
  • Import the converted log files
$HOME/scripts/stats-import

References

See: https://github.com/osulibraries/kb-stats-csv-import-elasticsearch

Export Utility

$HOME/bin/dspace stats-log-converter
usage: ClassicDSpaceLogConverter
       
 -h,--help        help
 -i,--in <arg>    source file ('-' or omit for standard input)
 -m,--multiple    treat the input file as having a wildcard ending
 -n,--newformat   process new format log lines (1.6+)
 -o,--out <arg>   destination file or directory ('-' or omit for standard
                  output)
 -v,--verbose     display verbose output (useful for debugging)

	ClassicDSpaceLogConverter -i infilename -o outfilename -v (for verbose output)

Import Utility

$HOME/bin/dspace stats-log-importer-elasticsearch
usage: StatisticsImporterElasticSearch
       
 -h,--help       help
 -i,--in <arg>   the input file ('-' or omit for standard input)
 -m,--multiple   treat the input file as having a wildcard ending
 -s,--skipdns    skip performing reverse DNS lookups on IP addresses
 -v,--verbose    display verbose output (useful for debugging)