Difference between revisions of "SUNScholar/Export and Import Artifacts"

From Libopedia
Jump to navigation Jump to search
 
(251 intermediate revisions by 3 users not shown)
Line 1: Line 1:
hZOHT7  <a href="http://etimddixgjqy.com/">etimddixgjqy</a>, [url=http://lehdsaofbuys.com/]lehdsaofbuys[/url], [link=http://lyszaopiwaoj.com/]lyszaopiwaoj[/link], http://yhitepssatwb.com/
+
<center>
 +
'''[[SUNScholar/Customisation|Back to Customisation]]'''
 +
</center>
  
=Export on the old server=
+
==Introduction==
==Create the folders==
+
  '''<font color="red">The functionality to export communities, whole collections and items using the XMLUI, only became available with DSpace versions => 1.7.0.<br>With DSpace versions => 5.X it is possible to import collections archived using the simple archive format with the XMLUI.</font>'''
Become the root user by typing:
 
  sudo -i
 
Make an exports folder as follows:
 
mkdir /home/exports
 
Make a scripts folder as follows:
 
mkdir /root/scripts
 
==Create export script==
 
Make an export script file as follows:
 
nano /root/scripts/collection-export
 
Copy and paste the following to the file open with the editor.
 
<pre>
 
#!/bin/bash
 
# Define the configs.
 
. config
 
  
# Create the folders
+
==[[SUNScholar/Export and Import Artifacts/Via Packages|Using Packages Format]]==
if [ ! -d $EXPORT ]; then
+
==[[SUNScholar/Export and Import Artifacts/Via Simple Archive Format|Using Simple Archive Format]]==
    echo "Please make an export folder available."
+
==Transfer Content==
    exit 1
+
*https://wiki.duraspace.org/display/DSDOC5x/Exchanging+Content+Between+Repositories
fi
+
*https://wiki.duraspace.org/display/DSDOC4x/Exchanging+Content+Between+Repositories
 
+
*https://wiki.duraspace.org/display/DSDOC3x/Exchanging+Content+Between+Repositories
# Do the exports.
+
==Archive Content==
for i in `cat collections`; do
+
*https://wiki.duraspace.org/display/DSDOC5x/AIP+Backup+and+Restore
    ECID=$i
+
*https://wiki.duraspace.org/display/DSDOC4x/AIP+Backup+and+Restore
    if [ ! -d $EXPORT/$ECID ]; then
+
*https://wiki.duraspace.org/display/DSDOC3x/AIP+Backup+and+Restore
mkdir $EXPORT/$ECID
+
*https://wiki.duraspace.org/display/DSPACE/ReplicationTaskSuite
        echo "Exporting collection no: $ECID" >> $EXPORT/export.log
+
==[[SUNScholar/Curation/Repository|Migrate Content]]==
$DSHOME/bin/export -m -d $EXPORT/$ECID -t COLLECTION -n 100 -i $HANDLE/$ECID
+
Click on the heading above.
    else
+
[[Category:Customisation]]
        echo "Collection:$ECID already exported." >> $EXPORT/export.log
 
    fi
 
done
 
 
 
# Give all permissions
 
chmod -R 0777 $EXPORT
 
chown -R root.root $EXPORT
 
</pre>
 
'''Please note: <font color="red">The -m means export without metadata. This is for migration not movement of assets. Only available from DSpace version 1.6.0.</font>'''
 
 
 
Save the file and exit the editor.
 
 
 
Now make the file executeable as follows:
 
chmod 0755 /root/scripts/collection-export
 
 
 
==Create export config file==
 
Make a config file as follows:
 
nano /root/scripts/config
 
Copy and paste the following into editor.
 
<pre>
 
HANDLE="123456789"
 
EXPORT="/home/exports"
 
DSHOME="/home/dspace"
 
PERSON="admin@myrepo.ac.za"
 
</pre>
 
Modify the above to suit your site and then save and exit the file.
 
 
 
==Create export collections file==
 
Make collections file as follows:
 
nano /root/scripts/collections
 
Copy and paste the following into the editor.
 
<pre>
 
12
 
34
 
23
 
56
 
21
 
</pre>
 
'''Change the above to reflect the collection handle ID's you want to export. See the procedure outlined at the top of this page to determine the collection ID's.'''
 
 
 
Save the file and exit.
 
 
 
==Do the export==
 
Run the script as follows:
 
/root/scripts/collection-export
 
 
 
To watch the log file as the exports happen, type the following after opening another terminal:
 
tail -f /home/exports/export.log
 
 
 
==Create an exports tarball archive==
 
After the exports are complete, make a tarball backup as follows:
 
cd /home
 
 
 
tar -czvf exports.tgz exports/
 
Now copy the tarball (exports.tgz) to a portable disk or remote backup device.
 
 
 
=Import on the new server=
 
==Copy the exports tarball archive to the new server==
 
Copy the tarball (exports.tgz) of items exported on the old machine to the home folder on the new machine. Try the following command to do the copy from one Ubuntu machine to another Ubuntu machine. Open a terminal on the machine where the items were exported and type:
 
scp dspace@%old-host%:/home/exports.tgz dspace@%new-host%:/home/
 
Replace %old-host% with host name of your old server. Replace %new-host% with host name of your new machine. It is assumed that the "dspace" user account exists on the new machine.
 
 
 
If the above method does not work, then do the transfer manually using a portable disk or any other method that works for you. You could even try to use '''WinSCP'''.
 
==Create the communities and collections on the new server==
 
Now manually create the same communties and collections that were on the old machine, on your new machine. Take note of the collection id's created on the new machine (see the screenshot and instructions above) and check them against the old collections id's on the old machine using a spreadsheet, for example:
 
old-collection-id = new-collection-id
 
Print a copy of the spreadsheet to use for the following procedures.
 
 
 
Also save a list of submitter email addresses (ePersons) per new collection. You might think of defining one submitter email address (ePerson) to use for the imports only.
 
 
 
==Create the folders==
 
Login to the new server and become the root user as follows:
 
sudo -i
 
Make an imports folder as follows:
 
mkdir /home/imports
 
Make a script folder as follows:
 
mkdir /root/scripts
 
 
 
==Extract the exports tarball archive on the new server==
 
Run the following command to untar the tarball contents into the new server's '''/home/exports''' folder.
 
tar -C /home -xzvf /home/exports.tgz
 
Check that the '''/home/exports''' folder contains the items exported from the old server by typing the following:
 
ls /home/exports
 
A list of folders should be displayed that represents your exported collections by collection id. If they are not there then check up on what went wrong with the copy and extraction of the exports tarball archive !
 
 
 
==Create the import script==
 
We create a symlink to hold the "imports" folder.
 
cd /home
 
 
 
ln -s exports imports
 
Open the import script for editing as follows:
 
nano /root/scripts/collection-import
 
Copy and paste the following into the editor:
 
<pre>
 
#!/bin/bash
 
 
 
# Setup the configs.
 
. config
 
 
 
# Check for an exports folder
 
if [ ! -d $IMPORT ]; then
 
    clear
 
    echo "Have you extracted the collections to the correct folder."
 
    exit 1
 
fi
 
 
 
# Go thru the collections
 
for i in `cat collections` ; do
 
    echo $i > /tmp/map
 
    ECID=`cat /tmp/map | awk -F',' '{ print $1 }'`
 
    ICID=`cat /tmp/map | awk -F',' '{ print $2 }'`
 
    echo "Importing old collection:$ECID to new collection:$ICID" >> $IMPORT/import.log
 
    if [ ! -e $IMPORT/$ECID/$ECID-to-$ICID-items-map-list.csv ] ; then
 
        $DSHOME/bin/import -a -w -e $PERSON -s $IMPORT/$ECID -c $HANDLE/$ICID -m $IMPORT/$ECID/$ECID-to-$ICID-items-map-list.csv
 
        else
 
echo "$IMPORT/$ECID collection already imported" >> $IMPORT/import.log
 
        clear
 
    fi
 
done
 
</pre>
 
 
 
'''Please note:<font color="red">
 
#Check the DC metadata fields in the destination server and ensure that any customisation from the source server is also updated on the destination server.
 
#The -w means add to the collections workflow. This is for movement of assets not migration.
 
</font>'''
 
 
 
Save the file and exit the editor.
 
 
 
Now make the script executeable as follows:
 
chmod 0775 /root/scripts/collection-import
 
 
 
==Create import config file==
 
Make a config file as follows:
 
nano /root/scripts/config
 
Copy and paste the following into editor.
 
<pre>
 
HANDLE="123456789"
 
IMPORT="/home/imports"
 
DSHOME="/home/dspace"
 
PERSON="submitter@myrepo.ac.za"
 
</pre>
 
Modify the above to suit your new site and then save and exit the file.
 
 
 
==Create import collections file==
 
Make collections file as follows:
 
nano /root/scripts/collections
 
Copy and paste the following into the editor.
 
<pre>
 
12,34
 
34,43
 
23,56
 
56,78
 
21,99
 
</pre>
 
'''Change the above to reflect the collection handle ID's you want to import using the spreadsheet that maps new collection ID's to old collection ID's. The first item per line in the file is the exported collection id and the second item after the comma is the new import collection id.'''
 
 
 
Save the file and exit.
 
 
 
==Do the import==
 
Run the script as follows:
 
/root/scripts/collection-import
 
 
 
To watch the log file as the imports happen, type the following after opening another terminal:
 
tail -f /home/imports/import.log
 
 
 
=Test the exports for correctness if imports fail=
 
The imports will fail if the exports are not correct. Use the following to setup a script to check the exports for correctness on the new server.
 
==Create export check script==
 
Open the export check script for editing as follows:
 
nano /root/scripts/collection-check
 
Copy and paste the following to open editor:
 
<pre>
 
#!/bin/bash
 
 
 
# Setup the configs.
 
. config
 
 
 
# Clean out the old logs.
 
rm /root/test.log
 
 
 
# Go thru the collections
 
for i in `cat collections` ; do
 
    echo $i > /tmp/map
 
    ECID=`cat /tmp/map | awk -F',' '{ print $1 }'`
 
    ICID=`cat /tmp/map | awk -F',' '{ print $2 }'`
 
    echo "Checking items in collection:$ECID"
 
    LIST=`ls $IMPORT/$ECID`
 
   
 
    for i in $LIST ; do
 
ITM=$i
 
cd  $IMPORT/$ECID/$ITM
 
if [ ! -e dublin_core.xml ] ; then
 
echo "Item: $ITM, No dublin core." >> /root/test.log
 
fi
 
if [ ! -e handle ] ; then
 
echo "Item: $ITM, No handle." >> /root/test.log
 
fi
 
if [ ! -e contents ] ; then
 
echo "Item: $ITM, No contents file." >> /root/test.log
 
fi
 
if [ ! -e license.txt ] ; then
 
echo "Item: $ITM, No license." >> /root/test.log
 
fi
 
ls $IMPORT/$ECID/$ITM | grep pdf > /tmp/pdf.log
 
if [ $? -ne 0 ] ; then
 
echo "Item: $ITM, No PDF file." >> /root/test.log
 
fi
 
    done
 
    cat /root/test.log | mail -s "Exported collections test log" $PERSON
 
    echo "Check complete and report emailed to dspace admin user"
 
done
 
</pre>
 
Save the file and exit the editor.
 
 
 
Now make the script executeable as follows:
 
chmod 0775 /root/scripts/collection-check
 
 
 
==Do the exports check==
 
Type the following to do the check:
 
/root/scripts/collection-check
 
Now check your email, you should have one with details of the test log.
 
 
 
If there are errors, fix them on the old server. Delete the exports on the old server and do a new export after everything is fixed. Bascially start the export and import over again. At least you now have the scripts to do it over again.
 
 
 
=Help Info=
 
==/home/dspace/bin/export==
 
It takes the following parameters:
 
<pre>
 
-d,--dest    destination collection directory
 
-h,--help    help
 
-i,--id      collection/item ID
 
-n,--number  sequence number to begin exporting items with
 
-m,--migrate  strips out any per-repository-specifics
 
-t,--type    type: COLLECTION or ITEM
 
</pre>
 
 
 
==/home/dspace/bin/import==
 
 
 
It takes the following parameters:
 
<pre>
 
-s,--source        source collection directory
 
-c,--collection    destination collection handle
 
-e,--eperson      email of authorised person
 
-m,--mapfile      collection/item map file name
 
-a,--add          add items
 
-d,--delete        delete items listed in mapfile
 
-r,--replace      replace items listed in mapfile
 
-w,--workflow      send submission through collection's workflow
 
-p,--template      apply template
 
-t,--test          test run - do not actually import items
 
-R,--resume        resume a failed import (add only)
 
-h,--help          help
 
</pre>
 
 
 
=References=
 
* http://wiki.dspace.org/index.php/Batch_Metadata_Editing_Prototype
 
* http://services.lib.sun.ac.za/files/dspace/ingest-export.odp
 
* http://services.lib.sun.ac.za/files/dspace/Module%20-%20Import%20and%20Export.odt
 
* http://cadair.aber.ac.uk/dspace/handle/2160/627
 
* http://www.dspace.org/index.php/Architecture/technology/system-docs/storage.html
 
* http://cavlec.yarinareth.net/2008/01/07/the-dspace-batch-importer/
 
* http://tds.terkko.helsinki.fi/utils/
 
 
 
=Command Line Help=
 
'''<font color="red">Go to: http://www.ubuntu.sun.ac.za/wiki/index.php/SelfHelp for more help about the command line programs used in this procedure.</font>'''
 
 
 
'''[[SUNScholar/IR|Back to IR Help]]'''
 

Latest revision as of 15:57, 29 May 2016