SUNScholar/Metadata/Batch Edit/SAFBuilder

From Libopedia
Jump to navigation Jump to search
Back to Batch Edit
Back to Interoperability

SAFBuilder guide to convert CSV files to SimpleArchiveFormat in order to import to DSpace (2018)

SAFBuilder allows you to bulk upload items to DSpace. See the documentation below for more details.

The instructions given on the repository page is sufficient for use on Linux, but the Windows procedure is more involved, and not well documented. The following instructions provide the exact steps needed to get SAFBuilder running on Windows.

Using this guide

This guide refers to values which may change as time passes. E.g., the version number of the JDK. Since this version number is used in a number of places, it has instead been replaced by the word 'version' in angle brackets, i.e. <version>. Please replace all values referenced using angle brackets with the actual values.

Install the JDK

Bulb.png Note: JDK vs JRE


Note that the Java Development Kit (JDK) is required, and includes the Java Runtime Environment (JRE). The JRE on its own is not sufficient to compile SAFBuilder.

To start, download the latest JDK (not JRE) from the Oracle website. This guide was tested using jdk-10.0.1_windows-x64_bin.exe. Install the JDK, and once the installation has completed, copy the C:\Program Files\Java\jdk-<version> directory to C:\jdk-<version>. Some tools do not handle spaces in paths well, which is why we move everything to C:\ to reduce the chances of running into this incompatibility.

Install Maven

Now download Maven 3 from your local Apache mirror. Choose the latest binary version, it will contain -bin in the filename. This guide was tested using apache-maven-3.5.3-bin.zip. Now extract the zip and move the contents to C:\maven.

Set the required environment variables

Bulb.png Note: Setting the environment variables


To learn how to set environment variables on Windows, see the following guide.

Once Java and Maven have successfully been installed, create the following system wide environment variables:

  • JAVA_HOME: C:\jdk-<version>
  • MAVEN_HOME: C:\maven
  • M2_HOME: C:\maven

And additionally add the maven bin directory to the system PATH:

C:\maven\bin

To test that these variables have been set correctly, open a new command prompt and run:

maven --version

Install SAFBuilder

Download the source code and extract the contents to C:\SAFBuilder

Edit the contents of safbuilder.bat to be as follows:

@echo off
echo "Cleaning build directory..."
call mvn -DskipTests=true clean package
echo "Done, compiling and running SAFBuilder..."
call mvn exec:java -Dexec.mainClass="safbuilder.BatchProcess" -Dexec.args="%1 %2"
echo "Done. "

Run SAFBuilder

Bulb.png Note: Formatting your csv file


The metadata csv needs to have 'filename' (case sensitive) as the first column, followed by all the other metadata columns you wish to include. For more information, see the SAFBuilder documentation.

Now open a command prompt in the same directory and run:

safbuilder.bat -c <filename.csv>

Ensure that the csv file is in the same directory as your bitstreams. SAFBuilder will create a SimpleArchiveFormat directory, which can be zipped and uploaded to DSpace. You can also request that SAFBuilder zips the file for you, by adding the -z flag.

Additional Documentation

Software

References