SAFBuilder guide to convert CSV files to SimpleArchiveFormat in order to import to DSpace (2018)
SAFBuilder allows you to bulk upload items to DSpace. See the documentation below for more details.
The instructions given on the repository page is sufficient for use on Linux, but the Windows procedure is more involved, and not well documented. The following instructions provide the exact steps needed to get SAFBuilder running on Windows.
Install the JDK
To start, download the latest JDK (not JRE) from the Oracle website. This guide was tested using jdk-10.0.1_windows-x64_bin.exe. Install the JDK, and once the installation has completed, copy the C:\Program Files\Java\jdk-<version> directory to C:\jdk-version. Some tools do not handle spaces in paths well, which is why we move everything to C:\ to reduce the chances of running into this incompatibility.
Note that the Java Development Kit (JDK) is required, and includes the Java Runtime Environment (JRE). The JRE on its own is not sufficient to compile SAFBuilder.
Now download Maven 3 from your local Apache mirror. Choose the latest binary version, it will contain -bin in the filename. This guide was tested using apache-maven-3.5.3-bin.zip. Now extract the zip and move the contents to C:\maven.
Set the required environment variables
To learn how to set environment variables on Windows, see the following guide.
Once Java and Maven have successfully been installed, create the following system wide environment variables:
- JAVA_HOME: C:\jdk-<version>
- MAVEN_HOME: C:\maven
- M2_HOME: C:\maven
And additionally add the maven bin directory to the system PATH:
To test that these variables have been set correctly, open a new command prompt and run:
Download the source code and extract the contents to C:\SAFBuilder
Edit the contents of safbuilder.bat to be as follows:
@echo off echo "Cleaning build directory..." call mvn -DskipTests=true clean package echo "Done, compiling and running SAFBuilder..." call mvn exec:java -Dexec.mainClass="safbuilder.BatchProcess" -Dexec.args="%1 %2" echo "Done. "
The metadata csv needs to have 'filename' (case sensitive) as the first column, followed by all the other metadata columns you wish to include. For more information, see the SAFBuilder documentation.
Now open a command prompt in the same directory and run:
safbuilder.bat -c <filename.csv>
Ensure that the csv file is in the same directory as your bitstreams. SAFBuilder will create a SimpleArchiveFormat directory, which can be zipped and uploaded to DSpace. You can also request that SAFBuilder zips the file for you, by adding the -z flag.