SUNScholar/Media Filters/Text Extraction
Jump to navigation Jump to search
Check the following settings in the "dspace.cfg" file:
#Custom settings for PDFFilter # If true, all PDF extractions are written to temp files as they are indexed...this # is slower, but helps ensure that PDFBox software DSpace uses doesn't eat up # all your memory #pdffilter.largepdfs = true # If true, PDFs which still result in an Out of Memory error from PDFBox # are skipped over...these problematic PDFs will never be indexed until # memory usage can be decreased in the PDFBox software pdffilter.skiponmemoryexception = true
Enable daily media filter jobs. See link below.