SUNScholar/Repository Website Metrics

From Libopedia
Revision as of 09:19, 15 February 2013 by Hgibson (talk | contribs) (→‎Robots)
Jump to navigation Jump to search

Register with the major harvesters

Suggestions from the webometrics ranking editors

See: http://repositories.webometrics.info/en/Best_Practices

All the scientific production, formal and informal, draft or definitive, published or unpublished, should be available from a unique web site. The institutional repository is a very important asset of the institution as a whole, not only of the library. We recommend the following syntax for the institutional repository web address:

http://repository.university.country
  • It is very important to avoid changing the institutional domain as it can generate confusion and it has a devastating effect on the visibility values.
  • Avoid cumbersome navigation menus based on Flash, Java or JavaScript that can block the robot access.
  • For scientists it is important that the link to the full text would be easily citable.
  • Therefore Very Long URLs should be avoided in all situations.

Good examples of repositories with friendly persistent URL's as per webometrics best practices

DSpace Google Setup

Google Scholar

Google Analytics

Open your main Dspace config file and look for the xmlui.google.analytics.key setting. Enter your google analytics key.

Rebuild the DSpace webapps using the custom rebuild script.

Google Sitemap

First, edit the DSpace config file and setup sitemaps as follows.

#### Sitemap settings #####
# the directory where the generated sitemaps are stored
sitemap.dir = http://scholar.sun.ac.za/sitemaps

#
# Comma-separated list of search engine URLs to 'ping' when a new Sitemap has
# been created.  Include everything except the Sitemap URL itself (which will
# be URL-encoded and appended to form the actual URL 'pinged').
#
#sitemap.engineurls = http://www.google.com/webmasters/sitemaps/ping?sitemap=

# Add this to the above parameter if you have an application ID with Yahoo
# (Replace REPLACE_ME with your application ID)
# http://search.yahooapis.com/SiteExplorerService/V1/updateNotification?appid=REPLACE_ME&url=
#
# No known Sitemap 'ping' URL for MSN/Live search

Once you've enabled your sitemaps, they will be accessible at the following URLs:

  • HTML Sitemaps: [dspace.url]/htmlmap
  • Google (XML) Sitemaps: [dspace.url]/sitemap

For example you can view SUNScholar maps by clicking on the links below.

http://scholar.sun.ac.za/htmlmap
http://scholar.sun.ac.za/sitemap

Robots

See below for an example robots.txt file.

User-agent: *
# Disable access to Discovery search and filters
Disallow: /discover
Disallow: /search-filter
 
# This should be the FULL URL to your HTML Sitemap. 
# Make sure to replace "[dspace.url]" with the value of your 'dspace.url' setting in your dspace.cfg file.
Sitemap: http://[dspace.url]/htmlmap
 
# If you have configured DSpace (Solr-based) Statistics to be publicly accessible,
# then you likely do not want this content to be indexed
# Disallow: /displaystats
 
# Uncomment the following line ONLY if sitemaps.org or HTML sitemaps are used
# and you have verified that your site is being indexed correctly.
# Disallow: /browse
 
# You also may wish to disallow access to the following paths, in order
# to stop web spiders from accessing user-based content:
# Disallow: /advanced-search
# Disallow: /contact
# Disallow: /feedback
# Disallow: /forgot
# Disallow: /login
# Disallow: /register
# Disallow: /search

HTML Metadata

Ensure Item Metadata appears in the HTML HEAD.

If you have heavily customized your metadata fields away from Dublin Core, you can modify the crosswalk that generates these elements by modifying [dspace]/config/crosswalks/xhtml-head-item.properties.

If you have heavily customized your metadata fields, or wish to change the default "mappings" to these Highwire Press tags, they are configurable in [dspace]/config/crosswalks/google-metadata.properties

Directories

References

Back to Web Analytics