--- /dev/null
+Running the sitemap generator
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The `sitemap_generator` script must be invoked with the following argument:
+
+* `--lib-hostname`: specifies the hostname for the catalog (for example,
+ `--lib-hostname https://catalog.example.com`); all URLs will be generated
+ appended to this hostname
+
+Therefore, the following arguments are useful for generating multiple sitemaps
+per Evergreen instance:
+
+* `--lib-shortname`: limit the list of record URLs to those which have copies
+ owned by the designated library or any of its children;
+* `--prefix`: provides a prefix for the sitemap index file names
+
+Other options enable you to override the OpenSRF configuration file and the
+database connection credentials, but the default settings are generally fine.
+
+Note that on very large Evergreen instances, sitemaps can consume hundreds of
+megabytes of disk space, so ensure that your Evergreen instance has enough room
+before running the script.
+
+Sitemap details
+~~~~~~~~~~~~~~~
+
+The sitemap generator script includes located URIs as well as copies
+ listed in the `asset.opac_visible_copies` materialized view, and checks
+ the children or ancestors of the requested libraries for holdings as well.
+
+Scheduling
+~~~~~~~~~~
+To enable search engines to maintain a fresh index of your bibliographic
+records, you may want to include the script in your cron jobs on a nightly or
+weekly basis.
+
+Sitemap files are generated in the same directory from which the script is
+invoked, so a cron entry will look something like:
+
+------------------------------------------------------------------------
+12 2 * * * cd /openils/var/web && /openils/bin/sitemap_generator
+------------------------------------------------------------------------
+
Reload the bib record summary in the web catalog and your new image will display.
-Sitemap generator
------------------
-A http://www.sitemaps.org[sitemap] directs search engines to the pages of
-interest in a web site so that the search engines can intelligently crawl
-your site. In the case of Evergreen, the primary pages of interest are the
-bibliographic record detail pages.
-
-The sitemap generator script creates sitemaps that adhere to the
-http://sitemaps.org specification, including:
-
-* limiting the number of URLs per sitemap file to no more than 50,000 URLs;
-* providing the date that the bibliographic record was last edited, so
- that once a search engine has crawled all of your sites' record detail pages,
- it only has to reindex those pages that are new or have changed since the last
- crawl;
-* generating a sitemap index file that points to each of the sitemap files.
-
-Running the sitemap generator
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The `sitemap_generator` script must be invoked with the following argument:
-
-* `--lib-hostname`: specifies the hostname for the catalog (for example,
- `--lib-hostname https://catalog.example.com`); all URLs will be generated
- appended to this hostname
-
-Therefore, the following arguments are useful for generating multiple sitemaps
-per Evergreen instance:
-
-* `--lib-shortname`: limit the list of record URLs to those which have copies
- owned by the designated library or any of its children;
-* `--prefix`: provides a prefix for the sitemap index file names
-
-Other options enable you to override the OpenSRF configuration file and the
-database connection credentials, but the default settings are generally fine.
-
-Note that on very large Evergreen instances, sitemaps can consume hundreds of
-megabytes of disk space, so ensure that your Evergreen instance has enough room
-before running the script.
-
-Scheduling
-~~~~~~~~~~
-To enable search engines to maintain a fresh index of your bibliographic
-records, you may want to include the script in your cron jobs on a nightly or
-weekly basis.
-
-Sitemap files are generated in the same directory from which the script is
-invoked, so a cron entry will look something like:
-
-------------------------------------------------------------------------
-12 2 * * * cd /openils/var/web && /openils/bin/sitemap_generator
-------------------------------------------------------------------------
-
-Troubleshooting TPAC errors
----------------------------
-
-If there is a problem such as a TT syntax error, it generally shows up as an
-ugly server failure page. If you check the Apache error logs, you will probably
-find some solid clues about the reason for the failure. For example, in the
-following example, the error message identifies the file in which the problem
-occurred as well as the relevant line numbers.
-
-Example error message in Apache error logs:
-
-----
-bash# grep "template error" /var/log/apache2/error_log
-[Tue Dec 06 02:12:09 2011] [warn] [client 127.0.0.1] egweb: template error:
- file error - parse error - opac/parts/record/summary.tt2 line 112-121:
- unexpected token (!=)\n [% last_cn = 0;\n FOR copy_info IN
- ctx.copies;\n callnum = copy_info.call_number_label;\n
-----
-
--- /dev/null
+Troubleshooting TPAC errors
+---------------------------
+
+If there is a problem such as a TT syntax error, it generally shows up as an
+ugly server failure page. If you check the Apache error logs, you will probably
+find some solid clues about the reason for the failure. For example, in the
+following example, the error message identifies the file in which the problem
+occurred as well as the relevant line numbers.
+
+Example error message in Apache error logs:
+
+----
+bash# grep "template error" /var/log/apache2/error_log
+[Tue Dec 06 02:12:09 2011] [warn] [client 127.0.0.1] egweb: template error:
+ file error - parse error - opac/parts/record/summary.tt2 line 112-121:
+ unexpected token (!=)\n [% last_cn = 0;\n FOR copy_info IN
+ ctx.copies;\n callnum = copy_info.call_number_label;\n
+----
+
--- /dev/null
+Sitemap generator
+-----------------
+
+A http://www.sitemaps.org[sitemap] directs search engines to the pages of
+interest in a web site so that the search engines can intelligently crawl
+your site. In the case of Evergreen, the primary pages of interest are the
+bibliographic record detail pages.
+
+The sitemap generator script creates sitemaps that adhere to the
+http://sitemaps.org specification, including:
+
+* limiting the number of URLs per sitemap file to no more than 50,000 URLs;
+* providing the date that the bibliographic record was last edited, so
+ that once a search engine has crawled all of your sites' record detail pages,
+ it only has to reindex those pages that are new or have changed since the last
+ crawl;
+* generating a sitemap index file that points to each of the sitemap files.
+
--- /dev/null
+Library visibility on the Web
+-----------------------------
+
+Introduction
+~~~~~~~~~~~~
+
+Evergreen follows a number of best practices to
+make Library data integrate with the rest of the
+Web. Evergreen's public catalog pages are
+designed so that search engines can easily extract
+meaningful information about your library and
+collections. Evergreen is also preparing for an
+eventual shift toward linked open bibliographic
+data.
+
+Catalog data in search engines
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Each record in the catalog is displayed to search
+engines using http://schema.org[schema.org] microdata.
+
+[IMPORTANT]
+Make sure your system administrator has not added
+a restrictive robots.txt file to your server.
+These files restrict search engines, up to the
+point of not allowing search engines to index your
+site at all.
+
+Details of the schema.org mapping
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ * Each item is listed as a
+ http://schema.org/Offer[schema:Offer], which is
+ the same category that an online bookseller might
+ use to describe an item for sale. These Offers
+ are always listed with a price of $0.00.
+ * Subject headings are exposed as
+ http://schema.org/about[schema:about]
+ properties.
+ * Electronic resources are assigned a
+ http://schema.org/url[schema:url]
+ property, and any notes or link text
+ are assigned a
+ http://schema.org/description[schema:description]
+ property.
+ * Given a Library of Congress relator code for
+ 1xx and 7xx fields, Evergreen surfaces the URL
+ for that relator code along with the
+ http://schema.org/contributor[schema:contributor]
+ property to give machines a better chance
+ of understanding how the person or organization
+ actually contributed to this work.
+ * Linking out to related records:
+ ** Given an LCCN (010 field), Evergreen links to
+ the corresponding Library of Congress record
+ using http://schema.org/sameAs[schema:sameAs].
+ ** Given an OCLC number (035 field, subfield `a`
+ beginning with `(OCoLC)`), Evergreen links to
+ the corresponding WorldCat record using
+ http://schema.org/sameAs[schema:sameAs].
+ ** Given a URI (024 field, subfield 2 = `'uri'`),
+ Evergreen links to the corresponding OCLC
+ Work Entity record using
+ http://schema.org/exampleOfWork[schema:exampleOfWork].
+
+
+Viewing microdata
+^^^^^^^^^^^^^^^^^
+You can learn more about how Evergreen publicizes
+these data by viewing them directly. The
+http://linter.structured-data.org[structured data linter]
+is a helpful tool for viewing microdata.
+
+. Using your favorite Web browser, navigate to a
+ record in your public catalog.
+. Copy the URL that displays in your browser's
+ address bar.
+. Go to http://linter.structured-data.org
+. Under the _Lint by URL_ tab, paste your URL
+ into the text box.
+. Click _Submit_
+
+Other helpful features for search engines
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+ * Titles of catalog pages follow a
+ "Page title - Library name" pattern to provide
+ specific titles in search engine results pages,
+ browser bookmarks, and browser tabs.
+ * Links that robots should not crawl, such as search
+ result links, are marked with the
+ https://support.google.com/webmasters/answer/96569?hl=en[@rel="nofollow"]
+ property.
+ * Catalog pages for record details and for library
+ descriptions express a
+ https://support.google.com/webmasters/answer/139066?hl=en[@rel="canonical"]
+ link to simplify the number of variations of page
+ URLs that could otherwise have been derived from
+ different search parameters.
+ * Catalog pages that do not exist return a proper
+ 404 "HTTP_NOT_FOUND" HTTP status code, and record
+ detail pages for records that have been deleted
+ now return a proper 410 "HTTP_GONE" HTTP status code.
+ * Record detail and library pages include
+ http://ogp.me/[Open Graph Protocol] markup.
+ * Each library has its own page at
+ _http://localhost/eg/opac/library/LIBRARY_SHORTNAME_
+ that provides machine-readable hours and contact
+ information.
+
+SKOS support
+~~~~~~~~~~~~
+
+Some vocabularies used (or which could be used) for
+stock record attributes and coded value maps in Evergreen
+are published on the web using SKOS. The record
+attributes system can now associate Linked Data URIs
+with specific attribute values. In particular, seed data
+supplying URIs for the RDA Content Type, Media Type, and
+Carrier Type has been added.
+
+This is an experimental, "under-the-hood" feature that
+will be built upon in subsuquent releases.
+
include::admin_initial_setup/designing_your_catalog.adoc[]
+include::opac/sitemap.adoc[]
+
+include::admin/sitemap_admin.adoc[]
+
+include::admin_initial_setup/troubleshooting_tpac.adoc[]
+
:leveloffset: 0
include::admin/audio_alerts.adoc[]
It is organized into Parts, Chapters, and Sections addressing key
aspects of the software.
-Copies of this guide can be accessed in PDF and HTML formats from http://docs.evergreen-ils.org/.
+Copies of this guide can be accessed in PDF and HTML formats from
+http://docs.evergreen-ils.org/.
include::opac/opensearch.adoc[]
+include::opac/visibility_on_the_web.adoc[]
+
:leveloffset: 0
+include::opac/sitemap.adoc[]
+
+See the Command Line System Administration Manual for details about
+running this script.
+
include::shared/attributions.adoc[]
include::shared/end_matter.adoc[]