|
Comprehensive searching options
Click
here for a live demo site demonstrating the search features of Veridian.
Standard Veridian search options include the following:
- Faceted search, allowing search results to be refined by date, publication, content type, or any other
available metadata. The faceted search feature also allows long or short documents to be excluded from your
search, which is often useful, for example when locating birth or death notices in newspapers.
- Complex boolean searching.
- It is possible to search for words occurring together either in the same page or within the same article
(article-level METS/ALTO is required for the latter).
- Restricting a search to a specific date range, publication, or content type (e.g. excluding advertisements
from a search).
Configurable browsing options
By default Veridian allows collections to be browsed by title and by date of
publication. It is easy to create alternative browse options for specific collections
however. For example, for a collection of books the ability to browse by author is
easily added.
On-the-fly image conversion and clipping
Veridian makes the most of available digital storage space by only storing the original
archival master images (i.e. the original "scans" of the materials). Derivative images are
created on-the-fly, unlike many systems which must pre-generate and store large numbers of
digital images.
Friendly to Internet search engines
If desired Veridian can make all or part of your digital collection accessible through major
search engines like Google, Yahoo, and Bing, to increase the accessibility and discoverability
of your content.
Multi-lingual
Veridian uses Unicode throughout, and can support searching and browsing of digitized content
in almost any language, even multiple languages within the same collection. Additionally, the user
interface is available in several languages, and is easily translated to more languages as required.
A demo in which both German and English content are searched and browsed together. Click the
screenshot to view the live demo.
Source materials: Supports display of books, newspapers, and most documents
METS/ALTO has been heavily used in large-scale newspaper digitization projects. It is also well suited
to the digitization of any printed materials however, including books and journals.
A book with page-level METS/ALTO data, displayed in Veridian. Click the screenshot to view the
live demo.
Source materials: Supports both page-level and article-level METS/ALTO data
Both page-level and article-level METS/ALTO data can be displayed by Veridian. If
necessary both types of data can even be displayed together within the same collection.
If article-level data is produced for your collection Veridian displays the logical
table of contents within the user interface. It is also possible to click any article on
the page, resulting in a popup menu with controls allowing the user to clip out the article
for printing or viewing, or to zoom in closer to the article.
A newspaper with article-level data, displayed in Veridian. The logical
table of contents for the entire newspaper issue is displayed in the left panel. All pages in the newspaper
issue are displayed in the right panel, and may be dragged and zoomed within that panel. Clicking
any article in a page in the right panel allows that specific article to be zoomed to, or to be clipped
out and displayed on its own, for printing. Click the screenshot to view the live demo.
When page-level METS/ALTO data is displayed it looks very similar to article-level data. With this
data no logical table of contents is available for display however, and it isn't possible to
click specific articles within pages.
A newspaper with page-level data, displayed in Veridian. When using page-level
data clicking anywhere on a page in the right panel displays only page-related controls. Click the
screenshot to view the live demo.
Source materials: Supports the display of scanned images which do not have METS/ALTO
For some large collections it is useful to scan the entire collection and put it online
with Veridian, then process the scanned images to METS/ALTO as time and budget allow. Veridian
enables this by allowing scanned images to be imported and displayed even if they don't have
METS/ALTO data. Users can browse and view these page images along with any other content in Veridian,
and they are displayed in an identical fashion to those pages with page-level METS/ALTO data
available. The limitation with these pages is of course that they are not full-text searchable,
unlike pages with METS/ALTO.
The intention of this feature is to allow large collections to be made accessible online with Veridian
very quickly, even if those collections don't become full-text searchable for some months or years,
while conversion to METS/ALTO is carried out. This feature also allows different types of non-textual
materials (e.g. photographs) to be supported by Veridian however.
Source materials: Supports many different types of archival master images
As noted above Veridian does not pre-generate and store derivative versions of archival master images.
It instead generates all derivative images it requires as it needs them, directly from the archival
masters. The archival masters themselves may be in either TIF, JPEG2000, or JPEG format, and may be
color, black and white, or grayscale. Our preference for new digitization projects is to produce JPEG2000
archival masters, as they are faster and more efficient to process. Many projects do still use TIF
for archival master images however.
The derivative images generated by Veridian (and delivered to users in their web browsers) may be in
either GIF, JPEG, or PNG format, as those are the only three image formats which are well supported by all
modern web browsers. Veridian is configurable to deliver any of these three formats.
Source materials: Easily customized to support formats other than METS/ALTO
If you have digitized materials which are not in METS/ALTO format it's likely we can
work with you to either add support for your data, or to convert your data to METS/ALTO for ingest
into Veridian.
Configuration options: multiple document display alternatives
Any Veridian collection can be configured so the main "document display" pages
look and function in one of two quite different ways, as shown in the screenshots below.
The "multi-page zoom-and-pan" document display. In this configuration all pages
of a document are shown in the right-hand panel. It's possible to zoom in to specific parts
of any page, and to drag all pages within the view panel. This configuration is well suited to documents
with relatively few pages, like newspapers. Click the screenshot to view the live demo.
The "single-page zoom-and-pan" document display. In this configuration
one page of the document is shown in the right-hand panel at any given time, and there are
controls for moving to the next or previous pages. Click the screenshot to view the live demo.
Note that it is also possible to configure Veridian with a single-page document display
that uses static page images, as an alternative to zoom-and-pan page images.
We are currently working on a third document display configuration, which will use
realistic page turning technology for the display of books and other long documents. We expect
this new configuration to be available in mid-2010.
Configuration options: standard user interfaces
Veridian currently has two standard user interfaces from which to choose, with more to be added
in future. It is also possible to change the color of either user interface, change the fonts and
styles used, and to change logos and text on any page. All these configuration changes are included
with the license fee — there is no additional setup or customization charge.
The alternative "standard" user interface. Click the screenshot to
view the live demo.
Configuration options: other
Other Veridian configuration options of note, in addition to those mentioned above, are as follows:
- Two alternative date browser configurations, for use when browsing documents by date.
- Configurable in either "multiple publication mode" or "single publication mode". The latter is
optimized for collections containing multiple issues of a single newspaper publication or journal.
Advanced customization options
Our intention is that all Veridian collections should be unique and tailored to the
requirements of the collection owner. If you need more than the standard configuration options
allow we can work with you to ensure Veridian looks exactly as you would like, fits in
perfectly with your other web content, and integrates well with all your other systems
(e.g. content management systems, digital rights management systems, federated search systems,
etc.)
"Papers Past", a large (1.3 million pages) newspaper collection
from the National Library of New Zealand. Papers Past uses a heavily customized Veridian installation, with
completely unique graphic design, a "browse by region" feature, a bilingual English/Maori user interface,
and many other specialized features. Click the screenshot to view the site live.
"NewspaperSG" from the National Library Board (NLB) of Singapore. Veridian provides
searching and browsing functions for this large newspaper collection, as well as delivering images
and on-the-fly image clipping services. The collection does not use the Veridian user interface however, but instead uses a
web interface built by the Singapore NLB themselves. Veridian has an extensive web services API allowing
third-party interfaces like that built by the NLB. An interesting aspect of this collection is that
some of the material is protected by copyright, so is only available for viewing from multimedia
stations at Singapore's libraries. The rest of the material is available publicly on the Internet,
but images of newspaper pages and articles are displayed with watermarks. Click the screenshot to view the
site live.
Future planned features
Veridian is constantly being updated and improved, and we have an extensive roadmap
for future developments. Most new features are offered free of charge to those who
have purchased Veridian previously, and who have a maintenance contract with Digital Library
Consulting.
Key enhancements scheduled to be completed in 2010 include the following:
-
End-user OCR text correction. Unfortunately the Optical Character Recognition (OCR) process used to extract
machine-readable text from scanned documents is not perfect. This leads to errors in the text used for searching
your digitized collection, which in turn means it may not always be possible to find all the articles or pages
in your collection that match a given search. The only way to improve on the quality of the text is to have people
correct it manually, which is of course hugely expensive for a large digital collection. What we propose developing
is a Veridian module allowing your users to register and correct text themselves, from right in the Veridian user
interface. So as users are browsing and reading the materials in your collection they will be able to, at their option,
correct text where they notice it is incorrect. That is, they'll be able to compare the machine-generated OCR text
with the scanned images of your digitized documents, and make corrections to the OCR text. Those changes will be pushed
back into the search index, thus gradually improving the quality and searchability of the collection.
-
Support for born-digital PDF files. Our intention is to add support for ingesting born-digital PDF
files into Veridian, and to have them displayed in exactly the same way as page-level METS/ALTO data.
-
Realistic page-turning document display. This will be the third major document display option added
to Veridian. It is being developed primarily as an improved interface for the display of books and other
long documents.
|