EXPORT

METHODS

new()
set()
add()
dumpindex()
test()

SEE ALSO
AUTHOR
COPYRIGHT AND LICENSE

NAME

Net::Z3950::RadioMARC - Perl extension for testing Z39.50 servers

SYNOPSIS

 use Net::Z3950::RadioMARC;

 $t = new Net::Z3950::RadioMARC();
 $t->set(host => 'z3950.loc.gov', port => '7090', db => 'voyager');
 $t->set(delay => 3);
 $t->add("filename.marc");
 $t->test('@attr 1=4 01245a01', { ok => '245$a is searchable as 1=4',
                                  notfound => 'This server is broken' });
 # -- or --
 set host => 'z3950.loc.gov', port => '7090', db => 'voyager';
 set delay => 3;
 add "filename.marc";
 test '@attr 1=4 01245a01', { ok => '245$a is searchable as 1=4',
                              notfound => 'This server is broken' };

DESCRIPTION

This module provides the harness in which test-scripts can be written for detecting the presence of a ``radioactive MARC record'' in a Z39.50-compliant database, and determining how that database indexes the record. Its key provision is the test() method, which runs a search for some well-known term that is known to occur in a ``radioactive'' record, and generates different output dependent on whether the record is found or not.

This module may be used in two different ways: the first approach is to use a rigorous object-oriented syntax in which a test-harness method is explicitly created, and methods are invoked on it. The other is a simpler syntax in which a test-harness object is transparently created behind the scenes when it's first needed, and subsequently referenced by function calls. These two styles are exemplified by the two code-fragments in the synopsis above.

For most purposes, the simple syntax will be preferable. The object-oriented synatx is useful primarily when it is necessary for a single script to run tests against two or more different databases.

METHODS

`new()`

 $testHarness = new Net::Z3950::RadioMARC();

Creates a new test-harness for checking the searchability of ``radioactive MARC records'' in a database available via Z39.50. There are no argument; the new object is returned.

Before the new test-harness can be used by the test() method, at least its host property must be set, and often its port, db and format as well. See the documentation for the <set()> method for details on how this is done and what the argument mean.

It is not necessary to explicitly create a test-harness object in order to use this module. See the describe above of the ``simple syntax'' approach, in which a test-harness object is implicitly created.

`set()`

 $t->set(host => 'z3950.loc.gov', port => '7090', db => 'voyager');
 $t->set(delay => 3);
 # -- or --
 set host => 'z3950.loc.gov', port => '7090', db => 'voyager';
 set delay => 3;

Sets one or more properties of the specified test-harness, or of the implicit test-harness if none is specified (i.e. when the simple syntax is used instead of the object-oriented syntax).

Each pair of arguments is taken to be a pair consisting of a property name and the corresponding new value. It is an error to provide an odd number of argument.

The following properties are defined.

host [no default]

The name or IP address of the Internet host where the Z39.50 server to be tested resides. The connection to the host will be forged when the first test-case is run by means of the test() method.

port [default: 210]

The port number where the Z39.50 server to be tested resides on the specified host.

db [default: "Default"]

The name of the database to be searched.

format [default: "USMARC"]

The record syntax to be used when fetching records from result sets to be compared with the known radioactive records.

delay [default: 1]

The delay, in seconds, between issuing one test search and the next. This delay is a courtesy to the server, to prevent it from being overrun by a large test-script.

messages [default: an empty hash]

A reference to a hash which maps test status values to message templates. These are used to generate the reporting output for tests, depending on the status returned by the test, except when overridden by the messages specified for that particular test (see below).

The interpretation of message templates is described in the documentation of the test() method. Status values for which no message template is provided (i.e. all status value initially) are not reported at all, except for the status fail which is reported using a simple, explicit default format.

verbosity [default: 1]

The level at which debugging and other output is emitted. This may be set to any integer; all messages at the nominated value and less are emitted. These messages, by level, are as follows:

add()

Level 1 is the default, since these messages are arguably reporting configuration errors, whereas the higher levels generate chit-chit that is is probably only useful for debugging.

report [default: 1]

A boolean indicating whether or not reporting output (as opposed to debuggin output) should be emitted for tests. This should nearly always be true: the principal use of this property is as an additional, one-shot property used for a single test, like this:

 ($status, $errmsg, $addinfo) = test '@attr 1=4 fruit', { report => 0 };

which allows the test-script to explicitly check the status and make whatever choices it deems appropriate without side-effects.

identityField [no default]

An indication of what MARC field or subfield is taken to convey the identity of a record for the purposes of comparison. If a record in a result-set has the same identity-field value as the radioactive record being tested, then they are regarded as the same record.

It may take the form tag for control fields (for example 001 to specify the local identifier) or tag$subfield (for example 245$a to specify the title field).

Multiple candidate identity fields may be specified, separated by commas, like this: 100,035$a. In this case, each such candidate subfield is tried in turn, and the first one that exists in both records being compared is used.

If no identity field is specified, then two records are considered to be the same only if they are byte-for-byte identical.

It is an error to try to set a property other than those described here.

`add()`

 $t->add("filename.marc");
 # -- or --
 add "filename.marc";

Adds one or more MARC records to the set that are to be tested for. Records are loaded from the file whose name is specified. Any number of records may be added to the test-set, but using many such records may be self-defeating, since then the radioactive tokens to be searched for are less likely to be unique.

Behind the scenes, this module builds an inverted index of all the words occurring in all the subfields of all the non-control fields in all the records that are added. This is used when test() is called to identify which of the add()ed records is the one that should be retrieved from the server.

add() returns a list of opaque tokens representing the newly added records. These tokens may be passed as the record parameter into the test() method to indicate explicitly which of the test-set records a particular query is intended to find.

`dumpindex()`

 $t->dumpindex();

Dumps to standard output the inverted index generated for the MARC records have been added to the test-set by the add() method.

Never call this method.

`test()`

 $t->test('@attr 1=4 01245a01', { ok => '245$a is searchable as 1=4',
                                  notfound => 'This server is broken',
                                  record => $token });
 $t->test('@attr 1=4 thrickbrutton');
 # -- or --
 test '@attr 1=4 01245a01', { ok => '245$a is searchable as 1=4',
                              notfound => 'This server is broken',
                              record => $token };
 test '@attr 1=4 thrickbrutton';

Runs a single test against the server that has been nominated for the specified test-harness. The first argument is a query in PQF (Prefix Query Format) as described in the YAZ manual at http://indexdata.com/yaz/doc/tools.tkl#PQF and the second (optional) is a reference to hash of parameters, some of which are used for mapping status values to message templates.

The query is analysed to see which of the test-set records it should find. For maximally indicative results, it should match exactly one such record - no more, no less. If it matches more than one, the the first one is used for the subsequent matching process: that is, the one that occurred earliest in MARC file that first add()ed to the test-set. If the parameter record is provided, then its value is use as the opaque token of the test-set record to be used and the query is not used for this purpose.

The query is submitted to the server, and returns some number of hit-set records. Again, the most significant test results are obtained when there is exactly one such record.

Each of the candidate hits is compared with the chosen test-set record to see whether there is match or not - that is, whether the search retrieved the nominated radioactive record.

The result of this process is that a status is generated, being one of the following short strings:

ok: The query succeeded, and a record in the hit-set was the same as the record chosen from the test-set.
notfound: The query succeeded, but no record in the hit-set was the same as the record chosen from the test-set. This may occur for several different reasons: because the query matched no record in the test-set (which is probably a configuration error); because it matched no record in the database (which means either that the radioactive record is not in the database or that it is not indexed in the way being tested for) or because it found some record in the database, but none of them is the one that was expected (for the same reasons).
fail: The query could not be executed, or the records could not be fetched.

The test() method returns a triple, ($status, $errmsg, $addinfo), with $errmsg being the human-readable string corresponding to the BIB-1 diagnostic code returned by the server in the case of an error, and $addinfo being any additional information returned by the server along with such a diagnostic.

If the report property of the test-harness is true (as it is by default), then a report is emitted describing the outcome of the test. Under some circumstances, it is useful to inhibit this behaviour by setting report false and testing the explicitly returned values instead.

The reporting output is generated from a template. The template is found by looking up the status of the test in the hash-reference argument, if this is supplied. If it is not supplied, or if the relevant element is missing, it is looked up in the hash that is the value of the message property. If the relevant element is not in this hash either, a default template is used for notfound and fail tests, but NO OUTPUT AT ALL is emitted for ok test. This makes it possible to write silent-on-success test scripts. If you want commentary on successful tests, then, you must explicitly specify an ok message template, either in the message property or in the hash-reference passed into test().

Report-generating templates are strings which may contain the following escape sequences, which are substituted the the appropriate values:

%{query}: The query that was run for this test.
%{status}: The status of the test.
%{errmsg}: The human-readable error message returned from the test, if any.
%{addinfo}: The additional information returned from the test, if any.

AUTHOR

Mike Taylor, <mike@indexdata.com>

COPYRIGHT AND LICENSE

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.3 or, at your option, any later version of Perl 5 you may have available.