Monthly Archives: June 2013

Authority Control

Authority control here means the works involved in creation of unique terms or identifiers to be used as access points in bibliographic records. The access point serves two functions: the finding function and the gathering function. Charles Cutter says a catalogue should 

1) enable user to find a book when either one of these is known to the user: the author, the title, or the subject.
2) show what the library has by a given author, or on a given subject, or in a give kind of literature.

Authority records contain data like: a unique control number, the preferred terms, the non-preferred terms, the variant form of the terms, the broader/narrower/related terms, scope notes, and supporting reasons of creation of the terms.

Library OPAC provides search by author, subject and title, etc.; in addition to keyword search. In FRBR’s terminology, it provides search by types of entities (Work, Expression, Person, Corporate Body, Family, Concept, Events, Place, Object). Nowadays, OPAC even provides faceted search by these entities and few key parameters derived from fixed fields and other fields of marc.

The success in providing the FINDING and GATHERING function in cataloguing relies on two things. First is the authority control, which is the work of building of authority files that houses the authority records. Second is the bibliographic control, which includes the process of assigning of appropriate terms (access points) from authority file in the bibliographic records (or the metadata). Take the example of the previous post of ‘Haze’ collection, the success of the catalogue requires a some authority control work to re-organise the index terms.

The Library of Congress Name and Subject Authority files are the most widely used controlled vocabularies in libraries worldwide, although they are known to have bias. The two files represent many many great librarians’ contribution to the library communities servicing their customers in more than 100 years. LC Subject Headings sometimes are difficult to use and there are calls to simplify it. FAST is a project trying the tackle this problem.

Playing with the Index Terms

Index Terms are the terms that we use to describe the subject matters of the document in hand. These terms are the index keys assigned to the document. The keys are to facilitate the search and retrieval of the documents. Using the terms that are listed in the previous post, we can come out several interesting points:

1. By looking at the list of index terms shown in the surrogate records (catalogue records), we can have a better understanding of the subject matters of the document being described. For example, if we see the terms ‘Haze’ and ‘Health advices’ appear in the record, we know that document is about some health tips when haze comes, and not economic impact of haze. That is something free text search cannot offer.

2. If our imaginative collection grows, we may need to control the words and phrases used in the index terms. We may need to create notes to indicate the scope of use of the terms. We may start to build a thesaurus.

3. We can try to create a map for all these terms. Draw a big circle at the centre and put the word ‘Haze’ in the circle. This becomes the central topic. Draw bubbles to contain each index term and draw lines to connect them together. You will start to see some branches linking the central topic and the subtopics. We should see some hierarchical relationships between the terms. In some cases, we may need to create new terms to connect the nodes and branches. Instead of putting the central topic at the centre, we can put it at the top of an organisation chart. This is a good exercise. You will benefit from it.

Index Terms for our Haze Attack 2013 collection

In response to the previous post, here are the index terms proposed for our imaginative collection.

fire; haze; 2013
June 2013
Singapore; Malaysia; PSI;
Public Health; Human-made disaster; International cooperation
Air quality; Air pollution; smoke; haze;  smoke haze; burning; farming; land use; monsoon session
Air cleaners; health; health advice; medicine; outdoor activities; loss; fire-fighting; water; cloud seeding;  rain; dry; schools; children; air-condition; food; agriculture; tourism; tourists; complaints; compensation;
NEA; Air quality; forest; forest fire; risks; health risks; face masks; Pollution Standards Index; N95 masks; pollutants; measurements; herbal tea; herbs; health impacts; pharmacies; elderly; pregnant; children; slash-and;burn; measurements; Air purifiers; Global Warming; Sumartra; Smog; Weather; Respiratory; Rain; Dry session.

There should be different aspects of information about Haze.  Use your imagination.  Once you find the angle, you should be able to think out more different sets of index term.

Please continue to submit your terms in this form:

Return TopTrackbackPrintPermalink

Currently rated 0.0 by 0 people

Haze attack

Singapore & Malaysia – Haze Attack 20 June 2013

Published on Jun 19, 2013


Amazingly, a related article was created in Wikipedia: 2013 Southeast Asian Haze 2013. The page was created at 6:34 on 19 June 2013.

If you were to compile a collection of information resources in this current event, what index terms you think you will assign to it.

ALCTS eForum on eBook Cataloguing

The June 2013 issue of ALCTS Newsletter also includes the summary of discussion about applying the Provider-Neutral Guidelines in cataloguing of eBooks. The title of the eForum is:

MARC Records in the Age of AACR2, Provider-Neutral Guidelines, and now RDA

Moderated by Amy Bailey, ProQuest, and Becky Culbertson, California Digital Library

“This e-forum, held April 23–24, 2013, focused on creating MARC records for ebooks using standards such as AACR2, Provider-Neutral Guidelines, and RDA, as well as issues related to managing those records. The intention was to engage library catalogers, consortiums, authors of standards, and vendors in a productive dialog in order to understand how each approaches the characteristics of ebooks and resolves cataloging issues they may generate.

The forum began with a question about the use of a single record (following the provider-neutral guidelines) or separate records when cataloging ebooks. The question asked if it is difficult to determine when the provider-neutral guidelines should be applied. One respondent stated that she leaves the original paging and illustration information from the print when deriving from a print record. Sometimes the front matter is missing from the ebook but she does not usually verify this information and accepts what was in the print record. Nonrelevant ISBNs left in the record can cause problems with overlaying.

A participant introduced a question about how others are capturing provider/platform information in their local ebook records if they chose to do so. At first the discussion centered around where and what fields to use for this purpose—a stunning array! This included the following suggestions: 856 $3; 856 $z; combination of 793 field and 856 $3; 710 field; 740 field; 773 field; 830 field; combination of 590 and 856 $3; 9XX. The person who mentioned using the 9XX fields felt that since the package names would be only useful for internal record management purposes; that having the 9XX fields be only available on the staff side of the system would be fine. This statement then led to a query by someone else about whether package information was indeed useful to our users. One cataloger felt that while it might not be useful directly to some patrons, that it would indeed be useful to the public service librarians. They have greater knowledge of the “warts” of some providers and would like to be able to quickly steer patrons in other directions.

A new thread addressed the many duplicate records for ebook titles in OCLC and what their merging algorithm was. The questions asked how a record is selected from among so many, if anyone reports duplicates to OCLC, and what issues may arise when records are merged. A problem with multiple records and batch loading was noted as well as ebook and print records that merge because of incorrect use of ISBNs. While separate records for each vendor would help maintain information unique to each provider (format, pagination, multiple versions, links), it makes batch loading difficult.

A participant pointed out that the Provider-Neutral Guidelines are incompatible with RDA and asked if the principles of RDA should trump the problem of duplicate records. A reply noted that P-N was also in contradiction to AACR2 as well. RDA is problematic for electronic reproductions and also microform reproductions because it emphasizes the reproduction information over the original publication information which is likely more important to patrons/researchers. A P-N approach to microforms may be discussed at a PCC meeting in May. Print-on-Demand is another area that might present issues with provider information in the record. (Update: At the PCC Operations Meeting at the Library of Congress on May 3 and 4, the intent was to set up a Task Group to document Best Practices for describing all kinds of reproductions under RDA.)

RDA offers some advantages over AACR2 but the FRBR model does not work well with some current systems. One response noted that converting AACR2 records to RDA would be expensive, so records derived from another will retain the standard used in the original. New access points would be created following RDA. Some aspects of RDA are seen as carryovers from AACR2 and are not FRBR compatible. Relationship designators in RDA records have been inconsistent, in that they may not be there are all, may be in code ($4) or terms ($e). Participants felt that using terms is clearer for users, although codes can be displayed in any term if the system is set up that way. With RDA, the use of $e seems to be preferred among many catalogers. Linking relationships such as “Reproduction of (manifestation)”, while supporting FRBR principles, do not work in most current systems. These relationships are valuable but textual displays are more useful to patrons.

Day 2 began with a posting that addressed call numbers and genre terms. Many participants said they do not use call numbers for ebooks, and remove them from ebook and e-audiobook records when importing or deriving records. It was suggested the call numbers could be moved to a 099 field instead of deleting them. Several participants noted that the classification number is useful for collection development and statistical purposes, so they are retained but suppressed from the public view. Those who display the call numbers often do so because of their virtual call number browse—this way the print and ebook versions are together. Several respondents said they append “eb” or “EBOOK” or some other ebook designation to the end of the call number, so that patrons won’t expect to find the item on the shelf. Some also distinguish between ebooks read remotely and those that are downloadable in a call number that the public sees. One response said that including the vendor name in the call number is useful to find items from a particular vendor if you want to remove or make changes to those records, to manage duplicate ebooks from different packages. A vendor noted that her clients have a wider range of preferences for ebook call numbers than for print records.

While some catalogers say they leave 655s for genre terms in records they import, many libraries now delete these genre/form terms because they feel they are no longer useful. After all, catalogers never have supplied “print books” as a genre term, so why should we do this for ebooks? Streaming video or Internet video might still be a useful term to include. It was suggested that the 072 could be used instead (e.g. 072 _7 ART $x 057000 $2 bisacsh). One participant noted that in OCLC-merged records there could be multiple 655s if there isn’t an exact character string match. Some catalogers add the term when creating an original record but do not add it when copy cataloging. The term could be added automatically by a program. Possibly, the term could be put in a 590 for staff use. A few catalogers mentioned they have an ebook search template in the OPAC or a discovery layer that can be used to find ebooks. One participant pointed out that the 655 rules have changed a lot recently.

A question about ISBNs asked where various ISBNs were recorded on ebook records. Often libraries take great pains to make sure that the ISBNs for the print version are labeled as $z on ebook records and ISBNs for e- are in $z on print records. It is useful to have the other ISBN to prevent the ordering of the other format if the print is already owned (or vice-versa). Not that the other version wouldn’t be purchased, but it is a good practice to flag staff if the title is already owned in a different format. One participant mentioned an excellent PowerPoint by Brian Green, the former Executive Director of the International ISBN Agency. This turned out to be a most sought after item by the e-forum participants!

Although normally demand-driven acquisition (DDA) is thought of as an acquisitions-based activity, the question was posited as to whether there were any procedures or issues related to DDA that were relevant for catalogers. One cataloger said that all his institution did was change the public note from “Read this MyLibrary ebook” to “Read this electronic book” once the book is officially purchased. He felt that the whole process was simple and required little manipulation on their part.

Regarding 856s, libraries generally remove any URLs from the record that are not relevant for them; often MarcEdit is the method of choice to remove them. Practices differ in other subfields in the 856 field. It would appear that the understanding and use of the subfield $3 varies from cataloger to cataloger, but most use this subfield to indicate vendor names. One cataloger said that they prepend their proxy information to the URL string, except in the case of open access journals. Some libraries ignore the $z note field; others use it to indicate “VIEW EBOOK” or “VIEW VIDEO.””

More on FAST

In building subject terms in authority file, one decision has to be made in dealing with compound subjects is precoordination vs postcoordination. In precoordinated system, the indexer or cataloguer predetermines the combination of topic terms to be used as index terms or headings. In postcoordinated system, the indexer or cataloguer uses single topic terms to be the index terms or headings. FAST is using the postcoordination approach as the syntax of the subject terms.

Continue reading


FAST (Faceted Application of Subject Terminology) is a service from OCLC. It is a ‘remake’ version of LCSH.

“The Library of Congress Subject Headings schema (LCSH) is by far the most commonly used and widely accepted subject vocabulary for general application. It is the de facto universal controlled vocabulary and has been a model for developing subject heading systems by many countries. However, LCSH’s complex syntax and rules for constructing headings restrict its application by requiring highly skilled personnel and limit the effectiveness of automated authority control. Recent trends, driven to a large extent by the rapid growth of the Web, are forcing changes in bibliographic control systems to make them easier to use, understand, and apply, and subject headings are no exception. The purpose of adapting the LCSH with a simplified syntax to create FAST is to retain the very rich vocabulary of LCSH while making the schema easier to understand, control, apply, and use. The schema maintains upward compatibility with LCSH, and any valid set of LC subject headings can be converted to FAST headings…..

Continue reading

Metadata Service

Excerpt from “What the heck is a metadata service?”

The outcry regarding the Harvard Library’s restructuring brings to light the vulnerability of traditional technical services librarians.

…. Let’s re-frame and say we have a mandate to turn metadata services into *more visible*public services.

So what do we mean by that?  Obviously, the next-gen metadata services are not the standard metadata services we’ve grown accustomed to in academic libraries (i.e descriptive cataloging in MARC for our ILS, in DC for our repositories, DACS for archives, authority work, holdings work, etc.).  These traditional services are not going away, let’s get that clear.  Original cataloging will remain a large part of the work of huge libraries with lots of original monographs.  The rest of us will keep doing traditional acquisitions, copy cataloging, etc.  We must continue providing these services but they will take a lesser and lesser role.  You understand the trend towards automating as much as possible in cataloging if you’ve been awake for the past 15 years.   Observe academic libraries obtaining more and more electronic resources with record sets we manipulate in batch.  This is what the FBI calls a clue.   As we automate the old functions, we make room for doing the new functions of a Metadata Services Group.

Continue reading