Load Bib, Holdings, and Item data

The DocStore is where bibliographic data such as MARC records, holdings and items are stored in OLE.

Institutions typically have millions of MARC data to be loaded for the very first time when they implement OLE. It is impossible to use conventional methods like Batch process to load the data considering both time and memory constraints.

A simpler method is to directly load the MARC data as MARCXML into the database and then do a re-indexing. The whole process is extensively documented here.

However, Institutions typically receive MARC data in .mrc or .out (Millennium) formats from Library Management Systems (LMS). This can be converted to MARCXML format with help of third-party tools such as MARCEdit for Windows, YAZ for Linux or other tools recommended by Library of Congress. The Institution should ensure the integrity of data so converted as certain tools are known to mess data owing to encoding problems (A version of YAZ always ignored or replaced diacritic characters).

Holdings and Item data

The 9XX fields are earmarked for institution specific data in MARC and this is widely leveraged by legacy LMS to carry Item and Holding information apart from the 841-88X fields earmarked for holdings and item data. Care should be employed to extract such information for load into corresponding holding and item tables.

Export of MARC data from legacy LMS may also remain incomplete. In Millennium, for example, the MARC records so exported lacked Item Bar code and Status information and necessitated a special plugin from Innovative Interfaces, Inc (III) to extract the whole record.

Control Fields

Certain LMS allow institutions to have alphanumeric Control Field 001. However, OLE supports only a 11 digit integer as the Bib Id. There are also instances when a single bib record held multiple Control Field 003 values owing to lax validation requirements of legacy LMS.

One option is to write a small program to extract information from these 9XX fields to construct the CSV formatted file to load the data into the holding and item tables. This program would parse the MARCXML converted from the MARC file and extract crucial item and holding data. The program can also be used to manipulate Bib data - replace Control Field 001, remove duplicate Control Field 003, move the alpha numeric Control Field 001 to a 9XX field, etc.

A second option for loading MARC records is through Webservice APIs. More detailed information along with sample request XMLs can be found here.

Bound-Withs

Institutions are likely to deal with bound-withs during data load. A bound-with is when two or more independent monographs are placed under one cover by the publisher or post publication by the library and treated as a single entity for all circulation purposes.

To load bound-withs, Bibliographic records are created for each independent monograph in the Bound-with collection. One Bib is designated as the parent and the holding record is created. Finally, all the other Bib data are linked to the newly created holding.

The linking information goes into the ole_ds_bib_holdings_t table. The table contains the following columns:

ColumnData TypeComment
BIB_HOLDINGS_TInteger (11)This is the Primary Key and will have to be unique.
HOLDINGS_IDInteger (11)This is the Primary Key from the ole_ds_holdings_t table representing the Holding data.
BIB_IDInteger (11)This is the Primary Key from the ole_ds_bib_t table representing the Bib data.

Operated as a Community Resource by the Open Library Foundation