Crawford's Subfield V Questionnaire

From JOCRAW@macc.wisc.edu Wed Feb 14 15:56:33 1996
Date: Mon, 30 Jan 1995 10:33:08 -0500
From: Josephine Crawford
Reply to: usmarc@loc.gov
To: Multiple recipients of list
Subject: Proposal 95-2 : Report on $v Questionnaire : WARNING: 1100 LINE


January 27, 1995

to: USMARC list

from: Josephine Crawford (jocraw@macc.wisc.edu)
Health Sciences Library
University of Wisconsin

re: Report on the Fall 1994 MARBI Questionnaire
on Form Subdivision Subfield Delimiter


BACKGROUND
----------

In September 1994, I wrote and distributed a questionnaire on the USMARC list to pull together information about the probable impact of the introduction of the subfield $v in subject heading fields. For more information, please see Discussion Paper #74, Discussion Paper #79, and Proposal 95-2. This report contains the result of my compilation.

Many, many thanks to the dozen or so people who helped and guided me in my efforts, especially Arlene Taylor and Priscilla Caplan. And many thanks to the respondents who included substantive comments in their questionnaire; their analysis and insights have moved my thinking along considerably.

The questionnaire had 25 questions, divided into the following sections:

Questions 1-3 : Respondent/Organization identification
Questions 4-14: Impact of a minimal implementation
Questions 15-19: Retrospective conversion
Questions 20-23: Beyond a Minimal Implementation
Questions 24-25: Desirable lead time + general comments

The compilation below will show the wording of the original question, numerical results as appropriate, direct quotes from many of the questionnaires, and a few explanatory comments.

A word of warning about the results presented here, particularly the numerical results. The figures represent serious estimates of the amount of effort required by key North American institutions to implement the $v proposal "in a minimum way." Since not every affected system has responded, the totals do not represent an absolute measure of effort across all systems. I believe that the figures show the minimum effort required to improve the machine control we have over subject heading form subdivisions, for newly created records.

Responses indicate that some systems are interested enough in the potential of the $v that retrospective conversion might indeed be undertaken locally. Many questionnaire respondents acknowledged that their decision-making will be influenced by LC's actions.

The questionnaire did not attempt to gather work estimates required to perform retrospective conversion, because I felt too little was known about the potential process last summer to yield accurate results. Comments in the responses do suggest that conversion across libraries and systems is only feasible if it can be performed by machine only; yet, the algorithm presented in the restrospective conversion section (the best algorithm several minds could come up with last summer) assumed a combined machine/human conversion effort. More analysis work is underway by members of the ALCTS Subject Analysis Committee, which ought to help us in future estimate the work required to perform retrospective conversion across systems.

This questionnaire was an attempt to document the costs involved with a proposed USMARC change. I volunteered to take this on, thinking of it as an experiment, because I am interested in possible methods to improve and expedite MARBI decision-making. Your comments about the methodology and process are most welcome.

Please keep these comments in mind as you review and interpret the results below. And, if there are any questions about this report, please let me know by private email or by the USMARC list (your preference), and I will try to respond promptly, given the upcoming ALA Midwinter meeting.

note: an asterisk in the far left column indicates wording from the original questionnaire.


MARBI $V QUESTIONNAIRE RESULTS

*1. WHICH ORGANIZATIONS RESPONDED?

Ameritech Library Services Blackwell North America Cornell University Data Research Associates Geac Canada Limited Harvard University Library ISM Library Information Systems JES Library Automation Library of Congress, APLO and ITS Offices Library of Congress, Cataloging Distribution Service The MARC of Quality Marcive, Inc. Michigan State University, University Archives National Library of Medicine OCLC Research Libraries Group University of California, Division of Library Automation University of North Florida University of Texas at Austin University of Wisconsin, Madison VTLS, Inc. WLN I received a total of 22 responses. I treated one response as invalid, because the answers showed some misunderstandings, in my judgement, on the part of the institution.
2. PLEASE CHECK ALL THAT APPLY TO YOUR ORGANIZATION
*
* __4__ Bibliographic utility
*
* __6__ Vendor of local systems
*
* __7__ Library which develops and maintains local system
*
* __6__ Vendor of services which process USMARC records * (e.g. authority control service; GPO record * processing)
*
* __5__ Vendor which develops specialized programs * (e.g. an off-the-shelf program for creating spine * labels or a customized program for downloading or * uploading records)
*
* __10_ Developer of Z39.50 server/client software
*
*
* __3__ Other:
- A national library which creates and distributes original cataloging records. - An institution responsible for a subject heading list. - An institution responsible for an online union catalog. Even though the total number of responses was low, the answers in this section show a wonderful mix of organizations took a serious interest in the $v proposal. All four bibliographic utilities responded. Two national libraries responded. Vendors providing different types of systems and services responded. Finally, a large number of organizations working with Z39.50 client/server software responded, leading me to the conclusion that the questionnaire has pulled in data from older, well-entrenched systems as well as the newer, state-of-the-art systems.

*3. LIBRARY SYSTEM NAME and SOFTWARE NAME (IF DIFFERENT)

Horizon (Ameritech) NOTIS (Cornell Univ) Data Research (DRA) Geac Advance (Geac Canada) HOLLIS (Harvard Univ) CATSS (ISM) ULISYS (JES) MUMS (LC) CDS Products (LC) MARC Review (MARC of Quality) MicroMARC:amc (Michigan State Univ) TESS (NLM) PRISM (OCLC) RLIN: Eureka and Zephyr (RLG) MELVYL (Univ of Calif) UTCAT (Univ of Texas) Network Library System (Univ of Wisconsin) VTLS LaserCat & MARS (WLN)
* NUMBER OF INSTALLATIONS: total = 6733+
15 organizations supplied a figure to this question. Answers range from one installation (i.e. a local library), to vendors with installations numbering 12 to 600, to a bibliographic utility with 5,000+ cataloging installations.


QUESTIONS ON MINIMAL IMPLEMENTATION OF $V

*NOTE: Please answer the questions in this section based upon doing
*the minimum amount of work to handle subfield $v in your software.
*The assumption behind these questions is that, once defined in the
*USMARC formats, your organization will have to do some work with
*the subfield $v but may prefer to do so in a minimum way, due to
*other development priorities. Due to the uniqueness of your
*software, only you can determine the minimum amount of work which
*you must do but we would like to know how much effort this would
*involve.

*EXAMPLE: At the current time, form subdivisions are stored in
*subfield $x along with topical subdivisions. In a minimal
*implementation, you might choose not to perform retrospective
*conversion to move form subdivisions to subfield $v. But you do
*want new bibliographic records to reflect the current standard so
*that catalogers can accept cataloging records with $v from a
*utility. Therefore, you must add the subfield $v to your tag table
*and make changes to your indexing and display routines. To keep to a
*minimum your changes to your indexing program, you might choose to
*treat subfield $v in exactly the same manner as your indexing
*program currently handles the data in subfield $x.
*Presumably it is less work to add another subfield to programming
*logic already present in your software rather than to create new
*logic from scratch.

NOTE: SEVERAL QUESTIONNAIRE RESPONDENTS INDICATED IN THEIR COMMENTS THAT HANDLING THE $v IN EXACTLY THE SAME WAY AS $x WAS INDEED A REASONABLE METHOD FOR IMPLEMENTING THE $v IN THEIR SYSTEM, IF THE MARBI PROPOSAL PASSES AND IF THEIR INSTITUTION DECIDES TO DO THE MINIMUM TO IMPLEMENT IT.

*4. When a new field or subfield is defined in the USMARC format,
*does your software require a change to one or more tag tables or
*data dictionaries? Or, is there another mechanism in place? In
*either case, please estimate the number of working days for all
*involved staff. If less than one day, please indicate how many
*hours expressed as a fraction.

*The following information is supplied to help you make this
*determination.
* Bibliographic format: add $v to 10 fields
* Authority format: add $v to 18 fields; define 4 new fields
* Classification format: add $v to 6 fields
* Community Information format: add $v to 8 fields
*The actual fields are listed in Attachment A of MARBI Discussion
*Paper No. 74.

 
*    __15__  TAG TABLE	     Working days   ____ 150.75 ____
 			
				low :		1 hour (Univ Texas; JES)
				high :		72 days  (LC MUMS system)
 				mean :		10.05 days
				median :	2 days
In other words, 15 organizations would need to do some work on a system tag table for an estimated total of 150.75 days. There was a real difference in the range of time given to the Working Days part of the question; for that reason, I am supplying the the range (lowest estimate and highest estimate), the arithmetic mean (or average), the median (midpoint of actual values), and the name of the system supplying a specific estimate, when this seems of interest.
* __5__ OTHER MECHANISM Working days _____ 57.5 ______
 
 				low :		1 day (NLM)
				high :		30 days   (LC MUMS system)
				mean :		11.5 days
				median :	5 days
In other words, five organizations would need to spend time on some other mechanism for a total of 57.5 days. Interestingly enough, a couple of organizations would need to spend time on both a tag table as well as some other mechanism.
* __3__ NO WORK NEEDED HERE.
In other words, three organizations would not need to do any work of this type.
*5. If the $v subfield is defined in the USMARC formats, will you *need to make programming changes to any programs used to load new *bibliographic records into your system? You may have more than *one program and all should be taken into account (for instance, you *may have a batch loading program as well as an online transfer *program). If answering yes, please estimate the time in the same *way as above.

* __8__ YES Working days _____ 56.5 ___________

				low :		1 day (NLM)
				high :		20 days (LC MUMS; also LC CDS)
				mean :		7.9 days
				median :	3 days
 
* __13_ NO WORK NEEDED HERE.

*6. If the $v subfield is defined in the USMARC formats, would you *have to make any programming changes to any cataloger workforms or *online cataloger displays that are used for inputting and editing?

* __5__ YES Working days _______ 18.5 __________

				low :		4 hours (Univ Texas)
				high :		10 days (LC MUMS)
 				mean :		3.7 days
				median :	5 days
 
* __15__ NO WORK NEEDED HERE.

*7. If the $v subfield is defined, would you have to make any *programming changes to any other display programs for a "minimal" *implementation?
* __14__ YES Working days ______ 150.25 _________

				low :		2 hours (Ameritech)
				high :		80 days  (LC CDS)	
 				mean :		10.7 days
				median :	1.75 days
 
For most of these 14 organizations, the amount of time was low such as one or two days. One vendor estimated 20 working days and LC estimated 30 working days for the MUMS system. LC also estimated 80 working days for displays programmed by CDS for the CDMARC Bibliographic product; this is the highest estimate.
* __7__ NO WORK NEEDED HERE.

*8. If the $v subfield is defined in the USMARC formats, would you *have to make any programming changes to any data validation *routines which might be present in your system? (It might be that *all data validation for subject headings is done by authority *control programs. If this is the case, answer NO here because of *question 11.)

* __5__ YES Working days _______ 113 _______

				low :	    1 day (OCLC)
				high :	    72 days (LC MUMS)
				mean :      22.6 days
				median :    10 days
 						
* __15__ NO WORK NEEDED HERE.

*9. If the $v subfield is defined in the USMARC formats, would you *have to make any programming changes to any programs which index *subject heading fields?

* __10__ YES Working days _______ 183.5 ________

 	
				one response : "unknown number of days"
				low :		4 hours (Michigan State)
				high :		60 days (LC CDS)
				mean :		20.3 days
				median :	8 days
* __10__ NO WORK NEEDED HERE.

*10. If the $v subfield is defined in the USMARC formats, would you *have to make any programming changes to any programs which create *reports, printed forms, or printouts?

* __11__ YES Working days _______ 86.5 __________

 
				one response:   "unknown no of days"
 				low :		4 hours (Michigan State)
				high :		30 days (LC CDS)
				mean :		8.65 days
				median :	5 days	
 
* __10__ NO WORK NEEDED HERE.

*11. Some systems support links and/or global changes between *authority records and bibliographic records. Sometimes this is *done for names and not topical subject headings, or for one subject *thesaurus but not for others. Sometimes this is done online and *other times it is done in batch mode.
* If subfield $v is defined, would you have to make any changes *to any existing authority control programs? If yes, would the same *changes apply to all thesauri or would you have to handle each *thesaurus separately? Please check off whatever applies to your *situation and estimate the number of working days/hours for each. *Explanatory comments are also welcome (use as much space as you need).

* __5__ All thesauri in one program change

* Working days ____ 257 _________

				two responses:  "already estimated above"
				low : 		7 days (Univ Wisc)
				high :		240 days (BNA)
 				mean :		85.6 days
				median :	10 days
* __5__ Library of Congress Subject Headings

* Working days ____ 90 __________

				one response:   "already estimated above"
				low :		5 days (Marcive, Inc.)
				high :		50 days (LC CDS)
 				mean :		22.5 days
				median :	17.5 days
* __1__ LC's Children's Subject Headings

* Working days ______ 5 __________ (Marcive)

* __2__ Medical Subject Headings

* Working days _____ 15 ____________

 				low :		5 days (Marcive)
				high :		10 days (OCLC)
 				mean/median:	7.5 days
* __0__ Art and Architecture Thesaurus

* Working days ______ 0 ____________

* __1__ Other: ____ Sears ____

Working days ______ 5 ___________ (Marcive) * __10_ NO WORK NEEDED HERE.

*12. Does your software support the USMARC classification format? *If this is the case, and if the $v subfield is defined, what impact *would this have on any classification programs? Please place an X *by all the choices below which apply; explanatory comments are also *welcome.

* __18__ USMARC classification format not supported.
* (if you check this, do not check any other choices)

* __1__ No impact on any programs which support the
* classification format.

* __1__ As currently programmed, fields with $v would
* print out as errors or exceptions.

* __1__ Programming changes already estimated above would be
* required. (That is, changes to tag table, printouts,
* etc.)

* __1__ Programming changes different from that estimated
* above would also be required. Please estimate the number
* of working days or, if less than one day, the number of
* hours expressed as a fraction.

* Working days ____ 6.25 ______ (LC CDS)

*13. Does your system software support the USMARC Community *Information format? If this is the case and, if the $v subfield is *defined, what impact would this have on any programs which support *your community information records? Please check all choices below *that apply.

* __17_ USMARC Community Information Format not supported. * (if you check this, do not check any other choices)

* __1__ No impact on any programs which support the Community * Information Format.

* __0__ As currently programmed, fields with $v would
* print out as errors or exceptions.

* __3__ Programming changes already estimated above would be
* required. (That is, changes to tag table, printouts,
* etc.)

* __0__ Programming changes different from that estimated
* above would also be required. Please estimate the number
* of working days or, if less than one day, the number of
* hours expressed as a fraction.

* Working days _____ 0 __________

*14. If the $v subfield is defined, would you have to make any
*programming changes to any software programs which create output
*files of USMARC records? For example, perhaps you have a program
*to write out serial records for inclusion in a regional list. Or,
*perhaps you are a vendor which processes GPO records in order to
*prepare them for loading into another system.

* __0__ Not applicable. No programs of this type.

* __14_ No impact on any program(s) creating output files.

* __6__ Programming changes needed for outputting records. * Please estimate staff time in same manner as above.

* Working days _____ 158 __________

					low :	1 day (Cornell)
					high :	100 days (LC CDS)
					mean :	26.3 days
					median: 12 days
 
 		note: LC Cataloging Distribution Service
		is in the business of outputting records and has a variety
		of available products, resulting in a high estimate here.
 
 

RETROSPECTIVE CONVERSION

*By retrospective conversion, I am referring to a special project to
*identify form subdivisions in subfield $x of subject headings in
*existing files and then move them to subfield $v of the same
*subject heading. Such a conversion may need to be performed on all
*databases.

*To undertake such a conversion, your computer program would need
*several lists, such as:
* - a list of approved form subdivisions for each subject
*thesaurus.
* - a list of dual-function subdivisions if applicable for a
*specific thesaurus. (See the two LCSH examples above with "Maps"
*functioning in one heading as a form subdivision and in another
*heading as a topical subdivision.)

*A computer algorithm summarized as follows might be used:
* (1) In 600-651 fields, look for occurrence of terms which
*match those present on the list of approved form subdivisions for
*the specific thesaurus. If found, continue to next step.
* (2) Check the term against the list of dual-function form
*subdivisions if available for the specific thesaurus. If there is
*a match, print out the information for human review and editing.
*Otherwise, continue to next step.
* (3) Check to see if the term is the last subfield in the
*field. If this is the case, change the subfield delimiter from $x
*to $v. If this is not the case, print out the information.

As noted in my opening remarks, analysis is underway on the percentage of records that could be converted by machine versus those requiring human intervention. Preliminary results indicate that it may be possible to perform a machine conversion on some form subdivisions, with a very low error rate. Questionnaire respondents also supplied some good ideas on how to improve the algorithm described here.

*15. How many total records in your system(s) would need
*conversion?

* ___________ Number of records needing conversion

The wording of this question turned out to be ambiguous and I don't believe that the results are usable. Some people supplied the total number of records in their database, since presumably all would need to be examined during a conversion. Others supplied an estimate of the records which would require a modification from $x to $v. Others left this question blank or put in a comment like "only authority records need be converted."

*16. Do you have any suggestions on how to improve the suggested
*algorithm listed above? We are particularly interested in finding
*automatic methods to convert dual-function terms, in order to
*reduce or even eliminate the need for human review and editing of
*these terms. Take as much space as you need if you have any
*helpful ideas.

GEAC CANADA LMTD : "The algorithm looks sound for a first pass but this will result in a large list of headings which will require human review. It would be best if the authority record suppliers were to do the conversion and make those records available..."

HARVARD UNIVERSITY : "Build tables of combinations of possible form terms with switch/don't switch flags, or tables of terms which cause a preceding form term to be left as an $x, etc. Determine where coded characteristics in the Leader/008 can disambiguate dual-use terms."

MARCIVE, INC. : "Considerable study needed to develop adequate algorithm."

NLM : "MeSH MARC is available from NLM."

OCLC : "The algorithm as suggested would not be easily implemented within our current authority control. Subject corrections at OCLC for LCSH are not made based on their position in a subject string. Therefore, relying on matching against a defined set of terms, and then checking on position, while it will work for MeSH, will not work for LCSH."

WLN : "Leader byte 6, 007, and 008 data elements could certainly be used to provide additional information about the form of the material. E.G., if a record's subject had $xMaps and Leader byte 6 was 'e' or 'f' or 007/00 was 'a', change $xMaps to $vMaps, or use 008/24-27 (Nature of contents) in books and serials format to identify 'Catalogs", etc. It would be extremely complicated to analyze, design and program and I doubt that we would do this amount of work to correct only a portion of the records with dual-function subdivisions. Most of the dual- function subdivisions would require human review, record by record."

*17. Do you think it is helpful or necessary for all or most
*databases to convert at the same time? Is this important, perhaps
*from the perspective of database maintenance, where catalogers move
*records from a utility to a local system?
* Please check only one response here.

* __2__ No opinion.

* __4__ Not necessary for conversion to occur at same time.

* __6__ Best if all systems convert databases at same time.

* __5__ Important if utilities convert at same time, * whereas local systems can convert anytime * afterwards.

*COMMENTS: (take as much space as you need)

HARVARD : "Of course it's best if all or most databases convert at the same time, but is it practical? You need to consider the likely administrative priority changing some $x's to $v's will get in large systems, particularly for LCSH where significant manual review would be required...."

LC CDS : "Would require redistribution of LC records to be effective and in synch. Can systems handle such a high volume redistribution?"

MARCIVE, INC. : "With present diversity of vendors, systems, and libraries, simultaneous conversion cannot possibly occur. Therefore, it does not matter if it is a good idea or not."

OCLC : "The movement of records is not unidirectional. Records may move from utility to local system, back to utility, and then to another utility. Records may also move from local system to local system. If any form of automated authority control is present at any step, the chances for reconversion and misconversion are greatly enhanced."

UNIV OF CALIFORNIA: "Conversion is important for user services. I can imagine our system adding these kinds of form subdivisions to our 'AND FORM' add-on search that now makes use of Leader values for Record Type, plus additional information in 007 fields to identify the type of material. A&I databases on our system often have 'document type' codes that are very useful...."

WLN : "In an ideal world, LC would convert all of it's records, redistribute them, and save all systems xxx conversion effort and cost. But whatever LC's role in implementing the $v proposal, every system has to be prepared to convert records or be faced with mixed files because the past is long and old records and old style of headings will always be with us. Thus conversion programs will have to be running whenever records are loaded or batch processing will have to periodically perform this type of cleanup."

*18. Do you think that all conversions (across systems and
*databases) should be done using the same computer algorithms? Not
*only would standardization be helpful to catalogers potentially,
*but the systems analysis work would be done once on a national
*level, saving your organization some time perhaps. Please answer

*for each thesaurus.

* Library of Congress Subject Headings

_11_ YES _2_ NO _8_ No opinion.

* LC's Children's Subject Headings

_7_ YES _2_ NO _12_ No opinion.

* Medical Subject Headings

_8_ YES _2_ NO _11_ No opinion.

* Art and Architecture Thesaurus

_4_ YES _1_ NO _16_ No opinion.

* Other thesaurus: _________________________

_0_ YES _0_ NO _21_ No opinion.

*19. In your environment, is it reasonable for a "minimal
*implementation" to occur without undertaking retrospective
*conversion of existing databases and files, or would this cause
*problems forcing retrospective conversion?
* There are three possible responses below. If you check off
*the last response, please also specify any or all thesauri for
*which this would be true. We want to know if there are differences
*between thesauri.
* Explanatory comments are very welcome if retrospective
*conversion would be required.

* _2__ We have no databases to convert.

* _13_ It is reasonable for a minimal implementation to occur
* without retrospective conversion of existing databases
* and files. This statement is true for all thesauri
* present in our installation(s).

* _5__ In our environment, we would be forced to undertake a
* retrospective conversion even under the rubric of a
* minimum implementation. (If you check this response,
* please answer for each specific thesaurus.)

* _5__ Library of Congress Subject Headings

* _5__ LC's Children's Subject Headings

* _3__ Medical Subject Headings

* _1__ Art and Architecture Thesaurus

* _2__ Other: Canadian (Eng. & Fr.) and NAL

* COMMENTS: (take as much space as you need)

JES LIBRARY AUTOMATION : "It is not practical for us to ignore the retrospective conversion problems. The conversion program would require approximately 10 days to write and test. We also would supervise the conversion for our clients, which would require additional time."

HARVARD : "This depends on LC's policy with regard to the use of the $v for LCSH and on the policies of the utilities regarding accepting records with pre-$v coding. We anticipate no benefit from the $v implementation, and at a minimum would simply insure that $v could be imported, exported, and treated like $x's for all system functions. If LC were to adopt the $v and one of the utilities were to insist on its use in contributed records, we would have to consider retro. conversion. You do not ask for an estimate of the working days required to accomplish that? The only local reason would be the impact of the $v on heading maintenance over time..."

MARC OF QUALITY : "If retrospective conversions are necessary, our program can help libraries to do it themselves."

OCLC : "As a national utility, we have a commitment to provide high quality, accurate records following current standards as to tagging, coding, and content. Not converting would mean split files in our database, and for our member libraries. This would negate the advantages for indexing and displays that implementing the $v would provide. With the move towards greater reliance on batch processes to acquire records, more manual intervention on the part of receiving libraries would be needed if we don't convert, or if conversion is not simultaneous. Each library would incur costs and do repetitive labor as they duplicate the work to check and/or convert.
There was no question given on the amount of time needed to do a conversion. We feel that to do a conversion for just one thesaurus could take up to 2 years. Conversion for other thesauri probably could not be done simultaneously, so that amount of time to do all appropriate conversions could take even longer....."

WLN : "Conversion of the database for the single function 'form' terms from one subfield code to another is not the labor intensive part of this project for WLN. After all the validation software, edit tables, and other system changes are made, programs are already in place that can make this change given a list of 'before' and 'after' forms of subdivision headings. And since the WLN Authority File is linked to the bibliographic records, only the authority heading must be changed, not the individual bib records."


*
BEYOND A MINIMAL IMPLEMENTATION OF SUBFIELD $V

*NOTE: Some systems might want to take advantage of the definition
*of subfield $v to implement improvements to their software. We
*would like to know if it is likely that your organization will
*improve its software through explicit use of subfield $v.

*20. Is it likely that you will improve the display of subject
*headings on an index or headings screen, especially when a large
*number of headings is retrieved?

* __7__ Not applicable in our systems environment.

* __6__ Unknown at this time.

* __4__ YES. We plan to do this.

* __4__ NO. We do not plan to do this.

[multiple reasons given by same four organizations]
* _3_ Not seen as an improvement.

* _2_ Too much work.

* _2_ Other development priorities.

* _1_ Other. Please supply reason: "The display needs to indicate clearly that it is part of the existing subject index. Although changes are possible, the chances are that the display will remain the same as it is now." (OCLC) *21. Is it likely that you will change your search software in
*order to allow for greater precision in searching? For instance,
*form subdivisions could be assigned their own unique search
*code in your software.

* _6_ Not applicable to our systems environment.

* _8_ Unknown at this time.

* _2_ YES. We plan to do this.

* _4_ NO. We do not plan to do this.

		[multiple reasons given by same six organizations]
*		     _3_ Not seen as an improvement.
 				
*                    _2_ Too much work.
 
*                    _3_ Other development priorities.
 
*                    _0_ Other.  Please supply reason:
 
 
*22. Is it likely that you would change your software to provide
*machine validation of form subdivisions from a list of valid forms
*for a specific thesaurus (if it exists)?

* _5__ Not applicable to our systems environment.

* _3__ Unknown at this time.

* _6__ YES. We plan to do this.

* _6__ NO. We do not plan to do this. [multiple reasons given by some organizations] * _0_ Not seen as an improvement. * _2_ Too much work. * _5_ Other development priorities. * _2_ Other. Please supply reason: "Searching by subdivision is already supported." (RLG) *23. Are there any other improvements which you are likely to do
*based upon the definition of the $v subfield? Please take as much
*space as you need to explain.

GEAC RESPONDED TO THIS QUESTION:
"As part of release 7.0 we are making a number of changes to the ADVANCE system which will greatly enhance the handling of subdivisions. This work is not based upon the implementation of the $v, however, the implementation would further ehnhance the usefulness of these changes."


*
CONCLUDING QUESTIONS

*24. If the subfield $v is defined in the USMARC formats, how much
*lead time would your organization like to have, so that you can
*more easily fold its implementation into your development schedule?

* SUPPLY AMOUNT OF TIME IN MONTHS: ___________________

			low :		0 months (MARC of Quality)
 			high :		24 months  (LC and OCLC both)
			mean :		8.3 months
			median :	6 months
 
			note: LC and WLN require that Format Integration
			work be completed first.
 
* OR
* SUPPLY AMOUNT OF IN YEARS.

*25. If you have any additional comments, please feel free to make
*them here. Take as much space as you need.

BLACKWELL NORTH AMERICA: "From the point of view of authority control, once a $x has been validated as a topic, any repeated authroity control on the field will require repeated validation. Possible solutions:

1) Two new subfields: one subfield for the term validated as a topic, and one subfield for the term validated as a form.
2) The authority control vendor maintains a file of validated strings showing 6xx and related 245."

HARVARD: "Our basic concern about this proposal is that it will require quite a lot of work to implenent, that in the absence of universal retrospective conversion there are no benefits, and even with universal conversion, we are not convinced that the potential benefits would be substantial. Many of the desired effects can now be accomplished with Boolean operators, with the 655/755 field, or could be accomplished through coded elements in the LEADER or 008. Our main concern is the ambiguity of LCSH, and we cannot reliably judge the impact without knowing how and to what extent LC would choose to use the $v in LCSH. While some terms are obviously forms, others will require judgement (and result in disagreements) both on the part of the cataloger and on the part of the patron, which does not seem to in the interest of cataloging simplification or improved access. For those thesauri where conversion could be completely automated, the project is a little easier, but we are still not overwhelmed by the potential benefits. Practically, like it or not, LCSH is the dominant thesaurus, and the preponderant impact of this proposal will be its impact on LCSH."

JES LIBRARY AUTOMATION: "The proposed change will have a greater impact on our clients than on our programmers. The headings that must be handled manually could result in much work. We already have some diagnostics [statistics] that can help..... I searched four databases for the keyword 'maps' and other possible dual function words that we discussed (periodicals, personal narratives, atlases, congresses, diaries, and biography). I counted the number of subject headings, the number with one $x, the number with more than one $x, the number with no $x, and the number in which a $x was the last subfield..... More than 100,000 headings may require changing in each database...."

LC MUMS SYSTEM : "Answers are provided by LC APLO (Collections Services) and ITS pertaining to the MUMS bibliographic system. CDS is providing a separate response for the MARC distribution system. Associated systems, such as SCORPIO and ACCESS for retrieval, and related client-server databases may also require modification. Many of the afore-mentioned activities would occur in parallel, within the context of a specification development, coding, integration testing, and assurance testing development cycle. It is estimated at this time that 181 working days are required.... for implementation in late 1996, following completion of Phase 2 of Format Integration."

LC CDS : "CDS, as the cataloging distribution arm of LC, offers many products including MARC records that implementation of this proposal would impact. Our responses to this survey reflect the impacts this proposal would have across all our products. The estimates given in our responses at this point are mere guesses. If this proposal goes forward, further indepth analysis would have to be done to assess the wide ranging impacts. We anticipate it will be no small feat to accommodate the new $v across all our products. In our estimation the benefits of this proposal do not outweigh the costs. As an alternative, we suggest further exploration of the use of field 655 for bibliographic records be done for newly created and modified book records. The advantages of use of the 655 are twofold: 1) raises the distinction to the field rather than the subfield level and 2) the 655 field already exists, hence easier to implement."

THE MARC OF QUALITY : "Our software can let the user find any subfield code present in a MARC record. It can also find a given code and specified data. It could be used by libraries to assist in retrospective conversion, if the lists of approved form subdivisions and dual form subdivisions for each thesaurus are made publicly available. It could produce reports of the control numbers and relevant data in records needing to be changed, allowing the libraries to find and change those records manually from within their own system. Alternatively, libraries could use it to copy only those records needing to be changed into a separate file. This smaller file could then be sent to a vendor (or some other central site) for global changing, a significant cost saving over sending the entire file."

NLM : "... MeSH presently distinguishes form subheadings from topical subheadings, and NLM retains this distinction within its internal files, including its cataloging systems.

Only at the point where NLM's systems interface with external systems conforming to current USMARC subfielding does NLM convert the data and combine form and topical subheadings in the same type of subfield.
Thus, while the implementation of $v by USMARC would require some changes to systems at NLM, the specific changes would be quite different-- and much less extensive-- than the changes required in systems that use subject heading schemes such as LCSH.
NLM *strongly* supports the use of $v because we already structure our subject headings [this] way and because it will simplify and improve how we import data from other sources."

UNIVERSITY OF TEXAS : "Subject search algorithms, either for browsing the subject index or for subject keyword search and retrieval, make no distinction between subject headings and ANY type of subdivision, so there is no point in distinguishing them for any altered indexing, retrieval, or display. We also would not expect us to have to make such distinctions among types of subdivisions.

In the context of a browsable Subject Index, users were confused by the separation in the card catalog of subject entries among general, geographic, and chronological subdivisions. It would be more unproductive to emulate that pattern in an online catalog, where such arrangements are not readily seen in an overview as they are in a catalog drawer, but only in a screen-by- screen view of a subject index." [comment written in response to Question #20]

WLN : "The 655 field was supposed to perform the function which is now being proposed in this $v proposal. It appears that if this field had been widely used in the library community, there would be no need for the $v proposal. I fear that what is being proposed is a patch job, a reaction to the lack of full implementation of an existing field that has the potential for rich
genre/form data. The $v proposal will only make searching existing 6xx fields more complex when online catalog users want it to be easier.

Perhaps we should take the opposite approach and remove all form/genre terms from the 6xx fields and require a 655 field! Boolean and online searching techniques could combine topics with genre/form in the same way that materials are now scoped by format of material via the Leader byte 6. I think we need to be able to tell users that subject fields describe what an item is ABOUT and let another field indicate what the item IS..... "

***************************************************************************

Due to obvious space limitations, I have only selectively presented quotations from the questionnaire responses. I originally attempted to write a summary, but found my writing and synthesis skills not up to the task, given the extremely different points of view and the many complicated issues. I came to the conclusion that supplying direct quotes would work better.

This approach has added to the length of this report, and I applaud you if you have reached this ending point! Please note that, in my selection of quotes, I have attempted to represent all points of view in a balanced format, to the best of my ability. With luck, I have not overlooked any crucial remarks. But, if I have, please post them on the USMARC list!

Again, many thanks to the people who have aided me over the last five months or so. Please do not hesitate to contact me directly or to ask questions on the USMARC list about this report. I will do my best to respond quickly.

Josephine Crawford Health Sciences Library University of Wisconsin

(608) 262-5709 jocraw@macc.wisc.edu

Back to Form Data Follow-Up page