|Oracle8 ConText Cartridge Application Developer's Guide
This chapter describes how ConText query applications can present documents with highlighted information.
The following topics are covered in this chapter:
In a typical query application, users can issue text or theme queries. The application executes the query and returns to the user a hitlist, allowing the user to select one or more documents.
When the user chooses a document, ConText enables you to present the selected document with the query terms highlighted for text queries, or with the relevant paragraphs highlighted for theme queries.
Your application can also present linguistic summaries of the selected documents.
For more information about linguistic output, see Chapter 7, "Linguistic Concepts".
With ConText, you use the CTX_QUERY.HIGHLIGHT procedure to create various forms of highlighted output that can be presented to users.
This chapter describes how to present highlighted documents for applications built in PL/SQL as well as applications built in a Windows 32-bit client-side environment.
The PL/SQL procedure CTX_QUERY.HIGHLIGHTgenerates filtered text, marked-up highlight text, and highlight information. You typically call CTX_QUERY.HIGHLIGHT after executing a text or theme query.
With text queries, HIGHLIGHT marks the relevant words or phrases in the document.
With theme queries, HIGHLIGHT marks the relevant paragraphs in the document.
Use CTX_QUERY.HIGHLIGHT to generate the following output for a document:
The positions and lengths of the query terms are specified as offsets from the beginning of the ASCII text version of the document.
When you call CTX_QUERY.HIGHLIGHT, you can specify the markup used to indicate the start and end of a highlighted word or phrase for text queries, or the start and end of a highlighted paragraph for theme queries.
When you specify no markup, HIGHLIGHT uses default markup. The default highlighting mark-up produced by HIGHLIGHT differs depending on the format of the source document.
If the source document is an ASCII document or a formatted document, the default highlighting markup is three angle brackets immediately to the left (<<<) and right (>>>) of each term.
If the source document is an HTML document filtered through an external filter, the default highlighting markup is the same as the highlighting markup for ASCII or formatted documents (<<< and >>>).
If the source document is an HTML document filtered through the internal HTML filter, the default highlighting markup is the HTML tags used to indicate the start and end of a font change:
For more information about internal and external filters, see Oracle8 ConText Cartridge Administrator's Guide.
To present highlighted documents in an application, do the following:
The result tables required by the HIGHLIGHT procedure can be allocated manually using the CREATE TABLE command in SQL or using the CTX_QUERY.GETTAB procedure.
For example, to create a MUTAB table to store highlighted ascii mark-up, issue the following statement:
To create a HIGHTAB table to store highlight offset information, issue the following statement:
Issue a one-step, two-step, or in-memory query to return a hitlist of documents. You can issue either a text or theme query. For text queries, you call CONTAINS with a text policy; for theme queries, you call CONTAINS with a theme policy. The hitlist provides the textkeys that are used to generate highlight and display output for specified documents in the hitlist.
Call CTX_QUERY.HIGHLIGHT with a pointer to a document (generally the textkey obtained from the hitlist) and a text or theme query expression.
CTX_QUERY. HIGHLIGHT returns various forms of the specified document that can be further processed or displayed by the application.
ConText uses the query expression specified in the HIGHLIGHT procedure to generate the highlight offset information and marked-up ASCII text. In addition, the offset information is based on the ASCII text version of the document.
While the query expression is usually the same as the expression used to return documents in the text query, it is not required that the query expressions match. For example, you might allow a user to search for all articles by a particular author and then allow the user to view highlighted references to a specified subject in the returned documents.
To create highlight mark-up for text queries, you must specify a text policy, which is usually the policy you specify with the CONTAINS procedure for the same query.
For example, to highlight all the occurrences of the term dog with a document identified by textkey 14, issue the following statement:
ctx_query.highlight ( cspec => 'text_policy', textkey => '14', query => 'dog', id => 14, hightab => 'highlight_ascii', mutab => 'mu_ascii' );
To create highlight mark-up for a theme query, you must specify a theme policy, which is usually the policy you specify with the CONTAINS procedure for the same query. With theme queries, the HIGHLIGHT procedure highlights the relevant paragraphs in the document.
For example, to highlight all the paragraphs that are relevant to the theme query computers for document with textkey 12, issue the following query:
ctx_query.highlight ( cspec => 'theme_policy', textkey => '12', query => 'computers', id => 12, hightab => 'highlight_ascii', mutab => 'mu_ascii' );
You can use the MUTAB table to view highlighted ascii text. For example in SQL*Plus, you can issue the following statement to view a MUTAB table called mu_ascii:
You can also use the offset information in the HIGHTAB table to highlight the document in ways that suit your application.
With text queries, the word or phrase is highlighted. For example, a text query on dog might produce the following type of highlighted ascii output for a document:
With theme queries, the relevant paragraphs in the document are highlighted. For example, a theme query of computers produces the following type of highlighted ascii output for a document:
<<< LAS VEGAS -- International Business Machines Corp. is using the huge computer trade show here this week to try to prove a much disputed marketing claim ofthe past year and a half: that its PS/2 line of personal computers really does offer unique benefits.>>> In the battle for the hearts and minds of the 100,000 dealers, corporate customers and other spectators gathered here, IBM has set up a series of demonstrations of the Micro Channel, which is the PS/2's internal data pathway. The demonstrations seek to show that this pathway has extra flexibility that can translate into more speed. One demonstration uses an add-in circuit board that IBM claims allows data to be sent over a network about 60% faster. Another illustrates a quicker way to store the huge amounts of data handled by a so-called file server, the machine that controls a network of personal computers. <<< While most personal computers contain just one "master" processor -- the chip that tells the various parts of the computer what to do -- the Micro Channel allows for more than one. That means that in Micro Channel machines, the workhorse central processor can dump lots of work onto another processor, freeing itself to go about other tasks.>>> ...
In this three paragraph excerpt of a news article that satisfies the theme query computers, ConText highlights (with angle brackets) only the paragraphs that are about computers.
After documents have been processed by the HIGHLIGHT procedure and displayed to the user, drop the highlight result tables.
If the tables were created manually, drop the tables using the SQL command DROP TABLE.
You can use the Oracle8 ConText Cartridge Viewer Control (CTXV32.OCX) to present highlighted documents to users in a Windows 32-bit environment, such as Windows NT or Windows 95. The viewer enables the user to browse documents in the supported formats with query terms highlighted.
You embed the control in client-side applications. To operate the viewer, you need not write any PL/SQL code; given the database connection, the document textkey, and the query term, the viewer control displays the document with highlights.
The user can view a Word document, for example, as it would appear in Microsoft Word. The user can also scroll through the document using the Next and Previous buttons to jump to the next or previous occurrence of the search term(s).
As OCX modules are not stand-alone executables, you need a development environment such as Visual C++ or Visual Basic to use the ConText Viewer Control. Within such an environment, you can add the control to the tool palette, from where you can place instances of the control on a form or canvas.
For example, in Visual Basic 4.0, you add the control to the tool palette by selecting Custom Controls from the Tool menu. Use the browser to select the Oracle8 ConText Viewer Control, CTXV32.OCX, from the oracle_home\BIN directory.
Alternatively, you can create instances of the control dynamically, using the identification string CTXV32.CTXViewer.1.
If the viewer control is embedded in an HTML page, the browser must support ActiveX components and the client machine must have the viewer installed on it with all required support files. The viewer uses SQL*Net to communicate with the database. Within HTML, you can invoke the methods using Visual Basic scripting, for example, and change properties with the OBJECT tag and parameter settings syntax..
You can use the ConText Viewer Control to view documents in the following server-side supported formats:
If you are not using the ConText viewer control to present documents in a 32-bit Windows environment, you can use the ConText I/O utility (CTXIO32) to move documents (highlighted or not) from database tables to the client operating system and vice-versa. Documents downloaded to the client operating system can be viewed in their native applications.
For more information about the 32-bit Windows I/O utility, see the Oracle8 ConText Cartridge Workbench User's Guide.