Oracle8 ConText Cartridge Workbench User's Guide Release 2.3 A57700-01 |
|
This chapter describes how to use the Oracle8 ConText Cartridge Input/Output (I/O) utility. You can use this utility in Windows 32-bit environments such as Windows NT and Windows 95.
The topics covered in this chapter are:
The I/O utility enables you to move text from database to client-side files in a Windows 32-bit environment and vice-versa.
The I/O utility can be used to perform the following operations:
These operations are performed in a sequence defined by a parameter file that you specify on the command line.
The I/O utility has the following command-line syntax:
ctxio32 [-s] [-w] [-p file] [-l file] [-v editor] [-d connect_string]
where:
Parameter | Description |
---|---|
-s |
This allows the user to confirm the connect string and the name of the parameter file. |
-w |
Messages in the parameter file are displayed in this window during processing. |
-p file |
Specifies the name of parameter file. For more information about the format of the file, see "Parameter File Structure" in this chapter. |
-l file |
Specifies the name of log file. Error and debug messages, as well as any messages in the parameter file, are output to this file. No messages are reported directly to the user. |
-v editor |
Displays log file, using given editor (a Windows executable, such as Notepad). |
-d connect_string |
Specifies an explicit database username/password connection string. If "-d none" is given, there is no log on and no prompt so no database session is started. This option would normally be used to spawn an operating system command. If -d is omitted, the user is prompted to enter the connection string. |
The general structure of a parameter file is similar to the tag structure used in HTML/SGML:
Start Tag | End Tag | Text | Description |
---|---|---|---|
<SQL> |
</SQL> |
Limited to a maximum of 32k of SQL split over many lines. Keep each line below 2k. If the first character is @, the SQL is read from the file that follows; for example, @oco.sql loads SQL from the file oco.sql, and executes it. SQL read from a file must not exceed 5Mb. |
Encloses SQL commands. |
<GET> |
</GET> |
Line 2: Table containing source document Line 3: Column containing source document Line 4: WHERE clause (excluding WHERE keyword)
|
Writes a document from a database table to an operating system file. |
<PUT> |
</PUT> |
Line 2: Destination table name Line 3: Destination column name Line 4: WHERE clause (excluding WHERE keyword)
|
Puts an operating system file into a database table.
|
<REG> |
</REG> |
Reg_key, Reg_value [, Trunc_string] |
Registration entries can be used within <EXE> </EXE> pairs. Reg_key is the registration key, as it appears in the lefthand pane of the Registry Editor window. Reg_value is the registry entry name as it appears in the righthand pane of the Registry Editor window. This defaults to "(Default)". Trunc_string is optional and causes truncation of the fetched registry value from the given string. Many useful registry entries have '%1' or similar placeholders. See the parameter file examples further on. |
<LOAD> |
</LOAD> |
Column_name[COL_OPTIONS],Column_name[COL_OPTIONS]... The backslash represents the escape character. Therefore, to obtain \, use \\. Or to obtain ", use \\'. |
Line 1: TAB_OPTIONS := (Replace) Replace: Data is updated rather than inserted into the database. This option is used if the row that satisfies the 'where' clause already exists.
COL_OPTIONS := (COL_OPTION, COL_OPTION...) COL_OPTION := Reads the contents of the given file into the database column. After reading the file replaces all occurrences of X with Y. There can be more than one Replace call per column. X and Y are case-sensitive if C is omitted or if C equals "N". X and Y are case-insensitive if C equals "Y". After reading the file removes all text between X and Y inclusive of X and Y. If Z is given, remove the text only if Z occurs between X and Y. X, Y and Z are case-sensitive if C is omitted or if C equals "N". X and Y are case-insensitive if C equals "Y". |
|
|
|
ExtractOne "X" "Y" ["Z"] ["C"] ExtractAll "X" "Y" ["Z"] ["C"] After reading the file, extracts all text between X and Y, excluding X and Y. If Z is given, extracts the text only if Z occurs between X and Y. ExtractOne extracts only the first instance of the text; ExtractAll extracts all instances with a carriage return/line feed between instances. X, Y and Z are case-sensitive if C is omitted or if C equals "N". X and Y are case-insensitive if C equals "Y". This parameter copies the post-filtered versions of the input files into directory X. This is useful for checking that search/replace processing has worked as expected. Any temporary files are created by default in the home directory; X can be used to specify another file name that is used instead. Typically, this is not required. This defines the columns to be used to define the where clause. This is used to insert the file data into the database and to perform the updates during Replace mode. The file names given refer to files in this directory. X must end in a double backslash ("\\"). For each file, this spawns a user-defined filter program to pre-filter the files before any Extract, Replace or Ignore actions. The program must take an input file as the first argument, and an output file as the second argument. X is the program name (including the full path). |
|
|
|
This generates a number of the value of this column. This value is initially n (default = 1). LoadDir "A" "B" "C" "D" "E" "F" This recursively scans files below directory A (which must have a trailing slash) and looks for files of type B (containing wildcards and separated by ';'). Use C as the directory separator. Prefix all files with D. Recursion ceases when more than E directory levels is reached (a value of 0 means no limit is set). Do not recurse into directories F (where multiple directories are separated by ';'). See the LoadDir Example further on. This is related to LoadDir. The generated file from LoadDir is used as input for this column. Line 3: Column_data = Strings, dates and so on should be quoted as if they were entered in an INSERT SQL statement. Only ASCII files can be filtered with Replace, Ignore, ExtractOne, ExtractAll. |
<EXE> |
</EXE> |
OS command. |
Operating system command. This can be split over many lines. If so, carriage returns are removed and replaced by spaces. |
<QU> |
</QU> |
Line 3: No_label |
This prompts the user with the given question and resumes execution after the line starting 'Yes_label:' if the user responds 'yes'. If the user responds 'no', execution is continued after the line starting 'No_label:'. Alternatively, labels may take the form '#<string>' |
<GOTO> |
label |
None. |
Resumes execution after the line starting 'label:' |
<MSG> |
end of line |
Message |
Displays message in log file and, if -w option is chosen, displays in status window. <MESS> is still supported as an alternative to <MSG>. |
<ERRORCODE> |
end of line |
Error number |
Sets the error code to the given number. If a command fails, and the current error code is non-zero, then that error code is returned by ctxio32.exe. If the -r command line option is used in this case, the error code is also placed in the registry at the following location: "HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\ConText Workbench\ctxio_ret" |
<AUTOCOMMIT> |
end of line |
On | Off |
Sets the standard autocommit function. If On, there is a commit after every SQL statement until either it is turned Off, or the end of the session. |
# |
end of line |
None |
Parameter file comment - not used. |
This section provides the following four examples for using the I/O utility:
"y:\\webpages\\" "*.htm *.html" "/" "http://oracle.uk.oracle.com" 20 ""
This example loads all .htm and .html files into and below y:\webpages (up to a maximum of 20 levels). The path from y:\webpages\ is then built by using '/' as the separator, and this relative path is then prefixed by 'http://oracle.uk.oracle.com/'.
So, if y:\webpages contains a.htm, subdir\b.html and subdir\c.htm, then the following three strings are loaded into the database:
'http://oracle.uk.oracle.com/a.htm' 'http://oracle.uk.oracle.com/subdir/b.html' 'http://oracle.uk.oracle.com/subdir/c.htm'
Head(GeneratedFile,ExtractOne "<TITLE>" "</TITLE>" "" "Y")
This example loads the file found by LoadDir and extract the text between 'TITLE' and '/TITLE'. Case is ignored because of the "Y" parameter.
The file below is a sample CTXIO32 parameter file. It is installed as part of the ConText Workbench.
The command to execute this parameter file is:
ctxio32 -v notepad -s -w -p pfile.txt -l log.txt -d ctxdemo/ctxdemo
# # CTXIO sample script # File : pfile.txt # # Script split into 2 # # 1) First section (#ReplaceAll) replaces all employees. # 2) Second section (#ReplaceSome) replaces some employees. # # The <QU> .. </QU> section asks a question and jumps to the relevant tag # <QU> Do you wish to replace ALL employees? #ReplaceAll #ReplaceSome </QU> # # ----------- This section replaces all current employees ----------- # #ReplaceAll <MESS>Drop employee table <SQL> DROP TABLE CTXIO_EMP </SQL> <MESS>Create employee table <SQL> CREATE TABLE CTXIO_EMP (EMPNO NUMBER, EMPNAME VARCHAR2(100), RESUME LONG) </SQL> <SQL> alter table ctxio_emp add constraint unique_ctxioemp unique(empno) </SQL> <MESS>Truncate employee table <SQL> truncate table CTXIO_EMP </SQL> <MESS>Loading ALL employees into database... # Use SaveDir so we can check search and replace has required affect <LOAD> CTXIO_EMP EMPNO(WHERE), EMPNAME, RESUME(ReadFromFile, Ignore "<Confidential>" "</Confidential>", Replace "brewing" "chemistry", Replace "Brewing" "Chemistry", SaveDir "temp") 1,"Joe Bloggs","jbloggs.htm" 2,"Ray Smith","rsmith.htm" 3,"Lisa Turner","lturner.htm" </LOAD> # Skip next bit which is for single update only <GOTO> #Common # # -------------- This section updates the resumes of some employees -------------- # #ReplaceSome <MESS>Replacing resumes in database... # Use SaveDir so we can check search and replace has required affect <LOAD> CTXIO_EMP (REPLACE) EMPNO(WHERE), EMPNAME, RESUME(ReadFromFile, Ignore "<Confidential>" "</Confidential>", Replace "brewing" "chemistry", Replace "Brewing" "Chemistry", SaveDir "temp") 2,"Ray Smith","rsmith.htm" </LOAD> # Load this resume without search and replace <PUT> jbloggs.htm ctxio_emp resume EMPNO=1 </PUT> # # ----------- This section common to both ----------- # #Common # Check Smith's CV was filtered and stored OK <GET> temp\rsmith.htm ctxio_emp resume EMPNO=2 </GET> # # ----------- Create OCO index ----------- # <MESS>Dropping policy and index (may produce errors but this is OK)... <SQL> begin ctxsys.ctx_ddl.drop_index('CTXIO_EMPRES'); end; </SQL> <SQL> begin ctxsys.ctx_ddl.drop_policy('CTXIO_EMPRES'); end; </SQL> <MESS>Creating policy on test table and creating index... <SQL> begin ctxsys.ctx_ddl.create_policy(policy_name=>'CTXIO_EMPRES', colspec=>'CTXIO_EMP.RESUME', textkey=>'EMPNO'); end; </SQL> <SQL> begin ctxsys.ctx_ddl.create_index('CTXIO_EMPRES'); end; </SQL> # # ----------- Do some OCO querying ----------- # <MESS>Create marked up table <SQL> drop table mutab </SQL> <SQL> create table mutab (id number, document long) </SQL> <MESS>Get highlights... <SQL> begin ctxsys.ctx_query.highlight(CSPEC=>'CTXIO_EMPRES', TEXTKEY=>2, query=>'chemistry,Chemistry', mutab=>'MUTAB', ID=>999,starttag=>'<EM>', endtag=>'</EM>'); end; </SQL> <MESS>Write marked up text to file... <GET> mudoc.htm mutab document id=999 </GET> # This views the marked-up resume in notepad #<EXE> #notepad mudoc.htm #</EXE> # This views the HTML resume in Netscape <EXE> <REG>HKEY_CLASSES_ROOT\NetscapeMarkup\shell\open\command,,"</REG> file:///mudoc.htm </EXE> #Finished
This example demonstrates how to load a website. In particular, it shows how to use the LoadDir option of the I/O utility.
<MESS>Drop table <SQL> DROP TABLE WEBPAGE </SQL> <SQL> DROP TABLE WEBPAGE_OPTIONS </SQL> <MESS>Create tables.. <SQL> CREATE TABLE WEBPAGE(DOCNO NUMBER, DOC LONG, HEAD VARCHAR2(1000)) </SQL> <SQL> CREATE TABLE WEBPAGE_OPTIONS(NAME VARCHAR2(100), VALUE LONG, DES VARCHAR2(1000)) </SQL> <SQL> CREATE TABLE WEBPAGE_HL(ID NUMBER, DOCUMENT LONG) </SQL> <MESS>Create sequence <SQL> CREATE SEQUENCE HL_SEQ </SQL> <SQL> alter table WEBPAGE add constraint unique_WEBPAGE unique(docno) </SQL> <MESS>Truncate table <SQL> truncate table WEBPAGE </SQL> <AUTOCOMMIT> On <MESS>Scanning and loading roll ... <LOAD> WEBPAGE Docno(Where, Sequence 1), Doc(LoadDir "j:\\www\\ocowsite\\" "*.htm;*.html" "/" "http:// roll.uk.oracle.com/" 20 "www"), Head(GeneratedFile, ExtractOne "<TITLE>" "</TITLE>" "" "Y") </LOAD> <MESS>Scanning and loading shake ... <LOAD> WEBPAGE Docno(Where, Sequence 1000), Doc(LoadDir "y:\\webpages\\" "*.htm;*.html" "/" "http:// shake.uk.oracle.com/" 20), Head(GeneratedFile, ExtractOne "<TITLE>" "</TITLE>" "" "Y") </LOAD> <MESS>Scanning and loading uksn16 ... <LOAD> WEBPAGE Docno(Where, Sequence 2000), Doc(LoadDir "h:\\etc\\httpd\\htdocs\\" "*.htm;*.html" "/" "http:// uksn16.uk.oracle.com/" 20), Head(GeneratedFile, ExtractOne "<TITLE>" "</TITLE>" "" "Y") </LOAD> <MESS>Dropping policy and index (may produce errors but this is OK)... <SQL> begin ctxsys.ctx_ddl.drop_index('WEBPAGE'); end; </SQL> <SQL> begin ctxsys.ctx_ddl.drop_policy('WEBPAGE'); end; </SQL> <MESS>Creating policy on test table and creating index... <SQL> begin ctxsys.ctx_ddl.create_policy(policy_name=>'WEBPAGE', colspec=>'WEBPAGE.DOC', filter_pref => 'CTXSYS.HTML_FILTER', dstore_pref => 'CTXSYS.DEFAULT_URL', textkey=>'DOCNO'); end; </SQL> <SQL> begin ctxsys.ctx_ddl.create_index('WEBPAGE'); end; </SQL> <SQL> TRUNCATE TABLE WEBPAGE_OPTIONS </SQL> <MESS>Initializing search page options ... <LOAD> WEBPAGE_OPTIONS NAME, VALUE, DES 'GenShortTitle','OCO Websearch','Short generic title' 'FrontTitle','Oracle8 ConText Cartridge Web Site Search','Title of front page' 'FrontHeading','<center><h1>Oracle8 ConText Cartridge Web Site Search</h1><center>','Heading of front page' 'FrontIncludeCount','Y','Include web pages count (Y/N)' 'FrontContentsPreQuery','','Contents of page after count and before query section (max 32k)' 'FrontContentsPostQuery','','Contents of page after query section (max 32k)' 'ResultTitle','Oracle8 ConText Cartridge Search Results','Title of results page' 'ResultHeading','<center><h1>Oracle8 ConText Cartridge Search Results</h1></center>','Heading of results page' 'ResultShowExpression','Y','Show query expression (Y/N)' 'ResultLabelHeading','Page heading','Label to describe heading' 'ResultLabelScore','Score','Label to describe score' 'ResultLabelPage','Page','Label to describe page' 'ResultHighStart','<Font color=red size=+4>','Highlight on' 'ResultHighEnd','</font>','Highlight off' 'ResultTagStart','<OCOH>','Internal use' 'ResultTagEnd','</OCOH>','Internal use' 'ResultContentsPreTable','','Contents of page before the results table (max 32k)' 'ResultContentsPostTable','','Contents of page after the results table (max 32k)' 'ResultHighNextYN','Y','Show the [Next] link after each highlight (Y/N)' 'ResultHighNext','<font size=-1>[Next]</font>','If ResultHighNextYN=Y, this is displayed after each highlight' 'ResultHighFirst','[First Highlight]','If ResultHighNextYN=Y, this is displayed at start and end of document' </LOAD> <PUT> frontpre.txt webpage_options value name='FrontContentsPreQuery' </PUT> <PUT> frontpos.txt webpage_options value name='FrontContentsPostQuery' </PUT> <PUT> respre.txt webpage_options value name='ResultContentsPreTable' </PUT> <PUT> respos.txt webpage_options value name='ResultContentsPostTable' </PUT>
All the functionality of ctxio is incorporated in a single dll (ctxio32l.dll). Developers can call the C routines within this dll directly.