||Recreate fulltext index
||ScanRecs(D, I, R, Fields(Felder), ExtABC, Cut, Kontraindex, Modus, Step, MaskenFeld, DynKontraIndex) : INTEGER
||All parameters except the first four are optional!
D table number of the source table
I table number of the index table
R table number of the relations table
Fields is a special function which commits the information to the function which field are included to the fulltext index. The field combination can take place as staticly as dynamicly. In case of CGI programming only the dynamic version is possible because no table are opened when compiling in this mode.
Static version: Fields(Feld1,Feld2,Feld3...)
Dynamic version: Fields("Feld1,Feld2,Feld3...")
All field of the table (except Blobs) are allowed. Date, time and number fields are converted to strings. Select fields the according text constant is included to the fulltext.
With string fields there is an important exception:
Starts a string field with "#" the rest of the string is interpreted as a path to an external text document. This version of the tdbengine supports external ASCII, ANSO and HTML documents.
Additionally the contents of ADL-linked records can included to the fulltext index via the L-field notation and via the R-field notation.
The optional Parameters:
is a string which shall be valid as word elements additional to letters. If e.g. also the post code shall be included to the fulltext index, there must stand "0123456789".
Tip: The dash should not be included as a word element (except in special cases), because with it the single elements of put together words can not be found (as easy and fast) via the fulltext search.
The next two parameters are used to constrict the fulltext index. In many cases words with less information like "and", "the" etc. are not wanted to be included to the fulltext index.
if a value larger than 0 is specified words that happens more often than Cut are not included to the fulltext index.
is the name of an external text file where those words are included which are not in inlcuded in the fulltext index. Thereby every word is written to an own line, a order is not necessary.
The next parameter defines the fundamental behaviour of the tdbengine creating an fulltext index. The particular modi are simply summed up:
0 The index table is recreated
1 An existing index table is used and only new words are added
2 An existing index table is used but no new words are added
4 *) Only the absolute necessary files are created
8 HTML-Tags are overread and HTML special chats are converted to ASCII
32 external textes are available in ASCII format (otherwise in ANSI)
*) This mode is only partially compatible to the relational system of the tdbengine. It is to use with highest advantage if the access to the index table and the relations table takes place exclusively via the fulltext functions. Moreover the needed memory is reduced to less than 50% and the search speed is increased significantly.
This parameter is only to set to value unequal 0 if really very huge amount of data is handled. This is the size of memory which the tdbengine requests from the operating system. Much memory brings about more working speed in this case. But a to big memory request can make the operating system to swap out some memory, what leads to a massive incursion of performance! Guessed you have a 256 Mbyte system where only a few processes run that need much memory you can enter here e.g. the value 100000000 (=100 Millions).
Tip: Change this value only if the indexing times are unusual.
This is the field number of the source file which saves a 16 Bit integer naumber (NUMBER,2). The content of this field is included to the fulltext index, better to te relations table (more exactly: to the IN2 of the relations table) and can be considered at the fulltext search. A special case is if here -1 is specified. Then the mask values can specified directly (seperated with colon) in the field combination at Fields. The analyse at the fulltext search takes place by specifying a number which will be linked with a binary AND to the mask that is saved in the fulltext index. Only such links are found where the binary AND results a value unequal 0.
That sounds complicated and so it is, but it gives you very nice options. For it an example. Assuming you have an address database, where different kinds of addresses are saved: private addresses, business addresses, agencies etc. You save the kind of an address as an identifier in an integer field:
8 scientific institution
To handle addresses that match to more than one cathegorie, you easily sum this values up. If you specify the field number of these field at fulltext indexing, you can define any restriction respecting to the cathegorie allready when searching without any record has to be read!
Alternative to the analyse of a mask field you also can assign mask numbers at Fields. In this case the mask field number is -1 and the mask numbers are specified directly after the field names. For it again an example:
Street:4, Country:8, Postcode:8, Town:8,
Here a fulltext index is made too. However in the relations table the mask number 1 is included, if the just searched field is the Firstname field, the number 2 is included for the Name fields and so on. With the fulltext search you can search well directed in single fields or field combinations (or in all fields at one time)!
This is a dynamic field oriented contra index which is transported via the function Fields again. This means that with every record at first the here defined fields are read and out of it a contra index is build. But this contra index valid only for this record, at the next one the procedure restarts.
||The function returns the number of entries in the relations table if no error occurs. Otherwise a runtime error occurs which you can catch with .ec 1.
This below example 1includes all integrants inclusively a drafted error handling.
Upon you should regard:
Order at opening the table; the relations table must always be the last one.
Change the primary table before calling ScanRecs that the field identifier of the source table are detected.
This happens at fulltext indexing:
At first as the index table as the relations table are emptied. Then all records of the source table are read. With every record the fields that are specified under Fields are scanned, i.e. deconstructed in single strings which only consists of letters. Every of these words are entered to the index table if it has not been existing there allready. At least a entry to the relations table takes place with a link to the record of the source table and the belonging record of the index table.
The parameters potentiate a very flexible fulltext indexing of any data set. But ScanRecs is a table function that changes two tables at one time, thus the system is finicky with following the right syntax. The function returns the number of entry in the relations table if no error occurs. Otherwise a runtime error occurs which you can catch with .ec 1.
The most common errors:
50: (unknown identifier) At least one false record is in Fields
58: (no table) At least one of the three required tables is not opened
64: (no ADL-table) The index table is not numbered automatically
77: (no relations table) The third parameter is not a valid relations table
If the function result is negative in fact no runtime error occurs, but the function is terminatied nevertheless.
-115 Not enought rights for the function (the relations table could not be emptied)
-56 (, expected) The ,fields in Fields have to be seperated by comma
Example 1: ScanRecs
A simple fulltext indexing for an address table adr.dat
VAR D,I,R,fc,N : INTEGER
?fc:=GenList("i-adr.dat",40,1)<>0/HALT .. Error creating index table
?fc:=GenRel("adr","i-adr","r-adr.rel")<>0/HALT .. Error creating relations table
IF D:=OpenDB("adr.dat") THEN
IF I:=OpenDB("i-adr.dat","",0,15) THEN
IF R:=OpenDB("r-adr.rel","",0,15) THEN
..Error opening relations table
.. Error opening index table
.. Error opening source table
Write a comment:
|in der Doku fehlt der Modus 257 für Indizierung von Teilmengen
|User: Hermann||Date: 31.12.2007 08:05||#2666
|also von markierten Datensätzen
FindAndMark(TERMIN,"($den=Today,$um< Now +30) or $den>Today")
|User: sk||Date: 10.01.2008 17:16||#2667