<!-- manual page source format generated by PolyglotMan v3.0.3a12, --> <!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z --> <HTML> <HEAD> <TITLE>WNSEARCH(3WN) manual page</TITLE> </HEAD> <BODY> <A HREF="#toc">Table of Contents</A><P> <H2><A NAME="sect0" HREF="#toc0">NAME </A></H2> findtheinfo, findtheinfo_ds, is_defined, in_wn, index_lookup, parse_index, getindex, read_synset, parse_synset, free_syns, free_synset, free_index, traceptrs_ds, do_trace <H2><A NAME="sect1" HREF="#toc1">SYNOPSIS </A></H2> <P> <B>#include "wn.h" <P> <B>char *findtheinfo(char *searchstr, int pos, int ptr_type, int sense_num); </B></B> <P> <B>SynsetPtr findtheinfo_ds(char *searchstr, int pos, int ptr_type, int sense_num ); </B> <P> <B>unsigned int is_defined(char *searchstr, int pos); </B> <P> <B>unsigned int in_wn(char *searchstr, int pos); </B> <P> <B>IndexPtr index_lookup(char *searchstr, int pos); </B> <P> <B>IndexPtr parse_index(long offset, int dabase, char *line); </B> <P> <B>IndexPtr getindex(char *searchstr, int pos); </B> <P> <B>SynsetPtr read_synset(int pos, long synset_offset, char *searchstr); </B> <P> <B>SynsetPtr parse_synset(FILE *fp, int pos, char *searchstr); </B> <P> <B>void free_syns(SynsetPtr synptr); </B> <P> <B>void free_synset(SynsetPtr synptr); </B> <P> <B>void free_index(IndexPtr idx); </B> <P> <B>SynsetPtr traceptrs_ds(SynsetPtr synptr, int ptr_type, int pos, int depth); </B> <P> <B>char *do_trace(SynsetPtr synptr, int ptr_type, int pos, int depth); </B> <H2><A NAME="sect2" HREF="#toc2">DESCRIPTION </A></H2> <P> These functions are used for searching the WordNet database. They generally fall into several categories: functions for reading and parsing index file entries; functions for reading and parsing synsets in data files; functions for tracing pointers and hierarchies; functions for freeing space occupied by data structures allocated with <B><A HREF="malloc.3.html">malloc</B>(3)</A> . <P> In the following function descriptions, <I>pos </I> is one of the following: <P> <blockquote><B>1 </B><tt> </tt> <tt> </tt> NOUN <BR> <B>2 </B><tt> </tt> <tt> </tt> VERB <BR> <B>3 </B><tt> </tt> <tt> </tt> ADJECTIVE <BR> <B>4 </B><tt> </tt> <tt> </tt> ADVERB <BR> </blockquote> <P> <B>findtheinfo()</B> is the primary search algorithm for use with database interface applications. Search results are automatically formatted, and a pointer to the text buffer is returned. All searches listed in <B>WNHOME/include/wn.h</B> can be done by <B>findtheinfo()</B>. <B>findtheinfo_ds()</B> can be used to perform most of the searches, with results returned in a linked list data structure. This is for use with applications that need to analyze the search results rather than just display them. <P> Both functions are passed the same arguments: <I>searchstr </I> is the word or collocation to search for; <I>pos </I> indicates the syntactic category to search in; <I>ptr_type </I> is one of the valid search types for <I>searchstr </I> in <I>pos </I>. (Available searches can be obtained by calling <B>is_defined()</B> described below.) <I>sense_num </I> should be <FONT SIZE=-1><B>ALLSENSES </B></FONT> if the search is to be done on all senses of <I>searchstr </I> in <I>pos </I>, or a positive integer indicating which sense to search. <P> <B>findtheinfo_ds() </B> returns a linked list data structures representing synsets. Senses are linked through the <I>nextss </I> field of a <B>Synset </B> data structure. For each sense, synsets that match the search specified with <I>ptr_type </I> are linked through the <I>ptrlist </I> field. See <FONT SIZE=-1><B>Synset Navigation </B></FONT> below, for detailed information on the linked lists returned. <P> <B>is_defined() </B> sets a bit for each search type that is valid for <I>searchstr </I> in <I>pos </I>, and returns the resulting unsigned integer. Each bit number corresponds to a pointer type constant defined in <B>WNHOME/include/wn.h </B>. For example, if bit 2 is set, the <FONT SIZE=-1><B>HYPERPTR </B></FONT> search is valid for <I>searchstr </I>. There are 29 possible searches. <P> <B>in_wn() </B> is used to find the syntactic categories in the WordNet database that contain one or more senses of <I>searchstr </I>. If <I>pos </I> is <FONT SIZE=-1><B>ALL_POS, </B></FONT> all syntactic categories are checked. Otherwise, only the part of speech passed is checked. An unsigned integer is returned with a bit set corresponding to each syntactic category containing <I>searchstr </I>. The bit number matches the number for the part of speech. <B>0 </B> is returned if <I>searchstr </I> is not present in <I>pos </I>. <P> <B>index_lookup() </B> finds <I>searchstr </I> in the index file for <I>pos </I> and returns a pointer to the parsed entry in an <B>Index </B> data structure. <I>searchstr </I> must exactly match the form of the word (lower case only, hyphens and underscores in the same places) in the index file. <FONT SIZE=-1><B>NULL </B></FONT> is returned if a match is not found. <P> <B>parse_index() </B> parses an entry from an index file and returns a pointer to the parsed entry in an <B>Index </B> data structure. Passed the byte <I>offset </I> and syntactic category, it reads the index entry at the desired location in the corresponding file. If passed <I>line </I>, <I>line </I> contains an index file entry and the database index file is not consulted. However, <I>offset </I> and <I>dbase </I> should still be passed so the information can be stored in the <B>Index </B> structure. <P> <B>getindex() </B> is a "smart" search for <I>searchstr </I> in the index file corresponding to <I>pos </I>. It applies to <I>searchstr </I> an algorithm that replaces underscores with hyphens, hyphens with underscores, removes hyphens and underscores, and removes periods in an attempt to find a form of the string that is an exact match for an entry in the index file corresponding to <I>pos </I>. <B>index_lookup() </B> is called on each transformed string until a match is found or all the different strings have been tried. It returns a pointer to the parsed <B>Index </B> data structure for <I>searchstr </I>, or <FONT SIZE=-1><B>NULL </B></FONT> if a match is not found. <P> <B>read_synset() </B> is used to read a synset from a byte offset in a data file. It performs an <B><A HREF="fseek.3.html">fseek </B>(3)</A> to <I>synset_offset </I> in the data file corresponding to <I>pos </I>, and calls <B>parse_synset() </B> to read and parse the synset. A pointer to the <B>Synset </B> data structure containing the parsed synset is returned. <P> <B>parse_synset() </B> reads the synset at the current offset in the file indicated by <I>fp </I>. <I>pos </I> is the syntactic category, and <I>searchstr </I>, if not <FONT SIZE=-1><B>NULL, </B></FONT> indicates the word in the synset that the caller is interested in. An attempt is made to match <I>searchstr </I> to one of the words in the synset. If an exact match is found, the <I>whichword </I> field in the <B>Synset </B> structure is set to that word's number in the synset (beginning to count from <B>1 </B>). <P> <B>free_syns() </B> is used to free a linked list of <B>Synset </B> structures allocated by <B>findtheinfo_ds() </B>. <I>synptr </I> is a pointer to the list to free. <P> <B>free_synset() </B> frees the <B>Synset </B> structure pointed to by <I>synptr </I>. <P> <B>free_index() </B> frees the <B>Index </B> structure pointed to by <I>idx </I>. <P> <B>traceptrs_ds() </B> is a recursive search algorithm that traces pointers matching <I>ptr_type </I> starting with the synset pointed to by <I>synptr </I>. Setting <I>depth </I> to <B>1 </B> when <B>traceptrs_ds() </B> is called indicates a recursive search; <B>0 </B> indicates a non-recursive call. <I>synptr </I> points to the data structure representing the synset to search for a pointer of type <I>ptr_type </I>. When a pointer type match is found, the synset pointed to is read is linked onto the <I>nextss </I> chain. Levels of the tree generated by a recursive search are linked via the <I>ptrlist </I> field structure until <FONT SIZE=-1><B>NULL </B></FONT> is found, indicating the top (or bottom) of the tree. This function is usually called from <B>findtheinfo_ds() </B> for each sense of the word. See <FONT SIZE=-1><B>Synset Navigation </B></FONT> below, for detailed information on the linked lists returned. <P> <B>do_trace() </B> performs the search indicated by <I>ptr_type </I> on synset synptr in syntactic category <I>pos </I>. <I>depth </I> is defined as above. <B>do_trace() </B> returns the search results formatted in a text buffer. <H3><A NAME="sect3" HREF="#toc3">Synset Navigation </A></H3> Since the <B>Synset </B> structure is used to represent the synsets for both word senses and pointers, the <I>ptrlist </I> and <I>nextss </I> fields have different meanings depending on whether the structure is a word sense or pointer. This can make navigation through the lists returned by <B>findtheinfo_ds() </B> confusing. <P> Navigation through the returned list involves the following: <P> Following the <I>nextss </I> chain from the synset returned moves through the various senses of <I>searchstr </I>. <FONT SIZE=-1><B>NULL </B></FONT> indicates that end of the chain of senses. <P> Following the <I>ptrlist </I> chain from a <B>Synset </B> structure representing a sense traces the hierarchy of the search results for that sense. Subsequent links in the <I>ptrlist </I> chain indicate the next level (up or down, depending on the search) in the hierarchy. <FONT SIZE=-1><B>NULL </B></FONT> indicates the end of the chain of search result synsets. <P> If a synset pointed to by <I>ptrlist </I> has a value in the <I>nextss </I> field, it represents another pointer of the same type at that level in the hierarchy. For example, some noun synsets have two hypernyms. Following this <I>nextss </I> pointer, and then the <I>ptrlist </I> chain from the <B>Synset </B> structure pointed to, traces another, parallel, hierarchy, until the end is indicated by <FONT SIZE=-1><B>NULL </B></FONT> on that <I>ptrlist </I> chain. So, a <B>synset </B> representing a pointer (versus a sense of <I>searchstr </I>) having a non-NULL value in <I>nextss </I> has another chain of search results linked through the <I>ptrlist </I> chain of the synset pointed to by <I>nextss </I>. <P> If <I>searchstr </I> contains more than one base form in WordNet (as in the noun <B>axes </B>, which has base forms <B>axe </B> and <B>axis </B>), synsets representing the search results for each base form are linked through the <I>nextform </I> pointer of the <B>Synset </B> structure. <H3><A NAME="sect4" HREF="#toc4">WordNet Searches </A></H3> There is no extensive description of what each search type is or the results returned. Using the WordNet interface, examining the source code, and reading <B><A HREF="wndb.5WN.html">wndb</B>(5WN)<B></B></A> are the best ways to see what types of searches are available and the data returned for each. <P> Listed below are the valid searches that can be passed as <I>ptr_type </I> to <B>findtheinfo() </B>. Passing a negative value (when applicable) causes a recursive, hierarchical search by setting <I>depth </I> to <B>1 </B> when <B>traceptrs() </B> is called. <P> <TABLE BORDER=0> <TR> <TD ALIGN=LEFT><B>ptr_type </B> </TD> <TD ALIGN=CENTER><B>Value </B> </TD> <TD ALIGN=CENTER><B>Pointer </B> </TD> <TD ALIGN=LEFT><B>Search </B> </TD> </TR> <TR> <TD ALIGN=LEFT> </TD> <TD ALIGN=CENTER> </TD> <TD ALIGN=CENTER><B>Symbol </B> </TD> </TR> <TR> <TR> <TD ALIGN=LEFT>ANTPTR </TD> <TD ALIGN=CENTER>1 </TD> <TD ALIGN=CENTER>! </TD> <TD ALIGN=LEFT>Antonyms </TD> </TR> <TR> <TD ALIGN=LEFT>HYPERPTR </TD> <TD ALIGN=CENTER>2 </TD> <TD ALIGN=CENTER>@ </TD> <TD ALIGN=LEFT>Hypernyms </TD> </TR> <TR> <TD ALIGN=LEFT>HYPOPTR </TD> <TD ALIGN=CENTER>3 </TD> <TD ALIGN=CENTER> </TD> <TD ALIGN=LEFT>Hyponyms </TD> </TR> <TR> <TD ALIGN=LEFT>ENTAILPTR </TD> <TD ALIGN=CENTER>4 </TD> <TD ALIGN=CENTER>* </TD> <TD ALIGN=LEFT>Entailment </TD> </TR> <TR> <TD ALIGN=LEFT>SIMPTR </TD> <TD ALIGN=CENTER>5 </TD> <TD ALIGN=CENTER>& </TD> <TD ALIGN=LEFT>Similar </TD> </TR> <TR> <TD ALIGN=LEFT>ISMEMBERPTR </TD> <TD ALIGN=CENTER>6 </TD> <TD ALIGN=CENTER>#m </TD> <TD ALIGN=LEFT>Member meronym </TD> </TR> <TR> <TD ALIGN=LEFT>ISSTUFFPTR </TD> <TD ALIGN=CENTER>7 </TD> <TD ALIGN=CENTER>#s </TD> <TD ALIGN=LEFT>Substance meronym </TD> </TR> <TR> <TD ALIGN=LEFT>ISPARTPTR </TD> <TD ALIGN=CENTER>8 </TD> <TD ALIGN=CENTER>#p </TD> <TD ALIGN=LEFT>Part meronym </TD> </TR> <TR> <TD ALIGN=LEFT>HASMEMBERPTR </TD> <TD ALIGN=CENTER>9 </TD> <TD ALIGN=CENTER>%m </TD> <TD ALIGN=LEFT>Member holonym </TD> </TR> <TR> <TD ALIGN=LEFT>HASSTUFFPTR </TD> <TD ALIGN=CENTER>10 </TD> <TD ALIGN=CENTER>%s </TD> <TD ALIGN=LEFT>Substance holonym </TD> </TR> <TR> <TD ALIGN=LEFT>HASPARTPTR </TD> <TD ALIGN=CENTER>11 </TD> <TD ALIGN=CENTER>%p </TD> <TD ALIGN=LEFT>Part holonym </TD> </TR> <TR> <TD ALIGN=LEFT>MERONYM </TD> <TD ALIGN=CENTER>12 </TD> <TD ALIGN=CENTER>% </TD> <TD ALIGN=LEFT>All meronyms </TD> </TR> <TR> <TD ALIGN=LEFT>HOLONYM </TD> <TD ALIGN=CENTER>13 </TD> <TD ALIGN=CENTER># </TD> <TD ALIGN=LEFT>All holonyms </TD> </TR> <TR> <TD ALIGN=LEFT>CAUSETO </TD> <TD ALIGN=CENTER>14 </TD> <TD ALIGN=CENTER>> </TD> <TD ALIGN=LEFT>Cause </TD> </TR> <TR> <TD ALIGN=LEFT>PPLPTR </TD> <TD ALIGN=CENTER>15 </TD> <TD ALIGN=CENTER>< </TD> <TD ALIGN=LEFT>Participle of verb </TD> </TR> <TR> <TD ALIGN=LEFT>SEEALSOPTR </TD> <TD ALIGN=CENTER>16 </TD> <TD ALIGN=CENTER>^ </TD> <TD ALIGN=LEFT>Also see </TD> </TR> <TR> <TD ALIGN=LEFT>PERTPTR </TD> <TD ALIGN=CENTER>17 </TD> <TD ALIGN=CENTER>\ </TD> <TD ALIGN=LEFT>Pertains to noun or derived from adjective </TD> </TR> <TR> <TD ALIGN=LEFT>ATTRIBUTE </TD> <TD ALIGN=CENTER>18 </TD> <TD ALIGN=CENTER>\= </TD> <TD ALIGN=LEFT>Attribute </TD> </TR> <TR> <TD ALIGN=LEFT>VERBGROUP </TD> <TD ALIGN=CENTER>19 </TD> <TD ALIGN=CENTER>$ </TD> <TD ALIGN=LEFT>Verb group </TD> </TR> <TR> <TD ALIGN=LEFT>DERIVATION </TD> <TD ALIGN=CENTER>20 </TD> <TD ALIGN=CENTER>+ </TD> <TD ALIGN=LEFT>Derivationally related form </TD> </TR> <TR> <TD ALIGN=LEFT>CLASSIFICATION </TD> <TD ALIGN=CENTER>21 </TD> <TD ALIGN=CENTER>; </TD> <TD ALIGN=LEFT>Domain of synset </TD> </TR> <TR> <TD ALIGN=LEFT>CLASS </TD> <TD ALIGN=CENTER>22 </TD> <TD ALIGN=CENTER>- </TD> <TD ALIGN=LEFT>Member of this domain </TD> </TR> <TR> <TD ALIGN=LEFT>SYNS </TD> <TD ALIGN=CENTER>23 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Find synonyms </TD> </TR> <TR> <TD ALIGN=LEFT>FREQ </TD> <TD ALIGN=CENTER>24 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Polysemy </TD> </TR> <TR> <TD ALIGN=LEFT>FRAMES </TD> <TD ALIGN=CENTER>25 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Verb example sentences and generic frames </TD> </TR> <TR> <TD ALIGN=LEFT>COORDS </TD> <TD ALIGN=CENTER>26 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Noun coordinates </TD> </TR> <TR> <TD ALIGN=LEFT>RELATIVES </TD> <TD ALIGN=CENTER>27 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Group related senses </TD> </TR> <TR> <TD ALIGN=LEFT>HMERONYM </TD> <TD ALIGN=CENTER>28 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Hierarchical meronym search </TD> </TR> <TR> <TD ALIGN=LEFT>HHOLONYM </TD> <TD ALIGN=CENTER>29 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Hierarchical holonym search </TD> </TR> <TR> <TD ALIGN=LEFT>WNGREP </TD> <TD ALIGN=CENTER>30 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Find keywords by substring </TD> </TR> <TR> <TD ALIGN=LEFT>OVERVIEW </TD> <TD ALIGN=CENTER>31 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Show all synsets for word </TD> </TR> <TR> <TD ALIGN=LEFT>CLASSIF_CATEGORY </TD> <TD ALIGN=CENTER>32 </TD> <TD ALIGN=CENTER>;c </TD> <TD ALIGN=LEFT>Show domain topic </TD> </TR> <TR> <TD ALIGN=LEFT>CLASSIF_USAGE </TD> <TD ALIGN=CENTER>33 </TD> <TD ALIGN=CENTER>;u </TD> <TD ALIGN=LEFT>Show domain usage </TD> </TR> <TR> <TD ALIGN=LEFT>CLASSIF_REGIONAL </TD> <TD ALIGN=CENTER>34 </TD> <TD ALIGN=CENTER>;r </TD> <TD ALIGN=LEFT>Show domain region </TD> </TR> <TR> <TD ALIGN=LEFT>CLASS_CATEGORY </TD> <TD ALIGN=CENTER>35 </TD> <TD ALIGN=CENTER>-c </TD> <TD ALIGN=LEFT>Show domain terms for topic </TD> </TR> <TR> <TD ALIGN=LEFT>CLASS_USAGE </TD> <TD ALIGN=CENTER>36 </TD> <TD ALIGN=CENTER>-u </TD> <TD ALIGN=LEFT>Show domain terms for usage </TD> </TR> <TR> <TD ALIGN=LEFT>CLASS_REGIONAL </TD> <TD ALIGN=CENTER>37 </TD> <TD ALIGN=CENTER>-r </TD> <TD ALIGN=LEFT>Show domain terms for region </TD> </TR> <TR> <TD ALIGN=LEFT>INSTANCE </TD> <TD ALIGN=CENTER>38 </TD> <TD ALIGN=CENTER>@i </TD> <TD ALIGN=LEFT>Instance of </TD> </TR> <TR> <TD ALIGN=LEFT>INSTANCES </TD> <TD ALIGN=CENTER>39 </TD> <TD ALIGN=CENTER> i </TD> <TD ALIGN=LEFT>Show instances </TD> </TR> </TABLE> <P> <B>findtheinfo_ds() </B> cannot perform the following searches: <P> <blockquote>SEEALSOPTR <BR> PERTPTR <BR> VERBGROUP <BR> FREQ <BR> FRAMES <BR> RELATIVES <BR> WNGREP <BR> OVERVIEW <BR> </blockquote> <H2><A NAME="sect5" HREF="#toc5">NOTES </A></H2> Applications that use WordNet and/or the morphological functions must call <B>wninit() </B> at the start of the program. See <B><A HREF="wnutil.3WN.html">wnutil</B>(3WN)</A> for more information. <P> In all function calls, <I>searchstr </I> may be either a word or a collocation formed by joining individual words with underscore characters (<B>_ </B>). <P> The <B>SearchResults </B> structure defines fields in the <I>wnresults </I> global variable that are set by the various search functions. This is a way to get additional information, such as the number of senses the word has, from the search functions. The <I>searchds </I> field is set by <B>findtheinfo_ds() </B>. <P> The <I>pos </I> passed to <B>traceptrs_ds() </B> is not used. <P> <H2><A NAME="sect6" HREF="#toc6">SEE ALSO </A></H2> <B><A HREF="wn.1WN.html">wn</B>(1WN)</A> , <B><A HREF="wnb.1WN.html">wnb</B>(1WN)</A> , <B><A HREF="wnintro.3WN.html">wnintro</B>(3WN)</A> , <B><A HREF="binsrch.3WN.html">binsrch</B>(3WN)</A> , <B><A HREF="malloc.3.html">malloc</B>(3)</A> , <B><A HREF="morph.3WN.html">morph</B>(3WN)</A> , <B><A HREF="wnutil.3WN.html">wnutil</B>(3WN)</A> , <B><A HREF="wnintro.5WN.html">wnintro</B>(5WN)</A> . <H2><A NAME="sect7" HREF="#toc7">WARNINGS </A></H2> <B>parse_synset() </B> must find an exact match between the <I>searchstr </I> passed and a word in the synset to set <I>whichword </I>. No attempt is made to translate hyphens and underscores, as is done in <B>getindex() </B>. <P> The WordNet database and exception list files must be opened with <B>wninit </B> prior to using any of the searching functions. <P> A large search may cause <B>findtheinfo() </B> to run out of buffer space. The maximum buffer size is determined by computer platform. If the buffer size is exceeded the following message is printed in the output buffer: <B>"Search too large. Narrow search and try again..." </B>. <P> Passing an invalid <I>pos </I> will probably result in a core dump. <P> <HR><P> <A NAME="toc"><B>Table of Contents</B></A><P> <UL> <LI><A NAME="toc0" HREF="#sect0">NAME</A></LI> <LI><A NAME="toc1" HREF="#sect1">SYNOPSIS</A></LI> <LI><A NAME="toc2" HREF="#sect2">DESCRIPTION</A></LI> <UL> <LI><A NAME="toc3" HREF="#sect3">Synset Navigation</A></LI> <LI><A NAME="toc4" HREF="#sect4">WordNet Searches</A></LI> </UL> <LI><A NAME="toc5" HREF="#sect5">NOTES</A></LI> <LI><A NAME="toc6" HREF="#sect6">SEE ALSO</A></LI> <LI><A NAME="toc7" HREF="#sect7">WARNINGS</A></LI> </UL> </BODY></HTML>