Terms and frequencies?

Hi,

After running a query and getting the results back, is it possible to get the terms and frequencies for each document?

Thanks,
Eric

[144 byte] By [ERobinson] at [2007-12-22]
# 1

You can use a hashtable to count the word frequency for each your document...

globelin at 2007-8-30 > top of Msdn Tech,Windows Search Technologies,Windows Desktop Search Development...
# 2
So, is there a column that contains the indexed terms for the documents?
ERobinson at 2007-8-30 > top of Msdn Tech,Windows Search Technologies,Windows Desktop Search Development...
# 3

I may use the word-docNo as the HashTable Key,

And count the tf

I think that you are doing some IR jobs..

globelin at 2007-8-30 > top of Msdn Tech,Windows Search Technologies,Windows Desktop Search Development...
# 4
See the post at: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=516694&SiteID=1

It describes all of the columns that are defined in WDS 2.6.5. Note that there is a column called "contents", but it is not retrievable.

I see no definition for any column that contains the list of indexed terms for the document. What I am asking is: "Is there a way to retrieve a list of indexed terms for the documents that are returned as a part of a query?"

I don't want to have to go open each file and essentially duplicate the work that the indexing engine has already accomplished.

Eric

ERobinson at 2007-8-30 > top of Msdn Tech,Windows Search Technologies,Windows Desktop Search Development...
# 5

Hello E Robinson,

There currently isn't a way to retrieve this information. You can query on the collumn Characterization (System.Search.AutoSummary in 3.0), but that will only give you the terms from the beginning of the document.

If you want access to all of the content you can instantiate and invoke the IFilter to grab the content stream again.

Paul Nystrom - MSFT

PaulNystrom-MSFT at 2007-8-30 > top of Msdn Tech,Windows Search Technologies,Windows Desktop Search Development...