Context indexing limitations

On 3/11/06, MSN Toolbar Help online contained the following text about the limitations of WDS relative to indexing large files:

Desktop Search indexes the first 1 MB of text in each document. This is enough to ensure that all but the very longest documents are completely indexed, but if you have an extremely long document and your search term appears after the first 1 MB of text, it may not be found.

A Google search today still shows the same text on a page which is part of Live Search Toolbar Help:

http%3A%2F%2Fsearch.sympatico.msn.ca%2Fdocs%2Ftoolbar.aspx%3Ft%3DMSNTbar_TROU_CantFindAFileIKnowExists.htm

Please advise about the following points:

Is the stated limit, or some other limit, about the amount of text which WDS will index in any particular document still in effect?

Is there a workaround so that all text in larger documents can be indexed?

Are there plans to modifiy or eliminate this limitation in future versions of the application?

Thank you.

[1399 byte] By [Jeff3464] at [2007-12-24]
# 1

Hello Jeff,

WDS actually indexes close to two MB of the textual content of any document. For sake of comparison we've found that WDS will index the entire content of the novel Moby Dick. We've found that most documents don't exceed that amount of textual content.

At the moment, there isn't a workaround or an intent to change the limit. Doing so for customers with a number of very large documents could result in extremely large WDS indexes.

If the demand is high enough, it might be possible to offer a workaround. What is it that you are looking to index with WDS? I'd be happy to submit your request to our team.

Paul Nystrom - MSFT

PaulNystrom-MSFT at 2007-10-8 > top of Msdn Tech,Windows Search Technologies,Windows Desktop Search Development...
# 2
Onenote 2007 files! It seems logical that 1-2 mb limit would be easily exceeded depending on what is considered a 'document' in this app. Not surprisingly with WDS 3.0 I cannot search for keywords (index has completed & .one file extension set to index content) unless I am on that page. Interestingly this worked fine in the 1st beta WDS if I used the required 'my documents' folder. However, now that I can define my own path, WDS cannot seem to find any of my OneNote material. Again, this might be a separate bug.
JimBinarry at 2007-10-8 > top of Msdn Tech,Windows Search Technologies,Windows Desktop Search Development...
# 3

Jim,

I would suggest reinstalling OneNote. The un-install/upgrade of WDS sometimes impacts some of the OneNote registry settings that enable the indexing of OneNote files. Usually, reinstalling OneNote does the trick.

Paul Nystrom - MSFT

PaulNystrom-MSFT at 2007-10-8 > top of Msdn Tech,Windows Search Technologies,Windows Desktop Search Development...