Dictionary Extension

I am trying to develop an application where a I can switch the main system dictionary used by the handwriting recognition engine to other dictionaries. I want to use the system dictionary so that the change is affected in all applications that use this engine. I have been trying to do some research to discover what dll's are being used, where the dictionaries are, ect. The only thing that I've managed to locate so far is the User dictionary, and I have not found any kind of resource that explains what files are being used and how.

I am new to development for Tablets and am looking for any pointers, suggestions, or ideas that anyone may have to help me out with this.

Thanks

B.J.

[713 byte] By [Chayodyn] at [2007-12-24]
# 1

Hi B.J.

Please take a look at this article/sample:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dntablet/html/usspdtirec.asp

Let me know if you have any further questions about this.

Thanks,

Stefan Wick

StefanWick-MSFT at 2007-8-31 > top of Msdn Tech,Software Development for Windows Vista,Notebook, Tablet PC, and UMPC Development...
# 2

Stefan,

Thanks for the article. It was a big help, and answered many of the questions that I have had about how the handwriting recognition uses dictionaries.

There are two questions that I havn't been able to answer, however. Is there a way to encrypt the application dictionaries so that the word lists contained within them cannot be viewed (even in a text editor)? And where is the main system dictionary located within the file system?

I know that the second of those questions might be somewhat sensitive, so please feel free to ignore it (just let me know so I don't ask again). I'm trying to find a way to use an encrypted dictionary, and I believe that the main dictionary is encrypted in some way. If I can figure out how the main dictionary is doing it, then I may be able to find a way to get the application dictionaries to do so.

Thanks for your help

B.J.

Chayodyn at 2007-8-31 > top of Msdn Tech,Software Development for Windows Vista,Notebook, Tablet PC, and UMPC Development...
# 3

Hi B.J.,

the system dictionaries are baked into the respective recognizer DLL - meaning the US dictionary is embedded in mshwusa.dll, the German dictionary in mshwdeu.dll, etc.

I am not aware of any way to encrypt the speech dictionaries. Can you explain your scenario for doing this?

If you want to use a private dictionary that isn't stored in the file system, you could assign a Wordlist to your RecognizerContext. This dictionary would then be only limited to your own app, though.

Thanks, Stefan

StefanWick-MSFT at 2007-8-31 > top of Msdn Tech,Software Development for Windows Vista,Notebook, Tablet PC, and UMPC Development...
# 4

Hi Stefan,

Thanks for the info. Basically, I'm trying to find a way to improve handwriting recognition for specialty terminology (such as medical offices that use lots of specialty terms). The word lists that I am using for this are licensed from a company that requires them to be unviewable and uneditable by users when being used in development. So, I have a very comprehensive list of terminolgy, but in order to use it I must encrypt it or hide it in some way so that users cannot see or edit the contents of the list.

I considered just adding the words into the user dictionary, but I've discovered that the dictionary file is limited in size. Also, you can see the words in the user dictionary by opening the file in any text editor.

Do you know of any way possible to encrypt or hide a dictionary for the speech/handwriting recognition?

Do you do any development for the tablets yourself?

Thanks,

B.J.

Chayodyn at 2007-8-31 > top of Msdn Tech,Software Development for Windows Vista,Notebook, Tablet PC, and UMPC Development...
# 5

Hi B.J.

>>Do you know of any way possible to encrypt or hide a dictionary for the speech/handwriting recognition?

I am not aware of a way to completely encrpyt/hide your custom dictionary entries. Since you want the entries to be accessible by any app system-wide, they will need to be visible via APIs. So even if you encrypted them in the file sysyem, a motivated user could still use the APIs to get to the content. This - by the way - is also true for our system dictionary entries.

>>I've discovered that the dictionary file is limited in size

Just curious: how many words are you trying to add?

>>Do you do any development for the tablets yourself?

Yes. I am a Software Engineer on the Tablet Platform team at Microsoft since 2001.

Thanks, Stefan

StefanWick-MSFT at 2007-8-31 > top of Msdn Tech,Software Development for Windows Vista,Notebook, Tablet PC, and UMPC Development...
# 6

Stefan,

>>Just curious: how many words are you trying to add?
I have attempted to import as many as 600,000 words (approx) using the Dictionary Tool PowerToy. The tool 'Encountered an unhandled exception' when there were 325,195 words in the user dictionary and would not add any more words. More significant that that though, is the fact that the dictionary file is exactly 10MB in size (size on disk - 10,485,760 bytes). Also, at this file size, I am unable to add words to the dictionary through the Dictionary Tool, or through the TIP. I assume that this is a maximum file size that is imposed within the handwriting system itself (maybe for performance issues?).

I would like to discuss this with you in more detail. Would you mind sending me an email? You can reach me at wrhyner AT spellex.com. Replace the ' AT ' with '@' of course :-)

Thanks,

B.J.

Chayodyn at 2007-8-31 > top of Msdn Tech,Software Development for Windows Vista,Notebook, Tablet PC, and UMPC Development...
# 7

The handwriting recognition engines are optimized to show best performance with less than 100,000 custom entries in the dictionary. Beyond that size, your customers will notice slower performance on average TabletPC hardware. Beyond 300,000 custom words, the performance will most likely be unacceptable for most users/models.

Just curious: what group of customers are you targetting that would frequently use 600,000 custom words on top of the words that are already in the system dictionary?

Thanks, Stefan

StefanWick-MSFT at 2007-8-31 > top of Msdn Tech,Software Development for Windows Vista,Notebook, Tablet PC, and UMPC Development...

Software Development for Windows Vista

Site Classified