Seperate Process
Will 2008 maintain the seperate process/service for Full Text search?
My major headache with FT currently in 2005 (which was a massive improvement on 2000 BTW) is that because the searches are performed outside of the engine it is very slow to join the results using CONTAINSTABLE back to tables in your database.
I run a number of process that return ten of thousands of rows from the FT index and then summarise them using other database tables - this is scary slow.
This article is very interesting and highlights a number of problems with the current implementation:
http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/ftslesld.mspx
It would be interesting to know if these issues are being addressed and if so how. For example one of the recomendations in this article is to allow windows (not SQL) enough RAM to cache the entire FT catalog - I have 26 catalogs (6 months weekly) - each is roughly 20-25GB - thats a heap of RAM!
Thanks
Simon
[1193 byte] By [
SimonDM] at [2008-1-8]
Hi,
No, FTS has been completely rewriten in SQL Server 2008 being fully integrated with the DB engine. We still have a separate deamon running out of SQL responsible of loading 3rd party components like (iFilterrs and WBs). This is for security reasons indeed.
We expect to see major improvements on query time in scenarios where relational and FT predicates are blend together. The Optimizer will know be able to select the best plan to satisfy a given query, where in 2000-2005, the FTCatalog would be queried entirely regardless of any cardinality observed.
For 2008, we will stil recommend to have the FTCatalog in memory when possible; however, I am sure you will experience good performance and manageability improvements even if you can not allocate your entire FTCatalogs in memory at once.
I really recommmend you to give a try to this new FT architecture.
Thanks.
Thanks Fernando. This sounds very promising indeed.
Is there any documentation on the changes yet? I have the July CTP BOL but it doesn't cover any of these details. I haven't installed the SQL CTP engine yet. Is FT up and running in the CTP or is it down for a future release?
We have really hit a brick wall with the 2005 engine. The project I'm working on is close to either being canned or moving to another technology (which I'm not happy about). If 2008 looks like it can solve some of the problems we have experienced then I'm sure we can wait.
From the sounds of what you are saying our server memory issues will be helped greatly as SQL and FT will share memory rather than compete, it also means FT isn't limited by the lack of AWE support. Also the query optimisations should also reduce the load on the FT search which in turn should reduce the memory requirement.
Thanks
Simon
You are welcome.
iFTS is not yet in any of the public CTPs. We are working hard to have it ready for CTP6. By then, the documentation will also be available. For now, I recommend you to contact me personally (fernlope@microsoft.com), in order to clarify your questions. We could also work out the possibility to forward you a private pre-CTP build of iFTS so you can try it out way before it is public.
Thanks.
Hi Fernando,
I think I met you at the SCAN Summit back in April. Would it be all right to contact you personally about getting a copy of a private pre-CTP build that has the iFTS bits in it? We are currently looking at other vendors to address features like proximity (find "Bill" within 2 words of "Gates"), and dictionary support (you mentioned "DMVs" in another post and how term occurrences would be exposed to admins).
Do you have any more information on either of the proximity support or the DMVs? Will there be a way we can expose this type of functionality ourselves now that the index is part of SQL, even if MS hasn't fleshed it out?
Thanks,
Cameron
Hi Cameron,
Glad to hear from you again.
Let me anwer your quetions:
-please, write me to fernlope@microsoft.com in order to discuss whether you can use our iFTS pre-CTP build right now. We are indeed very interested in you doing so, but we need to verify NDA and other aspecte before be allowed to share it outside.
-The good news is that iFTS is showing to be faster and robust in many aspects than FTS 2005. Also, we are exposing the set of DMVs I talked to you about. These allow you to see the content of the FTIndex and infer some statistics or data useful for troubleshooting, etc..
-the bad news is that we are not exposing (yet) the terms offset in the FTIndex. Therefore, we are not helping you here in order to determinate whether 2 words are N words away from eachother. Sorry.
I can assure you that a customizable NEAR operator is in our top 5 next features to add as soon as we can; however SQL Server 2008 will most probably not contain this feature due time contrasints and iFTS work still left.
Thanks.