Storing Session State
We have been using SQL Server to store session state for a while. Now we are growing and looking for more scalable alternatives. It looks like there are three products on the market: ScaleOut, Ncache and StateMirror. Just wondering if anyone had any experience with any of these beasts.
I assume the lack of comments indicates a lack of experience with these products. I would be interested in why you think you can't scale out SQL Server to meet your session state requirements. If SQL Server is handling it now, why wouldn't two or three copies of SQL Server handle your growth for a while?
Roger,
I assume you suggest using a simple partitioning approach when web farm is divided into several pools and each pool has a dedicated SQL Server for session management. The drawbacks are:
1. Potentially uneven load balancing. I must admit that the chances for this are very few, but still - this is not a 100% scalable model.
2. Single point of failure. I agree this maybe not a big deal for some deployments, but in some cases it’s good to have a bomber-proof solution.
3. Maintenance/upgrade downtime. Distributed storage with redundancy eliminates it.
4. Looks like this class of products also solves the problem of application state sharing. This can be useful for caching small lookup tables etc. As far as I know, ASP.NET 2.0 does not solve this problem for web farms.
5. Money. Let’s face it, these solutions do a good job competing with MS SQL Server on price.
If you suggest something more complicated, there is an excellent article about scaling out SQL Server data at http://msdn2.microsoft.com/en-us/library/aa479364.aspx . In most cases, my comments above are applicable to the scenarios described in the article. And here are some specific notes about every scenario.
Scalable Shared Databases
Not our case, it's read-only.
Peer-to-Peer Replication
1. No conflict resolution, and session state may change fast - consider a multiple frames in a single web page that change session state contents;
2. Performance. The products I mentioned use multicast communications, it must be much more efficient than peer-to-peer updates in terms of network traffic and CPU cycles.
Linked servers
Complexity and reliability. As far as I understand, the customer must provide server links between SQL boxes and re-write SQL scripts provided by ASP.NET so they use linked servers. Doesn't seem to be a trivial task to me.
DPVs
This looks pretty much like server clustering. Must be an overkill in terms of price/resouces for such a specific and straightforward task as session state management.
Data-Dependent Routing
This is almost the same as the classic partitioning scenario I mentioned above, but it also requires additional effort on dispatching the data objects.
SODA
This is a heavy-weight solution and is not applicable here.
Overall, I have an impression that this is a good opportunity for third-party developers. But I would be happy to hear success stories from those who used multiple SQL Server copies for ASP.NET application/session state management.
First, I would agree that article on scaleout is excellent - I wrote it.
If your solutions do load balancing and fault tollerance without routing queries by session identifier and without replication and all at a price less than free then I am truly impressed and I agree that SQL Server can't come close to matching these. Try repeating your analysis with only session state in mind. How does update conflict resolution apply for example - if you have two different connections updating the same session state simultaneously you have a lot more issues than just scaleout. I've never seen a session state implementation that didn't use a session identifier. If you hash that identifier and use it for partitioning and distributing to a farm of session state databases, you should be able to scale as far as you wish.
>First, I would agree that article on scaleout is excellent - I wrote it.
This is a good start. Pleased to meet you!
>If your solutions do load balancing and fault tollerance without routing queries by session identifier and without replication and all at a price less than free ...
These guys use multicast communication, so no explicit replication occurrs. And I would be surprised if they use routing by session id.
Anyways, let's concentrate on session state and forget about application state for now. As far as I understand, your weapon of choice is Data-Dependent Routing model with a good hash algorithm for session id. In your world, I can see two options.
- To use one dedicated SQL Server for session state; this SQL Server has a trigger on session state databases that dispatches data to a number of "real-back-end" SQL Servers.
- To hire a developer that writes a custom session store provider that dispatches session state objects to a number of SQL Servers.
Solution one shouldn't be taken seriously - dispatcher SQL Server is a bottleneck, so the scalability is … hmm… questionable. Model two looks more appealing. Let's compare it with third-party solutions.
- Technology maturity. SQL Server wins.
- Scalability. If we trust the numbers and demos on third-party websites both solutions are ok: just add wat... more servers.
- Reliability. SQL - single point of failure (unless every object is stored on two SQL Servers, which means price multiplied by two); third-party - data redundancy (may cost us some extra network traffic and CPU).
- Upgrade/maintenance downtime. Hardly avoidable in SQL scenario, unless custom session store provider is able to change dispatch settings dynamically and replicate session objects when needed (which is not a trivial task); third-party - zero downtime.
- Manageability. SQL - Enterprise Manager console, maybe some performance counters updated from custom provider. All third-party solutions have a component that monitors solution activity and allows perform some basic management operations.
- Price. SQL scenario: development/testing resources, licenses, support. Third party - product price, minimal (as it looks from product demos) support.
To me it looks like third parties do have a chance.
I'l accept that using IP broadcast to replicate state to every server gives you fault tollerance and scaleout with only the cost of a copy of the state on every server and some fairly high network and disk IO overhead. Broadcast is by nature unreliable so I assume either your application or the state server has to deal with session state being missing or corrupt on any given server. I'm a database person so I have an aversion to unreliable, non-transactional data storage but if your application can tollerate that, you can definitely get better performance without reliability. Let us know how your application works with your new state storage system.
Runs ok in the test environment (we are using StateMirror). It offers enough flexibility to easily bail out to SQL state storage if we find any problems.
Greate chain of answers.It is benfitting a lot of people.
Please carry on.
Regards
Sergey Pikhulya wrote: |
| Runs ok in the test environment (we are using StateMirror). It offers enough flexibility to easily bail out to SQL state storage if we find any problems. |
|