Practical questions
Hello all,
For as far as I can see, LINQ offers key features that belong to the data access layer. (Not like the current features from System.Data in .NET 2.0.) LINQ is all about replacing existing data access layer code with LINQ code, I think.
So is the true advantage that we don't have to write SQL code anymore? If so, would the LINQ code be placed on the database server (stored procedures) or on the application side?
Would data access layers still exist? And in what form? Where would they reside? On a (for example) SQL Server, or on the application side?
I could not find any 'best practices' yet on this subject.
Hope you understand my issues and can clearify them!
Thank you very much in advance.
Kind regards,
David van Leerdam
[762 byte] By [
davidvl] at [2007-12-17]
I see Linq as all about making queries over objects use the same syntax, regardless of whether they're in a database (DLinq), an in-memory collection (Linq), an XML structure (XLinq), or whatever you please. DLinq is specifically geared toward talking to databases (SQL Server, in the current preview).
I can see a place for Linq in a stored procedure, but not necessarily for replacing SQL -- it is much more generally useful than as an alternative to writing in any given query language.
IMHO, I see data layers still existing. But instead of being the place where you store SQL strings and perform the actual query, I see them as being the place where you store the description of the query you want to execute.
Read the Linq and DLinq documents carefully. Realize that with DLinq, absolutely NO query is executed until you call the query's enumerator. When you do so, the expression tree that was created using the query syntax is analyzed and transformed into query language for the database in question, and the data is finally pulled from the database. That is, in DLinq the query syntax is a means to create what in ADO.NET would be a *description* of the SqlCommand object. When you take the query description and put it in, say, a foreach block, the enumerator is called, which then constructs the SqlCommand object and queries the database.
So the data layer still exists, but it's not so strictly divided. The layer calling the data layer is the one that actually makes the query, unless you split things across services or the like.
| |
static class DAL { public static IEnumerable<Person> People { get { var db = new Northwind(connectionString); // creates a description of the query, // this is NOT a SQL statement var query = from p in db.People select p; // returns a QUERY DESCRIPTION, not the data // the query description has knowledge about // the data source it needs to access, and can // generate appropriate queries on demand return query; } } } class DALConsumer { public void DisplayPeople() { foreach (Person p in DAL.People) { Console.WriteLine(p.Name); } } }
|
At least, that's how I'm seeing it used. This is a fairly new approach for most people so "best practices" haven't really been published.
Hi Keith,
Thank you for your kind reply.
The reason I was asking for 'best practices' is that LINQ has to tackle some issues that current data access methodologies have. I wondered what issues LINQ would solve, and how this is accomplished. So there should be examples of situations where LINQ is really useful, and where it solves the issues it should solve. From the examples I found until now, I just don't see the real advantage of using LINQ, so I must be missing some point.
If I may extract this from your reply and code example, the advantage of using LINQ is a.o. that we are able to construct data commands
- that should be more readable
- that use a single quering language at application level, regardless of the storage format of the data
I understand from your reply that the construction of data commands comes in very handy with data stores that do not support stored procedures. Else LINQ would not be needed, since the command is defined on the data store itself. Is this correct or am I missing something?
Thanks in advance.
Kind regards,
David van Leerdam
Remember -- if it's database, it's DLinq. Linq and DLinq may look the same on the outside (intentionally!), but they work differently inside.
DLinq should provide the same benefits that the typical ORM package does:
* insulation from database-specific points
* strongly typed entities
* automatic (and correct) connection management
Try not to focus on databases and stored procedures. In the end, the fact that you're using a database is an internal implementation detail of your data persistence efforts. ORMs were developed to try to hide the database, and DLinq is no different.
Hi Keith,
That makes good sense. So I should see LINQ as a generic way of facilitating object-data mappings, and DLINQ is a specific implementation of LINQ for relational databases, somewhat like strongly typed datasets, right?
Thanks a lot!
Kind regards,
David van Leerdam
I completely disagree on the “best-practices” issue for Dlinq. There are indeed volumes of best practices written on using tools like Dlinq. Dlinq is just ORM. The ORM community is rife with best practices and hard experience.
To risk sounding officious, using static methods at DAL component boundaries is a real shot in the unit-testability foot and really would be one of those things that would typically be proscribed in the existing best practices – even without an ORM tool.
The essence of applications using Dlinq or any ORM tool is rooted in Domain-Driven Design, and you can’t find a more in-depth treatment of building applications that have an object-oriented versus a data-oriented perspective than Eric Evans’ book on the subject.
I would be really quite unfortunate if the Microsoft developer community were to become deluded into believing that Dlinq is in any way a new thing – expect perhaps for query syntax support in the compiler. Microsoft is quite late to the ORM and Domain-Driven game, and I hope that community-facing folks in Redmond do not fail to recognize the wealth of existing knowledge and experience supporting approaches engendered by tools like Dlinq, and do not try to re-invent the entity-centric design practice wheel. At best, such an effort would create confusion and fragmentation. At worst, it could provide more mis-guidance.
My 2 cents,
-s
That is exactly how I see it.
(So obviously it must be correct! ;9)
Linq's a pattern that transforms the query syntax into a set of calls to methods with standardized names. MS has provided some default implementations of these methods for use in various circumstances. Linq's implementations build delegates; DLinq's build expression trees.
You're free to implement your own: perhaps you want to query an LDAP directory using Linq, or maybe you want to make AmazonLinq or GoogleLinq using web service calls. In which case, your implementation would build LDAP, AmazonWS, or Google queries.
Of course it's not new.. people have been trying to do this for decades.
Linq, etc is what happens when you actually *do* sit down for a couple years and look at things. Just coming up with something different without taking time to seriously think about it leads to the balkanization you see in the ORM and freeware world. These days, particularly in .NET, innovation seems to be more about unifying the needlessly-different concepts rather than defining some completely new concept. Perfectly valid, in my book, and much more impressive.
Re statics in unit testing: never given me a problem before.
Well, I suppose we're kinda going OT, but it's a valuable topic to be off on... :)
If you have static methods at component boundaries, ie: if your data access assembly's API is made up of static methods, you can't unit test any code that is a client of that assembly. Since the client is statically bound to the data access component, business logic tests that make calls into data access must make calls into live data access. That's an integration test rather than a unit test.
A unit test of business logic that calls into a data access API should call into a mock of the data access API. Mocking would require instance pluggability which means that the data access API would need to be composed of instance methods.
I'm not saying that the system couldn’t be tested... it just couldn't be unit tested. Even if integration tests are executed from a unit testing tool like NUnit, they're still integration tests.
One of the great benefits of Domain-Driven Design is the ability to build entities and business logic in isolation of data access (and ultimately other external dependencies like web services, etc). Statically-bound external dependencies shut the door on unit testing and allow only for integration testing. That’s still better than no testing, but it doesn’t allow for the design benefits opened up by testable designs, like IoC, dependency injection, etc.
That’s my terribly-biased feeling, anyway
I disagree. You can plug a mock into a static member variable. In this case, the DAL becomes an adapter to an adapter, much in the same way I view DLinq:
| |
static class DAL { public static DbInterface Interface; public static int NumberOfCustomers() { return from c in Interface.Customers select count(c); // or something like that } }void UnitTest() { DAL.Interface = new MockDbInterface(); AssertEquals(123, DAL.NumberOfCustomers()); }
|
Consider the following:
UI -> Linq [-> Linq...] -> DLinq -> DB
>
somewhat like strongly typed datasets, right?... except that strongly-typed DataSets are mostly strongly-typed accessors over relational structures. Domain entity-oriented design allows you to decouple from the data schema, which ultimately can bring with it a whole host of design opportunities. Even though you *can* make your objects match the structure of your tables, it's often better to let the relational schema be the relational schema and let the object schema be the object schema, and use the mapping tool to resolve any differences in the schemas. You'd end up with differences because modeling a relational schema in an entity-oriented design can cripple the design opportunities of the domain layer. An ORM tool can go far beyond the DataSet/DataAdapter in terms of mapping capabilities, and hopefully Dlinq will go to the extent that the presently-available ORM tools for .NET have gone.
Which really kinda calls the utility of Metal into question... :) Starting with Metal could mean that a brand new domain layer could be smelly from the outset.
Indeed, but in a web app, different requests would be constantly over-writing each other's instance of the data access interface, since a static member is acccessible by any thread in an AppDomain and web request threads in any given ASP .NET app run in a single AppDomain. There's no guarentee that a given ASP .NET request will be served by a specific thread, so Thread Local Storage isn't a safe bet either.
Presently, most folks using ORM use the Session per Request pattern and most of the .NET ORM tool makers ship HttpModule implementations to support it out of the box. Some third parties, such as Castle, ship implementations as well. Spring is supposed to suppot this out of the box soon (if not already).
The practice gaining more momentum is the use of Evans' repository pattern concept with a pluggable data gateway - which is essentially a Strategy pattern at it's core. The .NET Provider Model is a Strategy pattern too, so it's essentially a provider over a provider.
Here's a thread on Ben Day's blog that goes a bit deeper, and links to a bunch of other references:
http://blog.benday.com/archive/2005/03/16/198.aspx
If you wanted multi-threaded, write-once, that can be done with a singleton pattern. Nothing gets overwritten.
Nothing you've said prevents statics from being tested with mocks, even in a multi-threaded environment.
I think we're working from different assumptions about design.
ORM development typically leans toward the use of data sessions and, in a web app, the Session per Request pattern. So I'll lay out my design assumptions and lets see if we're talking about the same stuff...
A data session typically holds a single connection (possibly multiple connections, but that’s rare), and possibly transaction(s), as well as the ORM’s cache of entity instances active in that session. The session is stateful.
Each user is assigned an individual data session used during the lifetime of their particular web request. The data session isn’t shared between requests since any given live connection and live transaction isn’t shared between users’ requests.
You can indeed assign a mock to a static member, but that would only solve a testing problem – the design wouldn’t work for production ASP .NET apps. All users’ ASP .NET requests in a given ASP .NET application are processed in a single AppDomain and therefore all request threads have visibility to the same memory space for static members for classes loaded into the AppDomain.
If a data session that is intended for use for a single user on a single web request is written to a static member of a data access Singleton, every time that Singleton is used by the multiple requests representing multiple users, it will hand back the same, shared instance of a single data session to multiple user requests.
One user can get another user’s data session.
In Dlinq, the data session is the DataContext. I’m not quite sure yet if this is indeed a context object in terms of behavior similar to ContextBoundObject, or some other object that would give it some kind of context affinity such as an affinity to an ASP .NET request. If DataSession is in fact web context-bound in some sense, then Microsoft may have solved much of the problem of using ORM data sessions in web apps right out of the box. I haven’t seen anything in Dlinq that leads me to believe that this is so, but I haven’t looked too deeply yet, and the product is still young. Nonetheless, to solve the problem, the Dlinq developers would still be subject to the operating environment of ASP .NET AppDomains and threads, and would likely ship an implementation of HttpModule to address session affinity like the other ORM vendors have done.
A Repository class lives in the domain layer. CustomerRepository is responsible for coordinating data access for the Customer Aggregate (Customer Entity and any child classes uniquely managed by the Customer Entity).
Typically, when a request begins, an HttpModule inserts a new data session (Dlinq DataContext) into HttpContext.Items. This makes the data session available to all business logic that might need to talk to the data access interface in the course of business logic transactions.
When a CustomerRepository is instantiated in the course of some business logic transaction, it (or a DI framework) plucks the data session from context (HttpContext.Items in the case of ASP .NET apps) and delegates data access behavior to it.
When the request ends, the HttpModule disposes the data session.
This is essentially the behavior of a Singleton, except that it will also work in ASP .NET apps as well – including asmx web services.
In unit testing the business logic, a mock or stub context provides an instance of a mock of the data session or a data access interface that encapsulates the specific ORM vendor’s implementation of a data session – say, ICustomerDataGateway, or something of the like. This allows business logic to be tested in isolation of the data access. When business logic under test instantiates and then uses CustomerRepository, the Repository delegates to a mock.
If CustomerRepository held an instance of ICustomerDataGateway in a static member, every time CustomerRepository is instantiated in the course of a web request, it would end up sharing the same ICustomerDataGateway instance among multiple requests and among different users of the web app.
My design assumptions about data access are totally influenced by prevailing ORM practices, the design practices that issue from Test-Driven Development, and Domain-Driven Design. If we’re not playing from the same application micro architecture play book, then we’re probably not really talking about the same concerns that would lead me to affirm that static members at component boundaries are bad things.
TDD practitioners by and large would eschew the use of Singletons all together because they tend to pollute the clean memory space of a given test fixture’s test cases that test or make use of a Singleton. Each test case of a fixture that tests the Singleton would not get a clean instance of the Singleton – which would negate a key tenet of test case isolation. To make sure that each test case in a fixture uses a clean Singleton, the Singleton would have to be re-initialized by each test case – which is essentially a Strategy pattern with a limp.
This invariably leads to the predominant use of the Strategy pattern over Singleton where testability is a key driver of the engineering practices and of design – as is the case with TDD.
Hello all,
Thanks for having this discussion, I think I now understand the main goals of LINQ.
One remaining (probably also off topic) related question though. My presentation layer code often contains a piece of code like:
Dim myOrder as Order = New Order
myOrder.SelectById(e.CommandArgument)
myOrder.Delete()
Which would select an order from my database and delete it. A result of this design is that in this situation it makes 1 'large' unneccesary database query, since my 'deleteorder' stored procedure takes the orderId as parameter.
I could do it by passing the Id to a static (shared in VB) Delete method instead of this version, but I feel like that is bad design too.
If I could just determine what the next call on my object would be (that is, detect a 'Delete()' following 'SelectById').
Does LINQ have a (kind of) solution for this? And how would you solve it with .NET 1.1 / 2.0?
Thanks in advance.
Kind regards,
David van Leerdam