Specification Pattern In Entity Framework 4 Revisited

After the post Entity Framework 4 POCO, Repository and Specification Pattern was published for a while, there have been quite a few of positive comments from readers. At first, I thought that this piece of code should have been used as a prototype to demonstrate the implementation of EF POCO, the Repository and Specification pattern. I known it is not the optimized piece of code to everyone and that if there was suggestion for improvement, I would leave to reader as I thought once they understood the design idea, they could extend/change/use the API in anyway they want.

However, there are also some comments concerning about the way the Specification pattern is applied which might cause the BIG performance problem when used that I was not aware of. Even I am using this API in my current work but it’s a shame that I rarely use Specification to query the data but the other methods are enough for me.

For your information, here are the extracted comments from Jon & Buu (many thanks :)) which pointed out the problem:

Jon: Linq to Entities uses expressions to build the SQL that will be executed on the database server. If you use the specification pattern as designed above, i.e., without expressions, the generated SQL is never impacted. Instead, EF will generate “SELECT * FROM Table,” returning all rows. From there the specification pattern kicks in and filters the data in memory. Your test passed because you got the right result, but not in the right way. it would never work in a real would scenario.”

Buu:  …the cause of the issue as Jon observed is because the method (IsSatisfiedBy) of Specification is fed into the Where method of the query object. That results in LINQ-2-EF loading all records to memory objects and then filter those objects using the passed in method. That’s a huge performance hit.

To fix the issue, you should feed into the Where method an instance of Expression so that the Where overload in IQueryable is invoked (instead of the one in IEnumerable). The Specification already stores an instance of Expression, so the code change should be straightforward. However, since not all expression operations are supported by LINQ-2-EF, you might end up breaking some existing code. Be warned…”

I verified the problem with Entity Framework Profiler and I have to say the problem is there. Here are some tests I made to retrieve a product by name, and also the results from the profiler:

Use standard lambda expression:

private void FindProductByName()
{
    IEnumerable<Product> products = productRepository.Find<Product>(p => p.Name == "Windows XP Professional");
    Assert.AreEqual(1, products.Count());
}

The output from the profiler

EF generated the correct sql and returned the expected result, no thing wrong with this.

Use specification

private void FindBySpecification()
{
    Specification<Product> specification = new Specification<Product>(p => p.Name == "Windows XP Professional");
    IEnumerable<Product> products = productRepository.Find<Product>(specification);
    Assert.AreEqual(1, products.Count());
}

The output from the profiler

As you can see there is a problem with using specification to retrieve a product in which EF loads all the products from the database, then perform filtering a product against the whole set of products in-memory. If the number of products is huge, it really is an issue since the query is not efficient at all.

This is because the specification pattern is implemented incorrectly. To fix it, I change the specification contract as the following:

public interface ISpecification<TEntity>
{
    TEntity SatisfyingEntityFrom(IQueryable<TEntity> query);

    IQueryable<TEntity> SatisfyingEntitiesFrom(IQueryable<TEntity> query);
}

And the implementation of some generic repository’s methods which accepts a specification are as following:

public class GenericRepository : IRepository
{
    // other code...

    public IQueryable<TEntity> GetQuery<TEntity>() where TEntity : class
    {
        var entityName = GetEntityName<TEntity>();
        return ObjectContext.CreateQuery<TEntity>(entityName);
    }

    public TEntity FindOne<TEntity>(ISpecification<TEntity> criteria) where TEntity : class
    {
        return criteria.SatisfyingEntityFrom(GetQuery<TEntity>());
    }

    public IEnumerable<TEntity> Find<TEntity>(ISpecification<TEntity> criteria) where TEntity : class
    {
        return criteria.SatisfyingEntitiesFrom(GetQuery<TEntity>());
    }

    // other code...
}

The implementation of a simple specification is straightforward.

public class Specification<TEntity> : ISpecification<TEntity>
{
    public Specification(Expression<Func<TEntity, bool>> predicate)
    {
        Predicate = predicate;
    }

    public TEntity SatisfyingEntityFrom(IQueryable<TEntity> query)
    {
        return query.Where(Predicate).SingleOrDefault();
    }

    public IQueryable<TEntity> SatisfyingEntitiesFrom(IQueryable<TEntity> query)
    {
        return query.Where(Predicate);
    }

    public Expression<Func<TEntity, bool>> Predicate;
}

Here is the output of the profiler with the new specification implementation:

As you can see EF now generates the expected sql and returns expected result as well, very efficient.

What about the composite specification?

If you already read the previous post, you should have known that the composite specification was being used to chain the specifications. With the new implementation of specification, the composite specification implementation also needs to change. The technique being applied here is to combine the lambda expression (or predicate) of each specification (left and right side of a composite specification) to create a new lambda expression then this new lambda expression is fed to the IQuerable object for querying.

Here is the code of the extension method of lambda expression which is mostly inspired from here:

public static class ExpressionExtension
{
    public static Expression<T> Compose<T>(this Expression<T> first, Expression<T> second, Func<Expression, Expression, Expression> merge)
    {
        // build parameter map (from parameters of second to parameters of first)
        var map = first.Parameters.Select((f, i) => new { f, s = second.Parameters[i] }).ToDictionary(p => p.s, p => p.f);

        // replace parameters in the second lambda expression with parameters from the first
        var secondBody = ParameterRebinder.ReplaceParameters(map, second.Body);

        // apply composition of lambda expression bodies to parameters from the first expression
        return Expression.Lambda<T>(merge(first.Body, secondBody), first.Parameters);
    }

    public static Expression<Func<T, bool>> And<T>(this Expression<Func<T, bool>> first, Expression<Func<T, bool>> second)
    {
        return first.Compose(second, Expression.And);
    }

    public static Expression<Func<T, bool>> Or<T>(this Expression<Func<T, bool>> first, Expression<Func<T, bool>> second)
    {
        return first.Compose(second, Expression.Or);
    }
}

And here is the implementation of the composite specification:

public class AndSpecification<TEntity> : CompositeSpecification<TEntity>
{
    public AndSpecification(Specification<TEntity> leftSide, Specification<TEntity> rightSide)
        : base(leftSide, rightSide)
    {
    }

    public override TEntity SatisfyingEntityFrom(IQueryable<TEntity> query)
    {
        return SatisfyingEntitiesFrom(query).FirstOrDefault();
    }

    public override IQueryable<TEntity> SatisfyingEntitiesFrom(IQueryable<TEntity> query)
    {
        return query.Where(_leftSide.Predicate.And(_rightSide.Predicate));
    }
}

public class OrSpecification<TEntity> : CompositeSpecification<TEntity>
{
    public OrSpecification(Specification<TEntity> leftSide, Specification<TEntity> rightSide)
        : base(leftSide, rightSide)
    {
    }

    public override TEntity SatisfyingEntityFrom(IQueryable<TEntity> query)
    {
        return SatisfyingEntitiesFrom(query).FirstOrDefault();
    }

    public override IQueryable<TEntity> SatisfyingEntitiesFrom(IQueryable<TEntity> query)
    {
        return query.Where(_leftSide.Predicate.Or(_rightSide.Predicate));
    }
}

The test code to find product by name and price which uses the AndSpecification

private void FindByAndCompositeSpecification()
{
    IEnumerable<Product> products = productRepository.Find<Product>(
        new Specification<Product>(p => p.Price < 100).And(new Specification<Product>(p => p.Name == "Windows XP Professional")));
    Assert.AreEqual(1, products.Count());
}

The output from the profiler:

The test code to find product which applies the OrSpecification:

private void FindByOrCompositeSpecification()
{
    IEnumerable<Product> products = productRepository.Find<Product>(
        new Specification<Product>(p => p.Price < 100).Or(new Specification<Product>(p => p.Name == "Windows XP Professional")));
    Assert.AreEqual(2, products.Count());
}

The output from the profiler:

The problem is solved!

The updated source code can be downloaded here.

Once again, comments are welcome.

Posted on August 25, 2010, in design, Entity Framework, Programming and tagged , . Bookmark the permalink. 45 Comments.

  1. Nice! Good job, huyrua :).

  2. 🙂 thanks.

  3. Great job. Thanks

  4. This is turning out to be a really nice framework for working with EF. I’ve been a big fan of SharpArch in the past and this is quickly becoming a viable alternative.

    Do you have any example solutions using this framework similar to what SharpArch did with Northwind? I pulled the sources from your googlecode repository and it wasn’t there.

    • Chris,

      Glad you find this framework interesting.

      Actually, I did think about using this framework with SharpArch which replaces the NHibernate data access component with this one. However, when looking at some core components of SharpArch, I changed my mind. The reasons are I might have to make many breaking changes and/or extend a lot of main classes/interfaces. But the most problem, IMO, is I find it hard to abstract an OR/M framework because each one has pros and cons, e.g., some advanced CRUD operations can be done easily with NHibernate but it is hard to EF, and vise versa.

      That is why I only ported some classes from SharpArch to this framework and use it in my current project which mostly uses MS technologies (very little open source frameworks are used because some strict requirements as said in the beginning of the previous post).

      SharpArch + (Fluent) NHibernate are still my prefer choice when building my own apps.

      Thanks.

  5. Thanks again Huyrua –
    Never thought about Extension on Expression. This is again awesome and stellar in my opinion.

    Also –
    What are your thoughts about working with Specification across multiple entities.
    e.g. Get all Products price < 100 but Belong to Category only Media. And somehow generating one and only SQL statement.
    I do understand I can chain Specification easily on related Entities. But what if they are not related. Again just a thought. Maybe there is no easy way, trying to solve too much with this.

    • Good question on the idea of Specification working across multiple entities. However, the Specification chain is able to solve the problem, so I dont think I should add another implemention for it. Please bear with me 🙂

  6. Huy,

    This is quite a gem of a find – beautiful generic repository!

    I see in your “Lab” versions that you have left out some operations (e.g. “get by key”), and in the latest (1.2?) version, where you have a UoW implementation in progress, many operations mainly because they need access to the underlying ObjectContext, and of course the DbContext doesn’t expose it’s underlying ObjectContext natively (it’s a protected property).

    I have an easy workaround which makes basically all the operations possible. Instead of using the plain DbContext as the main object being managed (instantiated) by DbContextManager, you could go “down” another layer, and declare an object like:


    public class BaseDbContext : DbContext
    {
    #region Constructors...

    public BaseDbContext() { }

    public BaseDbContext(ObjectContext objectContext) : base(objectContext) { }

    #endregion

    #region Properties...

    ///
    /// Exposes the underlying ObjectContext of the DbContext, which can then be used for
    /// extended operations which may not be directly supported by the DbContext.
    ///
    public ObjectContext UnderlyingObjectContext
    {
    get { return ObjectContext; }
    }

    #endregion
    }

    And you just use this “BaseDbContext”, for example in DbContextManager, like:


    public static BaseDbContext Current
    {
    get { return CurrentFor(ConnectionStringName); }
    }

    And of course this enables some operations which were left unimplemented, such as:


    public TEntity GetByKey(object keyValue) where TEntity : class
    {
    var entitySetName = GetEntitySetName();
    object originalItem;
    var key = new EntityKey(entitySetName, DefaultEntityKeyName, keyValue);

    if (DbContext.UnderlyingObjectContext.TryGetObjectByKey(key, out originalItem))
    {
    return (TEntity)originalItem;
    }

    return default(TEntity);
    }

    Or even:


    public void BeginTransaction(IsolationLevel isolationLevel)
    {
    if (_transaction != null) { throw new ApplicationException(Resources.EN.Exceptions.TransactionIsolationLevelViolation); }

    OpenConnection();

    _transaction = _dbContext.UnderlyingObjectContext.Connection.BeginTransaction(isolationLevel);
    }

    So far seems to work fine. Hopefully the EF team will be able to allow easier access to this in the future!

    Allen

    • Allen,

      Great suggestion for implementing BaseDbContext. I’ll consider adding it to the framework.

      Thanks.
      Huy

  7. Just upgraded to Code-First CTP5 via NuGet and it can no longer find ObjectContext. Wondering if you could update your stuff to the latest plus above changes. Would be awesome. looking forward to your comments.

  8. Just to help out:
    1. EntityConfiguration is now EntityTypeConfiguration

    2. IncludeMetadataInDatabase = false should be done via:

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
    modelBuilder.Conventions.Remove();
    }

    3. StructuralTypeConfiguration is now generic

    4. From documentation:

    DbContext.ObjectContext has moved
    Rather than being a protected member we have made the underlying ObjectContext available via an explicitly implemented interface, this allows external components to make use of the underlying context. Getting the context now looks like; ((IObjectContextAdapter)myContext).ObjectContext

  9. 2. modelBuilder.Conventions.Remove<IncludeMetadataConvention>();

    5. CreateModel().CreateObjectContext<T>(cn), I think is simply BuildObjectContext()

    6. LazyLoadingEnabled is enabled via ctx.Configuration

    7. DataBaseExists is ctx.Database.Exists()

    8. the connection string is now passed to DbConext as in return (T)(new DbContext(cn, true));

    9. In your DbContextBuilder class, I think there’re easier ways to add configurations now. Not sure here.

  10. Thanks. Woudl you by any chance have a time frame in mind?

    • Frank,

      Please bear with me as I am quite busy these days. After investigating EF CTP5 carefully, hopefully I can be able to upgrade to EF CTP5 by the end of next week.

      Thanks.

  11. Daniel Lidström

    Hi!

    I have a question: Why is the specification pattern necessary when we have Linq? I don’t see anywhere in your examples how to use Linq. To me it looks like you are simply re-implementing a code version of Linq. Thanks for any clarification!

  12. Any progress on the CTP5 stuff? Just anxiously waiting.


  13. IEnumerable products = productRepository.Find(
    new Specification(p => p.Price < 100).Or(new Specification(p => p.Name == "Windows XP Professional")));

    But how i can parameterize this specifications (say, 200 instead of 100 and “Debian Linux” instead of “Windows XP Professional”) ?

    I have two functions in my code:

    public bool CheckAccountEmailExist(string email)
    {
    var emailExistSpec = new Specification(a => a.Email.ToUpper() == email.ToUpper());
    return _accountRepository.GetBy(emailExistSpec).Any();
    }

    public bool CheckAccountEmailExist(string email, Guid exceptAccountId)
    {
    var emailExistSpec = new Specification();
    var exceptAccountSpec = new Specification(a => a.Id != exceptAccountId);
    return _accountRepository.GetBy(emailExistSpec.And(exceptAccountSpec)).Any();
    }

    I know, they stupid, but this is example, right? 🙂
    I want to extract specification “a => a.Email.ToUpper() == email.ToUpper()” to use it in both functions, but i should parametrize it with “email” (function parameter). How can I do this?

    • Daniel Lidström

      First of all, here’s something you can do with the Specification class to reduce some of the code you have to write. Add the following methods:


      public AndSpecification And(Expression<Func> second)
      {
      return new AndSpecification(this, new Specification(second));
      }

      public OrSpecification Or(Expression<Func> second)
      {
      return new OrSpecification(this, new Specification(second));
      }

      This allows you to write this:


      IEnumerable products = repository.Find(
      new Specification(p => p.Price p.Name == "Windows XP Professional"));

      I think it looks nicer. Anyway, for your question I would recommend you to subclass Specification:


      public class EmailExistsSpecification : Specification
      {
      public EmailExistsSpecification(string email)
      : base(e => e.Email.ToUpper() == email.ToUpper())
      {
      }
      }

      Now you can re-use the EmailExistsSpecification in your methods. Nice huh?

  14. In the proposed solution is the restriction on the number of compositions, for example i cant build this composition:
    new Specification().Add(new Specification.Add(new Specification()))

    There are some limitations?

    • There is no limitation since the And or Or method return AndSpecification or OrSpecification which are CompositeSpecification. So, you can chain it at any level you want.

    • Daniel Lidström

      Be careful where you put the parenthesis. Here’s what you should write:


      new Specification().Add(new Specification()).Add(new Specification());

  15. Problem is that
    new Specification().Add(new Specification()
    have type AndSpecification, which has no method (as well as parent CompositeSpecification) .Add()/.Or()

    and code
    new Specification().And(new Specification().And()

    has error like:

    'SpecsTest.AndSpecification' does not contain a definition for 'And' and no extension method 'And' accepting a first argument of type 'SpecsTest.AndSpecification' could be found (are you missing a using directive or an assembly reference?)

    • sorry, some code had disappeared
      bad code:

      
      new Specification().And(new Specification()).And(new Specification());
      
  16. Daniel Lidström

    Hi Pavel,

    now I see what you are having problems with. You are so right. I think I have a solution. Actually I believe CompositeSpecification, AndSpecification, and OrSpecification are completely unnecessary. I have modified the Specification class slightly to make it possible to chain calls to And/Or.
    Have a look at this Gist: http://gist.github.com/780835

    With these changes you need to be a bit careful how you chain your calls, in order to get correct And/Or precedence. Anyway, I am able to write this:


    private void FindByAndCompositeSpecification()
    {
    IEnumerable products = repository.Find(
    new Specification(p => p.Price p.Name == "Windows XP Professional")
    .Or(p => p.Name == "Windows 7 Home Edition"));
    Assert.AreEqual(1, products.Count());
    }

    and this:


    private void FindByConcretCompositeSpecification()
    {
    IEnumerable products = repository.Find(
    new ProductOnSaleSpecification().And(new ProductByNameSpecification("Windows XP Professional")));
    Assert.AreEqual(1, products.Count());
    }

    Daniel

    • Hi Daniel,

      it looks like a solution to the problem! (except that in the . Add() methods is called. Or(predicate) instead of .Add(predicate) in your code :)).

      The only thing we must not forget – the arrangement of brackets in the resulting predicate:

      For example, we want to obtain the following predicate:

      
      false
      &&
      (
      	false
      	||
      	true
      );
      
      result = false;
      
      
      Following wrong code:
      
      
      var wrongSpec =
      	new Specification(_ => false)
      		.And(new Specification(_ => false))
      		.Or(new Specification(_ => true));
      
      wrongSpec.Predicate.Compile().Invoke(new Account());
      
      
      will return true, because the resulting predicate would be:
      
      
      (
      	(
      		false 
      		&&
      		false
      	) 
      	||
      	true
      )
      
      
      Correct code:
      
      
      var firstPart = new Specification(_ => false);
      var secondPart = new Specification(_ => false)
      				.Or(new Specification(_ => true));
      var correctSpec = firstPart.And(secondPart);
      
      correctSpec.Predicate.Compile().Invoke(new Account());
      
      will return correct value (false).
      
      
      Daniel, many thanks for your help!
      Many thanks to huyrua for excelent article, i will use Specification pattern in current project :).
  17. Daniel Lidström

    Hi Pavel,

    nice to be able to help! I’ve corrected the And/Or error, good spotting. Good luck with your specifications!

  18. Thanks for great discussion and solution, Pavel & Daniel.

  19. This is a shining example of how to post a great blog on Repository and Specification patterns with Entity Framework. Any developer, particularly those in Domain Driven Design, could benefit greatly from this blog.

    Nicely done!!!!

    Thanks for posting this great blog Huy. Hope to see more of your good work in the future.

  20. Alabi Olushola

    Great work. Can you please upgrade your code for EF4.1

  1. Pingback: Entity Framework 4 POCO, Repository and Specification Pattern [Upgraded to CTP5] « Huy Nguyen's Blog

  2. Pingback: Entity Framework 4 POCO, Repository and Specification Pattern « ALT .NET Saigon's blog

  3. Pingback: Entity Framework 4 POCO, Repository and Specification Pattern « Huy Nguyen's Blog

  4. Pingback: Entity Framework 4 POCO, Repository and Specification Pattern [Upgraded to EF 4.1] « Huy Nguyen's Blog