5 Comments

How a calling method can tell an Entity Framework Code-First Repository to eager-load a variable number of related entities and collections.

Introduction

A Repository acts as the bridge between a database and entities that you can use in code.  When you ask a Repository to save an entity, this is normally delegated to an ORM such as Entity Framework.  If the entity is something like an Order and there are any related entities (in a changed or added state) these will be saved at the same time:

public class Order
{
    public int OrderId { get; set; }
    public Customer Customer { get; set; }
    public decimal Value { get; set; }
    public IList<Item> LineItems { get; set; }
}

A question arises when retrieving the same Order from the database (often referred to as "rehydrating").  Do I just retrieve the Order entity (which in this example is an OrderId and Value) or do I retrieve the related entity, Customer, and the related collection, LineItems?

If you are taking a Domain Driven Design approach, you may have clearly defined Aggregates, in which case you may have a rule that the whole Order Aggregate has to be retrieved at once, every time.

Or you may be able to rely on lazy-loading, such that any related entities are loaded only as they are needed by your code.

But if you need to eager-load entities to avoid multiple trips to the database when iterating over collections and want to control specifically which related entities and collections are retrieved, you may want the calling method to tell the Repository which to include in the re-hydrated object graph.

This post explains one method for doing this.

Repository Methods

Within the repository, this is the long-hand pattern for applying a variable number of Include statements to a DbSet from an Entity Framework Code First DbContext:

public IQueryable<Order> GetAll(
  params Expression<Func<CandidateTest, object>>[] includeExpressions)
{ 
  IQueryable<Order> set = _context.Orders;

  foreach (var includeExpression in includeExpressions)
  {
    set = set.Include(includeExpression);
  }
  return set;
}

The Include statements are specified as lambda expressions (see below) and by using the "params" keyword the method will accept as many as you provide.

The Fluent API allows you to chain as many Include statements as you require, so the code just loops through, doing this.

So to call this method, specifying we want the Customer and LineItems included, we use code such as:

var orders = _ordersRepository.GetAll(o=>o.Customer, o=>o.LineItems);

Of course, you can still chain other LINQ operations onto the end of the statement, such as "Where", "Select", etc as the Repository is still returning an IQueryable.

Resharper’s Version

Resharper suggests refactoring code above into a single statement using the LINQ Aggregate method.

It’s down to personal opinion which version you (and others visiting your code in the future) find easier to understand:

public IQueryable<Order> GetAll(
  params Expression<Func<Order, object>>[] includeExpressions)
{
  return includeExpressions
    .Aggregate<Expression<Func<Order, object>>, IQueryable<Order>>
     (_context.Orders, (current, expression) => current.Include(expression));
}

And similarly, if we want a "GetById" method, this can be implemented as:

public Order GetById(
  int id, 
  params Expression<Func<Order,object>>[] includeExpressions)
{
  if (includeExpressions.Any())
  {
    var set = includeExpressions
      .Aggregate<Expression<Func<Order, object>>, IQueryable<Order>>
        (_context.Orders, (current, expression) => current.Include(expression));

    return set.SingleOrDefault(s => s.Id == id);
  }

  return _context.Orders.Find(id);
}

If no include expressions are specified, we just fall back to the Find method, which will check what's already available locally in the DbContext, before going to the database if necessary.

With these two methods available in each Repository, the calling code can specify exactly which parts of an object should be included, providing quite a neat solution to the problem.

Comments

Comment by George

This is perfect, but then how does one write the controller to handle all the different ways an object and it's children want to be requested? Is it a matter of creating an CustomerOrderController and a CustomerOrderLineItemsController?

George
Comment by Steve Moss

@George. Good question but my answer would be, "it depends". In this example you would probably only get the LineItems in the context of an Order, so you might only have a CustomerOrderController. If the Order can be considered an "aggregate" and you only access LineItems through the "aggregate-root", this would make sense.

Steve Moss
Comment by George

Makes sense. Here's another twist. Let's say I want to Load customers which has a child object of addresses. is there something I'd need to do to load Customer and it's children when requesting Order?

George
Comment by Emanuel

Perfect post, genius!

Emanuel
Comment by VJenks

YES! So glad I found this. I was stumbling around trying to figure out how to use params to send a varying number of tables in but not quite getting it right. Generic it up and stuff it in your repository base class, and you're good to go! THANKS!

VJenks