find a location for property in a new city

Tuesday, 11 January 2011

Optimising Entity Framework - Don’t query from base entity

Be careful to query the derived type when dealing with an entity data model that involves inheritance. You can massively reduce query complexity by making sure you avoid querying the base type of your model.

Say there is a model with a base type of Person and three derived types of Customer, StaffMember, Client. It doesn't matter if your inheritance model is Table per Inheritance or Table per Type (TPT) but say for example it is Table per Hierarchy (TPH) and the condition for Customer is that an OrdersID column is not null. Also say there is a quality that of all Customers that they will have a guaranteed value of PersonType == 1.

So what is the difference between:

context.People.Where(p => p.MessageType == 1) and context.People.OfType()?

Aside from the better code the difference in generated SQL is massive. For some reason Entity Framework will generate SQL that works out which derived type each of the rows in the Person table are even though I'm asking for just the base type. This means unnecessarily joining on a number of tables if you’re using TPT or evaluating many conditional columns if you’re using TPH.

I tried the different approaches on quite a large Entity Data Model of mine. There is one base type with 5 derived types and a few other entities being included (joined). Using the first query from above the generated query was 28224 characters long. The revised query was 14177. That's half the size!

Using query length is a poor way of evaluating a query because most of EF queries are explicit query lengths which don’t actually add to complexity but believe me – in the case the saving of query length was almost exclusively down to removing pointless complexity.

Follow britishdev on Twitter

4 comments:

  1. Sounds like a good point, until you measured the query in "characters". Now that;s a loss. Many queries might be longer than their alternatives yet perform better in terms of execution plan, etc...

    You still have a very good point about the Joins and so of course. Good advice. Thanks :)

    ReplyDelete
  2. I know, I know, that was just laziness on my part and an easier way of quantifing the stupendous queryplan!

    Anyway, rest assured that the 14,000 characters of extra sql was all extra Joins to subqueries and some ridiculously complex join conditions. Bad bad bad!

    ReplyDelete
  3. I'm currently building quite a large model and was wondering about this myself. Being new to ORMs, I've been going a bit nuts on inheritance and asking myself if it's the right way to go. Your example fits well with what I am doing e.g. Contacts <- Person/Business <- Staff <- Employee/Trainee. I then associate AddressBooks and PhoneBooks etc... to the base contacts entity and use something like var person = context.Contact.OfType(); My problem has been the amount of associations that you could theoretically have with Contacts and wonderring if I was on the right path.

    ReplyDelete
  4. I was also going to comment that query length isn't necessary the indicator of poor performance... but if you were seeing lots of unnecessary joins / IO's / and in general a poor execution plan assigned by SQL Server then yeah... bail out. That's the thing with EF, you have to really be on your toes when you use it and make sure you're not telling it to be stupid.

    ReplyDelete