British Developer

Showing posts with label SQL. Show all posts

Friday, 17 May 2013

SQL MERGE that changes the ON constraint as it runs

What happens when your MERGE statement INSERTs a row that now satisfies the ON constraint. If a new row comes along that satisfies the ON constraint will it fall into WHEN MATCHED or WHEN NOT MATCHED?

Let's use an example piece of code to decide what happens. Here I have two table variables created and seeded with some random data:

DECLARE @t1 Table (id int, name varchar(12))
INSERT INTO @t1 VALUES(1, 'hi')

DECLARE @t2 Table (id int, name varchar(12))
INSERT INTO @t2 VALUES(1, 'bye')
INSERT INTO @t2 VALUES(3, 'colin')
INSERT INTO @t2 VALUES(3, 'sarah')

Now I will attempt to MERGE @t2 into @t1. If you have not come across MERGE before it is basically a more efficient way of saying UPDATE if it exists or INSERT if it is new.

MERGE @t1 AS t1
 USING(SELECT * FROM @t2) AS t2
 ON t2.id = t1.id
WHEN MATCHED THEN
 UPDATE SET t1.name = t2.name
WHEN NOT MATCHED THEN
 INSERT(id, name)
 VALUES(t2.id, t2.name);

So, what we are saying is:

USING this source of data (SELECT * FROM @t2)
ON this decider (matching the ID's to judge if this row already exists)
WHEN the ON clause is MATCHED update this row with new values
WHEN NOT MATCHED we should INSERT the new row

But, given the values in @t2, what will happen?

The 1st row will match on ID 1 and so will do an UPDATE
The 2nd row doesn't have a match for an ID of 3 so will INSERT
The 3rd row didn't have a match for an ID of 3 before but since the last INSERT it now does. So INSERT or UPDATE?

A SELECT * FROM @t1 will show you that it has INSERTed twice...

id	name
1	bye
3	colin
3	sarah

So watch out for this. The constraint is decided about the source and destination once at the beginning and not again so you should be sure that the source table being MERGED is complete in itself. You can do with using a GROUP BY or DISTINCT on the USING table. Whichever is most appropriate for your scenario.

How to solve Introducing FOREIGN KEY constraint may cause cycles or multiple cascade paths

I was writing my Entity Framework 5 Code First data models one day when I received a dramatic sounding error: "Introducing FOREIGN KEY constraint 'FK_dbo.Days_dbo.Weeks_WeekID' on table 'Days' may cause cycles or multiple cascade paths." I was instructed to "Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints." And then simply the message, "Could not create constraint." So what happened with my Foreign Key that makes it cyclic?"

First a spot of code that can nicely demonstrate this scenario:

public class Year
{
    public int ID { get; set; }
    public string Name { get; set; }

    public ICollection<Month> Months { get; set; }
}
public class Month
{
    public int ID { get; set; }
    public string Name { get; set; }

    public Year Year { get; set; }
    public ICollection<Day> Days { get; set; }
}
public class Day
{
    public int ID { get; set; }
    public string Name { get; set; }

    public Month Month { get; set; }
    public Week Week { get; set; }
}

//problem time
public class Week
{
    public int ID { get; set; }
    public string Name { get; set; }

    public Year Year { get; set; }
    public ICollection<Day> Days { get; set; }
}

So, let's explain this. A Year has Months and a Month has Days - all is well at this point and it will build and generate all your tables happily. The problem comes when you add the highlighted parts to add Weeks into the data model.

By making those Navigation Properties you have implicitly instructed EF Code First to create Foreign Key constraints for you and each of these will have cascading deletes on them. This means, upon deleting a Year, the database will cascade that delete to that Year's Months and in turn those Month's Days. This is to, quite rightly, hold referential integrity; you will not have Yearless Months.

So if you think about it, now that we have added a Week entity that also has a relationship with Days this too will cascade deletes. So as well as the cascading deletes that occur when you delete a Year as I described above, now deleting a Year will also delete its Weeks and those Weeks will delete the Days. But those Days would have already been deleted by the cascade that went via Months. So who wins? Who gets there first? Don't know... that's why you need to design an answer to this conundrum.

Removing the multiple cascade paths

The way I found to solve this issue is you need to remove one of the cascades, since there is nothing wrong with the Foreign Keys, it is only the multiple cascade paths. So lets remove the cascading delete on the Week -> Days relationship since that delete will be taken care of by each Month cascading its deletes to their Days.

public class CalenderContext : DbContext
{
    public DbSet<Year> Years{ get; set; }
    public DbSet<Month> Months { get; set; }
    public DbSet<Day> Days { get; set; }
    public DbSet<Week> Weeks { get; set; }

    protected override void OnModelCreating(DbModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Day>()
            .HasRequired(d => d.Week)
            .WithMany(w => w.Days)
            .WillCascadeOnDelete(false);

        base.OnModelCreating(modelBuilder);
    }
}

If you already have a ForeignKey field explicitly written into your code model (as I usually do - I removed them from the above example for clarity) you will need to specify this existing Foreign Key in the code just written. So if your Day entity looked like this:

public class Day
{
    public int ID { get; set; }
    public string Name { get; set; }

    [ForeignKey("Month")]
    public int MonthID { get; set; }
    public Month Month { get; set; }

    [ForeignKey("Week")]
    public int WeekID { get; set; }
    public Week Week { get; set; }
}

You will need your OnModelCreating method to look like this:

modelBuilder.Entity<Day>()
    .HasRequired(d => d.Week)
    .WithMany(w => w.Days)
    .HasForeignKey(d => d.WeekID)
    .WillCascadeOnDelete(false);

Have fun deleting safely!

Tuesday, 8 November 2011

Cross database joins in SQL Azure

Currently cross database joins are not supported in SQL Azure. Also you cannot change database mid query so you cannot, for example, put a USE [MyDB] in your query either. As a side note, please vote for it to be a priority feature for the SQL Azure team to develop soon.

So, since cross database joins are not supported at this time you must find a workaround. I will give you two possible solutions I would recommend and you can hopefully choose the one that is best for your application.

Combine your databases

If you have tables that are frequently used together, i.e. they are joined in queries or the rows are inserted in the same transactions, then it would be a good idea to move the similar tables into the same database. This of course eliminates the need to traverse databases. SQL Azure has very recently increased the maximum database size from 50GB to 150GB, which potentially makes this a more viable option than perhaps it once was.

Join your data in your application

Two separate queries could be run on the two separate databases and then these results could be joined within that application. Obvious downsides to this will be the potential for large DB I/O, large network transfer and large memory usage in your app. This is not something to consider if the amount of data that is likely to be returned is large (e.g. 1000+ rows) but it is fine if the data will be manageable.

Conclusion

Personally I would much rather settle for bringing all my similar tables that are likely to be used within the same queries together into one database so there is no longer a need for cross database solutions. This makes for cleaner application code and more efficient use of your resources. However, if this option is not available to you then perhaps the second option may have to be the one you choose.

Surely soon in the future the SQL Azure team will address this issue though and your cross database code can stay clean! Although in fairness cross database querying isn't even available in Entity Framework yet (but still workaround-able) either. I wonder why this is so difficult for Microsoft? Just shows you should always try and combine similar DBs in any database designs where possible.

Monday, 28 March 2011

How to delete an entity by ID in Entity Framework

It may seem unnatural to delete Entities in Entity Framework. In the days of stored procedures I used to just pass an ID where as now you need to get the entity before deleting it. It seems inefficient - you're basically making two database hits when surely you can do it in one database call, right? Well it is difficult but entirely possible.

Since you do not have the entity you will need you will effectively be updating a detached entity which is a technique in itself. The trick here is that we are making a fake entity that shares the Entity Key (EF version of a primary key) with the one that needs deleting. Then we mark it as deleted and SaveChanges.

public void DeleteByID(object id)
{
    //make fake entity and set the ID to the id passed in
    var car = new Car();
    car.ID = id;

    using (var context = new MyDataContext())
    {
        //add this entity to object set
        context.CreateObjectSet<Car>().Attach(car);
        //mark it as deleted
        context.ObjectStateManager.ChangeObjectState(car, EntityState.Deleted);
        //save changed - effectively saying delete the entity with this ID
        context.SaveChanges();
    }
}

It is worth noting that I am following my own twist on the Repository pattern for data access so I have an "EntityRepository" that manages everything EF related. This is where I will make my generic "DeleteByID" method. My EntityRepository uses generics to specify the type of entities that are dealt with so this has to be completely reusable by any entity set (TEntity) or object context (TContext).

How to Delete by ID in a generic way

public void DeleteByID(object id)
{
    var item = new TEntity();
    var itemType = item.GetType();

    using (var context = new TContext())
    {
        //get the entity container - e.g. MyDataContext - this will have details of all its entities
        var entityContainer = context.MetadataWorkspace.GetEntityContainer(context.DefaultContainerName, DataSpace.CSpace);
        //get the name of the entity set that the type T belongs to - e.g. "Cars"
        var entitySetName = entityContainer.BaseEntitySets.First(b => b.ElementType.Name == itemType.Name).Name;
        //finding the entity key of this entity - e.g. Car.ID
        var primaryKey = context.CreateEntityKey(entitySetName, item).EntityKeyValues[0];
        //using Reflection to get and then set the property that acts as the entity's key
        itemType.GetProperty(primaryKey.Key).SetValue(item, id, null);
        //add this entity to object set
        context.CreateObjectSet<TEntity>().Attach(item);
        //mark it as deleted
        context.ObjectStateManager.ChangeObjectState(item, System.Data.EntityState.Deleted);
        //save changed - effectively saying delete the entity with this ID
        context.SaveChanges();
    }
}

These both translate to the following SQL:

delete from [dbo].[Cars]
where  ([ID] = 123)

And there it is - a reusable delete by ID function with no SELECT!

Wednesday, 27 October 2010

Set an identity seed back after delete in SQL

When deleting from a table in SQL the seed of the identity column or primary key will not be forgotten. It would be nice to truncate the selected rows(?). This is a little annoying but there is a way around it.

If you want to quickly remove all rows from a table and begin the table from scratch TRUNCATE is the best approach. It is quicker too since it logs nothing (so beware).

TRUNCATE is only for destroying all data in a table though. What if you want to remove only the last few rows and then begin the seeding off from the last primary key?

Use CHECKIDENT to reseed

You can delete the rows you want and after you can reset the SEED of the table to what you desire using the following line:

DBCC CHECKIDENT('myTable', RESEED, 10)

Note: the 10 is a seed so you would set this to be the current maximum identity. So if you reseed with 10 the next insert will yield 11 as the primary key.

Friday, 2 July 2010

SQL Server 2008 Prevent saving changes that require the table to be re-created

I got this notification when I changed a table's column to NOT NULL and tried to save: "Saving changes not permitted. The changes you have made require the following tables to be dropped and re-created. You have either made changes to a table that can't be re-created or enabled the option Prevent saving changes that require the table to be re-created."

This is because the script generated would contain a DROP TABLE statement. Of course it would have saved everything in a temp table before being dropped and re-created the table so nothing would have actually been lost.

Even still, it is probably a nice little safe guard for those who maybe aren't that used to SQL Server but I am so: how to get round it?

Solution

If you are confident that you are a legend with your SQL skills and you just want to be left alone you can disable this warning.

Go Tools » Options...

Go Designers » Table and Database Designers and then uncheck "Prevent saving changes that require table re-creation"

Job done - enjoy at your peril!

Thursday, 17 June 2010

Update minor change to database in your Entity Data Model

I added a new column to a table in the database and couldn't seem to get these changes reflected in my Entity Data Model using Entity Framework 4. There is a function I can use in the context menu called "Update Model from Database..." but that seems overkill. I just want one new column mapped in my entity data model.

I didn't want to use the 'Update Model from Database' since it will update every table and view in my model, reverting all the entity and property names to the database naming convention and breaking all my referencing code. Too much pain for what should be quite a simple change? Yes, I think so.

Solution

As an example let's say you have added a varchar(50) column called MyDbColumn to MyDbTable that currently has an entity modelled called MyEntity that hasn't recognised this column which you want to be called MyProperty.

Let's also say you have added MyProperty of type String with MaxLength set to 50 to your MyEntity entity. If not, do that. Now the tough bit: Mapping it to the database without a full update.

Time to leave the comfort of the UI.

Right click your edmx file in the Solution Explorer
Select 'Open With...'
Select 'XML (Text) Editor' and click OK (you may be asked to close any open instances of the edmx file)
Search for where is says <EntityType Name="MyDbTable">
Add the property <Property Name="MyDbColumn" Type="varchar" MaxLength="50" />
Search for <EntitySetMapping Name="MyEntity">
You can find all your other mapped properties within a 'MappingFragment' element.
Add your new mapped column <ScalarProperty Name="MyPropertyName" ColumnName="MyDbColumnName"/>
Save & Build

If you now re-open your edmx file in the designer and take a look at the Mapping Details for MyEntity you will see that MyProperty is mapped to MyDbColumn in MyDbTable. Lovely stuff!

Tuesday, 15 June 2010

Cross database joins in Entity Framework4

Quite new to Entity Framework 4 I really struggled with the concept of not being able to make an Entity Data Model containing entities of tables that live across different multiple databases. I am still really quite disappointed that EF4 cannot support this and the team behind it doesn't even seem to think that cross database entities are an issue worth considering despite how long this forum has been asking for them.

Anyway, there is a cheeky little way that it can be done. I will use the example of a forum to illustrate this. I need to display posts alongside the username and email address of the contributor of that post. The problem is that the user information lives in a separate database from the post information.

Time to unleash my MS Paint skills to illustrate:

Solution

I got around this by creating a View inside the ForumDB like so:

USE ForumDB
CREATE VIEW dbo.vUserDetails
AS 
 SELECT
  u.UserID,
  u.UserName,
  c.Email
 FROM
  UserDB.dbo.UserNames u
 INNER JOIN
  UserDB.dbo.ContactInfo c ON c.UserID = u.UserID
GO

With this new View in the database I can now create an entity of it in the EDM. Go to your edmx file, right click the background and select 'Update Model from Database...'

Select the data connection to the ForumDB. Then when you are on the Choose Your Database Objects step, make sure you are on the Add tab and select Views and check the vUserDetails view:

You now have access to all your user data. You add an association to link the UserID from the Posts table (Post entity) to the UserID in your new view.

Further

You now have SELECT access to this user data but you cannot currently INSERT/UPDATE/DELETE users. In this scenario this is okay, however if you have a different objective you can still achieve this using stored procedure in the Entities to run on INSERT functions etc.

Monday, 14 June 2010

Entity Framework OptimisticConcurrencyException

I got an OptimisticConcurrencyException error with description of "Store update, insert, or delete statement affected an unexpected number of rows (0). Entities may have been modified or deleted since entities were loaded. Refresh ObjectStateManager entries." when trying to insert into a table using the Entity Framework 4.

I found that the table I was working on had an INSTEAD OF INSERT on it. This was doing a few things and then returning (0 row(s) affected). Entity Framework wasn't happy with this since it looked as though nothing was being inserted and so threw the confusing OptimisticConcurrencyException error.

Workaround

I've been trying to get this to work for a fancy workaround so a drop will be the order of the day:

DROP TRIGGER [dbo].[trg_PITA]

Job done. The main problem was finding this trigger in the first place since it is not something you normally look at and almost impossible to debug.

Better workaround

Almost 3 years later! Rachel Peirson has come up with a workaround to this problem that also allows you to keep your trigger in place. Please see her comments below for details.

Monday, 29 March 2010

Hilariously imaginative SQL injection attack!

Looks like Little Bobby Tables has grown up now and modified his car with an optimistic SQL injection attack. Try inserting that registration number into your database DVLA!

Nice work Gizmodo

Wednesday, 17 March 2010

Delete duplicate rows using CTE and ROW_NUMBER in SQL 2005

Ever had duplicate rows in your table because you forgot to use a 'Unique' contraint? Finding it hard to delete the duplicate rows as not all of the columns in the rows are the same? Well you can use two of SQL Server 2005's new features to overcome this problem.

I'm going to create a table with duplicate rows to demonstrate this problem:

CREATE TABLE CustomerEmails
(
ID int IDENTITY (1, 1),
CustomerID int,
Email varchar(100),
CreatedDate smalldatetime
)

INSERT INTO CustomerEmails
SELECT 135, 'apple@spamme.com', '2010-03-31' UNION ALL
SELECT 267, 'orange@spamme.com','2010-03-21' UNION ALL
SELECT 222, 'pear@spamme.com', '2010-03-11' UNION ALL
SELECT 333, 'kiwi@spamme.com', '2010-03-01' UNION ALL
SELECT 333, 'lemon@spamme.com', '2010-02-28' UNION ALL
SELECT 333, 'lime@spamme.com', '2010-02-21' UNION ALL
SELECT 222, 'grape@spamme.com', '2010-02-11' UNION ALL
SELECT 492, 'banana@spamme.com','2010-02-01';

So customers 222 and 333 are in there multiple times and if I do a lookup to find their email I get multiple results. Someone has been inserting new rows without trying to update existing rows first... the lunatic!

I remember a DBA faffing with this problem for ages. After realising you can't DISTINCT a single column, trying to DELETE WHERE and ID exists IN a GROUP SELECT statement HAVING COUNT(*) > 1 except for the TOP row with an ORDER BY on the date DESC... *yawn*

Well, here is the way for cool people to do it: You create a CTE using a column for a ROW_NUMBER() to count the number of duplicates for each customer:

WITH Duplicates AS
(
SELECT
CustomerID,
CreatedDate,
Email,
ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY CreatedDate DESC) as cnt
FROM
CustomerEmails
)

The ROW_NUMBER() has a partition on CustomerID which means the count will start again for each customer. I have oredered the count by CreatedDate DESC too which means the row with a count of 1 will be the newest.
Now you simply DELETE from the CTE where the count is higher than 1, removing all the duplicates:

DELETE FROM
Duplicates
WHERE
cnt > 1;

Have a look:

SELECT * FROM CustomerEmails;

Great! You no longer have any duplicates! Now quickly whack a UNIQUE contraint on that column before that lunatic developer starts messing with your data again.

ALTER TABLE CustomerEmails ADD UNIQUE (CustomerID);

Sunday, 7 March 2010

Waiting for VS2010 release date to come. Not long now...

I work on quite an agile environment at work and so it is possible to investigate an upgrade to .NET 4, Visual Studio 2010 etc. In fact I have been asked to investigate such a change (just finding time is the issue).

Of course I am keen to get moving with this but I'm not confident about pushing a change to a platform that is still in release candidate. To be fair, the latest release candidate does come with a "go live" license so if you are confident enough you can download it here. Due to me not being willing to upgrade my work's infrastructure to a release candidate I will wait until Monday 22^nd March 2010 when it is released. Launch date is April 12^th.

I noticed something though: It seems I've been doing a lot of waiting recently.

There are a lot of technologies that I am interested in and have been patiently waiting for that either come with this VS2010 release or co-incidentally release around the same time:

C# 4.0 (22/03/2010)
ASP.NET 4 (22/03/2010)
Visual Studio 2010 (22/03/2010)
SQL Server 2008 R2 (by May 2010)
MVC 2.0 (with .NET 4)
Entity Framework 4 (with .NET 4)

There is Siverlight 4 too (but I am not interested :P)

Update:

The release date has been delayed until at least the launch date of 12th April 2010. Which you can read about here:
Visual Studio 2010 release date delayed

Tuesday, 2 March 2010

Why should I make a developer blog?

About 3 years ago a friend of mine I used to work with told me he was going to make a developer blog. He planned to put all the interesting discoveries he made whilst going about his day job coding ASP.NET/C# into a blog.

So I did what's right; I called him an idiot and took the piss out of him for having such a stupid idea. He is a mate after all. Anyway, 3 years on and I'm starting to think he could have been on to something. I mean blogging has caught on somewhat...

I too am a (predominantly) ASP.NET/C# and SQL developer and I have been doing this work intensely for about 4 years now. I could not have got anywhere near the level of experience I have if it weren't for the plethora of blogs and forum posts a mere Google away.

So I have everything I need; why should I make a developer blog? I know I make interesting discoveries that other developers can benefit from but how does that help me?! I know this is incredibly selfish and we'd all be worse off if everyone subscribed to this viewpoint but even still this is my spare time we're talking about and there is beer to be drunk.

I'm still not 100% sure why I should start a developer blog at this point if I'm honest. However, here are a list of reasons that have motivated me to come this far:

I often work out solutions to problems myself after intensive Googling has failed me. It must be criminal to sit on these secrets
My use of blogs so far is almost entirely selfish and that is just being rude
I must be getting older (currently approaching 26) because I can feel irresponsibility slipping through my fingers
I'm proud of my work so why not show it off
I can forget the solutions to problems I have previously solved. This is good documentation too!
I would be so so happy with myself if I started to get the feeling I was helping people
I think writing a blog that really actually helps people is a new and massive challenge for me. I'm really interested to see if I can do it
I would take great pleasure to see people reading and commenting on my posts. Absolutely love it

On that note, please feel free to comment on this post if you can think of any other motives for blogging or if you have any feedback for my first post :)

Friday, 17 May 2013

Removing the multiple cascade paths

Tuesday, 8 November 2011

Combine your databases

Join your data in your application

Conclusion

Monday, 28 March 2011

How to Delete by ID in a generic way

Wednesday, 27 October 2010

Use CHECKIDENT to reseed

Friday, 2 July 2010

Solution

Thursday, 17 June 2010

Solution

Tuesday, 15 June 2010

Solution

Further

Monday, 14 June 2010

Workaround

Better workaround

Monday, 29 March 2010

Wednesday, 17 March 2010

Sunday, 7 March 2010

Update:

Tuesday, 2 March 2010

Colin Farr

Readers of British Developer

Search

Blog Archive

Labels

My Blog List

Subscribe To