Archive for November, 2015

Enforcing type-specific data

“Yeah, yeah, but your scientists were so preoccupied with whether or not they could that they didn’t stop to think if they should.” — Dr Ian Malcolm, Jurassic Park

(Forgive the over-used quote above, but it feels apt — I think you’ll see why.)

In our hypothetical situation, we have two types of Employee: a Manager and an Agent. So to begin with, we have Employee and EmployeeType tables:

CREATE TABLE dbo.EmployeeType
(
  EmployeeTypeID TINYINT NOT NULL
    CONSTRAINT PK_EmployeeType PRIMARY KEY CLUSTERED
  ,[Description] VARCHAR(50) NOT NULL
)
GO
INSERT dbo.EmployeeType(EmployeeTypeID, [Description])
    VALUES(1, 'Agent'), (2, 'Manager')
GO

We’ll say that Agents get a ‘terminal code’ (to access their terminals); Managers don’t need a terminal code, but they do get to use the staff car park, so we’ll need to store their car registration — but only for Managers, Agents aren’t allowed to park at work.

CREATE TABLE dbo.Employee
(
  EmployeeID INT NOT NULL
    CONSTRAINT PK_Employee PRIMARY KEY CLUSTERED
  ,EmployeeTypeID TINYINT NOT NULL 
  ,[Name] VARCHAR(50) NOT NULL
  ,StartDate DATE NOT NULL
  -- 1. Agent-specific
  ,TerminalCode VARCHAR(5) NULL
  -- 2. Manager-specific
  ,CarReg VARCHAR(10) NULL
  ,CONSTRAINT FK_Employee_EmployeeTypeID
    FOREIGN KEY (EmployeeTypeID) REFERENCES dbo.EmployeeType(EmployeeTypeID)
)
GO

Let’s put some data in:

INSERT dbo.Employee
(
  EmployeeID
  ,EmployeeTypeID
  ,[Name]
  ,StartDate
  ,TerminalCode
  ,CarReg
)
  VALUES(1, 1, 'Alan Smith', '2001Jan01', 'AS111', NULL)
    ,(2, 1, 'Brenda Thomson', '2002Feb02', 'BT222', NULL)
    ,(3, 1, 'Colin Underwood', '2003Mar03', 'CU333', NULL)
    ,(4, 2, 'Diana Varney', '2004Apr04', NULL, 'DV65 QAZ')
    ,(5, 2, 'Edward Wilson', '2005May05', NULL, 'EW65 WSX')
GO

So far, so straightforward. Now let’s muck things up:

UPDATE dbo.Employee
  SET CarReg = 'AB91 EDC'
  WHERE EmployeeID = 1
GO

which of course works, but now our mere Agent gets to park for free. (How disastrous!) We can fix this — we’ll put the NULL back in the CarReg field, and add a CHECK constraint to stop it happening again:

UPDATE dbo.Employee
  SET CarReg = NULL
  WHERE EmployeeID = 1
GO

ALTER TABLE dbo.Employee
  ADD CONSTRAINT CK_Employee_CheckFields
    CHECK
      (
      CASE
        WHEN EmployeeTypeID = 1
        THEN CASE WHEN TerminalCode IS NOT NULL AND CarReg IS NULL THEN 1 ELSE 0 END
        WHEN EmployeeTypeID = 2
        THEN CASE WHEN TerminalCode IS NULL AND CarReg IS NOT NULL THEN 1 ELSE 0 END
      END = 1
      )
GO

If we try and set the CarReg for an Agent now, it’ll fail:

The UPDATE statement conflicted with the CHECK constraint "CK_Employee_CheckFields"

Well, that works! Job done, no? Not really… every time we need to add or remove Agent- or Manager-specific fields, we’ll have to change that CHECK constraint; what if there were 100 fields? The constraint could get pretty unwieldy. Not to mention the wasted column space: Agent fields will always be NULL for Managers, and vice versa.

Drop the CHECK constraint, the TerminalCode and CarReg fields in Employee; and then we’ll create two new tables to hold the type-specific data:

CREATE TABLE dbo.AgentData
(
  EmployeeID INT NOT NULL CONSTRAINT PK_AgentData PRIMARY KEY CLUSTERED
  ,TerminalCode VARCHAR(5) NULL
)
GO
INSERT dbo.AgentData(EmployeeID, TerminalCode)
  VALUES(1,  'AS111'), (2, 'BT222'), (3, 'CU333')
GO
CREATE TABLE dbo.ManagerData
(
  EmployeeID INT NOT NULL CONSTRAINT PK_ManagerData PRIMARY KEY CLUSTERED
  ,CarReg VARCHAR(10) NULL
)
GO
INSERT dbo.ManagerData(EmployeeID, CarReg)
  VALUES(4, 'DV65 QAZ'), (5, 'EW65 WSX')
GO

(Foreign keys on EmployeeID removed for clarity.)

That’s better! Except how do we enforce that AgentData only holds EmployeeIDs that relate to Agents, and that ManagerData only holds EmployeeIDs that relate to Managers? We could use a trigger, absolutely. But for the purposes of this post, we have an irrational fear of triggers and don’t want to use them! Can we enforce our rules with just keys?

(And here’s where the quote at the beginning becomes relevant…) Yes, I think we can, but I don’t know that we should. The following solution seems to work, but I don’t know if I’d ever use it in practise, it feels over-the-top. (I should also add, I’ve not tested it in a production-like environment, it might be unsuitable for some reason I’ve not yet discerned.) Anyway, here it is.

Starting from scratch:

CREATE TABLE dbo.Employee
(
  EmployeeID INT NOT NULL
        CONSTRAINT UQ_Employee_EmployeeID UNIQUE
  ,EmployeeTypeID TINYINT NOT NULL 
  ,[Name] VARCHAR(50) NOT NULL
  ,StartDate DATE NOT NULL
  ,CONSTRAINT PK_Employee
        PRIMARY KEY CLUSTERED (EmployeeID, EmployeeTypeID)
  ,CONSTRAINT FK_Employee_EmployeeTypeID
        FOREIGN KEY (EmployeeTypeID) REFERENCES dbo.EmployeeType(EmployeeTypeID)
)

We’ve given the Employee table a composite PRIMARY KEY, comprised of the EmployeeID, and the EmployeeTypeID. The EmployeeID also has a UNIQUE index on it, for two reasons: (1) obviously, to stop the same EmployeeID having multiple types, and (2) so that other tables can reference the EmployeeID in a FOREIGN KEY, without having to include the EmployeeTypeID as well. Without the UNIQUE constraint, we’d get the following error:

There are no primary or candidate keys in the referenced table 'dbo.Employee'
that match the referencing column list in the foreign key

, but with it, we’re ok. We’ll INSERT the same employee details as before:

INSERT dbo.Employee
(
  EmployeeID
  ,EmployeeTypeID
  ,[Name]
  ,StartDate
)
  VALUES(1, 1, 'Alan Smith', '2001Jan01')
    ,(2, 1, 'Brenda Thomson', '2002Feb02')
    ,(3, 1, 'Colin Underwood', '2003Mar03')
    ,(4, 2, 'Diana Varney', '2004Apr04')
    ,(5, 2, 'Edward Wilson', '2005May05')
GO

Here’s a slightly icky bit: we’re going to fix the EmployeeTypeID of 1 in the AgentData table:

CREATE TABLE dbo.AgentData
(
  EmployeeID INT NOT NULL CONSTRAINT PK_AgentData PRIMARY KEY CLUSTERED
  ,EmployeeTypeID AS CAST(1 AS TINYINT) PERSISTED
  ,TerminalCode VARCHAR(5) NULL
  ,CONSTRAINT FK_AgentData_EmployeeID_EmployeeTypeID
    FOREIGN KEY (EmployeeID, EmployeeTypeID) REFERENCES dbo.Employee(EmployeeID, EmployeeTypeID)
)
GO

, and fix the EmployeeTypeID of 2 in the ManagerData table:

CREATE TABLE dbo.ManagerData
(
  EmployeeID INT NOT NULL CONSTRAINT PK_ManagerData PRIMARY KEY CLUSTERED
  ,EmployeeTypeID AS CAST(2 AS TINYINT) PERSISTED
  ,CarReg VARCHAR(10) NULL
  ,CONSTRAINT FK_ManagerData_EmployeeID_EmployeeTypeID
    FOREIGN KEY (EmployeeID, EmployeeTypeID) REFERENCES dbo.Employee(EmployeeID, EmployeeTypeID)
)
GO

Where both tables have a FOREIGN KEY onto the Employee table that includes the EmployeeTypeID, there’s no way to get the types mixed up: an Agent has to be an Agent in both tables, likewise for Managers.

Let’s INSERT the data and check we can query it without any problems:

INSERT dbo.AgentData(EmployeeID, TerminalCode)
  VALUES(1,  'AS111'), (2, 'BT222'), (3, 'CU333')
GO
INSERT dbo.ManagerData(EmployeeID, CarReg)
  VALUES(4, 'DV65 QAZ'), (5, 'EW65 WSX')
GO
SELECT
    e.*
    ,ad.TerminalCode
    ,md.CarReg
  FROM dbo.Employee e
  LEFT JOIN dbo.AgentData ad
    ON ad.EmployeeID = e.EmployeeID
  LEFT JOIN dbo.ManagerData md
    ON md.EmployeeID = e.EmployeeID
  ORDER BY e.EmployeeID
GO
EmployeeID  EmployeeTypeID Name              StartDate  TerminalCode CarReg
----------- -------------- ----------------- ---------- ------------ ----------
1           1              Alan Smith        2001-01-01 AS111        NULL
2           1              Brenda Thomson    2002-02-02 BT222        NULL
3           1              Colin Underwood   2003-03-03 CU333        NULL
4           2              Diana Varney      2004-04-04 NULL         DV65 QAZ
5           2              Edward Wilson     2005-05-05 NULL         EW65 WSX

(5 row(s) affected)

We can also add FOREIGN KEY references from other tables, pointing at the EmployeeIDs on the AgentData and ManagerData tables, and get the expected behaviour, e.g.:

CREATE TABLE dbo.ManagerEmail
(
  EmployeeID INT NOT NULL
    CONSTRAINT PK_ManagerEmail PRIMARY KEY CLUSTERED
  ,EmailAddress VARCHAR(255) NOT NULL
  ,CONSTRAINT FK_ManagerEmail_EmployeeID
    FOREIGN KEY (EmployeeID) REFERENCES dbo.ManagerData(EmployeeID)
)
GO
INSERT dbo.ManagerEmail(EmployeeID, EmailAddress)
	VALUES(4, 'diana.varney@bigcorp.com')
GO
-- works fine, Diana is a Manager

INSERT dbo.ManagerEmail(EmployeeID, EmailAddress)
	VALUES(3, 'colin.underwood@bigcorp.com')
GO
-- Error:
-- "The INSERT statement conflicted with the
-- FOREIGN KEY constraint "FK_ManagerEmail_EmployeeID".

What do you think? Does this work for you, or would a trigger be more straightforward? As always, please let me know in the comments, I’d love to hear from you.

, , , ,

Leave a comment