Here’s something I come across too often: how do we deal with sending data to and receiving data from external systems? I don’t mean the format of the data, or the transmission mechanism; but simply, how do we keep a record of what occurred? (And I use the word ‘external’ here in a vague sense; it could mean external to your company, external to the current app domain, or just something that you want to keep separate.)
Let’s pretend our system needs to call out to Amazon to query the best price for a book: we send the Amazon service an ISBN, and it returns a number that represents the price.
The Novice Developer (TND) has created a service, underneath which is a table that looks like this:
CREATE TABLE dbo.BookPrice ( ISBN CHAR(13) NOT NULL PRIMARY KEY ,LatestPrice SMALLMONEY NULL )
(TND probably uses Entity Framework or similar to talk to the database, but I’ll pretend that the usual SQL statements are being used.)
To get a price, the service sends the ISBN to Amazon, gets a response, and INSERTs into the table:
INSERT dbo.BookPrice(ISBN, Price) VALUES('9780575082076', 8.99)
, or does an UPDATE if the record already exists. So far, so good. Next time TND runs the service, there’s a problem; the book is out of stock, so Amazon doesn’t return a price. The code does this:
UPDATE dbo.BookPrice SET LatestPrice = NULL WHERE ISBN = '9780575082076'
An analyst looks at the data, and asks, “Well, why’s that price NULL?”. TND explains, and the analyst points out that there’s no way of telling what a NULL means. It could be any of:
- The service never attempted to call out to Amazon
- The service tried to make the call, but there was a network problem
- The service made the call correctly, but got an error back from Amazon
A manager overhears the conversation, and asks the TND to create a scheduled task that retries calls that didn’t return a price. To facilitate this, the developer changes the table:
ALTER TABLE dbo.BookPrice ADD LatestStatus VARCHAR(255) NULL , LastOKPrice SMALLMONEY NULL
Now, the UPDATE becomes:
UPDATE dbo.BookPrice SET LatestPrice = NULL ,LatestStatus = 'Out of stock' ,LastOKPrice = 8.99 WHERE ISBN = '9780575082076'
To get the latest price, TND tells the analyst to just run the following query:
SELECT ActualLatestPrice = ISNULL(LatestPrice, LastOKPrice) FROM dbo.BookPrice WHERE ISBN = '9780575082076'
Sometime later, TND’s service requests the latest price, and the network is down. The service notes this accordingly:
UPDATE dbo.BookPrice SET LatestStatus = 'Network problem' WHERE ISBN = '9780575082076'
The analyst then points out that the company has SLAs and needs to know if external service calls are failing. TND thinks, and adds a new column:
ALTER TABLE dbo.BookPrice ADD NetworkProblemCount INT NULL
and UPDATEs the table again:
UPDATE dbo.BookPrice SET NetworkProblemCount = 1 WHERE ISBN = '9780575082076'
“But we need to know when it happened”, says the analyst. So TND thinks again, and adds another new column:
ALTER TABLE dbo.BookPrice ADD LastNetworkProblemDate DATETIME NULL
and UPDATEs the table again:
UPDATE dbo.BookPrice SET LastNetworkProblemDate = GETDATE() WHERE ISBN = '9780575082076'
Back comes the analyst, “But when was the last time we actually successfully retrieved a price?”, so TND adds another column:
ALTER TABLE dbo.BookPrice ADD LastSuccessfulPriceRetrievalDate DATETIME NULL
… and hopefully by this point, you can agree with me that continually adding columns is not the way we should be doing this!
The most straightforward way I know of to record interactions with other systems, is via Request and Response tables. Take a look at the following DDL:
CREATE TABLE dbo.BookPrice_Request ( RequestID INT NOT NULL IDENTITY(1,1) CONSTRAINT PK_BookPrice_Request PRIMARY KEY CLUSTERED ,ISBN CHAR(13) NOT NULL ,CreatedOn DATETIME NOT NULL CONSTRAINT DF_BookPrice_Request_CreatedOn DEFAULT(GETDATE()) ) CREATE TABLE dbo.BookPrice_ResponseType ( ResponseTypeID TINYINT NOT NULL CONSTRAINT PK_BookPrice_ResponseType PRIMARY KEY CLUSTERED ,[Description] VARCHAR(255) NOT NULL ,CreatedOn DATETIME NOT NULL CONSTRAINT DF_BookPrice_ResponseType_CreatedOn DEFAULT(GETDATE()) ) INSERT dbo.BookPrice_ResponseType(ResponseTypeID, [Description]) VALUES(10, 'Price obtained successfully') , (20, 'Out of stock') , (30, 'Network error') CREATE TABLE dbo.BookPrice_Response ( ResponseID INT NOT NULL IDENTITY(1,1) CONSTRAINT PK_BookPrice_Response PRIMARY KEY CLUSTERED ,RequestID INT NOT NULL ,ResponseTypeID TINYINT NOT NULL ,Price SMALLMONEY NULL ,CreatedOn DATETIME NOT NULL CONSTRAINT DF_BookPrice_Response_CreatedOn DEFAULT(GETDATE()) ,CONSTRAINT FK_BookPrice_Response_RequestID FOREIGN KEY (RequestID) REFERENCES dbo.BookPrice_Request(RequestID) ,CONSTRAINT FK_BookPrice_Response_ResponseTypeID FOREIGN KEY (ResponseTypeID) REFERENCES dbo.BookPrice_ResponseType(ResponseTypeID) )
You can run the code above in a test database, and it’ll create three tables: a Request, a Response, and a ResponseType. When our internal service receives a request for a book price, it INSERTs into the Request table. When a reply is received from Amazon, we INSERT into the Response table, with (a) the RequestID we generated earlier, and (b) an indication of what type of response we received.
To recreate what was happening with TND’s service, run the following SQL:
-- Call #1 INSERT dbo.BookPrice_Request(ISBN) VALUES('9780575082076') -- The calling code would get the newly-INSERTed RequestID from SCOPE_IDENTITY() -- This is the first INSERT into the table, so the returned RequestID will be 1. -- ... our service gets the response from Amazon... INSERT dbo.BookPrice_Response(RequestID,ResponseTypeID,Price) VALUES(1, 10, 8.99) -- Call #2 INSERT dbo.BookPrice_Request(ISBN) VALUES('9780575082076') -- the RequestID returned is 2 -- ... our service gets the response 'Out of stock' from Amazon... INSERT dbo.BookPrice_Response(RequestID,ResponseTypeID,Price) VALUES(2, 20, NULL) -- Call #3 INSERT dbo.BookPrice_Request(ISBN) VALUES('9780575082076') -- the RequestID returned is 3 -- ... our service call fails, there's a problem with the network... INSERT dbo.BookPrice_Response(RequestID,ResponseTypeID,Price) VALUES(3, 30, NULL)
Let’s say a fourth service call was completely successful:
-- Call #4 INSERT dbo.BookPrice_Request(ISBN) VALUES('9780575082076') -- the RequestID returned is 4 -- ... our service gets the response from Amazon... INSERT dbo.BookPrice_Response(RequestID,ResponseTypeID,Price) VALUES(4, 10, 6.99) -- it's on offer at a reduced price
Then to get the latest price for a book (and when that price was obtained), the query is simply:
SELECT TOP 1 rp.Price ,rp.CreatedOn FROM dbo.BookPrice_Response rp JOIN dbo.BookPrice_Request rq ON rq.RequestID = rp.RequestID WHERE rq.ISBN = '9780575082076' AND rp.ResponseTypeID = 10 ORDER BY rp.ResponseID DESC
Our Request and Response tables have a complete record of everything that has happened in our service, and it’s simple for the analyst to build reports around how efficient the service is, and if it’s working correctly.
Before I go, a quick ‘gotcha’ – it’s fresh in my mind because I noticed this in a particular service’s database a few days ago: every Request record had a Response. Now, while that’s certainly the perfect situation, it’s just not realistic; things go wrong. In this case, it turns out that the INSERT into the Request table was only being done at the end of the process, only when a Response had been successfully received. So errors were being returned from the 3rd party system, but the service code was bombing out – therefore we had no visibility of this in the database. It’s very important: the Request and Response have to be written back to the database as soon as they happen, without delay.