We explain some of the basics of creating a database and planning relationships
One of the joys of writing about databases is the number of places you get to visit; the downside is the travelling.
But occasionally it yields lighter moments. I recently stayed at the Renaissance Hotel at Gatwick and while I was settling the bill I noticed that a tip I had added for a meal had been increased.
“Oh,” I was told. “That’ll be the new computer system. We had a lot of work done and the new server sometimes just adds random numbers to the bills.”
Right, excellent. I must remember that explanation next time I have to account for errors in one of my databases.
Tables
One of the central tenets of relational databases is that we split data into
multiple tables. There is a formal process (called normalisation) for working
out what data should go into each table, but there is also a good rule of thumb.
Simply identify each real-world type of object that you want to store data about
(Customer, Order or Employee), give each one a table and include in each table
only the fields that are relevant for that object. For example, in an Employee
table we might make fields such as FName, LName and DOB (Date Of Birth).
The table must also have a primary key. This is usually one field and must contain a unique value for each record. So, our three tables might look like the image (above, left): the primary keys in each case are the first column in the table.
Primary keys are significant because we use them to link the data in the separate tables together. If you look at the Order table you can see that Order 1 was placed by customer 2 (Sally Jones) and taken by employee 3 (Harry). The column that points to the primary key (for example, the CustID field in Orders) is called a Foreign key.
The good news about splitting the data into separate tables is that we never store the same data more than once. So, for example, the date of birth of each employee is only ever stored once. We always know where to find it: it has to be in the Employee table. But how can we get to it? For example, how do we list all of the orders, together with the DOB of the associated employee?
Simple, we use the primary and foreign keys to pull the data back together. In this case we could create a query (You’ll find the OrderEmployeeDOB in DBCFeb08. mdb here) that links them like this and it would return the correct information.
In these examples I’ve used a number field to make the joins, which is common, but there is no reason why you cannot make a join between two text fields. If you do so, then the join is case insensitive, so it will happily join “Harry” with “harry”. Most of the time this is what we want, but occasionally people want a case-sensitive join.
caSE SEnsitIve joins
Joins, as described above, are case insensitive. Sara Athwal emailed with a
problem that can only be solved with a case-sensitive join. She has written an
Access database for her company and has a product table.
Each product in her company has a unique identifier made up of characters and numbers. As she says, these are completely unreadable (like Df3dD) because they are machine generated. The machine in question is a Unix-based system and Sara was horrified to discover, after she had built the database, that DF3dD was being used to signify a different product from Df3dD and Df3dd - in other words, it is case sensitive.
Her question is simple: “How can I perform a case-sensitive join?” It’s an interesting question with at least three answers - or one solution and two answers.
Solution 1
One solution is: “Are you sure you want to do that?” An alternative is to keep
the machine-generated value in the table but not use it as the primary key.
Instead, assign another unique number (possibly using an autonumber field) to
each product and use that number as the primary key. This is potentially easier
and may be faster when you perform the joins.
While this may be a solution, it doesn’t answer the question I was asked and you may find a case one day where a case-sensitive join is essential, so let’s look for active solutions.
Solution 2
Access doesn’t support case-sensitive joins without a bit of creative
table-building, which relies on a more or less undocumented feature. When you’re
shown the list of available field types when you’re creating a table, the Binary
type does not figure. Nevertheless, it’s there and is used by some of Access’
internal system tables.
You can use it as you’d use the Text field type and it will behave as a Text field, except when used with any of the comparison operations (such as making joins and sorting or indexing fields). In these cases it uses the ASCII code value equivalent to the field contents and uses that for joining, sorting and indexing. There are different ASCII codes for each character in both cases (‘A’ is 65 and ‘a’ is 97) and this is how the Binary field can act in a case-sensitive fashion.
The Binary type isn’t on offer during the normal table-building process so we have to be a little sneaky in order to create a table with a binary field and use raw SQL. As part of its data definition language, SQL has a command called CREATE TABLE that does exactly what the name suggests. So if we run the following SQL:
CREATE TABLE PENGUIN (MyField binary (30))
it will create a table called ‘Penguin’ with a field called ‘MyField’ that is of type binary and able to accept up to 30 characters. How do we run this SQL?
Start from the Queries tab, selecting ‘Create query in Design view’ and close the Show Table window, which leaves an empty query grid on the screen. The top-left button on the menu bar says ‘SQL’: click it to open the SQL view where you’ll see the word ‘SELECT’, which is how SQL queries start. We’re using the data definition part of SQL so delete this and type the code above.
Run the code by clicking the red exclamation mark button on the menu bar, close the SQL window, and in the Tables tab, you should see the new table. Inspect it in Design view and you’ll see its one field is of Binary type.
I’ve built two tables in this way, and to each I have added an ID field and a Text field.
If we run a query that joins the two tables on the Text field (TextType), we would expect all five rows to match because no account will be taken of case. So the first row in Penguin, ‘Herring’ will be deemed identical to ‘herring’ in the first row of Penguin2. The query on the right, JoinOnCaseInsensitiveField (in the DBCFeb08.mdb), does this and returns all five rows. The query on the left, JoinOnCaseSensitiveField, joins on the Binary field and returns only the row for ‘hake’ because it is all lower case in both tables.
Solution 3
A second approach is to start by building a query that joins the two tables on
the Text field. The answer table contains all rows. Now add a new field to the
query grid:
CaseCompare: StrComp([Penguin].[TextType],[Penguin].[TextType],0)
and make it a visible column. It uses the String Compare function (StrComp) and here it’s comparing the TextType fields from the tables Penguin and Penguin2. The final zero makes the operation case sensitive. It returns 0 if the strings in the TextType fields are identical in case and in character order, and -1 if they don’t.
Run it and four columns contain -1 and one contains a zero: it’s for the ‘hake’ entry, which is the row we identified using solution 2 as the only case-sensitive match.
To see just the case-sensitive matches, type a zero into the Criteria: line in the query grid for the new column and run it again. You can hide the column once you’re sure it’s working with your data.
And there’s your case-sensitive join between tables. The drawback of this solution is that it’s likely to be slow for large data sets.