//  home   //  advanced search   //  news   //  categories   //  sql build chart   //  downloads   //  statistics
 ASP FAQ 
Home
ASP FAQ Tutorials

   8000XXXX Errors
   Alerts
   ASP.NET 2.0
   Classic ASP 1.0
   Databases
      Access DB & ADO
      General SQL Server & Access Articles
      MySQL
      Other Articles
      Schema Tutorials
      Sql Server 2000
      Sql Server 2005
   General Concepts
   Search Engine Optimization (SEO)

Contact Us
Site Map

Search

Web
aspfaq.com
tutorials.aspfaq.com
databases.aspfaq.com

ASP FAQ Tutorials :: Databases :: Other Articles :: How do I prevent duplicates in a table?


How do I prevent duplicates in a table?

The question "how do I remove duplicates?" is asked so often in the SQL Server newsgroups, that I think there needs to be more recommendations in database articles and books regarding the importance of data integrity, and how to protect your data through primary keys and unique constraints. 
 
If you already have duplicates in your table and you want to remove them, you can see a technique (useful if you have a primary key and the duplicate values are in another column), and a couple of links describing methods for other scenarios, in Article #2431
 
If you want to read about some techniques for preventing redundant data in the future, read on. 
 

Primary Key / Unique Constraint 
 
In my opinion, the best possible way to prevent corrupted or redundant data is to enforce data integrity at the table level. You can do this using a PRIMARY KEY CONSTRAINT; for example: 
 
CREATE TABLE blat 

    email VARCHAR(128) PRIMARY KEY 

GO
 
Now, if you try to insert a duplicate record, for example: 
 
<% 
    set conn = CreateObject("ADODB.Connection") 
    conn.open "<connection string>" 
 
    sql = "INSERT blat(email) VALUES('foo@x.com')" 
    conn.execute sql, , 129 
 
    sql = "INSERT blat(email) VALUES('foo@x.com')" 
    conn.execute sql, , 129 
 
    conn.close: set conn = nothing 
%>
 
You will be greeted with this error (the constraint and object name will vary, of course): 
 
Microsoft OLE DB Provider for SQL Server error '80040e2f'  
Violation of PRIMARY KEY constraint 'PK__blat__7CE47361'. Cannot insert duplicate key in object 'blat'.
 
In Access, you would get this error: 
 
Microsoft OLE DB Provider for ODBC Drivers error '80040e14'  
[Microsoft][ODBC Microsoft Access Driver] The changes you requested to the table were not successful because they would create duplicate values in the index, primary key, or relationship. Change the data in the field or fields that contain duplicate data, remove the index, or redefine the index to permit duplicate entries and try again.
 
You can also create a PRIMARY KEY that encompasses multiple columns. Let's pretend for a moment that it would be useful to allow email and/or username to be repeated in a table, but not both. We could enforce this as follows: 
 
CREATE TABLE blat 

    email VARCHAR(128), 
    username VARCHAR(32), 
    PRIMARY KEY(email, username) 

GO
 
See Article #2504 for information on choosing a primary key (the examples in this article are completely contrived and superfluous). 
 
Now, if your table already has a PRIMARY KEY, you might have other columns that need to be unique. Maybe you don't want them to be part of the PRIMARY KEY. Or maybe you want multiple columns to be validated individually for uniqueness, rather than as a whole. Or maybe you only want uniqueness to be enforced if the value is known (e.g. NOT NULL). You can do this with a UNIQUE CONSTRAINT, by adding the keyword UNIQUE to a column's definition in the CREATE TABLE statement. 
 
CREATE TABLE blat 

    email VARCHAR(128) NOT NULL PRIMARY KEY, 
    username VARCHAR(32) NULL UNIQUE 

GO
 
You can also apply the constraint on its own, and on single or multiple columns. So again, imagine that these enforcements of uniqueness actually made sense: 
 
CREATE TABLE blat 

    email VARCHAR(128) NOT NULL PRIMARY KEY, 
    username VARCHAR(32) NULL, 
    phone VARCHAR(12) NULL 
    CONSTRAINT up UNIQUE NONCLUSTERED(username, phone), 
    CONSTRAINT p UNIQUE NONCLUSTERED(phone) 

GO
 
When you violate this constraint, you will get the following error (it might be 'p' depending on which constraint you actually violate): 
 
Microsoft OLE DB Provider for SQL Server error '80040e2f'  
Violation of UNIQUE KEY constraint 'up'. Cannot insert duplicate key in object 'blat'.
 
You can also apply such a constraint to the table that already exists, by simply adding a UNIQUE INDEX: 
 
CREATE UNIQUE INDEX u ON blat(username)
 
When you violate this constraint, you will get the following error: 
 
Microsoft OLE DB Provider for SQL Server error '80040e2f'  
Cannot insert duplicate key row in object 'blat' with unique index 'u'.
 
Of course, all of these errors can be trapped in ASP: 
 
<% 
    set conn = CreateObject("ADODB.Connection") 
    conn.open "<connection string>" 
 
    sql = "INSERT ... that violates constraint ..." 
 
    on error resume next 
    conn.execute sql, , 129 
    if err.number <> 0 then 
        Response.Write err.description 
        ' or your own custom error message 
    end if 
 
    conn.close: set conn = nothing 
%>
 

Pre-checking 
 
So, if you can't add a PRIMARY KEY, UNIQUE CONSTRAINT, or UNIQUE INDEX to the table, then you can check manually to make sure the value you are *about* to insert does not already exist. 
 
From within SQL Server 
 
Of these three pre-checking methods, this is best because minimal lag time and a single transaction can help ensure that peripheral changes to the table don't affect your check. In a stored procedure, you can do this: 
 
CREATE PROCEDURE dbo.FAQ_InsertRow 
    @email VARCHAR(32) 
AS 
BEGIN 
    SET NOCOUNT ON 
 
    IF EXISTS (SELECT 1 FROM blat WHERE email = @email) 
        RAISERROR('This value already exists.', 11, 1) 
    ELSE 
        INSERT blat(email) VALUES(@email) 
END 
GO
 
Then from ASP: 
 
<% 
    set conn = CreateObject("ADODB.Connection") 
    conn.open "<connection string>" 
 
    sql = "EXEC FAQ_InsertRow 'foo@x.com'" 
 
    on error resume next 
    conn.execute sql, , 129 
    if err.number <> 0 then 
        Response.Write err.description 
    end if 
 
    conn.close: set conn = nothing 
%>
 
From ASP code (Access or SQL Server) 
 
Another possibility is to run the check first from ASP, separate from the INSERT. Imagine separate stored procedures, but for brevity I'm going to put the ad hoc SQL statements directly in the code: 
 
<% 
    set conn = CreateObject("ADODB.Connection") 
    conn.open "<connection string>" 
 
    sql = "SELECT COUNT(*) FROM blat " & _ 
        "WHERE email='foo@x.com'" 
 
    set rs = conn.execute(sql) 
 
    if rs(0) = 0 then 
        sql = "INSERT blat(email)" & _ 
            "VALUES('foo@x.com')" 
        conn.execute sql, , 129 
    else 
        response.Write "Value exists." 
    end if 
 
    conn.close: set conn = nothing 
%>
 
A problem with this method is that there is a lag, albeit small, between the query that counts the matching rows, and the actual insert. So, it is technically possible that another user will have inserted the same value in between. 
 
From client-side code (Access or SQL Server) 
 
Another thing you can do (and I wrote about this, way back in Active Server Pages 2.0 Unleashed), is to pull data to the client side, and use client-side validation to prevent duplicates. Note that this method is only feasible for finite datasets. Client-side script is not going to be able to iterate through thousands of values, never mind millions. So this method is usually useful for a small admin table with a very small number of rows. 
 
<% 
    set conn = CreateObject("ADODB.Connection") 
    conn.open "<connection string>" 
 
 
    sql = "SELECT email FROM blat" 
 
    set rs = conn.execute(sql) 
 
    Response.Write "<scr" & "ipt>" & vbCrLf 
    Response.Write "function val(e)" 
    Response.Write "{" & vbCrLf 
 
    do while not rs.eof 
        ' build client-side array syntax 
        response.write vbTab & "if (e=='" & _ 
            lcase(rs(0)) & "') { alert('" & _ 
            rs(0) & " already exists.');" & _ 
            "return; }" & vbCrLf 
        rs.movenext 
    loop 
 
    Response.Write vbTab & "alert('value is okay!');" 
    Response.Write vbTab & "document.a.submit();" 
    Response.Write vbCrLf & "}" 
    Response.Write vbCrLf & "</scr" & "ipt>" 
    conn.close: set conn = nothing 
%> 
 
<form name=a method=post action=somepage.asp> 
<input type=text name=email> 
<input type=button value='go' onclick='val(a.email.value);'> 
</form>
 
It is not recommended that you rely solely on client-side code to prevent duplicates in a server-side database; rather, it should be an enhancement to server-side validation that makes the experience better for the end user — they are told immediately, rather than after a round-trip, that the value already exists. 
 

Triggers 
 
If, for some reason, you can't add a primary key or unique constraint, can't change the stored procedure, and can't change the ASP code, you can still enforce uniqueness in a table using an INSTEAD OF TRIGGER (introduced in SQL Server 2000). There are two problems with this method: one minor, and one major. A minor problem is that an INSTEAD OF TRIGGER cancels the INSERT, so you have to re-code the INSERT again in the case where the value is not a duplicate. Second, if the INSERT includes a TEXT or NTEXT column, the trigger can't deal with those values. 
 
You could also, assuming the INSERT is called within an explicit transaction, use an AFTER TRIGGER to rollback the transaction if it detects a duplicate was inserted.  
 
We're going to stall on creating trigger examples, at least for the time being, because they should be used as a last resort (mainly for performance reasons). If this is not the case, and trigger examples would be useful for you, please let us know.

Related Articles

How do I build a query with optional parameters?
How do I calculate the median in a table?
How do I create a store locator feature?
How do I deal with MEMO, TEXT, HYPERLINK, and CURRENCY columns?
How do I deal with multiple resultsets from a stored procedure?
How do I debug my SQL statements?
How do I determine if a column exists in a given table?
How do I enable or disable connection pooling?
How do I enumerate through the DSNs on a machine?
How do I find a stored procedure containing <text>?
How do I get a list of Access tables and their row counts?
How do I get the latest version of the JET OLEDB drivers?
How do I handle alphabetic paging?
How do I handle BIT / BOOLEAN columns?
How do I handle error checking in a stored procedure?
How do I ignore common words in a search?
How do I page through a recordset?
How do I present one-to-many relationships in my ASP page?
How do I prevent my ASP pages from waiting for backend activity?
How do I prevent NULLs in my database from mucking up my HTML?
How do I protect my Access database (MDB file)?
How do I protect my stored procedure code?
How do I protect myself against the W32.Slammer worm?
How do I remove duplicates from a table?
How do I rename a column?
How do I retrieve a random record?
How do I return row numbers with my query?
How do I send a database query to a text file?
How do I simulate an array inside a stored procedure?
How do I solve 'Could not find installable ISAM' errors?
How do I solve 'Operation must use an updateable query' errors?
How do I temporarily disable a trigger?
How do I use a SELECT list alias in the WHERE or GROUP BY clause?
How do I use a variable in an ORDER BY clause?
Should I index my database table(s), and if so, how?
Should I store images in the database or the filesystem?
Should I use a #temp table or a @table variable?
Should I use a view, a stored procedure, or a user-defined function?
Should I use recordset iteration, or GetRows(), or GetString()?
What are all these dt_ stored procedures, and can I remove them?
What are the limitations of MS Access?
What are the limitations of MSDE?
What are the valid styles for converting datetime to string?
What datatype should I use for my character-based database columns?
What datatype should I use for numeric columns?
What does "ambiguous column name" mean?
What is this 'Multiple-step OLE DB' error?
What is wrong with 'SELECT *'?
What naming convention should I use in my database?
What should I choose for my primary key?
What should my connection string look like?
When should I use CreateObject to create my recordset objects?
Where can I get this 'Books Online' documentation?
Where do I get MSDE?
Which database platform should I use for my ASP application?
Which tool should I use: Enterprise Manager or Query Analyzer?
Why are there gaps in my IDENTITY / AUTOINCREMENT column?
Why can I not 'open a database created with a previous version...'?
Why can't I access a database or text file on another server?
Why can't I use the TOP keyword?
Why do I get 'Argument data type text is invalid for argument [...]'?
Why do I get 'Not enough space on temporary disk' errors?
Why does ASP give me ActiveX errors when connecting to a database?
Should I use COALESCE() or ISNULL()?
Where can I get basic info about using stored procedures?

 

 


Created: 12/11/2003 | Last Updated: 12/11/2003 | broken links | helpful | not helpful | statistics
© Copyright 2006, UBR, Inc. All Rights Reserved. (214)

 

Copyright 1999-2006, All rights reserved.
Finding content
Finding content.  An error has occured...