Tuesday, March 20, 2012

primary key data type

Where is the biggest difference in performance between a 'int' ID and
a 'nvarchar' ID on big tables?
On 7 Feb 2007 21:30:03 -0800, retima@.gmail.com wrote:

>Where is the biggest difference in performance between a 'int' ID and
>a 'nvarchar' ID on big tables?
The PK is used as a final pointer by all other indexes on the table,
so whatever the size difference between the int and average nvarchar
(plus length indicator), may be multiplied, and makes other indexes
somewhat less dense, which is in general a bad thing.
J.
|||On Feb 8, 7:39 am, JXStern <JXSternChange...@.gte.net> wrote:
> On 7 Feb 2007 21:30:03 -0800, ret...@.gmail.com wrote:
>
> The PK is used as a final pointer by all other indexes on the table,
> so whatever the size difference between the int and average nvarchar
> (plus length indicator), may be multiplied, and makes other indexes
> somewhat less dense, which is in general a bad thing.
> J.
thanks J.
Can you direct me to an example which is more specific?
(a certain query that would prove it with numbers)
|||Yaniv,shalom
http://www.sql-server-performance.com/datatypes.asp
<yaniv.harpaz@.gmail.com> wrote in message
news:1170919960.359095.315430@.p10g2000cwp.googlegr oups.com...
> On Feb 8, 7:39 am, JXStern <JXSternChange...@.gte.net> wrote:
> thanks J.
> Can you direct me to an example which is more specific?
> (a certain query that would prove it with numbers)
>
|||I think that you mixed up primary key and clustered index. None
clustered index are using the clustered index's keys as a pointer to
the data. Of course if the primary key is clustered then all the none
clustered index will use the PK as the pointer to the data, but many
times there is another clustered index (and sometimes there is no
clustered index at all).
Adi
On Feb 8, 7:39 am, JXStern <JXSternChange...@.gte.net> wrote:
> On 7 Feb 2007 21:30:03 -0800, ret...@.gmail.com wrote:
>
> The PK is used as a final pointer by all other indexes on the table,
> so whatever the size difference between the int and average nvarchar
> (plus length indicator), may be multiplied, and makes other indexes
> somewhat less dense, which is in general a bad thing.
> J.
|||An int occupies 4 bytes, while nvarchar occupies 2 bytes per character. An
int goes from -2billion to +2 billion (ish) so smallint might be more
appropriate (2 bytes and +/- 32k ish). See
http://www.databasejournal.com/features/mssql/article.phpr/2212141 for exact
sizes. The reason I'm talking about the space occupied, is that this is one
crucial factor in determining the datatype - fewer pages to search = faster
searches. What I find confusing though is that these datatypes are not
really comparable as they'll store different kinds of information, and
perhaps you are questioning whether to use a surrogate key or a key with
business meaning, which is a different discussion and one you'll find many
threads on in this and the programming newsgroup.
Cheers,
Paul Ibison SQL Server MVP, www.replicationanswers.com
|||On Feb 8, 10:22 am, "Paul Ibison" <Paul.Ibi...@.Pygmalion.Com> wrote:
> An int occupies 4 bytes, while nvarchar occupies 2 bytes per character. An
> int goes from -2billion to +2 billion (ish) so smallint might be more
> appropriate (2 bytes and +/- 32k ish). Seehttp://www.databasejournal.com/features/mssql/article.phpr/2212141for exact
> sizes. The reason I'm talking about the space occupied, is that this is one
> crucial factor in determining the datatype - fewer pages to search = faster
> searches. What I find confusing though is that these datatypes are not
> really comparable as they'll store different kinds of information, and
> perhaps you are questioning whether to use a surrogate key or a key with
> business meaning, which is a different discussion and one you'll find many
> threads on in this and the programming newsgroup.
> Cheers,
> Paul Ibison SQL Server MVP,www.replicationanswers.com
Hi Paul,
In one of the systems I'm reviewing, most of the tables have nvarchar
data types because it's "comfortable" for programming and reports. I
want to convince them in the best way, that this move has problems and
might be very expensive to fix in the future.
|||OK - then if the column doesn't contain any 'meaningful' business data and
is just used to identify the records, I'd recommend using the appropriate
whole-number datatype. As a programmer I find them just as easy to use as
nvarchars. A lot of DBAs use surrogate PK keys with identity columns which
has other benefits - no need to design an algorithm for separate values.
Cheers,
Paul Ibison SQL Server MVP,www.replicationanswers.com
|||> The PK is used as a final pointer by all other indexes on the table,
Not really - it is the clustered index key that is used for bookmark
lookups. The clustered index is not necessarily the primary key.
Hope this helps.
Dan Guzman
SQL Server MVP
"JXStern" <JXSternChangeX2R@.gte.net> wrote in message
news:4pdls2125tpm4oln46ijq7lil8ht9i611b@.4ax.com...
> On 7 Feb 2007 21:30:03 -0800, retima@.gmail.com wrote:
>
> The PK is used as a final pointer by all other indexes on the table,
> so whatever the size difference between the int and average nvarchar
> (plus length indicator), may be multiplied, and makes other indexes
> somewhat less dense, which is in general a bad thing.
> J.
>
|||On 8 Feb 2007 00:10:19 -0800, "Adi" <adicohn@.hotmail.com> wrote:

> I think that you mixed up primary key and clustered index.
Ooops, yes I did.
Sorry for any confusion.
To try the original question again, if the PK is not clustered, just
being a PK, then the only impact is as any less dense index is that
much less efficient, and I suppose its use (like any index's use) by
FK's, would also consume that many more bytes of space.
I'll happily use char(8) or even varchar(24) as PKs, heck, I'll use
compound PKs even longer, as required, unless it's some hugely active
system where you have to squeeze out the last bits of performance.
And of course, most databases I walk into, are already using clustered
identity ints (or bigints more recently) as PKs on most tables ...
hence my sloppy answer last night.
J.

> None
>clustered index are using the clustered index's keys as a pointer to
>the data. Of course if the primary key is clustered then all the none
>clustered index will use the PK as the pointer to the data, but many
>times there is another clustered index (and sometimes there is no
>clustered index at all).
>Adi
>On Feb 8, 7:39 am, JXStern <JXSternChange...@.gte.net> wrote:
>

No comments:

Post a Comment