convert dates: convert to set-based

I need help understanding this code and seeing if it can be converted to
set-based.
Declare @.Id int
Declare @.companyId numeric(18,0)
Declare @.franchiseId numeric(18,0)
--I understand this part where a table is created and populated
Declare @.tblCompanyFranchise Table( [Id] int, companyId numeric(18,0),
franchiseId numeric(18,0))
Declare @.pinId Varchar(30)
insert into @.tblCompanyFranchise([Id],companyId , franchiseId )
select [ID],numCompanyId,numFranchiseId from dbo.tblCompanyFranchise
where pinId is null
While(SELECT COUNT(*) FROM @.tblCompanyFranchise) > 0
Begin
--I don't understand where [id], companyid and franchiseid come from here to
compare to what is in the temp @.tblcompanyfranchise table
SELECT TOP 1 @.Id=[id], @.companyId=companyId,@.franchiseId=franch
iseId FROM
@.tblCompanyFranchise
--here deleting row just selected above
DELETE FROM @.tblCompanyFranchise WHERE companyId=@.companyId
and franchiseId=@.franchiseId and [id] =@.id
--not sure purpose of attempts here
Declare @.attempts int
Set @.attempts = 10
while(@.attempts > 0)
Begin
--set @.pinid as unique
--not sure why add @.id to part of pinid
Set @.pinId = Abs(CheckSum(NEWID()))
Set @.pinId = cast ( (cast (substring(@.pinId,1,10-len(@.Id)) as varchar) +
cast (@.Id as varchar)) as Varchar)
--this part below I don't understand. Is the update statement at the bottom
outside of the if statement below? What is the purpose of the attempts? IF
the count is not > 0 then attempts are set to 0. Correct?
if((Select count(*) from dbo.tblCompanyFranchise where pinid = @.pinId) > 0)
Begin
Set @.attempts = @.attempts - 1
If @.attempts = 0
Set @.pinId = null
End
Else
Begin
Set @.attempts = 0
End
End
Update dbo.tblCompanyFranchise set pinId = @.pinId
where numCompanyId=@.companyId and numFranchiseId=@.franchiseId and [id] =@.id
End
Go
Is there a set-based way to do the same thing?Looks to me like it's trying to create a pin for each companyid, but it's
worried that the pin might not be unique. To combat that, they put the
companyid at the end of it, padding to 10 characters. I suppose then though,
they might have the situation where there are two pins the same, if the
companyids are, say, 102 and 3102.
I would suggest that the logic be investigated, and replaced with something
that is going to produce a unique string each time, so that it doesn't need
to do each one individually. For example, if you are allowed a pin of 30
characters (as per the declare statement), then why not use all 10
characters of the random number (pad it out if necessary) and then put the
id on the end. That way, there will never be an overlap, as the digits from
position 11 on would be unique (just longer for larger numbers).
If the pin has to be 10 characters, then perhaps you could put a hyphen in
before the companyid section?
If all the characters have to be digits, then perhaps pad the companyid out
to a known number of digits - but that will restrict the number of companies
you could have in the system.
Of course, the chance of an overlap is really quite small, so you could put
a unique index on the pinid field, and just retry the query if you get an
error.
update dbo.tblCompanyFranchise
set pinId = cast ( (cast (substring(cast(abs(CheckSum(NEWID())) as
varchar),1,10-len(Id)) as varchar) + cast (Id as varchar)) as Varchar)
where pinId is null
Rob
"Dan D." <DanD@.discussions.microsoft.com> wrote in message
news:B80E4114-572B-4B03-9095-99BC70A780E3@.microsoft.com...
>I need help understanding this code and seeing if it can be converted to
> set-based.
> Declare @.Id int
> Declare @.companyId numeric(18,0)
> Declare @.franchiseId numeric(18,0)
> --I understand this part where a table is created and populated
> Declare @.tblCompanyFranchise Table( [Id] int, companyId numeric(18,0),
> franchiseId numeric(18,0))
> Declare @.pinId Varchar(30)
> insert into @.tblCompanyFranchise([Id],companyId , franchiseId )
> select [ID],numCompanyId,numFranchiseId from dbo.tblCompanyFranchise
> where pinId is null
> While(SELECT COUNT(*) FROM @.tblCompanyFranchise) > 0
> Begin
> --I don't understand where [id], companyid and franchiseid come from here
> to
> compare to what is in the temp @.tblcompanyfranchise table
> SELECT TOP 1 @.Id=[id], @.companyId=companyId,@.franchiseId=franch
iseId FROM
> @.tblCompanyFranchise
> --here deleting row just selected above
> DELETE FROM @.tblCompanyFranchise WHERE companyId=@.companyId
> and franchiseId=@.franchiseId and [id] =@.id
> --not sure purpose of attempts here
> Declare @.attempts int
> Set @.attempts = 10
> while(@.attempts > 0)
> Begin
> --set @.pinid as unique
> --not sure why add @.id to part of pinid
> Set @.pinId = Abs(CheckSum(NEWID()))
> Set @.pinId = cast ( (cast (substring(@.pinId,1,10-len(@.Id)) as varchar) +
> cast (@.Id as varchar)) as Varchar)
> --this part below I don't understand. Is the update statement at the
> bottom
> outside of the if statement below? What is the purpose of the attempts? IF
> the count is not > 0 then attempts are set to 0. Correct?
> if((Select count(*) from dbo.tblCompanyFranchise where pinid = @.pinId) >
> 0)
> Begin
> Set @.attempts = @.attempts - 1
> If @.attempts = 0
> Set @.pinId = null
> End
> Else
> Begin
> Set @.attempts = 0
> End
> End
> Update dbo.tblCompanyFranchise set pinId = @.pinId
> where numCompanyId=@.companyId and numFranchiseId=@.franchiseId and [id]
> =@.id
> End
> Go
> Is there a set-based way to do the same thing?
> --
>|||I thought newid() always produced a unique value? Do you know what the
purpose of the 10 "attempts" toward the bottom was?
Thanks,
--
Dan D.
"Rob Farley" wrote:

> Looks to me like it's trying to create a pin for each companyid, but it's
> worried that the pin might not be unique. To combat that, they put the
> companyid at the end of it, padding to 10 characters. I suppose then thoug
h,
> they might have the situation where there are two pins the same, if the
> companyids are, say, 102 and 3102.
> I would suggest that the logic be investigated, and replaced with somethin
g
> that is going to produce a unique string each time, so that it doesn't nee
d
> to do each one individually. For example, if you are allowed a pin of 30
> characters (as per the declare statement), then why not use all 10
> characters of the random number (pad it out if necessary) and then put the
> id on the end. That way, there will never be an overlap, as the digits fro
m
> position 11 on would be unique (just longer for larger numbers).
> If the pin has to be 10 characters, then perhaps you could put a hyphen in
> before the companyid section?
> If all the characters have to be digits, then perhaps pad the companyid ou
t
> to a known number of digits - but that will restrict the number of compani
es
> you could have in the system.
> Of course, the chance of an overlap is really quite small, so you could pu
t
> a unique index on the pinid field, and just retry the query if you get an
> error.
> update dbo.tblCompanyFranchise
> set pinId = cast ( (cast (substring(cast(abs(CheckSum(NEWID())) as
> varchar),1,10-len(Id)) as varchar) + cast (Id as varchar)) as Varchar)
> where pinId is null
> Rob
> "Dan D." <DanD@.discussions.microsoft.com> wrote in message
> news:B80E4114-572B-4B03-9095-99BC70A780E3@.microsoft.com...
>
>|||>> s there a set-based way to do the same thing? <<
What is this nightmare of poorly formatted insanely proprietary code
supposed to do? Without a spec, it is pretty hard to answer your
querstion. I have the feeling that this crap is not using a relatioanl
key at all, but that it is randomly trying to construct an unverifiable
exposed locator on the fly.|||Dan,

>I thought newid() always produced a unique value?
It does. But your code isn't using it in its standard form. It's grabbing
its checksum, and its absolute value, so that just makes it a random number
less than 2^31.
But then you change the last characters of that number with the id from the
table. Eg... if you have an id of 100342, your 10-digit random number might
be:
2834100342
But the whole reason for doing it one by one is that your code is worried
that it might not be unique. But it's going to be _probably_ unique - the
only chance of overlaps is where you have two numbers that overlap already.
For example, there's a 1/1000 chance that the same number above could be
generated for company 4100342. And it figures that it should be able to find
a unique number some time in the first 10 tries - which it shouldn't have
any problem doing at all.
The chance of each one being unique is very high. Not high enough to warrant
doing each one individually and checking each time. But if you need it to be
enforced, then do it with a unique key, and just put a check in to see that
the update hasn't broken the rule. If it has, just re-run it.
Let's have a quick think about where the possible overlaps are:
Record 100342 could overlap with:
record 2 (1/1000000000 chance)
record 42 (1/100000000 chance)
record 342 (1/10000000 chance)
record 1100342 (1/1000 chance)
record 2100342 (1/1000 chance)
...etc
If there's a really good business reason for the uniqueness, this is enough
of a risk to make it worth enforcing, but you could update hundreds of
thousands of records at a time without noticing any overlaps.
Rob
"Dan D." <DanD@.discussions.microsoft.com> wrote in message
news:F8787F1F-4AC0-405C-9E8E-5ACF7841F11D@.microsoft.com...
>I thought newid() always produced a unique value? Do you know what the
> purpose of the 10 "attempts" toward the bottom was?
> Thanks,
> --
> Dan D.
>
> "Rob Farley" wrote:
>|||The pinid column is a varchar(30) so if the contractor was really worried
about uniqueness, I don't know why he didn't use more of the id field. The i
d
itself is also supposed to be unique so the chance of combining the checksum
of newid() and all of the id field is pretty small.
I ran your query and it took 6 seconds. The original code took 4 hours.
Can you tell me what how this part of the code works:
SELECT TOP 1 @.Id=[id], @.companyId=companyId,@.franchiseId=franch
iseId FROM
@.tblCompanyFranchise
I interpret it to mean select the first record from @.tblCompanyFranchise
where the @.id in @.tblCompanyFranchise equals [id] and where @.companyid equals
companyid and where @.franchiseid equals franchiseid. But where does the valu
e
for [id], companyid and franchiseid come from?
Thanks so much for your help Rob.
Dan D.
"Rob Farley" wrote:

> Dan,
>
> It does. But your code isn't using it in its standard form. It's grabbing
> its checksum, and its absolute value, so that just makes it a random numbe
r
> less than 2^31.
> But then you change the last characters of that number with the id from th
e
> table. Eg... if you have an id of 100342, your 10-digit random number migh
t
> be:
> 2834100342
> But the whole reason for doing it one by one is that your code is worried
> that it might not be unique. But it's going to be _probably_ unique - the
> only chance of overlaps is where you have two numbers that overlap already
.
> For example, there's a 1/1000 chance that the same number above could be
> generated for company 4100342. And it figures that it should be able to fi
nd
> a unique number some time in the first 10 tries - which it shouldn't have
> any problem doing at all.
> The chance of each one being unique is very high. Not high enough to warra
nt
> doing each one individually and checking each time. But if you need it to
be
> enforced, then do it with a unique key, and just put a check in to see tha
t
> the update hasn't broken the rule. If it has, just re-run it.
> Let's have a quick think about where the possible overlaps are:
> Record 100342 could overlap with:
> record 2 (1/1000000000 chance)
> record 42 (1/100000000 chance)
> record 342 (1/10000000 chance)
> record 1100342 (1/1000 chance)
> record 2100342 (1/1000 chance)
> ...etc
> If there's a really good business reason for the uniqueness, this is enoug
h
> of a risk to make it worth enforcing, but you could update hundreds of
> thousands of records at a time without noticing any overlaps.
> Rob
>
> "Dan D." <DanD@.discussions.microsoft.com> wrote in message
> news:F8787F1F-4AC0-405C-9E8E-5ACF7841F11D@.microsoft.com...
>
>|||I wish I knew. This project was contracted out. There was no statement of
work, documentation, etc. by the original contractors. The project was takin
g
too long (surprise!) so it was brought in-house and another group of
contractors was hired to fix it.
I look through the code once in a while to see how other people code and to
learn.
--
Dan D.
"--CELKO--" wrote:

> What is this nightmare of poorly formatted insanely proprietary code
> supposed to do? Without a spec, it is pretty hard to answer your
> querstion. I have the feeling that this crap is not using a relatioanl
> key at all, but that it is randomly trying to construct an unverifiable
> exposed locator on the fly.
>|||Dan,
It sounds to me like you need to look through the business rules, and
probably get new contractors. :) If you can use more than 10 digits, then by
all means do that. I would actually suggest starting with the id number and
then using the large number padded to 10 digits. That way, you can guarantee
its uniqueness, plus you won't have 0 as the first character (because you
will need to pad the 10-digits to be sure it's unique - consider the case
where your checksum gives you a very small result).

> Can you tell me what how this part of the code works:
> SELECT TOP 1 @.Id=[id], @.companyId=companyId,@.franchiseId=franch
iseId FROM
> @.tblCompanyFranchise
This just gets the top row from @.tblCompanyFranchise without having any
filter, and populates the variables @.Id, @.companyId and @.franchiseId. It's
basically a cursor without having a cursor. My guess is that your
contractors have read that cursors are bad practice, but instead of taking a
set-based approach, have simply altered the code to remove the cursor
declaration.
Rob
PS: Sorry for my silence over the past several hours - I'm in Australia.
"Dan D." <DanD@.discussions.microsoft.com> wrote in message
news:A69B97BF-8B5F-4BFE-AD9E-573868DBBAD7@.microsoft.com...
> The pinid column is a varchar(30) so if the contractor was really worried
> about uniqueness, I don't know why he didn't use more of the id field. The
> id
> itself is also supposed to be unique so the chance of combining the
> checksum
> of newid() and all of the id field is pretty small.
> I ran your query and it took 6 seconds. The original code took 4 hours.
> Can you tell me what how this part of the code works:
> SELECT TOP 1 @.Id=[id], @.companyId=companyId,@.franchiseId=franch
iseId FROM
> @.tblCompanyFranchise
> I interpret it to mean select the first record from @.tblCompanyFranchise
> where the @.id in @.tblCompanyFranchise equals [id] and where @.companyid
> equals
> companyid and where @.franchiseid equals franchiseid. But where does the
> value
> for [id], companyid and franchiseid come from?
> Thanks so much for your help Rob.
> --
> Dan D.
>
> "Rob Farley" wrote:
>|||:) Yup. I try to sleep at night occasionally.
"Stefan Berglund" <sorry.no.koolaid@.for.me> wrote in message
news:ges982dt67flvs05h9i7pjf0sja9ah6i0j@.
4ax.com...
> On Tue, 6 Jun 2006 11:46:47 +0930, "Rob Farley" <rob_farley@.hotmail.com>
> wrote:
> in <uVR#E9QiGHA.3904@.TK2MSFTNGP02.phx.gbl>
> Oh! - Have they been closed for several hours?
> --
> Stefan Berglund|||Yeah, we have a regular scheduled outage for maintenance. We shut down
the country for a few hours each night - didn't you get the memo?
*mike hodgson*
http://sqlnerd.blogspot.com
Stefan Berglund wrote:

>On Tue, 6 Jun 2006 11:46:47 +0930, "Rob Farley" <rob_farley@.hotmail.com>
>wrote:
> in <uVR#E9QiGHA.3904@.TK2MSFTNGP02.phx.gbl>
>
>Oh! - Have they been closed for several hours?
>--
>Stefan Berglund
>

Monday, March 19, 2012

convert to set-based

No comments:

Post a Comment

convert dates

Blog Archive

About Me