Thursday, March 22, 2012

Converting a longer query into SQL92 syntax?

First of all, I DON'T want to convert the following query into the SQL92
equivalent, I just want to learn how to do it.
Most of the left and right join examples I've found on the Net usually deal
with 2 or three tables at the maximum. What happens if my query uses 7
tables and only two of those require a left join? The 1st Where clause
requires the left join; everything else stays the same. That's one of the
reasons I still haven't switched to SQL92. This is the query that I want to
convert:
select addr.city, addr.secondary_addr, count(*)
from
addr, acct_offer_prod, acct, cust, active_addr_range, vw_bdry_levels,
addr_range_grouping
where
addr.addr_id = acct_offer_prod.addr_id and
acct_offer_prod.acct_id = acct.acct_id and
acct.acct_id = cust.cust_id and
acct_offer_prod.addr_id = active_addr_range.addr_id and
active_addr_range.range_id = addr_range_grouping.range_id and
addr_range_grouping.bdry_id = vw_bdry_levels.bdry_id and
vw_bdry_levels.metro_or_isla like 'HD ISLA' and
vw_bdry_levels.region_short like 'REG-R' and
vw_bdry_levels.district_short like '037' and
acct_offer_prod.addr_id = addr.addr_id and
active_addr_range.eff_dt <= getdate() and
(active_addr_range.exp_dt >= getdate() or active_addr_range.exp_dt is null)
and
addr_range_grouping.eff_dt <= getdate() and
(addr_range_grouping.exp_dt >= getdate() or addr_range_grouping.exp_dt is
null) and vw_bdry_levels.eff_dt <= getdate() and
(vw_bdry_levels.exp_dt >= getdate() or vw_bdry_levels.exp_dt is null) and
acct_offer_prod.del_status_cd = 'NORMAL'
group by addr.city, addr.secondary_addrYour query is already valid SQL92 syntax (except for the non-standard
GETDATE function).
I assume you are referring to replacing the *= / =* notation with the LEFT /
RIGHT JOIN syntax. To do that just requires that you move the joining
criteria into the ON clause rather than the WHERE clause. In principle you
can mix LEFT and INNER JOINs as much as you like in a SELECT statement. The
key point is what goes in the ON clauses.
I'm not really clear what you are asking for so maybe we can help you better
if you post an example. Please do include DDL (CREATE TABLE statement(s)
including keys and constraints) with any query you post.
David Portas
SQL Server MVP
--|||> First of all, I DON'T want to convert the following query into the SQL92
> equivalent, I just want to learn how to do it.
> Most of the left and right join examples I've found on the Net usually dea
l
> with 2 or three tables at the maximum. What happens if my query uses 7
> tables and only two of those require a left join? The 1st Where clause
> requires the left join; everything else stays the same. That's one of the
> reasons I still haven't switched to SQL92. This is the query that I want t
o
> convert:
First of all, let's write it so it's readable. Looking past the naming
convention and the nonsense of putting keywords in all caps, this should be
SQL92 equivalent of your original query (and will work in SQL Server).
Select A.City, A.Secondary_Addr, Count(*)
From Addr As A
Join Acct_Offer_Prod As AOP
On A.Addr_Id = AOP.Addr_Id
And AOP.Del_Status_Cd = 'NORMAL'
Join Cust As C
On A.Acct_Id = C.Cust_Id
Join Active_Addr_Range As AAR
On AOP.Addr_Id = AAR.Addr_Id
And AAR.Eff_Dt <= Current_Timestamp
And (AAR.Exp_Dt >= Current_Timestamp
Or AAR.Exp_Dt Is Null)
Join Addr_Range_Grouping As ARG
On ARG.Range_Id = AAR.Range_Id
And ARG.Eff_Dt <= Current_Timestamp
And (ARG.Exp_Dt >= Current_Timestamp
Or ARG.Exp_Dt Is Null)
Join vw_Bdry_Levels As VBL
On ARG.Bdry_Id = BDry_Id
And VBL.Metro_Or_Isla Like 'HD ISLA'
And VBL.Region_Short Like 'REG-R'
And VBL.District_Short Like '037'
And VBL.Eff_Dt <= Current_Timestamp
And (VBL.Exp_Dt >= Current_Timestamp
Or VBL.Exp_Dt Is Null)
Group By A.City, A.Secondary_Addr

> The 1st Where clause
> requires the left join; everything else stays the same. That's one of the
> reasons I still haven't switched to SQL92. This is the query that I want t
o
> convert:
Are you asking, "What will happen *if* the first where clause needs to be
converted into a left join?" If that is the case there are a couple of
solutions, I'd be inclined to write like so (depending on the actual spec of
what is desired) using paren'ed joins (Not sure if this is officially ANSI
compliant)
Select A.City, A.Secondary_Addr, Count(*)
From Addr As A
Join Cust As C
On A.Acct_Id = C.Cust_Id
Left Join (Acct_Offer_Prod As AOP
Join Active_Addr_Range As AAR
On AOP.Addr_Id = AAR.Addr_Id
And AAR.Eff_Dt <= Current_Timestamp
And (AAR.Exp_Dt >= Current_Timestamp
Or AAR.Exp_Dt Is Null)
Join Addr_Range_Grouping As ARG
On ARG.Range_Id = AAR.Range_Id
And ARG.Eff_Dt <= Current_Timestamp
And (ARG.Exp_Dt >= Current_Timestamp
Or ARG.Exp_Dt Is Null)
Join vw_Bdry_Levels As VBL
On ARG.Bdry_Id = BDry_Id
And VBL.Metro_Or_Isla Like 'HD ISLA'
And VBL.Region_Short Like 'REG-R'
And VBL.District_Short Like '037'
And VBL.Eff_Dt <= Current_Timestamp
And (VBL.Exp_Dt >= Current_Timestamp
Or VBL.Exp_Dt Is Null))
On A.Addr_Id = AOP.Addr_Id
And AOP.Del_Status_Cd = 'NORMAL'
Group By A.City, A.Secondary_Addr
Another solution is to use a series of Left Joins
Select A.City, A.Secondary_Addr, Count(*)
From Addr As A
Join Cust As C
On A.Acct_Id = C.Cust_Id
Left Join Acct_Offer_Prod As AOP
On A.Addr_Id = AOP.Addr_Id
And AOP.Del_Status_Cd = 'NORMAL'
Left Join Active_Addr_Range As AAR
On AOP.Addr_Id = AAR.Addr_Id
And AAR.Eff_Dt <= Current_Timestamp
And (AAR.Exp_Dt >= Current_Timestamp
Or AAR.Exp_Dt Is Null)
Left Join Addr_Range_Grouping As ARG
On ARG.Range_Id = AAR.Range_Id
And ARG.Eff_Dt <= Current_Timestamp
And (ARG.Exp_Dt >= Current_Timestamp
Or ARG.Exp_Dt Is Null)
Left Join vw_Bdry_Levels As VBL
On ARG.Bdry_Id = BDry_Id
And VBL.Metro_Or_Isla Like 'HD ISLA'
And VBL.Region_Short Like 'REG-R'
And VBL.District_Short Like '037'
And VBL.Eff_Dt <= Current_Timestamp
And (VBL.Exp_Dt >= Current_Timestamp
Or VBL.Exp_Dt Is Null)
Group By A.City, A.Secondary_Addr
The results of these two solutions can vary dramatically depending on what i
s
actually desired. Thus, I'd need more details about the actual results desir
ed
before I could provide an exact solution.
Thomas|||That's really my doubt: taking the first joining critera from the WHERE
clause and moving it to the ON clause. Based on your description, it would
look like the following - but won't it alter the resultset since I've swiche
d
all the tables around?
select addr.city, addr.secondary_addr, count(*)
from
acct, cust, active_addr_range, vw_bdry_levels, addr_range_grouping, addr
LEFT JOIN acct_offer_prod on
(addr.addr_id = acct_offer_prod.addr_id)
where
acct_offer_prod.acct_id = acct.acct_id and ...
...
What if my query uses 50 tables and 200 JOINS, would the order of the tables
and/or JOINS alter the resulting recordset?
Thanks.
"David Portas" wrote:

> Your query is already valid SQL92 syntax (except for the non-standard
> GETDATE function).
> I assume you are referring to replacing the *= / =* notation with the LEFT
/
> RIGHT JOIN syntax. To do that just requires that you move the joining
> criteria into the ON clause rather than the WHERE clause. In principle you
> can mix LEFT and INNER JOINs as much as you like in a SELECT statement. Th
e
> key point is what goes in the ON clauses.
> I'm not really clear what you are asking for so maybe we can help you bett
er
> if you post an example. Please do include DDL (CREATE TABLE statement(s)
> including keys and constraints) with any query you post.
> --
> David Portas
> SQL Server MVP
> --
>
>|||On Tue, 10 May 2005 12:54:06 -0700, VMI wrote:

>First of all, I DON'T want to convert the following query into the SQL92
>equivalent, I just want to learn how to do it.
>Most of the left and right join examples I've found on the Net usually deal
>with 2 or three tables at the maximum. What happens if my query uses 7
>tables and only two of those require a left join? The 1st Where clause
>requires the left join; everything else stays the same. That's one of the
>reasons I still haven't switched to SQL92. This is the query that I want to
>convert:
(snip)
Hi VMI,
Are you sure you posted the right query? Since I can find no =* or *=
operator, the query you posted consists solely of INNER JOINs. BTW, why
are you using LIKE for string comparisons when the string constant you
compare to doesn't use any wildcards?
Anyway, here's what I believe would be the equivalent of your query in
the infixed join notation. Note that I also removed the duplicate check
for equality between acct_offer_prod.addr_id and addr.addr_id.
SELECT addr.city, addr.secondary_addr, COUNT(*)
FROM addr
INNER JOIN acct_offer_prod
ON acct_offer_prod.addr_id = addr.addr_id
INNER JOIN acct
ON acct.acct_id = acct_offer_prod.acct_id
INNER JOIN cust
ON cust.cust_id = acct.acct_id
INNER JOIN active_addr_range
ON active_addr_range.addr_id = acct_offer_prod.addr_id
INNER JOIN addr_range_grouping
ON addr_range_grouping.range_id = active_addr_range.range_id
INNER JOIN vw_bdry_levels
ON vw_bdry_levels.bdry_id = addr_range_grouping.bdry_id
WHERE vw_bdry_levels.metro_or_isla = 'HD ISLA'
AND vw_bdry_levels.region_short = 'REG-R'
AND vw_bdry_levels.district_short = '037'
AND active_addr_range.eff_dt <= CURRENT_TIMESTAMP
AND ( active_addr_range.exp_dt >= CURRENT_TIMESTAMP
OR active_addr_range.exp_dt IS NULL)
AND addr_range_grouping.eff_dt <= CURRENT_TIMESTAMP
AND ( addr_range_grouping.exp_dt >= CURRENT_TIMESTAMP
OR addr_range_grouping.exp_dt IS NULL)
AND vw_bdry_levels.eff_dt <= CURRENT_TIMESTAMP
AND ( vw_bdry_levels.exp_dt >= CURRENT_TIMESTAMP
OR vw_bdry_levels.exp_dt is null)
AND acct_offer_prod.del_status_cd = 'NORMAL'
GROUP BY addr.city, addr.secondary_addr
(untested)
Note: for better readability, I would consider rewriting the checks for
effective date / expiry date as
AND CURRENT_TIMESTAMP BETWEEN active_addr_range.eff_dt
AND COALESCE (active_addr_range.exp_dt,
CURRENT_TIMESTAMP)
But of course, readability can be a pretty personal opinion :-)
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)|||If you simply threw in a Left Join, in the place you described, you would no
t
alter the results. The reason is you have criteria in the where clause on th
e
derived tables. That effectively makes them into inner joins. So, in this ca
se,
the criteria of Del_Status_Cd = 'Normal' would essentially cause the system
to
ignore your left join.
It's not so much the order of the tables as it is how you are applying your
criteria.
Thomas
"VMI" <VMI@.discussions.microsoft.com> wrote in message
news:EAEA96BA-1655-4A4F-AA46-D65AAD0BDBA9@.microsoft.com...
> That's really my doubt: taking the first joining critera from the WHERE
> clause and moving it to the ON clause. Based on your description, it would
> look like the following - but won't it alter the resultset since I've swic
hed
> all the tables around?
> select addr.city, addr.secondary_addr, count(*)
> from
> acct, cust, active_addr_range, vw_bdry_levels, addr_range_grouping, addr
> LEFT JOIN acct_offer_prod on
> (addr.addr_id = acct_offer_prod.addr_id)
> where
> acct_offer_prod.acct_id = acct.acct_id and ...
> ...
> What if my query uses 50 tables and 200 JOINS, would the order of the tabl
es
> and/or JOINS alter the resulting recordset?
> Thanks.
>
> "David Portas" wrote:
>|||Conceptually. each "Join" is a join between only two "Relations", or
"Resultsets". When you have more than two tables in a From Clause, and,
therefore, you have two or more joins. the second "Join" that takes place ca
n
be thought of as a Join between the intermediate resultset created by the
first join, and the third table. So the answer to your question depends on
what order, and what exactly, you wish to Join in this second Join...
With three tables, for example, Two possibilities exist:
You could Join Tables B to A, using Outer Join syntax, and then Join C to
that resultset, also using Outer Join SyntAX...
From TableA
Left Outer Join Table B On .....
Left Outer Join Table C On ......
Or 2) you might be wishing to Join the COmbined Inner Join of Tables B & C
to Table A. In this case you would be joining B & C FIrst, and then Joining
THAT resultset to TableA using Outer Join Syntax
From TableA
Left Outer Join (Table B Join Table C On ....)
On ....
This approach might be used to get ALL Customers, (even the ones with no
Invoices), plus the data from a Invoices and connected Invoice Details table
s
, but only include invoices that have details...
"VMI" wrote:

> First of all, I DON'T want to convert the following query into the SQL92
> equivalent, I just want to learn how to do it.
> Most of the left and right join examples I've found on the Net usually dea
l
> with 2 or three tables at the maximum. What happens if my query uses 7
> tables and only two of those require a left join? The 1st Where clause
> requires the left join; everything else stays the same. That's one of the
> reasons I still haven't switched to SQL92. This is the query that I want t
o
> convert:
> select addr.city, addr.secondary_addr, count(*)
> from
> addr, acct_offer_prod, acct, cust, active_addr_range, vw_bdry_levels,
> addr_range_grouping
> where
> addr.addr_id = acct_offer_prod.addr_id and
> acct_offer_prod.acct_id = acct.acct_id and
> acct.acct_id = cust.cust_id and
> acct_offer_prod.addr_id = active_addr_range.addr_id and
> active_addr_range.range_id = addr_range_grouping.range_id and
> addr_range_grouping.bdry_id = vw_bdry_levels.bdry_id and
> vw_bdry_levels.metro_or_isla like 'HD ISLA' and
> vw_bdry_levels.region_short like 'REG-R' and
> vw_bdry_levels.district_short like '037' and
> acct_offer_prod.addr_id = addr.addr_id and
> active_addr_range.eff_dt <= getdate() and
> (active_addr_range.exp_dt >= getdate() or active_addr_range.exp_dt is null
)
> and
> addr_range_grouping.eff_dt <= getdate() and
> (addr_range_grouping.exp_dt >= getdate() or addr_range_grouping.exp_dt is
> null) and vw_bdry_levels.eff_dt <= getdate() and
> (vw_bdry_levels.exp_dt >= getdate() or vw_bdry_levels.exp_dt is null) and
> acct_offer_prod.del_status_cd = 'NORMAL'
> group by addr.city, addr.secondary_addr
>
>|||>> First of all, let's write it so it's readable. Looking past the
naming convention and the nonsense of putting keywords in all caps, ..
<<
Upon how amny years of research do you base this statement, if any?
I just published a book (SQL PROGRAMMING STYLE) based on 20+ years of
"readability of code" research, some of which I myself did at AIRMICS.
Reserved words in uppercase were shown in *every* study to be easier to
read than code in all-lowercase or all-uppercase. I think in the late
1970's we had about 150+ of them -- it was a popular topic for grad
students (see Ben Schneiderman at U. of Maryland, et al).
Quick lecture.
1) The eye of a Latin alphabet reader is drawn to an uppercase letter;
it is the start of a sentence which models a unit of thought.
Camelcase screws up eye movement. Lowercase does not attract it.
2) All uppercase words are seen as bouma. Bet that you had no idea
what a "bouma" is because you never did research on code readability.
It is "word shape" -- google it.for details. We do not read code the
way we read text. Code is 3D and text is linear.
You are smarter than this! Send me a "snail mail" and I will get you a
copy of SQL PROGRAMMING STYLE.|||1) Why are your using LIKE with patterns that have no wildcards? Do you
like the overhead?
2) Why are you still saying getdate() when you mean CURRENT_TIMESTAMP
? Or do you spit on all standards?
3) Why do you put AND at the end of lines, which only makes the code a
screaming XXXXX to maintain? think about how the eye has to jerk to
the end of the previous line.
4) Here is how OUTER JOINs work in SQL-92. Assume you are given:
Table1 Table2
a b a c
====== ======
1 w 1 r
2 x 2 s
3 y 3 t
4 z
and the outer join expression:
Table1
LEFT OUTER JOIN
Table2
ON Table1.a = Table2.a <== join condition
AND Table2.c = 't'; <== single table condition
We call Table1 the "preserved table" and Table2 the "unpreserved table"
in the query. What I am going to give you is a little different, but
equivalent to the ANSI/ISO standards.
1) We build the CROSS JOIN of the two tables. Scan each row in the
result set.
2) If the predicate tests TRUE for that row, then you keep it. You also
remove all rows derived from it from the CROSS JOIN
3) If the predicate tests FALSE or UNKNOWN for that row, then keep the
columns from the preserved table, convert all the columns from the
unpreserved table to NULLs and remove the duplicates.
So let us execute this by hand:
Let @. = passed the first predicate
Let * = passed the second predicate
Table1 CROSS JOIN Table2
a b a c
=========================
1 w 1 r @.
1 w 2 s
1 w 3 t *
2 x 1 r
2 x 2 s @.
2 x 3 t *
3 y 1 r
3 y 2 s
3 y 3 t @.* <== the TRUE set
4 z 1 r
4 z 2 s
4 z 3 t *
Table1 LEFT OUTER JOIN Table2
a b a c
=========================
3 y 3 t <= only TRUE row
--
1 w NULL NULL Sets of duplicates
1 w NULL NULL
1 w NULL NULL
--
2 x NULL NULL
2 x NULL NULL
2 x NULL NULL
3 y NULL NULL <== derived from the TRUE set - Remove
3 y NULL NULL
--
4 z NULL NULL
4 z NULL NULL
4 z NULL NULL
the final results:
Table1 LEFT OUTER JOIN Table2
a b a c
=========================
1 w NULL NULL
2 x NULL NULL
3 y 3 t
4 z NULL NULL
The basic rule is that every row in the preserved table is represented
in the results in at least one result row.
There are limitations and very serious problems with the extended
equality version of an outer join used in some diseased mutant
products. Consider the two Chris Date tables
Suppliers SupParts
supno supno partno qty
========= ==============
S1 S1 P1 100
S2 S1 P2 250
S3 S2 P1 100
S2 P2 250
and let's do an extended equality outer join like this:
SELECT *
FROM Supplier, SupParts
WHERE Supplier.supno *= SupParts.supno
AND qty < 200;
If I do the outer first, I get:
Suppliers LOJ SupParts
supno supno partno qty
=======================
S1 S1 P1 100
S1 S1 P2 250
S2 S2 P1 100
S2 S2 P2 250
S3 NULL NULL NULL
Then I apply the (qty < 200) predicate and get
Suppliers LOJ SupParts
supno supno partno qty
===================
S1 S1 P1 100
S2 S2 P1 100
Doing it in the opposite order
Suppliers LOJ SupParts
supno supno partno qty
===================
S1 S1 P1 100
S2 S2 P1 100
S3 NULL NULL NULL
Sybase does it one way, Oracle does it the other and Centura (nee
Gupta) lets you pick which one -- the worst of both non-standard
worlds! In SQL-92, you have a choice and can force the order of
execution. Either do the predicates after the join ...
SELECT *
FROM Supplier
LEFT OUTER JOIN
SupParts
ON Supplier.supno = SupParts.supno
WHERE qty < 200;
.. or do it in the joining:
SELECT *
FROM Supplier
LEFT OUTER JOIN
SupParts
ON Supplier.supno = SupParts.supno
AND qty < 200;
Another problem is that you cannot show the same table as preserved and
unpreserved in the extended equality version, but it is easy in SQL-92.
For example to find the students who have taken Math 101 and might
have taken Math 102:
SELECT C1.student, C1.math, C2.math
FROM (SELECT * FROM Courses WHERE math = 101) AS C1
LEFT OUTER JOIN
(SELECT * FROM Courses WHERE math = 102) AS C2
ON C1.student = C2.student;|||>> First of all, let's write it so it's readable. Looking past the
> Upon how amny years of research do you base this statement, if any?
I'd say about ten years. That's about how long color coded editors have been
available.

> I just published a book (SQL PROGRAMMING STYLE) based on 20+ years of
> "readability of code" research, some of which I myself did at AIRMICS.
> Reserved words in uppercase were shown in *every* study to be easier to
> read than code in all-lowercase or all-uppercase. I think in the late
> 1970's we had about 150+ of them -- it was a popular topic for grad
> students (see Ben Schneiderman at U. of Maryland, et al).
Really. So if everything is upper case that is easier? Come in to the 21st
century. Do these studies account for high resolution monitors and color cod
ing?
A lot has happened since the 1970's.

> Quick lecture.
> 1) The eye of a Latin alphabet reader is drawn to an uppercase letter;
> it is the start of a sentence which models a unit of thought.
> Camelcase screws up eye movement. Lowercase does not attract it.
Which is why I use pascal casing for keywords not camel casing. Besides, it
would be silly to write sElect * fRom tAbleName wHere cOlumnName > 10.

> 2) All uppercase words are seen as bouma. Bet that you had no idea
> what a "bouma" is because you never did research on code readability.
> It is "word shape" -- google it.for details. We do not read code the
> way we read text. Code is 3D and text is linear.
Admittedly, before you mentioned this, no I did not know what "bouma" meant.
However, after doing some reasearch, I did find this:
" The second key piece of experimental data to support the word shape model
is
that lowercase text is read faster than uppercase text. Woodworth (1938) was
the
first to report this finding in his influential textbook Experimental
Psychology. This finding has been confirmed more recently by Smith (1969) an
d
Fisher (1975). Participants were asked to read comparable passages of text,
half
completely in uppercase text and half presented in standard lowercase text.
In
each study, participants read reliably faster with the lowercase text by a 5
-10%
speed difference. This supports the word shape model because lowercase text
enables unique patterns of ascending, descending, and neutral characters. Wh
en
text is presented in all uppercase, all letters have the same text size and
thus
are more difficult and slower to read. "
From
http://www.microsoft.com/typography...ecognition.aspx

> You are smarter than this! Send me a "snail mail" and I will get you a
> copy of SQL PROGRAMMING STYLE.
Smart people know that they know only a fraction of what there is to know. I
'm
always amenable to learning more and re-evaluating what I think I know. Can
you
get me more than the 20% discount offered by Morgan Kaufmann? To where would
I
send said snail mail request? :->
Thomassqlsql

No comments:

Post a Comment