Quantcast
Channel: InterWorks
Viewing all articles
Browse latest Browse all 2698

Musings on Tableau Custom SQL connections through the JET driver

$
0
0

I’m en route to San Jose for some work with a new client then a weekend in Salt Lake City for a few laps on the Great White Icicle! I usually try to have some kind of grander plan for my blogs, but this one is simply designed to save you some time if you ever need to write some Custom SQL against the JET driver.

I’ve been musing whether to condemn the JET driver for its perplexing syntax and lack of simple functions or laud it for providing much of the functionality found in real databases when connecting to something as simple as an Excel document. Ultimately, I’ve settled on grudging appreciation for the work Excel is able to do as a stand-in for a true SQL DB. As JET is the (Slightly Frustrating) way that happens, here are a few Tips and Tricks that may be time savers the day you need to do them. This isn’t intended to be a comprehensive list, these are simply the Gotcha!’s that I’ve had to work around this week alone.

SubQuery Syntax

When writing a join that requires a sub-query, JET requires a somewhat strange syntax. Imagine I have two tables, each with a different metric and want to join their aggregates based on a shared dimension – let’s not worry with why I’m doing this particular query, I want to keep the example simple and the syntax is the point.

ColumnType
RegionNvarchar(56)
Metric OneFloat
ColumnType
RegionNvarchar(56)
Metric TwoFloat

 

SELECT
QUERYONE.[Region],
QUERYONE.[Metric One Total],
QUERYTWO.[Region_two], 
QUERYTWO.[Metric Two Total]
FROM(
SELECT [Table1$].[Region] AS [Region],
Sum([Table1$].[Metric One]) AS [Metric One Total]
FROM [Table1$]
GROUP BY [Table1$].[Region]
) [QUERYONE]
LEFT JOIN
(
SELECT [Table2$].[Region] AS [Region_two],
Sum([Table2$].[Metric Two]) AS [Metric Two Total]
FROM [Table2$]
GROUP BY [Table2$].[Region]
) AS [QUERYTWO]
ON QUERYONE.[Region]=QUERYTWO.[Region_two]

Note the interesting token assignments for the sub-queries – each requires square brackets on the token outside of the sub-query itself. Failing to follow this syntax and using code you’ve seen in other ANSI SQL sources will result in a myriad of errors like “Syntax Error in FROM Clause” or “Syntax Error in JOIN Clause”. You’ll see that I actually used a slightly different token assignment on the second sub-query, including AS in there – as long as the assignor has square brackets, we’re fine. Note also that I had to alias the Region field in the second query to “Region_two” to avoid the dreaded “Duplicate output alias ‘Region’” error – Tableau and JET won’t support two identically named fields, even from two different tables.

No Cross Join

Let’s build on that example and consider a real-life problem I may want to solve using Custom SQL (yes there are some other solutions including Blending, Table-Calculations and Pass-Through Calculated fields, but we’re focusing on SQL in JET today). Imagine I want to create a % of Total field where I look at one Region as a percent of the total across regions and wanted to achieve that using Custom SQL to get the grand total on all lines. I could easily achieve this using a Cross Join in most databases (Cross Join will essentially create a Cartesian Product), but JET doesn’t support a Cross Join so we need a workaround.

Here’s a cross Join that would solve the problem in another DB:

 SELECT 
ORDERS.[Region],
ORDERS.[Sales],
SUM.[Total Sales]
FROM
(SELECT
[Orders$].[Region] AS [Region],
[Orders$].[Sales] AS [Sales]
FROM [Orders$]) [ORDERS]
CROSS JOIN
(SELECT
sum(Sales) AS [Total Sales]
FROM [Orders$]
) [SUM]

But, alas, no CROSS JOIN. So, I might try the old 1=1 hack job, which essentially joins on TRUE (so achieves the same result in many DB’s)

 SELECT 
ORDERS.[Region],
ORDERS.[Sales],
SUM.[Total Sales]
FROM
(SELECT
[Orders$].[Region] AS [Region],
[Orders$].[Sales] AS [Sales]
FROM [Orders$] ) [ORDERS]
LEFT JOIN
(SELECT
sum(Sales) AS [Total Sales]
FROM [Orders$] ) [SUM]
ON 1=1

But, alas, that’s not allowed in JET either - “Join expression not supported”. Sigh. So, I typically end materializing a rubbish truth column in both sub-queries and joining on that extra dimension.

 SELECT 
ORDERS.[Region],
ORDERS.[Sales],
SUM.[Total Sales],
ORDERS.[One],
SUM.[One_]
FROM
(SELECT
[Orders$].[Region] AS [Region],
[Orders$].[Sales] AS [Sales],
1 AS [One]       
FROM [Orders$]
) [ORDERS]
LEFT JOIN
(SELECT
sum(Sales) AS [Total Sales],
1 AS [One_]
from [Orders$]
) [SUM]
ON [ORDERS].[One] = [SUM].[One_]

“Not Pretty” you say? I agree, but it does work, and I now have a usable dataset.

Nesting SubQueries

Both of the examples above used two tables and two sub-queries. What if I had three? I know many of you are used to writing pretty SQL like:

SELECT * FROM A,B,C
JOIN B ON A.Field=B.Field
JOIN C ON A.Field=C.Field

Not in JET you’re not. If two queries necessitated the form:

 SELECT A.Columns,  B.Columns (no * here)
FROM
(SELECT Columns FROM A) AS [A]
LEFT JOIN
(SELECT Columns FROM B) AS [B]
ON A.Key=B.Key

Which I would abbreviate as (A join B), then adding a third table will take the form ((A join B) join C), or in longerhand:

SELECT AB.Columns, C.Columns FROM
(
SELECT A.Columns, B.Columns (no * here)
FROM(
SELECT Columns FROM A) AS [A]
LEFT JOIN
(SELECT Columns FROM B) AS [B]
ON A.Key=B.Key
) AS [AB]
LEFT JOIN
( SELECT C.Columns FROM C
) AS [C]
ON [AB].Key=[C].Key

That’s it for now – don’t even get me started on FULL OUTER joins or some other subtleties. My flight is going to land soon and I’m looking forward to some dinner. Hopefully this will save you a few moments of headache. I’ve included an example of each in the attached TWBX workbook.

AttachmentSize
Custom-SQL-Blog.twbx14.88 KB

Viewing all articles
Browse latest Browse all 2698

Trending Articles