Quantcast
Channel: InterWorks
Viewing all articles
Browse latest Browse all 2697

Skyline Analytics in EXASOL

$
0
0

Skyline is used for attacking problems involving choice. Where there is more than one criterion, no computer can possibly choose the “best” solution. Skyline function helps in assisting decision making that involves more than one good choice. It gives you a shortlist of “good” dataset choices from a high performance database for further analysis.

What if I want to buy a house closer to my office? I’m also more interested in large neighborhoods. In a nutshell, I want to fetch the nearest neighborhoods that aren’t too small and the largest area neighborhood that isn’t too far. The usual query (SQL) to fetch this kind of information requires us to use inner joins, exists function, etc. This not only increases the line of code but also poses performance issues. Using Skyline analytics, it can be achieved in just one line of code. Skyline runs efficiently, works on “big data” volumes of data and many more criteria options rather than just two.

*** SQL ***

select t1.boroname, round(st_distance('POINT(-74.117 40.617)', t1.geom),2) as disto, t1.shape_area

from nyc_taxi_production.nyct t1

where t1.shape_area is not null

and not exists(

                  select t2.boroname, round(st_distance('POINT(-74.117 40.617)', t2.geom),2) as disti, t2.shape_area

                  from nyc_taxi_production.nyct t2

                  where t2.shape_area is not null

                  and (

                                    (local.disti < local.disto and t2.shape_area >= t1.shape_area)

                                    or

                                    (local.disti <= local.disto and t2.shape_area > t1.shape_area)

                                    )

);
*** SKYLINE ***

select boroname, round(st_distance('POINT(-74.117 40.617)', geom),2) as dist, shape_area

from nyc_taxi_production.nyct

where shape_area is not null

preferring low local.dist plus high shape_area;

I used both techniques to execute. Below is the output, along with the time it took to execute the query. You can clearly see that the time taken to execute the Skyline function is less than the time taken to execute the query using SQL function.

Skyline Analytics Demo

Skyline Analytics example

This example involved very few set of possibilities. Just imagine you have a large set of possibilities and there are several ways to assess each possibility; surely Skyline will come in handy.

 


Viewing all articles
Browse latest Browse all 2697

Trending Articles