Skyline Analytics in EXASOL

Data

Skyline Analytics in EXASOL

by Archana Chinnaswamy
//

Skyline is used for attacking problems involving choice. Where there is more than one criterion, no computer can possibly choose the “best” solution. Skyline function helps in assisting decision making that involves more than one good choice. It gives you a shortlist of “good” dataset choices from a high performance database for further analysis.

What if I want to buy a house closer to my office? I’m also more interested in large neighborhoods. In a nutshell, I want to fetch the nearest neighborhoods that aren’t too small and the largest area neighborhood that isn’t too far. The usual query (SQL) to fetch this kind of information requires us to use inner joins, exists function, etc. This not only increases the line of code but also poses performance issues. Using Skyline analytics, it can be achieved in just one line of code. Skyline runs efficiently, works on “big data” volumes of data and many more criteria options rather than just two.

*** SQL ***

select t1.boroname, round(st_distance('POINT(-74.117 40.617)', t1.geom),2) as disto, t1.shape_area

from nyc_taxi_production.nyct t1

where t1.shape_area is not null

and not exists(

                  select t2.boroname, round(st_distance('POINT(-74.117 40.617)', t2.geom),2) as disti, t2.shape_area

                  from nyc_taxi_production.nyct t2

                  where t2.shape_area is not null

                  and (

                                    (local.disti < local.disto and t2.shape_area >= t1.shape_area)

                                    or

                                    (local.disti <= local.disto and t2.shape_area > t1.shape_area)

                                    )

);
*** SKYLINE ***

select boroname, round(st_distance('POINT(-74.117 40.617)', geom),2) as dist, shape_area

from nyc_taxi_production.nyct

where shape_area is not null

preferring low local.dist plus high shape_area;

I used both techniques to execute. Below is the output, along with the time it took to execute the query. You can clearly see that the time taken to execute the Skyline function is less than the time taken to execute the query using SQL function.

Skyline Analytics Demo

Skyline Analytics example

This example involved very few set of possibilities. Just imagine you have a large set of possibilities and there are several ways to assess each possibility; surely Skyline will come in handy.

 

More About the Author

Archana Chinnaswamy

Analytics Consultant
How to Load Data from JSON to EXASOL If you’ve been looking for a way to load JSON data into EXASOL with just a click, then you’ll love this post! As data scientists, most ...
EXASOL – Small Business Edition Many of you must have heard about EXASOL. If not, I hope this blog gives you the motivation to discover it’s capabilities for yourself. ...

See more from this author →

InterWorks uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy. Review Policy OK

×

Interworks GmbH
Ratinger Straße 9
40213 Düsseldorf
Germany
Geschäftsführer: Mel Stephenson

Kontaktaufnahme: markus@interworks.eu
Telefon: +49 (0)211 5408 5301

Amtsgericht Düsseldorf HRB 79752
UstldNr: DE 313 353 072

×

Love our blog? You should see our emails. Sign up for our newsletter!