Sunday, July 29, 2018

A Kind Introduction MySQL Windowing Functions Part I


Windowing functions are a critical tool for grouping rows of data that are related to other rows. But they go far beyond the regular aggregate functions found in MySQL 5.7 and earlier. In MySQL 8 you do not have to collapse all the information down into a single output row. Each row can retain its individual identity but the server can analyze the data as a unit.

Statistics and Damned Lies

Finding the total Population of the District Texas from the world.city table is simple. 

SQL> select District, sum(Population)  
from city where district = 'Texas';
+----------+-----------------+
| District | sum(Population) |
+----------+-----------------+
| Texas    |         9208281 |
+----------+-----------------+
1 row in set (0.0068 sec)






 Simple.  But try to expand it to the entire USA and you get problems.

SQL> select District, sum(Population)  
from city where CountryCode = 'USA';
+----------+-----------------+
| District | sum(Population) |
+----------+-----------------+
| New York |        78625774 |
+----------+-----------------+
1 row in set (0.0046 sec)


The results only give out the results for all the cities and lumps them under New York. This is not the desired answer. By the way the only time New York (and New York city in particular) has 78 million people is when they are all on the road in front of me when I am trying to take a taxi to an airport.

With a windowing function it is easy to iterate over a subset of the entire data.  Imagine reading the data through a page with a section cut out to form a window that is just the right size to read only the group of rows desired!

SQL> select name, District, Population, sum(Population) over() as p0, 
     District, sum(population) over( partition by District) as p1
     from city where CountryCode = 'USA' limit 10;
+------------+----------+------------+----------+----------+---------+
| name       | District | Population | p0       | District | p1      |
+------------+----------+------------+----------+----------+---------+
| Birmingham | Alabama  |     242820 | 78625774 | Alabama  |  801519 |
| Montgomery | Alabama  |     201568 | 78625774 | Alabama  |  801519 |
| Mobile     | Alabama  |     198915 | 78625774 | Alabama  |  801519 |
| Huntsville | Alabama  |     158216 | 78625774 | Alabama  |  801519 |
| Anchorage  | Alaska   |     260283 | 78625774 | Alaska   |  260283 |
| Phoenix    | Arizona  |    1321045 | 78625774 | Arizona  | 3178903 |
| Tucson     | Arizona  |     486699 | 78625774 | Arizona  | 3178903 |
| Mesa       | Arizona  |     396375 | 78625774 | Arizona  | 3178903 |
| Glendale   | Arizona  |     218812 | 78625774 | Arizona  | 3178903 |
| Scottsdale | Arizona  |     202705 | 78625774 | Arizona  | 3178903 |
+------------+----------+------------+----------+----------+---------+
10 rows in set (0.0075 sec)


The above query has two windows.  The keyword to notice is OVER().  The window defined is OVER() with nothing within the parenthesis as this should be under stood to mean 'the widow is open wide enough to see all the data'. So sum(Population) over() as p0 will give us the sum of all the Population columns and name the column p0.

The second window is defined as sum(population) over (partition by District) as p1 will provide all of the Population of each district summed in a column named p1.  

Indianapolis 500 winning Lotus driven by Jimmy Clark. Before this car all other winners of the race had the engine in front of the driver.  What does this have to do with Windowing Functions, databases, SQL, or anything else in this blog post. Well, nothing but while at Detroit PHP I stopped by the Henry Ford Museum and took this picture of a very icon car that changed Indy forever.  Windowing Functions can change your SQL skills.

Different Types of Windows

The OVER clause has two forms - window_spec and window_name.  The window_spec option declares the specified window with the parenthesis.  While window_name is a window defined elsewhere in the query.  The send window in the above query  (where the output is named p1) is a window_spec.

So here is a window_name example:

SQL> SELECT District, Sum(Population) OVER w 
    FROM city where CountryCode ='USA' 
    WINDOW w AS (partition by District) limit 10;
+----------+------------------------+
| District | Sum(Population) OVER w |
+----------+------------------------+
| Alabama  |                 801519 |
| Alabama  |                 801519 |
| Alabama  |                 801519 |
| Alabama  |                 801519 |
| Alaska   |                 260283 |
| Arizona  |                3178903 |
| Arizona  |                3178903 |
| Arizona  |                3178903 |
| Arizona  |                3178903 |
| Arizona  |                3178903 |
+----------+------------------------+
10 rows in set (0.0032 sec)


The window was given the name w and then defined as WINDOW w AS (partition by District).   By the way the declaration within the window_name is itself as window_spec.

So what is a window_spec??

 

The definition from the manual (https://dev.mysql.com/doc/refman/8.0/en/window-functions-usage.html) informs that a window_spec is defined as:

window_spec: [window_name] [partition_clause] [order_clause] [frame_clause]

The window_name is an alias that can be used elsewhere in the query.

The partition_clause is how the data is to be divided up into groups. If this is unspecified it makes one big group.  

The order_clause provides the sort_order. Remember the ORDER BY from regular SQL queries?  This is how you can order the groups.

And the frame_clause determines sub groups within the big group.

And as it so often happens, there is a lot of material to be covered under the frame_clause and that will be in a future blog.





2 comments:

  1. My Assignment Services provides a 24-hour online Assignment Help and consultation to the students. Be it any subject such as Nursing, Economics, Law, Engineering, or Management, we provide the most reliable help with assignment online by our highly-proficient academic writers. My Assignment Services constantly aim to expand our base of assignment writing experts and call in international experts who are ex-professors from reputed business schools, management schools, engineering universities from across the globe. This provides you with an opportunity to get a global and world perspective in your Assignment Help India answers and lets you connect with a writer who understands you. This company has been trusted by thousands of students in Australia for their incredible help with assignment that are provided to students worldwide. Join these thousands of students and achieve high distinction in each and every one of your college tasks. We are proud of our best assignment help experts because of their dedication towards providing continuous support to students by helping them meet deadlines and scoring better grades. We understand how important academic assessments are in developing a student's career and future opportunities, this is why we take extreme measures to ensure that all Java Assignment Help solutions are best-in-class.

    ReplyDelete
  2. The information you shared was useful. Thank you for taking the time to organize it. blockchain development company has altered pretty radically over the last couple of years. What happening as a means to power up an efficient payment network has transformed into a reorganized processer where developers from all everywhere the world can come and create their own software applications. ADe Technologies offers blockchain development service in an efficient way. Please feel free to call us on +1 800-912-1219 or contact by email (contact@ade-technologies.com), if you require any additional information. Please visit our website www.ade-technologies.com

    ReplyDelete