Tuesday, September 22, 2020

MySQL + WePay MeetUp

Description

Join this virtual meetup featuring WePay and Oracle MySQL! They will review three great talks on MySQL Compatibility Check, Running MySQL on Kubernetes and Scalable Bookkeeping.

Talk 1: MySQL Compatibility Check

Talk 2: Running MySQL on Kubernetes

Talk 3: Scalable Bookkeeping

Date & Time 

Oct 1, 2020 10:00 AM in Pacific Time (US and Canada)

Register here

Wednesday, September 9, 2020

MySQL 5.6 End of Life Is Less Than Six Months Away

     If you are running MySQL 5.6 please note that it will hit support End of Life status in February 2021!  MySQL 5.7 has been Generally Available for almost five years and MySQL 8.0 has been out for over two  years.   You need to upgrade as soon as possible.  

    Upgrading to 5.7 is pretty easy. And I heartily suggest that after upgrading to 5.7 that you also upgrade to 8.0  using the upgrade checker utility.

    MySQL 8.0 has a lot of improvements including CTEs, Windowing Functions, Histograms, Multi-value indexes, upgraded JSON support, a contention aware transaction scheduler. dual passwords, and more.  Plus there are InnoDB Cluster, MySQL Router, the New Shell (mysqlsh) and ReplicaSets. Heck, you no longer have to run the old mysql_upgrade script (often a forgotten step) as the install process now does that for you!

    If you have not started planning to upgrade yet, please put it down on your to-do list.  

    MySQL Community Downloads

Monday, August 31, 2020

Three Bad MySQL Query Types You May Be Writing

    One of the problems with Structured Query Language as opposed to languages like C++, Java, Python, or PHP is that it is hard to tell if a query is good or bad just by looking at it.  A programmer with a few months of coding experience can be shown a snippet of code from a project and have a pretty good chance of telling if that code is good or not.  Some call this a 'code smell'.

    A programmer can tell if the code is legible, understandable, and many times if it performs well often times at just a glance.  Want to do an experiment on this?  Go out to Github and check out a handful of random projects.  Good code often just looks good.

    But SQL is different.  It is mostly a declarative language, telling the computer what it wants rather than how to make what it wants. You can random pull SQL queries off Gitthub or Stackoverflow or Reddit and can not tell if a query is really good or really bad.

    Why?  SQL depends a lot on the underlying structure of the data.  Looking for a phone number? Is it in a column designed to hold the phone number, or buried in a long string of data?  Separate country/area code from the main number?  Is a second phone number kept in different records, different columns, or not stored at all?

    Is there relational logic, foreign keys, and how are they implemented?  Are there keys? What do the indexes really index? Are there any histograms?  These things are not evident by just looking at the query. So you end up using some variant of EXPLAIN along with DESCRIBE TABLE to start to dig into things.  

    Plus SQL has a unique learning curve once you get past the basic UPDATE, SELECT, and DELETE commands.  So it is inevitable that people write poor queries.  Or they turn to Object Relational Mappers to write bad queries for them.  

    So on Sunday when I got an urgent plea for help from a well practiced programmer friend who did not have a lot of SQL skills and a new 'to be updated' project filled with horrible queries.  How horrible? Well, read on!

Bad Query Type Number 1 -- Wild Wildcards

    SELECT * FROM foo WHERE x=y;

    Everyone uses the * wildcard to get all the columns from a table.  But what if you need only a few columns of data and the table is a few hundred columns wide, with a few CHAR(255)s, and a pair of BLOBS.  All that data has to read off disk, transferred into memory, and then drained off a network interfaces into an application.  The program code in question wanted an pair of INTs and a DATE.  Not the several thousand other bits of data being pulled out of the database by the query.

    This query, from a glance at the code, was the original core of the program and the proficient programmer was pretty confident that the original table had only a handful of columns when it was created.  But sadly the table got bloated and the original fast performance is now history.

    So use * as a quick-and-dirty to look at the data but please avoid it in production unless your really, really need all of the data.

Bad Query Type Number 2 -- No JOINS

    One of the early developers left a comment that JOINs did not perform adequately.  So what did they do instead?  They read three different tables into memory and pared down the data within the application.  So beside the problems of pulling a lot of data from the database into application the application gets bloated to perform an operation that database should be doing.  

    So where did the original code get the idea that JOINs do not perform?  Well, on the Internet everything lives forever, especially if it is bad or outdated information.  The early developer was nice enough to date their comments and the first version was built in the MySQL 4.x days.  And there were JOINs back in those  but I do not remember their performance being poor.  But in this case this other  developer was still around the company and was available for a quick phone call. Yes, they admitting writing the code and being warned by some bulletin board (Are there still BBS?)  that performance of JOINs was less than optimum.  While I was rewriting the query the author called back and pointed to a very old post where someone was complaining about needing to join 65 tables when the hard coded limit in MySQL was 64 as the original instigator.

    And they there were other URLS of old, bad admonitions against using JOINS.  Folks, a lot has changed over the 25 year life of MySQL. JOINs work very well, support the relational model that you should be using, and put the work on the DATABASE and not the application.  And if you do search the internet for an answer to a problem please try to avoid coding advice that is old enough to vote or buy adult beverages.

Bad Query Type Number 3 -- Obfuscated Queries

    One query was the equivalent of  this:

SELECT (City.name)
    , (Country.name) 
    FROM City 
    JOIN Country on (City.CountryCode = Country.code) 
    WHERE 
        ((City.CountryCode=('USA')) 
        or ((((City.District IS NULL))))) 
        AND (((City.CountryCode IS NOT NULL))) 
limit 10;

    Wow! That is a lot of parentheses!  Looking at the table description had both of the columns being tested against NULL were both created as not being null-able.  The 'or' and 'and' clauses to the WHERE were not immediately evident of why they were there. Well, after another phone call to the original developer I found out the third or forth code developer was a fan of writing tricky to read code for reasons of job security.  And yes they were eventually fired for writing code that even they could not maintain.

    So do not obfuscate code to keep the next developer from figuring out what you are doing as most times you will be that next developer to work on that code but you will not remember you cute little code trick.   Now the optimizer is smart enough to work out the proper query plan but the MySQL optimizer wants to optimize a query each time it sees it; so writing extra clauses puts more load on the optimizer and degrades performance marginally.  Exercise for you -- See how the MySQL optimizer cleans up the above query against the World database, and the query plan may surprise you.

Wrap Up

    So with a little bit of effort on both our parts, the well practiced programmer was able to get acceptable performance out of their software with a promise to refactor the rest of the queries very quickly.  But in one setting we ran in to what I feel are three of the most common types of bad queries sent to MySQL servers, or other database servers.  


Wednesday, August 26, 2020

Better VIEWs with the WITH CHECK OPTION

     VIEWs have been a handy feature of MySQL for many years but do you know about the WITH CHECK OPTION clause?  The WITH CHECK OPTION clause is used for a updatable views to prohibit the changes to those views that would produce rows which are not included in the defining query.

    VIEW Definition

    Since a lot of SQL novices read my blog, I'd like to start with the definition of a view "Views are stored queries that when invoked produce a result set. A view acts as a virtual table" according the MySQL Manual.

SQL > CREATE VIEW x_city AS 
            SELECT     Name, 
                       CountryCode, 
                       District 
       FROM city;
Query OK, 0 rows affected (0.0358 sec)
SQL > SELECT * FROM x_city LIMIT 4;
+----------------+-------------+----------+
| Name           | CountryCode | District |
+----------------+-------------+----------+
| Kabul          | AFG         | Kabol    |
| Qandahar       | AFG         | Qandahar |
| Herat          | AFG         | Herat    |
| Mazar-e-Sharif | AFG         | Balkh    |
+----------------+-------------+----------+
4 rows in set (0.0200 sec)
 SQL > SELECT Name, District FROM x_city limit 4;
+----------------+----------+
| Name           | District |
+----------------+----------+
| Kabul          | Kabol    |
| Qandahar       | Qandahar |
| Herat          | Herat    |
| Mazar-e-Sharif | Balkh    |
+----------------+----------+

    Views are used to obfuscate column names, capture vital queries for later reuse, and many other reasons. For the most part we can treat it as a table. The above example joins data from two tables but we can treat the view as a separate, different table.  As can be seen above we can select all the columns from the view or a subset of the columns defined in the view. 

Updatable Views


    Some views can be used to specify tables to be updated in data change statements. In the example below we will use a simple one integer column table.
.
 SQL >  create view v_2 as select * from t2;
Query OK, 0 rows affected (0.0072 sec)
 SQL > select * from v_2;
+----+
| id |
+----+
|  1 |
|  3 |
|  5 |
|  7 |
|  9 |
| 11 |
+----+
6 rows in set (0.0009 sec)
 SQL > insert into v_2 (id) values (99);
Query OK, 1 row affected (0.0060 sec)
  SQL > select * from v_2;
+----+
| id |
+----+
|  1 |
|  3 |
|  5 |
|  7 |
|  9 |
| 11 |
| 99 |
+----+


    The v_2 view updated the underlying table t2.  

WITH CHECK OPTION


    The WITH CHECK OPTION  prevents inserts to rows for which the WHERE clause in the select_statement is not true. The view v_2a only retrieves data where the id value is greater than 4. If it is used in an attempt to update a value less than 5 it will raise and error.

SQL > CREATE VIEW v_2a AS 
        SELECT * FROM t2 WHERE id > 4 WITH CHECK OPTION;
Query OK, 0 rows affected (0.0102 sec)
 SQL > SELECT * FROM v_2a;
+----+
| id |
+----+
|  5 |
|  7 |
|  9 |
| 11 |
| 99 |
+----+
5 rows in set (0.0009 sec)
 SQL > insert into v_2a (id) values (2);
ERROR: 1369: CHECK OPTION failed 'demo.v_2a'

    And note the error provides the name of the view and the schema to helpt track it down.

    For more details please refer to the manual page on The View WITH CHECK OPTION Clause





Monday, August 24, 2020

MySQL Comparing INTs and CHARs

     How does MySQL compare INTs with CHARs? I was asked this very question over the weekend and wanted to share that information plus some of the changes in MySQL 8.0.21 in this area.  And yes those changes are pretty big.

Casting

    Computers are good at making comparisons of two values but only if everything else the same.  Comparing an integer with another integer is simple.  Same data with same data type comparisons are a good thing.  But what about when you need to compare a numeric 7 with a "7" where the number is in a string?  In such cases one or both numbers need to be changed into the same basic data type. Imagine your favorite Harry Potter character waving their magic wand and shouting 'accio data' to change two different magical pieces of data into one common data type.  No, Hogwarts was the the reason this conversion is called casting but this 'magic' needs to be made for a good comparison.

    If you read the Optimizer Notes section of the MySQL 8.0.21 Release Notes you will run into a notice that MySQL injects casts to avoid mismatches for numeric and temporal data with string data. The big trick was keeping backward compatibility with previous versions while matching the SQL standard.  Now when the optimizer is comparing numeric and temporal types and the expected data type does not match  it will now add casting operations in the item tree inside expressions and conditions. For instance if you are comparing a YEAR to string they will both be converted to a DOUBLE.

Example

    We have two tables and are comparing an INT to a CHAR.  If we run EXPLAIN ANALYZE of the query we get the details.

explain analyze select * 
from t1 
join t2 on t2.k = t1.k\G

*************************** 1. row ***************************

EXPLAIN: -> Inner hash join (cast(t2.k as double) = cast(t1.k as double))  (cost=4.75 rows=5) (actual time=1.759..1.771 rows=5 loops=1)

    -> Table scan on t1  (cost=0.22 rows=6) (actual time=1.670..1.677 rows=6 loops=1)

    -> Hash

        -> Table scan on t2  (cost=0.75 rows=5) (actual time=0.034..0.043 rows=5 loops=1)

    If we look at the original query we are trying to join two tables where the CHAR t2.k is equal to the INT t1.k.   The magenta highlighted text above shows where both the t2.k and the t1.k columns are cast as doubles.

    Running EXPLAIN without the ANALYZE we can see the query plan's version of the query that has been generated by the optimizer.

explain select * from t1 join t2 on t2.k = t1.k\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 5
     filtered: 100
        Extra: NULL
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 6
     filtered: 16.666667938232422
        Extra: Using where; Using join buffer (hash join)
2 rows in set, 1 warning (0.0010 sec)
Note (code 1003): /* select#1 */ select `demo`.`t1`.`id` AS `id`,`demo`.`t1`.`k` AS `k`,`demo`.`t2`.`id` AS `id`,`demo`.`t2`.`k` AS `k` from `demo`.`t1` join `demo`.`t2` where (cast(`demo`.`t2`.`k` as double) = cast(`demo`.`t1`.`k` as double))

    We can see that the original query of select * from t1 join t2 on t2.k = t1.k has been rewritten to select `demo`.`t1`.`id` AS `id`,`demo`.`t1`.`k` AS `k`,`demo`.`t2`.`id` AS `id`,`demo`.`t2`.`k` AS `k` from `demo`.`t1` join `demo`.`t2` where (cast(`demo`.`t2`.`k` as double) = cast(`demo`.`t1`.`k` as double)) by the optimizer.  

    I highly recommend looking at the query plan to help understand what the MySQL server needs to do to make your query work.  




Friday, August 21, 2020

What Is The Difference Between GROUP BY and Windowing Functions? PART II

     Last time we covered the basic differences between GROUP BY and Windowing functions. We ended up with this query:

select distinct count(ID) over (partition by District) as count, 
    CountryCode, 
    count(CountryCode) over (partition by District) as '#', 
    District, 
    SUM(Population) over (partition by District)  as 'Pop City',        SUM(Population) over (partition by CountryCode) as 'Pop Country' from city 
where CountryCode IN ('USA','BEL') 
order by CountryCode,District;

    Which I admit is a little messy. We have three windows over rows with two separate qualifiers.  One of the nice things about Windowing Functions is that they clean up so nicely and also make them easier to read. What we need to do is break out the 'over (...) ' sections.  Let us call the 'partition by District' as D_win and partition by CountryCode as C_win.  

select distinct count(ID) over D_win as count, 
    CountryCode, 
    count(CountryCode) over D_win as '#', 
    District, 
    SUM(Population) over D_win  as 'Pop City', 
    SUM(Population) over C_win  as 'Pop Country' 
from city 
where CountryCode IN ('USA','BEL') 
window C_win as (partition by CountryCode),
D_win as (partition by District)
order by CountryCode,District;

    This is much easier to comprehend and will be much easier to modify.  

GROUP BY and ORDER BY Mutually Exclusive No More!

    Now back to GROUP BY.  

    Prior to MySQL 8.0.12 there was something a few folks tripped overyou cannot also use an ORDER BY clause to sort the results. In other words, ROLLUP and ORDER BY were mutually exclusive in MySQL. But this as changed in '12 and you can no send queries like this:

select region, 
    count(name) as '# counties'  
from country 
group by Region with rollup 
order by Region;
+---------------------------+------------+
| region                    | # counties |
+---------------------------+------------+
| NULL                      |        239 |
| Antarctica                |          5 |
| Australia and New Zealand |          5 |
| Baltic Countries          |          3 |
| British Islands           |          2 |
| Caribbean                 |         24 |
| Central Africa            |          9 |
| Central America           |          8 |
| Eastern Africa            |         20 |
| Eastern Asia              |          8 |
| Eastern Europe            |         10 |
| Melanesia                 |          5 |
| Micronesia                |          7 |
| Micronesia/Caribbean      |          1 |
| Middle East               |         18 |
| Nordic Countries          |          7 |
| North America             |          5 |
| Northern Africa           |          7 |
| Polynesia                 |         10 |
| South America             |         14 |
| Southeast Asia            |         11 |
| Southern Africa           |          5 |
| Southern and Central Asia |         14 |
| Southern Europe           |         15 |
| Western Africa            |         17 |
| Western Europe            |          9 |
+---------------------------+------------+


    The reader of the output still has to know that the NULL entry under the region column is the rolled up total. And a lot of casual readers will be confused by having the total on the first line instead of the traditional placement on the last time. Now you can get the total on the bottom by adding DESC to the ORDER BY clause but now the regions are not ascending alpha order.  Dang trade offs.

    One thing to note is the query cost for the GROUP BY version is 264.40  (and yes these timing are relative to my hardware and should only be used for rough comparisions)

    But lets look at a Windowing Function version of this query:

select  distinct Region, 
        count(Name) over (partition by Region) 
from Country 
order by Region;

    The query cost here is 503.40!  Ouch!  The sort cost alone is 239.0! But Windowing Functions let you move the ORDER BY clause.

select    distinct Region, 
          count(Name) over (partition by Region order by Region) 
from Country;

    And the query cost is again 264.40. Yup, the same as the GROUP BY version.

    We do not have the ROLLUP functionality but as you saw in the last blog that it would not be hard to add.  First let us clean up the window definition.

select distinct Region,    
       count(Name) over w1 
from Country 
window 
    w1 as (partition by Region order by Region);

    Then add a new window definition:

select distinct Region, 
       count(Name) over w1, 
       count(Name) over w2 
from Country 
window 
    w1 as (partition by Region order by Region), 
    w2 as ();

    And the query cost is still 264.40.  












Wednesday, August 19, 2020

What Is The Difference Between GROUP BY and Windowing Functions? PART I

    I had an question that made me stop in my tracks -- What Is The Difference Between GROUP BY and Windowing Functions?  At one level there is a big difference and at another they are similar, at least in a few respects.

Definitions

    GROUP BY is described as "The GROUP BY statement groups rows that have the same values into summary rows, like "find the number of customers in each country". Another source says Window functions belong to a type of function known as a ‘set function’, which means a function that applies to a set of rows. The word ‘window’ is used to refer to the set of rows that the function works on.

    So both work on sets of rows. But what are the differences?

    Lets start with a goal and for the first example we want to know how many cities are in each District (or state) in the world.city table where the CountryCode = 'USA'. And lets write one query that uses GROUP BY and one query that uses a Windowing Function.  This query will be expanded to include more data just like queries expanded by the 'hey, this is nice but can you also get me ...' method.

Example Queries

    An example using GROUP BY

select count(id), 
       District 
from city 
where CountryCode='usa' 
group by District;

    And an example using a Windowing Function.  If you are unfamiliar with windowing functions the key word to look for is OVER()

select distinct count(ID) over (partition by District)
    District  
 from city 
 where CountryCode='USA';

    Not too much of a difference. What about the query plan?  Using EXPLAIN FORMAT=JSON reveals the query cost is 102.4 for the GROUPY BY and 276.4 for the Windowing Function.  

 Which Query to Use?!?!

    I have always used the Occum's Razor rule when confronted with two equivalent queries and picked the simpler of the two. But while visually these took look similar the performance of the GROUP BY is appreciably better. 

WITH ROLLUP or OVER()??

    So lets make the query work a little harder.  Now we want a total off all the IDs counted and with the GROUP BY version we add WITH ROLLUP.

select count(id), 
        District 
from city 
where CountryCode='usa' 
group by District WITH ROLLUP;

    And at the very end of the output we see the number 274 (the number of rows) and NULL as the name of the district used for the rollup. And yes you are not only stuck with the district name of NULL but explaining that it is a place holder to those wanting this data.  

    The Windowing Function version's output is not as pretty as it adds a repeating column of '274' to each row of output. The difference is now a selection of count(id) over() to the query.

select distinct count(ID) over (partition by District), 
                count(ID) over()
District 
from city 
where CountryCode='USA';

    A bit of explanation is needed here.  There are now two Windowing Functions being used in this query.  The first count(ID) is partitioned by the District column while the second, with the empty parens, is not partitioned -- it is counting all the records.

Now Add the Population

    Now let's us pretend we like our queries but now we want to add the sum of the Population column for each district.

select count(id), 
        District, 
        sum(Population) as Pop 
       from city 
        where CountryCode='usa' 
        group by District, Population with rollup
                    limit 5;

    Please note that I am limit the output to ten rows to save the length of this blog from the horrendous level.  Here the GROUP BY has prioritized Population over District and the results are not exactly usable. We are getting the population rolled up for each state but it is probably not obvious at first glance.

+-----------+----------+--------+
| count(id) | District | Pop    |
+-----------+----------+--------+
|         1 | Alabama  | 158216 |
|         1 | Alabama  | 198915 |
|         1 | Alabama  | 201568 |
|         1 | Alabama  | 242820 |
|         4 | Alabama  | 801519 |
|         1 | Alaska   | 260283 |
|         1 | Alaska   | 260283 |
|         1 | Arizona  | 108364 |
|         1 | Arizona  | 109697 |
|         1 | Arizona  | 158625 |
+-----------+----------+--------+

  So lets switch the group by to 'GROUP BY Population, District' to prioritize Population. And what we get is not much better. The count of the cities in the individual districts is gone.  We are not getting the population sum by District with the ROLLUP but the output is not what we want.

+-----------+----------------+-------+
| count(id) | District       | Pop   |
+-----------+----------------+-------+
|         1 | South Carolina | 89063 |
|         1 | NULL           | 89063 |
|         1 | California     | 89089 |
|         1 | NULL           | 89089 |
|         1 | Texas          | 89293 |
|         1 | NULL           | 89293 |
|         1 | Illinois       | 89408 |
|         1 | NULL           | 89408 |
|         1 | Wisconsin      | 89447 |
|         1 | NULL           | 89447 |
+-----------+----------------+-------+

       GROUP BY can prioritize what we group the data by but as you see above it is limited by the mechanics of its definition. So we can very quickly outgrow GROUP BY.

Defenestration

    Windowing Function let you have a little more control over the results.  In the follow example we count the number of IDs per each District and sum the population for each District.      

select distinct count(ID) over (partition by District) as count,            District, 
        SUM(Population) over (partition by District)  as 'Pop' 
from city 
where CountryCode='USA' 
limit 10;
+-------+----------------------+----------+
| count | District             | Pop      |
+-------+----------------------+----------+
|     4 | Alabama              |   801519 |
|     1 | Alaska               |   260283 |
|     9 | Arizona              |  3178903 |
|     1 | Arkansas             |   183133 |
|    68 | California           | 16716706 |
|     9 | Colorado             |  1851149 |
|     5 | Connecticut          |   609087 |
|     1 | District of Columbia |   572059 |
|    15 | Florida              |  3151408 |
|     6 | Georgia              |  1148875 |
+-------+----------------------+----------+


    But let us expand the query to include two countries with the goal of getting the per district and per country information included.

select distinct count(ID) over (partition by District) as count
        CountryCode, 
        count(CountryCode) over (partition by District) 
               as 'Cities in District',             
        District, 
        SUM(Population) over (partition by District)  as 'Pop' 
from city 
where CountryCode IN ('USA','BEL') 
order by CountryCode,District 
limit 10;
+-------+-------------+--------------------+---------------+---------+
| count | CountryCode | Cities in District | District      | Pop     |
+-------+-------------+--------------------+---------------+---------+
|     1 | BEL         |                  1 | Antwerpen     |  446525 |
|     2 | BEL         |                  2 | Bryssel       |  239551 |
|     1 | BEL         |                  1 | East Flanderi |  224180 |
|     2 | BEL         |                  2 | Hainaut       |  291762 |
|     1 | BEL         |                  1 | Liège        |  185639 |
|     1 | BEL         |                  1 | Namur         |  105419 |
|     1 | BEL         |                  1 | West Flanderi |  116246 |
|     4 | USA         |                  4 | Alabama       |  801519 |
|     1 | USA         |                  1 | Alaska        |  260283 |
|     9 | USA         |                  9 | Arizona       | 3178903 |
+-------+-------------+--------------------+---------------+---------+

    Now we are partitioning by District over three separate pieces of data.  If we need to add another column for the sum of the population of each country, it is simple addition to the query. We do have to specify a new partitioning factor as now we need to group the rows for the new calculation accordingly.  Unlike using GROUP BY, there is no ambiguity on what is being prioritized. And we are getting the data the way we want.

select distinct count(ID) over (partition by District) as count,            CountryCode,     
        count(CountryCode) over (partition by District) as '#',            District, 
        SUM(Population) over (partition by District)  as 'Pop City',         SUM(Population) over (partition by CountryCode) 
            as 'Pop Country' 
from city 
where CountryCode IN ('USA','BEL') 
order by CountryCode,District 
limit 10;
+-------+-------------+---+---------------+----------+-------------+
| count | CountryCode | # | District      | Pop City | Pop Country |
+-------+-------------+---+---------------+----------+-------------+
|     1 | BEL         | 1 | Antwerpen     |   446525 |     1609322 |
|     2 | BEL         | 2 | Bryssel       |   239551 |     1609322 |
|     1 | BEL         | 1 | East Flanderi |   224180 |     1609322 |
|     2 | BEL         | 2 | Hainaut       |   291762 |     1609322 |
|     1 | BEL         | 1 | Liège        |   185639 |     1609322 |
|     1 | BEL         | 1 | Namur         |   105419 |     1609322 |
|     1 | BEL         | 1 | West Flanderi |   116246 |     1609322 |
|     4 | USA         | 4 | Alabama       |   801519 |    78625774 |
|     1 | USA         | 1 | Alaska        |   260283 |    78625774 |
|     9 | USA         | 9 | Arizona       |  3178903 |    78625774 |
+-------+-------------+---+---------------+----------+-------------+

    This is just scratching the surface of how and when to use Windowing Functions over GROUP BY and just skims the surface of the power of Windowing Functions (and yes, you can neaten things up from the examples). I hope to show in a future post some more of this power.