Monday, January 30, 2017

PHP and MySQL Basics IV -- SQL Injection and Prepared Statements

SQL Injection is a highly feared and often misunderstood problem. The basic phobia is that someone hijacks your SQL request and suddenly has full access to everything in your database. Well, it usually is not that easy and it is actually easy to avoid.

Rule 1: Never Trust User Supplied

The usual example is something like a query SELECT * FROM customer_data WHERE customer_id='$id' and the programmer was expecting an integer for the customer_id. But a dastardly use inserts some horrible SQL code to pirate the information so the query looks like SELECT * FROM customer_data WHERE customer_id=1 OR customer_id > 0 and suddenly all your customer data is out free in the universe waiting for who knows what.

The code could have checked to see if the value of customer_id was truly an integer or returning an error if not. The is_int function was designed to do just this.

if is_int($customer_id)  {
  //Do all the stuff we want to do if we have a integer
  //submitted for a customer_id
} else {
  echo "Hey! I want an INTEGER for a customer identification number!";
}
Even more dastardly and from the PHP Manual where the dastardly injector resets all the passwords:

<?php
But a malicious user subits the value ' or uid like'%admin% to $uid to change the admin's password, or simply 
sets $pwd to hehehe', trusted=100, admin='yes to gain more privileges. Then, the query will be twisted:
>?php

// $uid: ' or uid like '%admin%
$query = "UPDATE usertable SET pwd='...' WHERE uid='' or uid like '%admin%';"
// $pwd: hehehe', trusted=100, admin='yes
$query = "UPDATE usertable SET pwd='hehehe', trusted=100, admin='yes' WHERE
...;";?>

Paranoid yet? This is another case checking the values for reasonableness can save grief. Is the uid an integer, is that integer in a proper range for uids? If too low or too high, you need to suspect someone is doing something bad.

On the database side

One thing I have recommended and implemented for years is separate users for SELECTs and INSERT/UPDATE/DELETEs. It is too easy to use one connection string over and over. But if you are in a situation where you need to take 'loose' information from an user and use that information against you data base, you need to split the queries. Set up a cust_read account on the MySQL server that can read customer data ONLY. The a corresponding cust_mod for all the other queries. This way you are assured that they can not delete or update records with that account if things get compromised. Also do not user superuser accounts like root for production database interactions -- save these accounts for maintenance functions.

Bound variables

The MySQL server supports using anonymous, positional placeholder with ?. Again from the PHP Manual:

/* Prepared statement, stage 1: prepare */
if (!($stmt = $mysqli->prepare("INSERT INTO test(id) VALUES (?)"))) {
    echo "Prepare failed: (" . $mysqli->errno . ") " . $mysqli->error;
}
Or as used in a SELECT

/* Prepared statement, stage 1: prepare */
if (!($stmt = $mysqli->prepare("SELECT * FROM test WHERE id = ?"))) {
    echo "Prepare failed: (" . $mysqli->errno . ") " . $mysqli->error;
}
/* Prepared statement, stage 2: bind and execute */
$id = 1;
if (!$stmt->bind_param("i", $id)) {
    echo "Binding parameters failed: (" . $stmt->errno . ") " . $stmt->error;
}

if (!$stmt->execute()) {
    echo "Execute failed: (" . $stmt->errno . ") " . $stmt->error;
} else {
    echo "Got it!!!\n";
}

Prepared Statements - Help But Not A Cure

Prepared statements make it nearly impossible to cram lots of data into one variable. That was nearly impossible.

Take the extra time to make sure that integers are really integers and that they are in the correct range of numbers. If you are looking for a string of up to say 30 characters that you do something when the length is over run. MySQL used to get a lot of flack with people running servers in less that strict mode and having excess data truncated with only a warning generated. Now sometimes that data is valuable. But if your corporate standard is to store emails in 45 characters and the user is trying to store 50, you need to programmatically warn those with long emails that their data is too long (and the Ops folks that the email to the account if going to bounce (if you do accept that truncated email)).

RTFM

The PHP Manual's section on SQL Injection is a must read. Do use separate MySQL accounts for SELECTs and INSERT/UPDATE/DELETEs. Keep asking yourself how to keep from exposing more data than the absolute minimum needed. Do not use SELECT * FROM foo but instead explicitly name the columns in your select statement; If something goes wrong you are not exposing data column that may have private information. And be paranoid -- keep asking if there is something else you can do to protect the data.

Wednesday, January 18, 2017

Python and the MySQL Document Store

The MySQL Document Store and X Devapi have a lot of very interesting features but right now my programming language of choice, PHP, is not yet supported. My Python is rusty and learning Node.JS is progressing. But the ability to search data from a database without knowing Structured Query Language (SQL) is going to appeal to many.

Example One

import mysqlx
import string

session = mysqlx.get_session({
'host': 'localhost',
'port': 33060,
'user': 'dstokes',
'password': 'Hell0Dave!'})

schema = session.get_schema('world_x');
collection = schema.get_collection('countryinfo')

print "Find three records***\n"
result = collection.find().limit(3).execute()
docs = result.fetch_all()

for i, data in enumerate(docs):
    print "{iteration}: {data}".format(iteration = i, data=data)

print "Find USA***\n"
result = collection.find('_id = "USA"').execute()
row = result.fetch_all()

for i, data in enumerate(row):
    print "{iteration}: {data}".format(iteration = i, data=data)

session.close()
Is result = collection.find().limit(3).execute() simpler than SELECT * FROM countryinfo LIMIT 3;? Maybe not. But the second query result = collection.find('_id = "USA"').execute() is more likely a better example. Now SELECT * FROM countryinfo WHERE _ID = 'USA'; is where you can start to see the value of the document store.

Variables can be bound to queries:

country = "Mexico"
result = collection.find('Name = :param').bind('param', country).execute()

Or an expanded version:

for country in ("Mexico", "Canada", "Brazil"):
    result = collection.find('Name = :param').bind('param', country).execute()

This is closer to what most programming languages teach today. SQL is a descriptive language. The differences between object-oriented/procedural languages and descriptive are subtle. But the subtlety escapes novices (and is often ignored by those at higher skill levels.

Of course it would be more efficient to make one dive into the database with a
SELECT * FROM countryinfo WHERE Name = 'Mexico' OR Name = 'Canada' OR Name = 'Brazil'
or
SELECT FROM countryinfo WHERE Name in ('Mexico', 'Canada', 'Brazil')

But those are not going to work.

Remember the data we are peeking at is in a JSON column.

So we would use
SELECT * FROM countryinfo where doc->"$.Name" = "Mexico" OR doc->"$.Name" = "Canada" OR doc->"$.Name" = "Brazil"
or
select * from countryinfo WHERE doc->"$.Name" IN ("Canada","Mexico","Brazil")
(note use of the -> shorthand in place of JSON_EXPLAIN)

That gets much more complicated. And complication can introduce errors. Novices should have few problems if they keep working in their chosen language without have to make a context switch mentally to write some SQL. And they days very few are being taught SQL.

Friday, January 13, 2017

PHP and MySQL Basics III -- Resulting Results

In the first two blogs entries on this series we set up a connection to MySQL and sent off a query. Now we need to get the data back from the database and into the application.

An Embarrassment of Riches

PHP has many options for what we want to do. But for the best place to start with was checking that rows were actually returned from a query. Below the results from a query are returned to a variable named $result. We can find out how many rows were returned from the server by examining $result->num_rows.
if (!$result = $mysqli->query($sql)) {
    
    // Again, do not do this on a public site, but we'll show you how
    // to get the error information
    echo "Error: Our query failed to execute and here is why: \n";
    echo "Query: " . $sql . "\n";
    echo "Errno: " . $mysqli->errno . "\n";
    echo "Error: " . $mysqli->error . "\n";
    exit;
}

// succeeded, but do we have a result?
if ($result->num_rows === 0) {
    // Oh, no rows! Sometimes that's expected and okay, sometimes
    // it is not. You decide.
    echo "No data returned.";
    exit;
}

This is a case where a programmer needs to know their data. In some cases you will not have a record or records returned because there is no data. Other times no data returned is a sign of big problems. So you have to have some education on what you expect back, and what you do not expect back.

Example

<?php
$mysqli = new mysqli("localhost", "root", "hidave", "world_x");

/* check connection */
if ($mysqli->connect_errno) {
    printf("Connect failed: %s\n", $mysqli->connect_error);
    exit();
}

/* Select queries return a resultset */
$query="SELECT Name, CountryCode, District FROM city LIMIT 10";

if ($result = $mysqli->query($query)) {

        if ($result->num_rows){
                printf("Select returned %d rows.\n", $result->num_rows);

                /* free result set */
                $result->close();
        } else {
                echo "No data returned";
        }
} else {   // if ($result)
   printf("Query failed: %s", $mysqli_error);
}

$mysqli->close();
?>

Sometime you just need the number of records, like number of outstanding customer orders. But in this case we are making sure we have some data to work with before proceedings.

So Now We Have Data

Now you have at least three choices -- rare, medium, or well done. Err, make that an associative array, an array or an object. Each have their uses and it is okay to have a favorite you use more.
$query="SELECT Name, CountryCode, District FROM city LIMIT 10";

if ($result = $mysqli->query($query)) {

        if ($result->num_rows){
                printf("Select returned %d rows.\n", $result->num_rows);
                $assoc = $result->fetch_assoc();
                $row = $result->fetch_row();
                $obj = $result->fetch_object();
        } else {
                echo "No data returned";
        }
} else {   // if ($result)
   printf("Query failed: %s", $mysqli_error);
}

So you make you choice of method and take the results. Here we use fetch_assoc(), fetch_row(), or fetch_object(). Depending on how you want to refer to the data, you use the one that fits the situation. Of course they are similar in use.

//associated array keys = column name, data = data from DB
printf("Sample assoc array %s -> %s\n", $assoc['Name'], $assoc['CountryCode']);

// simple row
printf("Sample row array %s -> %s\n", $row[0], $row[1]);

//object
printf("Sample object %s -> %s\n", $obj->Name, $obj->CountryCode);

Yes, you need to know all three as you will be looking at old code or someone else code that does not use your favorite. And sometimes you may need an object rather than a simple row.

Full Listing

<?php
$mysqli = new mysqli("localhost", "root", "hidave", "world_x");

/* check connection */
if ($mysqli->connect_errno) {
    printf("Connect failed: %s\n", $mysqli->connect_error);
    exit();
}

/* Select queries return a resultset */
$query="SELECT Name, CountryCode, District FROM city LIMIT 10";

if ($result = $mysqli->query($query)) {

        if ($result->num_rows){
                printf("Select returned %d rows.\n", $result->num_rows);
                $assoc = $result->fetch_assoc();
                $row = $result->fetch_row();
                $obj = $result->fetch_object();
        } else {
                echo "No data returned";
        }
} else {   // if ($result)
   printf("Query failed: %s", $mysqli_error);
}
//associated array keys = column name, data = data from DB
printf("Sample assoc array %s -> %s\n", $assoc['Name'], $assoc['CountryCode']);

// simple row
printf("Sample row array %s -> %s\n", $row[0], $row[1]);

//object
printf("Sample object %s -> %s\n", $obj->Name, $obj->CountryCode);

$result->close();
$mysqli->close();
?>

Wednesday, January 11, 2017

PHP and MySQL Basics II - Case Sense

Last time we set up a connection from a PHP program to a MySQL server. This time we will progress a little further in that direction.

Query

Data is asked for from the MySQL server by using a query written in a language named Structured Query Language (SQL). Now that we have a connection open to the server, we can pass out request to the server.

Manual Labor

The PHP Manual is wonderful 99% of time. If you take a peek at the page for mysqli::query there is a great example of a simple query. Many of learned to program by copying/pasting from books/manuals and this is a great us of the examples in the PHP manual. Except it may not work for you.

MySQL is usually case SeNsATiVe, so 'A' may not be the same thing as 'a'. But this is dependent to some extent on your operating system where 'A' = 'a'. I was using the example from the manual and ... it did not work.

What Happened

Here is an excerpt of the code, somewhat cut down:
<?php
$mysqli = new mysqli("localhost", "user", "secret", "world_x");

/* check connection */
if ($mysqli->connect_errno) {
    printf("Connect failed: %s\n", $mysqli->connect_error);
    exit();
}

$mysqli->close();
?>

Run the program and ... nothing.

So What Happened?

What happened is a subtle problem that novices will smack into very hard. Take a look at this section of the example.
/* Select queries return a resultset */
if ($result = $mysqli->query("SELECT Name FROM City LIMIT 10")) {
    printf("Select returned %d rows.\n", $result->num_rows);

    /* free result set */
    $result->close();
}

If you try the query SELECT Name FROM City LIMIT 10; with the MySQL command line client program you will get the answer. And the answer is:

mysql> SELECT Name FROM City LIMIT 10;
ERROR 1146 (42S02): Table 'world_x.City' doesn't exist
mysql>

I am using the new world_x example database where the city is NOT capitalized instead of the old world database where it is! This lesson can be summed as check you schema/table/column names for case sensitivity. Except that there is another problem here.

In the real world occasional the database/table/column that you carefully double checked was spelled correctly and with the proper case sensitivity will go away. It may have been renamed, deleted, munged, or what have you. What is needed is a way to check to see if there was an error if the query can not run.

Lets change the code slightly:

/* Select queries return a resultset */
if ($result = $mysqli->query("SELECT Name FROM City LIMIT 10")) {
    printf("Select returned %d rows.\n", $result->num_rows);

    /* free result set */
    $result->close();
} else {
    printf("Query failed: %s\n", $mysqli->error);
}

Always Check for Return Codes

By simply adding about 40 characters, the reliability of the program shoots up immensely AND we get an exact answer if what went wrong.
Query failed: Table 'world_x.City' doesn't exist
Same error as when we tried by query by hand. But now our code can handle this issue. We could even try to catch the error, send a note via a message queue to the operations staff about the nature of the problem, and possible limp along until things are resolved. Or we could just call exit()

When you are offered a return code be sure to check it. Yes, it may be over kill in simple examples. But the payoff comes when things go bad and you are scrambling to find out eleven months from now why your program is suddenly not working.

Tuesday, January 10, 2017

PHP and MySQL Basics

PHP and MySQL have had a long intertwined path together. I have been talking with a lot of newbies in the past several months who are trying to become PHP developers but are amazed at all the ancillary parts that go along with PHP such as unit testing, databases, JavaScript, continuous integration, and much more. Add in that there are two MySQL APIs -- PDO & MySQLi -- and an older deprecated mysql API still often found in the wild. This blog is the start of a series for new PHP developers to learn to program with a database.

Client Server Model

The PHP code when it seeks to talk to a MySQL (or most other databases) will make a connection to a port at an IP address. Usually MySQL is listening on port 3306. If you are developing an accessing a database on your local computer the IP address used will generally be at 127.0.0.1. The software that goes between the PHP application and the database is called a connector.

So your code on you local system an be talking to a database server on your local system or through a network connection. It does not matter which.

Can't connect to MySQL server on 'x.x.x.x' (111)

The Can't connect error can be especially frustrating. An experienced developer will know what to check from tears of experience. But this is a column on basics so we need to spell out the steps.
  1. Is the IP address correct? It is easy to fat finger IP address and ironically 127.0.01 on many Linux boxes will connect up to 127.0.0.1.
  2. Is there a instance of MySQL running at that IP address?
  3. Is that instance listening on the generic port 3306? Is may be running someplace else and you will have to chance down that port number.
  4. Can the MySQL command line shell or other tool connect to the instance? MySQL Workbench, PhPMyAdmin, the cli tools, and everything else authenticate through the same steps so if they work and your PHP program does not then most likely the fault is in the PHP code.

Setting up the client server connection

The PHP Manual is worth its weight in gold and you should refer to it often. Its examples are clear, or usually as clear as can be, and concise. Below is an excerpt example from the manual.

<?php

$mysqli = new mysqli("127.0.0.1", "user", "password", "database", 3306);
if ($mysqli->connect_errno) {
    echo "Failed to connect to MySQL: (" . $mysqli->connect_errno . ") " . $mysqli->connect_error;
}

echo $mysqli->host_info . "\n";
?>

Note that the IP address, user name of "user", password of "password", and the port of 3306 will need to be changed to fit the installation. The mysqli call sets up the connection between the application and the MySQL database server.

Please note that you should protect usernames and password or any other information that could allow someone to compromise the server and data.

The if statement is invoked when there is an error code is returned from the $mysql->connect_errono call. Subsequently the error message from the server can be printed out using $mysqli->error. The error message itself can be terse but often points out what is wrong in the code.

Bad arguments

What follows below are three bad connection strings.
// Bad IP address
$mysqli = new mysqli("19.10.0.3", "root", "barfoo", "world_x", 3306);
if ($mysqli->connect_errno) {
    echo "Failed to connect to MySQL: (" . $mysqli->connect_errno . ") " . $mysqli->connect_error;
}

// Bad account information
$mysqli = new mysqli("127.0.0.1", "root", "foobar", "world_x", 3306);
if ($mysqli->connect_errno) {
    echo "Failed to connect to MySQL: (" . $mysqli->connect_errno . ") " . $mysqli->connect_error;
}

// Bad database specified
$mysqli = new mysqli("127.0.0.1", "root", "foobar", "world_xx\", 3306);
if ($mysqli->connect_errno) {
    echo "Failed to connect to MySQL: (" . $mysqli->connect_errno . ") " . $mysqli->connect_error;
}

Part of mastering any computer programming language is learning to understand the error messages. The three examples above return similar but distinctly different messages.

The first of the trip provides the following error:


PHP Warning:  mysqli::__construct(): (HY000/2002): Network is unreachable in /home/dstokes/php/m02.php 
Failed to connect to MySQL: (2002) Network is unreachable

It would be nice to get more information than 'Network in unreachable' but it provides a starting point to diagnose the problem. Generally the more specific the problem, the more specific the error message.

The third of the trio attempts to connect to a database named 'world_xx' when we really wanted 'world_x'.

PHP Warning:  mysqli::__construct(): (HY000/1049): Unknown database 'world_xx' in /home/dstokes/php/m02.php
Failed to connect to MySQL: (1049) Unknown database 'world_xx'

Sadly for beginners it takes time and experience to get to the point where you can instantly look at an error and know what has gone wrong (or have a pretty good idea of what has gone wrong). But do not worry as many of us learn by correcting OUR mistakes and learning not to repeat them.

Connection Good

So after establishing a good connection to the MySQL server, we can now query it for data.
Next Time -- what happened to my query??

Friday, January 6, 2017

Using MySQL to Output JSON

MySQL has had a JSON data type since version 5.7 was released way back in '15. But did you know you could produce JSON output from non-JSON columns? It is very simple and saves a lot of time over trying to format it in your application.

World Database

We will be using the good old World database that MySQL has used for years in documentation, examples, and in the classroom. Starting with a simple query we will build up to something more complex.

SELECT Name, District, Population FROM City;

This will output the data from the table in a tabular format.

'Kabul', 'Kabol', '1780000'
'Qandahar', 'Qandahar', '237500'

Array or Object?

We have two options for composing JSON data: JSON_ARRAY and JSON_OBJECT.

Of the two, you will find JSON_ARRAY the least fussy. It will JSON-ize your data very easily. It takes a list of values or an empty list and returns a JSON array.

We add this function to our example query and it becomes SELECT JSON_ARRAY(Name, District, Population) FROM City;

And the output looks like:


'[\"Kabul\", \"Kabol\", 1780000]'
'[\"Qandahar\", \"Qandahar\", 237500]'
...

JSON_OBJECT wants key/value pairs and will complain if the key name is NULL or you have an odd number of objects. If we try SELECT JSON_OBJECT(Name, District, Population) FROM City; we will get Error Code: 1582. Incorrect parameter count in the call to native function 'JSON_OBJECT'. This fuctions sees the odd number of arguments as a 'key' and the evens as the 'value' in key/value pairs and therefore we should not have an odd number of arguments. We could stick in a dummy literal string into the select but odds are that we want the three fields specified but need to turn them into key/value pairs. So lets add 'keys' and let the database supply the values.

SELECT JSON_OBJECT('City', Name, 'Dist', District, 'Pop', Population) FROM City;,

And the output looks like:


'{\"Pop\": 1780000, \"City\": \"Kabul\", \"Dist\": \"Kabol\"}'
'{\"Pop\": 237500, \"City\": \"Qandahar\", \"Dist\": \"Qandahar\"}'
...

Conclusion

Developers need to work smarter and not harder. And I have been harping on letting the database do the heavy lifting for years. This is an example of letting the database format your information for you rather than feeding it into a function within your application. Sure you can do it but this saves you a step or two and reduces the complexity of your application.