Or: Things you don't know until you do them on accident
I was working on a content retrieval procedure recently, and hit a snag. I needed to retrieve the content of one field, do some other processing, then compare that content to a field in another table. That middle processing step was preventing me from just doing a JOIN between the two.
Logically, what I wanted to do was this:
BEGIN
-- set up variable to store name of table
DECLARE storedValue VARCHAR(64);
-- query to grab the value used later
SELECT firstValue
INTO storedValue
FROM FirstTable
WHERE id = 1;
-- other stuff happens here
DECLARE tableCur CURSOR FOR
SELECT *
FROM SecondTable
WHERE secondValue = storedValue;
OPEN tableCur;
tableLoop: LOOP
-- process results
END LOOP tableLoop;
CLOSE tableCur;
END
Read more →
It's not as bad as you'd think
Sometimes, you just need a way to GOTO. Here is a little trick that will duck out of a section of code using break
Read more →
Not as snazzy-looking as my last two, but a useful query nonetheless. One the applications we built over the years, among other things, takes a list of names and addresses (currently a little over 2600), and selects around 200 of them at random. The client has been keeping track of the alphabetical distribution of last names with each random batch, and was concerned that the apparent weighting of the list toward the beginning of the alphabet was evidence that the lists were not truly random.
Fortunately I was able to get a quick statistical breakdown of the overall list, using the following:
Read more →
Fun with ASCII graphs!

(Author's note: not necessarily actually a practical idea. But fun!)
So pictured here is a histogram of a moderately large set of random integers. Each vertical line represents the total number of entries at each particular integer. Since each number is made up of multiple random factors (10 different random numbers, each between 0 and 100, added together), the distribution tends toward a bell curve.
So how did I build the graph? Excel? PHP? Nope. Just a MySQL query.
Read more →
Finding spikes in your normal data

With most large sets of data, especially numerical data, statistical analysis plays a key role. You can't be bothered to look at every record yourself; that's what computers are for.
One useful tool in any statistical analysis is the identification of
outliers. Assuming you have a
normally distributed set of data, outliers can help to identify user error in the data entry process, or genuine spikes in the data. Once found, these numbers can be set aside for closer analysis or eliminated to normalize the data set.
There are many different methods for identifying outliers, with varying levels of rigor. Here I'll just demonstrate one of the simplest definitions: an outlier is any value greater than three
standard deviations away from the
mean.
Read more →
Why yes, you can log in with an invalid username
Stefan over at
Suspekt brought up some
interesting security vulnerabilities based on MySQL's column truncation tendencies (when not in
strict mode), so I thought I'd add my own to the pile, this one right in the grant tables.
MySQL's user table restricts
user names to 16 characters (and hosts to 60). Any attempt to create a user with a longer login results in an error. However, unlike Stefan's example where a field is compared, then truncated and then inserted, MySQL actually truncates a login attempt before processing it.
Read more →
Fun with stored procedures pt. I
(First in what will probably be a series of blogs as we move all our projects to a replicated, MySQL 5.0 environment, and I finally get to start playing with all the useful features that come with it)Say you've got a client, Mystery Client A. Mystery Client A has hired a marketing consultant. As a part of their rebranding efforts, they have decided to refactor their company spelling convention to MysteryClientA!, for whatever reason. That's fine for replacing a few logos, but MCA has a database-driven content management system, and their name is riddled throughout the database in page content, event description, news headlines and so forth. Your job is now to sift through the entire system and apply the newly crafted spelling to the entire database. So what do you do?
Read more →
Fun with bitwise logic!

A web app full of data is often going to be full of tables (at least on the administration end). A listing of users here, all the recently created events there. Often for readability we will alternate rows between two slightly different background colors.
This can be handled on the PHP end with a bit of math and a counter, like so:
$count = 0;
while( $row = mysql_fetch_assoc($res) ) {
if( $count % 2 == 0 )
$background_color = 'white';
else
$background_color = 'gray';
/* OUTPUT TABLE CONTENTS WITH GIVEN BACKGROUND */
$count++;
}However, it can also be done entirely in-database, via creative use of the XOR operator:
Read more →
Deleting all semi-matching rows in MySQL
One of the more recent additions to the SiteCrafting CMS arsenal is a comprehensive error logger, tracking all PHP and MySQL errors (by default... other error types can be created on a case by case basis) that occur in new sites we build. Errors are stored in our own intranet system with a timestamp, error body and a site ID (assigned to each client at a different stage of our project workflow). The table looks something like this:
+----+---------------------+---------------+---------+
| id | logTime | text | project |
+----+---------------------+---------------+---------+
| 2 | 2008-05-14 14:42:15 | A PHP Error | 1 |
| 3 | 2008-05-14 14:42:26 | A PHP Error | 1 |
| 4 | 2008-05-14 14:42:34 | A PHP Error | 1 |
| 5 | 2008-05-14 14:42:47 | A MySQL Error | 1 |
| 6 | 2008-05-14 14:42:56 | A MySQL Error | 1 |
| 7 | 2008-05-14 14:43:05 | A PHP Error | 2 |
| 8 | 2008-05-14 14:43:10 | A PHP Error | 2 |
| 9 | 2008-05-14 14:43:21 | A MySQL Error | 2 |
+----+---------------------+---------------+---------+
8 rows in set (0.00 sec)
Obviously sometimes we get duplicate errors coming through. Aside from being mere mortals who aren't always fast enough to correct an error before it recurs, one of the first stages of debugging is to try and replicate the error. These are often pretty easy to manage. It's pretty trivial to search for all matching errors, check them all, and delete them. Sometimes, though, this just doesn't cut it. Like when there are a few different errors with 10,000 occurrences apiece.
Read more →
Mar. 31, 2008 at 11:40amGot API?
An API reference does a method's body good...

gotAPI.com is one of the most useful online resources I've come across, primarily because it places resources spread all over the internet into one simple site. I've been using this for quite some time, and have for the most part I have taken its usefulness for granted. Then it occurred to me that I might not be the only one that could find this tool useful (I know, it was a big 'DUH!' moment). So now I will share this gem with others...
Read more →
The DAO and VO Patterns

In this installment, we will be looking at two patterns that have been 'borrrowed' from Java. If you've had any development experience with J2EE, you are probably well aware of how handy Data Access Objects and Value Objects can be. If you haven't, don't fret! This article was written especially for you!
If you've never heard these terms before, you may be wondering why I have chosen to group them together within one article. The simple explanation is ... well you'll see. For now just accept that they go hand-in-hand, much like salt and pepper or peanut butter and jelly or .
Excited? Let's dig deeper...
Read more →
I just finished installing Leopard on my computer, and my first impressions are that it's very slick and well thought out. One of my necessities as a developer is that I must have a webserver running on my personal computer, so I was dismayed when it wasn't functional after the upgrade. The main reason is that Leopard uses a different version of Apache than 10.4 did, and so some things get wonky. But it's easy to fix. Apache and PHP are included in Leopard, so the only thing missing is MySQL. To install that, go to
MySQL's site, and download the latest copy for OS X. It's incredibly simple to install.
After that comes setting up Apache. Open up a terminal window, and type in "sudo pico /etc/apache2/httpd.conf". (Note that you must be a computer administrator to access the files in etc/) OS X 10.4 had Apache in /etc/httpd/, and that's part of why it didn't work after the upgrade. Find the line in httpd.conf that looks like "#LoadModule php5_module libexec/apache2/libphp5.so", and remove the # sign at the beginning. Then search for AddType, and put the following somewhere around it.
AddType application/x-httpd-php .php
AddType application/x-httpd-php-source .phps
Save the file, and exit. Then open up System Preferences in Finder, and click on Sharing. Then turn on Web Sharing. That's all you need do to to setup a webserver on Leopard, or upgrade from OS 10.4. You should be able to open up a browser, and enter http://localhost/YOUR_USERNAME/ and see that the server is running.
What's your take on the database admin debate?

While writing up a review on a database tool I discovered today, I was inspired to spark a discussion about database GUIs in general. The value of GUI tools for administering database systems like MySQL has been a topic of much debate.
Read more →
Fie on commas! Fie!

SiteCrafting is in the process of phasing out some of our older servers, and as an added bonus, the clients hosted on those servers are getting a MySQL jumpstart, leapfrogging over 4.1 to go straight from 4.0.24 to 5.0.32. Tragically, it's not quite as simple as dump | import. This is what I get for bothering my bosses for a few weeks not long after coming aboard about how nice stored procedures, updatable views, and triggers could be.
"The wonderful thing about standards," a wiser person than me once said, "is that there are so many of them." That's not the whole of it, though. One good thing about standards is that there are certain features one can generally rely upon to work, translate, port, etc. Assuming one works within them, rather than taking advantage of loopholes allowed by their not-entirely-compliant-but-we're-getting-there-and-anyway-isn't-this-way- easier-and-faster software. When people don't (and I'm not entirely innocent here), you run the risk of turning your simple upgrade into a serious project when your favorite software decides it's time to comply a bit more.
Read more →
Creative use of temporary tables
At SiteCrafting, I enjoy working with a large number of different projects, each with their own requirements, technology, and problems to be solved, unfortunately, I sometimes forget about past solutions, until after I have finished writing a piece of code. Such is the case with a query that was eating up some serious processing time.
The problem was with a GROUP BY query with LEFT JOINs to several other tables and summing up totals from those joined tables. This query was taking about 4 minutes 45 seconds to run, and worse yet, it was affecting searches which had nothing to do with that query, and probably also eating up precious memory and cpu resources.
Read more →
The fewer the queries the better
I always enjoy trying to do my work creating web applications using the fewest number of database queries possible. It's kind of a pride thing, I guess. That and I suppose it makes to improve an app's performance and reduce the load on a server. If you're into that kind of thing. ;)
A technique I use subtitutes the temptation of using queries whose results call queries and instead uses only two.
Read more →
One of the more interesting adventures working with data storage is trying to aggregate information meaningfully from multiple very different data structures. Imagine you've got a website filled with content (say, a few hundred pages). All the content is stored and output dynamically - who wants to create and maintain 500 static html pages, anyway? And of course, you need a bit of variety, so all this content is spread across five different page designs, each requiring its own template and data structure.
Now you say you want to search your site? All the content? And you want the results all together in one big happy sorted-by-relevancy list? How on earth am I going to do that?
Well... like this.
Read more →
Applying MySQL Wizardry to MS SQL Server
Back in the day, we worked on developing a specialty application that was basically a lead generation system with a database that stored over 200,000 records, with a potential for a lot more. The web application displayed numerous reports that calculated totals from disparate sources. We discovered that once our client began adding all their data that those reports were running slower and slower and slower.
The problem was that we had one primary query to pull the records out, then, as the code looped through each record, several other queries were needed to calculate the disparate totals. That resulted in numerous database calls that slowed the entire web application. That's when Mike discovered
MySQL Wizardry, that used the SUM(IF()) and the GROUP BY clause, problem solved.
Read more →
Making Inserts Sucks
If you've ever created
SQL queries with
PHP, you probably know what a pain it can be to create insert and update statements. I really, really (really) don't like it. As I was working on my personal site, and exploring possible frameworks to use, I came across
CodeIgniter. They have a great database interaction library, especially the function for creating the insert queries.
Today, armed with only the descriptions of
CodeIgniter's query helper functions, I spent 20 minutes trying to duplicate some the effect of the insert and update functions. I've never seen the code, or even used it, but I didn't have to see the code to write a similar function. Both functions take a table name and an associative array of column names and values. The update function also requires a WHERE statement, and it can't be blank. This is different from CodeIgniter, and that's so you don't accidentally reset all of the passwords in the mysql users table, or any table for that matter. And then, *poof*, the function gives you a nice sql statement.
I'll never have to write another "INSERT blah blah blah" again. Yay!
Read more →