Or: Things you don't know until you do them on accident
I was working on a content retrieval procedure recently, and hit a snag. I needed to retrieve the content of one field, do some other processing, then compare that content to a field in another table. That middle processing step was preventing me from just doing a JOIN between the two.
Logically, what I wanted to do was this:
BEGIN
-- set up variable to store name of table
DECLARE storedValue VARCHAR(64);
-- query to grab the value used later
SELECT firstValue
INTO storedValue
FROM FirstTable
WHERE id = 1;
-- other stuff happens here
DECLARE tableCur CURSOR FOR
SELECT *
FROM SecondTable
WHERE secondValue = storedValue;
OPEN tableCur;
tableLoop: LOOP
-- process results
END LOOP tableLoop;
CLOSE tableCur;
END
Read more →
Not as snazzy-looking as my last two, but a useful query nonetheless. One the applications we built over the years, among other things, takes a list of names and addresses (currently a little over 2600), and selects around 200 of them at random. The client has been keeping track of the alphabetical distribution of last names with each random batch, and was concerned that the apparent weighting of the list toward the beginning of the alphabet was evidence that the lists were not truly random.
Fortunately I was able to get a quick statistical breakdown of the overall list, using the following:
Read more →
Fun with ASCII graphs!

(Author's note: not necessarily actually a practical idea. But fun!)
So pictured here is a histogram of a moderately large set of random integers. Each vertical line represents the total number of entries at each particular integer. Since each number is made up of multiple random factors (10 different random numbers, each between 0 and 100, added together), the distribution tends toward a bell curve.
So how did I build the graph? Excel? PHP? Nope. Just a MySQL query.
Read more →
Finding spikes in your normal data

With most large sets of data, especially numerical data, statistical analysis plays a key role. You can't be bothered to look at every record yourself; that's what computers are for.
One useful tool in any statistical analysis is the identification of
outliers. Assuming you have a
normally distributed set of data, outliers can help to identify user error in the data entry process, or genuine spikes in the data. Once found, these numbers can be set aside for closer analysis or eliminated to normalize the data set.
There are many different methods for identifying outliers, with varying levels of rigor. Here I'll just demonstrate one of the simplest definitions: an outlier is any value greater than three
standard deviations away from the
mean.
Read more →
Why yes, you can log in with an invalid username
Stefan over at
Suspekt brought up some
interesting security vulnerabilities based on MySQL's column truncation tendencies (when not in
strict mode), so I thought I'd add my own to the pile, this one right in the grant tables.
MySQL's user table restricts
user names to 16 characters (and hosts to 60). Any attempt to create a user with a longer login results in an error. However, unlike Stefan's example where a field is compared, then truncated and then inserted, MySQL actually truncates a login attempt before processing it.
Read more →
Fun with stored procedures pt. I
(First in what will probably be a series of blogs as we move all our projects to a replicated, MySQL 5.0 environment, and I finally get to start playing with all the useful features that come with it)Say you've got a client, Mystery Client A. Mystery Client A has hired a marketing consultant. As a part of their rebranding efforts, they have decided to refactor their company spelling convention to MysteryClientA!, for whatever reason. That's fine for replacing a few logos, but MCA has a database-driven content management system, and their name is riddled throughout the database in page content, event description, news headlines and so forth. Your job is now to sift through the entire system and apply the newly crafted spelling to the entire database. So what do you do?
Read more →
Fun with bitwise logic!

A web app full of data is often going to be full of tables (at least on the administration end). A listing of users here, all the recently created events there. Often for readability we will alternate rows between two slightly different background colors.
This can be handled on the PHP end with a bit of math and a counter, like so:
$count = 0;
while( $row = mysql_fetch_assoc($res) ) {
if( $count % 2 == 0 )
$background_color = 'white';
else
$background_color = 'gray';
/* OUTPUT TABLE CONTENTS WITH GIVEN BACKGROUND */
$count++;
}However, it can also be done entirely in-database, via creative use of the XOR operator:
Read more →
Deleting all semi-matching rows in MySQL
One of the more recent additions to the SiteCrafting CMS arsenal is a comprehensive error logger, tracking all PHP and MySQL errors (by default... other error types can be created on a case by case basis) that occur in new sites we build. Errors are stored in our own intranet system with a timestamp, error body and a site ID (assigned to each client at a different stage of our project workflow). The table looks something like this:
+----+---------------------+---------------+---------+
| id | logTime | text | project |
+----+---------------------+---------------+---------+
| 2 | 2008-05-14 14:42:15 | A PHP Error | 1 |
| 3 | 2008-05-14 14:42:26 | A PHP Error | 1 |
| 4 | 2008-05-14 14:42:34 | A PHP Error | 1 |
| 5 | 2008-05-14 14:42:47 | A MySQL Error | 1 |
| 6 | 2008-05-14 14:42:56 | A MySQL Error | 1 |
| 7 | 2008-05-14 14:43:05 | A PHP Error | 2 |
| 8 | 2008-05-14 14:43:10 | A PHP Error | 2 |
| 9 | 2008-05-14 14:43:21 | A MySQL Error | 2 |
+----+---------------------+---------------+---------+
8 rows in set (0.00 sec)
Obviously sometimes we get duplicate errors coming through. Aside from being mere mortals who aren't always fast enough to correct an error before it recurs, one of the first stages of debugging is to try and replicate the error. These are often pretty easy to manage. It's pretty trivial to search for all matching errors, check them all, and delete them. Sometimes, though, this just doesn't cut it. Like when there are a few different errors with 10,000 occurrences apiece.
Read more →

If the iPhone is web apps to go, Google and the Open Handset Alliance's new open source mobile phone API, dubbed Android, is The Joy Of Cooking Webapps. Once again Google has seen a good idea (packing a cell phone full of convenient little apps and widgets that sync with websites and external systems) and figured out a way to make it a great one.
Read more →
Fie on commas! Fie!

SiteCrafting is in the process of phasing out some of our older servers, and as an added bonus, the clients hosted on those servers are getting a MySQL jumpstart, leapfrogging over 4.1 to go straight from 4.0.24 to 5.0.32. Tragically, it's not quite as simple as dump | import. This is what I get for bothering my bosses for a few weeks not long after coming aboard about how nice stored procedures, updatable views, and triggers could be.
"The wonderful thing about standards," a wiser person than me once said, "is that there are so many of them." That's not the whole of it, though. One good thing about standards is that there are certain features one can generally rely upon to work, translate, port, etc. Assuming one works within them, rather than taking advantage of loopholes allowed by their not-entirely-compliant-but-we're-getting-there-and-anyway-isn't-this-way- easier-and-faster software. When people don't (and I'm not entirely innocent here), you run the risk of turning your simple upgrade into a serious project when your favorite software decides it's time to comply a bit more.
Read more →

I'm in a band. Well, two of them, but only one that matters for this discussion. I've noticed that the progression of a musical group that wants to be professional on some scale in many ways closely mirrors that of any product-based business. You start out spending a certain amount of time in development. You come up with ideas, assemble the best team available to execute those ideas, and then do your level best to refine and perfect your product before you release it to the public.
Read more →
One of the more interesting adventures working with data storage is trying to aggregate information meaningfully from multiple very different data structures. Imagine you've got a website filled with content (say, a few hundred pages). All the content is stored and output dynamically - who wants to create and maintain 500 static html pages, anyway? And of course, you need a bit of variety, so all this content is spread across five different page designs, each requiring its own template and data structure.
Now you say you want to search your site? All the content? And you want the results all together in one big happy sorted-by-relevancy list? How on earth am I going to do that?
Well... like this.
Read more →
Oct. 23, 2006 at 10:50amFirefox 2.0
Mozilla releases new browser when it's good and ready.
Something no developer wants to see...
foreach ($left_elements AS $le) {
?>
<table class="leftnav_<?=$le['class']?>">
<tr>
<td class="leftnav_<?=$le['class']?>">
<a href="<?=$le['target']?>" class="<?=$le['theme']?>"><?=$le['name']?></a>
</td>
</tr>
</table>
<?
}
Nobody wants to see that. Especially not seven different times with seven tiny variants covering seven different if conditions.
Each higher 'class' number is a deeper indentation. Apparently it's necessary for this class identification to be repeated for the table, table cell and link. Converting this to a nested list format (part of my current task) is going to be an adventure.
Well... at least it's indented nicely.