Oct. 13, 2008 at 8:45pmVariable Resolution In Server Side Cursors

Or: Things you don't know until you do them on accident

I was working on a content retrieval procedure recently, and hit a snag. I needed to retrieve the content of one field, do some other processing, then compare that content to a field in another table. That middle processing step was preventing me from just doing a JOIN between the two.

Logically, what I wanted to do was this:
BEGIN
    -- set up variable to store name of table
    DECLARE storedValue VARCHAR(64);
    
    -- query to grab the value used later
    SELECT firstValue
    INTO storedValue
    FROM FirstTable
    WHERE id = 1;

    -- other stuff happens here

    DECLARE tableCur CURSOR FOR
    SELECT *
    FROM SecondTable
    WHERE secondValue = storedValue;

    OPEN tableCur;
    tableLoop: LOOP

        -- process results

    END LOOP tableLoop;
    CLOSE tableCur;
END
Read more →

Oct. 6, 2008 at 4:43pmStats In MySQL Pt. III: Alphabetical Distribution

Not as snazzy-looking as my last two, but a useful query nonetheless. One the applications we built over the years, among other things, takes a list of names and addresses (currently a little over 2600), and selects around 200 of them at random. The client has been keeping track of the alphabetical distribution of last names with each random batch, and was concerned that the apparent weighting of the list toward the beginning of the alphabet was evidence that the lists were not truly random.

Fortunately I was able to get a quick statistical breakdown of the overall list, using the following:

Read more →

Aug. 27, 2008 at 9:51amStats in MySQL Pt. II: Histograms

Fun with ASCII graphs!

(Author's note: not necessarily actually a practical idea. But fun!)

So pictured here is a histogram of a moderately large set of random integers. Each vertical line represents the total number of entries at each particular integer. Since each number is made up of multiple random factors (10 different random numbers, each between 0 and 100, added together), the distribution tends toward a bell curve.

So how did I build the graph? Excel? PHP? Nope. Just a MySQL query.

Read more →

Aug. 26, 2008 at 8:45amStats in MySQL Pt. I: Outliers

Finding spikes in your normal data

With most large sets of data, especially numerical data, statistical analysis plays a key role. You can't be bothered to look at every record yourself; that's what computers are for.

One useful tool in any statistical analysis is the identification of outliers. Assuming you have a normally distributed set of data, outliers can help to identify user error in the data entry process, or genuine spikes in the data. Once found, these numbers can be set aside for closer analysis or eliminated to normalize the data set.

There are many different methods for identifying outliers, with varying levels of rigor. Here I'll just demonstrate one of the simplest definitions: an outlier is any value greater than three standard deviations away from the mean. Read more →

Aug. 20, 2008 at 11:20amMySQL Login Truncation

Why yes, you can log in with an invalid username

Stefan over at Suspekt brought up some interesting security vulnerabilities based on MySQL's column truncation tendencies (when not in strict mode), so I thought I'd add my own to the pile, this one right in the grant tables.

MySQL's user table restricts user names to 16 characters (and hosts to 60). Any attempt to create a user with a longer login results in an error. However, unlike Stefan's example where a field is compared, then truncated and then inserted, MySQL actually truncates a login attempt before processing it.
Read more →

Aug. 19, 2008 at 5:00pmFinding the Needle in your MySQL Haystack

Fun with stored procedures pt. I

(First in what will probably be a series of blogs as we move all our projects to a replicated, MySQL 5.0 environment, and I finally get to start playing with all the useful features that come with it)

Say you've got a client, Mystery Client A. Mystery Client A has hired a marketing consultant. As a part of their rebranding efforts, they have decided to refactor their company spelling convention to MysteryClientA!, for whatever reason. That's fine for replacing a few logos, but MCA has a database-driven content management system, and their name is riddled throughout the database in page content, event description, news headlines and so forth. Your job is now to sift through the entire system and apply the newly crafted spelling to the entire database. So what do you do?

Read more →

Jul. 11, 2008 at 4:33pmMySQL Tidbits: The XOR Toggle

Fun with bitwise logic!

A web app full of data is often going to be full of tables (at least on the administration end). A listing of users here, all the recently created events there. Often for readability we will alternate rows between two slightly different background colors.

This can be handled on the PHP end with a bit of math and a counter, like so:

$count = 0;
while( $row = mysql_fetch_assoc($res) ) {
    if( $count % 2 == 0 )
        $background_color = 'white';
    else
        $background_color = 'gray';

    /* OUTPUT TABLE CONTENTS WITH GIVEN BACKGROUND */

    $count++;
}


However, it can also be done entirely in-database, via creative use of the XOR operator: Read more →

May. 14, 2008 at 4:15pmDumping Duplicates

Deleting all semi-matching rows in MySQL

One of the more recent additions to the SiteCrafting CMS arsenal is a comprehensive error logger, tracking all PHP and MySQL errors (by default... other error types can be created on a case by case basis) that occur in new sites we build. Errors are stored in our own intranet system with a timestamp, error body and a site ID (assigned to each client at a different stage of our project workflow). The table looks something like this:

+----+---------------------+---------------+---------+
| id | logTime             | text          | project |
+----+---------------------+---------------+---------+
|  2 | 2008-05-14 14:42:15 | A PHP Error   |       1 |
|  3 | 2008-05-14 14:42:26 | A PHP Error   |       1 |
|  4 | 2008-05-14 14:42:34 | A PHP Error   |       1 |
|  5 | 2008-05-14 14:42:47 | A MySQL Error |       1 |
|  6 | 2008-05-14 14:42:56 | A MySQL Error |       1 |
|  7 | 2008-05-14 14:43:05 | A PHP Error   |       2 |
|  8 | 2008-05-14 14:43:10 | A PHP Error   |       2 |
|  9 | 2008-05-14 14:43:21 | A MySQL Error |       2 |
+----+---------------------+---------------+---------+
8 rows in set (0.00 sec)

Obviously sometimes we get duplicate errors coming through. Aside from being mere mortals who aren't always fast enough to correct an error before it recurs, one of the first stages of debugging is to try and replicate the error. These are often pretty easy to manage. It's pretty trivial to search for all matching errors, check them all, and delete them. Sometimes, though, this just doesn't cut it. Like when there are a few different errors with 10,000 occurrences apiece.

Read more →

Nov. 14, 2007 at 12:01amAndroid: The REAL Next Gen of Web Apps

If the iPhone is web apps to go, Google and the Open Handset Alliance's new open source mobile phone API, dubbed Android, is The Joy Of Cooking Webapps. Once again Google has seen a good idea (packing a cell phone full of convenient little apps and widgets that sync with websites and external systems) and figured out a way to make it a great one.

Read more →

Sep. 6, 2007 at 4:45pmAdventures in Database Migration Pt. I

Fie on commas! Fie!

SiteCrafting is in the process of phasing out some of our older servers, and as an added bonus, the clients hosted on those servers are getting a MySQL jumpstart, leapfrogging over 4.1 to go straight from 4.0.24 to 5.0.32. Tragically, it's not quite as simple as dump | import. This is what I get for bothering my bosses for a few weeks not long after coming aboard about how nice stored procedures, updatable views, and triggers could be.

"The wonderful thing about standards," a wiser person than me once said, "is that there are so many of them." That's not the whole of it, though. One good thing about standards is that there are certain features one can generally rely upon to work, translate, port, etc. Assuming one works within them, rather than taking advantage of loopholes allowed by their not-entirely-compliant-but-we're-getting-there-and-anyway-isn't-this-way- easier-and-faster software. When people don't (and I'm not entirely innocent here), you run the risk of turning your simple upgrade into a serious project when your favorite software decides it's time to comply a bit more.

Read more →

Dec. 18, 2006 at 10:37amThe Right People For Your Job

I'm in a band. Well, two of them, but only one that matters for this discussion. I've noticed that the progression of a musical group that wants to be professional on some scale in many ways closely mirrors that of any product-based business.  You start out spending a certain amount of time in development. You come up with ideas, assemble the best team available to execute those ideas, and then do your level best to refine and perfect your product before you release it to the public.

Read more →

Oct. 27, 2006 at 11:34amCross Table Content Search

One of the more interesting adventures working with data storage is trying to aggregate information meaningfully from multiple very different data structures. Imagine you've got a website filled with content (say, a few hundred pages). All the content is stored and output dynamically - who wants to create and maintain 500 static html pages, anyway? And of course, you need a bit of variety, so all this content is spread across five different page designs, each requiring its own template and data structure.

Now you say you want to search your site? All the content? And you want the results all together in one big happy sorted-by-relevancy list? How on earth am I going to do that?

Well... like this.
Read more →

Oct. 23, 2006 at 10:50amFirefox 2.0

Mozilla releases new browser when it's good and ready.

Hot on the heels of the Internet Explorer 7 release, Mozilla Firefox 2.0 is slated for official release tomorrow, but, as often happens, they have posted the full release version up on the mirrors already, so you can find what you're looking for at ftp://ftp.mozilla.org/pub/mozilla.org/firefox/releases/2.0

Read more →

Oct. 18, 2006 at 10:46amNavigation Nightmare pt. 1

Something no developer wants to see...

foreach ($left_elements AS $le) {
    ?>
    <table class="leftnav_<?=$le['class']?>">
        <tr>
            <td class="leftnav_<?=$le['class']?>">
                <a href="<?=$le['target']?>" class="<?=$le['theme']?>"><?=$le['name']?></a>
            </td>
        </tr>
    </table>
    <?
}


Nobody wants to see that. Especially not seven different times with seven tiny variants covering seven different if conditions.

Each higher 'class' number is a deeper indentation. Apparently it's necessary for this class identification to be repeated for the table, table cell and link. Converting this to a nested list format (part of my current task) is going to be an adventure.

Well... at least it's indented nicely.