Author Archives: admin

Bayesian calculations using MySQL 5.0.1+ Views (ratings)

I’ve been incredibly lax about posting, but I’m going to try to write a new article every week. I have more than enough material, what I’m short on is time. That being said:

One of the sites I manage allows users to cast ratings for other members. It’s always been sort of a kludge but it worked. However, it was far from perfect. I recently had the time to improve it substantially by using Bayesian calculations and a feature of MySQL 5.0.1+ called a view.

For all intents and purposes, a View is a Stored Procedure (for those familiar with Oracle, Sybase or Microsoft SQL).

Bayesian calculations are pretty simple but they can be very CPU intensive. Take a simple example. Say two users have all ratings of 10. User A has six votes and User B has 7 votes. Basically, we want to see which user is more popular over all.

What we have is essentially a table of all votes and we need to track the average number of votes multiply by the average rating. We then add the individuals total votes and multiply that by their rating. We then divide that by the average number of votes plus the number of votes this individual has. The equation looks like this:

((avg_num_votes * avg_rating) + (this_num_votes * this_rating)) / (avg_num_votes + this_num_votes)

I created the view like this: (this assumes we have an existing table in the dB to track ratings, where the columns is “user_account”, the person the vote was cast for. “user” which is the person who cast the vote, and “rating” which is the rating value left by “user” for “user_account”)

CREATE 
    ALGORITHM = UNDEFINED 
    DEFINER = `jon`@`%` 
    SQL SECURITY DEFINER
VIEW `ratings_bayesian` AS
    SELECT 
        `ratings`.`user_account` AS `user_account`,
        ((SELECT 
                COUNT(`ratings`.`user_account`)
            FROM
                `ratings`) / (SELECT 
                COUNT(DISTINCT `ratings`.`user_account`)
            FROM
                `ratings`)) AS `avg_num_votes`,
        (SELECT 
                avg(`ratings`.`rating`)
            FROM
                `ratings`) AS `avg_rating`,
        COUNT(`ratings`.`user_account`) AS `this_num_votes`,
        avg(`ratings`.`rating`) AS `this_rating`
    FROM
        `ratings`
    GROUP BY `ratings`.`user_account`

You call views in MySQL just like you would an ordinary table. The difference is that all of the heavy lifting is handle as a background process.

Now the Display Query is simply

SELECT User_account, ((avg_num_votes * avg_rating) + (this_num_votes * this_rating)) / (avg_num_votes + this_num_votes)
as real_rating FROM `urnawm`.`ratings_bayesian` ORDER BY real_rating DESC

We now have a weighted ratings system that processes in milliseconds and provides an incredibly accurate real time snap shot of your ratings system.

For more information on Bayesian methods and theory, check out http://en.wikipedia.org/wiki/Bayesian.

Nano Syntax Highlighting (using git on Centos)

Having used linux for years, I’ve run the gamut of command line text editors. VI, VIM, EMACS, Pico and of course Nano. My get down and dirty editor, need to edit it fast editor of the moment is Nano.

Have you ever wished that Nano supported Syntax Highlighting? Well it does. All you need are the files. Thanks to git, creating a local repository of syntax highlighting definitions is pretty straight forward.

Just clone the repository of your choice: git clone git://github.com/serialhex/nano-highlight.git ~/.nano and then edit your .nanorc to include the highlighting definitions that you want.


## CSS
include "~/.nano/css.nanorc"

## HTML
include “~/.nano/html.nanorc”

That’s really it. Pretty easy eh?

Sublime Text 2, the best Text Editor you aren’t using.

images

If you’ve been living under a rock, then you haven’t heard about the Sublime Text Editor. You are probably using something like EMACS, TextWrangler or (gasp) Notepad++.

If you are, then you need to stop everything you are doing and run over to the Sublime Text 2 website and install this gem. It exists for pretty much every operating system out there, and out of the box, it’s feature set alone is worth the $80 price tag.

My recommendation is to hit up YouTube and look for Sublime tutorial videos. You will be amazed. Not only that, but you will find mundane repetitive tasks no longer take up a bulk of your editing time.

imgres

Probably the biggest selling point for me was multi-line editing. The ability to edit every single instance of a variable name for instance. A close second was auto complete followed by column select.

That being said, in addition to the already full featured list of capabilities, Sublime Text 2 allows for third party extensions. Need Lint? FTP or SFTP to a remote site? How about a robust Dif utility? All this and more are at your fingertips. Especially after you have installed package control.

I’m not going to waste eny more of your valuable time yammering on. Go, check it out. Thank me later.

The Evolution of PHP and the rise of Composer and Git

This could be a really long post, but I’m going to keep it pretty short. PHP has been around in various forms since 1994. That’s almost 20 years, which is ancient in the tech world.

It’s gone through many changes over the years but it’s truly beginning to mature. Like most evolutionary changes, it took a couple of things to accelerate the process.

There are a number of content versioning systems. Over the years, I’ve used most of them on various projects hear and there. Git has been around for a while too, but it was always relegated to the status of the “geek cvs” with it’s arcane command line interface, and often difficult to grasp methodologies.

I’m not sure when the community began to rally around git or why. Perhaps the community at large finally became savvy enough to comprehend it, or git itself had enough useful tools and utilities mature so that the entry point or ease of accessibility became more attainable.

For whatever reason, git has become the gold standard. Developers who aren’t using it are missing out on an effective tool to track code changes and a vibrant community of knowledgable people exchanging ideas.

Git made the next major step possible. A lovely dependency manager called Composer.

PHP has had dependency managers in the past. Everyone has wrestled with PEAR. It has still been a struggle to keep software up to date and organized. Spaghetti code has been rampant and often left in place simply because the only alternative was to re-develop an entire project from scratch.