Smiley Madness

It sounded so simple. Wouldn’t it be fun to add smiley recognition to my own little web programs? WordPress does it automatically. They just can’t decide whether it’s spelled “smilie” or “smiley” – depends who did their coding.

Well, after all, every time you see the symbols “:-)” , you’d just replace the symbols with the image path to the appropriate smiley, wouldn’t you :-)?

So you can see the manual smiley markups, I even had to enclose the symbols in quotes for this post, to prevent WordPress from converting them to their image equivalents.

Outside of WordPress, it turned out not to be so simple. You see, every symbol in the smiley “grin” markup is also a “special character” in Perl and most other programming languages. The coding to test for their presence will therefore match to symbols which, themselves, are program control characters. So the program thinks it sees a syntax error and blows up.

Not only that, there are a lot more recognized smiley codes than the two I knew about. WordPress handles 44 distinct smiley codes. If you’re going to add smileys to your own projects, add all of them, as users aren’t going to be interested in guessing which ones you personally happen to know about.

I spent a day studying the WordPress code to see how they do it. I translated most of their relevant php code to Perl. I finally decided that (for my purposes) they are using the Large Hadron Collider to make refrigerator magnets, and I wrote my own code instead. That works for me.

Nothing like a little coding exercise to clear out the cobwebs. Now all I have to decide is what I want to use it for.

In the end, all I was able to use from the WordPress code was their markup-key to icon-value hash lookup table. I needed to write a subroutine to handle escaping of all the special characters. Even there, the characters have to be encoded in hex so the program doesn’t blow up. Fortunately Perl has a built-in library for the actual escaping (use URI::Escape). Lastly, I had to write the code for the program that outputs the test display below. It loops through the big hash table once. It loops through each word in the input text 44 times, analyzing each one to see if it matches this particular smiley. You can’t assume people will only use one smiley in an entry or post. But there should be a pre-test to see how many smileys the text sample contains. Then the program can exit as soon as it’s found all of them: done!

Example: In the test output below, quote marks enclose the found keys again, to make the display look more like the actual Perl html output. The quotes are not used in the original. The backslashes found elsewhere in the output below tell Perl not to treat the special characters as symbols, but just treat them like any other literal. In captured text containing smiley markup, you backslash all the special characters (this is called escaping control characters). You also escape the patterns you are trying to match. If there’s a match, substitute the corresponding smiley icon image for the matched smiley code markup, word by word. Then reassemble the words back into a sentence or paragraph. Finally, unescape the final smiley-converted text and print it out.

IN text = This is some smilie ":-)" text. Some is sad ":-(" some is confused < ":???:" and some is really glad. ":grin:"

TEST3. Try splitting into words and testing each word against hash.

read was key ":grin:" , value icon_biggrin.gif Matched key \:grin\: to word \:grin\:

read was key ":-(" , value icon_sad.gif Matched key \:\-( to word \:\-(

read was key ":???:" , value icon_confused.gif Matched key \:\?\?\?\: to word \:\?\?\?\:

read was key" :-)" , value icon_smile.gif Matched key \:\-\) to word \:\-\)

This is some smilie text. Some is sad some is confused and some is really glad.

1,105 total views, 1 views today