This is the first in perhaps a new series on some solutions that have been developed in the course of building LazyTweet. Today’s tip focuses on how to manage those profile pictures that many twitter mashups utilize.
Whether using the Twitter Search API, or the Twitter REST API, a twitter app has access to full user information for each tweet including the user’s name, username, and profile picture url.
In the search API, it looks something like this:
<link type="image/png" rel="image" href="http://s3.amazonaws.com/twitter_production/profile_images/57595462/avatar_normal.jpg"/> <author> <name>lazytweet (lazytweet)</name> <uri>http://twitter.com/lazytweet</uri> </author>
In the REST API:
<user> <id>10855142</id> <name>lazytweet</name> <screen_name>lazytweet</screen_name> <location>Everywhere</location> <description> Questions and Answers for Twitter - Get help from beyond your immediate followers </description> <profile_image_url> http://s3.amazonaws.com/twitter_production/profile_images/57595462/avatar_normal.jpg </profile_image_url> <url>http://www.lazytweet.com</url> <protected>false</protected> <followers_count>315</followers_count> </user>
Originally, I constructed the LazyTweet data model in a very denormalized manner, actually storing the user information for every tweet, just as the APIs publish it. This meant that people that began to accumulate tweets in LazyTweet had their information repeatedly stored. The primary motivation for doing this was my unfamiliarity with the App Engine datastore, and so it was easy to keep moving forward with this approach.
I didn’t so much mind the duplicate data, it wasn’t my server after all, it was Google’s. But as with probably any denormalized approach, maintenance is a bigger issue than disk space and the first issue that came up was how often users’ profile pictures change. So, in order to avoid having to update multiple records every time a user updated their picture, I went and properly learned about ReferenceProperty in App Engine and converted over to having distinct User and Post models.
Ok, so what’s the big lesson/tip/trick/hack after all that buildup? It’s all in how you update those pictures. It’s easy enough to check if it’s different when you suck a tweet into the system, but what about users that haven’t had a tweet come through in a while, and they still show up in searches for example. Since the old images are actually removed from twitter’s S3 instance, you end up with a lot of broken images on pages being rendered. So, we need an additional method to capture these. The trick is to utilize a HTML img tag’s onerror event attribute. Something like this:
<img onerror="updateProfileImage(this, 1138399932)" src="http://s3.amazonaws.com/twitter_production/profile_images/67192907/Ruth1_normal.JPG"/>
which invokes a simple function that uses the jQuery AJAX library to call back to the server with a tweet’s status id to go investigate what the new url is, using the show method, update it, and possibly return it to the client for updating the page.
function updateProfileImage(image, status_id) {
$.post("/user/update_profile_image", { status_id: status_id }, function(data) {
// swap image.src with some url returned in data
});
}
This way, no matter where the tweet is rendered in the application, we can keep an eye on updated images, in a very efficient way.
Now, I probably shouldn’t do this with a hardcoded event attribute in the tag and instead add an onerror event handler using jQuery, but I haven’t tested yet to see how that interaction would work, considering images start loading early in page execution (not sure within what event to add the handlers, seems like $(document).ready() would be too late). I’ll report back if I get that cleaner.