Avoiding rate limits when polling feeds

Posted by Kieren Pitts on 03 Feb 2010 | Tagged as: Web development

One of the cool things about the web is the ability to integrate content from a variety of sources. For example, you might want to pull in news data from an RSS feed and have the news items displayed on your web site.

This is easy to do with a script (such as a program written in Perl, PHP or Python etc) running via a cron job. The script then stores a fragment of content that you can use within your pages.

However, many services limit the number of content requests you can make in a specific time frame. For example, Twitter limit the number of requests a client can make to their service each hour. This prevents the service being swamped. Exceeding this limit, depending on the service, might result in an error or the IP address of your server being temporarily blocked from accessing that service.

Hitting the buffers

Under normal circumstances you probably won’t need to request a feed so many times that you hit the limits yourself. However, imagine the scenario where your web site is hosted on a shared server. You may only be requesting the feed once every two hours but, if other users on the same server are requesting data from the same service at a much higher rate, the combined number of requests might push you over the limit.

In this situation you might find that your requests will fail even though you are making requests at a low frequency.

Creating another source of the feed

If your access to a service is being limited by factors outside of your control then it’s often helpful to find an alternate source of the feed. This is rarely provided by the host service themselves but came be easily set up using Yahoo! Pipes. Yahoo! Pipes are a great way to aggregate, filter and republish data from online sources.

However, at the most basic level, you can set up a simple pipe that requests the RSS feed from the external service and then republishes the data as RSS at a http://pipes.yahoo.com address (see the ‘Get as RSS’ link for your Pipe).

Screenshot of a simple pipe created in Yahoo! Pipes

Once you have created the Pipe you can then request the Pipes version of the RSS feed rather than the original (remembering that you should still keep the rate of requests at a low level).

The result is that the request for the feed comes from the Yahoo! Pipes service and not directly from your script and you can circumvent the issues caused by other users on the same hosting. This is because services like Yahoo! Pipes are intended to be polled and are often white-listed.

Focusing on content migration

Posted by Kieren Pitts on 08 Jan 2010 | Tagged as: Project budgets, Project management, Web content

I recently posted a second article on the Internet Development blog. This time the intention of my article was to take a closer look at content migration.

Content migration is one of those strange areas that client’s often overlook. The intention of my article was to highlight some of the pitfalls and encourage a more proactive approach.

Your website audience

Posted by Kieren Pitts on 21 Feb 2009 | Tagged as: Web statistics

I’ve not been able to find much time to write articles for my own blog recently. However, I have written an article for our Internet Development blog.

The article covers web stats and discusses what you can and can’t tell about your audience: That most secretive of animals, your website audience

The death of email newsletters?

Posted by Kieren Pitts on 18 Dec 2007 | Tagged as: Accessibility, Marketing, Web development

I must admit I was surprised when the Email Standards Project (http://www.email-standards.org/) launched recently. The project aims to explain why Web standards are important for email. To any techie worth their salt the whole idea must grate horribly. After all, the Web is not email and email is not the Web.

I, and I can’t be alone, just want plain text emails rather than the bloated rubbish punted out by the marketing department (because they can and not because they should). I read my mail on a variety of different devices from my mobile to a desktop PC and using a number of different clients from Thunderbird to Pine. I don’t want to waste time/money by downloading HTML email to my phone (by paying for the extra bandwidth) and is my experience of reading an email really enriched by having it rendered in whatever colour/font the sender thinks I need to see?
Continue Reading »

Next »