Yahoo Pipe for new news sources

For a good while now, I’ve been spinning out the usual patter to folks I train about the virtues of Yahoo Pipes (and Feedrinse).  I’ve been waxing lyrical about the potential benefits, without having had a real-world problem to apply them to, and so test this blind faith.

Until this weekend that was.  Since I started work at the CIJ, I’ve been spending an ungodly amount of time scanning feeds relevant to that field (in developing the CIJ News Blog), and so neglecting my own feeds in the process.

Things came to a head yesterday when, setting down to go through a week’s worth of personal feeds, I ended up spending a good few hours I really didn’t/don’t have. 

So this morning I rolled the sleeves up, and came up with a pipe for a very specific job.

I subscribe to a small number of high-turnover feeds which inform me about new sources in search, social media and news.  It takes a fair amount of time to make your way through all Tech Crunch, Mashable and Read Write Web content, especially when you are looking for content applied in a particular field.

It occurs to me that if I made a feed out of these (and others) filtering for the term news, that this could serve as a suitably light Tech News feed, for new sources and information in that field.  Meanwhile, for the application of non-news engines and apps, I can still rely on those journalism sources I subscribe to (OJB, Andy Dickinson etc.) to a fair degree.

A while ago I posted up all my source-based feeds in an RSS Amnesty.

So basically, I lifted those relevant feeds from my Search Awareness and Technology folders, dumped them in Yahoo Pipes, created a filter (for the term news), then piped them out, here*

Time will tell to what extent I’m missing out on content, or picking up too many articles along the lines of  ‘Great news about…’.  Unfortunately Yahoo Pipes comes with all the usual language-based caveats associated with simple (or even boolean) search.

And if it doesn’t work paticularly well, I’ll just have to think a bit harder about how these stories might appear – or maybe use more sources (or use of the social web) to make the pipe more sophisticated (like this one).



* I just spent almost as much time trying to embed the damn thing in this post, as I did creating the Pipe in the first place.  Wonder how much time and effort is wasted by analy retentive attentive folks like me trying to make sure the presentation of their blog doesn’t let the content down, and wondering whether there’s a way to automate it

** Seems to be on the blink for now. Bloody technology…

*** Fixed now…


Tags: ,

6 Responses to “Yahoo Pipe for new news sources”

  1. Posts about Mashable as of March 15, 2009 » The Daily Parr Says:

    […] I love it. I tried a few different methods of noting the posts I wanted to include in the Trip, Yahoo Pipe for new news sources – 03/15/2009 For a good while now, I’ve been spinning out the usual […]

  2. frontlineclub Says:

    I find Yahoo Pipes enormously useful for filtering a broad segment of the Internet for quite specific information, but I don’t understand why a Pipe should be slower than a raw keyword RSS feed.

    This is based on completely unscientific research, but Pipes seem to run in batches every few hours or so. That’s my impression at least.

    In my RSS reader I have a Pipe that collects a tonne of raw keyword feeds in one feed. I also have a folder with all those raw feeds as straight RSS, no filtering, just to compare the speed of delivery and the Pipe is generally slower, but pretty accurate at pulling out only the stuff I need.

    If you have any idea why this might be the case, I’d love to know…

    In addition, how do you compare Pipes to search sites like Kosmix – – I find they miss too much, but are very user friendly.

    Lastly, I’ve been running a smallscale experiment for some journalists by designing custom Pipes for them – – Just hope they bother to tell me if they’re any good or not.

  3. slewfootsnoop Says:

    Hi Graham – I have no idea why a pipe of a feed may take longer than an equivalent keyword feed – do you mean in terms of seconds/fractions of a second slower?

    I guess it would be interesting to see a comparison of raw keyword Google and Yahoo , as well as page-2-rss – all vs Yahoo Pipes.

    Is it possible that Pipes does some sort of analysis on the feed before making it available, whether you’ve set filters or not?

    The only other thing I can think is that for non-Yahoo RSS, that pipes has to access these indices in the same way the surfer does – according to whichever server we are directed to (and hence a possible delay), whereas Google may provide a local network for it’s RSS systems to access it’s index at source – though that could be bs to be honest.

    I haven’t tried kosmix yet, but will give it a whirl this weekend – thanks for the tip.

    Bytheway just had a look at the pipe you built for Thomas Wiegold – very interesting.

    I took a look at the first pipe you are piping into that feed – by See-ming Lee – looks impressively complicated!

    THe first thing that occurs is whether it’s possible to tell what is possible to tweak, and what isn’t. For example, is it possible to tell (using stats, and a bit of editorial judgement, over time) which of these sources tends to produce better results than others, and so prioritise them accordingly?

    Great stuff though – a real eye opener.


  4. frontlineclub Says:

    Thanks for following up.

    I’ve done a few more little tests and am slightly more confused, or possibly enlightened.

    I have one big pipe I use every day to monitor news about journalism and journalists for the Frontline Club. I’ve just tested the exact same pipe feed in Netvibes, Google Reader, on Yahoo Pipes page itself and in NetNewsWire. All offer different result speeds. So is it an application/webtool thing and not a pipe thing?

    The Yahoo Pipes page is quickest to get stuff, which would make sense. However, even when I manually refresh the other three, the new feed I can see on the Pipes does not appear in the other destinations.

    I can see seven results in Netvibes that I can’t see in Google Reader – even with repeated refreshes while I write this. There are many more missing in NetNewsWire.

    I’ve taken some screenshots of this and may blog it because it has me very confused. However, it does remind me of a guy I was talking to a year or so ago who ran a niche news agencies who said they couldn’t rely on RSS as it was too slow, they used something else which offered real immediacy, a bespoke system I believe.

    I might have to rethink how I follow breaking news after this experiment.

  5. slewfootsnoop Says:

    Graham, I had an brief chat with Mike Shrenk at the Summer School last week end on this issue, and he told me a possible cause of the type of delay you describe would be the caching of this content in the various systems you’ve mentioned.

    Unfortunately I missed Mike’s talk and lab on web spidering (cij admin is a harsh mistress), but he tells me he will put the notes up on his site anon- and judging by the buzz surrounding his sessions, they would be well worth a read.

  6. frontlineclub Says:

    Cheers, thanks for that. Glad I subscribed to the comments here…

    BTW – I may need a wee bit of help on a project I am working on. Might be in touch re: that in sept/oct.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: