John Evans (johnevans) wrote,
John Evans

  • Mood:
  • Music:

The Problem With Pipes

I've been playing around with Yahoo Pipes recently. This has mostly been in the realm of seeing what it can do; I haven't found a really good use for it yet—at least not something that I am impressed by. But then, I don't work with feeds very often, so maybe I'm not the "intended audience". There's nothing wrong with that.

There is one interesting thing I've noticed, however. Pipes has a few obvious deficiencies in its language. The most obvious one is that it's impossible (or at least very awkward, more on that later) to extract text from items. "text" and "items" are two of the Pipes data types, representing a text string and a group of feed entries, respectively.

Here's an example to show what I mean. Let's say you have a feed with a certain number of items, and you want to choose one specific item from it. This is not too difficult; I've already written a pipe to choose one item from a feed. This pipe takes two inputs: a URL and a number. It uses Fetch Site Feed to get the feed from the URL, and it uses some math and filters to get only the specified item. Simple!

The Example That Doesn't Work

But let's think of a slightly different example. Let's say that you want to choose a specific item from a feed, but which item is given by a different webpage. Like you have a webpage that displays nothing but the number "3", and that means you want the third item from the feed. But you don't know the number until run-time; it could be 3, or 4, or 1600, or anything. You want the pipe to query the webpage to find out which entry to get from the feed. This turns out to be difficult.

The problem is that while you can get the webpage data with Fetch Page—or Fetch Data, or even Fetch CSV if you want—those modules don't return "number", they return "items". The math and filter modules need a "number" parameter to do their thing. "items" is a collection of data entries; it can't be used as a string, even if there's only one.

How to Solve the Problem

First, I should note that one part of this problem is not actually a problem. Pipes will convert strings into numbers. You can create a String Input containing "3", and can be hooked into a "number" input that will then be set to 3. So, the problem isn't converting the text; the problem is getting it out of the "items" type.

I propose a new module, perhaps called String Extractor. In its simplest form, it would take in "items" and return "text" representing the default content of the first entry in the feed. If we wanted to get more fancy, we could add a "number" parameter to denote which entry to extract, and also perhaps a field to choose which element of the entry gets extracted (like Rename or Regex lets you choose).

I believe this module would be simple to implement, and while I don't know much about the implementation, I have a good reason to believe this wouldn't be too hard.

The Workaround

The truth is, it's already possible to create a pipe that performs this function, using a "trick" that some Pipes developers have come up with. It actually involves creating two pipes.

1. First create a pipe that chooses a specific item from a feed. That was my first example, above.

2. Make a second pipe that fetches a number from a page. Process it until you have a feed with one item, containing the number.

3. Bring in a Loop module and place the first pipe inside it as a sub-pipe. Hook the number-feed up to the Loop module's input. Set the sub-pipe's number input to be "item.content".

The way this works is that "for every item in the input feed", the sub-pipe will be run on it and the content field will be used as the sub-pipe's parameter. Of course, there's only one item in the input feed, and it contains the specified number, so the loop is run once to choose the item from the feed.

Why a New Module?

Some might ask why we should be able to do this with Pipes at all. The new module would allow all sorts of facilities for interesting data filtering; much more like programming than simply mashing feeds together. I think this would be well worth it. And it's obvious that the Pipes developers want this functionality, because a lot of them talk about it and know about this "trick" on the Pipes discussion forums.

So, some might also ask why a new module is needed if we can already perform this function. The answer is that the current way is completely awkward; that makes it hard for new Pipes developers to figure out. Maybe some of them have already been discouraged. Also, because this method requires a sub-pipe, it inflates the number of pipes.

And because the functionality of the module I'm proposing is already part of theLoop module, it's certain to be easy to implement.
Tags: mashup, programming, web applications, web programming, yahoo pipes
  • Post a new comment


    default userpic
    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.