Magpie RSS FAQ ============= General ------- 1. What is MagpieRSS? Okay, this actually hasn't been asked much, but MagpieRSS (aka Magpie) is an RSS and Atom parser for PHP. 2. What versions of RSS do you support? MagpieRSS parses RSS 0.9, RSS 1.0, the various Userland RSS verions (0.9x and 2.0). Additionally it supports Atom 0.3, and many custom RSS namespaces. 2. Where can I get more info about MagpieRSS? Where can I get help? There is [Magpie blog][] for news, tips and general arcana. There is [Mapgie links page](http://magpierss.sf.net) which includes links to tutorials, howto, and open source projects using Magpie (good place to start if you're looking for examples) Lastly is a [mailing list][] which can be a good place to get help. 3. How should I ask a question? What is the best way to get help? Okay, no one asks this question, but they should at least ask themselves this question. When asking a question: 1\. Check the feed at the [feed validator][] 2\. Include the URL of the RSS feed that is causing problems. Without this, we can't help you, we just can't. 3\. Explain what problem you're seeing. 4\. Include which version of PHP and which version of Magpie you're using. 3. What is RSS? What is Atom? ... 4. Is the name Magpie or MagpieRSS? Officially the name is MagpieRSS, but Magpie is the affectionate nickname, and probably more accurate since Atom 0.3 support was added. 5. My question wasn't answered. I've got a better answer. Use the [mailing list](http://sourceforge.net/mail/?group_id=55691) 6. Can I donate to Magpie? How can I help? I swear, that is a *frequently* asked question.
The other best way to help is answer questions on the [mailing list][], and submit question and answer pairs for the FAQ. Installation ------------ 1. How do I install MagpieRSS? See: http://laughingmeme.org/magpie_blog/?p=80 MagpieRSS and Caching --------------------- 1. How does Magpie caching work? When Magpie successfully fetches and parses a feed, it saves the results PHP object to a file in the "cache directory". (this is called "serializing") Next time Magpie is asked to fetch that feed, Magpie will check for a cached version first. 2. Why is it important? 1\. Pages will load much faster with caching enabled. Rather then having to fetch and parse the feed each time the page is served, you do these slow operations once per hour (for example), and everyone else will see the speed up. 2\. Many sites will ban you if you fetch their RSS feed too frequently (or at least complain). Caching keeps you from doing this. 3\. Conditional GETs are an important technique for reducing bandwidth consumption, and only work if the cache system is enabled. 4\. If server is down, or slow enough to time out Magpie can continue to serve the old (stale) version of RSS until the remote server comes back. 3. Is caching on? Magpie ships with caching on by default, so unless you turned it off Magpie will *try* to use the cache system. 4. Where is the cache directory? By default Magpie will attempt to create a directory named 'cache' in the *working directory* of the PHP script which invoked it. That is to say, if you have a script named blog.php that resides at /var/www/mysite/blog.php that uses Magpie, Magpie will attempt to create the cache directory /var/www/mysite/cach You can override this default with define('MAGPIE_CACHE_DIR', '/var/foo/magpie/cache/dir/for/example'); 5. How do I know if caching is working? Check inside your cache directory for files with names like '25cd55bbc2766c84b57a3302daa8ba2e' Alternately if you can't find a cache directory try turning on debugging (see: How to debug Magpie), and look for an error message "Cache couldn't make dir ...." 6. Caching doesn't seem to be working, whats wrong? Is is a **very** frequent question. A number of things could be wrong, the most common is that the your web server does not have permission to write to your working directory. In this case you'll want to manually create the cache directory and make it web writeable. How to do this varies from platform to platform, and host to host, but the basic idea is: mkdir /var/www/mysite/cache; chown _web-user_:_web-group_ /var/www/mysite/cache; * On Debian _web-user_ and _web-group_ by default will be www-data and www-data * On Redhat..... (I don't know, help?) * On BSD...... (I don't know, help?) * On OS X _web-user and _web-group_ are www and www 7. I can't follow the above example because I don't have sufficient permissions (or I don't have a shell account, or I don't understand) Turning off caching is **never** a good idea so I recommend figuring out someway to make it work. A few options are: * put your cache in the /tmp directory like so define('MAGPIE_CACHE_DIR', '/tmp/magpie_cache') this approach has some security issues that will be addressed in a future version of Magpie. * make your cache directory world writeable. always a bad idea, I'm not going to cover how to do this * move your cache into a database won't be as fast, but is one solution, I'll discuss more in the future Troubleshooting --------------- 1. Error: "Failed to load PHP's XML Extension." Magpie depends on PHP to be compiled with XML support, if it hasn't been you'll need to rebuild your PHP to support it (or get your ISP to) http://www.php.net/manual/en/ref.xml.php 2. Warning: MagpieRSS: Failed to parse RSS file. (not well-formed (invalid token) at line x, column y) The RSS feed you're trying to parse contains an invalid character. Check it at: http://feedvalidator.org If the [feed validator][] doesn't find a problem then send an email to the [mailing list][] with the problem you're experiencing and the URL of the feed which is causing the error. Some RSS parser are based on regular expressions, and can parse invalid RSS but they have their own problems. 3. Fatal error: Call to undefined function: array\_change\_key\_case() Magpie requires at least PHP 4.2.0 (released April, 2002), and has been tested to work on all versions of PHP including PHP5 If you must use an ancient version of PHP, download the following file, and include it in your scripts. http://cvs.php.net/pear/PHP\_Compat/Compat/Function/array\_change\_key\_case.php 4. Error: MagpieRSS: Failed to fetch http://example.com/rss.xml. (HTTP Error: connection failed (1) A connection error of type 1 means "permission denied". This usually means that your ISP has configued PHP so that it can't open outgoing sockets (usually for security reasons). The only solution to this is to ask your ISP for help. Sometimes you'll also get the related `connection failed (11)` (e.g. on sourceforge.net) which also means PHP is configured in such a way that Magpie can't work. In Use ------ 1. How do I display the full HTML of an item? How do I access the content:encoded field? echo $item['content']['encoded']; 2. How do I find out what fields Magpie supports? Does Magpie support foo? The simplest way to find out if Magpie can parse a given field is to find a feed with that field and test it using the scripts/magpie_debug.php from a recent version of Magpie. This will display a `var_dump()` of the parsed RSS object. Look for you fields. For example if we dump the RSS feeds from the [Magpie blog][] we could scroll down until we found: ["items"]=> array(10) { [0]=> array(9) { ["about"]=> string(41) "http://laughingmeme.org/magpie_blog/?p=83" ["title"]=> string(32) "Consumer Recall on MagpieRSS 0.7" ["link"]=> string(41) "http://laughingmeme.org/magpie_blog/?p=83" ["dc"]=> array(3) { ["date"]=> string(20) "2004-12-12T18:59:00Z" ["creator"]=> string(34) "kellan (mailto:kellan@protest.net)" ["subject"]=> string(8) "LM" } ["description"]=> string(302) "We have reports of certain..." ["content"]=> array(1) { ["encoded"]=> string(595) "We have reports of certain models of MagpieRSS 0.7..." } ["date_timestamp"]=> int(1102877940) } From this we can see that Magpie successfully found and parsed the [Dublin Core] and [content] modules, as well as the default fields. In general Magpie will support name field of the following form, whether or not it has ever heard of it: value or value The Cookbook: Solutions to common programming challenges. --------------------------------------------------------- 1. Limit the number of headlines (aka items) returned. ### Problem ### You want to display the 10 (or 3) most recent headlines, but the RSS feed contains 15. ### Solution ### $num_items = 10; $rss = fetch_rss($url); $items = array_slice($rss->items, 0, $num_items); ### Discussion ### Rather then trying to limit the number of items Magpie parses, a much simpler, and more flexible approach is to take a "slice" of the array of items. And `array_slice()` is smart enough to do the right thing if the feed has less items then `$num_items`. See: http://www.php.net/array_slice 2. Display a custom error message if something goes wrong ### Problem ### You don't want Magpie's error messages showing up if something goes wrong. ### Solution ### # Magpie throws USER_WARNINGS only # so you can cloak these, by only showing ERRORs error_reporting(E_ERROR); # check the return value of fetch_rss() $rss = fetch_rss($url); if ( $rss ) { ...display rss feed... } else { echo "An error occured! " . "Consider donating more $$$ for restoration of services." . "
Error Message: " . magpie_error(); } ### Discussion ### MagpieRSS triggers a warning in a number of circumstances. The 2 most common circumstances are: if the specified RSS file isn't properly formed (usually because it includes illegal HTML), or if Magpie can't download the remote RSS file, and there is no cached version. If you don't want your users to see these warnings change your error_reporting settings to only display ERRORs. Another option is to turn off display_error, so that WARNINGs, and NOTICEs still go to the error_log but not to the webpages. You can do this with: ini_set('display_errors', 0); See: * http://www.php.net/error_reporting, * http://www.php.net/ini_set, * http://www.php.net/manual/en/ref.errorfunc.php 3. Generate a new rss feed ### Problem ### Create an RSS feed for other people to use. ### Solution ### Use Useful Inc's [RSSWriter](http://usefulinc.com/rss/rsswriter/) ### Discussion ### An example of turning a Magpie parsed RSS object back into an RSS file is forth coming. In the meantime RSSWriter has great documentation. 4. Display headlines more recent then a given date ### PROBLEM ### You only want to display headlines that were published on, or after a certain date. ### SOLUTION ### require 'rss_utils.inc'; # get all headlines published today $today = getdate(); # today, 12AM $date = mktime(0,0,0,$today['mon'], $today['mday'], $today['year']); $rss = fetch_rss($url); foreach ( $rss->items as $item ) { $published = parse_w3cdtf($item['dc']['date']); if ( $published >= $date ) { echo "Title: " . $item['title']; echo "Published: " . date("h:i:s A", $published); echo "

"; } } ### DISCUSSION ### This recipe only works for RSS 1.0 feeds that include the field. (which is very good RSS style) parse_w3cdtf is defined in rss_utils.inc, and parses RSS style dates into Unix epoch seconds. See: http://www.php.net/manual/en/ref.datetime.php 5. Parse a Local File Containing RSS ### PROBLEM #### MagpieRSS provides fetch_rss() which takes a URL and returns a parsed RSS object, but what if you want to parse a file stored locally that doesn't have a URL? ### SOLUTION ### require_once('rss_parse.inc'); $rss_file = 'some_rss_file.rdf'; $rss_string = read_file($rss_file); $rss = new MagpieRSS( $rss_string ); if ( $rss and !$rss->ERROR) { ...display rss... } else { echo "Error: " . $rss->ERROR; } # efficiently read a file into a string # in php >= 4.3.0 you can simply use file_get_contents() # function read_file($filename) { $fh = fopen($filename, 'r') or die($php_errormsg); $rss_string = fread($fh, filesize($filename) ); fclose($fh); return $rss_string; } ### DISCUSSION ### Here we are using MagpieRSS's RSS parser directly without the convience wrapper of fetch_rss(). We read the contents of the RSS file into a string, and pass it to the parser constructor. Notice also that error handling is subtly different. See: href="http://www.php.net/manual/en/ref.filesystem.php [mailing list]: http://sourceforge.net/mail/?group_id=55691 [Magpie blog]: http://laughingmeme.org/magpie_blog [feed validator]: http://feedvalidator.org