HTML::RSSAutodiscovery - methods for retreiving RSS-ish information from an HTML document.
use HTML::RSSAutodiscovery; use Data::Dumper;
my $url = "http://www.diveintomark.org/";
my $html = HTML::RSSAutodiscovery->new(); print &Dumper($html->parse($url));
# Mark's gone a bit nuts with this and # the list is too long to include here...
# see the POD for the 'parse' method for # details of what it returns.
Methods for retreiving RSS-ish information from an HTML document.
new()
Object constructor. Returns an object. Woot!
parse($arg)
Parse an HTML document and return RSS-ish <link> information.
$arg may be either:
Returns an array reference of hash references whose keys are :
locate($uri,\%args)
Like the parse method, but will perform additional lookups, if necessary or specified.
Valid arguments are
String. A live, breathing URI to slurp and parse.
Required
Hash ref whose keys may beBoolean. Don't bother parsing the document, this will also prevent you from checking for embedded links.
I don't know why you want to do this, but you can.
False, by default.
embeddedBoolean. Check all embedded links ending in '.xml', '.rss' or '.rdf' (and then 'xml', 'rss' or 'rdf') for RSS-ness.
False, by default, unless the initial parsing of the URI returns no RSS links.
embedded_and_remoteBoolean.
Boolean. Check all embedded links whose root is not the same as $uri for RSS-ness.
False, by default.
syndic8Boolean. Check the syndic8 servers for sites matching $uri
False, by default, unless the initial parsing of the URI and any embedded links returns no RSS links.
Returns an array reference of hash references whose keys are :
1.21
$Date: 2004/10/17 04:13:06 $
Aaron Straup Cope
Because you shouldn't need all that white space to do cool stuff ;-)
http://diveintomark.org/archives/2002/05/30.html#rss_autodiscovery
http://diveintomark.org/archives/2002/08/15.html
http://diveintomark.org/projects/misc/rssfinder.py.txt
These packages are required to actually parse an HTML document or URI.
These packages are required to check the embedded links in a URI for RSS files. They are not loaded until run-time so they are not required for doing basic parsing
These packages are required to query the syndic8 servers for RSS files associated with a URI. They are not loaded until run-time so they are not required for doing basic parsing
Copyright (c) 2002-2004, Aaron Straup Cope. All Rights Reserved.
This is free software, you may use it and distribute it under the same terms as Perl itself.