I would like to be able to parse RSS and Atom feeds that containnon-valid XML. The errors I have encountered and would like to fixinclude "simple" things such as a >
where the closing ;
ismissing, missing closing tags and closing tags that appear in thewrong order.
I would like to ignore the question whether in theory it makes anysense to attempt parsing malformed XML documents at all. One"technical" term that seems to come rather close to what I want to dois "tag soup". What existing CPAN modules should I use to build such aparser that is able to tolerate or correct simple errors like thosedescribed above?