|
Regex to return <a href> attributes
I can throw together some regular expressions that extract dates and other easy stuff but this is pretty taxing, for me anway.
Lets say I have the following HTML
HTML Code:
<ul>
<li><a href="file1.html" title="This is a title for file1">file 1</a></li>
<li><a href="banana.html" title="">Banana</a></li>
<li><a href="tg.html">tg</a></li>
<li><a href="bfsog.html" id="bfsog" title="biffy">biffy</a></li>
</ul>
I want the href and the link text. I have come up with
Code:
([<a href=\".*\">.*</a>])
To be honest I am not sure if that actually works, I have been ducking between doing this in PHP and C# (A php RE will not work in a C# app I do not think).
So in short, from the above example I would like something like
Quote:
file1.html file1
banana.html Banana
tg.html tg
bfsog.html biffy
|
Thanks in advance
|