Subject: | Regexp-based HTML parsing is sucky. |
Not specifically your implementation of it, but the entire idea of
parsing HTML using regexps is broken.
Particular problems in your implementation...
It can't identify the following <title> element:
<title lang="en">Hello World</title>
It can't find this link:
<a href="https://metacpan.org/">CPAN</a>