Subject: | TreeBuilder can broke tree with correct nesting |
first, sorry for my English :p
My problem is: if we have HTML with correct nesting, but wrong by DTD
policy, parsing that code without implicit_tags=0 can broke initial
nesting.
as example, i have HTML (correct as just a tree):
<div id="some">
<div>
<font>
<div>
<h1>
<font>
<p>!</p>
</font>
</h1>
</div>
</font>
</div>
111
<table>
<tr>
<td>!!</td>
</tr>
</table>
</div>
parse it with implicit_tags ON, look_down tree for id='some' and output
via as_HTML. What i get:
<div id="some"><div><font><div><h1><font> </font></h1><p>!</p></div></
Show quoted text
font></div></div>
Because, as i understood, block element 'p' cant be placed under phrase
element 'font'. So, 'font' force closed, and corresponding '/font'
closing tag close 'div' and another, first 'font'. And table throwed
out from div#some.
Of course i can set implicit_tags OFF, but it make a lot of useful (and
safe) work, such as table tags handling.
Locally, i added implicit_tags=2 setting, which mean is kind of 'ON,
but without tree nesting altering'. Maybe this should be added to next
release?