我需要匹配htm文件中的<dl>标签,但是它有嵌套的情况,我想把所有<dl>与</dl>对应的都匹配到,我尝试了使用c#平衡组:
$patternNobullet = [regex]"<dl[^>]*>[\s\S]*(((?'Open'<dl[^>]*>)[\s\S]*)+((?'-Open'</dl>)[\s\S]*)+)*(?(Open)(?!))</dl>" [System.Collections.ArrayList]$tablecontents = @() $match = $patternNobullet.Match($content) while ($match.Success) { $tablecontents.Add($match.Value) | out-null $match = $match.NextMatch() }
上面的这个能匹配到最外层的<dl>标签,但是只是最外层的,里面嵌套的它就匹配不出来了。
之后我又尝试使用下面的这种方法,
$patternNobullet = [regex]"<dl[^>]*>[\s\S]*(((?'Open'<dl[^>]*>)[\s\S]*)+((?'-Open'</dl>)[\s\S]*)+)*(?(Open)(?!))</dl>" $content | Select-String $patternNobullet -AllMatches | ForEach-Object{ foreach($v in $_.Matches) {...} }
但是这种方法,好像不支持平衡组,什么都不能匹配。各位大神有什么好的方法吗,能把说有对应的标签中的内容都提取出来吗?
这是htm中的内容:
<dl>
<dd>[Scenario or Feature Name] (Entry Page)<dl>
<dd>Why [Do Scenario or Use Feature]? </dd>
<dd>What’s New for [Scenario or Feature] in [Product] [Version#]? </dd>
<dd>Getting Started with [Scenario or Feature]<dl>
<dd>Learning Path for [Scenario or Feature]</dd>
<dd>Prepare Your Development Environment for [Scenario or Feature]</dd>
<dd>Tutorial: Create your First [Scenario or Feature Application]</dd>
<dd>Community Resources for [Scenario or Feature]</dd>
</dl>
</dd>
<dd>How to [Complete Scenario or Use Feature]<dl>
<dd>Best Practices for [Scenario or Feature]</dd>
<dd>How to [Complete a Dev Scenario] (Scenario Portal)<dl>
<dd>Best Practices for [Scenario]</dd>
<dd>Design Considerations for [Scenario]</dd>
<dd>How to: [Complete Task 1 in Scenario]</dd>
<dd>How to: [Complete Task 2 in Scenario]</dd>
<dd>How to: [Complete Task N in Scenario]</dd>
<dd>Testing Your [Scenario]</dd>
<dd>Troubleshooting Your [Scenario]</dd>
</dl>
</dd>
<dd>How to: [Complete Some Task]</dd>
<dd>How to: [Complete Some Task]</dd>
</dl>
</dd>
<dd>[Scenario or Feature] Concepts<dl>
<dd>[Scenario or Feature] Overview</dd>
<dd>[Scenario or Feature] Architecture</dd>
<dd><other conceptual topics></dd>
</dl>
</dd>
<dd>[Scenario or Feature] Reference<dl>
<dd><standard reference topics or an index to WinRT reference topics></dd>
</dl>
</dd>
<dd>[Scenario or Feature] Tools</dd>
<dd>[Scenario or Feature] Samples</dd>
</dl>
</dd>
</dl>