Replace Links On A Web Page Using HTML Parser
|
|
|
|
|
This article will describe how you can use our HTML Parser library HTMLParser.Net to
parse and replace a links on a web page. You can do this by fetching page from live site or by first saving the page on the disk as HTML file and then
parsing it. After replacing the links you can write the new HTML content into a file.
Here is code snippet that does the trick. For complete project, please goto HTMLParser.Net download page.
oParser = New Winista.Text.HtmlParser.Parser(oLexer)
oNodeList = oParser.Parse(Nothing)
oLinkNodes = oNodeList.ExtractAllNodesThatMatch(New Winista.Text.HtmlParser.Filters.TagNameFilter("A"), True)
For idx As Int32 = 0 To oLinkNodes.Count - 1
aTag = oLinkNodes(idx)
If (aTag.Link.IndexOf(".asp") > 0) Then
aTag.Link = aTag.Link.Replace(".asp", ".aspx")
End If
Next
oFileStream.Close()
Dim oNewFile As System.IO.FileStream
Dim oStreamWriter As IO.StreamWriter
oNewFile = New System.IO.FileStream("NewLinkFile.html", IO.FileMode.OpenOrCreate)
oStreamWriter = New IO.StreamWriter(oNewFile)
oStreamWriter.Write(oNodeList.ToHtml())
oStreamWriter.Close()
|