Using C#, how can I manually validate a html tag? -
i have example image tag:
<img src="http://... .jpg" al="myimage" hhh="aaa" />
and mantain, example, image tag list of valid attributes
l1=(alt, src, width, height, align, border, hspace, longdesc, vpace)
i parsing img tag , getting used attributes this:
l2=(src, al, hhh)
how can programaticaly validate image tag? 'al' attribute should become 'alt' ('alt' attribute more 'align' contains more characters) , 'hhh' tag disappear (because there no attribute it)?
for result tag should this:
<img src="http://... .jpg" alt="myimage" />
thanks.
jeff
you use linq2xml parse code:
xelement doc = xelement.parse(...)
then correct wrong attributes using best-match algorithm against valid attributes in-memory dictionary.
edit: wrote , tested simplified best-matched algorithm (sorry, it's vb):
dim validtags() string = { "width", "height", "img" }
(simplified, should create more structured dictionary tags , possible attributes each tag)
dim maxmatch integer = 0 dim matchedtag string = nothing each tag string in validtags dim match integer = checkmatch(tag, source) if match > maxmatch maxmatch = match matchedtag = tag end if next debug.writeline("matched tag {0} matched % {1}", matchedtag, maxmatch)
the above code calls method determine percentage source string equals valid tag.
private function checkmatch(byval tag string, byval source string) integer if tag = source return 100 dim maxpercentage integer = 0 index integer = 0 tag.length - 1 dim tindex integer = index dim sindex integer = 0 dim matchcounter integer = 0 while true if tag(tindex) = source(sindex) matchcounter += 1 end if tindex += 1 sindex += 1 if tindex + 1 > tag.length orelse sindex + 1 > source.length exit while end if end while dim percentage integer = cint(matchcounter * 100 / math.max(tag.length, source.length)) if percentage > maxpercentage maxpercentage = percentage next return maxpercentage end function
the above method, given source string , tag, finds best match percentage comparing single characters.
given "widt" input, finds "width" best match 80% match value.
Comments
Post a Comment