regex - Adding http:// to all links without a protocol -
i use vb.net , add http://
links doesn't start http://, https://, ftp:// , on.
"i want add http here <a href=""www.google.com"" target=""_blank"">google</a>, not here <a href=""http://www.google.com"" target=""_blank"">google</a>."
it easy when had links, can't find solution entire string containing multiple links. guess regex way go, wouldn't know start.
i can find regex myself, it's parsing , prepending i'm having problems with. give me example regex.replace() in c# or vb.net?
any appreciated!
quote rfc 1738:
"scheme names consist of sequence of characters. lower case letters "a"--"z", digits, , characters plus ("+"), period ("."), , hyphen ("-") allowed. resiliency, programs interpreting urls should treat upper case letters equivalent lower case in scheme names (e.g., allow "http" "http")."
excellent! regex match:
/^[a-za-z0-9+.-]+:\/\//
if matches href string, continue on. if not, prepend "http://". remaining sanity checks yours unless ask specific details. note other commenters' thoughts relative links.
edit: i'm starting suspect you've asked wrong question... perhaps don't have splits text individual tokens need handle it. see looking c# html parser
edit: blind try @ ignoring , attacking text, using case insensitive matching,
/(<a +href *= *")(.*?)(" *>)/
if second back-reference matches /^[a-za-z0-9+.-]+:\/\//
, nothing. if not match, replace with
$1 + "http://" + $2 + $3
this isn't c# syntax, should translate across without effort.
Comments
Post a Comment