java - JSoup not translating ampersand in links in html -


in jsoup following test case should pass, not.

@test public void shouldprinthrefcorrectly(){     string content=  "<li><a href=\"#\">good</a><ul><li><a href=\"article.php?boid=1865&sid=53&mid=1\">" +             "boss</a></li><li><a href=\"article.php?boid=186&sid=53&mid=1\">" +             "heavent</a></li><li><a href=\"article.php?boid=167&sid=53&mid=1\">" +             "hellos</a></li><li><a href=\"article.php?boid=181&sid=53&mid=1\">" +             "mr.jackson!</a></li>";      document document = jsoup.parse(content, "http://www.google.co.in/");     elements links = document.select("a[href^=article]");     iterator<element> iterator = links.iterator();     list<string> urls = new arraylist<string>();     while(iterator.hasnext()){         urls.add(iterator.next().attr("href"));     }      assert.asserttrue(urls.contains("article.php?boid=181&sid=53&mid=1")); } 

could of please give me reason why failing?

there 3 problems:

  1. you're asserting there's bovikatanid parameter present, while it's called boid.

  2. the html source using & instead of &amp; in source. technically invalid.

  3. jsoup parsing &mid | somehow. should have scanned until ;.

to fix #1, have yourself. fix #2, have report issue serveradmin in question (it's fault, however, since average browser forgiving on this, i'd imagine google doing save bandwidth). fix #3, i've reported an issue jsoup guy see thinks this.


update: see, jonathan (the jsoup guy) has fixed it. it'll there in next release.


Comments

Popular posts from this blog

java - SNMP4J General Variable Binding Error -

windows - Python Service Installation - "Could not find PythonClass entry" -

Determine if a XmlNode is empty or null in C#? -