Skip to content Skip to sidebar Skip to footer

Jsoup Is Giving A Different HTML Document Compared To My Browser

I made sure to use my browser's User Agent, and it still gives a different HTML. I also tried using Jsoup.parse(Url, int) instead of Jsoup.connect(String). The two attempts: Docume

Solution 1:

I think you can try sending User-Agent as header with jsoup

Document doc = Jsoup.connect(url)
                    .userAgent("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/601.7.7 (KHTML, like Gecko) Version/9.1.2 Safari/601.7.7")
                    .get();

Solution 2:

You can try getting the page with:

URL u = new URL("https://www.google.com/"); //replace https://www.google.com/ with your url
InputStream in = u.openStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
StringBuilder result = new StringBuilder();
String line;
while((line = reader.readLine()) != null) {
    result.append(line);
}
System.out.println(result.toString());

That won't require a library, and maybe that will return the exact page...


Solution 3:

If you're viewing the live DOM from the Elements tab then some of these classes may be different because of JavaScript, which Jsoup will not run when it fetches the raw HTML from the server.


Solution 4:

I'm faced the same problem when trying to get content of the url1 but jsoup return the content of another url (may be jsoup being redirected). Thank to the answer of @Zendy I have found the solution:

  • First step: open browser and press F12 to open developer tool, navigate to Network tab.
  • Navigate browser to the url you need to get content
  • Get the user-agent on the request header and set it into your jsoup. enter image description here

Post a Comment for "Jsoup Is Giving A Different HTML Document Compared To My Browser"