Skip to content Skip to sidebar Skip to footer

How To Programmatically Measure The Elements' Sizes In Html Source Code Using Python?

I'm doing webpage layout analysis in python. A fundamental task is to programmatically measure the elements' sizes given HTML source codes, so that we could obtain statistical data

Solution 1:

I suggest You to take a look at Ghost - webkit web client written in python. It has JavaScript support so you can easily call JavaScript functions and get its return value. Example shows how to find out google text box width:

>>>from ghost import Ghost>>>ghost = Ghost()>>>ghost.open('https://google.lt')>>>width, resources = ghost.evaluate("document.getElementById('gbqfq').offsetWidth;")>>>width
541.0  # google text box width 541px

Solution 2:

To properly get all the final sizes, you need to render the contents, taking in account all CSS style sheets, and possibly all javascript. Therefore, the only ways to get the sizes from a Python program are to have a full web browser implementation in Python, use a library that can do so, or pilot a browser off-process, remotely.

The later approach can be done with use of the Selenium tools - check how you can get the result of javascript expressions from within a Python program here: Can Selenium web driver have access to javascript global variables?

Post a Comment for "How To Programmatically Measure The Elements' Sizes In Html Source Code Using Python?"