Have you ever faced the following project requirements: expressive data reports, visualized in two forms? First, on demand (using a web interface) with the ability to export to PDF files. Second, generated (PDF as well) and sent according to the schedule at the user’s request. They must contain charts (pie, linear, bar, etc.), flows, tables (not so simple) and other non-plain-text representations that make data more readable, impressive and qualitative.
You’ve got it, partially. Data is already visualized on the web page using most famous web technologies (HTML, CSS, JS). Client wants to export what he sees in a similar fashion and style. He likes graphs, diagrams, colored arrows, lines, merged table cells, table captions, headers.
PDF is probably the most-liked, most-chosen format for business documents. Thinking as a developer, facing scheduling PDF file generation requirements you are probably choosing a server side approach. So you need to reproduce the same data representation on the server side, mirroring a web based solution that is already done and shining. You probably need yet another server side PDF report design tool and what’s more important: additional time and developer resources (I bet you’re lucky and you have a programmer waiting for tasks) to do the same thing on the back-end (yes, server) side. Many hours (or weeks) later, the server side solution is prepared and you are ready to go. During the agile process, requirements change from time to time, so now you need to change the data visualization solution twice, on the server side and on the user interface. You’ve just started experiencing disadvantages. Do things have to be so time and work consuming ?
OK, now we’ve got:
- Application with data visualized, web based
- Automation tool to interact with browsers
What else do we need ? Of course the browser. I’m not 100% sure about the first headless browser but I believe it was PhantomJS. Based on the wikipedia: “PhantomJS was released January 23, 2011 by Ariya Hidayat after several years in development.“ Unfortunately it’s no longer maintained (see here for details) so we should probably look around for something that is not in the suspended state. The list of headless browsers can be found on wikipedia and other resources like this one. However, if using selenium (Selenium WebDriver I should say), you are free to choose one of the supported browsers. I believe you already know at least one from this list 😉
There is the last piece of the puzzle to complete this integration. You will need to download additional drivers (for many browsers like Chrome, Firefox and Edge those are all standalone executables) to work with each of the major browsers. You don’t have to use every possible combination of these software. Just pick one pair, like Chrome browser and Chrome driver 🙂 And try it.
Let’s recap what we already achieved. Both requirements: to have exportable visualized data on demand and according to the schedule, are satisfied. What’s more important: resources are saved and work does not have to be duplicated on the server and client side.
What about the authentication?
You can instruct an automation tool to firstly authenticate and then perform required actions. Imagine this tool might be a custom user.
Does it work?
Solution based on this article has been working on a couple of production instances for about a year. It definitely works.
Przemysław Fusik, Java Team Leader