Blog Posts

How to Generate Automated PDF Documents with Python

Blog: Think Data Analytics Blog

When was the last time you grappled with a PDF document? You probably don’t have to look too far back to find the answer to that question. We deal with a multitude of documents on a daily basis in our lives and an overwhelmingly large number of those are indeed PDF documents. It is fair to claim that a lot of these documents are tediously repetitive and agonizingly painful to formulate. It is about time we consider leveraging the power of automation with Python to mechanize the tedious so that we may reallocate our precious time to more pressing tasks in our lives.

Mind you, there is absolutely no need to be tech savvy and what we are going to do here should be trivial enough that our inner unsavvy laymen can tackle in short order. After reading this tutorial you will learn how to automatically generate PDF documents with your own data, charts and images all bundled together with a dazzling look and structure.

Specifically, in this tutorial we will automate the following actions:

Creating PDF Documents

 
For this tutorial, we will be using FPDF which is one of the most versatile and intuitive packages used to generate PDF’s in Python. Before we proceed any further, fire up Anaconda prompt or any other Python IDE of your choice and install FPDF:

pip install FPDF

Then import the stack of libraries that we’ll be using to render our document:

import numpy as np
import pandas as pd
from fpdf import FPDF
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib.ticker import ScalarFormatter

Subsequently proceed with creating the first page of your PDF document and set the font with its size and color:

pdf = FPDF(orientation = 'P', unit = 'mm', format = 'A4')
pdf.add_page()
pdf.set_font('helvetica', 'bold', 10)
pdf.set_text_color(255, 255, 255)

You can however change the font whenever you like if you need to have various typefaces.

Inserting Images

 
The next logical step would be to give our document a background image that sets the structure for the rest of our page. For this tutorial I used Microsoft PowerPoint to render the formatting for my background image. I simply used text boxes and other visuals to create the desired format and once I was done I grouped everything together by selecting all the elements and hitting Ctrl-G. Finally I saved the grouped elements as a PNG image by right clicking on them and selecting ‘save as picture’.

Background image. Image by author.

As you can see above, the background image sets the structure for our page and includes space for charts, figures, text and numbers that will be generated later on. The specific PowerPoint file used to generate this image can be downloaded here.

Subsequently insert the background image into your PDF document and configure its position with the following:

pdf.image('C:/Users/.../image.png', x = 0, y = 0, w = 210, h = 297)

Please note that you can insert as many images as you like by extending the method shown above.

Inserting Text and Numbers

 
Adding text and numbers can be done in two ways. We can either specify the exact location we want to place the text:

pdf.text(x, y, txt)

Or alternatively, we can create a cell and then place the text within it. This method would be more suitable for aligning or centering variable or dynamic text:

pdf.set_xy(x, y)
pdf.cell(w, h, txt, border, align, fill) 

Please note that in the methods above:

Visualizing Data

 
In this part we are going to create a bar chart that will display a timeseries dataset of our credit, debit and balance values versus time. For this we will use Matplotlib to render our figures as such:

view rawbar_chart hosted with by GitHub

In the snippet above, credit, debit and balance are 2-dimensional lists with values for date and transaction amount respectively. Once the chart is generated and saved, it can then be inserted into our PDF document using the method shown in the previous sections.

Similarly, we can generate donut charts with the following snippet of code:

view rawdonut_chart hosted with ❤ by GitHub

And once you are all done, you can wrap it up by generating the automated PDF document as such:

pdf.output('Automated PDF Report.pdf')

Conclusion

 
And there you have it, your very own automatically generated PDF report! Now you’ve learnt how to create PDF documents, insert text and images into them and you’ve also learnt how to generate and embed charts and figures. But you are by no means limited to just that, in fact you can extend these techniques to include other visuals with multiple page documents too. The sky is truly the limit.

Image by author.

If you want to learn more about Python and data visualization, then feel free to check out the following (affiliate linked) courses: Python for Everybody Specialization and Data Visualization with Python. In addition, feel free to explore more of my tutorials here.

Original Source

The post How to Generate Automated PDF Documents with Python appeared first on ThinkDataAnalytics.

Leave a Comment

Get the BPI Web Feed

Using the HTML code below, you can display this Business Process Incubator page content with the current filter and sorting inside your web site for FREE.

Copy/Paste this code in your website html code:

<iframe src="https://www.businessprocessincubator.com/content/how-to-generate-automated-pdf-documents-with-python/?feed=html" frameborder="0" scrolling="auto" width="100%" height="700">

Customizing your BPI Web Feed

You can click on the Get the BPI Web Feed link on any of our page to create the best possible feed for your site. Here are a few tips to customize your BPI Web Feed.

Customizing the Content Filter
On any page, you can add filter criteria using the MORE FILTERS interface:

Customizing the Content Filter

Customizing the Content Sorting
Clicking on the sorting options will also change the way your BPI Web Feed will be ordered on your site:

Get the BPI Web Feed

Some integration examples

BPMN.org

XPDL.org

×