Home » Tutorials » How to Crack PDF Files in Python

How to Crack PDF Files in Python

Imagine standing at the threshold of digital knowledge, where a single password stands between you and the information you seek. In our digital world, PDFs are the gatekeepers to everything from revolutionary research to essential work docs. Ever wonder what to do when these treasures are sealed away by forgotten passwords? That’s where the power of Python and your curiosity converge.

In this article, you’ll learn how to crack PDF files in Python using a dictionary attack. We’ll walk you through setting up your Python environment, choosing the right tools and libraries, and writing a script that systematically tries to unlock your PDF with a list of possible passwords. By the end, you’ll clearly understand the process, the ethical boundaries, and how to apply these techniques responsibly in the real world.

Let’s get started!

Table of Contents

Disclaimer

Please note: This guide on cracking PDF files with Python is for educational purposes only. Unauthorized use may violate legal and ethical standards. Always obtain explicit permission before proceeding.

Necessary Libraries

Let’s get everything set up before we dive into the code part, so make sure to install the tkinter and pikepdf libraries via the terminal or your command prompt for the code to function properly:

$ pip install tk 
$ pip install pikepdf 

Imports

Since we aim for our program to be user-friendly, we will need a graphical interface. For this, we import the tkinter library. From this library, we will specifically import the filedialog and messagebox modules, which allow the user to select directories and display message boxes, respectively.

Lastly, we import the pikepdf library, a Python library designed for working with PDFs. It supports reading and writing PDFs, as well as encryption and decryption, and also handles passwords.

import tkinter as tk
from tkinter import filedialog, messagebox
import pikepdf

Crack PDF File Functions

Now, let’s define our functions – the heartbeat of our code:

browse_pdf Function

This function, browse_pdf, initiates by opening a file dialog window with the title “Choose PDF“, where only files of the PDF format (indicated by “*.pdf“) are displayed for selection, thanks to the filetypes parameter. If the user picks a file, meaning they’ve chosen something and pdf_path isn’t empty, the function first clears the contents of a previously specified entry widget, pdf_entry, from its beginning to the end.

Then, it inserts the path of the selected PDF file into this entry field at the beginning (position 0). This ensures the entry field is updated and ready for a new path to be displayed.

def browse_pdf():
   pdf_path = filedialog.askopenfilename(title="Choose PDF", filetypes=[("PDF Documents", "*.pdf")])
   if pdf_path:
       pdf_entry.delete(0, tk.END)
       pdf_entry.insert(0, pdf_path)

browse_wordlist Function

It performs the same function as the previous one, with one key difference: the user must select the “wordlist.txt” file that contains the passwords. Once a file is selected, its path is placed in a designated entry field, ready for use.

def browse_wordlist():
   wordlist_path = filedialog.askopenfilename(title="Password List", filetypes=[("Text Files", "*.txt")])
   if wordlist_path:
       wordlist_entry.delete(0, tk.END)
       wordlist_entry.insert(0, wordlist_path)

unlock_pdf Function

This one begins by checking if the user has selected both required files. If not, an error message will appear. If the files are selected, it retrieves the inputs, opens the selected “wordlist.txt“, and extracts the passwords to attempt unlocking the selected PDF file using pikepdf.open.

If a password from “wordlist.txt” successfully unlocks the PDF, a message displaying the found password will appear. Otherwise, a message will indicate that the password was not found.

def unlock_pdf():
   pdf_path = pdf_entry.get()
   wordlist_path = wordlist_entry.get()


   if not pdf_path:
       messagebox.showwarning("Warning", "Please select a PDF file.")
       return
   if not wordlist_path:
       messagebox.showwarning("Warning", "Please select a wordlist file.")
       return


   with open(wordlist_path, 'r') as wordlist_file:
       passwords = [line.strip() for line in wordlist_file]


   for pwd in passwords:
       try:
           with pikepdf.open(pdf_path, password=pwd):
               messagebox.showinfo("Success", f"Correct Password: {pwd}")
               return
       except pikepdf._core.PasswordError:
           continue


   messagebox.showinfo("Attempt Over", "No matching password found.")

Creating the Main Window

Here, we created our main window and also set its title.

# Create main window
root = tk.Tk()
root.title("Unlock PDF Password - The Pycodes")

PDF Entry Field and Browse Button

Following that, we created a label that reads “PDF File“, accompanied by an entry field for the PDF file directly beside it. Adjacent to the entry field is a “Browse” button that calls the browse_pdf() function. All these elements are organized using the grid layout.

# PDF Entry Field
pdf_label = tk.Label(root, text="PDF File:")
pdf_label.grid(row=0, column=0, padx=10, pady=5, sticky="w")

pdf_entry = tk.Entry(root, width=50)
pdf_entry.grid(row=0, column=1, padx=10, pady=5)

pdf_browse_btn = tk.Button(root, text="Browse", command=browse_pdf)
pdf_browse_btn.grid(row=0, column=2, padx=5, pady=5)

Wordlist Entry Field and Browse Button

This one serves the same purpose as the previous one:

  • It creates a label reading “Wordlist File“, followed by an entry field for the wordlist file. Right next to that, there is a “Browse” button that triggers the browse_wordlist() function. We’ve laid everything out neatly with a grid so it’s easy to find and use.
# Wordlist Entry Field
wordlist_label = tk.Label(root, text="Wordlist File:")
wordlist_label.grid(row=1, column=0, padx=10, pady=5, sticky="w")


wordlist_entry = tk.Entry(root, width=50)
wordlist_entry.grid(row=1, column=1, padx=10, pady=5)


wordlist_browse_btn = tk.Button(root, text="Browse", command=browse_wordlist)
wordlist_browse_btn.grid(row=1, column=2, padx=5, pady=5)

unlock_pdf Button

For this step, we created a button labeled “Unlock PDF” that triggers the unlock_pdf() function. It is positioned on the main window using the grid layout.

# Unlock PDF Button
unlock_pdf_btn = tk.Button(root, text="Unlock PDF", command=unlock_pdf)
unlock_pdf_btn.grid(row=2, column=1, pady=20)

Main Loop

Lastly, this line ensures that the main window remains active and responsive to user interactions until it is closed.

root.mainloop()

Example

In our example we used this Wordlist File:

https://github.com/danielmiessler/SecLists/blob/master/Passwords/Common-Credentials/10-million-password-list-top-1000000.txt

If you want to create your own wordlist, check out this tutorial.

Full Code

import tkinter as tk
from tkinter import filedialog, messagebox
import pikepdf


def browse_pdf():
   pdf_path = filedialog.askopenfilename(title="Choose PDF", filetypes=[("PDF Documents", "*.pdf")])
   if pdf_path:
       pdf_entry.delete(0, tk.END)
       pdf_entry.insert(0, pdf_path)


def browse_wordlist():
   wordlist_path = filedialog.askopenfilename(title="Password List", filetypes=[("Text Files", "*.txt")])
   if wordlist_path:
       wordlist_entry.delete(0, tk.END)
       wordlist_entry.insert(0, wordlist_path)


def unlock_pdf():
   pdf_path = pdf_entry.get()
   wordlist_path = wordlist_entry.get()


   if not pdf_path:
       messagebox.showwarning("Warning", "Please select a PDF file.")
       return
   if not wordlist_path:
       messagebox.showwarning("Warning", "Please select a wordlist file.")
       return


   with open(wordlist_path, 'r') as wordlist_file:
       passwords = [line.strip() for line in wordlist_file]


   for pwd in passwords:
       try:
           with pikepdf.open(pdf_path, password=pwd):
               messagebox.showinfo("Success", f"Correct Password: {pwd}")
               return
       except pikepdf._core.PasswordError:
           continue


   messagebox.showinfo("Attempt Over", "No matching password found.")


# Create main window
root = tk.Tk()
root.title("Unlock PDF Password - The Pycodes")


# PDF Entry Field
pdf_label = tk.Label(root, text="PDF File:")
pdf_label.grid(row=0, column=0, padx=10, pady=5, sticky="w")


pdf_entry = tk.Entry(root, width=50)
pdf_entry.grid(row=0, column=1, padx=10, pady=5)


pdf_browse_btn = tk.Button(root, text="Browse", command=browse_pdf)
pdf_browse_btn.grid(row=0, column=2, padx=5, pady=5)


# Wordlist Entry Field
wordlist_label = tk.Label(root, text="Wordlist File:")
wordlist_label.grid(row=1, column=0, padx=10, pady=5, sticky="w")


wordlist_entry = tk.Entry(root, width=50)
wordlist_entry.grid(row=1, column=1, padx=10, pady=5)


wordlist_browse_btn = tk.Button(root, text="Browse", command=browse_wordlist)
wordlist_browse_btn.grid(row=1, column=2, padx=5, pady=5)


# Unlock PDF Button
unlock_pdf_btn = tk.Button(root, text="Unlock PDF", command=unlock_pdf)
unlock_pdf_btn.grid(row=2, column=1, pady=20)


root.mainloop()

Happy Coding!

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top