Home » Tutorials » How to Build a Text Summarizer with Transformers in Python

How to Build a Text Summarizer with Transformers in Python

We live in a world full of information. Every day, we come across long articles, reports, and documents. It can be tough to find the time to read everything, and that’s where a text summarizer comes in handy. A text summarizer helps by pulling out the key points, making it easier to understand the main ideas quickly.

Transformers, a breakthrough in natural language processing (NLP), have made text summarization more accurate and context-aware. With the Hugging Face Transformers library, you can build a powerful text summarizer in Python.

Today, we’ll show you how to build a text summarizer using transformers in Python. You’ll learn how to leverage Hugging Face Transformers, create a user-friendly GUI with tkinter, and save your summaries. Let’s dive in and simplify how we process large texts!

Table of Contents

Necessary Libraries

Make sure to install the tkinter, transformers, and torch libraries using the terming or command prompt for the code to function properly:

$ pip install tk
$ pip install transformers
$ pip install torch

Torch is required by Transformers, which is why we need to install it.

Imports

Before starting any adventure, you need the right tools. Let’s gather the essential libraries for our journey:

  • tkinter: We’ll use this to create a smooth graphical user interface.
  • filedialog from tkinter: This allows us to open and save files easily.
  • messagebox from tkinter: It helps us display information and error messages.
  • ttk from tkinter: For adding sleek, themed widgets.
  • pipeline from the transformers library: This will power our text summarization model.
  • threading: Enables our program to handle multiple tasks without freezing.
  • Counter from collections: Keeps track of elements and their counts.
  • string: Provides useful constants and classes for string operations.
import tkinter as tk
from tkinter import filedialog, messagebox, ttk
from transformers import pipeline
import threading
from collections import Counter
import string

With these imports, we’re all set to build our text summarizer. Let’s get started!

Summarization Function

Now that we have met the dream team, it is time to create the function that will perform the text summarization: the summarize_text() function. How does it work? Let’s take a look:

  • First, it initializes the summarization model and loads it to perform text summarization.
  • Then, the function uses the model to generate a summary of the input text, respecting the maximum and minimum lengths specified by the user.
summarizer = pipeline("summarization")

def summarize_text(text, max_length, min_length):
   summary = summarizer(text, max_length=max_length, min_length=min_length, do_sample=False)
   return summary[0]['summary_text']

Opening a File

Next, to enhance user convenience, if the user already has a file and wants to summarize its text, they don’t need to copy and paste it. Instead, they can simply select the file using the open_file() function:

  • This function opens a file dialog that allows the user to select only text files with “.txt” extensions. When a file is selected, its content is read with file.read().
  • The function then inserts the content into the text widget using text_area.insert(), after first clearing any previous content with text_area.delete().
  • Finally, it displays the word count, character count, and common words of the selected text using display_text_info().
def open_file():
   file_path = filedialog.askopenfilename(filetypes=[("Text files", "*.txt")])
   if file_path:
       with open(file_path, 'r', encoding='utf-8') as file:
           text = file.read()
       text_area.delete("1.0", tk.END)
       text_area.insert(tk.END, text)
       display_text_info(text)

Saving the Summary

Once the text is summarized, we may want to save it. That’s why we created the save_summary() function:

This function takes the summarized text and trims any whitespace using summary_area.get().strip(). It then opens a file dialog for the user to select where to save the summary. Next, with file.write(), it writes the summary to the selected file. If the save is successful, it displays a success message; if the summary is empty, a warning message is displayed.

def save_summary():
   summary = summary_area.get("1.0", tk.END).strip()
   if summary:
       file_path = filedialog.asksaveasfilename(defaultextension=".txt", filetypes=[("Text files", "*.txt")])
       if file_path:
           with open(file_path, 'w', encoding='utf-8') as file:
               file.write(summary)
           messagebox.showinfo("Success", "Summary saved successfully!")
   else:
       messagebox.showwarning("Warning", "Summary is empty!")

Performing Summarization

Following the save function, we come to the perform_summarization() function, which ignites the summarization process. How, you might ask? Well, through a series of steps that we will explain right now:

  • First, this function retrieves and trims the input text with text_area.get("1.0", tk.END).strip(). If the text is not empty, it then attempts to convert the max and min length inputs into integers. If the min length is greater than the max length, a warning message is displayed and the function returns early.
  • If the input lengths are valid, the function starts a new thread to run the run_summarization() function with the input text and the specified lengths. If there is a ValueError during the conversion of the lengths, a warning message is displayed indicating that valid numbers are required. If the input text is empty, a warning message is also displayed.
def perform_summarization():
   text = text_area.get("1.0", tk.END).strip()
   if text:
       try:
           max_length = int(max_length_entry.get())
           min_length = int(min_length_entry.get())
           if min_length > max_length:
               messagebox.showwarning("Warning", "Min Length cannot be greater than Max Length!")
               return
           threading.Thread(target=run_summarization, args=(text, max_length, min_length)).start()
       except ValueError:
           messagebox.showwarning("Warning", "Please enter valid numbers for Max Length and Min Length!")
   else:
       messagebox.showwarning("Warning", "Input text is empty!")

Running Summarization in a Thread

def run_summarization(text, max_length, min_length):
   progress_bar.start()
   summary = summarize_text(text, max_length, min_length)
   summary_area.delete("1.0", tk.END)
   summary_area.insert(tk.END, summary)
   progress_bar.stop()

Now this is where the magic happens. The run_summarization() function handles the actual summarization process. How?

Well, when this function is triggered, it starts by activating the progress bar to indicate that the process has begun. Then, it calls the summarize_text() function to generate a summary of the input text. After that, it prepares the summary_area widget by deleting any previous content and inserting the new summary. Finally, it stops the progress bar to indicate that the process is complete.

Displaying Text Information

def display_text_info(text):
   word_count = len(text.split())
   char_count = len(text)
   most_common_words = Counter(text.lower().translate(str.maketrans('', '', string.punctuation)).split()).most_common(
       5)
   info = f"Word Count: {word_count}\nCharacter Count: {char_count}\nMost Common Words: {most_common_words}"
   info_label.config(text=info)

This is where we reveal the secrets of our text. The display_text_info() function analyzes the text to show the word count, character count, and the most common words.

Creating the GUI

With all the components ready, it’s time to assemble everything into a user-friendly program. We achieve this with the create_gui() function. It starts by creating the main window for our program, setting its title, and defining its geometry.

Next, we create a label with the name of the application. We then create a frame and add an “Input Text” label along with a scrollable text_area widget for user input.

We proceed by creating a new frame called controls_frame. In this frame, we add the “Open Text File” button, which calls the open_file() function. We also add labels and entry boxes for the Min Length and Max Length. Additionally, we include two buttons: the “Summarize Text” button, which calls the perform_summarization() function, and the “Save Summary” button, which calls the save_summary() function.

Similar to how we created a frame for the input widget, we create a new frame for the summary display. This frame contains a label that says “Summary” and a scrollable summary_area widget to display the summary. We also add a label to display text information.

Finally, we create a progress bar and add the mainloop() command to ensure that the main window starts, keeps running, and remains responsive to the user.

def create_gui():
   global text_area, summary_area, max_length_entry, min_length_entry, info_label, progress_bar


   root = tk.Tk()
   root.title("Text Summarizer - The Pycodes")
   root.geometry("900x700")


   title_label = tk.Label(root, text="Text Summarizer", font=("Helvetica", 16))
   title_label.pack(pady=10)


   text_frame = tk.Frame(root)
   text_frame.pack(fill=tk.BOTH, expand=True, padx=10, pady=5)


   text_label = tk.Label(text_frame, text="Input Text:")
   text_label.pack(anchor=tk.W)


   text_area_scrollbar = tk.Scrollbar(text_frame)
   text_area_scrollbar.pack(side=tk.RIGHT, fill=tk.Y)


   text_area = tk.Text(text_frame, wrap=tk.WORD, height=10, yscrollcommand=text_area_scrollbar.set)
   text_area.pack(fill=tk.BOTH, expand=True)
   text_area_scrollbar.config(command=text_area.yview)


   controls_frame = tk.Frame(root)
   controls_frame.pack(fill=tk.X, padx=10, pady=5)


   open_button = tk.Button(controls_frame, text="Open Text File", command=open_file)
   open_button.pack(side=tk.LEFT, padx=5)


   tk.Label(controls_frame, text="Max Length:").pack(side=tk.LEFT)
   max_length_entry = tk.Entry(controls_frame, width=5)
   max_length_entry.insert(0, "150")
   max_length_entry.pack(side=tk.LEFT, padx=5)


   tk.Label(controls_frame, text="Min Length:").pack(side=tk.LEFT)
   min_length_entry = tk.Entry(controls_frame, width=5)
   min_length_entry.insert(0, "30")
   min_length_entry.pack(side=tk.LEFT, padx=5)


   summarize_button = tk.Button(controls_frame, text="Summarize Text", command=perform_summarization)
   summarize_button.pack(side=tk.LEFT, padx=5)


   save_button = tk.Button(controls_frame, text="Save Summary", command=save_summary)
   save_button.pack(side=tk.LEFT, padx=5)


   summary_frame = tk.Frame(root)
   summary_frame.pack(fill=tk.BOTH, expand=True, padx=10, pady=5)


   summary_label = tk.Label(summary_frame, text="Summary:")
   summary_label.pack(anchor=tk.W)


   summary_area_scrollbar = tk.Scrollbar(summary_frame)
   summary_area_scrollbar.pack(side=tk.RIGHT, fill=tk.Y)


   summary_area = tk.Text(summary_frame, wrap=tk.WORD, height=10, yscrollcommand=summary_area_scrollbar.set)
   summary_area.pack(fill=tk.BOTH, expand=True)
   summary_area_scrollbar.config(command=summary_area.yview)


   info_label = tk.Label(root, text="", font=("Helvetica", 10), justify=tk.LEFT)
   info_label.pack(fill=tk.BOTH, expand=True, padx=10, pady=5)


   progress_bar = ttk.Progressbar(root, mode="indeterminate", length=400)
   progress_bar.place(x=200, y=620)


   root.mainloop()

Running the Application

Lastly, we ensure that this program can only be run directly and not imported as a module. This part also guarantees that the create_gui() function is called immediately when the script is executed.

if __name__ == "__main__":
   create_gui()

Example

I ran this code on a Linux system. First, I summarized this English paragraph as shown in the image below:

Then I summarized the uploaded file named “example.txt” by clicking on the “Open Text File” button, this txt file contains a French paragraph:

After that, I saved the file as a “summarized example” by clicking the “Save Summary” Button:

Also, I ran this script on a Windows system as shown in the image below:

Full Code

import tkinter as tk
from tkinter import filedialog, messagebox, ttk
from transformers import pipeline
import threading
from collections import Counter
import string


summarizer = pipeline("summarization")




def summarize_text(text, max_length, min_length):
   summary = summarizer(text, max_length=max_length, min_length=min_length, do_sample=False)
   return summary[0]['summary_text']




def open_file():
   file_path = filedialog.askopenfilename(filetypes=[("Text files", "*.txt")])
   if file_path:
       with open(file_path, 'r', encoding='utf-8') as file:
           text = file.read()
       text_area.delete("1.0", tk.END)
       text_area.insert(tk.END, text)
       display_text_info(text)




def save_summary():
   summary = summary_area.get("1.0", tk.END).strip()
   if summary:
       file_path = filedialog.asksaveasfilename(defaultextension=".txt", filetypes=[("Text files", "*.txt")])
       if file_path:
           with open(file_path, 'w', encoding='utf-8') as file:
               file.write(summary)
           messagebox.showinfo("Success", "Summary saved successfully!")
   else:
       messagebox.showwarning("Warning", "Summary is empty!")




def perform_summarization():
   text = text_area.get("1.0", tk.END).strip()
   if text:
       try:
           max_length = int(max_length_entry.get())
           min_length = int(min_length_entry.get())
           if min_length > max_length:
               messagebox.showwarning("Warning", "Min Length cannot be greater than Max Length!")
               return
           threading.Thread(target=run_summarization, args=(text, max_length, min_length)).start()
       except ValueError:
           messagebox.showwarning("Warning", "Please enter valid numbers for Max Length and Min Length!")
   else:
       messagebox.showwarning("Warning", "Input text is empty!")




def run_summarization(text, max_length, min_length):
   progress_bar.start()
   summary = summarize_text(text, max_length, min_length)
   summary_area.delete("1.0", tk.END)
   summary_area.insert(tk.END, summary)
   progress_bar.stop()




def display_text_info(text):
   word_count = len(text.split())
   char_count = len(text)
   most_common_words = Counter(text.lower().translate(str.maketrans('', '', string.punctuation)).split()).most_common(
       5)
   info = f"Word Count: {word_count}\nCharacter Count: {char_count}\nMost Common Words: {most_common_words}"
   info_label.config(text=info)




def create_gui():
   global text_area, summary_area, max_length_entry, min_length_entry, info_label, progress_bar


   root = tk.Tk()
   root.title("Text Summarizer - The Pycodes")
   root.geometry("900x700")


   title_label = tk.Label(root, text="Text Summarizer", font=("Helvetica", 16))
   title_label.pack(pady=10)


   text_frame = tk.Frame(root)
   text_frame.pack(fill=tk.BOTH, expand=True, padx=10, pady=5)


   text_label = tk.Label(text_frame, text="Input Text:")
   text_label.pack(anchor=tk.W)


   text_area_scrollbar = tk.Scrollbar(text_frame)
   text_area_scrollbar.pack(side=tk.RIGHT, fill=tk.Y)


   text_area = tk.Text(text_frame, wrap=tk.WORD, height=10, yscrollcommand=text_area_scrollbar.set)
   text_area.pack(fill=tk.BOTH, expand=True)
   text_area_scrollbar.config(command=text_area.yview)


   controls_frame = tk.Frame(root)
   controls_frame.pack(fill=tk.X, padx=10, pady=5)


   open_button = tk.Button(controls_frame, text="Open Text File", command=open_file)
   open_button.pack(side=tk.LEFT, padx=5)


   tk.Label(controls_frame, text="Max Length:").pack(side=tk.LEFT)
   max_length_entry = tk.Entry(controls_frame, width=5)
   max_length_entry.insert(0, "150")
   max_length_entry.pack(side=tk.LEFT, padx=5)


   tk.Label(controls_frame, text="Min Length:").pack(side=tk.LEFT)
   min_length_entry = tk.Entry(controls_frame, width=5)
   min_length_entry.insert(0, "30")
   min_length_entry.pack(side=tk.LEFT, padx=5)


   summarize_button = tk.Button(controls_frame, text="Summarize Text", command=perform_summarization)
   summarize_button.pack(side=tk.LEFT, padx=5)


   save_button = tk.Button(controls_frame, text="Save Summary", command=save_summary)
   save_button.pack(side=tk.LEFT, padx=5)


   summary_frame = tk.Frame(root)
   summary_frame.pack(fill=tk.BOTH, expand=True, padx=10, pady=5)


   summary_label = tk.Label(summary_frame, text="Summary:")
   summary_label.pack(anchor=tk.W)


   summary_area_scrollbar = tk.Scrollbar(summary_frame)
   summary_area_scrollbar.pack(side=tk.RIGHT, fill=tk.Y)


   summary_area = tk.Text(summary_frame, wrap=tk.WORD, height=10, yscrollcommand=summary_area_scrollbar.set)
   summary_area.pack(fill=tk.BOTH, expand=True)
   summary_area_scrollbar.config(command=summary_area.yview)


   info_label = tk.Label(root, text="", font=("Helvetica", 10), justify=tk.LEFT)
   info_label.pack(fill=tk.BOTH, expand=True, padx=10, pady=5)


   progress_bar = ttk.Progressbar(root, mode="indeterminate", length=400)
   progress_bar.place(x=200, y=620)


   root.mainloop()

if __name__ == "__main__":
   create_gui()

Happy Coding!

Subscribe for Top Free Python Tutorials!

Receive the best directly.  Elevate Your Coding Journey!

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
×