In today’s interconnected world, languageNecessary Libraries translation plays a crucial role in bridging communication gaps across different cultures and regions. With advancements in natural language processing (NLP) and machine learning, building a robust language translator has become more accessible than ever. In this tutorial, we will guide you through the process of creating a powerful language translator using Transformers in Python.
We’ll be using the MarianMTModel from Hugging Face’s Transformers library, which is renowned for its top-notch translation capabilities. To make things even better, we’ll create a user-friendly graphical user interface (GUI) with tkinter
, so our translator will be easy for anyone to use. By the time we’re done, you’ll have a working language translator that can handle multiple languages with ease.
Let’s get started!
Learn also:
- How to Build a Google Translator using Tkinter in Python
- How to Translate Languages with Flask in Python
Table of Contents
- Necessary Libraries
- Imports
- Translation Function
- Translate Button Click Handler
- Set Cursor Type
- GUI Setup
- Language Dictionary
- Main Entry Point
- Example
- Full Code
Necessary Libraries
Don’t forget to install these libraries using the terminal or command prompt so the code works properly:
$ pip install tk
$ pip install transformers
Imports
Just like any great adventure, we begin by gathering our essential tools. Here’s what we’ll need:
- MarianMTModel and MarianTokenizer: These are our heroes for machine translation, handling the heavy lifting of language processing.
- Tkinter: To create a graphical user interface (GUI).
- Messagebox: Think of this as our way to pop up helpful messages and alerts.
- OptionMenu: To create a drop-down menu for selecting an option.
- StringVar: This magical tool dynamically links variables with tkinter widgets, making sure everything stays in sync.
- ttk: Providing us with stylish, themed widgets to make our GUI look polished.
- Threading: Allowing our program to multitask like a pro, keeping the main window smooth and responsive.
- Platform: Helping us identify the operating system so we can set the cursor just right.
from transformers import MarianMTModel, MarianTokenizer
import tkinter as tk
from tkinter import messagebox, OptionMenu, StringVar, ttk
import threading
import platform
With these tools in hand, we’re ready to dive into our journey to build an awesome language translator!
Translation Function
Now that we have collected our tools, let’s move on to the engine that drives our program: the translate()
function, which transforms words from one language to another. How, you might wonder? Don’t worry, we’ll go through it step by step:
- First, the function constructs the model name and loads it along with the tokenizer based on the source and target languages. Then, it tokenizes the input text and prepares it for the model.
- Next, it uses beam search, which considers multiple possible translations to generate a translation with improved quality.
- Finally, once we get the generated tokens of the translation, they are decoded into a readable format.
def translate(texts, src_lang="en", tgt_lang="fr", num_beams=5, early_stopping=True):
""" This Translate texts from src_lang to tgt_lang using MarianMTModel with beam search."""
model_name = f'Helsinki-NLP/opus-mt-{src_lang}-{tgt_lang}'
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name)
# Tokenize the text
inputs = tokenizer(texts, return_tensors="pt", padding=True)
# Generate translation using the model with beam search
translated = model.generate(**inputs, num_beams=num_beams, early_stopping=early_stopping)
# Decode the translated text
translated_texts = [tokenizer.decode(t, skip_special_tokens=True) for t in translated]
return translated_texts
Translate Button Click Handler
With the engine of our program complete, it is time to make the guide that steers our engine: the on_translate()
function. When triggered, this function retrieves the input text from the entry widget and strips any whitespace using text_entry.get().strip()
. It then ensures that the text entry widget isn’t empty; if it is, it prompts the user to input the text with a message box. Next, it checks if the user has selected both the source and target languages. If not, it prompts the user again to select the languages from the drop-down menu with a message box.
After validating the inputs, it converts the language names to their corresponding codes using language_dict
(e.g., English is “en” and French is “fr”). With all inputs ready, the function proceeds as follows:
- It starts a new thread with the sub-function
translate_thread()
. - It sets the cursor to a watch icon and starts the progress bar to indicate the process has begun.
- It calls the
translate()
function and updates the result text widget with the translation. - Any error that occurs during this process is indicated with a message box.
- Once the operation is over, the progress bar stops, showing that the translation is complete.
def on_translate():
"""Handle the translate button click."""
source_text = text_entry.get("1.0", tk.END).strip()
if not source_text:
messagebox.showwarning("Input Error", "Please enter text to translate.")
return
src_lang_name = src_lang_var.get().strip()
tgt_lang_name = tgt_lang_var.get().strip()
if not src_lang_name or not tgt_lang_name:
messagebox.showwarning("Input Error", "Please specify both source and target languages.")
return
src_lang = language_dict[src_lang_name]
tgt_lang = language_dict[tgt_lang_name]
def translate_thread():
try:
set_cursor("watch")
progress_bar.start()
translations = translate([source_text], src_lang, tgt_lang)
result_text.set(translations[0])
except Exception as e:
messagebox.showerror("Translation Error", str(e))
finally:
progress_bar.stop()
set_cursor("")
# Run the translation in a separate thread
threading.Thread(target=translate_thread).start()
Set Cursor Type
def set_cursor(cursor_type):
"""Set the cursor type depending on the platform."""
cursor = cursor_type if platform.system() != 'Linux' else 'watch'
root.config(cursor=cursor)
Next, to keep things clear for our users, we need a way to show that something’s happening behind the scenes. That’s where our set_cursor()
function comes in. It checks the operating system using the platform module and sets the cursor type based on the cursor_type
parameter, giving a visual cue that work is in progress.
GUI Setup
For this step, let’s bring all our components together and build the graphical interface with the setup_gui()
function. We’ll start by creating the main window and setting its title. First, we add a label with the name of our blog. Next, we set up an entry widget where you can type in the text you want to translate.
Below the text entry, we place a label that says “Source Language” and a drop-down menu to select the source language. We use a StringVar
to hold the selected language, with English set as the default. We’ll do the same for the target language, adding a label, a drop-down menu, and another StringVar
, with French as the default.
After that, we add a “Translate” button that will trigger the on_translate()
function when clicked. To show that the translation process is ongoing, we’ll include a progress bar. Finally, we create a StringVar
to hold the results, which will be displayed on the result label.
Feel free to customize this function however you like. For example, you could replace the result label with a widget that lets you select and copy the results, or even add a copy button.
def setup_gui():
"""Set up the GUI components."""
global root
root = tk.Tk()
root.title("Translator - The Pycodes")
# Blog name label
blog_name_label = tk.Label(root, text="The Pycodes", font=("Helvetica", 16, "bold"))
blog_name_label.pack(pady=10)
# Source text entry
global text_entry
text_entry = tk.Text(root, height=10, width=50)
text_entry.pack(pady=10)
# Source language selection
src_lang_label = tk.Label(root, text="Source Language:")
src_lang_label.pack()
global src_lang_var
src_lang_var = StringVar(root)
src_lang_var.set("English")
src_lang_menu = OptionMenu(root, src_lang_var, *language_dict.keys())
src_lang_menu.pack()
# Target language selection
tgt_lang_label = tk.Label(root, text="Target Language:")
tgt_lang_label.pack()
global tgt_lang_var
tgt_lang_var = StringVar(root)
tgt_lang_var.set("French")
tgt_lang_menu = OptionMenu(root, tgt_lang_var, *language_dict.keys())
tgt_lang_menu.pack()
# Translate button
translate_button = tk.Button(root, text="Translate", command=on_translate)
translate_button.pack(pady=10)
# Progress bar
global progress_bar
progress_bar = ttk.Progressbar(root, mode='indeterminate')
progress_bar.pack(pady=10)
# Result display
global result_text
result_text = tk.StringVar()
result_label = tk.Label(root, textvariable=result_text, wraplength=400, justify="left", bg="lightgray", height=10, width=50)
result_label.pack(pady=10)
return root
Language Dictionary
Since we are using language codes, our program would be incomplete without a dictionary that maps languages to their codes. This is the objective of the language_dict
, ensuring accurate translation.
# Available languages
language_dict = {
"English": "en",
"French": "fr",
"German": "de",
"Spanish": "es",
"Italian": "it",
"Dutch": "nl",
"Portuguese": "pt",
"Russian": "ru",
"Chinese": "zh",
"Japanese": "ja",
"Korean": "ko"
}
Main Entry Point
We have finally reached the grand finale. This part ensures that the script can only be run directly and not imported as a module. It also starts the main event loop, keeping the main window running and responsive to the user, and calls the setup_gui()
function to set up the graphical user interface.
if __name__ == "__main__":
# Set up and run the GUI
root = setup_gui()
root.mainloop()
Example
I ran this code on a Windows Syestem as shown in the image below:
Also on a Linux system:
Full Code
from transformers import MarianMTModel, MarianTokenizer
import tkinter as tk
from tkinter import messagebox, OptionMenu, StringVar, ttk
import threading
import platform
def translate(texts, src_lang="en", tgt_lang="fr", num_beams=5, early_stopping=True):
""" This Translate texts from src_lang to tgt_lang using MarianMTModel with beam search."""
model_name = f'Helsinki-NLP/opus-mt-{src_lang}-{tgt_lang}'
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name)
# Tokenize the text
inputs = tokenizer(texts, return_tensors="pt", padding=True)
# Generate translation using the model with beam search
translated = model.generate(**inputs, num_beams=num_beams, early_stopping=early_stopping)
# Decode the translated text
translated_texts = [tokenizer.decode(t, skip_special_tokens=True) for t in translated]
return translated_texts
def on_translate():
"""Handle the translate button click."""
source_text = text_entry.get("1.0", tk.END).strip()
if not source_text:
messagebox.showwarning("Input Error", "Please enter text to translate.")
return
src_lang_name = src_lang_var.get().strip()
tgt_lang_name = tgt_lang_var.get().strip()
if not src_lang_name or not tgt_lang_name:
messagebox.showwarning("Input Error", "Please specify both source and target languages.")
return
src_lang = language_dict[src_lang_name]
tgt_lang = language_dict[tgt_lang_name]
def translate_thread():
try:
set_cursor("watch")
progress_bar.start()
translations = translate([source_text], src_lang, tgt_lang)
result_text.set(translations[0])
except Exception as e:
messagebox.showerror("Translation Error", str(e))
finally:
progress_bar.stop()
set_cursor("")
# Run the translation in a separate thread
threading.Thread(target=translate_thread).start()
def set_cursor(cursor_type):
"""Set the cursor type depending on the platform."""
cursor = cursor_type if platform.system() != 'Linux' else 'watch'
root.config(cursor=cursor)
def setup_gui():
"""Set up the GUI components."""
global root
root = tk.Tk()
root.title("Translator - The Pycodes")
# Blog name label
blog_name_label = tk.Label(root, text="The Pycodes", font=("Helvetica", 16, "bold"))
blog_name_label.pack(pady=10)
# Source text entry
global text_entry
text_entry = tk.Text(root, height=10, width=50)
text_entry.pack(pady=10)
# Source language selection
src_lang_label = tk.Label(root, text="Source Language:")
src_lang_label.pack()
global src_lang_var
src_lang_var = StringVar(root)
src_lang_var.set("English")
src_lang_menu = OptionMenu(root, src_lang_var, *language_dict.keys())
src_lang_menu.pack()
# Target language selection
tgt_lang_label = tk.Label(root, text="Target Language:")
tgt_lang_label.pack()
global tgt_lang_var
tgt_lang_var = StringVar(root)
tgt_lang_var.set("French")
tgt_lang_menu = OptionMenu(root, tgt_lang_var, *language_dict.keys())
tgt_lang_menu.pack()
# Translate button
translate_button = tk.Button(root, text="Translate", command=on_translate)
translate_button.pack(pady=10)
# Progress bar
global progress_bar
progress_bar = ttk.Progressbar(root, mode='indeterminate')
progress_bar.pack(pady=10)
# Result display
global result_text
result_text = tk.StringVar()
result_label = tk.Label(root, textvariable=result_text, wraplength=400, justify="left", bg="lightgray", height=10, width=50)
result_label.pack(pady=10)
return root
# Available languages
language_dict = {
"English": "en",
"French": "fr",
"German": "de",
"Spanish": "es",
"Italian": "it",
"Dutch": "nl",
"Portuguese": "pt",
"Russian": "ru",
"Chinese": "zh",
"Japanese": "ja",
"Korean": "ko"
}
if __name__ == "__main__":
# Set up and run the GUI
root = setup_gui()
root.mainloop()
Happy Coding!