π― Project Objective
To build a multi-functional PDF Toolkit using Python that can merge, split, rotate, extract text, and add watermarks to PDF files β similar to tools like SmallPDF or iLovePDF.
π§© 1. Overview
PDFs are one of the most used document formats in the world.
A PDF Toolkit automates tasks like combining reports, splitting pages, extracting text, and adding watermarks β saving time and effort.
πΌ Real-World Uses
- Combine invoices or reports into one file
- Extract text for analysis
- Add company logos or βConfidentialβ watermarks
- Rotate or reorder scanned pages
βοΈ 2. Required Modules
Weβll use PyPDF2 for handling PDFs, and auto-install it if missing.
# Auto-install required modules
try:
from PyPDF2 import PdfReader, PdfWriter
except ModuleNotFoundError:
import subprocess
subprocess.check_call(["pip", "install", "PyPDF2"])
from PyPDF2 import PdfReader, PdfWriter
π§ 3. Merge Multiple PDFs
def merge_pdfs(pdf_list, output_file):
writer = PdfWriter()
for pdf in pdf_list:
reader = PdfReader(pdf)
for page in reader.pages:
writer.add_page(page)
with open(output_file, "wb") as f:
writer.write(f)
print(f"β
Merged {len(pdf_list)} PDFs into '{output_file}'")
π§ͺ Example:
merge_pdfs(["report1.pdf", "report2.pdf", "report3.pdf"], "merged_report.pdf")
βοΈ 4. Split PDF into Individual Pages
def split_pdf(input_file):
reader = PdfReader(input_file)
for i, page in enumerate(reader.pages):
writer = PdfWriter()
writer.add_page(page)
output_filename = f"page_{i+1}.pdf"
with open(output_filename, "wb") as f:
writer.write(f)
print(f"β
Split '{input_file}' into {len(reader.pages)} pages.")
π§ͺ Example:
split_pdf("merged_report.pdf")
π 5. Rotate Pages
def rotate_pdf(input_file, output_file, rotation_angle=90):
reader = PdfReader(input_file)
writer = PdfWriter()
for page in reader.pages:
page.rotate(rotation_angle)
writer.add_page(page)
with open(output_file, "wb") as f:
writer.write(f)
print(f"β
Rotated all pages in '{input_file}' by {rotation_angle}Β°")
π§ͺ Example:
rotate_pdf("page_1.pdf", "rotated_page.pdf", 180)
π§ 6. Add Watermark to Each Page
def add_watermark(input_file, watermark_file, output_file):
reader = PdfReader(input_file)
writer = PdfWriter()
watermark = PdfReader(watermark_file).pages[0]
for page in reader.pages:
page.merge_page(watermark)
writer.add_page(page)
with open(output_file, "wb") as f:
writer.write(f)
print(f"β
Added watermark to '{input_file}' and saved as '{output_file}'")
π§ͺ Example:
add_watermark("merged_report.pdf", "watermark.pdf", "watermarked_output.pdf")
π 7. Extract Text from PDF
def extract_text(input_file):
reader = PdfReader(input_file)
all_text = ""
for page in reader.pages:
all_text += page.extract_text() + "\n"
with open("extracted_text.txt", "w", encoding="utf-8") as f:
f.write(all_text)
print("β
Text extracted and saved as 'extracted_text.txt'")
π§ͺ Example:
extract_text("report.pdf")
π§° 8. Interactive Menu System
def main():
print("=== PDF Toolkit ===")
print("1. Merge PDFs")
print("2. Split PDF")
print("3. Rotate PDF")
print("4. Add Watermark")
print("5. Extract Text")
print("6. Exit")
choice = input("Enter your choice: ")
if choice == "1":
files = input("Enter PDF filenames (comma separated): ").split(",")
output = input("Output file name: ")
merge_pdfs([f.strip() for f in files], output)
elif choice == "2":
file = input("Enter PDF filename to split: ")
split_pdf(file)
elif choice == "3":
file = input("Enter PDF filename: ")
angle = int(input("Rotation angle (90/180/270): "))
output = input("Output file name: ")
rotate_pdf(file, output, angle)
elif choice == "4":
file = input("Enter PDF filename: ")
watermark = input("Enter watermark PDF filename: ")
output = input("Output file name: ")
add_watermark(file, watermark, output)
elif choice == "5":
file = input("Enter PDF filename: ")
extract_text(file)
elif choice == "6":
print("Goodbye π")
else:
print("Invalid choice.")
if __name__ == "__main__":
main()
π‘ 9. Enhancement Ideas
| Feature | Description |
|---|---|
| πΌ GUI Toolkit | Add a Tkinter-based file selector |
| π PDF Security | Add password protection or encryption |
| π Metadata | Display or edit PDF metadata (title, author) |
| π Batch Mode | Process entire folders automatically |
| βοΈ Cloud Upload | Save output directly to Google Drive or Dropbox |
β Summary
| Feature | Function |
|---|---|
| π Merge | Combine multiple PDFs |
| βοΈ Split | Separate pages into files |
| π Rotate | Rotate PDF pages |
| π§ Watermark | Add watermark/logo |
| π Extract | Extract text from PDFs |
| π§° Extend | GUI or encryption features possible |

Leave a Reply