{"id":161,"date":"2025-10-28T02:20:14","date_gmt":"2025-10-28T02:20:14","guid":{"rendered":"https:\/\/codetypingpro.com\/?p=161"},"modified":"2025-10-28T02:20:14","modified_gmt":"2025-10-28T02:20:14","slug":"23-real-world-python-projects-pdf-toolkit","status":"publish","type":"post","link":"https:\/\/codetypingpro.com\/?p=161","title":{"rendered":"23 &#8211; Real-World Python Projects &#8211; PDF Toolkit"},"content":{"rendered":"\n<h3 class=\"wp-block-heading\">\ud83c\udfaf <strong>Project Objective<\/strong><\/h3>\n\n\n\n<p>To build a <strong>multi-functional PDF Toolkit<\/strong> using Python that can <strong>merge, split, rotate, extract text, and add watermarks<\/strong> to PDF files \u2014 similar to tools like <em>SmallPDF<\/em> or <em>iLovePDF<\/em>.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83e\udde9 1. <strong>Overview<\/strong><\/h2>\n\n\n\n<p>PDFs are one of the most used document formats in the world.<br>A <strong>PDF Toolkit<\/strong> automates tasks like combining reports, splitting pages, extracting text, and adding watermarks \u2014 saving time and effort.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">\ud83d\udcbc <strong>Real-World Uses<\/strong><\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Combine invoices or reports into one file<\/li>\n\n\n\n<li>Extract text for analysis<\/li>\n\n\n\n<li>Add company logos or \u201cConfidential\u201d watermarks<\/li>\n\n\n\n<li>Rotate or reorder scanned pages<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\u2699\ufe0f 2. <strong>Required Modules<\/strong><\/h2>\n\n\n\n<p>We\u2019ll use <code>PyPDF2<\/code> for handling PDFs, and auto-install it if missing.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Auto-install required modules\ntry:\n    from PyPDF2 import PdfReader, PdfWriter\nexcept ModuleNotFoundError:\n    import subprocess\n    subprocess.check_call(&#91;\"pip\", \"install\", \"PyPDF2\"])\n    from PyPDF2 import PdfReader, PdfWriter\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83e\udde0 3. <strong>Merge Multiple PDFs<\/strong><\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>def merge_pdfs(pdf_list, output_file):\n    writer = PdfWriter()\n    for pdf in pdf_list:\n        reader = PdfReader(pdf)\n        for page in reader.pages:\n            writer.add_page(page)\n    with open(output_file, \"wb\") as f:\n        writer.write(f)\n    print(f\"\u2705 Merged {len(pdf_list)} PDFs into '{output_file}'\")\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">\ud83e\uddea Example:<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>merge_pdfs(&#91;\"report1.pdf\", \"report2.pdf\", \"report3.pdf\"], \"merged_report.pdf\")\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\u2702\ufe0f 4. <strong>Split PDF into Individual Pages<\/strong><\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>def split_pdf(input_file):\n    reader = PdfReader(input_file)\n    for i, page in enumerate(reader.pages):\n        writer = PdfWriter()\n        writer.add_page(page)\n        output_filename = f\"page_{i+1}.pdf\"\n        with open(output_filename, \"wb\") as f:\n            writer.write(f)\n    print(f\"\u2705 Split '{input_file}' into {len(reader.pages)} pages.\")\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">\ud83e\uddea Example:<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>split_pdf(\"merged_report.pdf\")\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83d\udd01 5. <strong>Rotate Pages<\/strong><\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>def rotate_pdf(input_file, output_file, rotation_angle=90):\n    reader = PdfReader(input_file)\n    writer = PdfWriter()\n\n    for page in reader.pages:\n        page.rotate(rotation_angle)\n        writer.add_page(page)\n\n    with open(output_file, \"wb\") as f:\n        writer.write(f)\n    print(f\"\u2705 Rotated all pages in '{input_file}' by {rotation_angle}\u00b0\")\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">\ud83e\uddea Example:<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>rotate_pdf(\"page_1.pdf\", \"rotated_page.pdf\", 180)\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83d\udca7 6. <strong>Add Watermark to Each Page<\/strong><\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>def add_watermark(input_file, watermark_file, output_file):\n    reader = PdfReader(input_file)\n    writer = PdfWriter()\n    watermark = PdfReader(watermark_file).pages&#91;0]\n\n    for page in reader.pages:\n        page.merge_page(watermark)\n        writer.add_page(page)\n\n    with open(output_file, \"wb\") as f:\n        writer.write(f)\n    print(f\"\u2705 Added watermark to '{input_file}' and saved as '{output_file}'\")\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">\ud83e\uddea Example:<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>add_watermark(\"merged_report.pdf\", \"watermark.pdf\", \"watermarked_output.pdf\")\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83d\udcdc 7. <strong>Extract Text from PDF<\/strong><\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>def extract_text(input_file):\n    reader = PdfReader(input_file)\n    all_text = \"\"\n    for page in reader.pages:\n        all_text += page.extract_text() + \"\\n\"\n    with open(\"extracted_text.txt\", \"w\", encoding=\"utf-8\") as f:\n        f.write(all_text)\n    print(\"\u2705 Text extracted and saved as 'extracted_text.txt'\")\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">\ud83e\uddea Example:<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>extract_text(\"report.pdf\")\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83e\uddf0 8. <strong>Interactive Menu System<\/strong><\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>def main():\n    print(\"=== PDF Toolkit ===\")\n    print(\"1. Merge PDFs\")\n    print(\"2. Split PDF\")\n    print(\"3. Rotate PDF\")\n    print(\"4. Add Watermark\")\n    print(\"5. Extract Text\")\n    print(\"6. Exit\")\n\n    choice = input(\"Enter your choice: \")\n\n    if choice == \"1\":\n        files = input(\"Enter PDF filenames (comma separated): \").split(\",\")\n        output = input(\"Output file name: \")\n        merge_pdfs(&#91;f.strip() for f in files], output)\n    elif choice == \"2\":\n        file = input(\"Enter PDF filename to split: \")\n        split_pdf(file)\n    elif choice == \"3\":\n        file = input(\"Enter PDF filename: \")\n        angle = int(input(\"Rotation angle (90\/180\/270): \"))\n        output = input(\"Output file name: \")\n        rotate_pdf(file, output, angle)\n    elif choice == \"4\":\n        file = input(\"Enter PDF filename: \")\n        watermark = input(\"Enter watermark PDF filename: \")\n        output = input(\"Output file name: \")\n        add_watermark(file, watermark, output)\n    elif choice == \"5\":\n        file = input(\"Enter PDF filename: \")\n        extract_text(file)\n    elif choice == \"6\":\n        print(\"Goodbye \ud83d\udc4b\")\n    else:\n        print(\"Invalid choice.\")\n<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>if __name__ == \"__main__\":\n    main()\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83d\udca1 9. <strong>Enhancement Ideas<\/strong><\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Feature<\/th><th>Description<\/th><\/tr><\/thead><tbody><tr><td>\ud83d\uddbc GUI Toolkit<\/td><td>Add a Tkinter-based file selector<\/td><\/tr><tr><td>\ud83d\udd10 PDF Security<\/td><td>Add password protection or encryption<\/td><\/tr><tr><td>\ud83d\udcd1 Metadata<\/td><td>Display or edit PDF metadata (title, author)<\/td><\/tr><tr><td>\ud83d\udcc1 Batch Mode<\/td><td>Process entire folders automatically<\/td><\/tr><tr><td>\u2601\ufe0f Cloud Upload<\/td><td>Save output directly to Google Drive or Dropbox<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">\u2705 <strong>Summary<\/strong><\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Feature<\/th><th>Function<\/th><\/tr><\/thead><tbody><tr><td>\ud83d\udcce Merge<\/td><td>Combine multiple PDFs<\/td><\/tr><tr><td>\u2702\ufe0f Split<\/td><td>Separate pages into files<\/td><\/tr><tr><td>\ud83d\udd01 Rotate<\/td><td>Rotate PDF pages<\/td><\/tr><tr><td>\ud83d\udca7 Watermark<\/td><td>Add watermark\/logo<\/td><\/tr><tr><td>\ud83d\udcdc Extract<\/td><td>Extract text from PDFs<\/td><\/tr><tr><td>\ud83e\uddf0 Extend<\/td><td>GUI or encryption features possible<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\ud83c\udfaf Project Objective To build a multi-functional PDF Toolkit using Python that can merge, split, rotate, extract text, and add watermarks to PDF files \u2014 similar to tools like SmallPDF or iLovePDF. \ud83e\udde9 1. Overview PDFs are one of the most used document formats in the world.A PDF Toolkit automates tasks like combining reports, splitting [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-161","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/codetypingpro.com\/index.php?rest_route=\/wp\/v2\/posts\/161","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/codetypingpro.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/codetypingpro.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/codetypingpro.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/codetypingpro.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=161"}],"version-history":[{"count":1,"href":"https:\/\/codetypingpro.com\/index.php?rest_route=\/wp\/v2\/posts\/161\/revisions"}],"predecessor-version":[{"id":162,"href":"https:\/\/codetypingpro.com\/index.php?rest_route=\/wp\/v2\/posts\/161\/revisions\/162"}],"wp:attachment":[{"href":"https:\/\/codetypingpro.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=161"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/codetypingpro.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=161"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/codetypingpro.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=161"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}