๐ฏ Project Objective
To build an automated Python Image Downloader that fetches and saves images from a website or Google Image search results โ useful for data collection, content management, and AI datasets.
Skills Youโll Learn:
- Web scraping with
requests&BeautifulSoup - Working with URLs and file systems
- File I/O for saving images
- Error handling and rate limiting
- Automation and progress tracking
๐ง Project Overview
The Image Downloader App:
- Takes a keyword or URL from the user
- Finds and downloads all image files (
.jpg,.png,.gif, etc.) - Saves them in a structured local folder
- (Optional) Displays progress and handles duplicates
Real-Life Applications:
- Collecting product or art images
- Creating ML/AI image datasets
- Automating wallpaper downloads
- Archiving online photo galleries
โ๏ธ Technology Stack
| Library | Purpose |
|---|---|
requests | Fetch HTML and image data |
BeautifulSoup | Parse website content |
os | File and directory handling |
re | Regular expressions for filtering URLs |
tqdm | Progress bar (optional, auto-installed) |
๐ป Version 1 โ Console-Based Image Downloader
This script automatically installs missing dependencies, scrapes a URL, and saves images.
import os
import re
import sys
import subprocess
# โ
Auto-install missing packages
def install(package):
try:
__import__(package)
except ImportError:
subprocess.check_call([sys.executable, "-m", "pip", "install", package])
install("requests")
install("beautifulsoup4")
install("tqdm")
import requests
from bs4 import BeautifulSoup
from tqdm import tqdm
from urllib.parse import urljoin
def download_images(url, folder="downloaded_images"):
# Create folder if not exists
os.makedirs(folder, exist_ok=True)
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
img_tags = soup.find_all("img")
if not img_tags:
print("โ No images found.")
return
print(f"๐ผ๏ธ Found {len(img_tags)} images. Downloading...")
for img in tqdm(img_tags, desc="Downloading"):
img_url = img.get("src")
if not img_url:
continue
# Make absolute URL
img_url = urljoin(url, img_url)
img_name = os.path.basename(img_url.split("?")[0])
# Only save image files
if not re.search(r"\.(jpg|jpeg|png|gif)$", img_name, re.IGNORECASE):
continue
try:
img_data = requests.get(img_url, timeout=10).content
with open(os.path.join(folder, img_name), "wb") as f:
f.write(img_data)
except Exception as e:
print(f"โ ๏ธ Skipped {img_url}: {e}")
print(f"\nโ
Download complete! Images saved in '{folder}' folder.")
# Example Usage
if __name__ == "__main__":
target_url = input("Enter the website URL to scrape images from: ")
download_images(target_url)
๐งพ Example Output
Enter the website URL to scrape images from: https://books.toscrape.com
๐ผ๏ธ Found 60 images. Downloading...
Downloading: 100%|โโโโโโโโโโโโโโโโโโโโโโ| 60/60 [00:09<00:00, 6.52it/s]
โ
Download complete! Images saved in 'downloaded_images'
๐งฉ Version 2 โ Search-Based Image Downloader
This one searches by keyword using Bing Image Search API (or you can adapt to Google).
import requests, os
API_KEY = "your_bing_api_key"
SEARCH_URL = "https://api.bing.microsoft.com/v7.0/images/search"
def search_images(keyword, count=10):
headers = {"Ocp-Apim-Subscription-Key": API_KEY}
params = {"q": keyword, "count": count}
response = requests.get(SEARCH_URL, headers=headers, params=params)
data = response.json()
folder = f"images_{keyword.replace(' ', '_')}"
os.makedirs(folder, exist_ok=True)
for i, img in enumerate(data["value"]):
try:
img_url = img["contentUrl"]
img_data = requests.get(img_url, timeout=10).content
with open(os.path.join(folder, f"{keyword}_{i+1}.jpg"), "wb") as f:
f.write(img_data)
except Exception as e:
print(f"โ ๏ธ Error downloading {img_url}: {e}")
print(f"โ
Downloaded {count} images for '{keyword}'")
search_images("sunsets", 15)
๐งฐ Optional Add-Ons
| Feature | Description |
|---|---|
| โ Duplicate Filtering | Compare hashes to skip identical images |
| ๐ Delay/Throttle | Add time.sleep() between requests |
| ๐ Auto Categorization | Sort images by keyword/topic |
| ๐งฎ Progress Bar | Use tqdm for download visualization |
| ๐ง AI Integration | Use OpenAI or CLIP models to caption or tag images |
๐ Real-Life Automation Use-Cases
- Building datasets for AI training (e.g., dogs, cars, food images)
- Downloading product photos from e-commerce platforms
- Backing up gallery or blog images
- Generating visual datasets for research
๐ง Learning Outcomes
After completing this project, youโll:
- Master website data extraction
- Automate repetitive download tasks
- Safely manage and structure large image datasets
- Learn the ethics & legality of scraping (robots.txt compliance)
โ ๏ธ Ethical Scraping Tips
- Always check a siteโs robots.txt or terms of use before scraping.
- Use headers to mimic browsers:
headers = {"User-Agent": "Mozilla/5.0"} response = requests.get(url, headers=headers) - Avoid sending too many requests quickly โ respect server limits.

Leave a Reply