A simple public proxy parser in Python using Google search. We’ll be using the googlesearch-python
library to perform Google searches and BeautifulSoup
for HTML parsing.
First, ensure you have the necessary libraries installed:
pip install beautifulsoup4 google
Now, let’s create the proxy parser:
from googlesearch import search
from bs4 import BeautifulSoup
import requests
def fetch_proxies():
proxies = []
# Perform a Google search for public proxy lists
query = "public proxy list"
for url in search(query, num=5, stop=5, pause=2):
# Fetch the HTML content of the search result
try:
response = requests.get(url, timeout=10)
if response.status_code == 200:
html_content = response.text
# Parse the HTML using BeautifulSoup
soup = BeautifulSoup(html_content, 'html.parser')
# Find proxy IP addresses and ports
for row in soup.find_all('tr'):
cols = row.find_all('td')
if len(cols) >= 2:
proxy = cols[0].text.strip() + ':' + cols[1].text.strip()
proxies.append(proxy)
except Exception as e:
print(f"Error fetching proxies from {url}: {e}")
return proxies
if __name__ == "__main__":
proxies = fetch_proxies()
for proxy in proxies:
print(proxy)
This script will perform a Google search for public proxy lists, parse the HTML of the search results, and extract the IP addresses and ports of the proxies. Please note that the quality and reliability of the proxies obtained using this method may vary. Additionally, always make sure to use proxies responsibly and adhere to the terms of service of the websites you are accessing through them.
Comments (0)
There are no comments here yet, you can be the first!