v1.8.91-d84675c
← Back to Hex Proxies

Proxies for LangChain Applications

Last updated: April 2026

By Hex Proxies Engineering Team

Learn how to configure LangChain document loaders, web research agents, and custom tools to route requests through Hex Proxies for reliable, unblocked data access.

intermediate20 minutesai-data-science

Prerequisites

  • Python 3.10+
  • LangChain 0.2+ installed
  • Hex Proxies account

Steps

1

Install dependencies

Install LangChain, httpx, and BeautifulSoup. Configure your Hex Proxies credentials as environment variables.

2

Configure document loaders

Set up WebBaseLoader and RecursiveUrlLoader with proxy configuration for reliable document ingestion.

3

Build proxy-aware tools

Create custom LangChain tools that route web requests through Hex Proxies with geo-targeting support.

4

Assemble the agent

Combine your proxy-aware tools with an LLM to create a research agent that can access web content reliably.

5

Optimize for production

Add caching, rate limiting, and error handling to your LangChain proxy integration for production reliability.

Proxies for LangChain Applications

LangChain is the dominant framework for building LLM-powered applications. Many LangChain workflows — document loaders, web research agents, retrieval chains — need to fetch data from external websites. Without proxy infrastructure, these requests hit rate limits, geographic blocks, and anti-bot defenses.

LangChain Components That Need Proxies

  • **WebBaseLoader**: Fetches HTML from URLs for document ingestion
  • **RecursiveUrlLoader**: Crawls entire sites for knowledge base construction
  • **WebResearchRetriever**: Searches the web and fetches results in real-time
  • **Custom Tools**: Agent tools that call external APIs or scrape data

Configuring WebBaseLoader with Proxies

LangChain's WebBaseLoader uses `requests` under the hood. Pass proxy configuration through the session:

import requests

def create_proxied_session(username: str, password: str) -> requests.Session: """Create a requests session configured with Hex Proxies.""" session = requests.Session() proxy_url = f"http://{username}:{password}@gate.hexproxies.com:8080" session.proxies = {"http": proxy_url, "https": proxy_url} session.headers.update({ "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36" }) return session

# Use with WebBaseLoader session = create_proxied_session("YOUR_USER", "YOUR_PASS") loader = WebBaseLoader( web_paths=["https://example.com/page1", "https://example.com/page2"], requests_kwargs={"proxies": session.proxies}, ) docs = loader.load() ```

Custom Proxy-Aware Tool for Agents

Build a LangChain tool that routes all web requests through proxies:

from langchain.tools import tool

PROXY_URL = "http://YOUR_USER:YOUR_PASS@gate.hexproxies.com:8080"

@tool def fetch_webpage(url: str) -> str: """Fetch a webpage through proxy infrastructure. Use for any URL that needs scraping.""" with httpx.Client(proxy=PROXY_URL, timeout=30, follow_redirects=True) as client: resp = client.get(url, headers={ "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36", "Accept": "text/html,application/xhtml+xml", }) resp.raise_for_status() return resp.text[:10000] # Limit context size for LLM

@tool def fetch_api_data(url: str) -> str: """Fetch JSON data from an API through proxy. Use for structured data endpoints.""" with httpx.Client(proxy=PROXY_URL, timeout=30) as client: resp = client.get(url, headers={"Accept": "application/json"}) resp.raise_for_status() return resp.text[:5000] ```

Geo-Targeted Research Agent

Build an agent that can research topics from specific geographic perspectives:

from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor

def build_geo_proxy(country: str) -> str: return f"http://YOUR_USER-country-{country.lower()}:YOUR_PASS@gate.hexproxies.com:8080"

@tool def fetch_geo_content(url: str, country: str = "US") -> str: """Fetch content as seen from a specific country. Useful for regional pricing or localized content.""" proxy = build_geo_proxy(country) with httpx.Client(proxy=proxy, timeout=30) as client: resp = client.get(url) return resp.text[:8000]

llm = ChatOpenAI(model="gpt-4o") tools = [fetch_webpage, fetch_api_data, fetch_geo_content] prompt = ChatPromptTemplate.from_messages([ ("system", "You are a research assistant with web access through proxies."), ("human", "{input}"), ("placeholder", "{agent_scratchpad}"), ]) agent = create_tool_calling_agent(llm, tools, prompt) executor = AgentExecutor(agent=agent, tools=tools, verbose=True) ```

RecursiveUrlLoader for Knowledge Bases

When building a RAG knowledge base that requires crawling entire documentation sites:

from langchain_community.document_loaders import RecursiveUrlLoader

def bs4_extractor(html: str) -> str: soup = BeautifulSoup(html, "html.parser") return soup.get_text(separator="\n", strip=True)

loader = RecursiveUrlLoader( url="https://docs.example.com", max_depth=3, extractor=bs4_extractor, requests_kwargs={ "proxies": { "http": PROXY_URL, "https": PROXY_URL, } }, ) docs = loader.load() ```

Performance Tips for LangChain + Proxies

LangChain agents can make many sequential web requests. Use ISP proxies for the lowest latency — their sub-50ms response time keeps agent chains fast. For broad web research that hits many different domains, residential rotating proxies provide the IP diversity needed to avoid blocks.

Hex Proxies' infrastructure handles 50 billion requests per week. Your LangChain agents will never be bottlenecked by proxy capacity.

Tips

  • *Use ISP proxies for LangChain agents — the sub-50ms latency keeps multi-step chains responsive.
  • *Limit fetched content size before passing to the LLM to manage context window usage.
  • *Cache frequently accessed documents to reduce proxy requests and improve agent speed.
  • *Use country targeting for research tasks that need geographic perspectives.
  • *Set explicit timeouts on all proxy requests to prevent hanging agent chains.

Ready to Get Started?

Put this guide into practice with Hex Proxies.

Cookie Preferences

We use cookies to ensure the best experience. You can customize your preferences below. Learn more