v1.8.91-d84675c
Web ScrapingTechnical

Concurrent Connection Limits: How They Affect Scraping Performance and Cost

9 min read

By Hex Proxies Engineering Team

Concurrent Connection Limits: How They Affect Scraping Performance and Cost

Concurrent connection limits define how many simultaneous requests your proxy plan can handle. This single parameter constrains your scraping throughput more than pool size, IP quality, or any other factor. Understanding how concurrent connections interact with request latency, target rate limits, and proxy cost is essential for designing efficient scraping architecture.

What Are Concurrent Connections?

A concurrent connection is an active, open request flowing through the proxy at a given instant. When your scraper sends a request through a proxy and waits for the response, that consumes one concurrent connection for the duration of the request-response cycle.

Timeline (ms) →     0    200   400   600   800   1000
                    │     │     │     │     │     │
Connection 1:       ├─────────────────┤                 (600ms request)
Connection 2:       ├───────────┤                       (400ms request)
Connection 3:             ├─────────────────────────┤   (800ms request)
Connection 4:                   ├───────────┤           (400ms request)
                    │     │     │     │     │     │
Peak concurrent:    2     3     3     3     2     1

At the peak (200-600ms), three connections are active simultaneously. If your plan allows only 2 concurrent connections, Connection 3 would either queue (adding latency) or fail (reducing success rate).

How Limits Affect Throughput

The Throughput Formula

Your maximum theoretical throughput is:

max_requests_per_second = concurrent_connections / avg_response_time_seconds

For example:


  • 50 concurrent connections with 500ms average response time: 50 / 0.5 = 100 requests/second

  • 50 concurrent connections with 2,000ms average response time: 50 / 2.0 = 25 requests/second

  • 10 concurrent connections with 500ms average response time: 10 / 0.5 = 20 requests/second


The critical insight: Doubling your concurrent connections doubles your throughput. Halving your average response time also doubles your throughput. Optimizing response time is free; buying more concurrent connections costs money.

Real-World Throughput Measurements

We measured throughput at various concurrency levels against three target categories (source: Hex Proxies internal testing, April 2026, residential rotating proxies):

Concurrent ConnectionsFast Targets (200ms avg)Medium Targets (800ms avg)Slow Targets (2,000ms avg)
1050 req/s12 req/s5 req/s
25125 req/s31 req/s12 req/s
50245 req/s60 req/s24 req/s
100470 req/s115 req/s47 req/s
200880 req/s210 req/s88 req/s
Note: Actual throughput at high concurrency is slightly below theoretical maximum due to proxy gateway overhead, TCP connection setup time, and queueing effects.

Diminishing Returns at High Concurrency

Adding concurrent connections has diminishing returns beyond a certain point, because:

  1. Per-domain rate limits. If you scrape one site, the target's rate limiter caps your effective throughput regardless of your proxy concurrency.
  1. Proxy gateway queueing. At very high concurrency (500+), the proxy gateway itself introduces queueing delay as it routes requests to available IPs.
  1. Target server capacity. Individual web servers have finite capacity. Sending 500 concurrent requests to a single origin server may slow down responses for all of them.
The practical ceiling for most single-target scraping is 50-100 concurrent connections. For multi-target scraping (thousands of different domains), higher concurrency is useful because each target independently handles a small fraction of the load.

Cost Implications

Residential Proxies: Concurrency Is Usually Unlimited

Most residential proxy providers, including Hex Proxies, do not limit concurrent connections on residential plans. You pay per GB of bandwidth, and concurrency is unlimited or set at generous levels (500-1,000+).

Cost optimization for residential: Since you pay per GB, the cost driver is total bandwidth, not concurrency. Optimize by:


  • Disabling images, CSS, and JavaScript when scraping (reduces page size 5-10x)

  • Using gzip/br compression

  • Requesting only necessary data (API endpoints instead of full pages when available)


ISP Proxies: Concurrency Is Per-IP

ISP proxy concurrency depends on how many IPs you purchase. Each IP can handle multiple concurrent connections (typically 10-50), so your total concurrency scales with your IP count:

ISP IPs PurchasedConnections per IPTotal ConcurrencyMonthly Cost (at $2.08/IP)
1025250$20.80
2525625$52.00
50251,250$104.00
100252,500$208.00
Cost optimization for ISP: Since bandwidth is unlimited, maximize the throughput per IP. Use persistent connections (HTTP keep-alive) to avoid connection setup overhead. Run multiple concurrent requests per IP to maximize utilization -- but stay below the detection threshold for the target site.

Concurrency vs. Cost-per-Request

The most useful cost metric is cost per successful request, which factors in concurrency, success rate, and bandwidth:

from dataclasses import dataclass


@dataclass(frozen=True)
class CostAnalysis:
    """Immutable cost analysis for a proxy configuration."""
    monthly_cost: float
    requests_per_day: int
    success_rate: float  # 0.0 - 1.0
    avg_bandwidth_per_request_kb: float
    
    @property
    def monthly_successful_requests(self):
        return self.requests_per_day * 30 * self.success_rate
    
    @property
    def cost_per_successful_request(self):
        if self.monthly_successful_requests == 0:
            return float('inf')
        return self.monthly_cost / self.monthly_successful_requests
    
    @property
    def monthly_bandwidth_gb(self):
        total_requests = self.requests_per_day * 30
        return total_requests * self.avg_bandwidth_per_request_kb / 1_000_000


# Residential example: $4.25/GB, 50K requests/day, 200KB avg page
residential = CostAnalysis(
    monthly_cost=50_000 * 30 * 200 / 1_000_000 * 4.25,  # ~$1,275/mo
    requests_per_day=50_000,
    success_rate=0.93,
    avg_bandwidth_per_request_kb=200,
)

# ISP example: 50 IPs at $2.08/IP, 50K requests/day
isp = CostAnalysis(
    monthly_cost=50 * 2.08,  # $104/mo
    requests_per_day=50_000,
    success_rate=0.95,
    avg_bandwidth_per_request_kb=200,
)

print(f"Residential: ${residential.cost_per_successful_request:.5f}/request")
print(f"ISP: ${isp.cost_per_successful_request:.5f}/request")
# Residential: ~$0.00091/request
# ISP: ~$0.00007/request (13x cheaper for this workload)

This example illustrates why ISP proxies are dramatically cheaper for high-volume scraping of the same targets. The tradeoff is that ISP proxies have a finite IP pool (static IPs), while residential proxies rotate through a much larger pool. For workloads requiring high IP diversity, residential proxies justify the higher per-request cost.

Optimizing Connection Usage

Connection Pooling

Reuse TCP connections instead of opening new ones for each request. HTTP keep-alive reduces connection setup overhead from ~80ms to ~5ms per request.

import requests
from requests.adapters import HTTPAdapter


def create_optimized_session(proxy_url, pool_connections=25, pool_maxsize=50):
    """Create a requests Session with connection pooling.
    
    pool_connections: Number of urllib3 connection pools to cache.
    pool_maxsize: Maximum number of connections in each pool.
    """
    session = requests.Session()
    session.proxies = {
        "http": proxy_url,
        "https": proxy_url,
    }
    
    # Mount adapters with connection pooling
    adapter = HTTPAdapter(
        pool_connections=pool_connections,
        pool_maxsize=pool_maxsize,
        max_retries=0,  # Handle retries at application level
    )
    session.mount("http://", adapter)
    session.mount("https://", adapter)
    
    return session


# Single session, reuses connections across requests
session = create_optimized_session(
    "http://USER:PASS@gate.hexproxies.com:8080",
    pool_connections=25,
    pool_maxsize=50,
)

# Each request reuses an existing connection when possible
response = session.get("https://example.com/page1")
response = session.get("https://example.com/page2")  # Reuses connection

Async Concurrency Control

Use a semaphore to limit concurrency to your plan's limits while maximizing utilization:

import asyncio
import aiohttp


async def scrape_with_concurrency_control(
    urls, proxy_url, max_concurrent=50, timeout_seconds=30
):
    """Scrape URLs with controlled concurrency.
    
    Args:
        urls: Tuple of URLs to scrape.
        proxy_url: Proxy URL string.
        max_concurrent: Maximum simultaneous connections.
        timeout_seconds: Per-request timeout.
    
    Returns:
        Tuple of result dicts.
    """
    semaphore = asyncio.Semaphore(max_concurrent)
    timeout = aiohttp.ClientTimeout(total=timeout_seconds)
    results = []
    
    async def fetch_one(session, url):
        async with semaphore:
            try:
                async with session.get(
                    url, proxy=proxy_url, timeout=timeout
                ) as resp:
                    body = await resp.text()
                    return {
                        "url": url,
                        "status": resp.status,
                        "size": len(body),
                        "success": resp.status == 200,
                    }
            except Exception as exc:
                return {
                    "url": url,
                    "status": 0,
                    "size": 0,
                    "success": False,
                    "error": str(exc),
                }
    
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_one(session, url) for url in urls]
        results = await asyncio.gather(*tasks)
    
    return tuple(results)

Monitoring Connection Utilization

Track how many of your concurrent connections are actually in use:

import time
import threading


class ConnectionMonitor:
    """Monitor concurrent connection utilization."""
    
    def __init__(self, max_connections):
        self.max_connections = max_connections
        self._active = 0
        self._peak = 0
        self._lock = threading.Lock()
        self._samples = []
    
    def acquire(self):
        """Called when a new connection starts."""
        with self._lock:
            self._active += 1
            if self._active > self._peak:
                self._peak = self._active
            self._samples.append((time.time(), self._active))
    
    def release(self):
        """Called when a connection completes."""
        with self._lock:
            self._active -= 1
            self._samples.append((time.time(), self._active))
    
    def get_stats(self):
        """Get utilization statistics."""
        with self._lock:
            if not self._samples:
                return {"utilization": 0, "peak": 0}
            
            avg_active = sum(s[1] for s in self._samples) / len(self._samples)
            
            return {
                "current_active": self._active,
                "peak_active": self._peak,
                "avg_active": round(avg_active, 1),
                "max_allowed": self.max_connections,
                "utilization_pct": round(
                    avg_active / self.max_connections * 100, 1
                ),
            }

Interpreting utilization:


  • Below 50%: You are paying for more concurrency than you use. Reduce your plan or add more scraping tasks.

  • 70-85%: Optimal. Enough headroom for traffic spikes without waste.

  • Above 90%: Requests are likely queueing. Increase concurrency or optimize response times.

  • 100%: Requests are being dropped or severely delayed. Increase concurrency immediately.


For more on concurrent connection optimization, see our concurrent connections glossary entry, load balancing overview, and session calculator.

Frequently Asked Questions

What happens when I exceed my concurrent connection limit?

Behavior varies by provider. Some providers queue excess requests (adding latency), some return HTTP 429 (rate limit), and some drop the connection. Hex Proxies queues excess requests briefly and returns a 429 if the queue exceeds capacity. Design your application to handle 429 responses with exponential backoff.

Should I use more connections with shorter timeouts or fewer with longer timeouts?

More connections with shorter timeouts is generally better. A 30-second timeout on a connection that typically responds in 500ms wastes a connection slot for 29.5 seconds when a request hangs. Use a 10-15 second timeout for most scraping and retry failed requests, freeing the slot for a new request faster.

Do browser automation sessions use more concurrent connections?

Yes. A single Playwright or Puppeteer session opening a page may trigger 50-100 sub-requests (HTML, CSS, JavaScript, images, API calls). Each sub-request uses a concurrent connection through the proxy. A 50-connection limit may only support 1-2 browser sessions simultaneously. For browser-based scraping, plan for 50-100 connections per concurrent browser session.

How does Hex Proxies handle concurrent connections?

Residential plans include generous concurrent connection limits (typically 500-1,000+ depending on plan tier). ISP proxy connections scale with the number of IPs purchased. Contact support for enterprise concurrency needs above standard limits.


Concurrent connections are the throughput lever of proxy infrastructure. Understanding the relationship between concurrency, latency, and cost lets you design scraping systems that maximize data collection per dollar. Hex Proxies residential plans start at $4.25/GB with 500+ concurrent connections; ISP plans at $2.08/IP with unlimited bandwidth. Explore plans or use our concurrent session calculator to size your needs.

Cookie Preferences

We use cookies to ensure the best experience. You can customize your preferences below. Learn more