Integration Guide
Build lead scraping automations on Pipedream with full code access. Pull Apollo contacts, Google Maps businesses, validate emails, find phone numbers, and route results to your CRM, spreadsheet, or cold email tool - all running in the cloud on a schedule.
Pipedream workflows connect ScraperCity's B2B data API to any downstream tool. Every workflow runs serverlessly in the cloud - no infrastructure to manage. Here are the most common pipelines teams build.
Cron trigger fires every morning. ScraperCity scrapes Apollo for contacts matching your ICP filters. Pagination code step collects all pages. Leads route to HubSpot, Pipedrive, or a Google Sheet automatically.
Trigger on webhook or schedule. ScraperCity pulls Google Maps listings by keyword and city - including phone, email, reviews, and website. Send new businesses to your outreach tool or Airtable CRM.
After any lead enrichment step, POST each address to ScraperCity's email validation API at $0.0036/email. Filter deliverable-only leads to a separate Google Sheet or CRM list. Reduce bounces before you hit send.
Receive a webhook from your sign-up form. Pass the company domain to ScraperCity's Email Finder or Website Finder. Append phone numbers with Mobile Finder. Push the enriched lead record back to your CRM automatically.
Scrape e-commerce stores by niche using the Store Leads endpoint. Filter by platform, revenue signals, or technology. Route qualified merchants to a Slack notification channel and a CRM pipeline stage.
Trigger from a new row added to Google Sheets. For each company name + person name, call ScraperCity's Email Finder API. Write the discovered email back to the same row. A complete no-touch enrichment loop.
Follow these steps to connect ScraperCity to Pipedream. The full workflow takes about 10 minutes.
Log in to ScraperCity and go to app.scrapercity.com/dashboard/api-docs to copy your API key. Then in Pipedream, navigate to Settings > Environment Variables and create a new secret variable named SCRAPERCITY_API_KEY. Storing the key as an environment variable keeps it out of your workflow code and prevents accidental exposure in Pipedream's execution logs.
Create a new workflow inside a Pipedream project. Choose the trigger type that matches your use case:
0 8 * * 1-5 to run at 8 AM on weekdays.Note: Pipedream manages the servers for scheduled workflows, so there is no server or cron daemon to operate yourself.
Add a step and select HTTP Request (or use a Node.js code step with axios for more control). Configure the request as shown below. This example scrapes Apollo for Director-level contacts in SaaS companies with a verified email address.
Method: GET
URL: https://app.scrapercity.com/api/v1/database/leads
Headers:
Authorization: Bearer YOUR_SCRAPERCITY_KEY
Query Parameters:
title: Director of Sales
industry: computer software
hasEmail: true
limit: 100
page: 1Replace YOUR_SCRAPERCITY_KEY with {{process.env.SCRAPERCITY_API_KEY}} when using Pipedream's object explorer, or reference process.env.SCRAPERCITY_API_KEY inside a code step. The Lead Database endpoint requires the $649/mo plan. All other scraper endpoints work on any plan.
The ScraperCity API paginates at 100 leads per page with a maximum of 100,000 leads per day. Add a Node.js code step to loop through all pages and return a flat array of leads for downstream steps:
export default defineComponent({
async run({ steps, $ }) {
const axios = require("axios");
const allLeads = [];
let page = 1;
let totalPages = 1;
do {
const response = await axios.get(
"https://app.scrapercity.com/api/v1/database/leads",
{
headers: {
Authorization: "Bearer " + process.env.SCRAPERCITY_API_KEY,
},
params: {
title: "Director of Sales",
industry: "computer software",
hasEmail: "true",
limit: "100",
page: String(page),
},
}
);
allLeads.push(...response.data.data);
totalPages = response.data.pagination.totalPages;
page++;
} while (page <= totalPages);
return allLeads;
},
})Using axios (rather than $.send.http()) is recommended here because you need to read the response body to extract pagination metadata and pass the full lead array to the next step. Pipedream makes any value you return from a code step available to all downstream steps via the steps object.
Add an optional email validation step after the pagination loop. For each lead returned, POST the email address to the ScraperCity Email Validator at $0.0036/email. This filters catch-all addresses and undeliverable contacts before they reach your CRM or outreach tool - keeping your sender reputation clean.
export default defineComponent({
async run({ steps, $ }) {
const axios = require("axios");
const leads = steps.fetch_leads.$return_value; // from previous step
const validatedLeads = [];
for (const lead of leads) {
if (!lead.email) continue;
try {
const res = await axios.post(
"https://app.scrapercity.com/api/v1/email-validator",
{ email: lead.email },
{
headers: {
Authorization: "Bearer " + process.env.SCRAPERCITY_API_KEY,
"Content-Type": "application/json",
},
}
);
if (res.data.deliverable === true) {
validatedLeads.push(lead);
}
} catch (err) {
console.log("Validation error for " + lead.email, err.message);
}
}
return validatedLeads;
},
})Add a destination step after your data is collected and optionally validated. Pipedream has pre-built actions for common destinations - click the + button and search by app name:
Every ScraperCity scraper is accessible via the same base URL (https://app.scrapercity.com/api/v1) with Bearer token authentication. The table below shows the endpoints most commonly used in Pipedream workflows.
| Endpoint | What it returns | Cost | Delivery |
|---|---|---|---|
| GET /apollo | B2B contacts from Apollo by title, industry, location | $0.0039/lead | 11-48+ hrs |
| GET /database/leads | 3M+ B2B contacts, instant query (requires $649/mo plan) | Included in plan | Instant |
| GET /google-maps | Local businesses with phone, email, reviews, website | $0.01/place | 5-30 min |
| POST /email-validator | Deliverability, MX records, catch-all detection | $0.0036/email | 1-10 min |
| POST /email-finder | Business email from name + company domain | $0.05/contact | 1-10 min |
| POST /mobile-finder | Phone numbers from LinkedIn URL or email | $0.25/input | 1-5 min |
| GET /store-leads | Shopify/WooCommerce stores with contacts | $0.0039/lead | Instant |
| GET /status/:runId | Poll the status of an async scrape job | Free | Instant |
| GET /download/:runId | Download CSV results for a completed scrape | Free | Instant |
All endpoints use Authorization: Bearer YOUR_API_KEY in the request header. Apollo scrapes are asynchronous and delivered in 11-48+ hours. For async scrapers, use the Status endpoint to poll job completion or configure a webhook at app.scrapercity.com/dashboard/webhooks to receive a POST notification when results are ready.
Some ScraperCity scrapers - most notably Apollo - are asynchronous. When you POST a scrape request, the API returns a runId immediately but results are not available for 11-48+ hours. Pipedream cron-triggered workflows have a maximum execution time, so you cannot poll inside a single workflow run for an async scrape. There are two reliable patterns for handling this:
Configure a webhook URL in ScraperCity's dashboard (app.scrapercity.com/dashboard/webhooks). Set the URL to a Pipedream HTTP-triggered workflow. When your scrape completes, ScraperCity POSTs the results to that URL and Pipedream fires the workflow automatically - no polling needed.
// Workflow A: Trigger the scrape (Cron trigger)
export default defineComponent({
async run({ steps, $ }) {
const axios = require("axios");
const res = await axios.post(
"https://app.scrapercity.com/api/v1/apollo",
{
title: "VP of Engineering",
industry: "saas",
limit: 500,
},
{
headers: {
Authorization: "Bearer " + process.env.SCRAPERCITY_API_KEY,
"Content-Type": "application/json",
},
}
);
// Store the runId to track status if needed
return { runId: res.data.runId };
},
});
// Workflow B: Receive webhook when complete (HTTP trigger)
// ScraperCity POSTs results to this workflow's URL
// steps.trigger.event.body contains the lead dataFor shorter async scrapers (Google Maps, Email Finder - typically 1-30 minutes), you can use a second scheduled workflow that polls the GET /api/v1/status/:runId endpoint every few minutes. When the status returns completed, call the Download endpoint to retrieve results and route them to your destination.
export default defineComponent({
async run({ steps, $ }) {
const axios = require("axios");
const runId = process.env.PENDING_RUN_ID; // store this after triggering
const status = await axios.get(
`https://app.scrapercity.com/api/v1/status/${runId}`,
{
headers: { Authorization: "Bearer " + process.env.SCRAPERCITY_API_KEY },
}
);
if (status.data.status !== "completed") {
$.flow.exit("Scrape not ready yet - will retry on next cron tick");
}
const results = await axios.get(
`https://app.scrapercity.com/api/v1/download/${runId}`,
{
headers: { Authorization: "Bearer " + process.env.SCRAPERCITY_API_KEY },
}
);
return results.data;
},
})These are the most common issues when integrating ScraperCity with Pipedream, and how to fix each one.
401 Unauthorized
Why it happens: The Authorization header is missing or the API key is wrong.
Fix: Confirm the header is set to Authorization: Bearer YOUR_KEY with no typos. In a code step, verify process.env.SCRAPERCITY_API_KEY returns the correct value by logging it once (then remove the log - do not leave API keys printing to Pipedream's Inspector logs).
429 Too Many Requests
Why it happens: You are sending requests faster than the allowed rate.
Fix: The ScraperCity Lead Database endpoint allows up to 100,000 leads per day at 100 per page. Add a short delay between page requests in your loop if you are paginating at very high speed. Pipedream itself rate-limits HTTP triggers to an average of 10 requests per second - use throttle controls in Workflow Settings if fanning out large batches.
Workflow timeout error (red in Inspector)
Why it happens: Your pagination loop takes longer than Pipedream's execution limit. Cron workflows default to 60 seconds.
Fix: Split the work across multiple workflow runs. Scrape one page range per cron tick, storing the current page in an external state store (e.g. a single-cell Google Sheet or a Pipedream data store). Alternatively, use the webhook pattern for async scrapers so no polling loop is needed inside a single run.
Duplicate request blocked (ScraperCity 30-second dedup)
Why it happens: ScraperCity blocks identical requests made within 30 seconds to prevent accidental double charges.
Fix: This is expected behavior. If you are retrying a failed step, wait at least 30 seconds before resending. Vary at least one query parameter (e.g. page number) if you need to send multiple requests quickly.
process.env.SCRAPERCITY_API_KEY returns undefined
Why it happens: The environment variable is being referenced outside of the defineComponent export function, or it was not saved correctly.
Fix: Confirm the variable was saved at Settings > Environment Variables in Pipedream. Ensure your code references process.env inside the run function body. process.env returns undefined when called at the module level outside of defineComponent.
Empty data array returned
Why it happens: The filter parameters returned no matching contacts, or the scrape is still processing.
Fix: For async scrapers (Apollo), the data will not be available until the scrape completes (11-48+ hours). Check the run status using GET /api/v1/status/:runId. For synchronous scrapers, loosen your filter criteria - try removing one filter at a time to identify which constraint is too narrow.
ScraperCity's API works with any HTTP-capable automation tool. Here is how the main options compare for lead scraping workflows specifically.
| Platform | Code steps | Pagination support | Hosting | Best for |
|---|---|---|---|---|
| Pipedream | Node.js + Python | Full loop control | Cloud (managed) | Devs who want code + 2,000+ integrations |
| n8n | JavaScript function node | Full loop control | Self-hosted or cloud | Teams wanting self-hosted control + visual builder |
| Zapier | Code step (limited) | No native pagination | Cloud (managed) | No-code single-step triggers, simple routing |
| Make (Integromat) | Limited | Iterator module | Cloud (managed) | Visual scenario builder, moderate complexity |
For bulk lead scraping with pagination, data transformation, and conditional routing, Pipedream or n8n are the strongest choices. Both give you the code access needed to loop through ScraperCity's paginated API responses and handle async scrape jobs correctly.