Web Scraping API
Abstract’s Web Scraping API is a simple yet powerful REST API used to extract data from a given URL. To make a request, you simply include the target URL and your API key, and Abstract’s API will return the data from that site.
It’s very simple to use: you only need to submit your api_key
and a target URL
, and the API will return the data from that site.
Getting started
REST
The Web Scraping API, like all of Abstract’s APIs, is organized around REST. It is designed to use predictable, resource-oriented URLs and to use HTTP status codes to indicate errors.
HTTPS
The Web Scraping API requires all communications to be secured with TLS 1.2 or greater.
API Versions
All of Abstract’s APIs are versioned. The Web Scraping API is currently on Version 1.
Your API Key
Your API key is your unique authentication key to be used to access Abstract’s Web Scraping API. Note that each of Abstract’s APIs has a unique API key, so you will need different keys to access the Web Scraping and Email Validation APIs, for example. To authenticate your requests, you will need to append your API key to the base URL.
Base URL
https://scrape.abstractapi.com/v1/
Validation endpoint
Abstract’s Web Scraping API simply requires your unique api_key
and the target URL
you’d like to scrape:
https://scrape.abstractapi.com/v1/
? api_key = YOUR_UNIQUE_API_KEY
& url = https://news.ycombinator.com
This was a successful request, so the information from the provided website is returned below:
Request parameters
Your unique API key. Note that each user has a unique API key for each of Abstract’s APIs, so your Web Scraping API key will not work for your User Avatar API, for example
The URL to extract the data from. Note that this parameter should include the full HTTP Protocol (http:// or https://). If your URL has parameters, you should encode it. For example the &
character would be encoded to %26
.
If true, the request will render JavaScript on the target site. Note that JavaScript is rendered via a Google Chrome headless browser. Defaults to false.
If true the request will use a different IP address on each call. Defaults to false.
It can be used to make requests with preloaded cookies. The array can contain multiple objects, each with the following properties: path, value, name, and domain.
[
{
"path": "path",
"value": "value",
"name": "name",
"domain": "domain"
},
{
"path": "path",
"value": "value",
"name": "name",
"domain": "domain"
}
]