HTML Engine Scrape
As the name suggests, HTML Scrape Engine is used to scrape data from websites that are not dynamic. This engine is particularly well-suited for static websites, server-side rendered websites, and websites built with frameworks like Laravel, Django, and Ruby on Rails.
The HTML Scrape Model
The HTML Scrape Model contains all the information about your HTML Scrape request and response.
POST/v1/scrape/html
Perform a HTML scrape
This endpoint allows perform a HTML scrape on target URL and return the HTML response.
Required attributes
- Name
url
- Type
- string
- Description
The URL to scrape.
Optional attributes
- Name
premium_proxy
- Type
- boolean
- Description
Use a premium proxy to scrape the URL at additional cost but with higher success rate and bypass anti-scraping measures.
- Name
country_code
- Type
- string
- Description
The country code accept an ISO 3166-1 alpha-2 code used for geolocation of the proxy.
Request
POST
/v1/scrape/htmlcurl https://scrapinglabs.com/api/v1/scrape/html \
-H "Authorization: Bearer {token}" \
-d url="https://www.example.com" \
-d premium_proxy="true" \
-d country_code="GE"
Response
<html>
<head>
<title>Example Domain</title>
<meta charset="utf-8" />
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<style type="text/css">
body {
background-color: #f0f0f2;
margin: 0;
padding: 0;
font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
}
div {
width: 600px;
margin: 5em auto;
padding: 2em;
background-color: #fdfdff;
border-radius: 0.5em;
box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
}
a:link, a:visited {
color: #38488f;
text-decoration: none;
}
@media (max-width: 700px) {
div {
margin: 0 auto;
width: auto;
}
}
</style>
</head>
<body>
<div>
<h1>Example Domain</h1>
<p>This domain is for use in illustrative examples in documents. You may use this
domain in literature without prior coordination or asking for permission.</p>
<p><a href="https://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>