Web
CSS Selectors
Example guide to CSS selectors for Web scraping
What are CSS Selectors
CSS selectors are used to define a pattern of the elements that you want to select for applying a set of CSS rules on the selected elements.
See usage in Scrape API
Selector | Example | Use Case Scenario |
---|---|---|
* | * | This selector picks all elements within a page. |
.class | .card-title | The simplest CSS selector is targeting the class attribute. If only your target element is using it, then it might be sufficient. |
.class1.class2 | .card-heading.card-title | There are elements with a class like class=“card-heading card-title”. When we see a space, it is because the element is using several classes. However, there’s no one fixed way of selecting the element. Try keeping the space, if that doesn’t work, then replace the space with a dot. |
#id | #card-description | What if the class is used in too many elements or if the element doesn’t have a class? Picking the ID can be the next best thing. The only problem is that IDs are unique per element. So won’t cut to scrape several elements at once. |
element | h4 | To pick an element, all we need to add to our parser is the HTML tag name. |
element.class | h4.card-title | This is the most common. |
parentElement > childElement | div > h4 | We can tell our scraper to extract an element inside another. In this example, we want it to find the h4 element whose parent element is a div. |
parentElement.class > childElement | div.card-body > h4 | We can combine the previous logic to specify a parent element and extract a specific CSS child element. This is super useful when the data we want doesn’t have any class or ID but is inside a parent element with a unique class/ID. |
[attribute] | [href] | Another great way to target an element with no clear class to choose from. Your scraper will extract all elements containing the specific attribute. |
[attribute=value] | [target=_blank] | We can tell our scraper to extract only the elements with a specific value inside its attribute. |
[attribute~=value] | [title~=rating] | This selector will pick all the elements containing the word ‘rating’ inside its title attribute. |
element,element | div, p | Selects all <div> elements and all <p> elements. |
element+element | div + p | Selects the first <p> element that is placed immediately after <div> elements. |
[attribute^=value] | a[href^="https"] | Selects every <a> element whose href attribute value begins with “https” |
[attribute*=value] | a[href*="jigsawstack"] | Selects every <a> element whose href attribute value contains the substring “jigsawstack” |
:active | a:active | Selects the active link |
:link | a:link | Selects all unvisited links |
For resources on using CSS selectors for Scraping, visit Scrapingbee
Was this page helpful?