Initial requirements

  • Setup a JigsawStack account (if you don’t have an account already)
  • Get your API key from here.
  • Install the Node.js SDK

Using file URL

Using File key

Pass the file key returned from the File Storage API.

Sample result

{
  "success": true,
  "text": " Hey guys, I'm pretty excited to talk about a new API that we're going to be releasing in Jigsaw Stack. It's our AI Scrape API, which can basically scrape any website by just prompting and giving you back structured data consistently, no matter the site. So I mean, we can try something simple first. This will give you a short demo. So we're going to release this in beta and then improve it as we go. So let's try to scrape super base pricing plan, for example, something simple. So I think I have a small demo up here. So we have the URL of the page that we're going to scrape. We're going to type basically what we want to scrape. So we don't even need to inspect element. We don't need to understand how the website works. We just need to write what we want to script. So we're going to script the plan header. So we got a plan header. So right over here and the plan price. All right. So we want to script the plan price and that's it. So what it's going to do, it's going to take the website's HTML, try to understand how the HTML works and find the right places to script. So as you can see, it gets us all the relevant tax information that we need. So all the pricing plans over here. And if you scroll down, you'll get the next section to the actual prices. So you can see $0 $25 599 exactly as we got here. And you didn't have to write a single piece of code to scrape the website. Another more complex example would be Y Combinator. So if you go to Y Combinator's site over here, you have a bunch of information, article titles, points, and a bunch of other things. So same thing, right? You put the link of the website you would like to scrape, the information you can also post title right over here. So the script, the post title and post points. So the more descriptive you are, the more accurate information gets and it's similar to how you would use ChatGPT for messaging. It's all about prompt engineering. Same here, right? So as simple as it is, you just write what you need and you press send. You should get back consistent data every single time. So especially this is great for websites that are dynamic social media platforms where you want to scrape a set of content that always changes. So as you can see, you get the title as shown here. And each of it is the title is shown here for the entire page and content goal is pretty long. So let me show you two sets. It even tells you the selectors that you can use to extract a website. So if you want to scrape it directly yourself in the future, you can just use the selectors to consistently get the information over and over again. So let me show you the other piece. So right over here, you get the points for each of the items as well. So 12 points, 58 points, accordingly. So you scroll down, you'll get all the points. So you can use this for any site. Just write a prompt on what you want to scrape, and it'll consistently scrape the information and give it back to you in a structured manner. So you can use this data to build your next platform, to build your next app, or even to monitor different things on any website.",
  "chunks": [
    {
      "timestamp": [0.96, 7.2],
      "text": " Hey guys, I'm pretty excited to talk about a new API that we're going to be releasing in Jigsaw"
    },
    {
      "timestamp": [7.2, 14.56],
      "text": " Stack. It's our AI Scrape API, which can basically scrape any website by just prompting and giving"
    },
    {
      "timestamp": [14.56, 20.64],
      "text": " you back structured data consistently, no matter the site. So I mean, we can try something simple"
    },
    {
      "timestamp": [20.64, 29.12],
      "text": " first. This will give you a short demo. So we're going to release this in beta and then improve it as we go. So let's try to scrape super base pricing plan, for example,"
    },
    {
      "timestamp": [29.12, 34.8],
      "text": " something simple. So I think I have a small demo up here. So we have the URL of the page that we're"
    },
    {
      "timestamp": [34.8, 39.2],
      "text": " going to scrape. We're going to type basically what we want to scrape. So we don't even need"
    },
    {
      "timestamp": [39.2, 43.92],
      "text": " to inspect element. We don't need to understand how the website works. We just need to write what"
    },
    {
      "timestamp": [43.92, 45.76],
      "text": " we want to script. So we're going to script the plan header."
    },
    {
      "timestamp": [45.76, 46.76],
      "text": " So we got a plan header."
    },
    {
      "timestamp": [46.76, 49.24],
      "text": " So right over here and the plan price."
    },
    {
      "timestamp": [49.24, 49.56],
      "text": " All right."
    },
    {
      "timestamp": [49.56, 51.64],
      "text": " So we want to script the plan price and that's it."
    },
    {
      "timestamp": [52.32, 56],
      "text": " So what it's going to do, it's going to take the website's HTML,"
    },
    {
      "timestamp": [56, 60.36],
      "text": " try to understand how the HTML works and find the right places to script."
    },
    {
      "timestamp": [60.36, 65.12],
      "text": " So as you can see, it gets us all the relevant tax"
    },
    {
      "timestamp": [65.12, 68.34],
      "text": " information that we need. So all the pricing plans over here. And"
    },
    {
      "timestamp": [68.34, 70.82],
      "text": " if you scroll down, you'll get the next section to the actual"
    },
    {
      "timestamp": [70.82, 76.74],
      "text": " prices. So you can see $0 $25 599 exactly as we got here. And"
    },
    {
      "timestamp": [76.74, 79.4],
      "text": " you didn't have to write a single piece of code to scrape"
    },
    {
      "timestamp": [79.44, 83.32],
      "text": " the website. Another more complex example would be Y"
    },
    {
      "timestamp": [83.32, 86.54],
      "text": " Combinator. So if you go to Y Combinator's site over here,"
    },
    {
      "timestamp": [86.54, 93.24],
      "text": " you have a bunch of information, article titles, points, and a bunch of other things. So same thing,"
    },
    {
      "timestamp": [93.24, 98.04],
      "text": " right? You put the link of the website you would like to scrape, the information you"
    },
    {
      "timestamp": [98.04, 102.32],
      "text": " can also post title right over here. So the script, the post title and post points. So"
    },
    {
      "timestamp": [102.32, 105.2],
      "text": " the more descriptive you are, the more accurate information gets"
    },
    {
      "timestamp": [105.2, 109.86],
      "text": " and it's similar to how you would use ChatGPT for messaging."
    },
    {
      "timestamp": [110.18, 112],
      "text": " It's all about prompt engineering."
    },
    {
      "timestamp": [112.4, 113.24],
      "text": " Same here, right?"
    },
    {
      "timestamp": [113.28, 114.68],
      "text": " So as simple as it is,"
    },
    {
      "timestamp": [114.76, 116.54],
      "text": " you just write what you need and you press send."
    },
    {
      "timestamp": [117.06, 120.4],
      "text": " You should get back consistent data every single time."
    },
    {
      "timestamp": [121.04, 123.54],
      "text": " So especially this is great for websites"
    },
    {
      "timestamp": [123.54, 126],
      "text": " that are dynamic social media platforms where you"
    },
    {
      "timestamp": [126, 130.5],
      "text": " want to scrape a set of content that always changes. So as you can see, you get the title"
    },
    {
      "timestamp": [130.5, 136.22],
      "text": " as shown here. And each of it is the title is shown here for the entire page and content goal"
    },
    {
      "timestamp": [136.22, 141.76],
      "text": " is pretty long. So let me show you two sets. It even tells you the selectors that you can use to"
    },
    {
      "timestamp": [141.76, 147.94],
      "text": " extract a website. So if you want to scrape it directly yourself in the future, you can just use the selectors to consistently"
    },
    {
      "timestamp": [147.94, 152.06],
      "text": " get the information over and over again. So let me show you the"
    },
    {
      "timestamp": [152.06, 155.92],
      "text": " other piece. So right over here, you get the points for each of the items as well. So 12 points,"
    },
    {
      "timestamp": [155.98, 159.92],
      "text": " 58 points, accordingly. So you scroll down,"
    },
    {
      "timestamp": [159.98, 164.02],
      "text": " you'll get all the points. So you can use this for any site. Just write a prompt on what"
    },
    {
      "timestamp": [164.02, 166.48],
      "text": " you want to scrape, and it'll consistently scrape the information"
    },
    {
      "timestamp": [166.48, 168.86],
      "text": " and give it back to you in a structured manner."
    },
    {
      "timestamp": [169, 172.36],
      "text": " So you can use this data to build your next platform,"
    },
    {
      "timestamp": [172.46, 173.34],
      "text": " to build your next app,"
    },
    {
      "timestamp": [173.58, 177.22],
      "text": " or even to monitor different things on any website."
    }
  ]
}