![]() ![]() It's a headless browser that will allow javascript execution. Note that this will simply pull down the html, for my use case that was enough but if you need the webpage to execute javascript in order to get the content you need you might want to have a look at phantomjs. In order to start scraping the page, we're going to need to load it up. You can pick any page you want, as long as it contains some data that you want to extract into a JSON file. I picked the Starcraft 2 units overview page as a starting point. Once you have all the dependencies, you're going to need a webpage that you will scrape. If you don't have npm installed yet then follow the instructions here to install node and npm. Installing them will work like this: npm install -save request cheerio promise The ones I have used are: request, cheerio and promise. Before you get started you're going to want to install some dependencies. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |