Scraping Cities

Hacks are always hilarious and brute force. One of the companies that I work with needs all businesses everywhere. To do this, we first need to know all the cities, and their corresponding GPS coordinates. 

I could Geocode, but for fun, I’ll ruin somebody’s web server.

First we do a quick google search to see if there’s a website that has a list of cities and their corresponding coordinates

Ehhh the design needs work on this site :/… Whatever, next is to map the general structure of the site:

First page is a list of States:

image

One link deep we have all the cities

image

Another link deep we have the coordinates:

image

Quick look at the source, and we get the corresponding Xpath for both coordinates: 

Longitude: td div td:nth-child(2) strong

Latitude: td div td:nth-child(1) strong

Then some simple string substitution to the format we want (North and East are positive valies, West and South are Negative Values). 

Now all we have to do is loop it into a database, and we’re laughing. Below’s the code, and the video.

https://gist.github.com/D3MZ/5198592