A hackers guide in getting a job
They call it networking : To you, meet people, who know other people, that’ll help you get where want to go. To a hacker? Meeting datasets, that know other datasets… (IO networking).
Data is just information, and a dataset is just a collection of information. Ever since the social revolution, it’s now cool to post who you know, and what you do.
So in the next series of tutorials, I’ll be developing an algorithm that maps out all the people you need to know, to get where you want to go. If you’re a cook, think of this as a map of people in between you and Gordon Ramsay.
Since I’m learning with you, we’ll keep it simple:
- Map out the steps involved in doing our objective.
- Leverage as much boilerplate code as possible. (Because hackers can’t program)
- We’ll start with something easy to mine - Twitter. Then we’ll get fancy.
STEP 1: MAP OUT PROCESS BEFORE CODING.
- First we need to have to a twitter account that’s connected to at least a couple people. Not you following people, but people following you.
- We’re working on the assumption that people who follow you, are your friends. Or at minimum, they care about what you have to say — Close enough. Thus we will need to write code that generates a list of followers when you input some username (whether yours, or your followers, or their followers).
- Who ever your target person is, we need to generate a list of people they follow. Twitter’s API calls this “friends”. The goal is to know the people that your target listens to, because if the people who your target listens to, listen to you, then you have a really good chance of your target listening to you directly. Thus we will need to write code that generates a list of friends when you input some username.
- At this point you’ll need to generate lists of lists of lists of friends & followers. We’ll need write code that checks if the TargetUserName’s Friends are following people, that follow people, that follow you (3 degrees of separation).. Or if we will have to run another degree of separation on either the Target’s side, or your own.
- Finally, make it human readable.
STEP 2: COPY AND PASTE
RubyGems is the first place to look for Ruby programmers. Conviently they have a couple gems that connect with twitter’s API. This one is the most popular, so we’ll go with that for now. Type this in your terminal:
$ sudo gem i twitter
Now we’re happy. If you go through both twitter’s API and the gem docs. You’ll realize that this gem is a wrapper to make twitter’s output a little easier to handle. You can do without it, but it’s nice to have this starting step.
get used to this:
require ‘twitter’
That line connects your gem, to your program.
A little google mining, and you get something like this from IBM:
Look at your friends (people you follow), and gather data to understand their popularity. In this case, you gather your friends and sort them in the order of their followers count.
#!/usr/bin/env ruby require "rubygems" require "twitter" require 'google_chart' name = String.new ARGV[0] user = Hash.new # Iterate friends, hash their followers friends = Twitter.friend_ids(name) friends.ids.each do |fid| f = Twitter.user(fid) # Only iterate if we can see their followers if (f.protected.to_s != "true") user[f.screen_name.to_s] = f.followers_count end end user.sort_by {|k,v| -v}.each { |user, count| puts "#{user}, #{count}" }
It’s nice because it ranks people you follow based on their popularity. But since twitter limits the number of times you can talk to their server for information. It’s almost trying to do 2 degrees of separation in one program. With an OAUTH of 350 requests/hour. IBM’s script is trash.
A little more googling and sifting through people’s crap code – you’ll get a broken version of this code (which I fixed and added a couple features for you):
#!/usr/bin/env ruby
require 'rubygems’
require 'twitter’
#Auth
Twitter.configure do |config|
config.consumer_key = 'YOUR KEY’
config.consumer_secret = 'YOUR SECRET’
config.oauth_token = 'YOUR TOKEN
config.oauth_token_secret = 'YOUR TOKEN SECRET’
end
def is_a_number?(s)
s.to_s.match(/\A[+-]?\d+?(.\d+)?\Z/) == nil ? false : true
end
username = String.new ARGV[0]
is_a_number?(username) ? username = username.to_i : username = username.to_s
cursor = “-1”
followerIds = []
while cursor != 0 do
followers = Twitter.follower_ids(username,{:cursor=>cursor})
cursor = followers.next_cursor
followerIds += followers.ids
end
outfile = File.new(“#{username}-followers.txt”,'w’)
outfile.puts(followerIds)
outfile.close
Your fucking welcome. You run it like this in your term:
$ chmod 775 scriptname.rb;
$./scriptname.rb username
and it’ll output a file like this
username-followers.txt
My code unlike IBM’s, is that it actually gets a list of all your followers, and pushes it to a file later. The upper limit for my code, is unfortunately 1,750,000 (5000*350) people. So if you're Brittany Spears or Justin Bieber, you’ll have to stay tuned until I write a fix next time.
I added a degree of separation through mathematica:
AbsoluteTiming[
Run[“./friends-lean.rb ” <> ToString@#] & /@
Import[“jhiller12-followers.txt”, “Lines”]]
But, since it’s not urgent I’ll eventually do everything in ruby, so you can copy and paste later.
Below is a follower (2 degrees) graph of my friend, jhiller12
You can see that a lot of his followers know each other.. Displaying information is the last step. We’ll work on that later.