Facebook API is a powerfull tool which provides you an interface to create games, authorize for an application or utility application, like wall content analyser, etc. But there are couple things you cannot use API to. For example, you can easily send message to your friends, but there is no way to receive messages from them. For such and other reasons I decided to web scrape FB in order to create own API. In this case I used low-end web interface available at http://mbasic.facebook.com. In this post I will describe how to create simple logging in python.
We will need to libraries for our purpose: requests for making calls to FB and HTMLParser to scrape HTML code and extract useful information. We will also need to know how Facebook’s low-end interface work. In this case we will go to http://mbasic.facebook.com (make sure to logout first) and use web inspector tool to preview HTML code. We should see something like below
As you can see, what we need is to find a form tag with id “login_form” and extract every input field with it’s name and value attribute. To do that we will need to create parser class which derives from HTMLParser. We should end up with someting like that:
classLoginParser(HTMLParser):
isLoginForm = False
data = {}
defhandle_starttag(self, tag, attrs):
if tag =="form":
attrs = {k[0]: k[1] for k in attrs}
if attrs['id'] =="login_form":
self.isLoginForm = True
self.action = attrs['action']
else:
if self.isLoginForm:
if tag =="input":
name =""
value =""for key, val in attrs:
if key =="name":
name = val
if key =="value":
value = val
self.data[name] = value
defhandle_endtag(self, tag):
if tag =="form"and self.isLoginForm:
self.isLoginForm = False
Now we need to connect to Facebook and retrieve login form and send login request. We can do that with following code with usage of requests library
Here we send initial request to get form, then we feed parser with response and extract login form data. After that we set login and password in data dictionary and send post request to Facebook with login and form data, cookies retrieved at the beginning and headers which should simulate desktop browser. Setting headers is not mandatory, but we do not want to let Facebook know, that we are using our own script to do the job. Whole application should look like following