In the last 3 days, I tried to find out the probable friends, and it was not as difficult as I thought.
I have around 340 friends, and if I can get all the friends list pages of all my friends, then it would be easy to get the probable friends. wget is a very nice utility to download anything. For authenticating the session, we need to pass the username, password, and some other information to orkut as post data. It would be time consuming to findout all the form information. I used Live http headers extension in firefox. With that, if you do any operation in any website, it will store all the http information. For example, if you start Live http headers, and login to orkut, it will store all the requests that it had sent for logging in. It would take around 4-5 requests before displaying the home page. We can use that http headers to build wget request. There are few options --load-cookies, --save-cookies, --keep-session-cookies in wget. With that, we can have a proper session with wget.
Once the authentication is done, rest all can be automated. I have done in the following way.
- The script takes my id, and downloads my profile page by wget.
- VIM parses that file, and finds out no.of friends, and a C program generates the URLs for all the friends. (It is possible to generate URLs also with VIM. I did not remember how to do it at that time)
- wget gets all friends list pages of all my friends.
- VIM parses all the files, and stores all the friends of friends in plain text format.
- A program written in Java takes all the friends of friends, and finds out probable friends.
I have around 50,000 friends of friends. Around 1000 friends of friends have more than 5 mutual friends. It was much more than what I expected.
After getting all this data, I was wondering, if orkut adds this feature, would I use it or not? Except for the feature, find your gmail contacts in orkut, In all other features, orkut uses paging, and shows 10-20 items in a page. If I have to see all the 1000 friends of friends, I may have to open 50 pages, and it would be tedious. But, with my script, I could get all the details in a single plain text file, with one line for one friend of friend. So, it was very easy for me to find out my friends. If orkut had implemented this feature a week back, I would not have tried to use this feature, and would have browsed all the 50 pages, and would not have learned about logging into a website programmatically. So, I should be thankful to orkut for delaying this feature. ;)
I would like to thank Deepak Manohar for teaching me few VIM commands for parsing the html files.
Good to read that.
ReplyDeleteThe concept of how far a person from you is, in the chain of friends is a topic of research in another form. Check out Erdos Number Project: http://www.oakland.edu/enp .
You may wish to look at Orkut APIs to reduce some hacking: http://code.google.com/apis/orkut/docs/orkutdevguide.html .
-rupesh.
Did you develop anything with Orkut APIs?
ReplyDeleteImmediately after google released orkut APIs, I requested for the access, and tried to work on that. I got the access to orkut APIs (sandbox.orkut.com). But, it does not have enough APIs to develop any complex application. Even now, I could not see any reasonable application (Other than Hello world) with orkut APIs. Even plain listing of friends does not give all the friends. I am getting only 4 friends, and probably only those who have access to sandbox.orkut.com.
I could not do anything significant with orkut APIs, so, I have done this with wget, and VIM. If you have developed any application with orkut APIs, please let me know.
I agree that one cannot do anything serious with the API right now. I thought, for your application (friends of friends), the APIs available are best suited. But slowly the support will be added, I think. We should understand that releasing some API for a feature means automating the task and that means some malicious user can write a program to automate a problematic task.
ReplyDeleteI havent written an application using the API, but plan to do soon using OpenSocial (http://code.google.com/apis/opensocial/articles/).
-rupesh.