yasemin's notes: November 2011

November 28, 2011

How to get android app store apps on kindle fire

It was not happy to realize that kindle fire does not allow access to android market place. Probably this is a well thought out marketing strategy of Amazon, but if you're tired of keep being redirected to "can not find app" message and don't want to root your kindle, keep reading :)

The default kindle fire browser (Silk) redirects all android market links to amazon app store. After googling a little bit, idea of getting another browser and using that to access android market seemed like to be a smart way. I tried accessing via dolphin - but failure, still redirect to amazon app store (btw - give a try to dolphin if you haven't done yet, it is a pretty cool browser.). So I'm left with the last option, sideloading apps through third parties such as freewarelovers.com and getjar.com. Just search for the apps online, download the apk file, and click on it to install. Before installing, make sure "allow applications from unknown sources" option is enabled through Settings->Device.

November 08, 2011

Scalable Manipulation of Archival Web Graphs

I was working on archival web graphs till last June. Actually, most of the Hadoop related posts on this blog are things I learned while working on this project.

We looked at the problem of processing large-scale archival web graphs and generating a simple representation for the raw input graph data. This representation should allow the end user to be able to analyze & query the graph efficiently. Also, the representation should be flexible enough - so it can be loaded into a database or can be processed using distributed computing frameworks, ie. Hadoop.
To achieve these goals, we developed a workflow for archival graph processing within Hadoop. This project is still going on and its current status has appeared at LSDS-IR '11 workshop at CIKM conference last month. I like to share the paper [acm] [pdf] for those who are interested in further details. The abstract is following:

In this paper, we study efficient ways to construct, represent and analyze large-scale archival web graphs. We first discuss details of the distributed graph construction algorithm implemented in MapReduce and the design of a space-efficient layered graph representation. While designing this representation, we consider both offline and online algorithms for the graph analysis. The offline algorithms, such as PageRank, can use MapReduce and similar large-scale, distributed frameworks for computation. On the other side, online algorithms can be implemented by tapping into a scalable repository (similar to DEC’s Connectivity Server or Scalable Hyperlink Store by Najork), in order to perform the computations. Moreover, we also consider updating the graph representation with the most recent information available and propose an efficient way to perform updates using MapReduce. We survey various storage options and outline essential API calls for the archival web graph specific real-time access repository. Finally, we conclude with a discussion of ideas for interesting archival web graph analysis that can lead us to discover novel patterns for designing state-of-art compression techniques.

Also, the source code for "graph construction algorithm" is open sourced at GitHub.