December 03, 2010

make sure start-dfs.sh script is being executed on the master !

everything looks fine when start-all.sh is executed but there is no Namenode process apprear on the jps results. why ? also when I check the logs I see following networking exceptions:

ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Problem binding to hostname/ipaddress:host : Cannot assign requested address
        at org.apache.hadoop.ipc.Server.bind(Server.java:190)
        at org.apache.hadoop.ipc.Server$Listener.(Server.java:253)
        at org.apache.hadoop.ipc.Server.(Server.java:1026)
        at org.apache.hadoop.ipc.RPC$Server.(RPC.java:488)

when I try to do an ls on dfs, I get following:
ipc.Client: Retrying connect to server:  hostname/ipaddress:host
...
...
Bad connection to FS. command aborted

I spent lots of time trying to figure out the "networking" problem, checked if the port is already in use, ip4/ip6 conflict etc ..

at the end, I realized that I'm running start-all.sh script on a random node. When it is executed on the master node, works just fine ! simple fix ..