TechLog: Hadoop Lookouts

Some important checks for hadoop

Logs are in log directory

Its a good idea to start and stop all the nodes once. While stopping we are looking for following which means all the Hadoop components have been stopped.

stopping jobtracker

localhost: stopping tasktracker

stopping namenode

localhost: stopping datanode

localhost: stopping secondarynamenode

If you following message then namenode is not started and all the jobs will fail.

stopping jobtracker
localhost: stopping tasktracker
no namenode to stop
localhost: stopping datanode
localhost: stopping secondarynamenode

For minimal configuration refer to http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/ or http://hadoop.apache.org/common/docs/current/cluster_setup.html

Format the name node
$./hadoop namenode -format

Start dfs :
$./start-dfs.sh

Stop dfs :
$./stop-dfs.sh

start mapred :
$./start-mapred.sh

stop mapred :
$./stop-mapred.sh

copy data from local to hdfs :

$./hadoop fs -copyFromLocal /home/input /input

create directory in hdfs

$./hadoop fs -mkdir input

copy the data from hdfs to local file system

$./hadoop fs -get /output /home/output

check if whole dfs file system in ok

$./bin/hadoop fsck / -files -blocks -locations > dfs-v-old-fsck-1.log

Note that here we are checking "/" ie root dfs. You can replace root dir with any other dir you want to check.

List all the dfs files and dirs of the system

$./bin/hadoop dfs -lsr / > dfs-v-old-lsr-1.log

LIst all the nodes pariticipating in the cluster

$./bin/hadoop dfsadmin -report > dfs-v-old-report-1.log

More resources : http://wiki.apache.org/hadoop/Hadoop_Upgrade

TechLog

Saturday, November 26, 2011

Hadoop Lookouts

No comments:

Post a Comment

Labels

Blog Archive