User:Lindenb/Notebook/UMR915/20110610: Difference between revisions
From OpenWetWare
No edit summary |
|||
Line 1: | Line 1: | ||
{{PLNB|20110517| | {{PLNB|20110517|20110614}} | ||
=Hadoop= | =Hadoop= | ||
download & unzip '''hadoop-0.20.203.0rc1.tar.gz''' | download & unzip '''hadoop-0.20.203.0rc1.tar.gz''' |
Revision as of 11:50, 14 June 2011
Hadoop
download & unzip hadoop-0.20.203.0rc1.tar.gz
Single node setup
http://hadoop.apache.org/common/docs/current/single_node_setup.html
export JAVA_HOME=/usr/local/package/jdk1.6.0_26 cd hadoop-0.20.203.0 mkdir input cp conf/*.xml input bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+' 11/06/10 10:59:36 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-lindenb/mapred/staging/lindenb-1423012718/.staging/job_local_0001 java.net.UnknownHostException: srv-clc-04.u915.irt.univ-nantes.prive3: srv-clc-04.u915.irt.univ-nantes.prive3 at java.net.InetAddress.getLocalHost(InetAddress.java:1354) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:815) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:791) at java.security.AccessController.doPrivileged(Native Method)
change congig
conf/core-site.xml:
<configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> </configuration>
conf/hdfs-site.xml:
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
conf/mapred-site.xml:
<configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> </configuration>
Setup ssh for no password
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys ##Important, change chmod for ssh ############################################# $ chmod 700 ~/.ssh/ $ chmod 640 ~/.ssh/authorized_keys
Format a new distributed-filesystem
$ bin/hadoop namenode -format [lindenb@srv-clc-04 hadoop-0.20.203.0]$ bin/hadoop namenode -format 11/06/10 12:19:03 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = java.net.UnknownHostException: srv-clc-04.u915.irt.univ-nantes.prive3: srv-clc-04.u915.irt.univ-nantes.prive3 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 0.20.203.0 STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203 -r 1099333; compiled by 'oom' on Wed May 4 07:57:50 PDT 2011 ************************************************************/ Re-format filesystem in /tmp/hadoop-lindenb/dfs/name ? (Y or N) Y 11/06/10 12:19:07 INFO util.GSet: VM type = 64-bit 11/06/10 12:19:07 INFO util.GSet: 2% max memory = 19.1675 MB 11/06/10 12:19:07 INFO util.GSet: capacity = 2^21 = 2097152 entries 11/06/10 12:19:07 INFO util.GSet: recommended=2097152, actual=2097152 11/06/10 12:19:07 INFO namenode.FSNamesystem: fsOwner=lindenb 11/06/10 12:19:07 INFO namenode.FSNamesystem: supergroup=supergroup 11/06/10 12:19:07 INFO namenode.FSNamesystem: isPermissionEnabled=true 11/06/10 12:19:07 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100 11/06/10 12:19:07 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 11/06/10 12:19:07 INFO namenode.NameNode: Caching file names occuring more than 10 times 11/06/10 12:19:07 INFO common.Storage: Image file of size 113 saved in 0 seconds. 11/06/10 12:19:08 INFO common.Storage: Storage directory /tmp/hadoop-lindenb/dfs/name has been successfully formatted. 11/06/10 12:19:08 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at java.net.UnknownHostException: srv-clc-04.u915.irt.univ-nantes.prive3: srv-clc-04.u915.irt.univ-nantes.prive3
- /
start the server
[lindenb@srv-clc-04 hadoop-0.20.203.0]$ ./bin/start-all.sh namenode running as process 23788. Stop it first. localhost: starting datanode, logging to /home/lindenb/package/hadoop-0.20.203.0/bin/../logs/hadoop-lindenb-datanode-srv-clc-04.u915.irt.univ-nantes.prive3.out localhost: starting secondarynamenode, logging to /home/lindenb/package/hadoop-0.20.203.0/bin/../logs/hadoop-lindenb-secondarynamenode-srv-clc-04.u915.irt.univ-nantes.prive3.out starting jobtracker, logging to /home/lindenb/package/hadoop-0.20.203.0/bin/../logs/hadoop-lindenb-jobtracker-srv-clc-04.u915.irt.univ-nantes.prive3.out localhost: starting tasktracker, logging to /home/lindenb/package/hadoop-0.20.203.0/bin/../logs/hadoop-lindenb-tasktracker-srv-clc-04.u915.irt.univ-nantes.prive3.out
copy cdina's data
server1:
scp Axiom_GW_Hu_SNP.r2.na31.annot.csv lindenb@172.18.254.164: The authenticity of host '172.18.254.164 (172.18.254.164)' can't be established. RSA key fingerprint is ad:67:03:8d:c1:20:d6:70:04:aa:c2:c8:9b:26:62:8f. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '172.18.254.164' (RSA) to the list of known hosts. lindenb@172.18.254.164's password: Axiom_GW_Hu_SNP.r2.na31.annot.csv 100% 765MB 40.3MB/s 00:19
Create a directory on HDFS:
bin/hadoop fs -mkdir myfolder