How to install hadoop on ubuntu 14.04 server

First of all sorry this post isn't related to Ossn i struggled 4 hours to install apache hadoop for my brother, non of tutorials on internet worked for me.

In this wiki i'll tell you how to install hadoop v2.6.0 on fresh ubuntu 14.04 x86 Server with nothing installed on it.

SSH SERVER

You need a ssh server with root access, login to your ssh server using root account.

INSTALLING NANO
Run following commands:

sudo apt-get update
sudo apt-get install nano

INSTALLING JAVA RUN TIME

Run the following command :

sudo apt-get update

After running above command ubuntu will update its repository list, after update completed run following command:

sudo apt-get install default-jre

The above command will install java run time environment, after it is installed run the following command:

sudo apt-get install default-jdk

SET "JAVA_HOME" VARIABLE

Run the following command:

sudo update-alternatives --config java

It will give you output like this:

There is only one alternative in link group java (providing /usr/bin/java): /usr/lib/jvm/java-7-openjdk-i386/jre/bin/java

Copy the path for example /usr/lib/jvm/java-7-openjdk-i386/

Run following command:

sudo nano /etc/environment

Add following code on top of file:

JAVA_HOME="/usr/lib/jvm/java-7-openjdk-i386/"

Save file by pressing CTRL+X

Run following command:

source /etc/environment

Now run following command:

echo $JAVA_HOME

It must output /usr/lib/jvm/java-7-openjdk-i386/

Run following command to check if java is installed or not:

java -version

INSTALLING HADOOP
Run following commands one by one:

useradd -m -d /home/hadoop hadoop

passwd hadoop

Once you created the user and created password successfully run the following commands:

su - hadoop

ssh-keygen

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

ssh 127.0.0.1

wget https://www.apache.org/dist/hadoop/core/hadoop-2.6.0/hadoop-2.6.0.tar.gz

tar -zxvf hadoop-2.6.0.tar.gz

mv hadoop-2.6.0 hadoop

EDIT bashrc FILE

Run following command :

nano ~/.bashrc

And add following code at end of file:

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-i386/
export HADOOP_HOME=/home/hadoop/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

Save it by pressing CTRL+X

Run following commands:

source ~/.bashrc  

cd $HADOOP_HOME/etc/hadoop

Edit files:

Run following command:

nano core-site.xml

Change configuration block to following:

<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

Run

nano hdfs-site.xml

Change configuration block to :

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/namenode</value>
</property>

<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>
</property>
</configuration>

Now run following commands:

cp $HADOOP_HOME/etc/hadoop/mapred-site.xml.template $HADOOP_HOME/etc/hadoop/mapred-site.xml

nano mapred-site.xml

Change configuration block to:

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

Run following command:

nano yarn-site.xml

Change configuration block to :

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

Now run following commands and follow the instructions:

hdfs namenode -format

cd $HADOOP_HOME/sbin/

start-dfs.sh

After running the command visit http://yourwebsite:50070 you should see GUI