You are currently viewing Apache pig setup

Apache pig setup

Apache Pig is a part of the Hadoop ecosystem. Apache pig is internally running a Hadoop map-reduce job when we execute the apache pig script. Apache pig is used when you don’t have good programming knowledge and knowledge of SQL queries. We understood here the usage of apache pig. Now, we will see if we want to use apache pig, then how can we configure apache pig.

Prerequisite

In order to set up apache pig, we have some prerequisite that needs to be installed on the system. As apache pig is part of the Hadoop ecosystem we need to configure Hadoop first to set up apache pig. Now, in order to set up apache Hadoop, we need to set up Java as Hadoop is using java libraries internally. So, basically, there are two prerequisites for apache pig which are Java and Hadoop.

How to setup the Apache Pig software

Now, we will see the steps to set up apache pig.

We need to download the first apache pig from its website which has the following URL.

https://pig.apache.org/

Apache pig homepage

Open the URL shared above. Click on the link to the release page in the News part. You will redirect to the below page.

https://pig.apache.org/releases.html

Here, Click on the Download link mentioned on the above page. You will redirect to the below page from here.

Click on Download a release now! Link from here. It will redirect to the below page where it suggests which page to visit from here for the apache pig download link.

On clicking the link mentioned under HTTP you will redirect to the page where a list of pig releases is available to download. We need to select one of them. Below is the page where we can find a list of releases.

We will select the latest release from here. So, click on the latest directory. On clicking the latest directory it will reach the below page.

https://dlcdn.apache.org/pig/latest/

We will download the tar.gz file which is pig-0.17.0.tar.gz to perform configuration and not the src file from this page. It’s around 220 MB in size. So, it will take time accordingly to download the file.

Install Apache Pig

We will unzip the downloaded file using the below command:

tar –xvzf pig-0.17.0.tar.gz

It will extract the pig-0.17.0 directory. Move it to the appropriate location where you want to set up apache pig. We have moved it under /usr/local location using the below command.

mv pig-0.17.0-src.tar.gz/* /usr/local/pig-0.17.0/

Apache Pig Configuration

After Installing Apache Pig successfully we need to do some configuration changes. We need to update the below changes in bashrc file.

We need to set PIG_HOME with the pig root location. Need to set the pig bin location in the path location. Need to set up the classpath with the conf directory location.

Apart from this if we need to change some properties related to the pig then we can update the pig.properties file in the conf directory inside the pig home location.

We have updated the required changes for Apache Pig Configuration. Now, we can verify the Apache Pig version and confirm that we have done the configuration properly. Let’s check the Apache Pig version using the below command:

pig -version

Here, we can see the Pig version is showing correctly which is 0.17.0

Also, let’s try to connect the apache pig grunt console now using the below command.

pig

Here we can see it’s able to successfully connect with apache pig.

Conclusion

Here, we have shown a complete step-by-step flow about how to download and configure apache pig. Hope it will help you out to perform pig configuration with the latest version and further we can work with apache pig using the apache pig console. Hope you have a good understanding of this article. Let me know and feel free to if you still have any confusion. We are open to resolve any confusion.

Happy Pig!

If you like the article and would like to support me, make sure to: