Adds

Sunday, 16 October 2016

Pentaho CE ETL Installation Linux

1.  Download pentaho DI Server

Follow the below steps to download the pentho DI server
1.      Open the browser and access the below URL. You will find all the available versions of pentaho DI. Click on the latest version available(6.1)





2.      You will be redirected to a page which contains the kettle-sdk and pdi-ce downloader’s. Click on  “pdi-ce-6.1.0.1-196.zip” to download the data integration server



3.      The will download a zip file containing the kitchen, spoon and pan utilities

2.  Download Oracle JDK

1.      Navigate to the below URL and click on JDK download


2.      You will be redirected to page where you need to accept the agreement and select the downloader. Choose 64 bit tar file to download



3.      The above step will download the tar file to your system

3.  Copy the Installers

1.      Login to the  server where we have to install pentaho DI by giving proper credentials
2.      Create a directory “pentaho_installer” in the root directory and copy the pentaho zip  and the JDK tar file downloaded in the previous steps.

4.  Oracle JDK installation

1.      Create the directory /pentaho/ if it does not exists
2.      Navigate to /pentaho_installer/ and run the below command. It will install jdk1.8 to /opt/pentaho/ folder
Command: tar zxvf jdk-8u91-linux-x64.tar.gz -C /pentaho/
3.      Set the path variable by running the below command
export PATH=/pentaho/jdk1.8.0_91/bin/:$PATH
4.      Execute the below command to verify that the java is installed properly. It should give the output s shown in the below diagram
                                java –version    
       

5.  Setting environment variables

1.      Edit the file /etc/environment and add the highlighted content

PATH="/pentaho/jdk1.8.0_91/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"
PENTAHO_JAVA_HOME="/pentaho/jdk1.8.0_91/"

2.      You have to relogin for the above changes to take effect

6.  DI Installation

1.      Navigate to / pentaho_installer/  and run the below command
Commad: unzip pdi-ce-6.1.0.1-196.zip -d /pentaho/
2.      The will create data-integration folder as shown below



7.  Configuring the repository

1.      Pentaho CE supports two types of repository
a.      FileRepository : Repository will be there in the local file system. It is recommended for better performance. For clustering we need to mount the same repository across all the nodes.
b.     DB Repository: Repository will be there in the database. It has some of the performance issues and is suited for multimode cluster

7.1.   Configuring DB Repository

7.1.1.        Creating Postgres Database

1.      Download the script file and copy it to the below location
/pentaho_installer/
2.      Make sure md5 is enabled for all by viewing the file /etc/postgresql/9.3/main/pg_hba.conf and looking for the below line.If not change it to md5 and restart the postgres service
# "local" is for Unix domain socket connections only
local   all             all                                     md5

3.      Run the below command to create the pentaho DB repository.
psql -U postgres -f  "/pentaho_installer/DBRepository_postgres.sql"

7.1.2.        Create repository.xml:Copy the repository.xml to the user .kettle folder (~/.kettle/) and change the configuration as required

7.1.3.        Configuring data integration with DB Repository

1.      Create the JNDI configuration file in the below location
/pentaho/data-integration/simple-jndi/

2.      Add the below configuration and change the connection details according to the environnment
Repository/type=javax.sql.DataSource
Repository/driver=org.postgresql.Driver
Repository/user=pentahodi
Repository/password=xxxxxx
Repository/url=jdbc:postgresql://localhost:5432/pentaho_db_repo

7.2.   Configuring File Repository: Copy the repository.xml to the user .kettle folder (~/.kettle/) and change the base_directory as required

8.  Verify Repository Configuration

8.1.   Verify FileRepository:

Run the below commands and make sure that  kitchen and pan commands are running without any errors
Commands
                                                    i.     sh kitchen.sh -rep="FileRepository" -dir="" -listjobs  -level=Debug
                                                   ii.     sh pan.sh -rep="FileRepository" -dir="" -listtrans  -level=Debug

8.2.   Verify DBRepository:

Run the below commands and make sure that  kitchen and pan commands are running without any errors
Commands
                                                    i.          sh kitchen.sh -rep=" DBRepository" –user=admin –pass=admin -dir="" -listjobs  -level=Debug
                                                   ii.          sh pan.sh -rep=" DBRepository" –user=admin –pass=admin -dir="" -listtrans  -level=Debug

No comments:

Post a Comment