How to uninstall Amazon Linux AMI’s Pre-Installed Java JDK and Yum Install a Custom JDK

Amazon Linux (Amazon Linux AMI 2013.03.1) comes pre-installed with Java OpenJDK version 6. If you need Java version 7, you will need to install it yourself.

1. Uninstall the default OpenJDK 6 JRE

sudo yum remove java-1.6.0-openjdk

2. Install OpenJDK 7 JRE as per http://openjdk.java.net/install/

sudo yum install java-1.7.0-openjdk

3. Optionally, also install OpenJDK 7 JDK (which is necessary for Tomcat compiling your JSPs into Servlets)

sudo yum install java-1.7.0-openjdk-devel

4. When you are done, you could verify the version of Java through:

java -version
Share

Resolving a dependency directly from a project folder (Maven)

Disclaimer: First of all, i do not recommend not putting a custom dependency in an Artifact Repository Manager in the way Maven is made to resolve a dependency. But, for the discussion of refuting that this feature is unsupported in Maven that could be easily done in ANT / Gradle / Ivy and if that shifts you away from using Maven – the wonderful build manager, i am providing this article to show how it is being done, but whether it should be done this way, will call for your judgement / use case.

Resolving a dependency directly from a project folder (Maven)

Usually we would put such a snippet (or similar) into the project object model (pom) XML to get Maven to download a JAR for us:

<dependency>
 <groupId>org.apache.commons</groupId>
 <artifactId>commons-lang3</artifactId>
 <version>3.1</version>
</dependency>

This triggers Maven to first look for the mentioned jar in the local directory, and if not found, then in the central Maven repository, and if still not found, then in any custom repository defined.

To force Maven to check for the dependency within the project folder itself, perhaps in /lib is an additional two lines of simple configuration:

<dependency>
 <groupId>org.apache.commons</groupId>
 <artifactId>commons-lang3</artifactId>
 <version>3.1</version>
 <scope>system</scope>
 <systemPath>${basedir}/src/main/resources/lib/commons-lang3-3.1.jar</systemPath>
</dependency>
<scope /> tells Maven that a particular dependency may not be a Maven built JAR and is not located in a Maven type repository.

<systemPath /> tells Maven where to locate the JAR file.

${basedir} refers to the project’s current directory. This is portable between Linux / Windows as it is a Maven universal placeholder to simply tell Maven it refers to the project’s current directory. This is even better than putting a relative directory ../../lib or hard directory C:/location/lib or /home/myname/lib you might do in ANT or Gradle.

Next, just place the custom JAR (this could be built using ANT, downloaded binary from web, or any legacy JAR) into the project’s /lib folder:

As soon as this JAR is placed into the /lib folder, Maven puts this as its dependency (See C:\Users\Chin Boon\workspace\self-container-webapp\src\main\resources\lib).

Such Maven projects can be checked into SVN as a regular project and when checked out by others (or exported as a project), can have its dependency resolved with no extra work like re-installing the JAR / making Maven go to central repository to get the JAR.

Share
aws

Copying DynamoDB Table Contents Across Regions

There is no direct way of copying data of a DynamoDB table across regions (i.e. from one region to another) from the DynamoDB management console.

Why would you need such an operation? i.e. moving data across regions. Answer may be, sometimes you need to setup multiple environments, one for each development phase. Like for example, development, SIT, UAT, Pre-production, performance-test environment etc.

I will be deploying a solution which requires using the Amazon EMR (Elastic Map Reduce). EMR is essentially the service recommended by Amazon to move data in and out of DynamoDB into S3 where it may then be picked up for further processing / use.

High Level Overview of the solution requiring EMR

  1. Start an EMR Job from the EMR Management Console, putty into the EMR Master Node, start Hive.
  2. Set the DynamoDB endpoint to the source region
  3. Set up a Hive table to reference to the DynamoDB table in the source region
  4. Create an S3 Bucket to store temporal data
  5. Set up a Hive table to reference to the S3 location where data from the DynamoDB table in the source region will be written to
  6. Issue the Hive command that does the actual copying from the DynamoDB table in the source region to the S3 location
  7. Set the DynamoDB endpoint to the destination region.
  8. Issue the Hive command that does the actual copying from the S3 location to the DynamoDB table in the destination region.

Step 1: Start an EMR Job from the EMR Management Console, putty into the EMR Master Console, start Hive.

This step is well detailed in Amazon EMR’s Documentation here at http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/EMRforDynamoDB.html therefore i will not reiterate.

Follow the following steps in sequence

Step 1.1 (Create a Key pair): http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/EMR_SetUp_KeyPair.html

Step 1.2 (Create a Cluster): http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/EMR_CreateJobFlow.html

Step 1.3 (SSH into the Master Node): http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/EMR_SetUp_SSH.html

Once the above three steps are completed, you would have secure shelled into the EMR Master Node and started Hive.

For example:

final

Step 2: Set the DynamoDB endpoint to the source region

Now you will need to set the DynamoDB endpoint to the source region, this is the region where your source table resides. In this example, i am setting the source region endpoint to Singapore.

SET dynamodb.endpoint=dynamodb.ap-southeast-1.amazonaws.com;

For example:

source

Step 3: Set up a Hive table to reference to the DynamoDB table in the source region

In the earlier step i have pointed Hive to DynamoDB in the source region, i shall now create an external table in Hive to reference to the DynamoDB table.

CREATE EXTERNAL TABLE hive_dynamodb_customer_lines (cust_id string, lines bigint)
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler' 
TBLPROPERTIES ("dynamodb.table.name" = "customer_lines", 
"dynamodb.column.mapping" = "cust_id:cust_id,lines:lines");

In this script i am creating an external table named ‘hive_dynamodb_customer_lines’, this Hive table would reference the DynamoDB table ‘customer_lines’ containing the actual data. The table ‘customer_lines’ has the fields cust_id and lines.

Step 4: Create an S3 Bucket to store temporal data

In this step i am creating an S3 Bucket to store temporal data. This bucket and its contents can be deleted at the end of the data migration.

Create a bucket in S3, in this example, i am have created the bucket named ‘dynamodb-migration’. This can be named anything for you, subject to availability (i.e. there cannot be two buckets with the same time in S3)

For example:

migration

 

In the newly created S3 Bucket, create a folder named ‘temp’.

For example:

migratio

Step 5: Set up a Hive table to reference to the S3 location where data from the DynamoDB table in the source region will be written to

I now need to create a Hive table that references to the newly created S3 location where i would dump the data of the DynamoDB table into.

CREATE EXTERNAL TABLE hive_s3_customer_lines (cust_id string, lines bigint)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION 's3://dynamodb-migration/temp';

Step 6: Issue the Hive command that does the actual copying from the DynamoDB table in the source region to the S3 location

In this step i will issue the Hive command that does the copying from the DynamoDB table to S3.

INSERT OVERWRITE TABLE hive_s3_customer_lines SELECT * 
FROM hive_dynamodb_customer_lines;

Once this step executes, data from the DynamoDB table will be copied over into S3. You could go into your S3 bucket to see this file.

For example:

file

Step 7: Set the DynamoDB endpoint to the destination region.

In this step i am setting the DynamoDB endpoint to the destination region where i want the contents of the DynamoDB table to be copied into. In this example, my destination DynamoDB region would be Ireland.

SET dynamodb.endpoint=dynamodb.eu-west-1.amazonaws.com;

Step 8: Issue the Hive command that does the actual copying from the S3 location to the DynamoDB table in the destination region.

In this step i am issuing the hive command that performs the actual copy from S3 to the DynamoDB table in the destination region.

INSERT OVERWRITE TABLE hive_dynamodb_customer_lines SELECT * FROM hive_s3_customer_lines;

When completed, you could go to the DynamoDB table in the destination region to verify that your data is copied successfully.

 

Remember to terminate the EMR Job created through Step 1 and delete the S3 Bucket created in Step 4 to stop incurring AWS resources / charges at the end of the exercise.

Share
  • Newsflash

    March 2012: We have change our site theme to F2.
  • Who's Online

    8 visitors online now
    8 guests, 0 members
    Powered by Visitor Maps