CreateElasticsearchBackedHiveTable must specify LOCATION in create table

Description

The CreateElasticsearchBackedHiveTable currently does not specify location when creating the table. Since creating a category also does not specify a LOCATION with the CREATE DATABASE command, Hive creates a default location e.g. 'hdfs://<namenode_hostname>:8020/user/hive/warehouse/<categoryname>.db/<feedname>'. In the case of the S3 template this can be a problem if the HDFS namenode gets renamed, or in the case of EMR where nodes are redefined with each new cluster.

In EMR case these steps will cause the error shown below:
1. Create a Category
2. Create a Feed for that category
3. Run a feed input through the feed so that one time initialization section of resuable flow is performed.
4. Stop EMR cluster, start EMR and reconfigure to new EMR
5. Create a new Feed using the above category

The following error will appear in Nifi logs and hiverserver2.log on cluster:

1 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: java.net.NoRouteToHostException No Route to Host from ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal/XXX.XXX.XXX.XXX to ip-YYY-YYY-YYY-YYY.us-west-2.compute.internal:8020 failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost)

Work Around: It may be possible to drop the database created for the category and specify one with an S3 location before creating any feeds for the category.

Environment

None

Status

Assignee

Tim Harsch

Reporter

Tim Harsch

Labels

Reviewer

None

Story point estimate

None

Epic Link

Components

Sprint

None

Fix versions

Affects versions

0.8.2

Priority

Highest