Verify file ingestion (AVRO format)

Description

Verify that AVRO format files can be ingested.

Activity

Show:
Jagrut Sharma
April 15, 2017, 11:30 PM

Verified ingestion of Avro-formatted files.
Mixed case works. I think there was an earlier bug related to mixed case that you may have hit. It was closed some time back.

Calvin Pietersen
April 4, 2017, 9:30 AM

Added a fix to schema discovery for Avro files. https://github.com/Teradata/kylo/pull/17

Calvin Pietersen
April 4, 2017, 4:44 AM
Edited

Looks like there are issues choosing an Avro Sample file when defining a feed?

Looking at the code, it looks like we are generating a script to run on the spark shell service. https://github.com/Teradata/kylo/blob/f66903fe61e5f968856f8e159b50e190de4aa5ca/plugins/schema-discovery-default/src/main/java/com/thinkbiganalytics/discovery/parsers/hadoop/SparkFileSchemaParserService.java

The script:

Given that we are running sqlContext.read.avro and not importing any avro libraries this should fail, however no logs are being captured in SparkFileSchemaParserService.java making it difficult to determine.

Douglas Moore
March 20, 2017, 2:11 PM

Try field names with mixed case as part of your test, I think we saw failures with mixed case field names when processed by Spark (1.5.2).

Done

Assignee

Jagrut Sharma

Reporter

Jagrut Sharma

Labels

None

Reviewer

None

Components

Sprint

None

Fix versions

Priority

Medium