Referencing unexpected methods, due to dependency conflicts on org.apache.poi:poi.jar

Description

Hi, in kylo-0.10.0 (kylo-0.10.0\core\file-metadata\file-metadata-core module), there are mulptiple versions of org.apache.poioi:jar. However, according to Maven's dependency management strategy, only org.apache.poioi:jar:3.15 can be loaded, and org.apache.poioi:jar:3.17 will be shadowed.

As shown in the following dependency tree, org.apache.tika:tika-parsers:jar:1.18 expects to reference org.apache.poioi:jar:3.17. But due to dependency conflicts, Maven actually loads org.apache.poioi:jar:3.15. As a result, org.apache.tika:tika-parsers:jar:1.18 has to invoke the methods included in the unexpected version org.apache.poioi:jar:3.15, which may cause inconsistent semantic behaviors.

For instance, method <com.thinkbiganalytics.kylo.tika.detector.CSVDetector: org.apache.tika.mime.MediaType detect(java.io.InputStream,org.apache.tika.metadata.Metadata)> actually references method <org.apache.poi.poifs.macros.VBAMacroReader: protected void readMacros(DirectoryNode macroDir, ModuleMap modules)> in the unexpected version org.apache.poioi:jar:3.15 via the following invocation path:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 <com.thinkbiganalytics.kylo.tika.detector.CSVDetector: org.apache.tika.mime.MediaType detect(java.io.InputStream,org.apache.tika.metadata.Metadata)> D:\testcase\NewProject3\kylo-0.10.0\core\file-metadata\file-metadata-core\target\classes <com.thinkbiganalytics.file.parsers.util.ParserUtil: void <clinit>()> D:\cEnvironment\repository\com\thinkbiganalytics\kylo\kylo-file-metadata-util\0.10.0\kylo-file-metadata-util-0.10.0.jar <org.slf4j.LoggerFactory: org.slf4j.Logger getLogger(java.lang.Class)> D:\cEnvironment\repository\org\slf4j\slf4j-api\1.7.12\slf4j-api-1.7.12.jar <org.slf4j.LoggerFactory: org.slf4j.Logger getLogger(java.lang.String)> D:\cEnvironment\repository\org\slf4j\slf4j-api\1.7.12\slf4j-api-1.7.12.jar <org.slf4j.impl.Log4jLoggerFactory: org.slf4j.Logger getLogger(java.lang.String)> D:\cEnvironment\repository\org\slf4j\slf4j-log4j12\1.7.10\slf4j-log4j12-1.7.10.jar <org.apache.log4j.LogManager: void <clinit>()> D:\cEnvironment\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar <org.apache.log4j.helpers.OptionConverter: void selectAndConfigure(java.net.URL,java.lang.String,org.apache.log4j.spi.LoggerRepository)> D:\cEnvironment\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar <org.apache.log4j.PropertyConfigurator: void doConfigure(java.net.URL,org.apache.log4j.spi.LoggerRepository)> D:\cEnvironment\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar <org.apache.log4j.PropertyConfigurator: void doConfigure(java.util.Properties,org.apache.log4j.spi.LoggerRepository)> D:\cEnvironment\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar <org.apache.log4j.PropertyConfigurator: void parseCatsAndRenderers(java.util.Properties,org.apache.log4j.spi.LoggerRepository)> D:\cEnvironment\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar <org.apache.log4j.config.PropertySetter: void setProperties(java.util.Properties,java.lang.String)> D:\cEnvironment\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar <org.apache.log4j.config.PropertySetter: void activate()> D:\cEnvironment\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar <org.apache.log4j.varia.ExternallyRolledFileAppender: void activateOptions()> D:\cEnvironment\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar <org.apache.log4j.varia.HUP: void run()> D:\cEnvironment\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar <junit.extensions.ActiveTestSuite$1: void run()> D:\cEnvironment\repository\junit\junit\4.12\junit-4.12.jar <junit.framework.JUnit4TestAdapter: void run(junit.framework.TestResult)> D:\cEnvironment\repository\junit\junit\4.12\junit-4.12.jar <org.junit.internal.runners.JUnit4ClassRunner: void run(org.junit.runner.notification.RunNotifier)> D:\cEnvironment\repository\junit\junit\4.12\junit-4.12.jar <org.junit.internal.runners.ClassRoadie: void runProtected()> D:\cEnvironment\repository\junit\junit\4.12\junit-4.12.jar <org.junit.internal.runners.ClassRoadie: void runUnprotected()> D:\cEnvironment\repository\junit\junit\4.12\junit-4.12.jar <org.apache.tika.parser.ParsingReader$ParsingTask: void run()> D:\cEnvironment\repository\org\apache\tika\tika-core\1.14\tika-core-1.14.jar <org.apache.tika.parser.microsoft.OfficeParser: void parse(java.io.InputStream,org.xml.sax.ContentHandler,org.apache.tika.metadata.Metadata,org.apache.tika.parser.ParseContext)> D:\cEnvironment\repository\org\apache\tika\tika-parsers\1.18\tika-parsers-1.18.jar <org.apache.tika.parser.microsoft.OfficeParser: void extractMacros(org.apache.poi.poifs.filesystem.NPOIFSFileSystem,org.xml.sax.ContentHandler,org.apache.tika.extractor.EmbeddedDocumentExtractor)> D:\cEnvironment\repository\org\apache\tika\tika-parsers\1.18\tika-parsers-1.18.jar <org.apache.poi.poifs.macros.VBAMacroReader: public Map<String, String> readMacros()> <org.apache.poi.poifs.macros.VBAMacroReader: protected void findMacros(DirectoryNode dir, ModuleMap modules)> <org.apache.poi.poifs.macros.VBAMacroReader: protected void readMacros(DirectoryNode macroDir, ModuleMap modules)>

By further analyzing, the expected callee <org.apache.poi.poifs.macros.VBAMacroReader: protected void readMacros(DirectoryNode macroDir, ModuleMap modules)>, have different implementations from the actual callees with the same signatures (same method names, same paremeters) included in the unexpected (but actual loaded) version org.apache.poioi.jar 3.15, which leads to different behaviors.

Solution:
Use the newer version org.apache.poioi.jar 3.17 to keep the version consistency in dependency management document.

Thanks!
Best regards,
Coco

Dependency Tree---

[INFO] — maven-dependency-plugin:2.8:tree (default-cli) @ kylo-file-metadata-core —
[INFO] com.thinkbiganalytics.kylo:kylo-file-metadata-core:jar:0.10.0
[INFO] +- com.thinkbiganalytics.kylo:kylo-file-metadata-model:jar:0.10.0:compile
[INFO] | +- (javax.inject:javax.inject:jar:1:compile - omitted for duplicate)
[INFO] | +- (org.slf4j:slf4j-api:jar:1.7.12:compile - version managed from 1.7.10; omitted for duplicate)
[INFO] | - (org.slf4j:slf4j-log4j12:jar:1.7.10:compile - omitted for duplicate)
[INFO] +- com.thinkbiganalytics.kylo:kylo-file-metadata-util:jar:0.10.0:compile
[INFO] | +- (com.thinkbiganalytics.kylo:kylo-file-metadata-model:jar:0.10.0:compile - omitted for duplicate)
[INFO] | +- org.apache.commons:commons-csv:jar:1.4:compile
[INFO] | +- (org.apache.commons:commons-lang3:jar:3.7:compile - omitted for duplicate)
[INFO] | +- (commons-io:commons-io:jar:2.5:compile - omitted for duplicate)
[INFO] | +- (org.slf4j:slf4j-ext:jar:1.7.12:compile - omitted for duplicate)
[INFO] | - (javax.inject:javax.inject:jar:1:compile - omitted for duplicate)
[INFO] +- org.apache.commons:commons-lang3:jar:3.7:compile
[INFO] +- commons-io:commons-io:jar:2.5:compile
[INFO] +- com.fasterxml.jackson.core:jackson-databind:jar:2.9.6:compile
[INFO] | +- (com.fasterxml.jackson.core:jackson-annotations:jar:2.9.6:compile - version managed from 2.9.0; omitted for duplicate)
[INFO] | - com.fasterxml.jackson.core:jackson-core:jar:2.9.6:compile
[INFO] +- com.fasterxml.jackson.datatype:jackson-datatype-joda:jar:2.9.6:compile
[INFO] | +- (com.fasterxml.jackson.core:jackson-annotations:jar:2.9.6:compile - version managed from 2.9.0; omitted for duplicate)
[INFO] | +- (com.fasterxml.jackson.core:jackson-core:jar:2.9.6:compile - omitted for duplicate)
[INFO] | +- (com.fasterxml.jackson.core:jackson-databind:jar:2.9.6:compile - omitted for duplicate)
[INFO] | - joda-time:joda-time:jar:2.9.2:compile (version managed from 2.7)
[INFO] +- com.fasterxml.jackson.core:jackson-annotations:jar:2.9.6:compile
[INFO] +- org.apache.tika:tika-core:jar:1.18:compile
[INFO] +- org.slf4j:slf4j-api:jar:1.7.12rovided (scope not updated to compile)
[INFO] +- org.slf4j:slf4j-ext:jar:1.7.12:compile
[INFO] | +- (org.slf4j:slf4j-api:jar:1.7.12:compile - version managed from 1.7.10; omitted for duplicate)
[INFO] | - ch.qos.cal10n:cal10n-api:jar:0.8.1:compile
[INFO] +- org.slf4j:slf4j-log4j12:jar:1.7.10rovided (scope not updated to compile)
[INFO] | +- (org.slf4j:slf4j-api:jar:1.7.12rovided - version managed from 1.7.10; omitted for duplicate)
[INFO] | - log4j:log4j:jar:1.2.17rovided
[INFO] +- org.apache.tika:tika-parsers:jar:1.18:compile
[INFO] | +- (org.apache.tika:tika-core:jar:1.14:compile - version managed from 1.18; omitted for conflict with 1.18)
[INFO] | +- org.apache.poioi:jar:3.15:compile (version managed from 3.17)
[INFO] | | - org.apache.commons:commons-collections4:jar:4.1:compile
[INFO] | - com.googlecode.juniversalchardet:juniversalchardet:jar:1.0.3:compile
[INFO] +- javax.inject:javax.inject:jar:1:compile
[INFO] - junit:junit:jar:4.12:test
[INFO] - org.hamcrest:hamcrest-core:jar:1.3:test

Environment

None

Status

Assignee

Unassigned

Reporter

Hello Coco

Labels

Reviewer

None

Story point estimate

None

Components

Affects versions

0.10.0

Priority

Medium
Configure