Using UDF’s jar of Hive with Impala

I tried to use UDF’s jar of Hive with Impala, but ClassCastException has been thrown. I created a function reflect(string, string, string...) whose symbol is org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect class.

> create function reflect(string, string, string...) returns string location '/tmp/hive-exec-1.1.0-cdh5.4.7.jar' symbol='org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect';
Query: create function reflect(string, string, string...) returns string location '/tmp/hive-exec-1.1.0-cdh5.4.7.jar' symbol='org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect'
> select reflect('java.net.URLDecoder', 'decode', 'a', 'UTF-8') ;
Query: select reflect('java.net.URLDecoder', 'decode', 'a', 'UTF-8')
WARNINGS: ClassCastException: class org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect

And log of Impala Daemon is the following:

I1112 15:42:13.068034  7056 UdfExecutor.java:415] Loading UDF 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect' from /var/lib/impala/udfs/hive-exec-1.1.0-cdh5.4.7.3221.0.jar
I1112 15:42:13.068990  7056 jni-util.cc:177] java.lang.ClassCastException: class org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect
at java.lang.Class.asSubclass(Class.java:3165)
at com.cloudera.impala.hive.executor.UdfExecutor.init(UdfExecutor.java:418)
at com.cloudera.impala.hive.executor.UdfExecutor.<init>(UdfExecutor.java:132)
I1112 15:42:13.071012  7056 status.cc:114] ClassCastException: class org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect

So I looked into com.cloudera.impala.hive.executor.UdfExecutor and found that the symbol class must be extended org.apache.hadoop.hive.ql.exec.UDF. GenericUDFReflect does not extend UDF but GenericUDF.

//
// By com.cloudera.impala.hive.executor.UDFExecutor.java
//
private void init(String jarPath, String udfPath,
ColumnType retType, ColumnType... parameterTypes) throws
ImpalaRuntimeException {
ArrayList<String> signatures = Lists.newArrayList();
try {
LOG.debug("Loading UDF '" + udfPath + "' from " + jarPath);
ClassLoader loader = getClassLoader(jarPath);
Class<?> c = Class.forName(udfPath, true, loader);
Class<? extends UDF> udfClass = c.asSubclass(UDF.class); // <--- This point
Constructor<? extends UDF> ctor = udfClass.getConstructor();
...

I replaced classes which extend GenericUDF with ones which extend UDF. They looks like they work well. Let me show you some examples.

> create function my_length(string) returns int location '/tmp/hive-exec-1.1.0-cdh5.4.7.jar' symbol='org.apache.hadoop.hive.ql.udf.UDFLength';
Query: create function my_length(string) returns int location '/tmp/hive-exec-1.1.0-cdh5.4.7.jar' symbol='org.apache.hadoop.hive.ql.udf.UDFLength'
Fetched 0 row(s) in 0.03s
> select my_length('abcde') ;
Query: select my_length('abcde')
+----------------------------+
| default.my_length('abcde') |
+----------------------------+
| 5                          |
+----------------------------+
Fetched 1 row(s) in 0.01s
> create function get_json_object(string, string) returns string location '/tmp/hive-exec-1.1.0-cdh5.4.7.jar' symbol='org.apache.hadoop.hive.ql.udf.UDFJson';
Query: create function get_json_object(string, string) returns string location '/tmp/hive-exec-1.1.0-cdh5.4.7.jar' symbol='org.apache.hadoop.hive.ql.udf.UDFJson'
Fetched 0 row(s) in 0.02s
> select get_json_object('{"a":"b"}', '$.a') ;
Query: select get_json_object('{"a":"b"}', '$.a')
+---------------------------------------------+
| default.get_json_object('{"a":"b"}', '$.a') |
+---------------------------------------------+
| b                                           |
+---------------------------------------------+
> create function my_base64(string) returns string location '/tmp/hive-exec-1.1.0-cdh5.4.7.jar' symbol='org.apache.hadoop.hive.ql.udf.UDFBase64';
Query: create function my_base64(string) returns string location '/tmp/hive-exec-1.1.0-cdh5.4.7.jar' symbol='org.apache.hadoop.hive.ql.udf.UDFBase64'
Fetched 0 row(s) in 0.03s
> select my_base64('abcde') ;
Query: select my_base64('abcde')
+----------------------------+
| default.my_base64('abcde') |
+----------------------------+
| YWJjZGU=                   |
+----------------------------+
Fetched 1 row(s) in 0.15s

User-Defined Functions (UDFs)

byebyehaikikyou

日記やIT系関連のネタ、WordPressに関することなど様々な事柄を書き付けた雑記です。ITエンジニア経験があるのでプログラミングに関することなどが多いです。

シェアする

コメントを残す

メールアドレスが公開されることはありません。 * が付いている欄は必須項目です

コメントする

Translate »