Error "org.apache.hadoop.hive.ql.exec.vector.TimestampColumnVector.changeCalendar(ZZ)V" while using Timestamp filed in SQL query

Rajendra_Tamboli · June 15, 2022, 5:29am

Hi Team,
We are using Hopsworks 2.4 community edition and trying to execute a sql query having timestamp attribute to create a feature and it gives below error.

An error was encountered:
An error occurred while calling o180.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 4.0 failed 1 times, most recent failure: Lost task 0.0 in stage 4.0 (TID 765) (h02dyn.maersk.com executor 1): java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.exec.vector.TimestampColumnVector.changeCalendar(ZZ)V

SQL
df=spark.sql(""“select order_id,
order_year,
order_week,
Source_txn_commit,
source_txn_csn,
source_txn_rsn,
order_amount,
row_number()over(partition by order_id,order_year,order_week,order_amount order by Source_txn_commit desc,source_txn_csn desc,source_txn_rsn desc) as Rid
from order
where order_year >= ‘2018’
and COALESCE(Comments,‘Empty’) NOT LIKE (’%sample request%’)
group by 1,2,3,4,5,6,7
order by Source_txn_commit, source_txn_csn, source_txn_rsn desc”"")

Fabio · June 16, 2022, 6:28pm

HI @Rajendra_Tamboli,

In which format is the data stored and could you send the schema of the table you are trying to query?

–
Fabio

Rajendra_Tamboli · June 19, 2022, 7:43am

Hi Fabio,

The data is stored in orc in adls and fetched via storage connector. Please find below schema of the order table used in SQL -