One of the actions performed is: converting two fields (epoch seconds and nanos) to one field timestamp (and casting it to spark logical type TimestampType)
After this resulted avro files are uploaded to Redshift. The corresponding column in Redshift is TIMESTAMP.
Problem
During the upgrade of spark from 2.2.0 to 2.4.4 uploading to Redshift stopped working.
What has changed: in version 2.4.0 built-in Avro support was added to Spark SPARK-24768 (to replace databricks spark-avro library). As a result in the new version spark stores timestamp field as a long in microseconds, while in the old version in milliseconds. If the avro files are processed only by spark (old or new version), the change is invisible. However, if you use a vanilla avro library to read the data, the value will be 1000 bigger than expected.
Example how to reproduce
Avro schema
For spark 2.2 with the dependecy com.databricks:spark-avro_2.11:4.0.0
For spark 2.4.0 only small changes in the code are required and the databricks library needs to be replaced with org.apache.spark:spark-avro_2.11:2.4.0
After this the last assertion is failing. As the epochSeconds_casted column has a value in microseconds (not in milliseconds as before). The schema after version upgrade looks as
Workaround
We removed casting to Timestamp (in case of avro), but left casting to Timestamp in case of Parquet files.