Spark connector

Notifications

User guide:

Source codes: starrocks-connector-for-apache-spark

Naming format of the JAR file: starrocks-spark-connector-${spark_version}_${scala_version}-${connector_version}.jar

Methods to obtain the JAR file:

Directly download the Spark connector JAR file from the Maven Central Repository.
Add the Spark connector as a dependency in your Maven project's pom.xml file and download it. For specific instructions, see user guide.
Compile the source codes into Spark connector JAR file. For specific instructions, see user guide.

Version requirements:

Spark connector	Spark	StarRocks	Java	Scala
1.1.1	3.2, 3.3, or 3.4	2.5 and later	8	2.12
1.1.0	3.2, 3.3, or 3.4	2.5 and later	8	2.12

1.1.1

This release mainly includes some features and improvements for loading data to StarRocks.

NOTICE

Take note of the some changes when you upgrade the Spark connector to this version. For details, see Upgrade Spark connector.

Features

Improvements

Remove useless dependency, and make the Spark connector JAR file lightweight. #55 #57
Replace fastjson with jackson. #58
Add the missing Apache license header. #60
Do not package the MySQL JDBC driver in the Spark connector JAR file. #63
Support to configure timezone parameter and become compatible with Spark Java8 API datetime. #64
Optimize row-string converter to reduce CPU costs. #68
The starrocks.fe.http.url parameter supports to add a http scheme. #71
The interface BatchWrite#useCommitCoordinator is implemented to run on DataBricks 13.1 #79
Add the hint of checking the privileges and parameters in the error log. #81

Bug fixes

Parse escape characters in the CSV related parameters column_seperator and row_delimiter. #85

Doc

Refactor the docs. #66
Add examples of load data to BITMAP and HLL columns. #70
Add examples of Spark applications written in Python. #72
Add examples of loading ARRAY-type data. #75
Add examples for performing partial updates and conditional updates on Primary Key tables. #80

1.1.0

Features

Features