初始设置
设置表名、基本路径和数据生成器,以生成示例所需要的记录,代码如下:
// spark-shell import org.apache.hudi.QuickstartUtils._ import scala.collection.JavaConversions._ import org.apache.spark.sql.SaveMode._ import org.apache.hudi.DataSourceReadOptions._ import org.apache.hudi.DataSourceWriteOptions._ import org.apache.hudi.config.HoodieWriteConfig._ val tableName = "hudi_trips_cow" val basePath = "hdfs://xueai8:8020/hudi/hudi_trips_cow" val dataGen = new DataGenerator // 测试生成的json数据集 convertToStringList(dataGen.generateInserts(2)).foreach(println)
执行以上代码,可以看到Hudi的数据生成器生成2条JSON数据,数据格式如下:
{"ts": 1647196090688, "uuid": "421b6078-2d2c-4e23-a6f5-b64713bdf81d", "rider": "rider-284", "driver": "driver-284", "begin_lat": 0.7340133901254792, "begin_lon": 0.5142184937933181, "end_lat": 0.7814655558162802, "end_lon": 0.6592596683641996, ............
抱歉,只有登录会员才可浏览!会员登录