数据转换-project

Project函数从事件流中选择一组属性子集,并且只将选中的元素发送到下一个处理流(相当于SQL语句中的投影概念)。下面是进行project转换的示例代码:

Scala代码:

不支持。

Java代码:

import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

public class BatchJob {

	public static void main(String[] args) throws Exception {
		// 设置批处理执行环境
		final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

		// project转换
		Tuple3<Integer, String, Double> user01 = new Tuple3<>(1,"张三",12000.00);
		Tuple3<Integer, String, Double> user02 = new Tuple3<>(2,"李四",22000.00);
		Tuple3<Integer, String, Double> user03 = new Tuple3<>(3,"王老五",18000.00);

		DataStream<Tuple3<Integer, String, Double>> ds = env.fromElements(user01,user02,user03);

		// 选择第3列和第2列
		DataStream<Tuple3<Integer, String, Double>> ds_select = ds.project(2,1);		

		ds_select.print();
	}
}

输出结果如下:

(张三,12000.0)
(李四,22000.0)
(王老五,18000.0)

《PySpark原理深入与编程实战》