我将下面的JSON字符串加载到dataframe
列中。
{ "title": { "titleid": "222", "titlename": "ABCD" }, "customer": { "customerDetail": { "customerid": 878378743, "customerstatus": "ACTIVE", "customersystems": { "customersystem1": "SYS01", "customersystem2": null }, "sysid": null }, "persons": [{ "personid": "123", "personname": "IIISKDJKJSD" }, { "personid": "456", "personname": "IUDFIDIKJK" }] } } val js = spark.read.json("./src/main/resources/json/customer.txt") println(js.schema) val newDF = df.select(from_json($"value", js.schema).as("parsed_value")) newDF.selectExpr("parsed_value.customer.*").show(false)
//模式:
StructType(StructField(customer,StructType(StructField(customerDetail,StructType(StructField(customerid,LongType,true), StructField(customerstatus,StringType,true), StructField(customersystems,StructType(StructField(customersystem1,StringType,true), StructField(customersystem2,StringType,true)),true), StructField(sysid,StringType,true)),true), StructField(persons,ArrayType(StructType(StructField(personid,StringType,true), StructField(personname,StringType,true)),true),true)),true), StructField(title,StructType(StructField(titleid,StringType,true), StructField(titlename,StringType,true)),true))
//输出:
+------------------------------+---------------------------------------+ |customerDetail |persons | +------------------------------+---------------------------------------+ |[878378743, ACTIVE, [SYS01,],]|[[123, IIISKDJKJSD], [456, IUDFIDIKJK]]| +------------------------------+---------------------------------------+
我的问题:有没有一种方法可以通过将Array columns
保持原样将键值拆分为separate dataframe columns
如下所示,因为one record per json string
只需要one record per json string
:
customer column
示例:
customer.customerDetail.customerid,customer.customerDetail.customerstatus,customer.customerDetail.customersystems.customersystem1,customer.customerDetail.customersystems.customersystem2,customerid,customer.customerDetail.sysid,customer.persons 878378743,ACTIVE,SYS01,null,null,{"persons": [ { "personid": "123", "personname": "IIISKDJKJSD" }, { "personid": "456", "personname": "IUDFIDIKJK" } ] }