# here data is a pyspark.sql dataframe and balance is it's column # get the mean value of a column data.agg({'balance': 'mean'}).show() #or # get the max value of a column data.agg({'balance': 'avg'}).show() ------other related function ------ few possible parameters are ['max', 'min', 'stddev', 'variance', 'count', 'skewness', 'kurtosis', 'sum'] # get the max value of a column data.agg({'balance': 'max'}).show() # get the min value of a column data.agg({'balance': 'min'}).show() # get the standard deviation of a column data.agg({'balance': 'stddev'}).show() # get the variance of a column data.agg({'balance': 'variance'}).show()
Read more of this post
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.