cudf.DataFrame.partition_by_hash#

DataFrame.partition_by_hash(columns, nparts, keep_index=True)#

Partition the dataframe by the hashed value of data in columns.

Parameters
columnssequence of str

The names of the columns to be hashed. Must have at least one name.

npartsint

Number of output partitions

keep_indexboolean

Whether to keep the index or drop it

Returns
partitioned: list of DataFrame