cudf.DataFrame.partition_by_hash#

DataFrame.partition_by_hash(columns, nparts, keep_index=True)#

Partition the dataframe by the hashed value of data in columns.

Parameters

columnssequence of str: The names of the columns to be hashed. Must have at least one name.
npartsint: Number of output partitions
keep_indexboolean: Whether to keep the index or drop it

Returns

partitioned: list of DataFrame