cudf.to_numeric#

cudf.to_numeric(arg, errors='raise', downcast=None)#

Convert argument into numerical types.

Parameters
argcolumn-convertible

The object to convert to numeric types

errors{‘raise’, ‘ignore’, ‘coerce’}, defaults ‘raise’

Policy to handle errors during parsing.

  • ‘raise’ will notify user all errors encountered.

  • ‘ignore’ will skip error and returns arg.

  • ‘coerce’ will leave invalid values as nulls.

downcast{‘integer’, ‘signed’, ‘unsigned’, ‘float’}, defaults None

If set, will try to down-convert the datatype of the parsed results to smallest possible type. For each downcast type, this method will determine the smallest possible dtype from the following sets:

  • {‘integer’, ‘signed’}: all integer types greater or equal to np.int8

  • {‘unsigned’}: all unsigned types greater or equal to np.uint8

  • {‘float’}: all floating types greater or equal to np.float32

Note that downcast behavior is decoupled from parsing. Errors encountered during downcast is raised regardless of errors parameter.

Returns
Series or ndarray

Depending on the input, if series is passed in, series is returned, otherwise ndarray

Notes

An important difference from pandas is that this function does not accept mixed numeric/non-numeric type sequences. For example [1, 'a']. A TypeError will be raised when such input is received, regardless of errors parameter.

Examples

>>> s = cudf.Series(['1', '2.0', '3e3'])
>>> cudf.to_numeric(s)
0       1.0
1       2.0
2    3000.0
dtype: float64
>>> cudf.to_numeric(s, downcast='float')
0       1.0
1       2.0
2    3000.0
dtype: float32
>>> cudf.to_numeric(s, downcast='signed')
0       1
1       2
2    3000
dtype: int16
>>> s = cudf.Series(['apple', '1.0', '3e3'])
>>> cudf.to_numeric(s, errors='ignore')
0    apple
1      1.0
2      3e3
dtype: object
>>> cudf.to_numeric(s, errors='coerce')
0      <NA>
1       1.0
2    3000.0
dtype: float64