20. Categorical Data
Categorical Data
New in version 0.15.
Note
While there was pandas.Categorical
in earlier versions, the ability to use categorical data in Series
and DataFrame
is new.
This is an introduction to pandas categorical data type, including a short comparison with R’s factor
.
Categoricals
are a pandas data type, which correspond to categorical variables in statistics: a variable, which can take on only a limited, and usually fixed, number of possible values (categories
; levels
in R). Examples are gender, social class, blood types, country affiliations, observation time or ratings via Likert scales.
In contrast to statistical categorical variables, categorical data might