<P> Categorical data is the statistical data type consisting of categorical variables or of data that has been converted into that form, for example as grouped data . More specifically, categorical data may derive from observations made of qualitative data that are summarised as counts or cross tabulations, or from observations of quantitative data grouped within given intervals . Often, purely categorical data are summarised in the form of a contingency table . However, particularly when considering data analysis, it is common to use the term "categorical data" to apply to data sets that, while containing some categorical variables, may also contain non-categorical variables . </P> <P> A categorical variable that can take on exactly two values is termed a binary variable or dichotomous variable; an important special case is the Bernoulli variable . Categorical variables with more than two possible values are called polytomous variables; categorical variables are often assumed to be polytomous unless otherwise specified . Discretization is treating continuous data as if it were categorical . Dichotomization is treating continuous data or polytomous variables as if they were binary variables . Regression analysis often treats category membership with one or more quantitative dummy variables . </P> <P> Examples of values that might be represented in a categorical variable: </P> <Ul> <Li> The blood type of a person: A, B, AB or O . </Li> <Li> The state that a person lives in . </Li> <Li> The political party that a voter in a European country might vote for: Christian Democrat, Social Democrat, Green Party, etc . </Li> <Li> The type of a rock: igneous, sedimentary or metamorphic . </Li> <Li> The identity of a particular word (e.g., in a language model): One of V possible choices, for a vocabulary of size V . </Li> </Ul>

The values of data measured on this scale can be a number or a name