
egen price4 = cut(price), group(5) generates price4 into 5 groups of the same size.Īnother way to convert numeric variables to categorical or factor variables is to use autocode, the automated version of recode. egen price3 = cut(price),at(3291,5000,15906) recodes price into price3 with three intervals [3291,5000), [5000, 15906), and [5000, 15906).Įgen newvar = cut( var),group( #) alternatively divides the newly defined variable into groups of equal frequencies. # specifies the cut-offs with its left-side being inclusive. Generates a new group id with values from 1 to 4 for the categorical variable region and then converts the id variable to a string.Įgen newvar = cut( var),at( #,#,…,#) provides one more method of recoding numeric to categorical variables. gen car_space2 = (headroom + length)/2 where if any of the variables has missing values, generate will ignore the entire rows and return missing values.Įgen group_id = group( old_group_var) creates a new group id with numeric values for the categorical variable.

In this example neither variable contains missing values.Ĭompare this method with the generate method: If both are missing, egen newvar = rowmean() will then return a missing value. Note that if one of the two variables headroom and length is missing, egen newvar = rowmean() will ignore the missing observations and use the non-missing observations for calculation. egen car_space = rowmean(headroom length) creates an arbitrary measure for car space using the mean of headroom and car length.

egen total_weight = total(weight) if !missing(weight), by(foreign) Therefore, if we want to include only the nonmissing cases, we need to

Note that egen newvar = total() treats missing values as 0. egen total_weight=total(weight), by(foreign) creates the total car weight by car type. Type help egen to view a complete list and descriptions of the functions that go with egen.īelow we will see some common usage of egen. It requires a function to be specified to generate a new variable: egen newvar = function().įunctions include mean(), sd(), min(), max(), rowmean(), diff(), total(), std(), group() etc. Repair | RECODE of rep78 (Repair Record 1978) To check if the transformations have worked as we would like them to be, it is always a good idea to cross-tabulate the newly defined variables and the variables created from. Label() gives a name to the new value label. If there are values unassigned, they will be taken to the new variable as they are.Ĭontents following the assigned value is the value label (e.g. Missing indicates all missing values nonmissing for all nonmissing values and else for both missing and non-missing values. Min refers to the smallest value max refers to the largest value.

Numbers to the left of = are values to be recoded, while the number following = is the new value to be assigned. Values specified in the () includes the two boundaries. ), gen(rep78_scale) label(repair_record_scale) Recode var (rule)., generate( newvar) changes the contents of numeric variables, usually to create categorical variables. replace price2 = price*0.8 if foreign = 1 recode It can be used in combination with generate to recode the newly created variable from the existing numeric variable. Replace changes the contents of a variable. Generate newvar = exp creates the new variable from existing variables through an expression. The five commands below are often used to create or modify variables.
