Transcript Slide 1
STATA Third group training course in application of information and communication technology to production and dissemination of official statistics 10 May – 11July 2007 Gereltuya Altankhuyag, Lecturer/Statistician, UNSIAP [email protected] 7/21/2015 1 Getting Started There are three ways of executing commands: Using menu-bar Using dialog box (db) Using Syntax It is preferable to use Syntax 7/21/2015 2 Getting Started – dialog box Dialog box db is the command-line way to launch a dialog for a Stata command. Syntax db varname For instance: db sum 7/21/2015 3 Getting Started – dialog box 7/21/2015 4 Basic commands to inspect datasets The following commands are used to inspect datasets codebook count describe list summarize table tabstat 7/21/2015 5 Basic commands to inspect datasets codebook It examines the variable names, labels, data to produce a codebook describing the dataset It distinguishes/reports the standard missing values Syntax codebook [varlist] [if] [in] [, option] Example: codebook codebook region 7/21/2015 6 Basic commands to inspect datasets option: all – provides a complete report excluding mv header – adds header to the top of the output, name, date notes – lists any notes attached to the variables mv – determines the pattern of missing values Examples: codebook region hhlandd famsize, all codebook region hhlandd famsize, header codebook region hhlandd famsize, notes codebook region hhlandd famsize, mv 7/21/2015 7 Basic commands to inspect datasets count It counts the number of observations that satisfy the specified conditions. If no conditions are specified, count displays the number of observations in the data. Syntax count [if] [in] For instance: count count if famsize>=5 7/21/2015 8 Basic commands to inspect datasets describe It produces a summary of the dataset: In memory Of the data stored in a Stata-format dataset Syntax: Data in memory: describe [varlist] [, describem_options] Data in file describe [varlist] using filename[, describef_options] Example: des des region famsize toilet 7/21/2015 9 Basic commands to inspect datasets options: simple – display only variable names short – display only general information detail – display additional details fullname – do not abbreviate variable names numbers – display vriable number along with name 7/21/2015 10 Basic commands to inspect datasets list It displays values of variables Syntax list list [varlist] [if] [in] [, options] Example: list list region famsize toilet list region famsize toilet in 1/15 list region if famsize>5 in 1/15 7/21/2015 11 Basic commands to inspect datasets summarize It calculates and displays a variety of summary statistics. If no varlist is specified, summary statistics are calculated for all the variables in the dataset. Syntax summarize summarize [varlist] [if] [in] [weight] [, options] Example: sum sum in 1/15 sum region famsize toilet sum region famsize toilet [aw=weight] 7/21/2015 12 Basic commands to inspect datasets options: detail - produces additional statistics including skewness, kurtosis, the four smallest and four largest values, and various percentiles. meanonly - which is allowed only when detail is not specified, suppresses the display of results and calculation of the variance. format - requests that the summary statistics be displayed using the display formats associated with the variables, separator(#) - specifies how often to insert separation lines into the output. The default is separator(5), meaning that a line is drawn after every 5 variables. separator(10) would draw a line after every 10 variables. separator(0) suppresses the separation line. 7/21/2015 13 Basic commands to inspect datasets NOTE: Commands and output are shown in Results window. When MORE message is shown, press GO to continue display 7/21/2015 or X button to stop display 14 Basic commands to inspect datasets NOTE: We may specify a variable list for a range of variables des region – toilet sum region – hhlandd list thana - famsize 7/21/2015 15 Basic commands to inspect datasets NOTE: We may use the menus for DESCRIBE Data ► Describe Data ►Describe Variables in Memory for SUMMARIZE Statistics ► Summaries, Tables & Tests ►Summary Statistics ►Summary Statistics Data ► Describe Data ►Summary Statistics 7/21/2015 16 Basic commands to inspect datasets There are 5 types of “table” command: table tabstat tabulate one-way tabulate two-way tabulate summarize 7/21/2015 17 Basic commands to inspect datasets table It calculates and displays tables of statistics. Syntax: table rowvar [colvar [supercolvar]] [if] [in] [weight] [, options] Main options: 7/21/2015 contents - specifies the contents of the table's cells; select up to 5 statistics; by(superrowvarlist) - superrow variables; up to 4 variables. 18 Basic commands to inspect datasets Examples: table region, c(mean famsize median hhandd) table region, by(sexhead) c(mean famsize median hhandd) 7/21/2015 19 Basic commands to inspect datasets tabstat It displays table of summary statistics Syntax: tabstat varlist [if] [in] [weight] [, options] Main options: 7/21/2015 by(varname) - group statistics by variable statistics(statname [...]) - report specified statistics 20 Basic commands to inspect datasets Examples: tabstat region, stats(mean range) tabstat region, by( sexhead) stat(min mean max) col (stat) 7/21/2015 21 Basic commands to inspect datasets tabulate one-way (tab1) It produces one-way tables of frequency counts. Syntax: tabulate varname [if] [in] [weight] [, options] It produces one-way tables of frequency counts. tab1 varlist [if] [in] [weight] [, tab1_options] It produces a one-way tabulation for each variable specified in varlist. 7/21/2015 22 Basic commands to inspect datasets Examples: tabulate toilet tabulate region tabulate hhelec tabulate sexhead tab1 region toilet hhelec sexhead Note: please see the differences!! 7/21/2015 23 Basic commands to inspect datasets tabulate two-way (tab2) It produces two-way tables of frequencies Syntax: tabulate varname1 varname2 [if] [in] [weight] [, options] It produces two-way tables of frequency counts, along with various measures of association, including the common Pearson's chi-squared, the likelihood-ratio chi-squared, Cramér's V, Fisher's exact test, Goodman etc. 7/21/2015 24 Basic commands to inspect datasets tab2 varlist [if] [in] [weight] [, options] It produces all possible two-way tabulations of the variables specified in varlist. Examples: 7/21/2015 tabulate region toilet, row tabulate region sexhead, row col chi2 tabulate region toilet, all exact tab2 region sexhead toilet tab2 region sexhead toilet, all exact 25 Basic commands to inspect datasets Tabulate summarize It produces one- and two-way tables (breakdowns) of means and standard deviations. Syntax: tabulate varname1 [varname2] [if] [in] [weight] [, summarize] 7/21/2015 26 Basic commands to inspect datasets Examples: One-way tables: tabulate region, summarize( hhlandd) tabulate region [aweight=weight], summarize( toilet) Two-way tables: tabulate region sexhead, summarize( hhlandd) tabulate region sexhead [aweight=weight], summarize( hhlandd) 7/21/2015 27 Basic commands to create and change variables, labels etc. generate It creates a new variable. The values of the variable are specified by =exp. Syntax: generate [type] newvar[:lblname] =exp [if] [in] Examples: gen agehead2=agehead*agehead gen agehead3=agehead*agehead if sexhead==1 7/21/2015 28 Basic commands to create and change variables, labels etc. replace It changes the contents of an existing variable. Because replace alters data, the command cannot be abbreviated. Syntax: replace oldvar =exp [if] [in] [, nopromote] Examples: 7/21/2015 replace agehead3=0 if region==2 29 Basic commands to create and change variables, labels etc. egen It creates newvar of the optionally specified storage type equal to fcn(arguments). Here fcn() is a function specifically written for egen. Syntax: 7/21/2015 egen [type] newvar = fcn(arguments) [if] [in] [, options] 30 Basic commands to create and change variables, labels etc. Examples: 7/21/2015 egen age4=mean( agehead) egen test=median( weight- d_bank) 31 To be continued. … END Introduction to STATA Please perform EXERCISE 2 7/21/2015 32