Stata Basics: foreach and forvalues
Another great tool in your coding tool belt is loops. Loops allow you to run the same command for several variables at one time without having to write separate code for each variable. This discussion could go on for pages and pages because there is much you can do with a loop. Let me begin by explaining the loop I included in the macro article at the end of an earlier article on efficient coding in Stata :. The two most common commands to begin a loop are foreach and forvalues.
The foreach command loops through a list while the forvalues loops through numbers. The first line of the code above is very similar to how you would create a macro. The line begins with the command foreach followed by the name I want to use to represent a group exactly the same as a macro.
In this situation foreach var of local continuous is the same as foreach var in educat exper wage age. I could use either one in my loop. On the second line of the loop I asked Stata to create a box plot of the variables educat, exper, wage, and age and save them.
Inside the parenthesis of saving is the name I want to use for the saved graph and to replace any existing graph with the same name that is in the directory where I am saving it. Stata continues to do this until all variables have been used. Time for one more example. So to analyze the data set you will have to fix this. There are at least two commands that can be used to do this, replace and recode. I will give you an example using the command replace.
Since we are working with variables I need to start my loop with the command foreach. Next step is to decide upon the name I want to use to represent the group. So my code so far looks like this foreach var. If I was going to list all of the variables one by one the next word in my code would be in. As a result the next part of my command is of varlist. In this case I type in the first variable followed by the dash and end with the last variable.
Using the wages data set I would have educat-Race2. When I run this code Stata will take the first variable from the variable list and replace with a period. It will then go to the next variable and work its way through the entire list. Loops can significantly reduce the number of lines of code that you have to write.
Imagine how much time it would take you if you had a hundred variables and you had to write the code for each individual variable. The fewer lines of code you have the less time you have to spend writing the code and the fewer chances for making mistakes. Jeff Meyer is a statistical consultant with The Analysis Factor, a stats mentor for Statistically Speaking membership, and a workshop instructor.
Read more about Jeff here. Tagged as: codingloopsStata. Very informative, thank you. I have a question, what kind of loop will allow me to run separate, bivariante analysis simultaneously?
Is there a code that spares me from having to run each bivariante analysis separately? It follows a similar set up. Hi Jeff, I have been trying to replicate your code for some of the variables in my dataset. My apologies for that.Ever needed to do the same thing to ten different variables and wished that you didn't have to write it out ten times? If so, then this article is for you. If not, someday you will—so you might as well keep reading anyway. Stata has all the tools required to write very sophisticated programs, but knowing just a few of them allows you to make everyday do files shorter and more efficient.
This article will focus on those programming tools that, in our experience, anyone who uses Stata heavily will eventually want to learn. To benefit from this article you'll need a solid understanding of basic Stata syntax, such as you can get from our Stata for Researchers series. The primary intended audience is Stata users with no other programming experience.
If you've done a lot of Stata programming already and are looking to expand your "bag of tricks" check out Stata Programming Tools. This article is best read at the computer with Stata running. Typing the commands in the examples yourself will help you notice and retain all the details, and prepare you to write your own code.
A Stata macro is a box you put text in. You then use what's in the box in subsequent commands. The real trick is getting a single command to run multiple times with a different bit of text in the box each time--we'll get there. The macros we'll use are "local" macros. If you're familiar with global and local variables from other languages, Stata's local macros are local in the same way.
If not, just trust us that local macros are the right ones to use. This creates a local macro called x and puts the character ' 1 ' in it not the value 1 as in "one unit to the right of zero on the number line".
To use a macro, you put its name in a command, surrounded by a particular set of quotation marks:. The quote before the x is the left single quote.
Subscribe to RSS
The quote after the x is the right single quote. It is found under the double quotation mark " on the right side of the keyboard. Macros are handled by a macro processor that examines commands before passing them to Stata proper.
When it sees a macro denoted by that particular set of quotation marks it replaces the macro with its table.This module illustrates 1 how to create and recode variables manually and 2 how to use foreach to ease the process of creating and recoding variables.
As you see, this requires entering a command computing the tax for each month of data for months 1 to 12 via the generate command. In the example below we use the foreach command to cycle through the variables inc1 to inc12 and compute the taxable income as taxinc1 — taxinc The initial foreach statement tells Stata that we want to cycle through the variables inc1 to inc12 using the statements that are surrounded by the curly braces.
Each statement within the loop in this case, just the one generate statement is evaluated and executed. This is repeated for inc2 and then inc3 and so on until inc So, this foreach loop is the equivalent of executing the 12 generate statements manually, but much easier and less error prone.
Often one needs to sum across variables also known as collapsing across variables. In order to get this information, four quarterly variables incqtr1-incqtr4 need to be computed. Again, this can be achieved manually or by using the forea ch command. Below is an example of how to compute 4 quarterly income variables incqtr1-incqtr4 by simply adding together the months that comprise a quarter. This same result as above can be achieved using the foreach command. The example below illustrates how to compute the quarterly income variables incqtr1-incqtr4 using the foreach command.
In this example, instead of cycling across variables, the foreach command is cycling across numbers, 1, 2, 3 then 4 which we refer to as qtr which represent the 4 quarters of variables that we wish to create.
The trick is the relationship between the quarter and the month numbers that compose the quarter and to create a kind of formula that relates the quarters to the months. This is what the statements below from the foreach loop are doing. They are relating the quarter to the months. Then, imagine all of those values being substituted into the following statement from the foreach loop. In this example, with only 4 quarters of data, it would probably be easier to simply write out the 4 generate statements manually, however if you had 40 quarters of data, then the foreach loop can save you considerable time, effort and mistakes.
The foreach command can also be used to identify patterns across variables of a dataset. To obtain this information, dummy indicators can be created to indicate in which months this occurred. Note that only 11 dummy indicators are needed for a 12 month period because the interest is in the change from one month to the next.
This program is illustrated below note for simplicity we assume no missing data on income. We can list out the original values of inc and lowinc and verify that this worked properly. So, for the first pass through the foreach loop the value for curmon is 2 and the value for lastmon is 1, so the generate and replace statements become. The process is repeated until curmon is 12, and then the generate and replace statements become. Click here to report an error on this page or leave a comment Your Name required.
Your Email must be a valid email for us to receive the report! How to cite this page. Introduction This module illustrates 1 how to create and recode variables manually and 2 how to use foreach to ease the process of creating and recoding variables.
Consider the sample program below, which reads in income data for twelve months. Collapsing across variables manually Often one needs to sum across variables also known as collapsing across variables. Collapsing across variables using the foreach command This same result as above can be achieved using the foreach command.Login or Register Log in with. Forums FAQ. Search in titles only. Posts Latest Activity. Page of 1. Filtered by:.
Navid Asgari. Loop over string values of a varibale 14 Nov Hi all, I wonder if Sata has the capability for looping over the string values of a variable. Thanks, David.
Tags: None. Nick Cox. Yes, certainly, and this is well documented. See e. Many active members here are reluctant to support people with anonymous or cryptic identifiers. Comment Post Cancel. Tobias Wendler. I have a string variable that looks like the following has overall 20 or so different labels it may take : input str Indicator "Domestic Material Input Non-metallic minerals Tonnes " "Domestic Material Input Non-metallic minerals Tonnes " "Domestic extraction Total Tonnes " "Domestic extraction Total Tonnes " What I now want to do is: Writing a loop which in each step only keeps those observations where the string variable has an identical "value".
Then I want to save the dataset in which only these values are kept. Then save it naming it by the first letters of each word of the manifestation in Indication.
Generating a new variable which gets running numbers, which are the same for each unique label in the string variable 2.Login or Register Log in with. Forums FAQ. Search in titles only. Posts Latest Activity. Page of 1. Filtered by:. Sayli Javadekar. For loop over varlist -replace 22 Dec Hello I am trying to run a for loop over a varlist where I want to replace an observation of a variable with the subsequent one if the value of the original observation is 0.
Could someone help me? Thank You Sayli J. Tags: None. Matt Warkentin. I don't believe your code will execute, but in sentiment, you would be replacing an individuals value with value for the next observation, which I don't believe is your intent. For any specific individual you want to replace their value for mv1 with the value for mv2, and so forth. For this type of loop, check out the forval command in Stata.
This will allow you to loop over values for variables with the same base name, but with numerical distinction. Comment Post Cancel.
William Lisowski. So for example you want Code:. William Lisowski the code you suggested worked and I understood it too. I agree that STATA is better with long layout however I needed the data in wide to use the nwcommands for network analysis. Sayli Javadekar The code I provided was not how I would have approached the problem, but my intent was to show how to modify your code suitably.
But in creating it, I realized, and worked around, the problem I described in post 3. Unfortunately, I did not realize that your code, and mine, has the problem that Code:.
Stack Overflow for Teams is a private, secure spot for you and your coworkers to find and share information. In this data set the values following 1 belongs to the same group. For example the first 2 zero belong to group 1 and the second 2 zeros belong to the second group an so on. And I would like to get a final output similar to this. Note that the delta between the two 1's is arbitrary:.
I think I need to write a loop that goes over the observations. But I cannot figure out the logical statements that will accomplish this. There is a loop over observations tacit as usual there, but you don't need an explicit loop. In detail, sum here is a cumulative or running sum. In your case, the first solution is simple and adequate. The reason for mentioning the second solution is because it's more general: we can tag the first observation in each block or spell with 1 and then create a running sum to form blocks of 1s, 2s, and so forth.
Learn more. Stata: looping over observations Ask Question. Asked 5 years, 3 months ago. Active 5 years, 3 months ago. Viewed 2k times.
My data set looks like this x1 1 0 0 1 0 0 1 1 In this data set the values following 1 belongs to the same group. Note that the delta between the two 1's is arbitrary: x1 x2 1 1 0 1 0 1 1 2 0 2 0 2 1 3 1 4 I think I need to write a loop that goes over the observations.
Nick Cox 27k 5 5 gold badges 24 24 silver badges 43 43 bronze badges. Rodrigo Rodrigo 79 3 3 silver badges 12 12 bronze badges.
Active Oldest Votes. Nick Cox Nick Cox 27k 5 5 gold badges 24 24 silver badges 43 43 bronze badges. Sign up or log in Sign up using Google. Sign up using Facebook. Sign up using Email and Password. Post as a guest Name. Email Required, but never shown. The Overflow Blog. Socializing with co-workers while social distancing. Podcast Programming tutorials can be a real drag. Featured on Meta. Community and Moderator guidelines for escalating issues via new response…. Feedback on Q2 Community Roadmap.
Technical site integration observational experiment live on Stack Overflow. Dark Mode Beta - help us root out low-contrast and un-converted bits.There are times we need to do some repetitive tasks in the process of data preparation, analysis or presentation, for instance, computing a set of variables in a same manner, rename or create a series of variables, or repetitively recode values of a number of variables.
Now the mean temperatures of each month are in Centigrade, if we want to convert them to Fahrenheit, we could do the computation for the 12 variables. However this takes a lot of typing. Alternatively, we can use the -foreach- command to achieve the same goal. Note that braces must be specified with -foreach.
The open brace has to be on the same line as the foreach, and the close brace must be on a line by itself. This was a rather simple repetitive task which can be handled solely by the foreach command. Here we introduce another command -local- which is utilized a lot with commands like foreach to deal with repetitive tasks that are more complex. The -local- command is a way of defining macro in Stata. A Stata macro can contain multiple elements; it has a name and contents.
Consider the following two examples:.
Take the temperature dataset we created as an example. We can do so by just tweaking a bit of the codes in the previous example. We can obtain the same results in a slightly different way.
This time we use another 12 variables fmtemp1-fmtemp12 as examples. Again, we will rename them as fmtempjan-fmtempdec. Define local macro month, then define local macro monthII in the foreach loop with specifying the string function word to reference the contents of the local macro month. The -forvalues- command is another command that gets to be used a lot in handling repetitive works.
Consider the same temperature dataset we created, suppose we would like to generate twelve dummy variables warm1-warm12 to reflect if each of the monthly average temperature is higher than the one in the previous year.
For example, I will code warm1 for the year of as 1 if the value of fmtemp1 for is higher than the value for We can do this by running the following codes, then repeat them for twelve times to create the twelve variables warm1-warm