ST 555 – Programming HW #6 Solved

Description

5/5 – (1 vote)

This assignment has two distinct parts. Section 1 will allow you to work with DO loops to generate data. In particular you will investigate how to randomize a designed experiment using the DATA step. Section 2 will ask you to work with real data by employing DO loops and arrays. While the examples are unrelated, the SAS programming techniques are related. Be sure that you clearly delineate in your code (via a comment) where Section 1 ends and Section 2 begins. GPP would dictate that we use two separate programs for this – but that makes the grading more difficult for the TAs, so we’ll put all our code into a single file. As always, final data sets should be placed in a permanent library – in this case named HW6.
Section 1: These tasks are enumerated on purpose – you need to complete Item 1 before you move on to Item 2 as they build upon the previous items.
1. Write a data step to generate a data set named DESIGN with three variables: Block, Treatment, and Replicate. BLOCK can take on the values 1, 2, 3, 4, and 5. Within each block the treatments are A, B, C, and D. Within each treatment there should be three replicates. You should have 60 observations and three variables in your final data set. This data set represents a designed experiment (specifically a balanced Complete Block Design) with four treatments and three replications per treatment.
2. One of the fundamental principles of a designed experiment is randomization. To randomize your design via the DATA step you can use the RAND() function. Copy the code you used to create the DESIGN data set and use it to create a data set named DESIGN RAND 0 that includes an additional variable named RANNUM. To create RANNUM include the following assignment statement:
rannum = rand(‘uniform’);
Your goal is to generate a random number for each of the 60 records. (Note: I’m not telling you where the statement goes in your DATA step – that is up to you to figure out.)
3. Once you’ve created your DESIGN RAND 0 data set successfully (check the log!) sort it such that all records for a single block remain together but within a block the records are arranged in ascending order by RANNUM. Name this data set RCBD 0. Open this data set and compare it to our original data set, DESIGN. Can you see that DESIGN RAND 0 has shuffled the order of the records from DESIGN? (Also, if you’ve learned about designed experiments before, hopefully you can explain why this is so important!)
4. Now that your code for DESIGN RAND 0 successfully randomizes the data, and since you are so proud of your work, you decide to show it to a colleague at work. Run your code several times (including the sort!) – after each run open the data set and take a look at the records. What stays the same? What changes? Place your answers in a comment immediately after your PROC SORT code for RCBD 0.
5. In a professional setting (such as consulting) your RCBD 0 code presents a problem – it isn’t repeatable. The ideal program would generate a random data set in a repeatable way! (This is not nearly as difficult as it sounds!) Let’s again copy your code – this time take the code from DESIGN RAND 0 and use it to create a new data set called DESIGN RAND SEED. As before, you will need to update your code with a new statement:
call streaminit(12345);
Place this statement anywhere in your DATA step – as long as it’s before the assignment statement that creates RANNUM (FYI – it makes the most sense after the DATA statement and before your first DO statement). As before, sort your data and store in a data set named RCBD SEED. With this new statement in place (which you are not expected to learn unless you want to!) what happens when you run your code multiple times (including the PROC SORT)? What stays the same and what changes in your data sets? What happens if you change the number 12345 to a different positive number? What stays the same and what changes? Again, place your answers in a comment. (The positive integer you are changing is called the seed, which explains why I asked you to name your data set DESIGN RAND SEED.)
ST 555 – Programming HW #6
Section 2: These tasks are enumerated on purpose – you need to complete Item 1 before you move on to Item 2 as they build upon the previous items. I’ve included a screenshot of what part of my data set looks like if you find it necessary. Your data set should look similar.
1. Use the provided FISH data set and sort it by lake type and dam status (LT and DAM). You won’t need the latitude or longitude variables for this assignment, so you can drop them.
3. Combine these summary statistics with the original data set using an appropriate join technique to create a data set named ALL. This data set will be required to complete the following items.
4. After the data sets are joined, but in the same data step do the following. You must use arrays to get credit for this portion of the assignment. (Hopefully you can see why since there are nine variables to work with here…)
• For each of the nine variables that have a mean and median computed you also need to correctly compute the following for each of the 9 variables. (That means no warnings or errors – syntax or otherwise!)
– Difference from the mean [variable – mean.of.variable]
– Percent difference from the mean [(variable – mean.of.variable)/mean.of.variable]
– Difference from the median [variable – median.of.variable]
– Percent difference from the median [(variable – median.of.variable)/median.of.variable]
• Apply reasonable formats and labels to all the variables that weren’t originally in the FISH data set.
• Drop any irrelevant variables.
5. Sort your ALL data set by NAME.

Reviews

There are no reviews yet.

Be the first to review “ST 555 – Programming HW #6 Solved”

ST 555 – Programming HW #6 Solved

Description

Reviews

Related products

ST555 – ST 445 – Programming HW #4 Solved

ST 555 – Programming HW #5 Solved

ST555 – ST 445 – Programming HW #4 Solved

ST555 – ST 445 – Programming HW #3 Solved

ST 555 – Programming HW #2 Solved