|
|
|
|
Arrays And Looping Techniques The looping concept is extremely powerful, and programming languages such as VB offer several ways to zip through sets of numbers. Most frequently, the numbers are stored in indexed sets known as arrays. After defining and illustrating the array, we will explore efficient ways to process some or all of the numbers in a set. An array is a subscripted variable. Each member of the array is uniquely identified by the subscript attached to it. For example, we might have five children in a family, and we could identify each one by a number representing the child's age. A(1), the age of the oldest child, might be twelve. A(2), the age of the second child, might be ten. The array could be given in a list as follows: A(1) = 12 A(2) = 10 A(3) = 7 A(4) = 4 A(5) = 2 Note that the subscripts serve to identify the score, but do not convey the score directly. The advantage conveyed by an array is that we can refer to the members either singly or as a group. For example, we can very succinctly direct the program to square all of the A's, or to sum all of the A's. We can operate on some but not all of the A's without having to name each value being transformed. It is often useful to think of arrays as being associated via their connection to a common set of indices. Suppose we have a group of people for each of whom we wish to keep track of several attributes such as height, weight, age, number of children, and the like. We would define an array for each attribute, and place the value for each person in the same spot in each of the arrays. Thus, HEIGHT(4), WEIGHT(4), AGE(4), and NUMKIDS(4) would all contain values pertaining to person 4. The names of our individuals might be placed in another array, so NAME (4) would refer to person 4's name. This scheme would make for convenient access to all of the information we had about any particular member of the group. Within a VB program, an array is defined by a special dimensioning statement incorporating a delimiter which tells the compiler to set aside storage space for the set. DIM A(15) AS SINGLE means that the array called A may have up to 15 single precision scores in it. This statement must precede the appearance of the array. The DIM statement works beautifully if you know the maximum size of the array. However, if the number of scores depends on values that may change as the program is running, as would be the case if an inputted value determined the array size, then you need a two-step process. First, use a DIM statement with empty parentheses to convey the idea that an array is coming (e. g. DIM NAMES() AS STRING). Then before the array is used, a REDIM statement should be employed as well such as REDIM NAMES (N). When the REDIM statement is executed, all of the values in the array are initialized to zeros (for a numeric variable) or blanks (if a string variable.) Microsoft's BASIC languages, including VB, have a quirk I've never understood. The default setup for arrays includes zero as the smallest subscript. Thus the 20 element array B goes from B(0) to B(19). This is not the way I learned to think algebraically. Fortunately there is an easy patch. The statement OPTION BASE 1 makes the smallest subscript one, and then a 20 element array goes from 1 to 20. I recommend that all programs using arrays include this statement in a module. There is also a more complex patch you can use if for some reason you need arrays which begin with a zero subscript. The limits on each array can be expressed during the dimensioning, with a statement such as DIM A(1 TO 20) AS SINGLE. This seems to me the hard way, but perhaps people who know more math than I have good reasons. For routine use in loop construction, the FOR ... NEXT loop is the most likely choice. Its structure clearly marks off the instructions contained within the loop. The limits of the looping are specified in the FOR statement, and the NEXT statement signifies the end of the cycle. Adding the scores in an array is an operation called for frequently. The way it is done is worth emphasizing. An accumulator variable is established, whose initial value is zero. The first time through the loop, the value of the first score is placed into the accumulator. Each subsequent pass through the loop adds the current score to the running total held by the accumulator. In our looping illustrations, the variable SUM is initialized prior to entering the loop. Then the variable MEAN is computed after the looping has concluded, when all of the scores have been incorporated into the sum. 'input NSCORES using a text box DIM SCORES() AS SINGLE REDIM SCORES(NSCORES) SUM = 0 FOR J = 1 TO NSCORES LABEL1.CAPTION = "What is the value of score number " & STR(J) 'input the score value using another text box SUM = SUM + SCORES(J) NEXT J MEAN = SUM/NSCORES LABEL2.CAPTION = "Mean Value of Scores is " + MEAN Another looping variation is the DO... loop. It has two variants, DO WHILE and DO UNTIL. In either case, the loop is concluded with a LOOP statement. 'input NSCORES using a text box DIM SCORES() AS SINGLE REDIM SCORES(NSCORES) SUM = 0 J = 0 DO WHILE J < NSCORES J = J + 1 LABEL1.CAPTION = "What is the value of score number " & STR(J) 'input the score value using another text box SUM = SUM + SCORES(J) LOOP MEAN = SUM/NSCORES LABEL2.CAPTION = "Mean Value of scores is " & MEAN The DO UNTIL variant simply replaces the DO WHILE statement with a rather similar DO UNTIL J = NSCORES. You can, if you prefer, express individuality by having the first statement in the loop simply say DO, with the last containing the limit: LOOP WHILE J < NSCORES. Having so many ways to accomplish the same operation is a feature of VB that can sometimes drive one crazy. You don't really have to know them all. My personal preference is to use the FOR ... NEXT loop when working with arrays, because the looping limits are explicitly specified within the loop. Also, one less line needs to be typed (the starting value of the index) than with the DO forms. DO... loops seem to me to be most useful for looping operations in which only one boundary is specified, such as cases in which the loop goes on until a stopping value is achieved. Another possibly excessive flexibility in FOR ... NEXT loops permits the use of oddball step sizes and limits. Loops normally go from 1 TO N, stepping through the integers in an orderly, expected fashion. This may be thought of as the default structure. The default implicitly sets a parameter of the FOR statement called the STEP size to the value 1. When used explicitly, the step size may be any value; e.g., FOR J = 1 TO 15 STEP 2. The STEP size of 2 would set the indices successively to 1, 3, 5, 7, 9, 11, 13, 15. Limits need not begin at 1 either; any value is allowed. We can loop FOR L5 = 3 TO 12. Looping may even go in reverse order, but then the STEP value will be negative, as in FOR LOOPINDEX = 100 TO 10 STEP -10. It may occur to you that none of these variations is really necessary for the looping structure to achieve the desired result. It would be possible to do a little algebra within the default form of the loop that would generate the same steps and limits. But why bother, when the language makes it so easy? In any case, the looping is terminated when the current index goes past the second limit; it need not land evenly on it. FOR LOOPER = 10 TO 25 STEP 10 will make two stops: the first at 10, the second at 20; then the looping will terminate. For clarity in my examples, I have made the STEP sizes constants. They need not be; just like the loop index and the limits, the step size may be either a constant or a variable. If you prefer to loop with simple IF and GOTO statements, and are prepared to face sneers from expert programmers, here's how. 'input NSCORES using a text box DIM SCORES() AS SINGLE REDIM SCORES(NSCORES) SUM = 0 10 J = J + 1 LABEL1.CAPTION = "What is the value of score number " & STR(J) 'input the score value using another text box SUM = SUM + SCORES(J) IF J < NSCORES THEN GOTO 10 MEAN = SUM/NSCORES LABEL2.CAPTION = "Mean Value of scores is " & MEAN Nested Loops And Multidimensional Arrays Although multidimensional array sounds complex, it simply expresses the concept of data arranged in tables. When we process the numbers in a table, we work through the rows and columns. Typically what is done for one column is repeated for all of the columns. For example, in an annual report we might need to cumulate the monthly sales figures for each of several salespersons. We might want to identify the good staff members and to see which months need special advertising campaigns. The data are two-dimensional because each number is attached to, or indexed by, both a person and a month. The scores are doubly indexed. From the programming perspective, accessing the scores calls for two looping operations. There will be a loop within a loop. The outer loop will cycle through the persons. Then, for each person, the inner loop will cycle through the months. When all of the months for the person have been dealt with, it is time to move on to the next person. It may be seen that the inner loop cycles quickly and the outer loop slowly. Let's write a program to accomplish this task. We will place the sales figures in a two-dimensional array. As the scores come in, they are summed in one-dimensional arrays keeping track of salesperson totals and month totals respectively. Finally, single loops are used to print out the desired marginal means. In this simplistic report, these averages themselves do not need to be stored as variables since they are computed on the fly during the printing. 'input number of salespersons (NSELLERS) using a text box DIM SELLERNAMES() AS STRING REDIM SELLERNAMES(NSELLERS) DIM SALES() AS SINGLE REDIM SALES(NSELLERS, 12) DIM SELLERSUMS() AS SINGLE REDIM SELLERSUMS (NSELLERS) DIM MONTHLYTOTALS(12) AS SINGLE FOR I = 1 TO NSELLERS LABEL1.CAPTION = "What is salesperson " & STR(I) & "'s name?" ' input SELLERNAMES(I) from a textbox FOR J = 1 TO 12 "For Month "& STR(J) + ", please enter the sales for " & SELLERNAME(I) & " in dollars:" 'input SALES(I, J) from a text box SELLERSUMS(I) = SELLERSUMS(I) + SALES(I, J) MONTHLYTOTALS(J) = MONTHLYTOTALS(J) + SALES(I, J) NEXT J NEXT I Although one and two dimensions are likely to satisfy your array cravings, VB allows you to have up to 63 dimensions. My mind is incapable of picturing such complexity, but tagging scores with several indices is not unusual in social science endeavors. A score may be identified in terms of the subject generating it, the trial number, each of several stimulus conditions, and so on. When the program processes the scores, deeply nested loops will be needed. Those of us with one-track minds sometimes find the distinction between a multidimensional array and multiple associated arrays less than crystal clear. A multidimensional array consists of one set of scores, each of which is identified by multiple indices. Each sales figure is identified with a seller and a month. Associated arrays refer to several sets of scores, each identified with one person. Each seller has a name, stored in one array, and also an annual sales total, stored in another. For readers who speak statistics language, a multidimensional array corresponds to multiple independent variables combined in a factorial design, while associated arrays are like multiple dependent variables. Jumping Out of a Loop Normally a loop cycles smoothly through all of the defining indices. Occasionally, though, a conditional exit is desired before the looping is finished. When the condition is satisfied, the loop is abandoned. The program will usually need to keep track of the indices at the point of departure. For example, suppose we have a set of scores representing monthly incomes, and our interest is in whether anyone has an income over $5000. OPTION BASE 1 DIM WORKER(10) AS STRING DIM SALARY(10) AS STRING WORKER(1) = "Jim Smith" SALARY(1) = 2800 WORKER(2) = "Melissa Washington" SALARY(2) = 3300 WORKER(3) = "Hillary Chu" SALARY(3) = 4800 WORKER(4) = "James Hurley" SALARY(4) = 1950 WORKER(5) = "Shirley Katz" SALARY(5) = 5300 WORKER(6) = "Stanley Hernandez" SALARY(6) = 3400 FOR I = 1 TO 6 IF SALARY(I) > 5000 THEN LABEL1.CAPTION = WORKER(I) & " makes more than $5000." EXIT FOR END IF NEXT I END The departure statement is EXIT FOR. There is also a corresponding EXIT DO. I don't recommend using EXIT FOR in nested loops, because it's too hard to figure out where the program goes. The alternative way to jump out of a loop is with a GO TO linenumber, and that's easy to follow. Once a loop has been left, attempts to get back in are not advised, because the indexing gets messed up.
|
|
|