![]() |
![]() |
|
Data |
|
APL owes a considerable amount of its power and conciseness to the way it handles data. Many lines of code in non-APL programs are devoted to 'dimensioning' the data the program will use and to setting up loops and counts to control data structure. With APL you can create variables dynamically as you need them and the structure you give a data item when you create it determines how it will be treated when it's processed. Data is an important subject in APL. The rest of this chapter is a survey of its main characteristics. VariablesAs in most programming languages, data can be directly quoted in a statement, for example: 234.98 × 3409÷12.4 or it can be 'assigned' to a name by the symbol .., in which case it's called a variable: VAR ← 183.6 We concentrate on variables in this chapter, but the comments on data type, size and shape are equally applicable to directly quoted numbers and characters. NamesVariables, user-defined functions and user-defined operators have names which are composed of letters and digits. The full rules are in the APLX Language Manual, but here are some examples: PRICE A albert A999 ITEM∆1 THIS_ONE That¯One APL uses upper-case and lower-case characters. APL regards the symbol Note: On some non-GUI implementations of APLX, the lower-case characters may be replaced by underlined upper-case characters. Please check in the Supplement which covers your implementation of APLX. Types of dataData can be numbers, characters or a mixture of the two. Characters are enclosed in single quotes and include any letter, number or symbol you can type on the keyboard, plus other, non-printing characters. The space counts as a character:
Numeric digits, if enclosed in quotes, have no numeric significance and can't be involved in arithmetic.
Size, shape and depthAn array in APLX can be anything from a single letter or number to a sixty-three dimensional array. Elements within the item may themselves be arrays. Here are some examples of data items:
As you'll have gathered, data is considered to have dimensions. A single number or character scalar (like a point) has no dimensions. A vector has one dimension, length. A matrix has two dimensions, height and length. The word 'array' is a general term applicable to a data structure of any dimension. Arrays of up to sixty-three dimensions are possible in APLX. An array which contains other arrays is called nested. An array which does not is called simple. This is how APL displays a three-dimensional array: 23 30 11 8 30 22 23 20 3 19 27 9 14 23 15 8 9 11 5 15 27 28 2 28 16 16 10 30 15 8 3 29 3 16 12 9 Each of the three blocks of numbers has two dimensions represented by the rows and columns. The three blocks form three planes which constitute another dimension, depth. You will notice that the array is displayed on the screen in such a way that you can identify the different dimensions. No spaces are left between the rows of each plane. One blank line is left between each plane. A four dimensional array would be displayed with two blank lines between each set of planes. More complicated arrays, where some of the elements are themselves arrays, will also have a 'depth' which measures the degree of complexity of the structure. Thus a simple scalar has a depth of 0 and a structure whose elements are purely simple scalars (such as the array shown above) has a depth of 1. If any element of an array is itself an array, the array has a depth of 2. The depth will go on increasing with the complexity of the structure. An array which has an element which in turn has a non-scalar element has a depth of 3, and so on. Setting up data structuresIt isn't always necessary to explicitly define the size or shape of data: X ← 23 9 144 12 5 0 In the case above, X is a six-element vector, by virtue of the fact that six elements are assigned to it. Vectors which contain both characters and numbers may be set up by enclosing the characters in X ← 1 2 'A' 'B' 3 4 Explicit instructions would be necessary if we wanted the six elements to be rearranged as rows and columns. The two-argument form of the function 2 3 ⍴ 23 9 144 12 5 0 23 9 144 12 5 0 The left argument specifies the number of rows (in this case 2) and the number of columns (in this case 3). The right argument defines the data to be arranged in rows and columns. Notice that the dimensions are always specified in this order, that is: - columns are the last dimension - rows precede columns and, if there are only two dimensions, are the first dimension. In the case of data with more than two dimensions, the highest dimension comes first. So in the three-dimensional example used earlier, the plane dimension is the first dimension followed by the rows, then the columns. (The ordering of dimensions is an important point and will be discussed again later in this chapter.) To return to the Arrays of three or more dimensions are set up in a similar way to matrices. The following statement specifies that the data in a variable called 3 3 4⍴NUMS The result would look like the three-dimensional array shown in the previous section. The 6⍴9 9 9 9 9 9 9 Arrays of arrays (or 'nested arrays') may be set up by a combination of these rules. Here we set up another vector, some of whose elements are themselves vectors or matrices. Note the use of parentheses to indicate those elements which are actually arrays. VAR ← (2 3⍴9) (1 2 3) 'A' 'ABCD' 88 16.1 The variable Data structure versus data valueA data structure has certain attributes, regardless of the specific data it contains. For example, a vector has one dimension while a single number has no dimensions. You can take advantage of this fact. If you intend to use a single number for certain purposes, it may be convenient to set it up as a one-element vector. In this next example X ← 1 ⍴ 22 For contrast, here Y ← 22 The difference between ⍴X 1 ⍴Y empty response Both variables contain the value The result of the Similarly, it may be convenient in certain situations to define a vector as a one-row matrix. Here Z ← 1 5 ⍴ 12 5 38 3 6 It looks like a vector when displayed: Z 12 5 38 3 6 But an enquiry about its size returns information about both its dimensions: ⍴Z 1 5 Empty data structuresVariables which have a structure but no content may also be useful, for example as predefined storage areas to which elements can be added. An 'empty vector' is a variable which has been defined as a vector, but which has no elements. Similarly, an 'empty matrix' has the appropriate structure, but no elements. There are many ways of creating empty data structures. To take one example, the function X ← ⍳0
X But it is a vector (albeit an empty one) and does have the dimension of length. If the one-argument form of ⍴X 0 This indicates that its length is zero elements. Contrast this with the answer returned if you apply ⍴ 45 An empty answer is displayed since the item has no dimensions. An empty matrix can be created in the same way as an empty vector. In the following example, an empty matrix is created consisting of 3 rows and no columns: TAB ← 3 0⍴⍳0 Dimension orderingWhen a function is applied to an item with more than one dimension, you need to know which dimension the function will operate on. If you apply an add operation to a matrix, for example, will it produce the sums of the rows or the sums of the columns? COL 1 COL 2 COL 3 COL 4 ROW 1 1 + 2 + 3 + 4 = 10 ROW 2 5 + 6 + 7 + 8 = 26 ROW 3 9 + 10 + 11 + 12 = 42 == == == == 15 18 21 24 The rule is that unless you specify otherwise, operations take place on the last dimension. The 'last' dimension is the one specified last in the size statement: TABLE ← 3 4⍴DATA The An add operation 'on' the columns adds each element in column 1 to the corresponding element in columns 2, 3 and 4.
So, as can be seen, an add operation 'on' the columns produces the sum of the elements in each row. Similarly, if you were to apply the add operation to the first dimension of the matrix, that is to the rows, it would add all the items in row 1 to the corresponding items in rows 2 and 3: ROW 1 | 1 2 3 4 | ↓ ↓ ↓ ↓ ROW 2 | 5 6 7 8 | ↓ ↓ ↓ ↓ ROW 3 | 9 10 11 12 ↓ ↓ ↓ ↓ 15 18 21 24 So an add operation applied to the rows produces the sum of each column. As already described, by default operations are applied to the last dimension (the columns). If you want to specify a different dimension, you can do so by using the axis ( IndexingTo select elements from a vector or matrix a technique called indexing is used. For example, if you have a ten-element vector like this: X ← 1 45 6 3 9 33 6 0 1 22 the following expression selects the fourth element and adds it to the tenth element: X[4] + X[10] Note that square brackets are used to enclose the index. To index a matrix, two numbers are necessary, the row number and the column number: TABLE 12 34 27 9 28 14 66 0 31 TABLE[3;2] 0 In the last example the index selected the element in row 3, column 2. Note the semicolon used as a separator between the rows and columns. Note also the order in which dimensions are specified. This corresponds to the order used in the Items can be selected from data with three or more dimensions in exactly the same way: DATA[2;1;4] selects the item in plane 2, row 1, column 4 of a three-dimensional data structure. To select an entire row from the matrix above you could type: TABLE[1;1 2 3] That is, you could specify all three columns in row 1. A shorter way of specifying this is: TABLE[1;] Similarly, to select a column, say column 2, you would enter: TABLE[;2] The expression you put in square brackets doesn't have to be a direct reference to the item you want to select. It can be a variable name which contains the number which identifies the item. Or it can be an expression which when evaluated yields the number of the item: (3 8 4)[1+2] 4 The above statement selects item 3. The item selected by the following statement depends on the value of 'ABCDE'[P] B You can also use indexing to re-arrange elements of a vector or matrix: 'ABCDE'[4 5 1 4] DEAD Finally note that the data or variables used within an indexing expression may be of a higher dimension than the object being indexed. Thus: 'ABCDE'[2 2⍴4 5 1 4] DE AD For more details on this point check the entry for 2⌷ 'ABCD' selects the second element from
|
Copyright © 1996-2008 MicroAPL Ltd