Per Erik Strandberg /cv /kurser /blog

Abstract

This entry in Min Blogg deals with curious behavior of the increment-operator. In particular in lines like a[c++] = b[c]; I found some inconsistencies that were both depending on language (C or C#) and compiler (cl, gcc, cs or mcs).

WARNING: It is well-defined that "a[c++] = b[c];" is undefined in C. In C# it is well-defined on the other hand.

Headlines are:

  1. ++ = Increment
  2. Pointer Arithmetics
  3. Use of increment on regular integers
  4. The Problem
  5. The solution
  6. Compilation
  7. Observations
  8. Conclusions
  9. See Also

1. ++ = Increment

Some time in ancient history (at latest when releasing the first C version 1972) the increment operator was invented. In general this operator is used to add one to something. For example if we increment an integer it's value is increased with one:

#include <stdio.h>
void main(int argc, char * argv[])
{
  int c = 5;
  c++;
  printf("c is now %d\n", c); // prints "c is now 6"
}

Also note that there the "inverse" of increment is called decrement and is written --.

This is completely meaningless if it hadn't been for pointers, arrays and Pointer Arithmetics.

2. Pointer Arithmetics

Pointers have some form of built-in intelligence (remember that we are talking computer science of the 1970's - I do not mean gps, mp3, self-destruction-button kind of intelligence, I mean the more crude form of intelligence). If we for example have an array of a data type complex and assign a pointer to the start of it then the increment operator (++) will make the pointer point to the next item in the array.

#include <stdio.h>
void main(int argc, char * argv[])
{
  int arr[10];

  // c points at item in position 4 (the fifth item)
  int* c = &arr[4];  

  // c sets item in position 4 to 1337 and points at the next item
  *c++ = 1337;
}

Also note that the increment can be done in another way. The comments in the next example explains the difference

#include <stdio.h>
void main(int argc, char * argv[])
{
  int arr[10];

  // d points at item in position 4 (the fifth item)
  int* d = &arr[4];  

  // d points to the next item and sets that item to 1337
  *++d = 1337;
}

Not all languages have built-in support for pointers. C# allows you to use it only under certain conditions for example. Python has no increment operator but can be considered to use pointers anyway (from my perspective).

3. Use of increment on regular integers

Today I was confronted with the following situation: I was given an very long input vector and a quite short output vector. I was to add N zeros to the beginning of the output vector. Then fill the remaining positions with values from the input vector. Pretty much like this:

double[] A = new double[10] { 12, 23, 24, 34, 45, 35, 56, 56, 57, 53};
double[] B = new double[6];

int n = 3;

for (int i = 0; i < n; i++)
  B[i] = 0;

for (int i = n; i < B.Length; i++)
  B[i] = A[i];

// values of the array
// A: 12 23 24 34 45 35 56 56 57 53
// B:  0  0  0 34 45 35

Since I had a number of variables and wanted to use a minimal amount of silly counters I tried to keep the code minimal, pretty much like this:

int c = 0;

for (int i = 0; i < n; i++)
{
  B[c] = 0;
  c++;
}

for (int i = n; i < B.Length; i++)
{
  B[c] = A[c];
  c++;
}

The advantage of this is not obvious in this example. But imagine that the loops require a lot of computation and so on to determine the number of items to set in B. Also: perhaps we need to call other functions where we pass c as a parameter to know where in the arrays to insert values. Anyway: the last loop annoyed me and I wanted to lower the number of lines in it from two to one, to something like this:

for (int i = n; i < B.Length; i++)
{
  B[c] = A[c++];
}

Please note

Please note that the best way to do exactly what I mean in the above minimal loop is something like this:

for (int i = n; i < B.Length; i++, c++)
{
  B[c] = A[c];
}

4. The Problem

The thing to think about here is what the line B[c] = A[c++]; is decomposed to. Questions that ran through my mind was:

5. The solution

I created a some simple code-files that tests many possible cases:

They both first contain some declarations creating eight arrays to be filled with values from a ninth array. They also contain this horrible for-loop:

// c1-c8 are all 1 before this loop
for (i = 2; i < 5; i++, c3++, ++c7)
{
  b1[c1] = a[c1++];
  b2[c2++] = a[c2];
  b3[c3] = a[c3];
  b4[c4++] = a[c4++];

  b5[c5] = a[++c5];
  b6[++c6] = a[c6];
  b7[c7] = a[c7];
  b8[++c8] = a[++c8];
}

6. Compilation

I compiled using:

  1. cl to a native C(ansi) file.
  2. gcc to a dito.
  3. the built-in cs compiler in .NET 2.x (I later tested cs from 1.1 and the results are the same). Also mcs from the Mono Platform produce the same result. (Perhaps since the all use Common Intermediate Language (CIL)?)

Output from the CIL versions

b1: 0 2 3 4 0 0 0 0 0 0
b2: 0 3 4 5 0 0 0 0 0 0
b3: 0 2 3 4 0 0 0 0 0 0
b4: 0 3 0 5 0 7 0 0 0 0
b5: 0 3 4 5 0 0 0 0 0 0
b6: 0 0 3 4 5 0 0 0 0 0
b7: 0 2 3 4 0 0 0 0 0 0
b8: 0 0 4 0 6 0 8 0 0 0

Output from the cl version

b1: 0 2 3 4 0 0 0 0 0 0
b2: 0 2 3 4 0 0 0 0 0 0
b3: 0 2 3 4 0 0 0 0 0 0
b4: 0 2 0 4 0 6 0 0 0 0
b5: 0 0 3 4 5 0 0 0 0 0
b6: 0 0 3 4 5 0 0 0 0 0
b7: 0 2 3 4 0 0 0 0 0 0
b8: 0 0 0 4 0 6 0 8 0 0

Output from the gcc version

b1: 0 2 3 4 0 0 0 0 0 0
b2: 0 2 3 4 0 0 0 0 0 0
b3: 0 2 3 4 0 0 0 0 0 0
b4: 0 2 0 4 0 6 0 0 0 0
b5: 0 3 4 5 0 0 0 0 0 0
b6: 0 0 3 4 5 0 0 0 0 0
b7: 0 2 3 4 0 0 0 0 0 0
b8: 0 0 4 0 6 0 8 0 0 0

7. Observations

8. Conclusions

My guess here is that increment operator is not well defined enough.

My interpretation is that b2[c2++] = a[c2]; in C# is converted to:

int* p = a[c2];
c2++;
b[c2] = p;

and in C(ansi)

int* p = a[c2];
b[c2] = p;
c2++;

I used Reflector on the C# version and the loop there looked like this:

while (i < 5)
{
  b1[c1] = a[c1++];
  b2[c2++] = a[c2];
  b3[c3] = a[c3];
  b4[c4++] = a[c4++];
  b5[c5] = a[++c5];
  b6[++c6] = a[c6];
  b7[c7] = a[c7];
  b8[++c8] = a[++c8];
  i++;
  c3++;
  c7++;
}

so that does not help much. Perhaps some details might be give in by reading the CLI - but I don't get that yet.

9. See Also


This page belongs in Kategori Programmering.