HOME CODE-BLOG ABOUT

Part I - Additions into C language I would like to have

April 2021

Most of these features have been implemented in my transpiler that can be visualized here:

cprime V2 online
cake V3 online

Struct member initializer

struct members can be annotated with their respective initialization value.


struct X {
   int i = 1;
   struct Point pt = { .x = 1, .y = 1 };
};

This information is used for empty initialization and empty compound literals.

int main() {
   
   struct X x = {};       
   x = (struct X) {};       
}

C++ has this features but with some unexpected design.


struct point {
    int x = 1;
    int y = 2;
};

struct line {
  struct point point = { .x= 3, .y = 4 };
};

int main() {
  struct line line = { .point = { .x = 5 } };

  // Is line.point.y 2 or 4?

  printf("%d\n", line.point.y);
}

This sample prints 2 in C++.

I was expecting 4.

C++ also accepts non constant initialization.

if with initializer

This is the same of C++ 17. See https://en.cppreference.com/w/cpp/language/if


  if (struct X* pX = malloc(sizeof * pX); pX)
  {    
    ...
    free(pX);
  }
  
  //pX out of scope

if with initializer and defer-expression

Considering the interesting pattern above (that is very useful to avoid bugs) we also have an option with 'defer' to put everything at same line.

  if (FILE* f = fopen("file.txt", "r"); f; fclose(f))
  {        
     
  }

When jumps like continue, break or goto are used the defer is called before the jump.

When return is called first the result is copied to a local variable then defer is called then copied variable is returned.

try-block statement and throw

try block statement creates a region where we can use throw and jump to the end of statement of inside catch block.


   try {
      throw; /*jump*/
   }
   /*here*/
   
   try {
      throw; /*jump*/
   }
   catch
   {
     /*here*/
   }
      

The difference for C++ is that throw can only be used inside try-blocks making the jump always local.

defer

Using defer the statement is executed at the end of scope or before leaving the scope with jumps like return break etc.

  defer statement

Function Literal

Syntax:

   int (*f) (int arg1, int arg2) = 
          (int (int arg1, int arg2)) { return arg1 + arg2; };

Grammar changes:

  postfix-expression:
    primary-expression
    postfix-expression [ expression ]
    postfix-expression ( argument-expression-listopt )
    postfix-expression . identifier
    postfix-expression -> identifier
    postfix-expression ++
    postfix-expression --
    ( type-name ) { initializer-list }
    ( type-name ) { initializer-list , }
    ( type-name ) compound-statement           <---- if typename is function type

Literal string copy to fixed array


   char s[3];
   s = "ab";//OK
   s = "abc";//compile time error
   s = "abcd";//compile time error

(not implemented yet in cprime)

Overloaded functions

Overload functions are functions with name mangling. They are created to support destructor/polymorphism/parametrization.

void draw(struct Box* p) overload;
void draw(struct Circle* p) overload;

See reference : https://clang.llvm.org/docs/AttributeReference.html#overloadable

We can think of it as an inverse of extern "C".

New operator

( Maybe will be removed because of the different possibilities of allocation. )

 postfix-expression:
   new (type-name)
   new (type-name) { }
   new (type-name) { initializer-list }

The objective of the new operator is allocate memory an them copy the default compound literal or user provided compound literal,

The allocation is done using malloc.


struct X {
    char * name;
};

int main() {

  struct X* pX = new (struct X) {};
  if (pX != NULL)
  {
    free(pX->name);
  }
}

Comparison with C++: There is not constructor here. There is no need for exceptions.

So far there is no way to customize the allocator. I am considering other alternatives to make it generic without adding C++ complexity.

struct X* p = malloc(sizeof * p) *= {struct X}{};

Destroy operator

Destroy operator instantiates an especial function that is used to destroy object parts recursively.

The user can optionally inform a destructor (overloading destroy function) that is called just before the object destruction.


struct X {
    char * name = NULL;
};

void destroy(struct X* pX) overload {
   free(pX->name);
}

struct Y {
    struct X x;
};

int main()
{
  struct Y y = {};
  destroy(y);
}

Auto pointers

( Maybe will be removed because of the different possibilities of allocation. )

Pointers can be qualified with auto.

This tells the type system that this pointer is the owner of the pointed object. When the lifetime ends it should call a function to free the resource it points to.

At this moment auto is used to generate destructors but it also could be used for static analysis in the future.

struct X {
    char * auto name = NULL;
};

int main()
{
  struct X x = {};
  destroy(x);
}

When a pointer qualified with auto is destroyed it calls the destructor of the pointed object and then free the memory.

By default it calls free.

Alternatively we can do,

struct X {
    char * name = NULL;
};

void destroy(struct X* p) { free(p->name); }

int main() {
  struct X x = {};
  destroy(x);
}

struct X {
    char * auto name = NULL;
};

int main()
{
  struct X* pX = new (struct X);
  destroy(pX); //warning does nothing
  destroy(*pX); //destroy the object but does not free memory

  struct X* auto pAuto = new (struct X);
  destroy(pAuto); //destroy the pointed object and calls free
  destroy(*pAuto); //destroy the object but does not free memory
}

Specifying a free function.

Not is not implemented yet.

struct X {
    char * auto(customFree) name = NULL;
};


This is useful for custom allocator. Sometimes a pointer to the allocator is also required, in this case the best solution is auto() to say it is the owner pointer but don't use this information to free it, I will do it manually for instance at the destructor.

Polymorphism

Pointers that can point to a specific set of struct types.

Syntax:

 struct <tag-list> 
 
 tag-list:
  identifier
  identifier , tag-list
 struct <X | Y>* p; // points to struct X or struct Y (or equivalent typedef) 

Structs must have a common discriminant that is used in runtime to select the appropriated type.

For instance:

 struct X {  int type = 1;  }
 struct Y {  int type = 2;  }

We can define a name for this type without typedef


 struct <Box | Circle> Shape;

 struct Shape* pShape;

This name can be used in other pointers and the result is the union set of types.

 struct <Shape | Data> Serializable;

The definition of the struct is automatic (auto generated). We provide only the declaration.

Sample:

 
struct Box {
    const int id = 1; //discriminant
};

void draw(struct Box* pBox) overload {
    printf("Box");
}

struct Circle {
    const int id = 2; //discriminant
};

void draw(struct Circle* pCircle) overload {
    printf("Circle");
}

struct <Box | Circle> Shape;

int main()
{
  struct Shape * auto shapes[2] = {};
  
  shapes[0] = new (struct Box);
  shapes[1] = new (struct Circle);
  
  for (int i = 0; i < 2; i++)
  {    
    draw(shapes[i]); 

    printf("%d", shapes[i].id);
  }
  
  destroy(shapes);
}

The discriminant can be constant strings or enuns etc.


struct Box {
    const char * type= "box";
};

struct Circle {
    const char * type= "circle";
};

(strings are not implemented yet)

We also need a way to cast. The syntax is a open question but the C++ syntax can be used.

struct Box *pBox = dynamic_cast<struct Box*>(pShape);

Alternatives:

struct Box *pBox = pShape as Box;

if (pShape is Box)
{

}

Cast from box to shape for instance are normal casts but in the future the compiler can check the if the types are on the same set.