My 5 minutes Object Pool in c++

Here it is


template<typename T, int SIZE>
struct Pool {
    
  struct Node {
    char data[ sizeof(T) ];
    Node * next = nullptr;
  };

  Pool() {
    free_node = &objects[ 0 ];
    for ( int i = objects.size() - 2; i >= 0 ; i-- ) { 
      objects[ i ].next = &objects[ i + 1 ];
    }
  }

  template<typename ... Args>
  T* construct( Args... args) { 
    if ( free_node ) {
      T * node = new ( free_node->data ) T( std::forward<Args>( args )... );
      auto old_node = free_node;
      free_node = free_node->next;
      old_node->next = nullptr;
      return node;
    }
    if ( !next_pool ) next_pool.reset( new Pool<T,SIZE> );
    if ( next_pool ) return next_pool->construct(  std::forward<Args>( args )... );
    return nullptr;
  }

  void remove( T * node ) {
    node->~T();
    auto next_free_node = reinterpret_cast<Node*>( node );
    next_free_node->next = free_node;
    free_node = next_free_node;
  }

  std::array< Node, SIZE> objects;
  std::unique_ptr< Pool<T,SIZE> > next_pool;
  Node * free_node = nullptr;
};

performance comparision between stl vector and list

It´s funny how people don’t realize about the power of continuos memory space and how is easy is to traverse it for the cpu if you do it in a predictible way.

For instance, when someone ask you what is the best container to perform random insertion of elements at any point in the container, most of people would say that linked list should fit well. And is true, the complexity of an insertion in a linked list is O(1) becuase it just needs to re-arrange some pointers to add the new element in the chain of elements.

In a vector, in theory, the operation to add an element in the middle would be slower, because you need to move all elements after the new element to make space for it, so that´s O(n).

Well, actually that is not true. See the below program, in that, a random element is generated and inserted in a list and a vector, keeping these 2 containers sorted. The output of the application shows that these operation is always faster in a vector.

The way to compile this is doing:

g++ -O3 -std=c++11 main.cpp getCPUTime.c

And you need this file to measure the CPU time:
getCPUTime

#include <iostream>
#include <iomanip>  // setprecision
#include <cassert>
#include <string>   // getline
#include <typeinfo>
#include <stdio.h>      /* printf, scanf, puts, NULL */
#include <stdlib.h>     /* srand, rand */
#include <time.h>       /* time */
#include <list> 
#include <algorithm> 

extern double getCPUTime( );

template< typename T >
void insert( T & l, unsigned int data )
{
  typename T::iterator it = l.begin();
  while ( it != l.end() && data > *it ) it ++;  
  typename T::iterator inserted = l.insert( it, data );
}

unsigned int MAX = 100;

template< typename T >
double test_insert( T & container )
{
  unsigned int iSecret;
  double startTime, endTime;
  startTime = getCPUTime( );
  for (int i = 0; i < MAX; i++ ) {      
    iSecret = rand() % 1000000 + 1;
    insert( container, iSecret );       
  }
  endTime = getCPUTime( );
  return (endTime - startTime);
}

int main() 
{
  unsigned int iSecret, iGuess;
  double perf_vector, perf_list;
  srand (time(NULL));

  while ( 1 ) {    
    {
      std::list<unsigned int> list;  
      perf_list = test_insert( list );
    }    
    {
      std::vector<unsigned int> v;
      v.reserve(MAX);
      perf_vector = test_insert( v );
    }
    std::cout << MAX << " l:" << perf_list << " v:" << perf_vector << "\n";
    MAX *=2;
  }
}

See the output


100 l:0 v:0
200 l:0 v:0
400 l:0 v:0
800 l:0 v:0
1600 l:0 v:0
3200 l:0.015625 v:0.015625
6400 l:0.0625 v:0.046875
12800 l:0.375 v:0.171875
25600 l:2.03125 v:0.6875
51200 l:11.0469 v:2.8125
102400 l:59.3906 v:11.0781
204800 l:504.922 v:44.6406
409600 l:2672.73 v:176.516

Stack of plates (Stack of stacks)

Imagine a (literal) stack of plates. If the stack gets too high, it might topple. Therefore, in real life, we would likely start a new stack when the previous stack exceeds some threshold. Implement a data structure SetOfStacks that mimics this. SetOfStacks should be composed of several stacks, and should create a new stack once the previous one exceeds capacity. SetOfStacks.push() and SetOfStacks.pop() should behave identically to a single stack (that is, pop() should return the same values as it would if there were just a single stack).


const unsigned int capacity = 10;
template< class T >
class Stack {
private:
  unsigned int size_;
  T v[ capacity ];
public:
  Stack(): size_(0) {}
  bool empty() const{ return !size_; }
  bool full()  const{ return size_ >= capacity; }
  T   pop()         { return v[ --size_ ]; }
  T & top()         { return v[ size_ -1 ]; }
  void push(T e)    { v[ size_ ++ ] = e; }
};

template< class T >
class SetOfStack {
private:
  Stack< Stack < T > > s;  
public:
  T pop() {     
    T ret = s.top().pop();
    if ( s.top().empty() ) { s.pop(); }
    return ret;
  }
  void push(T e) {
    if ( s.top().full() )  { s.push( Stack< T >() ); }
    s.top().push( e );
  }
  bool empty()             { return ( s.empty() ); }
};

int main() 
{
  SetOfStack< int > s;
  for (int i = 0; i < 55; i++ ) {
    s.push( i );
  }
  while ( !s.empty() ) {
    cout << s.pop() << " ";
  }
  cout << "\n";

HeapTree in C++ using templates and gtest (google test framework for c++)

Here is a implementation of a HeapTree, it has the following features:

  • It uses templates, so you can define what you want to store in the tree
  • It uses the common STL interface, it is not a full compatible STL container but it keeps some of the most uses interfaces.
  • I used gTest, the google test framework in c++ to test the container and verify that works as expected
using namespace std;

template<class T>
class HeapTree {
 public:
  HeapTree(int max):a(NULL), cur(0) {
    a = new T[max];   
  }
    ~HeapTree() {
      delete a;
    }
  void push(const T & elem) {
    a[cur] = elem;
    
    int pos = cur;
    while (hasParent(pos) && elem < parent(pos)) {
      std::swap(a[pos], a[parent_pos(pos)]);
      pos = parent_pos(pos);
    }
    cur++;
  }

  T& top() {
    return a[0];
  }
  void pop() {
    if (cur == 1) { 
      cur = 0;
      return;
    }

    swap(a[0], a[cur-1]);
    cur --;      

    int pos = 0;
    while (1) {
      int new_pos = -1;
      if (hasChildRight(pos) &&  hasChildLeft(pos))
	{
	  if ( a[pos] < a[childRight(pos)] && 
	       a[pos] < a[childLeft(pos)] )
	    break;
	  else if (a[childRight(pos)] < a[childLeft(pos)])
	    new_pos = childRight(pos);
	  else 
	    new_pos = childLeft(pos);
	}
      else if (hasChildRight(pos) && a[childRight(pos)] < a[pos])
	new_pos = childRight(pos);
      else if (hasChildLeft(pos)  && a[childLeft(pos)] < a[pos])
	new_pos = childLeft(pos);
      else break;

      std::swap(a[pos], a[new_pos]);	
      pos = new_pos;      
    }


  }
  
  int childLeft(const int & i) {
    return 2*i + 1;
  }

  int childRight(const int & i) {
    return 2*i + 2;
  }

  int hasChildLeft(const int & i) {
    return (cur > childLeft(i)); 
  }

  int hasChildRight(const int & i) {
    return (cur > childRight(i)); 
  }

  int parent_pos(const int & i ) {
    return (i-1)/2;
  }
  T& parent(const int & i) {
    return a[parent_pos(i)];
  }
  bool hasParent(const int & i) {
    return (i > 0);
  }
  bool empty() {return cur == 0;}
  void print () {
    for (int i = 0; i<cur; i++) {
      cout << a[i] << "-";
    }
    cout << endl;
  }
  int size() {
    return cur;
  }

  T * a;
  int cur;

};

template<class T>
ostream& operator<<(ostream & out, const HeapTree<T> & h){
  for (int i = 0; i<h.cur; i++) {
    out << h.a[i] << "-";
  }
  return out;
}

And the unit test:


#include "heaptree.h"
#include "gtest/gtest.h"
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

using namespace std;

TEST(heaptree, trivial) {
  HeapTree<int> h(100);
  h.push(1);
  EXPECT_TRUE(h.top() == 1);
}

TEST(heaptree, TestParent) {
  HeapTree<int> h(100);
  h.push(1);
  h.push(2);
  h.push(3);
  h.push(4);
  h.push(5);
  h.push(6);
  EXPECT_EQ(h.parent(1), 1);
  EXPECT_EQ(h.parent(2), 1);
  EXPECT_EQ(h.parent(3), 2);
  EXPECT_EQ(h.parent(4), 2);
  EXPECT_EQ(h.parent(5), 3);
}
TEST(heaptree, PushTopPop) {
  HeapTree<int> h(100);
  EXPECT_TRUE(h.empty());  
  h.push(1);
  EXPECT_FALSE(h.empty());    
  EXPECT_EQ(h.top(), 1);  
  h.pop();
  EXPECT_TRUE(h.empty());  
} 

TEST(heaptree, GoodTreeOne) {
  HeapTree<int> h(100);
  h.push(1);  
  EXPECT_EQ(h.top(), 1);
  h.push(2);
  EXPECT_EQ(h.top(), 1);
  h.push(3);    
  EXPECT_EQ(h.top(), 1);
 }

TEST(heaptree, GoodTreeTwo) {
  HeapTree<int> h(100); 
  h.push(3);  
  EXPECT_EQ(h.top(), 3);
  h.push(2);
  EXPECT_EQ(h.top(), 2);  
  h.push(1);
  EXPECT_EQ(h.top(), 1);
 }

TEST(heaptree, PopTest1) {
  HeapTree<int> h(100);
  h.push(1);  
  h.push(2);
  h.push(3);
  
  EXPECT_EQ(h.top(), 1);
  h.pop();
  EXPECT_EQ(h.top(), 2);        
  h.pop();
  EXPECT_EQ(h.top(), 3);        
  h.pop();
  EXPECT_TRUE(h.empty());    
 }

TEST(heaptree, LoopTest1) {
  HeapTree<int> h(100);
   
  for (int i = 0; i < 100; i++) {
    h.push(i);        
  }

  for (int i = 0; i < 100; i++) {
    ASSERT_EQ(h.top(), i) << "i is " << i << endl << h << endl;
    h.pop();      
  }

}

TEST(heaptree, LoopTest2) {
  HeapTree<int> h(100);
  for (int i = 99; i > -1; i--) {
    h.push(i);      
  }
  
  for (int i = 0; i < 100; i++) {
    EXPECT_EQ(h.top(), i);        
    h.pop();         
  }   
}

TEST(heaptree, LoopRandom) {
  HeapTree<int> h(100);
  for (int i = 0; i < 100; i++) {    
    h.push(rand() % 1000);      
  }

  int last = -1;
  while (!h.empty()){    
    EXPECT_TRUE(h.top() >= last);    
    last = h.top();
    h.pop();
  }
}

int main(int argc, char **argv) {
  /* initialize random seed: */
  srand ( time(NULL) );
  ::testing::InitGoogleTest(&argc, argv);
  return RUN_ALL_TESTS(); 
}

C++ 0x lambda functions versus Erlang FP

Ok, so here is a small comparison between the new and exciting lambda functions brought in C++ 0x and the functional functions defined in Erlang. Remember to compile all c++ code with the new flag for 0x:
/usr/local/bin/i686-pc-linux-gnu-gcc-4.6.0 -Wall -std=gnu++0x -lstdc++ test_lambdas.cpp

Let’s go for a simple function that will double a list in C++.

void test_double()
{
  vector v1,v2;
  for (long long i = 0; i<100000000 ;i++) {
    v1.push_back(i);
  }
  v2.resize(v1.size());
  transform(v1.begin(), v1.end(), v2.begin(),
	    [](int a){return 2*a;}
	    );
}

Here is the same version of the code in Erlang.

double()->
    Seq = lists:seq(0,100000000),
    Seq2 = lists:map(fun(A)->2*A end, Seq).

Still the code in Erlang looks shorter and neat. Let write a function that sums all the elements in an array.

void test_sum()
{
  vector<int> v;
  int sum = 0;

  v.assign (1000,1);
  for_each (v.begin(), v.end(),[&](int a){sum += a;});

  cout << "sum:" << sum << endl;
}

Note that we have to capture the local variable “sum” by reference and not by value as we are change its value while traversing the array.

Now the Erlang version of the same algorithm.

sum() ->
    V = [1 || X<-lists:seq(1,1000)],
    lists:foldl(fun(A, Acc)-> A+Acc end,0, V).

Another example with classes, we have an array of pointers that points to objects of the class Student. In this function we just sort the students by the mark they got, so the best student will be at the end of the array.

class Student
{
public:
  string name;
  int id;
  double mark;
};

void test_sort()
{
  vector<Student*> v;

  for (long long i = 0; i < 1000000; i ++) {
    Student * s1 = new Student();
    s1->name = "test";
    s1->id   = i;
    s1->mark = (rand() % 100 + 1)+ (double)(rand() % 100 + 1)/100;
    v.push_back(s1);
  }
  
  sort(v.begin(), v.end(), [](Student * a, Student * b) { return a->mark < b->mark ; });
  
  for (auto & i:v){
    cout << "student="<< i->name << " id=" << i->id << " mark=" << i->mark << endl;
  }
}

Note the nice loop at the end that prints a student, it uses a new syntax introduced in 0x similar to Perl and avoid the long code needed to declare iterators when traversing a std container.
Now the same in Erlang:

-record(student, {name, id, mark}).

sort()->    
    {A1,A2,A3} = now(),
    random:seed(A1, A2, A3),
    
    V = [#student{id = X, name = "test", mark = random:uniform() * 100} 
	 || X<-lists:seq(1,1000000)],
    
    lists:sort(fun(A,B)-> A#student.mark < B#student.mark end,V).

With the last example I am doing a performance tests with 10million objects. I am measuring only the sorting bit and ignoring the creation of the objects itself.

It turns out that Erlang took around 50sec to do the job while c++ took 11 secs. Probably this is not a fair test anyway so you might want to ignore the line I just wrote. The reason is that a list in Erlang is implemented as a “linked-list”, so we are comparing sorts of two different containers which doesn’t make sense.

Changing “vector” to “list” gives c++ down to 20 seconds which is still behind Erlang. Is there any Erlanger out there that wants to suggest a better algorithm for the Erlang version?

Just to finish this post, I want to give attention to one of the lambda function behavior that is very prone to errors.

class Test {
public:
  void do_something(){
    int n = 10;

    auto lambda = [=] {return n;};
    cout << lambda() << endl;
    
    n = 11;

    cout << lambda() << endl;

  }
};
class Test2 {
public:
  int n;
  void do_something(){
    n = 10;

    auto lambda = [=] {return n;};
    cout << lambda() << endl;
    
    n = 11;

    cout << lambda() << endl;

  }
};

The first function will give us 10 and 10 as expected because we are catching “n” by value so the lambda function is going to have “n” to 10 because thats the value it had when the function was created.

However the last class Test2 will give us 10 and 11 as an output. And this is because “n” is a member of the class, and inside the lambda function you should read “return n” as “return this->n”. So in reality, what we are caching here is the pointer to “this”.

Double polymorphism or double dispatch in C++

Polymorphism has its limitations. Check this example:

class A {};
class B: public A {};

void f(B* b) {
  cout<< "hi" << endl;
}

int main() {
  B b;
  A *a = &b;
  f(&b);
  f(a); // It won't compile
}

To solve this issue we have double dispatch or double polymorphism. This is a quick example of double polymorphism in C++ using the Design Pattern Visitor:

class Visitor;

class Visitable {
public:
  virtual void accept(Visitor &v) = 0;
};

class Visitable1: public Visitable {
public:
  virtual void accept(Visitor &v);
};

class Visitable2: public Visitable {
public:
  virtual void accept(Visitor v);
};

class Visitor {
public:
  virtual void visit(Visitable1 & v) = 0;
  virtual void visit(Visitable2 & v) = 0;
};

class CVisitor1:public Visitor {
public:
  virtual void visit(Visitable1 & v) {cout<<"1 visiting 1..."<<endl;}
  virtual void visit(Visitable2 & v) {cout<<"1 visiting 2..."<<endl;}
};

class CVisitor2:public Visitor {
public:
  virtual void visit(Visitable1 & v) {cout<<"2 visiting 1..."<<endl;}
  virtual void visit(Visitable2 & v) {cout<<"2 visiting 2..."<<endl;}
};

void Visitable1::accept(Visitor &v) {v.visit(*this);}
void Visitable2::accept(Visitor &v) {v.visit(*this);}

int main() {

vector<Visitable> v;
 v.push_back(new Visitable1());
 v.push_back(new Visitable2());
 
 vector<Visitor> visitors;
 visitors.push_back(new CVisitor1());
 visitors.push_back(new CVisitor2());

 for (int i = 0; i<2 ; i++)
    for (int j = 0; j<2 ; j++)
        v[i]->accept(*visitors[j]);
}

Fibonacci using C++ STL Pthreads

Here is a basic implementation of the fibonacci algorithm using Pthreads. Note that the long type is not big enough to store the current value of the fib function. This code uses 3 threads:

  • The main one, for the user input.
  • One to compute the fibonacci algorithm.
  • One thread to show every second the state of the last value of the fibonacci sequence.

To compile, just:

g++ -pthread fib.cpp && ./a.out

 

using namespace std;


class MyFib {

private:
  vector m_fibonacci_values;
  volatile bool m_running;
  pthread_mutex_t m_mutex;
  pthread_t m_thread,m_progress_thread;
  volatile long last;

  static void* start_thread(void *obj)
  {
    //All we do here is call the do_work() function
    MyFib *ptr = reinterpret_cast(obj);      
    ptr->do_work();
  }

  static void* start_thread_progress(void *obj)
  {
    //All we do here is call the do_work() function
    MyFib *ptr = reinterpret_cast(obj);      
    ptr->do_work_progress();
  }
 

  long fibonacci_number(long num)
  {
    last = num;
    switch(num)
      {
      case 0:
      case 1:
	return 1;
      default:
	return m_fibonacci_values[(num -1)%3] + m_fibonacci_values[(num-2)%3];
      };
   } 
  
public:
  volatile bool m_stoprequested; 
  MyFib():m_stoprequested(false), m_running(false) {
    pthread_mutex_init(&m_mutex,NULL);
  }

  ~MyFib() {
    pthread_mutex_destroy(&m_mutex);
  }

  void go() {
    m_fibonacci_values.assign(3,0);
    assert(m_running == false);
    m_running = true;
    pthread_create(&m_thread, 0, &MyFib::start_thread, this);
    pthread_create(&m_progress_thread, 0, &MyFib::start_thread_progress, this);
  }
  

  void do_work() {
    long iteration = 0;
    while (!m_stoprequested ) {
      pthread_mutex_lock(&m_mutex);
      
      m_fibonacci_values[iteration % 3] = fibonacci_number(iteration);
      last = iteration;
      pthread_mutex_unlock(&m_mutex); 
      iteration ++;
    }
  }

  void do_work_progress() {
    while (!m_stoprequested ) {
      get_fibonacci_value(),
      sleep(1);
    }
  }

  void stop() 
  {
    assert(m_running == true);
    m_running = false;
    m_stoprequested = true;
    (void) pthread_join(m_progress_thread, 0);
    (void) pthread_join(m_thread, 0);
  }


  void get_fibonacci_value()
  {
    pthread_mutex_lock(&m_mutex); 
    cout << "fib(" << last << ")=";
    cout << m_fibonacci_values.back() << endl;
    pthread_mutex_unlock(&m_mutex);
  }


};

int main(){

  MyFib fib;
  char key;
  
  cout << "press any key to stop...\n";
  fib.go();
  cin.get(key);

  fib.stop();

  fib.get_fibonacci_value();

  
}

Quick example of hiberl

One of my tiny projects reads jobs from different’s datasources and push them into a small database which I is running on my computer.

I am sure this idea is quite crazy and useless in other countries but it perfectly make sense in Spain. First, we don’t have a good job website aggregator. Second, almost all the job offers are old although they update their date everyday and they are shown in “today’s job”. And finally, normally there is no way to control how many job offers you can see in a single page, so normally you have to see them in chunks of 30s which is quite annoying.

Anyway what I want to show here is how easy is to get all the job offers from today in a single website, with no ads and not old jobs using hiberl

%% generate modules and header files
4> hiberl:render(hiberl_drv_mysql, "jobs", "DSN=jobs;UID=username", ["job"]).
created file jobs.erl
created file job.erl, got 8 columns []
ok

%%get the fields of a job
5> job:columns().
["id","position","url","city","company","date","created", "new",[]]

%% Start a session
6> hiberl_session:start_register("DSN=jobs;UID=username").
{ok,}

%% get jobs from today
7>Jobs = job:select("date like '27/10' ").
...

%% A function to create a link using the url and the description
8>F = fun(Job) ->
		Job#job{
                     position = ["<A HREF=\"http:",Job#job.url,"\">",Job#job.position,"</A>"],
                     url = ""}
	end.
...

%% generate html
9> hiberl:render_html(lists:map(F,Jobs)).
ok

%% finish session
10>hiberl_session:terminate().
ok