[docker][cassandra] Reaching mixed load – 750,000 op\sec

The cart goes nowhere because the swan wants to fly in the air, the pike wants to swim underwater and the crawfish wants to crawl backward.

Cassandra performance tuning - challengeCassandra is one of powerhorse of modern high load landscape – nosql database that can be pluged to Spark for distributed computing, Titan for playing with graph data representation and even to Elastic as a search backend.
And if you really care about pure write performance – this is de-facto choice in world of open source solutions: production proof, with big community that already generated many outdated controversial answers at SO & mailing lists.


No single point of failure, scalability, high availability, retention periods, … , but those marketing claims hide few principal caveats… Actually, cassandra has only single drawback(*) – it will not reach its limits with default settings on your hardware. Lonely, single node configuration – it is not use case for cassandra, it will shine in multinoded clustered setup.

If you really want to see full utilization of endless cores and crazy amount of RAM, you have to use some virtualisation technology to manage hardware resource.

Let me start first with some conclusions and recommendations, based on extensive two monthes testing and observartion of trouble tickets after migration of this approach to production. With those considerations in mind I was managed to configure it such way to tolerate 750 k mixed operations per seconds. It was generated for more than 8 hours to check pressure tolerance and emulate peak loads. .It was mixed execution of async inserts, without future processing and synced inserts as well as read requests.

Frankly speaking, I am sure it is still far from its limit.

Bear in mind that I am talking about Cassandra 2.1.*.

About disks

  1. Use ssd disks as mapped volume to docker container. Single container = single dedicated disk.
    It is possible to use multiple disks per containers, but it will lead to 5-15 % of slowdown.
  2. If you use ssd disk you can map all casandra directories to it (saved_cache, data, commit_logs) and adjust casandra.yaml with higher values of throughput, in particularly: compaction_throughput_mb_per_sec, trickle_fsync
  3. It is really depends on data distribution and your data model, but be ready that disk utilization will vary from one node to another up to 20%
  4. Docker should be configured to NOT use host’s root partitions. Don’t be mean and allocate single drive for logs and choose proper storage driver – docker-lvm.
  5. In practice, cluster start strugling when any of nodes come out of space. Surprisingly, in my experiments it was stable even with 3% free, but in real life better to configure your monitoring system to give alert at 15-20%.
  6. Choose compaction and compresion strategies wisely when you design your db
  7. Be careful with column naming – it wil be added for every god damn row!
  8. Do sizing when you think about number of nodes (and disks).

About cpus:

  1. More cpu per node is NOT always good. Stick with 8 cores per node.
    I’ve experimenting with single fat supernode per physical server = 48 cores, 4×12, 6×8.
    6 node with 8 cpu cores outperform all others in 6 kind of stress load scenarious.
  2. If you play with core number you have to adjust few settings at cassandra.yaml to reflect that number: concurent_compactors, concurent_reads, concurent_writes.
  3. Cassandra in most cases endup to be cpu-bound, don’t forget to left for host system 8-16 cores, and allocate cpu exclusivly for containers using –cpuset-cpus

About RAM:

  1. cassandra-env.sh have builtin calculation of free memory to adjust jvm settings using analysing results of command free. Ofcourse it is not for docker based setup. Bear this in mind and tweak your startup scripts to substitue values there.
  2. Disable swap within docker using –memory-swappiness=0
  3. Effectiveness of memory usage depend on cpu amount, how effective multithreaded compaction is implemented at Cassandra and what settings for reader\writer\compactors you have at your cassandra.yaml, i.e. you can have hundreds of RAM but endup in OOM. But even with 8 Gb of RAM per node you already can see benefits. More RAM – mean more memtables, bigger key cache, and more effective OS-based file caching. I would recommend have 24 Gb RAM per node.
  4. Disable huge page at host system or at least tune your jvm settings:
 echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag 

About network

  1. Mandatory to use network stack of host Os using flag at docker –net=host
  2. Most likely network should not be bottleneck for your load, so you can stick with virtual interfaces on top of single real one.


  • 3 physical server: each have 72 cores, 400gb ram
  • Cassandra 2.1.15
  • Docker: 1.10
  • Host Os: Centos 7.5
  • Guest Os: Centos 7.5
  • java 8 from oracle with jna

Cassandra 3.* this is competely another story – in my opinion, mainly, because of storage engine changing, but here is a huge list.

DB overview:

  • Dozen keyspaces, each have up to 20(?) tables.
  • Few indexes – just do not use indexes, design schema properly
  • Data replication = 3, gossip through file
  • Each physical server represent dedicated rack within single datacenter.
  • Row cache were disabled at cassandra.yaml i.e. first priority was to focur on write oriented workload


  1. datastax stresstool, artificial table – very intresting, but useless, using your schema is very important
  2. Datastax stresstool + your own table definition – nice, give hints of production performance. But you still testing single table – usually it is not the case in real life.
  3. Self written in-house stress tool that generate data according to our data model in randomized fasion + set of dedicated servers for ddos with ability to switch between async inserts (just do not use batches) with and without acknowledgment.
    Once again: no batch inserts as they should not be used in productions.
  4. Probably, you can adapt Yahoo! Cloud Serving Benchmark. I haven’t played with it.


That’s it folks, all craps below is my working notes and bookmarks.

How to get c++11 at Centos7 for stress tool compilation:

Install recent version of compiler on centos7: devtoolset-4
update gcc version 4.8 at centos 6: https://gist.github.com/stephenturner/e3bc5cfacc2dc67eca8b

scl enable devtoolset-2 bash

RAM & swap:

How to clean buffer os cache

echo 3 > /proc/sys/vm/drop_caches
check system wide swappiness settings:
more /proc/sys/vm/swappiness

Docker related:

If you are not brave enough to play with DC\OS or Openstack you can find docker-compose to be usefull for manipulation of homogeneous set of containers


sudo rpm -iUvh http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm
rpm -qa | grep docker

If you fucked up with partition settings:

wipefs -a /dev/sda1

Docker and mapped volumes: http://container-solutions.com/understanding-volumes-docker/

If docker info | grep loopback show you something – you already screw up configuration of storage driver.

How to check journal what is happening:

journalctl -u docker.service --since "2017-01-01 00:00:00"

Full flags described here.

Usefull commands for check docker images:

docker inspect
dmsetup info /dev/dm-13


How to check heap memory consumption per node:

nodetool cfstats | grep 'off heap memory used' | awk 'NR > 3 {sum += $NF } END { print sum }'

How to check what neighbors we can see from nodes:

nodetool -h ring

How to find processes that use swap:

for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done | awk '$2 {print $1 FS $2}'

Check how many disk memory we used, from Cassandra perspective

nodetool cfstats | grep 'Space used (total)' | awk '{s+=$NF} END{print s}'

Determine disk usage, OS point of view:

du -ch /var/lib/cassandra/data/

cassandra check health:

 ssh nodetool status


How to open port at centos 7:

firewall-cmd --zone=public --add-port 9160/tcp --permanent
firewall-cmd --zone=public --add-port 9042/tcp --permanent
firewall-cmd --zone=public --add-port 7200/tcp --permanent
firewall-cmd --zone=public --add-port 7000/tcp --permanent

Open ports for spark, master:

firewall-cmd --zone=public --add-port 7077/tcp --permanent
firewall-cmd --zone=public --add-port 8081/tcp --permanent

Apply changes in ip tables:

firewall-cmd --reload

or do this, usefull in case network manager behave badly:

systemctl restart network.service
systemctl status network.service

And as bonus points –

How to migrate data from old cluster to bright new one

  1. sstableloader
  2. cassandra snapshots
  3. For tiny dataset to get cql file with inserts: – cassandradump

First two approaches represent standard way of data migration.
Limitation of first is speed and necessity to stop old node.
Limitation of second is necessity to manualy deal with token_ring on per node basis.

If life was really cruel to you, you can play with data folders per node.

NOTE: if you can replicate exact same setup – in terms of assigned ip, it will be enough to just copy cassandra.yaml from old nodes to new one, and use exact same mapping folder within docker as it were at old cluster.

If not – you still can do it with copying data folder follow steps below, but better just use sstableloader.

  1. In order to do it you have to run following command on every node to drain node from cluster and flush all data into filesystem:
nodetool drain</pre>

NOTE: this is unofficial, not recommended way to deal with data migration.
NOTE 1: it require you to have similar amount of nodes in both clusters
NOTE 2: no need for the same datacenter\rack cfg\ip address

2. Deploy docker based setup according to HW configuration. Total amount of nodes should be equal to total amount of nodes at old cluster. On new cluster deploy exact schema that were deployed on old cluster.

3. Stop new cluster.
within every node data folder of OLD cluster you would have following folders:

NOTE: do not touch system* tables.

4. Under folder /your/cassandra/data-folder/your-keyspace
you should have set of folders corresponding to that keyspace under which data is stored.

5. You have to copy content of this folder (*.db, *.sha1, *.txt) for every node from OLD cluster to corresponding folder of NEW node cluster in. UUID WILL be different.
I.e. old cluster, node 1 to new cluster, node 2:
data copy example
scp /old/cluster/cassandra-folder/data/your-keyspace/your-table-e31522b0e2d511e6967a67ec03b4d2b5/*.* user@:ip/new/cluster/cassandra-folder/data/your-keyspace/your-table-c56f4dd0e61011e6af481f6740589611/

6. Migrated node of OLD cluster must be stopped OR you have to use `nodetool drain` for processed node to have all data within sstables ~ data folders.

Performance monitoring:

  • general system overview: atop or htop
  • Be sure that you understand memory reporting.
  • JMX based monitoring: jconsole
  • Jconsole connection string: service:jmx:rmi:///jndi/rmi://:/jmxrmi
  • dstat network & disk io
  • strace – show every system call. slow down. can connect to running.
  • netstat -tunapl | lsof -i -P – network\ports per process
  • docker stats – reports cpu\mem\io for container
  • perf + perf-map-agent for java monitoring:
    for example cache miss, more there:
perf stat -e L1-dcache-load-misses

Articles that I find usefull:


*cassandra has only single drawback – it have no idea of your data model, whether you configure your data schema correctly, what is your load patterns. That why you have to dive in wonderland of controversial recommendations in blogposts like that one instead of thoroughly read documentations first.


P. S. do not believe anyone, measure!



How to stop being a junior – 7 hints of programmer productivity

0) If you don’t know something – don’t afraid to ask.

Especially if you already checked first page of google search and pretty sure that no one ask that question at stackoverflow.
Reinventing the wheel and breaking the stalemate can be a good exercise for your home projects, but in production environment better to check idea with your mentor before diving into implementation details.

1) Don’t afraid to show that you have no idea.

real programming - do not left open questions ever

do not left open questions ever

Just add note for yourself to figure out it later.
If you do not understand how something works – obviously this is gap in your knowledge.
And you can’t just skip it – you are software engineer – this is your obligation to be aware what is happening under the hood.

And yes, sometime you have to do it at your own time.

Professional growth is very simple:

  • step 1 – find something that you don’t know,
  • step 2 – start investigation, discover bunch of additional mysteries and repeat step 1

2) Split problem to several simple questions and address them one by one.

Don’t try to troubleshoot huge multi-component system within real data to reproduce the problem.
Forget for a minute for overall complexity of your enormous project, analyze suspicious functions one by one independently of each others.
Use online fiddle for your language to check obscure part of language with fake data returned by mock api.

3) stop wasting your gorgeous time

If you find yourself googling the same commands over and over again – start making notes.
Create txt file with most useful commands and update it whenever you find yourself googling again.
Search for ready to use cheat sheet or even put wallpaper on desktop.

4) Do not stop to invest time in reading proper books. Ever.

Pressure, deadlines, laziness.
This is not for company, boss or to bragging about.
This is your main and primary investment to the future – your knowledge is treasure.
30 minutes of reading every day – not too much,
but in long run – you will be noticed that you become more capable and be able to tackle previously hard to solve problems.

from junior to senior - bite by bite

from junior to senior – bite by bite

5) You should properly digesting advice, articles and opinions.

Always will be people who are picky about your code.
Always will be deadlines where you have to make compromise.
Always will be people who haven’t seen big picture or just too stubborn to like ideas of other.
Be wise and pragmatic:
first and foremost you earn money for get your job done.
focus on that target along the way try to do it concise and efficient but at the same time meet your own deadlines.
New technologies, new languages, new paradigms and new patterns give it a try only when this shit is working.

6) Do not ever ever ever do work that you do not like or not interested in.

You will do it mediocre at best.
Your work should be your passion, your pride and not amount of hours behind the desk exchanged for paycheck.
If you are not happy at the morning before the workday – you have to change something. Urgently.
Excitement, challenge and satisfaction should be your main motivation for every day.
Money and career opportunities always follows three guys above 🙂

db mix – postgres, sqlite, cassandra, aerospike & redis


How to delete all tables from sqlite:

SELECT 'DROP TABLE ' || name || ';' FROM sqlite_master WHERE type = 'table';

Find duplicates by field “field_name”:

SELECT field_name, COUNT(field_name) AS cnt FROM some_table GROUP BY field_name HAVING(cnt &gt; 1 ) FIXME - use cnt?

Find records changed in last 5 days:

SELECT * FROM some_table WHERE created_at &gt;= NOW() - '5 day'::INTERVAL;

Get table definitions:

pragma table_info(mass_connections);

Export select query to csv:

.mode csv
.output result_of_query.csv
select * from my_table;
.output stdout 

Import data from csv into fresh new table:

.mode csv
.import /path/to/your/all_data.csv new_table


How to show all tables with sizes within database

SELECT schema_name, relname, pg_size_pretty(table_size) AS size, table_size FROM ( 
SELECT pg_catalog.pg_namespace.nspname AS schema_name, relname, pg_relation_size(pg_catalog.pg_class.oid) AS table_size 
FROM pg_catalog.pg_class 
JOIN pg_catalog.pg_namespace ON relnamespace = pg_catalog.pg_namespace.oid) t 
WHERE schema_name NOT LIKE 'pg_%' ORDER BY table_size DESC;

Show average amount of records per table

SELECT schemaname,relname,n_live_tup FROM pg_stat_user_tables ORDER BY n_live_tup DESC;

How to create data-only dump:

pg_dump -U your_pg_user -h pg_ip_address -p pg_port -a --column-inserts db_name > file_name.sql
pg_dump -U your_pg_user -h pg_ip_address -p pg_port -a --column-inserts --table=table_name db_name > file_name.sql

Useful pg_dump flags:

  • – C adds the CREATE statements
  • – s dump schema only
  • – a dump schema & data
  • – D dump using inserts (to simplify uploading data from PG into another db engine)

How to restore data:

psql dbname < infile.sql

PG stop/start:

$(PG_HOME)/bin/pg_ctl -D /data stop -m immediate
$(PG_HOME)/bin/pg_ctl start -D /data -l logfile


Get settings:

asinfo -v 'get-config:'

Set particular settings:

asinfo -v 'set-config:context=service;batch-max-requests=10000000'
asinfo -v 'set-config:context=network;timeout=10000000'
asinfo -v 'set-config:context=service;batch-index-threads==100'

How to register LUA-script:

register module 'your_script_name.lua'

more at http://www.aerospike.com/docs/guide/aggregation.html

How to build secondary index based on bin

CREATE INDEX _idx ON . (bin_name) NUMERIC

bin_name ~ field name

How to delete all records within set:


How to register lua script:

redis-cli script load "$(cat /YOUR/PATH/script_name.lua)"


How to save results of query to the file:

cqlsh -e"select * from table_name where some_txt_attr='very tricky string';" &gt; cassandra_file_query_result.txt

How to check node health:

nodetool status | awk '/^(U|D)(N|L|J|M)/{print $2}'

How to check compression ratio for particular table:

nodetool -h cassandra_ip cfhistograms some_keyspace some_table

How to check the dropped tasks count (at the bottom) at particular node:

watch -n 1 -d "nodetool tpstats"

How to do a “backup: of  cassandra:

nodetool snapshot

It will generate snapshot of data at /<your path from yaml file>/data/snapshot.

How to do a “restore” from snapshot:

      stop cassandra


      delete content of every keyspace table at /<your path from yaml file>/data/


      copy data from snapshot to the respective keyspace folder


    restart the server

7 sins of blatant ignorance and how to avoid them

…You produce software for some time. In most cases it even works.
Other developers tend to care about your opinion.
Damn, you even wear some fancy title like senior\principal\architect.
And here it is – suddenly you were offered to wear really posh title – CTO…

This is moment when real troubles get started.

I did mistakes by myself. I fought with others to prevent them from repeating of my past vicious moves.
And I wish list below appear every time I was about to make game changing decision.


A lot of challenges are awaiting for new CTO. How they can be avoided?

A lot of challenges are awaiting for new CTO. How they can be avoided?

1. Claim-to-be-universal tools suck. Always.

Do not make assumption based on bright hello-world example at promo web site.
In your case – it would be necessary to have some tricky functionality that this mega-framework does not support by design.

2. Be suspicious to any black-box like solution that promise all and everything at no price.

Never-ever expect that you are aware of the deep technical details.

In production, during important demo or pitch – impossible cases tend to jump out regardless of claims from probabilistic theory.
Listen to your team – they are your experts who (should) know about tiny nuances about setup, implementations and limitations.

3. Start simple. Focus on creating _working_ prototype fast.

Forget about speed, scalability, cluster, gpu-computing or “best practices” declared by yet another guru.
In 99.99999 percents of cases you do not need load balancing or advanced caching strategy.
You will iterate if it necessary.

4. Trendy stuff sucks.

Do not waste your time in fighting with bugs of another pre-alpha release of some looks like promising tool.

New database engine \ fresh from research lab language \ trendy paradigm – should be out of your list of consideration.

You need get stuff done. That’s it.
Good old bullet-proof solution are your choice.

Especially if you and team have experience of delivering some product with it.
Otherwise, year later you will realize that your repositories contains complicated workarounds and dirty hacks in desperate efforts to build initial DSL.

5. Listen to your team.

Measure & prototype. Be open-minded for their approach for solution.

Do not abandon idea only because you are not author of it. Encourage them for think out of box (even if it mean be contradicted to your opinion).

6. Books are your friends.

Inspire your team to learn new things and professional growth.  Make your habit to read every day – in long run it will make a huge difference.
Short articles from HN do not help you to build foundation of knowledge – it is just a tip of iceberg.
You never can be sure that you know enough – treat with suspicious any “undisputed” statements (and those who dare to make them).

7. Take it easy.

World of IT and software development is hilariously small.
You do not know whom you will be interviewed by next time.
Who will be contacted for additional reference.

All makes mistakes. Not everyone learn from them.

Avoid any illusions from QA department – in most cases software will not work as you expected in first version.

Positive vibes during stern fuck-ups is what makes our profession truly awesome and memorable.
Humor is your best way to deal with burnout, pressure and broken coffee machine.

Sprinkle usual working day of your team with few bits of laugh to remind everyone that programming is fun! 😀

Information retrieval, Search and recommendation engine:

Natural language processing:

https://class.coursera.org/nlp/lecture – Processing of texts written in ordinal languages
https://company.yandex.com/technologies/matrixnet.xml – search algorithm by Yandex

Information retrieval:

http://nlp.stanford.edu/IR-book/html/htmledition/irbook.html – Introduction to Information Retrieval, Cambridge
http://machinelearning.wustl.edu/mlpapers/paper_files/BleiNJ03.pdf – Latent Dirichlet Allocation (LDA)

Few examples of applications for the above:


Recommender systems:

https://www.coursera.org/learn/recommender-systems/ – video lectures – 101 for Recommendation System
http://www.ibm.com/developerworks/library/os-recommender1/ – introduction to approach and algorithms
http://www.cs.bme.hu/nagyadat/Recommender_systems_handbook.pdf – “Encyclopedia” of recommender systems
http://www.slideshare.net/xamat/kdd-2014-tutorial-the-recommender-problem-revisited – overview of recommendation algorithms
http://www.machinelearning.org/proceedings/icml2007/papers/407.pdf – Restricted Boltzmann Machines for Collaborative Filtering

Ready for use recommendation engine:

https://cloud.google.com/prediction/ – Google recommendation engine
https://mahout.apache.org – Apache recommendation and general purpose machine learning framework

Elastic Search – just a few useful snippets

Install head plugin or rely on oldschool curl utility in order to test your queries:




Q: Show me example of query for complex, nested document?


{ "query": 
    { "bool":
        { "must": [
                 {"path": "document.sub_document",
                           {"must": [
                               { "match":
                                   { "document.sub_document.attribute": "PUT_YOUR_SEARCH_VALUE_HERE" }

NOTE: if what are you searching for in sub-sub-sub document – just add proper number of nested chains of “bool” “must” “nested” elements.

Q: I need full text search and aggregations (aka facets) by attribute in nested document.


{ "query": 
    { "query_string": 
        { "query": "PUT_YOUR_SEARCH_STRING_HERE" }
                {"path": "document.SUB_DOCUMENT"},
                            {"field": "document.SUB_DOCUMENT.ATTRIBUTE_NAME"}

Q: I need a full text search and aggregations by geo positions aka distance range.

NOTE: put proper values for “origin”, “field”, “ranges” fields.

{ "query":
    { "query_string":
        { "query": "PUT_YOUR_SEARCH_STRING_HERE" }
                {"path": "document.SUB_DOCUMENT"},
                            {"origin": "100500, 100500",
                             "ranges": [{"to": 1000}, {"to": 3000, "from": 1000}, {"from": 3000}]

Q: I have fields in document that contains multiple words, I want them to be be aggregated not as separate single terms, but as a whole string.

A.1 put proper mapping for such field – “multi_field” or in most recent version of elasticsearch – just “fields”.

... document mapping, ...

YOUR_FIELD: {   "type": "string",
                    { "type": "string", "index": "not_analyzed" }

... document mapping, ...

A.2 use such kind of queries for nested faceting:

        {"query": "PUT_YOUR_SEARCH_STRING_HERE"}
                {"path": "document.SUB_DOCUMENT"},
                             {"field": "document.SUB_DOCUMENT.ATTRIBUTE_NAME.raw"}

Q: I want return all documents sorted by distance and my geo_point field in nested document.

NOTE: ATTRIBUTE_NAME should be mapped as geo_point at moment of writing – it can be done only via manually created mapping.


    {"match_all": {}},
    "sort": [
                {"lat": 25,"lon": 55},
                 "order": "asc",
                 "unit": "km",
                 "distance_type": "plane"

Q: I want to return aggregation only?


{"size": 0,
    "aggs": {
        "name_of_1st_parent_aggregation": {
            "nested": {"path": "document.SUB_DOCUMENT"},
            "aggs": {
                "name_of_1st_aggregation": {
                    "terms": {
        "name_of_2nd_parent_aggregation": {
            "nested": {"path": "document.SUB_DOCUMENT_1"},
            "aggs": {
                "name_of_2nd_aggregation": {
                    "terms": {
        "name_of_3rd_parent_aggregation": {
            "nested": {"path": "document.SUB_DOCUMENT_2"},
            "aggs": {
                "name_of_3rd_aggregation": {
                            {"origin": "100500, 100500",
                             "ranges": [{"to": 1000}, {"to": 3000, "from": 1000}, {"from": 3000}]

Q: I want autocomplete?

A-0. NOTE better to use n-gramm based approach

A-1. Prefix approach for autocomplete:

    "query":{"query_string" : {
        "default_field" : "field.name",
        "query" : "start_of_phrase*"

A-2. by adding to document mapping additional suggest field:

"mappings" : {
    "document_type": {
             "suggest" : {
                        "type" : "completion",
                        "index_analyzer" :"simple",
                        "search_analyzer" :"simple",

When you add document for indexing you have to specify this additional information and use special endpoint _suggest for request suggestions:

    "suggest_name" : {
        "text" : "k",
        "completion" : {
            "field" : "suggest"

Q: I want filtering of a search result by nested attribute!

  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      "filter": {
        "or": [
            "nested": {
              "path": "document.nested_attribute",
              "filter": {
                "bool": {
                  "must": [
                      "terms": {
                        "document.nested_attribute.attribute_value": [
            "nested": {
              "path": "document.nested_attribute_1",
              "filter": {
                "bool": {
                  "must": [
                      "terms": {
                        "document.nested_attribute_1.attribute_value": [
                          "some string value"

Heterogeneous vector in c++ – overview of common approaches

So, you are wondering about heterogeneous vector in c++?
Maybe even dare to dream about any suitable substitution of such non-existent container?
In another word, you need a generic-like container that can store different datatypes.

If you just need a quick answer – stick with std::vector < boost::any> approach or read this,
If you need more technical-rich overview of purely templated solution – scroll down to links part,
If you are wondering about other options and don’t mind to dive in a world of bad English grammar and details about one of my recent task – read on.

heterogeneous container in c++

So, you want a heterogeneous container in c++…

First ask yourself – do you really need a heterogeneous vector?
If your answer is – yes, I have a bad news for you – in 99.9 percent it’s just consequences of messy design.
However, I am sure you are here for the sake of that one exceptional case: for example – your task is providing intermediate peace of software for interaction with some 3rd party old-fashion engine.

In my case – I was trying to implement convenient way of operating variable length list of parameters for OpenCL kernel wrapper.

If you are not familiar with mechanism of interaction of OpenCL code with C++, there are only one problem (sarcasm!) – it is too verbose. Off course there are numerous third party wrappers – http://streamcomputing.eu/knowledge/for-developers/opencl-wrappers/ but in situation where even NVidia drivers sometimes do not support all features from those-before-the-last standard,
I am afraid to think about cases when you have to deal with additional layer of external api.
Yeah, lets think about your own implementation because development of your own bugs is very entertaining and educational. It’s time consuming as well as terrible error-prone but you can narrow desired functional for your needs and be sure that all issues are made by you.

OpenCL kernels are compiled independently of host code, so you do not have any standard approaches for checking whether arguments provided to kernel have appropriate types and whether their amount corresponds to definition of kernel. It means that:
1) In case you are too lazy to somehow analyze every OpenCL kernels you can’t check how many arguments is necessary for particular kernel
2) You can’t check whether provided arguments have proper data type without any external parser

As a consequence, in general, my wrapper should be able to deal with variable length array of arbitrary any-type parameters.

(NOTE: Yeah, I’m familiar with undocumented c++ wrappers based on variadic templates, but it force you to follow their low-level nature by falling down from level of domain-specific objects to operating in terms of POD types.)

From that brief idea, I conclude that my goal was:

vector < gpu_arg > kernel_args;

where gpu_arg is a class that can encapsulate any data types – built-in as well as a user-defined.
Who have mentioned templates? How to create vector that can hold any templated parameter? (we discuss it a bit latter)

I approach – return to the ancient times
The most straightforward way – forget about C++ and rely on encapsulation of data into void* pointers with numerous C-way casting:

struct gpu_arg
  void* data;
  size_t size;
  // numerous helper methods here
  // NOTE: you have to add some kind of type_id to deal with data in proper way

I.e. kernel parser report that there should be following parameter set:

float, int, custom_class *

and when you start adding parameters it treat them as predefined types (with or without your own additional datatypes checks).

As an advantage of such idea – we can easily use it in run-time.
In addition, this solution is error-prone and can lead to cruel punishment during any code review.

II approach – std::tuple and variadic templates
On the other hand – in many cases when you are not keen to find a perfect silver bullet you may find helpful to simplify task. In order to check argument’s types you have to preprocess kernel, so you can form an expected parameter list. If you perform this operation during compilation of host-side code, you may use acquired parameter list to simplify code-generation task.
I started investigation of possible approaches and find out that the most obvious solution would be based on std::tuple:

// this C++11 container allows creation custom container like this:
std::tuple < int, int, bool, float *, unsigned short * > parameters_set;

or encapsulate it in a class with some syntax sugar for convenience:

template < typename ...T >
class arguments_set

    std::tuple<T...> data;
    template< size_t I >
    using data_type = typename std::tuple_element<I, decltype(data)>::type;

     *      variadic-based routine for initialize every element of tuple
     * */
   	template < std::size_t I = 0, typename TT >
	init ( TT & arg )
        std::get<I>( data ) = arg;

	template < std::size_t I = 0, typename TT, typename ...Args>
	init ( TT & arg, Args ... args )
		init <I,TT> ( arg );
		init <I+1,Args...>( args ... );


    template<typename... Ts>
    arguments_set(Ts... args) {
       init <0,Ts...> ( args... );

    template < std::size_t I = 0>
    auto get ( ) -> data_type<I>
        return std::get<I>(data);


Argh templates, well, who care about compiler’s effort to parse all this fluff?
Despite of convenience of variadic templates constructor, template metaprogramming is not easy to deal with during maintenance phase (as well as during developing).
On the other hand, it provides a desired result – strong type-checking combined with ability to generate variable length list of parameters.

So, solution was:
1) run pre-processor for OpenCL kernels in order to generate proper tuple for method invocation
2) compile the whole module

It is can be a solution in situation where you are not intend to run this mechanism in run-time.
(because it mean you have to dynamically extend templated class tuple objects)
Not my case.
Idea of pre-compiled kernels (PTX) sound great, but reality is sad – mess of drivers, hardware and vendor’s ambitions lead to incompatibility of generated binaries in general case. Not usable for me :(.
(But hope springs eternal – if you are lucky enough you can play with CL_CONTEXT_OFFLINE_DEVICES_AMD, http://clusterchimps.org/ocltools.php, http://www.browndeertechnology.com/coprthr.htm )

III approach – type erasure

Ok, let’s return to my preconditions once again:
1) I need type checking – template?
2) I need it in run-time – maybe some virtual stuff?!

What if I declare interface of argument as an abstract class and inherit it as a templated child with proper data fields


// pure abstract class
// in case of necessity can be further divided on pure interface\data-fields parts

class gpu_arg_base {
   *  interface part that depend on child's template parameter = a lot of virtual functions
   *  common data fields with ordinary setters\getters methods

template < typename T>
class gpu_arg : gpu_arg_base {
   * explicit override of interface with possible overloading
  gpu_arg ( T* init_data);
    T *data; // NOTE: just a reference to data, no allocation\de-allocation!

It allow me to use them in following way:

class kernel_wrapper {
    vector <gpu_array_base*> kernel_params;
     /* some stuff here */
        // variadic template functions to deal with any number of parameters
        template < typename T>
	add_kernel_args ( T * arg )
		add_kernel_arg ( arg );

	template <typename T, typename ...Args>
	add_kernel_args( T * arg, Args ... args )
		add_kernel_arg ( arg );
		add_kernel_args ( args ... );

        // generating of proper function for particular type
        template<class T>
        void add_kernel_arg( T * host_data )
	        gpu_arg_base* new_arg = new gpu_arg<T> ( host_data );
		kernel_params.push_back( new_arg );

     /* interface part */

But be careful with this easy-looking approach – there are two main issues which can affect your mood and calmness.
First, read about differences between overriding and hiding of methods in inheritance hierarchy here or here. It is a great source of confusion, especially during investigation of fresh bug-reports.
Second, do not forget about “covariant return type” rules – http://aycchen.wordpress.com/2009/08/17/covariant-return-type-in-cpp/.
Great article about possible caveats and workaround can be found there:

After reviewing all solutions described above, you may find that you accept additional dependency in exchange for absence of disastrous side-effects of your implementation.
boost::any or boost::variant can be a proper choice.

PS. Actually, I suspect that using tuples and dark magic of template metaprogramming, you can save few ticks of processor’s time by abandoning inheritance and virtual table, but as usual during development we have to balance between concept of the perfect code and requirements of too fussy world.


Example of heterogeneous container ( deeply nested approach)

Interesting practical example of tuple usage for ORM-like engine:
Some practical aspects of using tuples:

Using variadic templates to initialize tuples or other way round:

Any class in c++:

Illustration of variadic templates usages for generating C++11 variant class:

Hands-on experience with tuples:

How to convert png pair of RGB and Depth frames into Pointcloud library PCD format

There are a lot of accessible dataset of RGB-D data:


But usually it stored in PNG format and unfortunately Pointcloud library do not provide built-in function neither for treating it as a PointCloud nor for conversion it to PCD. For my experiments I need to test few points using data with ground truth estimation, thats why I have to code small utility for conversion purpose.

Due to sudden leisure I decided to share that peace of code – probably you can find it useful.


From readme:

png2pcd_batch – simple command line utility to convert depth and rgb frames
from png format to PCL pointcloud.

There are 2 execution mode:
using file with association information i.e. containing rgb-depth png files correspondence
or just providing folders that contain depth and rgb frames ( not reccommended ).

In 1st case you should anyhow create associate file by yourself
(for further details check description of parse_freiburg function)
In 2nd case – correspondence strictly depends on file names, and you should check it twice,
to avoid situation when selected depth frame is not appropriate for rgb frame.
( add sorting to filenames vector using custom predicate )

All dataset related parameters are incapsulated in Intr structure ( intrinsics ).
There are: width, height, fx, fy, cx, cy, scale_factor.
Usually depth data is saved as unsigned short ( 16 bit ),
but in pcl::PointXYZ you have to re-scale it to float – metric measurment.

Appropriate intrinsics should be written to file cam_params.cfg otherwise
default values will be used ( which may lead to invalid output data ).

There are exist two opportunity for compiling:

using classical make:
edit WORK_DIR in Makefile to point in directory contained pcl-trunk & opencv
it produce more lightweighted version by avoiding linkage with unnecessary libraries

or using cmake:

mkdir build; cd build
cmake ..

NOTE 1: There are only two dependencies:
PCL and OpenCV.
NOTE 2: in case of builded-but-not-properly-installed OpenCV libraries you have to
manually create symlink to ipp lib:
sudo ln -s /path-to-opencv/3rdparty/ippicv/unpack/ippicv_lnx/lib/intel64/libippicv.a /usr/lib/libippicv.a

Note 3: it have built-in support for 16bit unsigned depth and 3-channel RGB data only, in case your data has another format you have to change code a bit
Note 4: do not forget to provide appropriate intrinsics for proper calculation of XYZ vertex

Templates in plain C

Templates in ANSI C – simple and convenient method for emulating c++ like templates in plain c. Sample project, which demonstrate this technics can be found at github.

So, it is our constraints:

  • ANSI C (no templates, inheritance, overloading, default params etc.)
  • set of almost the same user-defined structures (the common difference – is types of internal fields)
  • set of the functions, which operates on user-defined structures and provide a common interface used in the whole app

The most straightforward way to solve such task is just hard coded all necessary routine by hand:

/*		first type - type_int							*/
typedef struct type_int {
	int data;
} type_int;

make_type_int(int init_val) {
	type_int return_value;
	return_value.data = init_val;
	return return_value;

subtract_type_int (type_int A, type_int B) {
	return make_type_int ( A.data - B.data );

 *		and a lot of different functions here

 /*		second type - type_float						*/
typedef struct type_float {
	float data;
} type_float;

 *		etc.

This leads to a huge amount of copy-paste and increase chances of errors, especially in case of a large set of functions and vicious habit of the compiler to use implicit type conversion.
But what is most important – such way a bit annoying and leads to impression of bad “smell” of your own code.
So I decided to google around (all helpful link are located at the end of articles) and find out that indeed – the better way for emulating templates in plain C is exist!

Here is my how-to for generating declaration of structures and implementation of methods operating on them, which can be used further in the whole project.
The main trick here is to refresh the basis of C preprocessor and macros.

Firstly, lets define several helpful macros in file my_types.h:

#define CAT(X,Y) X##_##Y
#define TYPE_NAME(X,Y) CAT(X,Y)

They will be used to generate names of your structures and methods using simple rule – merge two params, using underscore as delimiter.
Using macros above lets create our simple structure in file my_type_templates.h:

typedef struct TYPE_NAME(TYPE, SUB_TYPE) {
	SUB_TYPE data;

And add declaration and implementation of all necessary functions:


/* if this file is included in any header - just add there definition of interface */


 *		long list of supported functions


/* if this file is included in implementation file, where defined flag INCLUDED_IN_IMPLEMENTATION_FILE than generate implementaion of functions

/*	add implementain make_* functions 		*/
TYPE_NAME(make, TYPE_NAME(TYPE, SUB_TYPE) ) ( SUB_TYPE init_value ) {
	TYPE_NAME(TYPE, SUB_TYPE) return_value;

	return_value.data = init_value;

	return return_value;

/*	add implementain subtract_* functions 		*/
	return TYPE_NAME(make, TYPE_NAME(TYPE, SUB_TYPE) ) ( A.data - B.data );


NOTE: you should not use any global ifdefs in file my_type_templates.h, because we have to include it multiple times, for every new custom type. Preprocessor will generate actual struct’s names and appropriate functions for manipulating them.

After that lets specify all types, which should be used in our project in usual header file my_types.h. For every type, which we want to generate – just add define/undef command like this:

#define TYPE type
#define SUB_TYPE int
    #include <my_type_templates.h>
#undef TYPE
#undef SUB_TYPE

and add implementation of all functions to a source file – my_types.c – just define flag showing that it is actual implementation and include header file, containing all defined types:

#include <my_types.h>

In your project you should use my_types.h header as usual – just include it in all dependent sources. We have used ifdef for implementation part of our template, so header doesn’t contain any functions implementation – therefore there is no ambiguity during linking and all necessary function would be compile only once – during compilation of file my_types.c.

So the final files should looks like following:


#ifndef MY_TYPES
#define MY_TYPES

#define CAT(X,Y) X##_##Y
#define TYPE_NAME(X,Y) CAT(X,Y)

#define TYPE type

#define SUB_TYPE int
	#include <my_type_templates.h>
#undef SUB_TYPE

#define SUB_TYPE float
	#include <my_type_templates.h>
#undef SUB_TYPE

#undef TYPE

#endif /* MY_TYPES */


 *		Add all includes necessary for you implementations
#include <my_types.h>

That’s it. Using this trick you will achieve compile-time type-checks, decrease amount of boring hand-coded routine and avoid possible errors using another approach, based on void pointers and multiple run-time casts.

Additional resources:

Что почитать для проф развития программисту

Давно хотел как-то упорядочить свой список книг для внеклассного чтения для повышения проф-пригодности. Эти книжки для тех программистов, которые уже не совсем новички. Возможно уже и не совсем программисты – техлиды/архитекты. И хотят данную ситуацию усугубить.
Я из них прочитал еще не все 🙁
Но галочки уже расставляю 🙂

зы. Не думаю, что это нужно/интересно/полезно вот прям всем – это уже лично мой список отражающий текущие или прошлые профессиональные интересы. Я ж наверняка что-то позабыл включить, а что-то мне разонравится и я буду безжалостно это вычеркивать – так что в статье ожидаются правки 🙂


  1. Язык программирования С++, Страуструп
  2. Эффективное использование C++. 55 верных способов улучшить структуру и код ваших программ, Мейерс Скотт
  3. Наиболее эффективное использование С++. 35 новых рекомендаций по улучшению ваших программ и проектов, Мейерс Скотт
  4. C++: Библиотека программиста, Джефф Элджер
  5. Веревка достаточной длины, чтобы… выстрелить себе в ногу. Правила программирования на Си и Си++. Ален Голуб
  6. Стандарты программирования на С++. 101 правило и рекомедакция. Герб Саттер, Андрей Александреску
  7. C++. Практический подход к решению проблем программирования, Мэтью Уилсон
  8. Современное проектирование на С++: Обобщенное программирование и прикладные шаблоны проектирования, Андрей Александреску
  9. Advanced C++ Metaprogramming, Davide Di Gennaro
  10. Introduction to the Boost C++ Libraries, by Robert Demming


  1. Algorithms in a Nutshell, George T. Heineman, Gary Pollice, Stanley Selkow
  2. Алгоритмы на C++, Роберт Седжвик
  3. Алгоритмы. Построение и анализ, Томас Кормен
  4. Искусство программирования, Дональд Э. Кнут


  1. Эффективное программирование TCP-IP, Снейдер Й.
  2. UNIX. Разработка сетевых приложений,  У. Р. Стивенс

Функциональный подход к программированию

  1. Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp, Peter Norvig
  2. Learn You Some Erlang for Great Good!: A Beginner’s Guide, Fred Hebert
  3. ERLANG Programming, Francesco Cesarini, Simon Thompson
  4. Purely Functional Data Structures, Chris Okasaki
  5. Learn You a Haskell for Great Good!: A Beginner’s Guide, Miran Lipovaca

Проектирование ООП программ

  1. Head First Object-Oriented Analysis and Design, Brett D. McLaughlin, Gary Pollice, Dave West
  2. Head First Design Patterns, Elisabeth Freeman, Eric Freeman, Bert Bates, Kathy Sierra, Elisabeth Robson
  3. Head First Software Development, Dan Pilone, Russ Miles
  4. Domain-Driven Design: Tackling Complexity in the Heart of Software, Eric Evans
  5. An Introduction to Object-Oriented Analysis and Design and Iterative Development, Craig Larman

Компьютерное зрение

  1. Computer Vision: Models, Learning, and Inference, Simon J. D. Prince
  2. Multiple View Geometry in Computer Vision Richard Hartley, Andrew Zisserman
  3. Computer Vision: A Modern Approach, David A. Forsyth
  4. Компьютерное зрение, Шапиро, Стокман


  1. Чистый код. Создание, анализ и рефакторинг, Роберт Мартин
  2. Совершенный код, Стив Макконнелл
  3. 97 этюдов для архитекторов программных систем, Нил Форд, Майкл Найгард, Билл де Ора
  4. Защищённый код, Майкл Ховард, Дэвид Леблан
  5. Рефакторинг. Улучшение существующего кода, Мартин Фаулер
  6. Шаблоны корпоративных приложений, Мартин Фаулер
  7. Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions, Bobby Woolf, Gregor Hohpe
  8. How to Design Programs: An Introduction to Programming and Computing, http://htdp.org/, Matthias Felleisen, Robert Bruce Findler, Matthew Flatt, Shriram Krishnamurthi
  9. Structure and Interpretation of Computer Programs (SICP), http://mitpress.mit.edu/sicp, Harold Abelson, Gerald Jay Sussman, Julie Sussman
  10. The Pragmatic Programmer: From Journeyman to Master, Andrew Hunt
  11. Writing Solid Code, Steve Maguire
  12. Hacker’s Delight, Henry S. Warren
  13. The Software Architect’s Profession: An Introduction, Marc Sewel
  14. 19 смертных грехов, угрожающих безопасности программ, Ховард М., Лебланк Д., Виега Д.
  15. Компиляторы: принципы, технологии и инструменты, “книга дракона”, Альфреда В. Ахо, Рави Сети, Джеффри Д. Ульмана,
  16. Паттерны проектирования, Гамма, Хелм, Джонсон, Влиссидес
  17. Test Driven Development: By Example, Kent Beck
  18. Code Craft: The Practice of Writing Excellent Code, Pete Goodliffe
  19. The Art of Multiprocessor Programming, Maurice Herlihy
  20. The Architecture of Open Source Applications, Amy Brown, Greg Wilson

VIM (emacs – 😛)

  1. Practical Vim: Edit Text at the Speed of Thought, Drew Neil
  2. Learning the vi and Vim Editors, Arnold Robbins, Elbert Hannah, Linda Lamb

Project Managment

  1. Искусство войны, Сунь-Цзы
  2. Мифический человеко-месяц, или Как создаются программные системы, Фредерик Брукс
  3. The Psychology of Computer Programming, Gerald M. Weinberg
  4. Extreme Programming Explained, Kent Beck
  5. Agile Software Development: The Cooperative Game, Alistar Cockburn
  6. Peopleware: Productive Projects and Teams, Tom DeMarco
  7. Adaptive Software Development: A Collaborative Approach to Managing Complex Systems, James A. Highsmith
  8. Software Craftsmanship: The New Imperative, Pete McBreen
  9. Dynamics of Software Development, Jim McCarthy
  10. Antipatterns: Managing Software Organizations and People, Colin J. Neill, Philip A. Laplante, Joanna F. DeFranco
  11. AntiPatterns in Project Management, William J. Brown
  12. Beyond Chaos: The Expert Edge in Managing Software Development, Larry L. Constantine
  13. The Manager Pool: Patterns for Radical Leadership (Software Patterns Series), by Don Sherwood Olson
  14. Death March, Edward Yourdon
  15. Leading a Software Development Team: A developer’s guide to successfully leading people, Richard Whitehead
  16. Head First PMP, Jennifer Greene, Andrew Stellman
  17. Agile Software Development, Principles, Patterns, and Practices, Robert C. Martin
  18. Цель. Процесс непрерывного совершенствования, Элияху М. Голдрат, Джефф Кокс
  19. Как пасти котов. Наставление для программистов, руководящих другими программистами, Дж. Ханк Рейнвотер

UX design

  1. A Project Guide to UX Design: For user experience designers in the field or in the making (2nd Edition) (Voices That Matter), Russ Unger, Carolyn Chandler

и, на посошок

Некоторые любопытные обсуждения литературы для проф развития:

http://mrelusive.com/books/books.html – список книг для разработчика игр