Chronicles of one crypto-arbitrage bot


Chronicles of one crypto-arbitrage bot

Some time ago friend of mine, innocent in terms of exposure to bloody details of executing mid-term IT projects, sold me yet another brilliant idea of how to effectively decrease amount of leisure time – by diving (again) in horror-land of startups.

It was excited journey with up, downs and sort of Challenger Deeps that empowered some of my beliefs in regards of general principles of software development, made me treat another “rules” in a more complimentary way, and definitely allow to grasp over few interesting tricks.

With this article I want to highlight curious constraints that we have to overcome; real-life uncertainties, shaping my architectural approach and illustrate, how project evolved during growth of code base and changing requirements.

As years went by I tend to think that there is two main approaches for starting your own startup:

  • you try to foresee any possible outcomes, address all edge cases, horrified by volume of work and end up with “Oh, hell, fuck it, no!”
  • you start with “Fuck it!” jump in with some awkward prototype written during the lunch to see whether this shit work at all.


Bright Idea

It was early 2017, term ‘crypto’ starts appears in mass media, video cards were out of stocks almost everywhere. ICOs, promising various kind of digital revolutions, wide spreading as bubonic plague during ancient times, raising crazy money in exchange for vaguely compiled pdfs. Gold rush of modern century as it is.

There was bunch of crypto-exchanges offering to trade alt-coins: different subset of trading pairs,
complex rules of fees, volumes of digital “commodities”. It was distributed and (almost) non-regulated market where price at the same coin may be different among exchanges.

Idea was pretty simple:

Deposit money at supported exchange, monitor price difference for some coin, as soon as it exceeds profitability threshold – sell at one exchange, buy at another exchange – just to maximise absolute volume of all coins in all our account’s wallets. Approximated opportunities window (based on manual trade’s execution) sometimes reach up to 15 minutes – i.e. it was possible to send missing coins from another wallet to exchange and trigger necessary order.

All of this still sounded pretty easy – so we agreed on investigation stage: collect fluctuation of prices along the day and analyse them to better understand prospects of project.

At that time I can’t distinct ask from bid, how order different from trade, what is the candle or ticker and how price regulated at all.

Very brief intro to terminology:

Order it is when you register within exchange your desire to buy or sell particular crypto currency. If you want to buy – it called bid, sell – ask.
When someone else decided to put order for the same coin and price be matched – trade(s) will be executed – i.e. coins travel from wallet to wallet, exchange charge their fee, order’s volume will be updated to reflect executed trade, when it become zero it mean that order fully filled.
Volume (amount) it is exactly what you want to trade, but price – depending on exchange, can encapsulate several strategies, most common of them is to define exact price.
It has one issue though – price is changed based on current state of order book – i.e. if no one want to sell something for price that you set – you can’t buy anything.
On practice it mean that if you set price according to current top order, then during period since you click submit – till the moment exchange noticed it – someone else may purchase everything and the next lot in order book would have another price.
That means no matching order – your order may hang, and, depending on exchange rules, may be expired in couple of weeks, depending on price movement.
To overcome this inconvenience another common option is to use ‘market’ price – when you definitely want to participate in trade on best possible real cost (if order matching engine implemented properly).
Ticker – summary, that highlight changes of trades for fixed time period: highest ask, lowest bid for coin pair, and other details that varies from exchange to exchange.
Candle – usually have more verbose information – open-close-high-low prices for time period and volume related info.

First blood

So, yeah, returning to the topic of getting the data from exchange.
I have started looking at public api for chosen exchanges. Can be a great illustration for Noah’s Ark – a lot of very different beasts. In the world of classical exchanges Fix protocol is a common way to get updates from exchanges but even now it is almost not supported in crypto world. Web sockets were complex to use and not available at majority of exchanges. Orkay – it meant we are working through REST. Get data from tickers and save it to look later. A piece of cake!

Initial tech stack and reasoning behind it:

  • I do not expect to have some crazy data not in terms of volume not in terms of operations.
  • Database? But maybe this thing will not fly? Csv files are still very convenient to deal with! (yeah, right)
  • Language – speed of implementation that was what matter the most – so no exotic I-want-to-try something new, no complex power of c++ or scala, no verbosity of java – I just stick with python. 2.7. Because, umm, well, it was installed by default and it has everything as 3? (yeah, right)

Those initial code several python modules and single ugly ipython notebooks probably not even committed, but looking at tickers I can’t believe my eyes – this is too good to be true. Either there were some very core errors in our reasoning (or in the code), or we should abandon all other life activities and dive in implementation of remaining functionality asap.

We will be rich soon!

After discussion decision had been made to start iteratively.

During the first stage I will create prototype application that

  • collect data for further analysis – Arima combined with the most advanced tricks from technical analysis should give us ideal prediction (yeah, right!)
  • trigger notification if observed difference noticeable enough to execute manual trades.


  • Notifications. It is strange but not all people find amusing digging through log files or read emails. On the other hand telegram has free bot api, mobile and desktop clients with nice UI to simplify analysis.
  • Every exchanges choose to have their own unique naming convention for coins. All right we will create intermediate naming layer and exchange specific conversion routines.
  • Database. Yeah. Probably it is time to have it as collected csv starts exceeds Gb in size after couple of days of running. What about constraints?
    • Hundred writes per second seems to be very safe upper bound (yeah, right).
    • Reads? Only when analysing data (yeah, right).
    • I do know about normal forms, so with proper approach un-voidable schema changes should be cheep.
    • I do not yet know which data dimensions will be the most important for us – so I want flexibility to tweak data tuples based on every dimensions (columns).
    • Implementation language may be changed in future (yeah, right) – so wide choice of _mature_ client’s libraries is definitely an advantage.
    • I have worked with postgres – which seems to satisfy all the above and have a lot of room to go above – thats another reason to use known technology.
    • Just to keep in mind that things still may be changed so some simple abstraction layer that allow to switch database are always good.

When plan is ready it is very easy to type code – no matter that it is after intense working or family hours – pick issue and execute. But… yeah it would be to easy without them:

First obstacles

There is famous quote «Data is modern oil» but exchanges not really keen to share their data – and in case you need granular timed snapshot of market state the single way to acquire it is to collect itself (or buy). Order book may have dozen thousands of bids and asks. If you request it every second – majority of data will be duplicative and its retrieval will be just annoying overhead for exchanges. If every second you want to get info about hundreds of coins – it can be considered as some kind of ddosing. There is a lot of policies how to control end-point abusing: weighted cost for every API invocation and maintaining client’s balance, renewable every quant of time; share quotes of request – per IP or api key; specify global hard limits and as soon as client exceed allowed number of connections\requests throw 429, 503 or even disrupt all communications with offender. It was very interesting to learn such diversity of load balancing approaches by practice – troubleshooting data retrieval processes by playing with timeouts between invocations and spreading applications among several VPS.

All logging related activities are under constant threat of becoming too noisy and be ignored. And it is always very tricky to choose balanced strategy to prevent recipient from creation of special rule for incoming mails to automatically move everything to the trash. As I mention earlier time gap for opportunities window sometimes used to reach 15 minutes, we analyse price every second and arbitrage event may occurs for multiple currencies. As result our telegram channel were bombarded with identical messages making it un-usable (and telegram doesn’t hesitate to ban us for cool down period as well so we have to add some back-off mechanism as well). In order to make navigation through alerts history in telegram channel more simple we introduce naming convention to have pair of exchange and name of currencies as tags. Additionally we have to went through a lot of iterations in regards of alert attributes: severity, depending on size of possible profit; time – our local? (but we were in different timezones), exchange time (but due to their geographic distribution it also be different), UTC from server where we run process. And, yeah, this channel at telegram were muted. For the same reason of huge amount of notification we disregarded idea of adding simple controls to be able to manually confirm order placements.

We decided to store details about observed arbitrage events as well, history retrieval were ready with some precautions for timeout handling and restarting. Now question where actually store data. If you are lean startup and your free EC2 micro instance were expired, but you do not wish to invest time in investigating of alternative options from competitors your option is to have «in-house» infrastructure. I.e. just run `docker pull postgres`, map volume for persistency and to speed up things a bit (just in case for so cold premature optimisation(C)) disable network virtualisation. Now how to access it? Classical setup is following – you have router from internet provider with dynamically allocated public ip address. Depending on your luck and location (country of residence) ip address will be periodically changed by ISP provider. Behind router – in your home private network ip addresses usually assigned by dhcp protocol – i.e. you restart your host where database server is running and it may get another private ip address. In addition to this it is your home network – i.e. NAS with private photos or laptop with work’s projects. There is a lot of various services offering dynamic dns – you have to run special daemon at your server that will update bonded ip address for chosen domain name – I have chosen Second simple step is to configure your workstation to have statically assigned ip.  Many young hacker’s bundles offer convenient interface over nmap to scan for open ports and check whether by any chance you allow ssh access using login and password.  In order to have at least some protection layer port forwarding were setup using not standard port, ssh was configured to provide access by key file only and old good ip tables allow you to configure banning ip address for failed authorisation attempts. With all this pre-cautions – you can initiate a tunnel with port-forwarding in such way that remote postgres process will be bind to your localhost:1234.

Real programmers test in production

Reactive asynchronous program dealing with real world not so easy to test. And the main reason for it – infinite number of possible combinations of weird situations to happen. Blackout in your home office, un-noticed maintenance of exchange due to recent hacking, issues of internet provider – you name it I saw them all!

First launch with real money went in following way:

  1. we run multiple processes
  2. telegram channel were bursted with endless messages with details of order placed
  3. I saw that telegram started exponentially increase time outs to forward messages from the queue to the channel – i.e. my code do something and it is not visible for us (Oh, shiiiit)
  4. my partner was trying to keep up with message flow – but it was impossible as for every message he have to check two different exchanges filtering by currency.
  5. We need to stop it! Now! Side note: Let’s add to backlog new feature request – we need some mechanism to stop all of them once. At that time I just run pkill to everything with python.

After 6 months of development first 20 minutes of running reveals various issues ranging from annoying to critical.  And in order to investigate what was happening and find out root causes we have to dive in logs and history of orders and trades – situation from bots perspective were quite different in comparison to what exchanges thoughts.

Exchange performance

Kraken – it just doesn’t work. I am still have old script to issue placement of thousand orders and compute ratio of success. At that time it was like 30 something from the whole thousand. At least it was easy to verify and reproduce.

Maintenance of up to date balance state

You have to know how many money you have – as it define scale to what you can trade. In crypto-exchange world there are primary coins – BTC, USDT, ETH – they used for trading as base currency; and secondary – myriads of alt-coins that traded for base. If there are many processes using same wallet it mean that any of them may change remaining volume of base currencies. So issue related to maintaining up to date state of balance was much more trickier. Initially, every arbitrage process sync it independently with every exchange before computing details of order to place. In case of any network issues during balance retrieval – rely on previously retrieved state.

Some of exchanges distinct total and available balance by introducing freezed balance i.e. portion of coins were allocated to registered orders, that does not yet executed. When you deal with response you have to pick proper field to be sure to what extend you can go crazy with trades. Sad thing that not all exchanges actually have it. (Kraken) But anyway I completely missed this bit during implementation – so we have several cases of «Not enough balance» errors during order placement but paired order at another exchange successfully placed and executed – i.e. direct loss. 

Another issue related to balance – burst of requests from the same ip. Balance API is specific for user, and considered as private API with authorisation i.e. more sensitive to load. When price on some coins fluctuating we experience situation that many processes requested it within same time window and response may return with delay or even timeouted. Exchanges start completely ban ip address which was actually fine as no trade were possible in this case because even order book can’t be retrieved. Throwing timeouts, on another hand, was catastrophic as bot have to wait for timeout period (i.e. some one else can place this order) or rely on outdated balance state to place order on one of exchanges, failing to do it at first because of not enough balance.

As we already started experimenting with multi-node deployment – solution to issues above were inject balance retrieval to dedicated process that every second update balance state in redis cache. Every arbitrage bot may access it and check, if last date of update became too old – immediately shutdown itself. After order placement – process forcefully update balance at cache. This approach was optimistic as well – even after execution – it take some (tiny but still may be crucial) time to reflect changes – just humble attempt to minimise time window of uncertainty.

Order book analysis

Initial approach for analysis of current state of market was take the top bid and ask, compute price difference, compare with issue orders, take a look at second pair of bid and ask. But at the same time other players may affect order book – so whatever we are dealing with is already out of sync, and we saw situation when we try to bet on already missing bids, So we decided to do processing of order book only once. Second issue – sometimes price were different, but volume wasn’t enough to get profit. It varies from coin to coin, so we have to pre-compute and maintain volume cap, dependent on most recent coin price and exchange fee, to prevent placement of order leading to loss and use for analysis not only first bid\ask but median of topN of them to approximate possible delay of processing and activity of other players.

Reverse coins movement

Initial idea was just to maximise total balance across all exchanges, but in this case it will be possible (and it happens to us as well) to use all coins from one exchange so just can’t trade it anymore. So we decided to add another mechanism to re-distribute balance evenly across all exchanges by using the same logic as for arbitrage but with lower threshold – i.e. even if it will be zero profit at the end – we still meet our goal. Those logic were triggered for particular currency only if observed dis-balance on pair of exchanges exceed configured threshold.

Busy days

On the rise we have 5 EC2 instances, each running up to 80 processes, history data were stored at Postgres at RDS (bills up to 600$ for this one? Hell no, lets migrate back on self-hosted solutions!). Deals channel at telegram beeped every day. It doesn’t mean we don’t have issues but situation were stable enough to check it every evening for confirmation that everything were fine or re-prioritise backlog.

However first days of live trading brings few obstacles:

  • Logs tend to consume all disk space (yeah, it was very inefficient text file logging) so processes tend to dies on file write (simple logrotate configuration with cron solve it)
  • Deployment also become an issue – manually starting hundreds of processes is a bit overwhelming. Another thing that processes are not daemonized (yeah, I know) in order to have some kind of heartbeat within console as opposite to search in logs by PID and time, so to avoid any issue with SIGHUP they have to be deployed under screen sessions. First we will create new screen, named by pair of exchanges, inside of it we will have dedicated console named by trading pairs to simplify troubleshooting. Same things for other supporting services: history retrieval, balance updating, telegram notifiers, etc. So I drafted few scripts and middleware to automate it within single server. And bash alias to have command to stop processes without screen shutdown.
  • Almost every exchange required to have nonce as part of payload for trading related requests – incrementing sequence of integer number. I have experimented with various approaches but we end up with relying on redis as well to share it across many processes relying on the same api key.
  • Re-balancing initially were run as independent processes and sometimes they compete with direct arbitrage processes, so decision have been made to run them sequentially within the same process

Conquer new heights

  • When we start adding more coins and new exchanges – errors handling become more and more important. On every request we may get in response timeouts or errors (and it actually may mean that order placement were failed, but also it may succeeded!). Errors may also be returned as part of payload – i.e. http response was 200, but payload said error something. Additionally what bot was doing so far was just order placement – i.e. no any guarantee that trades were executed – i.e. manual work required to review open orders and process them. Our solution to both these issue were introducing two priority queue backed up by redis to keep track all placed orders sorted by time. Every queue may be read by one or multiple consumer process that check status of order with the exchange and properly process it.
  • From time to time speed become an issues – we noticed that not always can issue order in time python 2.7 doesn’t have support for async so I have added gevent and pool request, not so convenient to deal with and it doesn’t completely eliminate issue – as we are still operate on top of snapshot of order book
  • Architecture of module, probably, partly a bit verbose, but proof itself to be flexible enough to fuse new exchanges with their quirks in terms of new exchanges’s API integration and addition of new currencies.

Problem of floating point arithmetic in python.

Rounding is a known pain in the ass in many programming language ( and python not an exception here. Apart of situations when we sell more or less than expected, there were other cases when it affected us. Some exchanges are very strict in cases when you submit volume value with more than expected precision – when API expect up to 6 numbers but receive number with 7 digits after decimal point – the whole order will be rejected (and paired order at another exchange may be executed as they may have more relaxed rules). Those rules are specific to exchange, vary from currency to currency – i.e. you can have 0.00001 of BTC but not for some obscure coins. And, guess what, those rules may be changed – so you have to monitor for any ‘precision’ errors on order placement. Another annoying thing – implicit fallback for scientific notation for string representation for small numbers. I do understand meaning of 1e-6 but usual api is not so smart. In order to overcome all those issues it wasn’t enough just to switch on Decimal class but the whole new layer of conversions have to be implemented to cut redundant part of string representation based on exchange specific rules.

Brightest prospects

Many development activities happens outside core module:

As size of database grows we start thinking about encapsulating news feed and various forum’s rumours to checks whether it correlated with fluctuation of prices. Inspired by this aim independent project were started to create web-crawlers for twitter accounts, telegram groups, medium posts, reddit threads and few more specific resources to get all information within some date-time range. Every new source of information brings new surprise – for example in order to work with telegram you have to build native extension to use java wrapper through JNI. In order to efficiently get tweets with comments for curated accounts we have to maintain set of keys and rely on key rotation mechanism as soon as api start throwing timeouts. Important aspect there – time handling. World are big, complex and have many time zones which sometimes may be reflected with web-page or response in all its cheer differences, so we have to adjust everything to common denominator – UTC – as we have done for market data.

Another direction was to simplify bot management & deployment to decrease operational toil. As we already started use redis for sharing common information (nonce) and maintain alerting messages and order’s queue, next steps was to extend this mechanism to the next level – build sort of message bus for inter-process communication. Other important functions are: automation of deployment of services to a new node in a bit more advanced way that just cloning the whole repo, update configuration for particular server, shutdown all processes at all registered nodes at once. In order to flexibly manage newly added EC2 instances to the system we add ability to deploy agents. It is a special daemon that control arbitrage and supplementary processes at node, listen commands from UI of command centre. From monitoring perspective – it would be much more handy to review state of processes by skim through colour of trading pairs per servers reflecting delays of heartbeat, than ssh’ing inside screen of every server to watch for heartbeat messages in console. Many challenges were overcome on this path as well: flask & dash integration, dynamic UI creation (I am not the biggest fan of frontend) with ability to forward input from newly created forms into redis.

The fall of Rome

And suddenly something changed – just in one single day we no longer were able to place orders in time for few currencies. Either some one plays with scalping (which, in theory, exchanges tries to prevent), or someone were able to prepare efficient ML model to predict price movements within second intervals, or someone just can do it faster. Even now it is not the case for all currencies for all exchanges – as individual or institution have to pull a lot of money to cover everything, but it affects profitability so direct arbitrage approach probably will not bring you too many money. The last attempt that I have made was to switch trading on web-socket basis. Idea was very simple – you pull snapshot of orderbook and subscribe for updates to sync and then maintain actual state locally. Subscription – means event loop in dedicated threads. Asynchronous code with threads not ideal use case for python 2.7 so I devoted good amount of time to add option to show in real-time top bid and asks from both order books for debug purpose. But several trial sessions with this approach doesn’t reveal too many opportunities for profit and show general instability of subscription itself that may lead to re-syncing and potentially error-prone.

Lessons learned

  1. Exchanges – just another piece of software with all related issues of big complex systems – reliability, consistency, performance. So you have to deal with it with all possible precaution: stale orderbook, delays with balance updating and order placement, errors management.
  2. Vps is not real hardware – you still share resources with other processes. First time I saw at htop metrics steal time!
  3. Python for mid size projects – great speed of development, rich ecosystem – google it and use. Be careful as you never know how that guy implemented those function that you need inside library – if you care about performance – try to minimise external dependencies as much as you can.
  4. Implementation of some builtin algorithms may be not as you expected – check set intersection or insort.
  5. It is useful to use some weird default values: -1, 100500 – when you stumble across them in logs you (almost) 100% sure that something went wrong.
  6. If not sure – better just shutdown process.
  7. Shell programming – ancient mortal art that can save a lot of time for supplementary task for example to analyse logs.
  8. Bugs related to lack of types — stupid and annoying errors that requires long time to troubleshoot
  9. Code indentation – when by mistake block of code moved inside or outside of cycle or condition – unique «feature» of python – also require thorough investigation and not so easy to find
  10. python and date
  11. python and float
  12. cost of function call may be high – inlining may lead to code duplication but may be faster

On interview preparation: nuances of python

From my perspective mastering particular language is still secondary skills for programmer.
For sure it will affects architecture of software and may add advantages for some use cases of handling data flows, but overall, in the long run, decent design of system coupled with carefully catered algorithms much more important.

Nonetheless within this article I want to highlight several topics that usually distinct python veteran from novice. Part of these questions I was asked about during recent interviews, some of them I had asked myself to distinct people, who name themselves python expert from those who really are, based on long exciting hours of debugging sessions from real products.

Why I want to start with python and not serious languages c++ or scala?
Currently its de-facto standard second language accompanied more performant one and
in many domains it is considered to be a primary language because of simplicity of entry,
rich ecosystem and variety of libraries.


  • CPython – default one
  • Pypy – If you need a bit faster python (JIT compiler with faster version of garbage collector)
  • Jython\IronPython – in case you need integration with java or .Net world
  • Cython – you need more speed, numpy, pandas and scikit can not fullfil what you need,
    but you still afraid to accept that c++ may be your best friend.

Dependencies management:

Standard practice is to isolate dependencies within dedicated sandbox of virtual environment with all required packages and python interpreter itself installed in such way that it doesn’t confronting with system wide installed libraries.

virtualenv is mainly used for it, but virtualenvwrappers may be more handy. In case it will be necessary to maintain several python versions simultaneously Pyenv can help.

Requirements.txt – should contains list of all dependencies.
Using –index-url you can specify your own private package servers instead of default one – pypi – including private repos with access through ssh or just local file path:


Distribution and Installation:

  • – during automatic package installation requirements.txt are not analysed
    anyhow – what really matter is file used by setuptools. Its worth to be aware about install_requires directive, that should contain list of all dependencies.
  • wheel (or egg) as common distribution formats.

Popular language’s caveats:

Mutable vs immutable datatypes

You can’t change int or string value  of some variables – you have to replace it with new value. But you can add element to list or remove key from dict. Custom objects usually mutable – you can go crazy with altering their fields or adding new methods.

Why it maters?

Mutating default arguments – perfect case for hard to find bugs:

def some_method(b=[]):     # WRONG, use b=None instead
    return sum(b)

for x in xrange(3):

Arguments are passed inside methods by reference i.e. it is not copy of object.
In case they immutable and you try to modify them inside of function – new local instance will be created – those procedure named rebinding, if they are mutable – you can end up modifying originally submitted value.

How to make sure that it is actually mutating?

x = 100500
y = 100500

id(x) == id(y) # False, values are the same but objects are different

Method id return identity of object (sort of virtual address of variable if you are used to C terminology)

z = x
id(x) == id(y) # True, equal!
z = 127
id(z) == id(y) # False, as we make re-assign z to another value

Note however:

x = 250
y = 250

id(x) == id(y) # will be True 
"""because of optimisation of interpreter for small numbers:"""

x = "STR1"
y = "STR1"

id(x) == id(y) # may be True as well
interpreter's implementation specific
for immutable types, operations that compute new values may 
actually return a reference to any existing object with 
the same type and value:

Operator is  is doing exactly this – comparing ids of two objects – whether they both reference exactly same memory cell.

So with brief ideas about value and reference, we can tackle equality operation:

class A():
    def __init__(self, a):
        self.a = a

x1 = A(1)
x2 = A(1)

# Both will be False
x1 == x2
x1 is x2 # equal to id(x1) == id(x2)

By default, comparison operator == invoke __eq__ method and it use ids of objects for comparison.

Truthiness vs “is None”

class A:
    def __init__(self, a):
        self.a = a

class B():
  __bool__(self): return False

l = [float('nan'), 0, 13, 0.0001, "", "False", [], A(None), B(), False]

for entry in l:
    if entry:
        print("{} - True".format(entry))
        print("{} - False".format(entry))

Is None – is nothing else as straightforward check whether some variable reference the same address as None value. But what checks are performed during invocation truthiness check for some_object? Under the hood it just evaluate objects – check for existence of implementation of __bool__ or len method of object, if they are not present – treat object as True.

Hashing in python

Common question to confuse people – why by default hash is defined for immutable objects only:
reasons will be very obvious if you think about example below:

class A():
    def __init__(self, a):
        self.a = a

    def __hash__(self):
        return hash(self.a)

    def __eq__(self, other):
        print("__eq__ invoked!")
        return self.__class__ == other.__class__ and self.a == other.a

container = set()
c = A(1)
d = A(10)
c in container # True
d in container # False
c.a = 123
c in container # False

I.e.  after we change value of object’s attribute – we can’t really check whether it was in container. That make sense! But what if we do this one:

d.a = 1
A(1) in container # False, and call of eq method
d in container # False, and call of eq method

I.e.  such kind of checks always invoke __eq__ for cases when hashes are similar in order to deal with possible hash collisions.

What it mean from practical point of view:

  • if object1 equals to object2 their hashes also should be the same
  • in theory it is possible to use identity of objects for hash computation and compare values in case hash collisions

What will happen if we try to compare lists?

a = [1, 2, 3]
b = [3, 2, 1]
c = [1, 2, 3]

a == b # False
a == c # True

Check equality for every pair of elements with the same index – so order matters!

Copy – shallow and deep

What if we add this to our equation?

d = a
a == d # True
print("{} = {}".format(id(a), id(b)))

I.e. we have similar address that d & a referenced.
Due to it – if we add element to a, d will also have this element.

It mean that in case we need two independent instances of similar list we need to use
deepcopy. And it will be necessary for every mutable type (or collection).

Python sort of names mangling:

Python offer interesting approach to prevent name collisions in case of inheritance.
Knowledge of this trick can save your time while working with someone’s code.

class B():
    __attr1 = None

u = B()
u.____attr1 # AttributeError: B instance has no attribute '__attr1'
u._B__attr1 = 123 # All right then

Slicing tricks:

Reversing string, pythonic way:

a = "Acapulco"

And familiarity with slice notation


may be beneficial by providing very compact code:

g = [x for x in xrange(10)] # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
g[::2] # Even only [0, 2, 4, 6, 8]
g[-3:] # Last three elements [7, 8, 9]

Init list with similar values:

b = [-13] * 10

But don’t get in trap of using:

b2d = [[-13]*10]*10 - # WRONG! will create 10 reference to the same list

this is valid way of doing this:

b2d = [[-13] * n_cols for i in range(n_rows)]

How to work with resources using with notation:

with open("in.txt", "r") as input, open("out.txt", "w") as output:

Benefits – implicit handling of resource release in case of any exceptions

And yeah, nothing will stop you from adding to your own objects similar functionality
just add to them __enter__ and __exit__ methods.

Talking about exceptions:

Keep in mind:

  • possibility to rely on exception’s inheritance hierarchy
  • using else when no exception is raised within try block
  • finally is similar to what you may seen in other languages – will be executed anyway.
import random

class MySimpleException(Exception):
    """Base class"""

class MyMediumException(MySimpleException):

class MyCriticalException(MySimpleException):

class MyCustomException(Exception):

def some_method():
    res = random.randint(0, 10)

    if res == 0:
        return res

    if 1 <= res < 3:
        raise MySimpleException("MySimpleException")
    elif 3 <= res < 6:
        raise MyMediumException("MyMediumException")
    elif 6 <= res < 9:
        raise MyCriticalException("MyCriticalException")

    raise MyCustomException("Bad luck today!")

def exception_handling_example():
    except MySimpleException as e:
        print("Any exception within class hierarchy: " + e.message)
    except Exception as e:
        print("Any exception" + e.message)
        print("No exception at all")
        print("This will be executed no matter what")

for x in xrange(10):

Iterators, generators & yield:

x = range(1000000) # Python 2.7.*
import sys
sys.getsizeof(x) # 8000072

def my_generator(n):
    num = 0
    while num < n:
        yield num
        num += 1

x = my_generator(1000000)
sys.getsizeof(x) # 80

Range will create list with all elements in memory.
my_generator(1000000) will create special object – generator – that can be used in loop –
in our example it will produce values from 0 up to 1000000 – 1 in lazy fashion:

next(x) # 0
next(x) # 1
next(x) # 2

Many familiar with syntax of list comprehension but may find this syntax a bit confusing:

a = (u * 2 for u in g)

but this is comprehension as well – it just return generator

This is not:

x = xrange(1000000) # &amp;gt;&amp;gt; type 'xrange'
next(x)             # TypeError: xrange object is not an iterator
x = iter(x)
next(x)             # 0

Iterator – is object that have implemented both __next__ and __iter__ method.

Generators is just one particular case of coroutine:

  • within method we have list of instruction,
  • when we hit return, all subsequent calls of this method will re-execute all those instructions.
  • when we hit yield – current state of method (last executed instructions and values of local variables) will be saved and subsequent calls to this method continues from the last statement.

Few words about import in python:

First of all a bit of terminology:

  • module – is just *.py file (or C-extensions – *.pyd or *.so, built-in usually are C-extensions as well)
  • package – set of modules under some folder (for python 2.7.* – it must contains __init__ file)

When we write:

import X
# Just a shortcut to use: xyz.method(args)
import X.Y.Z as xyz
# Will bring to global scope method, variable or complete module Y
from . import Y
# Will bring to global scope EVERYTHING from module X, including its sub-imports
from X import *
  1. Import search module in sys.modules cache that contains previously imported modules.
  2. If not found – search among built-in modules.
  3. And as the last hope – search by filename in every directory mention in sys.path in order of appearance (different version of the same modules in two paths – will load first one)

When module to be loaded is finally found – it will be parsed, interpreter set few variables before execution – for example __name__, and finally execute – hence those famous check for main:

# executed when we imported this module from other code:
# import x.y.z.name_of_this_script, 
# __name__ = "x.y.z.name_of_this_script"
print("module loaded")

if __name__ == '__main__':
    # executed when we run "python ./x/y/z/"
    print("module executed as main")

Parsing may be slow – it first check for pre-cached versions – *.pyc files (with all related last time changed vs timestamp of cached version).

Importing will change global variables and therefore perform some blocking to ensure that nothing
will screw up – even if modules to be imported already in sys.modules cache.

Usually two types of import are distinguished:

from a.b.c.d import Z   # absolute - can be a bit verbose
from . import Z         # relative - add Y from current package

These is just syntax sugar for more compact code:

import X.Y.Z as xyz

from X.Y import Z

It still will be loaded fully – not just Z but X, all imports from X, Y, all imports from Z, and finally Z (with all imports as well).

Circular imports

Imagine following project structure:

    module1 – contains line from module2.script2 import method2
    module2 - contains line from module1.script1 import method1

python -m module1.script1

When you try to run script1 python will try to resolve all imports first:

  • ok, we need import module2.script2 – lets load all imports for it,
  • within script we have to import module1.script1 – lets load all imports for it,
  • within script we have to import module2.script2 – lets load all imports for it,

In most cases it is matter of proper design, but nonetheless changing imports to be

import module2.script2

will load it once.

Another common approach to fight import’s issues is to use deferred importing:

# imports
# some method

def very_important_method():
    from another_module import smth
    # code


f = lambda x: x.some_attribute
type(f) # type "function"

Single expression – i.e. it should return something and fit to single expression – i.e. sequence of statements that return some value. Example of usage: map, filter, sort.

Memory optimisation

Python is not the most efficient when we talk about performance or RAM consumption:

import sys

class A(object):

a = A()

sys.getsizeof(A) # 904 0_o
dict_proxy({'__dict__': &amp;lt;attribute '__dict__' of 'A' objects&amp;gt;, '__module__': '__main__', '__weakref__': &amp;lt;attribute '__weakref__' of 'A' objects&amp;gt;, '__doc__': None})

sys.getsizeof(a) # 64
sys.getsizeof(a.__dict__)   # 280
# but we have freedom to add object and method of object's instance
a.new_int_attribute = 123

Main approach is to explicitly define attributes that should be available for object using __slots__ specifier:

class B(object):
    __slots__ = ['a']
    def __init__(self, a):
        self.a = a

sys.getsizeof(B) # 944
dict_proxy({'a': &amp;lt;member 'a' of 'B' objects&amp;gt;, '__module__': '__main__', '__slots__': ['a'], '__init__': &amp;lt;function __init__ at 0x109a287d0&amp;gt;, '__doc__': None})

b = B(1)
sys.getsizeof(a)  # 56
sys.getsizeof(a.__dict__) # AttributeError: 'B' object has no attribute '__dict__'

a.new_attr = 23 # B object has no attribute new_attr

Note, that apart of __dict__, another attribute will be missing from objects with implicit definition of __slots____weakref__ – object containing ALL weak references to current object.

Additionally it is possible to use something like namedlist – to create objects that not participate in garbage collections – full comparison of various approach for save a bit of memory is discussed here.  Usually for profiling it is sufficient to use dedicated library like profilehooks.

But nothing is stoping you from diving into byte code as well:

def a():
print(a.__code__)    # code object a at 0x109b2bc30
import dis
  2           0 LOAD_CONST               1 ('')
              3 PRINT_ITEM
              4 PRINT_NEWLINE
              5 LOAD_CONST               0 (None)
              8 RETURN_VALUE

compile("print('Hello world')", '', 'single')   # code object &amp;lt;module&amp;gt; at 0x10db313b0
exec compile("print('Hello world')", '', 'single')   # Hello world

Python is a language with automatic garbage collection. It has two key components: reference counting and generation based garbage collections. When del method is invoked or variable goes out of scope – nothing is actually deleted right away – first number of references is decremented, when it become zero – then memory occupied by object will be available for interpreter again.  Programmer can disable completely generational GC or disable it for particular class by defining __del__ method:

import gc
gc.disable()   # For everything
# OR
class A():
    # ...
    def __del__(self):
        # details
    # ...

gc module allow to manually trigger garbage collection or hunt down possible leaks. It is implementation specific but generally memory will not be returned to OS.


Python interpreter is single threaded itself – famous GIL. So it may not be the perfect choice for compute-bound application. But for I\O based load – threads in python good enough, especially with gevent in python2.7 or asyncio in 3.7 Alternative is to use multiprocessing module. Or some more suitable language!

Function definitions:

Positional and named arguments – that should be simple:

def method(a, b, c):
    print("a:{a}b:{b}c:{c}".format(a=a, b=b, c=c))

method(b=1,c=5, a="whatever")   # named: will print a:whateverb:35c:1
method(1,5,"whatever")          # positional: will print a:1b:5c:whatever

args  and  kwargs used to specify variable number of arguments. kwargs is named and args is positional

def method(*args, **kwargs):
    for _ in args:

    for k, v in kwargs.items():
        print("{} = {}".format(k, v))

method(1,2,3, a="a", b="b", c="c")
a = a
c = c
b = b


Few useful built-in modules to be aware of and not re-invent the wheel:

  • argparser, configparser
  • itertools – permutations, combinations, product, imap
  • collections – defaultdict, Counter, ordered dict, namedtuples
  • re – regular expresions
  • os, sys, random

Not so standard but handy:

  • requests – to talk with web via POST & GET
  • scapy – for playing with tcp packets
  • scrappy\beautiful soup – to crawl web
  • numpy\panda for memory optimised collections
  • nose – if standard unittest is not enough
  • dash – visualisation

P.S.: Which version to use 2 vs 3:

As for now – in the middle of 2019 – more historical remark:

  • chained exceptions stack trace
  • unordered exceptions during comparison 1 > ‘2’
  • yield from generator_method()
  • faulthandler – tracing kills (except SIGKILL)
  • More handy modules: ip_address, lru_cache, enum, pathlibs, dataclasses import dataclass
  • unicode strings
  • type hints & annotations
  • iterable unpacking
  • dict items order guaranteed 3.7+
  • speed

On interview preparation: algorithms and data structures recap

With this article I want to go a little bit further than widespread memos “X most common data structures for interviews”. I am not only interested in information about asymptotic complexity. The real value – understanding whether particular algorithms or data structure can be beneficial in particular case: real life tasks rarely given with hints like “Use directed graphs, Luke!” or which comparison function should you implement to have your favorite sort algorithm work accordingly to upper bound complexity from textbooks. On this page you can find not just theoretical high level summary but also few facts that I am considering important, practical or just afraid to forget.


Few considerations worth to notice before speculating about “Best” sorting algorithms:

  • How much data we are dealing with?
    • Dozens numbers or billions?
  • What is nature of data – i.e. what is cost of element comparison?
    • Integer\strings\complex objects? For example it may involve disks seeks and we should aim for algorithms that minimize comparison or even stick with non-comparison based algorithms?
  • What is the cost of elements transfer?
    • Should we aim to algorithms with minimum number of swaps/data copying
  • What is current state of data? Does it partially sorted?
    • In case of partially sorted data some algorithms may touch they upper bounds but adaptive class of algorithms can deal with this input in efficient manner
  • Are data allow random access?
    • For example: linked list vs arrays. Classical quicksort on linked list always be inefficient.
  • Are data fit to the memory?
    • Otherwise we need modifications of algorithms that support external sorting.
  • Are we are dealing with memory constrained environment?
    • Can allow auxiliary space or have to stick with in-place algorithms?
  • Is there possible duplicative values in array to be sorted?
    • Do we need stable sorting – i.e. should they keep their initial order from input array?

Comparative based algorithms – i.e. we compare the whole values of input array (as opposite to distinct bits of those values).  Any comparative based sorting algorithms lower bound worst case complexity – N log (N).

Comparative based algorithms – out of our league – not used in reality, but sometimes can be asked during interview

Algorithm Time complexity Space complexity Comments
Best Average Worst Auxiliary
Bubble sort O ( N ) O ( N ^ 2 ) O ( N ^ 2 ) O ( 1 )


Select sort O ( N ^ 2 ) O ( N ^ 2 ) O ( N ^ 2 ) O ( 1 ) At i-th iteration find index MIN of the smallest remaining entry and swap it i-th element with MIN.
N Exchange. (Stable for list.)


Practical comparative based algorithms

Algorithm Time complexity Space complexity Comments
Best Average Worst Auxiliary
Quicksort O (N Log N) O (N Log N) O ( N ^ 2) O ( Log N ) O( Log N ) probabilistic guarantee.
Pick a pivot, partition array in order to group all elems less than pivot in the left subarray, and those which are bigger – to the right.Recursively apply.
Pivot may be chosen randomly but there are better strategies like median of medians.
3-Way Quicksort O (K * N) ? O ( (2 ^ k) * N) O ( Log N ) Used in case of duplicate keys.
Pick a pivot element and partitions elements into three sets: less, equal or greater than pivot.
There are variants with two pivots.

Mergesort O (N Log N) O (N Log N) O (N Log N) O ( N )


NOTE: For linked lists:
O ( 1 )

Divide array into two sub-arrays and sort them recursively and merge in sorted order.
In simplest version – till sub-array have size 1, in more optimised – till it reach size of threshold (10-20 elements) and another sorting algorithms can by applied (insertion sort for example).
Easy modification for parallel and external sorting (aka not-fit in memory).
Less comparison than quicksort.
More element transfer than quicksort.
Not In-place.

Heapsort O (N Log N) O (N Log N) O (N Log N) O ( 1 ) Construct a max heap mapping array’s indexes to binary search tree.
After constructing recursively delete root from tree until it is not empty.

Insertion sort O ( N ) O ( N ^ 2) O ( N ^ 2 ) O ( 1 ) At i-th iteration swap i-th element with each larger entry to its left.
Useful for partially sorted array or small arrays – less than 10 elements.

Shell sort O ( N ) ? O (N ^ (3 / 2))

O(N * (Log N) ^ 2)
O ( 1 ) Like insertion sort but we rely on D-order sort, where D – gap between sorted entries.
All estimations depend on choosing gap sequence.
Used in embedded devices due to non-recursive nature – no need for support deep and memory hungry call stack).

Timsort O ( N ) O (N Log N) O (N Log N) O ( N ) Based on Merge sort and Insert sort:
tries to find a consecutive sequence of ordered elements and merge them to decrease necessary processing to final results.
Used in Java and Python as default sorting algorithm.

Not In-place.
Introsort O ( N ) O (N Log N) O (N Log N) O ( N ) Hybrid between quicksort & heap sort.
Used in c++ as default implementation of sorting.

Not In-place.
Treesort O (N Log N) O (N Log N) O (N Log N) O ( N ) Build binary search tree from elements and traverse it in-order.

Variations with self balancing binary search tree or Splay tree (Adaptive).
Estimations based on usage of balanced trees.

Not In-place.
Cubesort ? ? O (N Log N) O ( N ) Parallel sorting algorithm

Not In-place.

Noncomparison based sorting algorithms. For such kind of algorithms we do not compare the whole keys (values) but using individual bits of values (for example characters for strings or digits for numbers) – it allows to achieve linear time O(N) complexity. Price for it – necessity to tune it for every new type.

Algorithm Time complexity Space complexity Comments
Best Average Worst Auxiliary
Radix sort O ( N * K ) O ( N * K ) O ( N * K ) O ( N + K ) Examines individual bits of keys.
K – max number of digits in array’s elements.
Applicable for integer numbers, string and float numbers in special formatting.
Worth to use in situation when N >> K and K is fixed – famous question about sorting of million of 32-bit numbers.
MSD vs LSD – most significant vs least significant digits.
Bucket sort O ( N + K ) O ( N + K ) O ( N ^ 2 ) O ( N * K ) Partitioning an array into a number of buckets.
Each bucket is sorted either by recursive application of bucket sort or using other algorithms.
Can be used as external sorting algorithms.
Asymptotic given for cases when K ~ N.
Applicable in case data evenly distributed over a range.
Counting sort O ( N + K ) O ( N + K ) O ( N + K ) O ( N + K ) Count number of keys related to item, use this count to determine position of elements.
Asymptotic given for cases when K ~ N.
Applicable in case N ~ K.

N-th biggest\smallest element

just a variations of partial sort:

  • QuickSelect – worst case O(n^2) best case and average – O(n) – as quicksort with pivot based partitioning but move in one direction only
  • Median of medians – improve worst case of quickselect to O(n) – partition input to group of several(5) elements, find median within every group, and return median of n/5 medians
  • Introselect – back up nth_element at C++ STL – start as classical quick select and switch on median of median method in case depth of recursion is too deep and at the same time sizes of sub-partitions observed so far not leading to halving.


Family of hashing functions: H(X) = ((a * x + b) mod p) mod N) a,b < p, a!= 0, p – prime number used to reduce number of collisions (it depend on your keys distribution).

Two main approaches: separate chaining vs open addressing (linear probing) with double hashing.

  • Separate chaining – array of lists, at the very worst case – O(N) to retrieve record!
  • Open addressing – circular buffer – i.e. in case of collisions try to insert record to next un-occupied cell – which lead to poor performance in case of bad choice of hashing function. Load factor – how it will perform if 25\50\80 % of cells of array will be occupied. Used for hashtable implementation in python.

Separate chaining advantages:

  • Easy implementation of delete method
  • Performance degrades gracefully
  • Clustering is less sensitive to poorly designed hash functions

Linear probing advantages:

  • Less wasted space (pointers)
  • Better cache performance

There are various use cases & requirements for hashing that affect characteristics of good hash function:

  • Quickly insert\return records with mapping Key to Value – cheap cost of computing hash and low number of collisions
  • Find similar or close entries – need to maximise collisions if keys meet some criteria – locality sensitive caching (for example GeoHash) with rolling hash to speed up computations, another famous example: Rabin Karp string search algorithm.
  • Make it difficult to find key corresponding to particular hash value – one way function – Cryptography related hashing – fixed size output for any input, no collisions and at the same time almost impossible (time and computation consuming wise) to find input that produce particular hash
  • Cornerstone of more advanced data structures: bloom filter, merkle tree, distributed hash table.
  • Interesting overview of details of hash implementation in various languages can be found here


Root always contains element that satisfy heap property: min-heap or max-heap. Elements not sorted but partially ordered.  Heaps may be constructed in O(N) using Floyd algorithm. Usually used as underneath structure for priority queues. Leftist heap as example of pure functional data structure.

Heap’s underneath datastructure Time complexity of operations Comments
Find Max Extract Max Increase Key Insert Delete Merge
Linked List (unsorted) O ( N ) O ( N ) O ( 1 ) O ( 1 ) O ( 1 ) O ( 1 )  
Array (sorted) O ( 1 ) O ( 1 ) O ( N ) O ( N ) O ( 1 ) O ( N + M )  
Binary Heap O ( 1 ) O (Log N) O (Log N) O (Log N) O (Log N) O (N + M) Heapify: O(N)


Binary Heap (array representation) – ordered complete binary tree – parent’s key no smaller than children’s key. Complete binary tree.
Binomial Heap O (Log N) O (Log N) O (Log N) O (Log N) O (Log N) O (Log N) A binomial heap of order K has a root whose children are roots of binomial trees of orders K-1, K-2, … , 2, 1, 0 – in this order.
Each binomial tree is a heap obeys the minimum-heap property: the key of a node is greater than or equal to the key of its parent.
There can only be either one or zero binomial trees for each order, including zero order.

Support quick merging of two heaps.
Fibonacci Heap O ( 1 ) O (Log N) O ( 1 ) O ( 1 ) O (Log N) O ( 1 ) Collections of trees.
Used for priority queue.
Degree of nodes (degree means the number of children) are kept low;
every node has a degree at most O(Log N) and the size of a subtree rooted in a node of degree K is at least F(K) + 2, where F(K) is K-th Fibonacci number.

Amortised constant factor affect practical usage. High memory consumption.


Full tree – every node except leafs have non-null children. Complete tree – full tree that have all nodes at the last level are lean left.

Tree traversal:

  • Pre-order – via dfs
  • In-order – via dfs
  • Post-order – via dfs
  • Level-order – via bfs

Search data structures

  • Size of element (read as cost of access\copy\compare operations)
  • Volume of data: can it fit to RAM?
  • Usage patterns:
    • evenly mixed read & writes or skew for particular operations?
    • access single key or range?
    • access to value using the whole key or interested to find all values where even parts of key matches?
Data structure Average time complexity Worst time complexity Auxiliary Space Comments
Indexing Searching Insertion Deletion Indexing Searching Insertion Deletion
Arrays O ( 1 ) O ( N ) O ( N ) O ( N ) O ( 1 ) O ( N ) O ( N ) O ( N ) O ( N ) Cache friendly, low memory overhead.
Pre-allocated vs dynamic.
Linked list O ( N ) O ( N ) O ( 1 ) O ( 1 ) O ( N ) O ( N ) O ( 1 ) O ( 1 ) O ( N ) Size of data vs size of pointer(s).
One directional vs bi-directional vs circular.
Sequential access vs random.
Internal or external storage (i.e. node contains only reference to actual data)
Foundation of stack, queue and dequeue.
Skip list O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( N ) O ( N ) O ( N ) O ( N ) O ( N Log N) Allow fast search using additional references hierarchy: next, the next after next, every 4th, i-th level – should contain 2 ^ i elements, but it used probabilistics skips – i.e. at i-th level, next reference may point to 2^i element or further.
Alternative for hash table and balanced tree – theoretical worst case performance can be worse, but in practice usually faster – depend on nature of data and cost of comparison operator.
Foundation of various lock-free implementations for dictionaries and priority queues. Used in Redis, LevelDb and Lucene.
Hash table (symbol table, associative array) ? O ( 1 ) O ( 1 ) O ( 1 ) ? O ( N ) O ( N ) O ( N ) O ( N ) Not best in case of small number of keys in comparison to direct mapping key -> array index. In practice, particular operation can be slow because of array resizing and cache un-friendly nature may lead to poor performance in comparison even with O(N) brute force lookup of simple arrays. C++: unordered_map, multimap
Python: dict
Java: HashMap
Binary Search Tree O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( N ) O ( N ) O ( N ) O ( N ) O ( N ) Complete if it perfectly balanced, except for a bottom level. In worst case (depend on insertion order) it can be N-height. Constructing as quick sort. Application of problem: 1D-range search.
Cartesian Tree ? O ( Log N ) O ( Log N ) O ( Log N ) ? O ( N ) O ( N ) O ( N ) O ( N ) It is heap-ordered and symmetric (in-order) traversal of the tree returns the original sequence. Used for RMQ (range minimum query) kind of tasks.
B-Tree O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( N )

Not just two child anymore => heigh of tree lower => lookup faster.

Various modifications.

B+-tree – actual data stored within leafs, intermediate nodes contains copy of keys.
Core data structure for relational databases as it allow to minimise disk seeks –
implemented in Berkeley DB.

2-3 Tree O (C * Lg(N)) O (C * Lg(N)) O (C * Lg(N)) O (C * Lg(N)) O (C * Lg(N)) O (C * Lg(N)) O (C * Lg(N)) O (C * Lg(N)) ? Maintain symmetric order and perfect balance.
Red-Black Tree O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O( N )

Balance is preserved via painting different nodes in Red\Black colours to satisfy some predefined properties. Re-arranging and re-colouring can be performed efficiently.

For example: no node has 2 red links connected to it. Red links lean left. Every path from root to null link has the same number of black links – perfect black balance.

C++: set and map.

Java: TreeMap and TreeSet (and in some situation HashMap fallback on their usage as well!)

Splay Tree O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O( N )

Splay Tree is a self-adjusting binary search tree with additional property that recently accessed elements are quick to access again.

Applications: cache, garbage collectors, memory allocators.

AVL Tree O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O ( Log N ) O( N ) Balance in AVL Tree – the heights of two child subtrees of any node differ by at most one. AVL Tree better for look-up intensive applications than Red-Black Trees
Kd Tree ? O ( Log N ) O ( Log N ) O ( Log N ) O(N Log N) O ( N ) O ( N ) O ( N ) O( N )

Space partition Binary Tree. Used for range search and nearest neighbour search. Constructing N Log N. Not suitable for high dimensional space – better to use when N >> 2^k.

Octree for 3D space.

More advanced structures worth to be aware of:

  • Trie (Prefix Tree) and its variations, mainly Radix Tree for lookup intensive applications. Worst case – O(M), where M is the length of string to search. DAFSA – can be a great alternative for cases with limited memory.
  • Generalized Suffix Tree – search efficiently with mismatches, applications for various string related tasks: longest common substrings, longest repeated substrings, longest palindromic substring,
  • Fenwick Tree (Binary indexed tree)– common use case: modify element of array and return result of some inverse operations for range of elements (for example partial sum but not max). Simpler and faster in comparison with segment trees. O(N) build time.
  • Segment Tree – classical approach for RMQ kind of tasks, operation not necessary to be inverse. Double memory requirement in comparison to fenwick tree. Constant factor affect speed in practice, especially in cases of multiple dimensions.
  • Implicit Treap – sequence of pairs, heap ordered by 1st entry, BST ordered by the 2nd. If we use index of element in array as first entry – than we can
    • insert\remove element at any position
    • move subset of elements in any position
    • computation of some function for elements within some interval [i, j] – sum, min – including range updates using lazy propagations
    • foundation for Rope implementation – to have similar asymptotic for very long strings
    • and all of the above in O(Log N) – with huge constant factor though
  • vEB Tree – ordered dictionary or associative array much faster (exponentially) than BST, memory overhead
  • Heavy-Light Decomposition: split tree on multiple not-crossing paths. Edge from A to B is considered heavy if number of vertexes reachable from B is at least half as big as number of vertexes reachable from A. Heavy-Light Decomposition – union of light vertexes from bottom of tree to the root. Such decomposition allows us to serve queries like “Get some metric on the path between A and B” – for example sum or max. Not needed in case we do not have update queries. Usually used with segment tree and LCA.

Union Find (Disjoint set)

Approaches: colouring, trees on top of arrays using ranks\randomisation\path compression & merging either randomly or using ranks of trees. Additional information can be stored with leaders of set.

Applications: offline LCA, offline RMQ, dynamic connectivity variables dependencies during program analysis, variations of percolation problem – connections in social network, fluids or electricity flow.


Representation: incidence matrix vs adjacency list. Generally adjacency list is preferable for sparse graph, but there are variation of incidence matrix based on associative array where pair of vertexes used as a key.

Possible variations:

  • Complete
  • Directed
  • Weighted
  • Planar
  • Bipartite 

Cut – set of vertexes from graph, if they will be removed – graph become disconnected. Size of min cut – connectivity or K-connectivity.

Graph related tasks

  • Find cycles in graph – union find (complexity – Akkerman function from number of vertexes) or DFS – proportional to number of vertexes
  • Determine whether graph is bipartite – dfs
  • Eulerian path – route visiting every edge exactly once (exist if all vertexes has even degree and it is connected) – search of such path can be done in linear time O(|E|)
  • Hamiltonian path – route including every vertex exactly once (as in Travelling salesman problem)
  • Comparison of graph – graph’s isomorphism (still open question N or NP)
  • Graph representation in such way that there are no edge crossing, except in vertexes (upward planar graph)  – there are some special cases where such checks can be done in O(N) but generally it is NP-hard problem. 
  • Minimum Spanning Tree (MST) – application is design of network (For example to avoid cycle of packets in Ethernet) or clustering, with custom distance function.
    • Prim – adding edge to current vertex with lowest weight
    • Kruskal – gradually merge trees from all vertexes using edge with lowest weight for connection


For directed graphs:

  • Topological sorting – tasks with ordering limitations, schedule related tasks (DFS) – create schedule in case we do not have cycle – re-draw graph in such way that all path follow from bottom to top – reverse dfs postorder (additional stack) = result
  • path finding – Dijkstra, if weights are negative – Bellman-Ford or its optimised version – Shortest Path Faster Algorithm (SPFA)
  • Shortest path for multiple sources – dense graph with negative weights – Floyd-Warshall or not dense – Johnson’s algorithms or Dijkstra 
  • Search for cycle – DFS – detection of deadlock in concurrent systems or topological sort
  • Maximum flowFord–Fulkerson algorithm , Bipartite matching problem, baseball elimination
  • Dynamic connectivity
  • 2-SAT – satisfy system of constraints of boolean variables – scheduling, medicine testing. Can be represented as directed implication graph. Checking whether two vertexes are within the same connected component allow to answer question whether we can satisfy all conditions.

Substring search

  • Knuth-Morris-Pratt (KMP) – based on DFA (deterministic finite state automaton) that have to be pre-computed for pattern. O(R + N) where R – length of pattern and N – length of text. O(RN) space.
  • Boyer-Moore – scan from the end of pattern and in case of mismatch skip up to R characters – worst case is as brute force O (R * N) but on average considered the most efficient algorithms – O (N / R).  O(R) space.
  • Rabin-Karp – compute sliding hash for every substring of length R for hashing can use modular hashing with some prime number. For efficiency sake Horner’s method can be utilised to re-use computation from previous steps – O(R). Hash match – still can be a collisions so there are two variations in case of hash matches. 2D search or search for multiple patterns using pre-computed table of hashes. O(1) space.
  • Aho-Corasick – in case we are dealing with multiple patterns and text where we search them are different. Build trie for all words, extend it to special finite state machine without backtracking – for failed patterns using suffix links. Used in grep – -F flag.
  • Z-functions: Given string A, consider array of all possible suffixes: A[len(A)], A[len(A) – 1], … , A[1]. Compare symbols from the start of string and every suffix – number of matched symbols – represent particular component of Z-function. For sub-string search: concatenate pattern, unique delimiter and original string and check if values of Z-function equal to length of pattern.

Playgrounds for preparations

Собеседование на программиста в 2019 – к чему готовиться

Осень-зима 2018-2019 года выдались у меня на редкость насыщенными: ориентировочно, с октября по февраль я поучаствовал в доброй сотне собеседований на позиции уровня Senior-Principal-Lead. Четыре из них были из России (ну мне честно было интересно как тут у нас сейчас), остальные Европа и Англия. Куда-то дальше по разным причинам не хотелось – хотя и пообщался с парой компанией из солнечной Австралии. В этой заметке хочется поделиться накопленными наблюдениями для тех кто задумывается о смене работы, в первую очередь, ориентируясь на иностранные фирмы. Сначала немного сухой статистики, а потом личные выводы.

Первый контакт: процентов 70 представители компаний или рекрутеры находили по профилю в linkedin. С остальными связывался сам – преимущественно откликаясь на любопытные мне вакансии со stackoveflow jobs\hacker news who is hiring\

Этапы собеседований


Опциональная беседа с ичаром компании или рекрутером. Цель определить что вы:

  • можете связать пару слов по английски (здесь будет полезно подготовить кратенький монолог на тему “последние 5 лет моей карьеры”)
  • прояснить круг обязанностей по позиции – надобно только писать, помогать писать другим или перед всем этим сначала еще обговорить с клиентами что собственно делаем
  • обсудить желаемую локацию – были случаи когда указаная страна внезапно менялась на другую, например по причинам визы или так как самые hardcore разработчики исторически осели там
  • удостовериться что вы не страдаете особыми закидонами в плане общения.
  • в ряде случаев – определить уровень компенсации дабы не было в конце взаимного разочарования из-за бездарно потраченного времени. Это бывает удобно когда у компании строго определенный бюджет и выйти за него они не могут, а вам меньше тоже не особо интересно. После того как первый раз мне выкатили офер процентов на 40 ниже того что я ищу – в не очень именитые фирмы этот вопрос поднимал всегда до каких либо технических собеседований.
  • обсудить релокацию – что по визам (особенно сейчас актуально для Штатов), оплачивают ли жилье на первое время – и сколько его этого первого времени, мед-страховка для членов семьи, предоставляется ли контейнер на случай если вы желаете переехать с любимым диваном и коллекцией пивных кружек, помогают ли с сопутствующей бумажной волокитой и поиском жилья.

Tech screening – первая кровь

Компании высшего эшелона – гуглы, фейсбуки, амазоны и подобные им, на первом этапе презренные деньги не обсуждают, однако въедливо скринят по нескольким тематикам (в зависимости от позиции, если программист – то обязательно алгоритмы + предметная область – специфика вашего любимого языка, математика, машинное обучение, сети, потроха линуксов и т.п.) – дабы удостовериться что стоит на вас спускать инженеров. Встречаются и небольшие амбициозные компании которые норовят устроить экзамен на знание акронимов из вашей предметной области.

Подавляющие же большинство предпочитают полагаться на автоматизированные тесты на, . Обычно это 1-3 задачки которые надо решить (написать код, запустить и удостоверится что проходят все тесты) в условиях неотвратимо цокающего таймера. Задачки варьируются уровнем от чисто алгоритмических (легкие\средние с, до а вот запилите нам простейший блокчейн или каталогизатор фоток. Корректность решения проверяется заготовленными тестами на самой платформе. А как ваш код будет работать если вместо int придет long? А что если в последовательность положительных чисел вклинится одно отрицательное? Тесты вам могут быть доступны все (т. е. вы запустили – система показала какой тест завален – вы скачали данные для проверки), доступны частично (т. е. вы видите что из 100 тестов, 99 failed а почему и что там может быть – додумывайте сами), а могут быть и не показаны вовсе (считаете что все красиво – жмакайте кнопку “я сделалъ”). Таймеры у всех разные – у Amazon было два часа на две нудные длинные задачи с регэкспами, у Yelp 20 минут на одну. Могут быть и вопросы с вариантами ответов как у Optiver или даже практикум на живой виртуалке с линуксом как у Booking.

Некоторые предлагают заполнить форму с вопросами – аля как мы можем использовать технологию X, или программа на языке Y работает медленно – опишите как вы будете ускорять ее. Здесь убивают сразу двух зайцев – проверяют на базовое знание матчасти (вопросы обычно классом поглубже чем поверхностное гугление) и ваш талант кратко и по делу донести свои мысли деловым английским.

Online assessment – онлайн собеседование

Если удалось справится с тремором по поводу гонок со временем – то второй этап либо онлайн кодинг, либо так называемый take home (задачку на дом) и его последующее обсуждение.

В первом случае вас ожидает несколько сессий где вас попросят написать что-то на алгоритмы и обсудить вариации решения, подкидывая на ходу дополнительные условия, чтобы оценить насколько расширяемо вы пишете код. Большинство не парится на тему конкретного языка и заинтересовано в первую очередь в верном алгоритме, исходя из теории был бы человек хороший – а принятый в компании стек можно и освоить. Примеры задач – в добавок к упомянутым,, еще можно посмотреть на Сложность и количество задач всегда зависит от двух факторов: позиция и ваша синьорность – вполне могут дать одну посложнее вместо пары-тройки легких.

Задачка на дом представляет собой уже что-то более-менее приближенное к реальности – как правило это уже тот стек на котором вам предстоит ударно трудиться. Здесь приветствуются авто-тесты, авто-документация и все best practices применимые к языку системе сборки и документации которые так уютно существуют в самых смелых мечтах разработчиков. Код обычно размещается в приватном репозитории чтобы ваши конкуренты не повторяли ваши ошибки, а делали свои, а работодатели не ломали голову над очередной дилеммой: как бы и кандидата проверить в около-боевых условиях, и, при этом, ему не надо убивать на это более двух вечеров. Далее назначается дата и обсуждаются результаты – то ли вы вообще решали (а у меня был случай что я в попыхах не верно интерпретировал условие), решили ли (опять же был весьма досадный опыт когда поверхностно проверил не на всех наборах данных и пропустил фатальную жирную багу) насколько код готов быть развернут на боевом сервере – много обращений, много данных, низкое качество данных – в параллели заодно затрагиваются детали языка\фреймворка\стиля.

Onsite interview aka face to face

В случае успеха остается последний рубеж. Компании попроще на этом этапе ограничиваются общением с непосредственным руководителем и снова HR – behavioral interview, cultural fit – дефицит кадров еще не значит, что кто-то хочет попрать священное правило “No asshole!“. Некоторые примеры вопросов:

  • Опишите ситуацию когда вы облажались – какие уроки вы извлекли из этого
  • Как вы считаете какие ваши сильные и слабые стороны назвал ваш предыдущий начальник\коллеги\подчиненные
  • Что вы будете делать если почувствуете что у вас напряженные отношения с кем то в команде
  • Что вы будете делать если не согласны с каким-то техническим решением? Что если оно уже утверждено?

Те же у кого финансы позволяют – могут пригласить вас в офис чтобы вы пообщались на отвлеченные темы с представителями из разных команд, воочию посмотрели как проходит обычный рабочий день – как происходит взаимодействие между людьми. Это может внести некоторую заминку из-за необходимости получения визы – особенно в случае если вы не из Москвы.

Hard mode: on

Вдобавок к выше перечисленному если компания большая, а имя ее заставляет рекрутеров делать стойку и дает вам возможность добавлять себе в титул приставку ex-Имеряк’ер – тогда тут-то и начинается самое интересное. Целый день вы проведете в офисе компании. С вами будут говорить такие же технари – представители нескольких смежных отделов. Несколько секций с решением уже знакомых алгоритмических задачек – подтвердить продемонстрированный ранее уровень, но уже в окружении скучающих интервьюров на доске, листе бумаге или хромбуке. Сессия вопросов и ответов (с вас) по специфичным для позиции темам. Специализированные интерактивные секции – (тут вы должны постоянно задавать наводящие вопросы определяя ограничения задач) – спроектировать архитектуру некоторого приложения, затраблешутить багу в проде. Это будет насыщенный, но не самый легкий день в вашей практике.

Далее уже не так интересно – торг по зарплате, обсуждение условий релокации и бесконечная бумажная волокита.

Из забавного

Из-за перепутанных часовых поясов пришлось отвечать на вопросы по телефону во время пробежки с псом, между заходами в парилку в бане и на первой попавшейся парковке из машины.

В ОАЭ всех претендующих на работу на госслужбу, кроме всего прочего, тестируют с помощью платформы TalentQ – первая попытка прохождения заставила меня крепко зауважать всех госслужащих этой страны. Похожий тест – Criteria Cognitive Aptitude Test (CCAT) – просят пройти, например, в Crossover до любого рода технических тестов. Вопросы там разбиты на три подгруппы: логические – найти лишнюю крякозябру в списке, подставить недостающий петроглиф в строку; около математические – на основе таблицы с кучей цифр подсчитать арифметическое среднее или разницу в процентах чего-нибудь с чем нибудь; прочитав пару абзацев текста выбрать основную мысль или чью-то точку зрения – в простых случаях найти антоним или синоним к какому-нибудь заковыристому слову.

В одном из заданий на дом требовалось вытаскивать некоторые публичные данные и, не смотря на тривиальность задачи, сервер упорно плевался таймаутами. Я вообразил хитрую защиту от пауков и с вдохновением принялся рандомизировать агента, развернув сервис в локали, благо он был заопенсурсшен, проверял CORS и куки. Апофеозом был анализ tcp дампа в wireshark‘e и запоздалая проверка traceroute к сайту. Выяснилось что сайт хостился на айпишнике из забаненных диапазонов, а во всех браузерах были vpn плагины.

Иногда по результатам рассмотрения резюме присылали отказ, а спустя пару недель рекрутер той же самой компании зазывал пообщаться.

На дату публикации – самая длительная пауза между откликом и ответом составляет три месяца. Победители в этой номинации запросили целый набор документов (ASAP!11) для того чтобы запланировать интервью.

Самопозиционирование как architect – иногда вводило рекрутеров в заблуждение и предлагались позиции связанные с очередной стройкой века в некоторых странах Персидского залива.

Некоторые выводы и наблюдения – ваш друг и помощник – когда самая главная головная боль – понять когда у кого сколько времени

Растет число компаний предлагающих удаленную работу по рейтам не зависящим от вашей локации. Задачи наконец-то начинают отличаться от “in-house сотрудники не хотят возиться с унылым легаси” до вполне cutting edge – дизайнить дата пайплайны для финансовых платформ, SRE\DevOps для вполне серьезных хайлоадов, ML – для предсказания цен и рекомендаций в разных отраслях.

Планирование сроков – собеседований и рассмотрения предложений – позволяет задать и себе и компаниям осязаемые дедлайны. Это дисциплинирует.

График собеседований лучше составить так, чтобы сперва для разминки вы пообщались с мало-интересными вам компаниями, а уже потом, на пике формы, переходили к компаниям мечты (но еще не успели выгореть от бесконечных “есть два целочисленных массива … “).

Звонков и вообще всяческой суеты от рекрутеров будет много – они будут пропадать, внезапно звонить спозаранку или ночью, вы вполне можете путать их имена и компании которые они представляют. Собеседование это тоже работа и тут вы, чем сможете, должны им помогать, понять и простить. Хороший рекрутер может помочь вам с согласованием предлагаемого работодателем пакета и выбить комфортные условия релокации, или предложить что-нибудь интересное через год когда вы уже напрочь о нем забыли.

Российские компании (из тех с кем я общался) куда более зациклены на конкретном стеке кроме, пожалуй, Яндекса. Уровень вопросов вполне сопоставим с общемировым. С учетом налогов и даже без учета стоимости жизни в Москве и Питера можно получать больше чем в Европах, даже без менеджерских лычек.

Как ни странно – надо готовится. И тут чудес не бывает – брать Кормэна, смотреть лекции Седжвика и в виме простом блокноте писать на том языке на котором планируется проходить собеседование. Для философских разговоров хорошо заходят Architecture of open source applications про хайлоад – DDIA, для SRE – Sire Reliability Engineering и моя любимая книжка с черепашкой – Операционная система Unix Робачевский.

P.S. И просто побрюзжать

Современные IDE отупляют разработчиков – иначе я не могу объяснить причину появления сонма фундаментальных серьезных руководств как писать программы в текстовых редакторах (без авто-комплита, подсветки, подсказок и возможности скомпилировать или запустить программу чтобы высветился номер строчки с ошибкой).

Я еще помню времена, когда вопросы частенько задавались заведомо не правильные, на знание стандартов C++, undefined behaviour, depend on compiler implementation. Сейчас, к моему удивлению, ряд вопросов служат тому, чтобы выяснить что вы будете делать если нет интернета или стековерфлоу лежит, а с кодом надо что-то решать.

Иногда формализм побеждает – когда спрашивающий общий вопрос хочет услышать свой любимый ответ или занимается проверкой вашей памяти, задавая вопросы на терминологию (многие бы сильно удивились узнав что они используют какие-то из известных паттернов проектирования или определенные нормальные формы в БД). Из запомнившихся вопросов подобного плана – CAP теорема, что значит каждая буква в SOLID или перечислить те самые двенадцать факторов.

Коррелирует ли как-то способность решить задачу про кроликов или свести проблему Пингвин-Авиа к поиску минимального остовного дерева за полчаса интервью с тем как кандидат будет работать – не знаю. Поможет ли вам знание о существовании неявного декартого дерева в работе – сильно зависит от того над чем вы будете работать. Расширит ли алгоритмическая подготовка ваше сознание – несомненно!

Собеседования с вменяемыми компаниями (не ленящихся прочитать резюме, без подсказок дающих обратную связь) – прекрасный способ держать себя в тонусе. 🙂

[docker][cassandra] Reaching mixed load – 750,000 op\sec

The cart goes nowhere because the swan wants to fly in the air, the pike wants to swim underwater and the crawfish wants to crawl backward.

Cassandra performance tuning - challengeCassandra is one of powerhorse of modern high load landscape – nosql database that can be pluged to Spark for distributed computing, Titan for playing with graph data representation and even to Elastic as a search backend.
And if you really care about pure write performance – this is de-facto choice in world of open source solutions: production proof, with big community that already generated many outdated controversial answers at SO & mailing lists.


No single point of failure, scalability, high availability, retention periods, … , but those marketing claims hide few principal caveats… Actually, cassandra has only single drawback(*) – it will not reach its limits with default settings on your hardware. Lonely, single node configuration – it is not use case for cassandra, it will shine in multinoded clustered setup.

If you really want to see full utilization of endless cores and crazy amount of RAM, you have to use some virtualisation technology to manage hardware resource.

Let me start first with some conclusions and recommendations, based on extensive two monthes testing and observartion of trouble tickets after migration of this approach to production. With those considerations in mind I was managed to configure it such way to tolerate 750 k mixed operations per seconds. It was generated for more than 8 hours to check pressure tolerance and emulate peak loads. .It was mixed execution of async inserts, without future processing and synced inserts as well as read requests.

Frankly speaking, I am sure it is still far from its limit.

Bear in mind that I am talking about Cassandra 2.1.*.

About disks

  1. Use ssd disks as mapped volume to docker container. Single container = single dedicated disk.
    It is possible to use multiple disks per containers, but it will lead to 5-15 % of slowdown.
  2. If you use ssd disk you can map all casandra directories to it (saved_cache, data, commit_logs) and adjust casandra.yaml with higher values of throughput, in particularly: compaction_throughput_mb_per_sec, trickle_fsync
  3. It is really depends on data distribution and your data model, but be ready that disk utilization will vary from one node to another up to 20%
  4. Docker should be configured to NOT use host’s root partitions. Don’t be mean and allocate single drive for logs and choose proper storage driver – docker-lvm.
  5. In practice, cluster start strugling when any of nodes come out of space. Surprisingly, in my experiments it was stable even with 3% free, but in real life better to configure your monitoring system to give alert at 15-20%.
  6. Choose compaction and compresion strategies wisely when you design your db
  7. Be careful with column naming – it wil be added for every god damn row!
  8. Do sizing when you think about number of nodes (and disks).

About cpus:

  1. More cpu per node is NOT always good. Stick with 8 cores per node.
    I’ve experimenting with single fat supernode per physical server = 48 cores, 4×12, 6×8.
    6 node with 8 cpu cores outperform all others in 6 kind of stress load scenarious.
  2. If you play with core number you have to adjust few settings at cassandra.yaml to reflect that number: concurent_compactors, concurent_reads, concurent_writes.
  3. Cassandra in most cases endup to be cpu-bound, don’t forget to left for host system 8-16 cores, and allocate cpu exclusivly for containers using –cpuset-cpus

About RAM:

  1. have builtin calculation of free memory to adjust jvm settings using analysing results of command free. Ofcourse it is not for docker based setup. Bear this in mind and tweak your startup scripts to substitue values there.
  2. Disable swap within docker using –memory-swappiness=0
  3. Effectiveness of memory usage depend on cpu amount, how effective multithreaded compaction is implemented at Cassandra and what settings for reader\writer\compactors you have at your cassandra.yaml, i.e. you can have hundreds of RAM but endup in OOM. But even with 8 Gb of RAM per node you already can see benefits. More RAM – mean more memtables, bigger key cache, and more effective OS-based file caching. I would recommend have 24 Gb RAM per node.
  4. Disable huge page at host system or at least tune your jvm settings:
 echo never &amp;amp;gt; /sys/kernel/mm/transparent_hugepage/enabled
echo never &amp;amp;gt; /sys/kernel/mm/transparent_hugepage/defrag 

About network

  1. Mandatory to use network stack of host Os using flag at docker –net=host
  2. Most likely network should not be bottleneck for your load, so you can stick with virtual interfaces on top of single real one.


  • 3 physical server: each have 72 cores, 400gb ram
  • Cassandra 2.1.15
  • Docker: 1.10
  • Host Os: Centos 7.5
  • Guest Os: Centos 7.5
  • java 8 from oracle with jna

Cassandra 3.* this is competely another story – in my opinion, mainly, because of storage engine changing, but here is a huge list.

DB overview:

  • Dozen keyspaces, each have up to 20(?) tables.
  • Few indexes – just do not use indexes, design schema properly
  • Data replication = 3, gossip through file
  • Each physical server represent dedicated rack within single datacenter.
  • Row cache were disabled at cassandra.yaml i.e. first priority was to focur on write oriented workload


  1. datastax stresstool, artificial table – very intresting, but useless, using your schema is very important
  2. Datastax stresstool + your own table definition – nice, give hints of production performance. But you still testing single table – usually it is not the case in real life.
  3. Self written in-house stress tool that generate data according to our data model in randomized fasion + set of dedicated servers for ddos with ability to switch between async inserts (just do not use batches) with and without acknowledgment.
    Once again: no batch inserts as they should not be used in productions.
  4. Probably, you can adapt Yahoo! Cloud Serving Benchmark. I haven’t played with it.


That’s it folks, all craps below is my working notes and bookmarks.

How to get c++11 at Centos7 for stress tool compilation:

Install recent version of compiler on centos7: devtoolset-4
update gcc version 4.8 at centos 6:

scl enable devtoolset-2 bash

RAM & swap:

How to clean buffer os cache

echo 3 &amp;amp;gt; /proc/sys/vm/drop_caches
check system wide swappiness settings:
more /proc/sys/vm/swappiness

Docker related:

If you are not brave enough to play with DC\OS or Openstack you can find docker-compose to be usefull for manipulation of homogeneous set of containers


sudo rpm -iUvh
rpm -qa | grep docker

If you fucked up with partition settings:

wipefs -a /dev/sda1

Docker and mapped volumes:

If docker info | grep loopback show you something – you already screw up configuration of storage driver.

How to check journal what is happening:

journalctl -u docker.service --since "2017-01-01 00:00:00"

Full flags described here.

Usefull commands for check docker images:

docker inspect
dmsetup info /dev/dm-13


How to check heap memory consumption per node:

nodetool cfstats | grep 'off heap memory used' | awk 'NR &amp;amp;gt; 3 {sum += $NF } END { print sum }'

How to check what neighbors we can see from nodes:

nodetool -h ring

How to find processes that use swap:

for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done | awk '$2 {print $1 FS $2}'

Check how many disk memory we used, from Cassandra perspective

nodetool cfstats | grep 'Space used (total)' | awk '{s+=$NF} END{print s}'

Determine disk usage, OS point of view:

du -ch /var/lib/cassandra/data/

cassandra check health:

 ssh nodetool status


How to open port at centos 7:

firewall-cmd --zone=public --add-port 9160/tcp --permanent
firewall-cmd --zone=public --add-port 9042/tcp --permanent
firewall-cmd --zone=public --add-port 7200/tcp --permanent
firewall-cmd --zone=public --add-port 7000/tcp --permanent

Open ports for spark, master:

firewall-cmd --zone=public --add-port 7077/tcp --permanent
firewall-cmd --zone=public --add-port 8081/tcp --permanent

Apply changes in ip tables:

firewall-cmd --reload

or do this, usefull in case network manager behave badly:

systemctl restart network.service
systemctl status network.service

And as bonus points –

How to migrate data from old cluster to bright new one

  1. sstableloader
  2. cassandra snapshots
  3. For tiny dataset to get cql file with inserts: – cassandradump

First two approaches represent standard way of data migration.
Limitation of first is speed and necessity to stop old node.
Limitation of second is necessity to manualy deal with token_ring on per node basis.

If life was really cruel to you, you can play with data folders per node.

NOTE: if you can replicate exact same setup – in terms of assigned ip, it will be enough to just copy cassandra.yaml from old nodes to new one, and use exact same mapping folder within docker as it were at old cluster.

If not – you still can do it with copying data folder follow steps below, but better just use sstableloader.

  1. In order to do it you have to run following command on every node to drain node from cluster and flush all data into filesystem:
nodetool drain</pre>

NOTE: this is unofficial, not recommended way to deal with data migration.
NOTE 1: it require you to have similar amount of nodes in both clusters
NOTE 2: no need for the same datacenter\rack cfg\ip address

2. Deploy docker based setup according to HW configuration. Total amount of nodes should be equal to total amount of nodes at old cluster. On new cluster deploy exact schema that were deployed on old cluster.

3. Stop new cluster.
within every node data folder of OLD cluster you would have following folders:

NOTE: do not touch system* tables.

4. Under folder /your/cassandra/data-folder/your-keyspace
you should have set of folders corresponding to that keyspace under which data is stored.

5. You have to copy content of this folder (*.db, *.sha1, *.txt) for every node from OLD cluster to corresponding folder of NEW node cluster in. UUID WILL be different.
I.e. old cluster, node 1 to new cluster, node 2:
data copy example
scp /old/cluster/cassandra-folder/data/your-keyspace/your-table-e31522b0e2d511e6967a67ec03b4d2b5/*.* user@:ip/new/cluster/cassandra-folder/data/your-keyspace/your-table-c56f4dd0e61011e6af481f6740589611/

6. Migrated node of OLD cluster must be stopped OR you have to use `nodetool drain` for processed node to have all data within sstables ~ data folders.

Performance monitoring:

  • general system overview: atop or htop
  • Be sure that you understand memory reporting.
  • JMX based monitoring: jconsole
  • Jconsole connection string: service:jmx:rmi:///jndi/rmi://:/jmxrmi
  • dstat network & disk io
  • strace – show every system call. slow down. can connect to running.
  • netstat -tunapl | lsof -i -P – network\ports per process
  • docker stats – reports cpu\mem\io for container
  • perf + perf-map-agent for java monitoring:
    for example cache miss, more there:
perf stat -e L1-dcache-load-misses

Articles that I find usefull:


*cassandra has only single drawback – it have no idea of your data model, whether you configure your data schema correctly, what is your load patterns. That why you have to dive in wonderland of controversial recommendations in blogposts like that one instead of thoroughly read documentations first.


P. S. do not believe anyone, measure!



How to stop being a junior – 7 hints of programmer productivity

0) If you don’t know something – don’t afraid to ask.

Especially if you already checked first page of google search and pretty sure that no one ask that question at stackoverflow.
Reinventing the wheel and breaking the stalemate can be a good exercise for your home projects, but in production environment better to check idea with your mentor before diving into implementation details.

1) Don’t afraid to show that you have no idea.

real programming - do not left open questions ever

do not left open questions ever

Just add note for yourself to figure out it later.
If you do not understand how something works – obviously this is gap in your knowledge.
And you can’t just skip it – you are software engineer – this is your obligation to be aware what is happening under the hood.

And yes, sometime you have to do it at your own time.

Professional growth is very simple:

  • step 1 – find something that you don’t know,
  • step 2 – start investigation, discover bunch of additional mysteries and repeat step 1

2) Split problem to several simple questions and address them one by one.

Don’t try to troubleshoot huge multi-component system within real data to reproduce the problem.
Forget for a minute for overall complexity of your enormous project, analyze suspicious functions one by one independently of each others.
Use online fiddle for your language to check obscure part of language with fake data returned by mock api.

3) stop wasting your gorgeous time

If you find yourself googling the same commands over and over again – start making notes.
Create txt file with most useful commands and update it whenever you find yourself googling again.
Search for ready to use cheat sheet or even put wallpaper on desktop.

4) Do not stop to invest time in reading proper books. Ever.

Pressure, deadlines, laziness.
This is not for company, boss or to bragging about.
This is your main and primary investment to the future – your knowledge is treasure.
30 minutes of reading every day – not too much,
but in long run – you will be noticed that you become more capable and be able to tackle previously hard to solve problems.

from junior to senior - bite by bite

from junior to senior – bite by bite

5) You should properly digesting advice, articles and opinions.

Always will be people who are picky about your code.
Always will be deadlines where you have to make compromise.
Always will be people who haven’t seen big picture or just too stubborn to like ideas of other.
Be wise and pragmatic:
first and foremost you earn money for get your job done.
focus on that target along the way try to do it concise and efficient but at the same time meet your own deadlines.
New technologies, new languages, new paradigms and new patterns give it a try only when this shit is working.

6) Do not ever ever ever do work that you do not like or not interested in.

You will do it mediocre at best.
Your work should be your passion, your pride and not amount of hours behind the desk exchanged for paycheck.
If you are not happy at the morning before the workday – you have to change something. Urgently.
Excitement, challenge and satisfaction should be your main motivation for every day.
Money and career opportunities always follows three guys above 🙂

db mix – postgres, sqlite, cassandra, aerospike & redis


How to delete all tables from sqlite:

SELECT 'DROP TABLE ' || name || ';' FROM sqlite_master WHERE type = 'table';

Find duplicates by field “field_name”:

SELECT field_name, COUNT(field_name) AS cnt FROM some_table GROUP BY field_name HAVING(cnt &gt; 1 ) FIXME - use cnt?

Find records changed in last 5 days:

SELECT * FROM some_table WHERE created_at &gt;= NOW() - '5 day'::INTERVAL;

Get table definitions:

pragma table_info(mass_connections);

Export select query to csv:

.mode csv
.output result_of_query.csv
select * from my_table;
.output stdout 

Import data from csv into fresh new table:

.mode csv
.import /path/to/your/all_data.csv new_table


How to show all tables with sizes within database

SELECT schema_name, relname, pg_size_pretty(table_size) AS size, table_size FROM ( 
SELECT pg_catalog.pg_namespace.nspname AS schema_name, relname, pg_relation_size(pg_catalog.pg_class.oid) AS table_size 
FROM pg_catalog.pg_class 
JOIN pg_catalog.pg_namespace ON relnamespace = pg_catalog.pg_namespace.oid) t 
WHERE schema_name NOT LIKE 'pg_%' ORDER BY table_size DESC;

Show average amount of records per table

SELECT schemaname,relname,n_live_tup FROM pg_stat_user_tables ORDER BY n_live_tup DESC;

How to create data-only dump:

pg_dump -U your_pg_user -h pg_ip_address -p pg_port -a --column-inserts db_name > file_name.sql
pg_dump -U your_pg_user -h pg_ip_address -p pg_port -a --column-inserts --table=table_name db_name > file_name.sql

Useful pg_dump flags:

  • – C adds the CREATE statements
  • – s dump schema only
  • – a dump schema & data
  • – D dump using inserts (to simplify uploading data from PG into another db engine)

How to restore data:

psql dbname < infile.sql

PG stop/start:

$(PG_HOME)/bin/pg_ctl -D /data stop -m immediate
$(PG_HOME)/bin/pg_ctl start -D /data -l logfile


Get settings:

asinfo -v 'get-config:'

Set particular settings:

asinfo -v 'set-config:context=service;batch-max-requests=10000000'
asinfo -v 'set-config:context=network;timeout=10000000'
asinfo -v 'set-config:context=service;batch-index-threads==100'

How to register LUA-script:

register module 'your_script_name.lua'

more at

How to build secondary index based on bin

CREATE INDEX _idx ON . (bin_name) NUMERIC

bin_name ~ field name

How to delete all records within set:


How to register lua script:

redis-cli script load "$(cat /YOUR/PATH/script_name.lua)"


How to save results of query to the file:

cqlsh -e"select * from table_name where some_txt_attr='very tricky string';" &gt; cassandra_file_query_result.txt

How to check node health:

nodetool status | awk '/^(U|D)(N|L|J|M)/{print $2}'

How to check compression ratio for particular table:

nodetool -h cassandra_ip cfhistograms some_keyspace some_table

How to check the dropped tasks count (at the bottom) at particular node:

watch -n 1 -d "nodetool tpstats"

How to do a “backup: of  cassandra:

nodetool snapshot

It will generate snapshot of data at /<your path from yaml file>/data/snapshot.

How to do a “restore” from snapshot:

      stop cassandra


      delete content of every keyspace table at /<your path from yaml file>/data/


      copy data from snapshot to the respective keyspace folder


    restart the server

7 sins of blatant ignorance and how to avoid them

…You produce software for some time. In most cases it even works.
Other developers tend to care about your opinion.
Damn, you even wear some fancy title like senior\principal\architect.
And here it is – suddenly you were offered to wear really posh title – CTO…

This is moment when real troubles get started.

I did mistakes by myself. I fought with others to prevent them from repeating of my past vicious moves.
And I wish list below appear every time I was about to make game changing decision.


A lot of challenges are awaiting for new CTO. How they can be avoided?

A lot of challenges are awaiting for new CTO. How they can be avoided?

1. Claim-to-be-universal tools suck. Always.

Do not make assumption based on bright hello-world example at promo web site.
In your case – it would be necessary to have some tricky functionality that this mega-framework does not support by design.

2. Be suspicious to any black-box like solution that promise all and everything at no price.

Never-ever expect that you are aware of the deep technical details.

In production, during important demo or pitch – impossible cases tend to jump out regardless of claims from probabilistic theory.
Listen to your team – they are your experts who (should) know about tiny nuances about setup, implementations and limitations.

3. Start simple. Focus on creating _working_ prototype fast.

Forget about speed, scalability, cluster, gpu-computing or “best practices” declared by yet another guru.
In 99.99999 percents of cases you do not need load balancing or advanced caching strategy.
You will iterate if it necessary.

4. Trendy stuff sucks.

Do not waste your time in fighting with bugs of another pre-alpha release of some looks like promising tool.

New database engine \ fresh from research lab language \ trendy paradigm – should be out of your list of consideration.

You need get stuff done. That’s it.
Good old bullet-proof solution are your choice.

Especially if you and team have experience of delivering some product with it.
Otherwise, year later you will realize that your repositories contains complicated workarounds and dirty hacks in desperate efforts to build initial DSL.

5. Listen to your team.

Measure & prototype. Be open-minded for their approach for solution.

Do not abandon idea only because you are not author of it. Encourage them for think out of box (even if it mean be contradicted to your opinion).

6. Books are your friends.

Inspire your team to learn new things and professional growth.  Make your habit to read every day – in long run it will make a huge difference.
Short articles from HN do not help you to build foundation of knowledge – it is just a tip of iceberg.
You never can be sure that you know enough – treat with suspicious any “undisputed” statements (and those who dare to make them).

7. Take it easy.

World of IT and software development is hilariously small.
You do not know whom you will be interviewed by next time.
Who will be contacted for additional reference.

All makes mistakes. Not everyone learn from them.

Avoid any illusions from QA department – in most cases software will not work as you expected in first version.

Positive vibes during stern fuck-ups is what makes our profession truly awesome and memorable.
Humor is your best way to deal with burnout, pressure and broken coffee machine.

Sprinkle usual working day of your team with few bits of laugh to remind everyone that programming is fun! 😀

Information retrieval, Search and recommendation engine:

Natural language processing: – Processing of texts written in ordinal languages – search algorithm by Yandex

Information retrieval: – Introduction to Information Retrieval, Cambridge – Latent Dirichlet Allocation (LDA)

Few examples of applications for the above:

Recommender systems: – video lectures – 101 for Recommendation System – introduction to approach and algorithms – “Encyclopedia” of recommender systems – overview of recommendation algorithms – Restricted Boltzmann Machines for Collaborative Filtering

Ready for use recommendation engine: – Google recommendation engine – Apache recommendation and general purpose machine learning framework

Elastic Search – just a few useful snippets

Install head plugin or rely on oldschool curl utility in order to test your queries:




Q: Show me example of query for complex, nested document?


{ "query": 
    { "bool":
        { "must": [
                 {"path": "document.sub_document",
                           {"must": [
                               { "match":
                                   { "document.sub_document.attribute": "PUT_YOUR_SEARCH_VALUE_HERE" }

NOTE: if what are you searching for in sub-sub-sub document – just add proper number of nested chains of “bool” “must” “nested” elements.

Q: I need full text search and aggregations (aka facets) by attribute in nested document.


{ "query": 
    { "query_string": 
        { "query": "PUT_YOUR_SEARCH_STRING_HERE" }
                {"path": "document.SUB_DOCUMENT"},
                            {"field": "document.SUB_DOCUMENT.ATTRIBUTE_NAME"}

Q: I need a full text search and aggregations by geo positions aka distance range.

NOTE: put proper values for “origin”, “field”, “ranges” fields.

{ "query":
    { "query_string":
        { "query": "PUT_YOUR_SEARCH_STRING_HERE" }
                {"path": "document.SUB_DOCUMENT"},
                            {"origin": "100500, 100500",
                             "ranges": [{"to": 1000}, {"to": 3000, "from": 1000}, {"from": 3000}]

Q: I have fields in document that contains multiple words, I want them to be be aggregated not as separate single terms, but as a whole string.

A.1 put proper mapping for such field – “multi_field” or in most recent version of elasticsearch – just “fields”.

... document mapping, ...

YOUR_FIELD: {   "type": "string",
                    { "type": "string", "index": "not_analyzed" }

... document mapping, ...

A.2 use such kind of queries for nested faceting:

        {"query": "PUT_YOUR_SEARCH_STRING_HERE"}
                {"path": "document.SUB_DOCUMENT"},
                             {"field": "document.SUB_DOCUMENT.ATTRIBUTE_NAME.raw"}

Q: I want return all documents sorted by distance and my geo_point field in nested document.

NOTE: ATTRIBUTE_NAME should be mapped as geo_point at moment of writing – it can be done only via manually created mapping.


    {"match_all": {}},
    "sort": [
                {"lat": 25,"lon": 55},
                 "order": "asc",
                 "unit": "km",
                 "distance_type": "plane"

Q: I want to return aggregation only?


{"size": 0,
    "aggs": {
        "name_of_1st_parent_aggregation": {
            "nested": {"path": "document.SUB_DOCUMENT"},
            "aggs": {
                "name_of_1st_aggregation": {
                    "terms": {
        "name_of_2nd_parent_aggregation": {
            "nested": {"path": "document.SUB_DOCUMENT_1"},
            "aggs": {
                "name_of_2nd_aggregation": {
                    "terms": {
        "name_of_3rd_parent_aggregation": {
            "nested": {"path": "document.SUB_DOCUMENT_2"},
            "aggs": {
                "name_of_3rd_aggregation": {
                            {"origin": "100500, 100500",
                             "ranges": [{"to": 1000}, {"to": 3000, "from": 1000}, {"from": 3000}]

Q: I want autocomplete?

A-0. NOTE better to use n-gramm based approach

A-1. Prefix approach for autocomplete:

    "query":{"query_string" : {
        "default_field" : "",
        "query" : "start_of_phrase*"

A-2. by adding to document mapping additional suggest field:

"mappings" : {
    "document_type": {
             "suggest" : {
                        "type" : "completion",
                        "index_analyzer" :"simple",
                        "search_analyzer" :"simple",

When you add document for indexing you have to specify this additional information and use special endpoint _suggest for request suggestions:

    "suggest_name" : {
        "text" : "k",
        "completion" : {
            "field" : "suggest"

Q: I want filtering of a search result by nested attribute!

  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      "filter": {
        "or": [
            "nested": {
              "path": "document.nested_attribute",
              "filter": {
                "bool": {
                  "must": [
                      "terms": {
                        "document.nested_attribute.attribute_value": [
            "nested": {
              "path": "document.nested_attribute_1",
              "filter": {
                "bool": {
                  "must": [
                      "terms": {
                        "document.nested_attribute_1.attribute_value": [
                          "some string value"