db mix – postgres, sqlite, cassandra, aerospike & redis


How to delete all tables from sqlite:

SELECT 'DROP TABLE ' || name || ';' FROM sqlite_master WHERE type = 'table';

Find duplicates by field “field_name”:

SELECT field_name, COUNT(field_name) AS cnt FROM some_table GROUP BY field_name HAVING(cnt > 1 ) FIXME - use cnt?

Find records changed in last 5 days:

SELECT * FROM some_table WHERE created_at >= NOW() - '5 day'::INTERVAL;

Get table definitions:

pragma table_info(mass_connections);

Export select query to csv:

.mode csv
.output result_of_query.csv
select * from my_table;
.output stdout 

Import data from csv into fresh new table:

.mode csv
.import /path/to/your/all_data.csv new_table


How to show all tables with sizes within database

SELECT schema_name, relname, pg_size_pretty(table_size) AS size, table_size FROM ( 
SELECT pg_catalog.pg_namespace.nspname AS schema_name, relname, pg_relation_size(pg_catalog.pg_class.oid) AS table_size 
FROM pg_catalog.pg_class 
JOIN pg_catalog.pg_namespace ON relnamespace = pg_catalog.pg_namespace.oid) t 
WHERE schema_name NOT LIKE 'pg_%' ORDER BY table_size DESC;

Show average amount of records per table

SELECT schemaname,relname,n_live_tup FROM pg_stat_user_tables ORDER BY n_live_tup DESC;

How to create data-only dump:

pg_dump -U your_pg_user -h pg_ip_address -p pg_port -a --column-inserts db_name > file_name.sql
pg_dump -U your_pg_user -h pg_ip_address -p pg_port -a --column-inserts --table=table_name db_name > file_name.sql

Useful pg_dump flags:

  • – C adds the CREATE statements
  • – s dump schema only
  • – a dump schema & data
  • – D dump using inserts (to simplify uploading data from PG into another db engine)

How to restore data:

psql dbname < infile.sql

PG stop/start:

$(PG_HOME)/bin/pg_ctl -D /data stop -m immediate
$(PG_HOME)/bin/pg_ctl start -D /data -l logfile


Get settings:

asinfo -v 'get-config:'

Set particular settings:

asinfo -v 'set-config:context=service;batch-max-requests=10000000'
asinfo -v 'set-config:context=network;timeout=10000000'
asinfo -v 'set-config:context=service;batch-index-threads==100'

How to register LUA-script:

register module 'your_script_name.lua'

more at http://www.aerospike.com/docs/guide/aggregation.html

How to build secondary index based on bin

CREATE INDEX _idx ON . (bin_name) NUMERIC

bin_name ~ field name

How to delete all records within set:


How to register lua script:

redis-cli script load "$(cat /YOUR/PATH/script_name.lua)"


How to save results of query to the file:

cqlsh -e"select * from table_name where some_txt_attr='very tricky string';" > cassandra_file_query_result.txt

How to check node health:

nodetool status | awk '/^(U|D)(N|L|J|M)/{print $2}'

How to check compression ratio for particular table:

nodetool -h cassandra_ip cfhistograms some_keyspace some_table

How to check the dropped tasks count (at the bottom) at particular node:

watch -n 1 -d "nodetool tpstats"

How to do a “backup: of  cassandra:

nodetool snapshot

It will generate snapshot of data at /<your path from yaml file>/data/snapshot.

How to do a “restore” from snapshot:

  • restart the server
  • copy data from snapshot to the respective keyspace folder
  • delete content of every keyspace table at /<your path from yaml file>/data/
  • stop cassandra


How to show groups:
For old consumers:

./bin/kafka-consumer-groups.sh --list --zookeeper

For new consumers:

./bin/kafka-consumer-groups.sh --list --new-consumer --bootstrap-server

How to show how many messages:

./bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list --topic --time -1 --offsets 1

How to show how many messages in ALL partitions:

./bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list --topic --time -1 --offsets 1 | awk -F ":" '{sum += $3} END {print sum}'

How to show topics:

./bin/kafka-topics.sh --zookeeper --list

How to show consumer’s lag:
For old consumers:

./bin/kafka-consumer-groups.sh --describe --zookeeper --describe --group | column -s , -t

For new consumers:

./bin/kafka-consumer-groups.sh --describe --new-consumer --bootstrap-server --describe --group | column -s, -t

How to run test consumer:

./bin/kafka-console-consumer.sh --from-beginning --new-consumer --bootstrap-server --topic


How to create exchange and queue:

rabbitmqadmin declare exchange --vhost=/ name=test123 type=direct
rabbitmqadmin declare queue --vhost=/ name=test_queue.123 durable=true
rabbitmqadmin --vhost="/" declare binding source="test123" destination_type="queue" destination="test_queue.123" routing_key="test_routing_key"


Updating file access control:

hdfs -setfacl -m user:user_name:rwx /Some/Path
hadoop fs -chown user_name:group_name /Some/Path

Updating file access control recursive:

hdfs dfs -chown -R user_name:group_name /Some/Path

In case of following error:
>> java.io.IOException: Cannot obtain block length for LocatedBlock
Root cause: non-correctly closed files.  How to fix them:

hdfs fsck /Some/Path -files -openforwrite | grep OPENFORWRITE
hdfs debug recoverLease -path /Some/Path/corrupt.json -retries 5

Leave a Reply

Your email address will not be published.