Cassandra

如果是读远大于写，而且数据量很大，冷数据很多，就适合于cassandra。若热数据多，用redis更好

资料

圈内人：

teddyma, https://legacy.gitbook.com/@teddyma
flyml, http://www.flyml.net/

教程 * http://www.tutorialspoint.com/cassandra/cassandra_data_model.htm * https://www.slideshare.net/DataStax/understanding-how-cql3-maps-to-cassandras-internal-data-structure

安装

$ vi /etc/yum.repos.d/cassandra.repo

[cassandra]
name=Apache Cassandra
baseurl=https://www.apache.org/dist/cassandra/redhat/311x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://www.apache.org/dist/cassandra/KEYS

$ yum install cassandra

配置

cluster_name
listen_address
data_file_directory

seed_provider

alter table xxx.xxx with min_index_interval=131;

运维

加入新节点

安装cassandra
配置
- cluster_name：与master一样
- listen_address: 本机可外部访问的地址
- data_file_directory
- seed_provider
- endpoint_snitch: GossipingPropertyFileSnitch
- cassandra-rackdc.properties里面的内容要与seeds一样

Rebalance Data

https://www.datastax.com/dev/blog/balancing-your-cassandra-cluster https://serverfault.com/questions/135437/how-can-i-get-cassandra-to-automatically-rebalance?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa https://stackoverflow.com/questions/16343142/how-to-rebalance-cassandra-cluster-after-adding-new-node?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa

https://stackoverflow.com/questions/23294298/replication-in-cassandra

常用操作

select peer,release_version,schema_version from system.peers;
select key,release_version, host_id, schema_version from system.local;
alter KEYSPACE md5 with replication={'class': 'SimpleStrategy', 'replication_factor':2};

# 移除一个失效的节点
nodetool removenode 17601d18-83ed-491b-b4a0-793f74026ef7

# 获取一个key存放位置
nodetool -h localhost getendpoints  <keyspace> <cf> <key>

# 获取 keyspace 信息

nodetool cfstats <keyspace>

节点坏掉

Cassandra一个节点的磁盘坏了，分两种情况，一种是节点还可以正常启动。另外一种是节点无法启动。

第一种情况：节点还可以正常启动

1、把坏的盘换掉，如果你没有新的盘去更换，你可以在cassandra.yaml里直接把坏的盘注释掉

2、启动cassandra，如果启动的过程中报错，说找不到keyspace之类的，那你应该使用第二种情况的解决方案。

3、使用nodetool repair修复该节点丢失的数据。

第二种情况：节点无法正常启动

1、把坏的盘换掉，如果你没有新的盘去更换，你可以在cassandra.yaml里直接把坏的盘注释掉

2、在正常的节点上执行： $ nodetool ring | grep ip_address_of_node | awk ‘ {print $NF “,”}’ | xargs

从而获取到坏掉节点的tokens，把它们（用逗号分割）配置到cassandra.yaml的initial_token 选项中。

3、在cassandra.yaml设置配置项： auto_bootstrap: false

这一步官方文档漏掉了，参考 https://issues.apache.org/jira/browse/CASSANDRA-11365

4、删除你所有数据盘下面的system目录。

rm -fr /mnt1/cassandra/data/system rm -fr /mnt2/cassandra/data/system 。。。。

5、启动cassandra，如果启动的过程中报错，说schema不存在之类的属于正常情况，system库会自动重建，只要节点可以正常加入集群就算正常。

6、同样使用nodetool repair修复该节点丢失的数据。

调优

目标，把所有key尽可能放入cache中

ALTER TABLE keyspace.table
with min_index_interval = 512 
AND max_index_interval = 2048;

key_cache_size_in_mb: 1000

cql –request-timeout=3600

cqlshrc

http://cassandra.apache.org/doc/latest/tools/cqlsh.html

默认是在 ~/.cassandra/cqlsh 内容可以参考 conf/cqlshrc.sample

其它

注意/var/log/cassandra/里面的文件权限
cassandra对btrfs subvolume支持不好
注意关掉防火墙（要不然就把 c* 的相关端口打开）
cassandra -f 可用于在前台启动 c* 以观察日志

批量导入数据

salt -E 'py00[1345]' cmd.run "nodetool disableautocompaction"

SizeTieredCompactionStrategy LeveledCompactionStrategy TimeWindowCompactionStrategy

CREATE TABLE use_view  (
    uid bigint PRIMARY KEY,
    ids map<int, int>
    ) WITH compaction = { 'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}

alter table use_view WITH compaction = { 'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}

https://www.cnblogs.com/didda/p/4728588.html https://my.oschina.net/u/2449787/blog/1583615

Daimon Blog