=======Installing of HA Opennebula on Centos 7 with Ceph as a datastore and IPoIB as backend network======= ==== Introduction. ==== //This article is exploring the process of installing HA opennebula and ceph as datastore on three nodes (disks - 6xSSD 240GB, backend network IPoIB, OS Centos 7) and using one additional node for backup.//\\ Scheme of equipment below:\\ {{:ru:jobs:opennebula_forart.gif}}.\\ We are using this solution for virtualization of our imagery processing servers. ==== Preparing. ==== All actions should be performed on all nodes. For kosmo-arch all except bridge-utils and %%FrontEnd%% network.\\ yum install bridge-utils **%%FrontEnd%% network.**\\ Configure bond0 (mode0) and start script below to create frontend interface for VMs (%%OpenNebula%%)\\ #!/bin/bash Device=bond0 cd /etc/sysconfig/network-scripts if [ ! -f ifcfg-nab1 ]; then cp -p ifcfg-$Device bu-ifcfg-$Device echo -e "DEVICE=$Device\nTYPE=Ethernet\nBOOTPROTO=none\nNM_CONTROLLED=no\nONBOOT=yes\nBRIDGE=nab1" > ifcfg-$Device grep ^HW bu-ifcfg-$Device >> ifcfg-$Device echo -e "DEVICE=nab1\nNM_CONTROLLED=no\nONBOOT=yes\nTYPE=bridge" > ifcfg-nab1 egrep -v "^#|^DEV|^HWA|^TYP|^UUI|^NM_|^ONB" bu-ifcfg-$Device >> ifcfg-nab1 fi **%%BackEnd%% network.** Configuring of %%IPoIB%%:\\ yum groupinstall -y "Infiniband Support" yum install opensm Enable %%IPoIB%% and switch infiniband to connected mode. This [[https://www.kernel.org/doc/Documentation/infiniband/ipoib.txt|Link]] about differences of connected or datagram modes. cat /etc/rdma/rdma.conf # Load IPoIB IPOIB_LOAD=yes # Setup connected mode SET_IPOIB_CM=yes Start Infiniband services. systemctl enable rdma opensm systemctl start rdma opensm Check of working ibv_devinfo hca_id: mlx4_0 transport: InfiniBand (0) fw_ver: 2.7.000 node_guid: 0025:90ff:ff07:3368 sys_image_guid: 0025:90ff:ff07:336b vendor_id: 0x02c9 vendor_part_id: 26428 hw_ver: 0xB0 board_id: SM_1071000001000 phys_port_cnt: 2 port: 1 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 8 port_lid: 4 port_lmc: 0x00 link_layer: InfiniBand port: 2 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 4 port_lid: 9 port_lmc: 0x00 link_layer: InfiniBand and iblinkinfo CA: kosmo-virt1 mlx4_0: 0x002590ffff073385 13 1[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 2 10[ ] "Infiniscale-IV Mellanox Technologies" ( ) Switch: 0x0002c90200482d08 Infiniscale-IV Mellanox Technologies: 2 1[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 2[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 1[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 4[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 5[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 6[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 7[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 8[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 9[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 10[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 13 1[ ] "kosmo-virt1 mlx4_0" ( ) 2 11[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 4 1[ ] "kosmo-virt2 mlx4_0" ( ) 2 12[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 13[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 14[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 15[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 16[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 17[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 18[ ] ==( Down/ Polling)==> [ ] "" ( ) CA: kosmo-virt2 mlx4_0: 0x002590ffff073369 4 1[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 2 11[ ] "Infiniscale-IV Mellanox Technologies" ( ) Setup bond1 (mode1) of two IB interfaces. Set up IP 172.19.254.X where X is node number. Example below: cat /etc/modprobe.d/bonding.conf alias bond0 bonding alias bond1 bonding cat /etc/sysconfig/network-scripts/ifcfg-bond1 DEVICE=bond1 TYPE=bonding BOOTPROTO=static USERCTL=no ONBOOT=yes IPADDR=172.19.254.x NETMASK=255.255.255.0 BONDING_OPTS="mode=1 miimon=500 primary=ib0" MTU=65520 **Disable firewall ** **Tuning sysctl.** net.core.rmem_max=16777216 net.core.wmem_max=16777216 net.core.rmem_default=16777216 net.core.wmem_default=16777216 net.core.optmem_max=16777216 net.ipv4.tcp_mem=16777216 16777216 16777216 net.ipv4.tcp_rmem=4096 87380 16777216 net.ipv4.tcp_wmem=4096 65536 16777216 ==== Installing Ceph. ==== **Preparation** Configuring of passwordless access between nodes for user root. The key shoud be created on one node and then copy to other to /root/.ssh/. ssh-keygen -t dsa (creation of passwordless key) cd /root/.ssh cat id_dsa.pub >> authorized_keys chown root.root authorized_keys chmod 600 authorized_keys echo "StrictHostKeyChecking no" > config Disable Selinux on all nodes In /etc/selinux/config SELINUX=disabled setenforce 0 Add max open files to /etc/security/limits.conf (depends on your requirements) on all nodes * hard nofile 1000000 * soft nofile 1000000 Setup /etc/hosts on all nodes:\\ 172.19.254.1 kosmo-virt1 172.19.254.2 kosmo-virt2 172.19.254.3 kosmo-virt3 172.19.254.150 kosmo-arch 192.168.14.42 kosmo-virt1 192.168.14.43 kosmo-virt2 192.168.14.44 kosmo-virt3 192.168.14.150 kosmo-arch ** Installing ** Install kernel >3.15 on all nodes (That is needed for using cephFS client) rpm -ivh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm yum --enablerepo=elrepo-kernel install kernel-ml -y Set up new kernel for booting. grep ^menuentry /boot/grub2/grub.cfg grub2-set-default 0 # number of our kernel grub2-editenv list grub2-mkconfig -o /boot/grub2/grub.cfg Reboot. Set up repository: (on all nodes) cat << EOT > /etc/yum.repos.d/ceph.repo [ceph] name=Ceph packages for $basearch baseurl=http://ceph.com/rpm/el7/$basearch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-noarch] name=Ceph noarch packages baseurl=http://ceph.com/rpm/el7/noarch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc EOT Import gpgkey: (on all nodes) rpm --import 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' Setup ntpd. (on all nodes) yum install ntp Editing /etc/ntp.conf and start ntpd. (on all nodes) systemctl enable ntpd systemctl start ntpd Install: (on all nodes) yum install libunwind -y yum install -y ceph-common ceph ceph-fuse ceph-deploy **Deploing.** (on kosmo-virt1) cd /etc/ceph ceph-deploy new kosmo-virt1 kosmo-virt2 kosmo-virt3 MON deploying: (on kosmo-virt1) ceph-deploy mon create-initial OSD deploying: (on kosmo-virt1) cd /etc/ceph ceph-deploy gatherkeys kosmo-virt1 ceph-deploy disk zap kosmo-virt1:sdb ceph-deploy osd prepare kosmo-virt1:sdb ceph-deploy disk zap kosmo-virt1:sdc ceph-deploy osd prepare kosmo-virt1:sdc ceph-deploy disk zap kosmo-virt1:sdd ceph-deploy osd prepare kosmo-virt1:sdd ceph-deploy disk zap kosmo-virt1:sde ceph-deploy osd prepare kosmo-virt1:sde ceph-deploy disk zap kosmo-virt1:sdf ceph-deploy osd prepare kosmo-virt1:sdf ceph-deploy disk zap kosmo-virt1:sdg ceph-deploy osd prepare kosmo-virt1:sdg (on kosmo-virt2) cd /etc/ceph ceph-deploy gatherkeys kosmo-virt2 ceph-deploy disk zap kosmo-virt2:sdb ceph-deploy osd prepare kosmo-virt2:sdb ceph-deploy disk zap kosmo-virt2:sdc ceph-deploy osd prepare kosmo-virt2:sdc ceph-deploy disk zap kosmo-virt2:sdd ceph-deploy osd prepare kosmo-virt2:sdd ceph-deploy disk zap kosmo-virt2:sde ceph-deploy osd prepare kosmo-virt2:sde ceph-deploy disk zap kosmo-virt2:sdf ceph-deploy osd prepare kosmo-virt2:sdf ceph-deploy disk zap kosmo-virt2:sdg ceph-deploy osd prepare kosmo-virt2:sdg (on kosmo-virt3) cd /etc/ceph ceph-deploy gatherkeys kosmo-virt3 ceph-deploy disk zap kosmo-virt3:sdb ceph-deploy osd prepare kosmo-virt3:sdb ceph-deploy disk zap kosmo-virt3:sdc ceph-deploy osd prepare kosmo-virt3:sdc ceph-deploy disk zap kosmo-virt3:sdd ceph-deploy osd prepare kosmo-virt3:sdd ceph-deploy disk zap kosmo-virt3:sde ceph-deploy osd prepare kosmo-virt3:sde ceph-deploy disk zap kosmo-virt3:sdf ceph-deploy osd prepare kosmo-virt3:sdf ceph-deploy disk zap kosmo-virt3:sdg ceph-deploy osd prepare kosmo-virt3:sdg where sd[b-g] - SSD disks. MDS deploying: New giant version of ceph doesn't have osd pool data and metadata\\ Use ceph osd lspools to check.\\ ceph osd pool create data 1024 ceph osd pool set data min_size 1 ceph osd pool set data size 2 ceph osd pool create metadata 1024 ceph osd pool set metadata min_size 1 ceph osd pool set metadata size 2 Check pool id of data and metadata with\\ ceph osd lspools Configuring FS ceph mds newfs 4 3 --yes-i-really-mean-it where 4 - id metadata pool, 3 - id metadata pool\\ Configure MDS (on kosmo-virt1) cd /etc/ceph ceph-deploy mds create kosmo-virt1 (on kosmo-virt2) cd /etc/ceph ceph-deploy mds create kosmo-virt2 (on all nodes) chkconfig ceph on Configuring for kosmo-arch.\\ Copy /etc/ceph.conf and /etc/ceph.client.admin.keyring from any of kosmo-virt to kosmo-arch ==== Preparing Ceph for OpenNebula. ==== Create pool: ceph osd pool create one 4096 ceph osd pool set one min_size 1 ceph osd pool set one size 2 Setup authorization to pool one: ceph auth get-or-create client.oneadmin mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=one' > /etc/ceph/ceph.client.oneadmin.keyring Get key from keyring: cat /etc/ceph/ceph.client.oneadmin.keyring | grep key | awk '{print $3}' >> /etc/ceph/oneadmin.key Checking: ceph auth list Copy /etc/ceph/ceph.client.oneadmin.keyring and /etc/ceph/oneadmin.key to the second node. ==== Preparing for Opennebula HA ==== ===Configuring MariaDB cluster=== on all nodes except kosmo-arch Setup repo: cat << EOT > /etc/yum.repos.d/mariadb.repo [mariadb] name = MariaDB baseurl = http://yum.mariadb.org/10.0/centos7-amd64 gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB gpgcheck=1 EOT Installing: yum install MariaDB-Galera-server MariaDB-client rsync galera start service: service mysql start chkconfig mysql on mysql_secure_installation preparing for cluster: mysql -p GRANT USAGE ON *.* to sst_user@'%' IDENTIFIED BY 'PASS'; GRANT ALL PRIVILEGES on *.* to sst_user@'%'; FLUSH PRIVILEGES; exit service mysql stop configuring cluster: (for kosmo-virt1) cat << EOT > /etc/my.cnf [mysqld] collation-server = utf8_general_ci init-connect = 'SET NAMES utf8' character-set-server = utf8 binlog_format=ROW default-storage-engine=innodb innodb_autoinc_lock_mode=2 innodb_locks_unsafe_for_binlog=1 query_cache_size=0 query_cache_type=0 bind-address=0.0.0.0 datadir=/var/lib/mysql innodb_log_file_size=100M innodb_file_per_table innodb_flush_log_at_trx_commit=2 wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_address="gcomm://172.19.254.1,172.19.254.2,172.19.254.3" wsrep_cluster_name='scanex_galera_cluster' wsrep_node_address='172.19.254.1' # setup real node ip wsrep_node_name='kosmo-virt1' # setup real node name wsrep_sst_method=rsync wsrep_sst_auth=sst_user:PASS EOT (for kosmo-virt2) cat << EOT > /etc/my.cnf collation-server = utf8_general_ci init-connect = 'SET NAMES utf8' character-set-server = utf8 binlog_format=ROW default-storage-engine=innodb innodb_autoinc_lock_mode=2 innodb_locks_unsafe_for_binlog=1 query_cache_size=0 query_cache_type=0 bind-address=0.0.0.0 datadir=/var/lib/mysql innodb_log_file_size=100M innodb_file_per_table innodb_flush_log_at_trx_commit=2 wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_address="gcomm://172.19.254.1,172.19.254.2" wsrep_cluster_name='scanex_galera_cluster' wsrep_node_address='172.19.254.2' # setup real node ip wsrep_node_name='kosmo-virt2' # setup real node name wsrep_sst_method=rsync wsrep_sst_auth=sst_user:PASS EOT (for kosmo-virt3) cat << EOT > /etc/my.cnf collation-server = utf8_general_ci init-connect = 'SET NAMES utf8' character-set-server = utf8 binlog_format=ROW default-storage-engine=innodb innodb_autoinc_lock_mode=2 innodb_locks_unsafe_for_binlog=1 query_cache_size=0 query_cache_type=0 bind-address=0.0.0.0 datadir=/var/lib/mysql innodb_log_file_size=100M innodb_file_per_table innodb_flush_log_at_trx_commit=2 wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_address="gcomm://172.19.254.1,172.19.254.2,172.19.254.3" wsrep_cluster_name='scanex_galera_cluster' wsrep_node_address='172.19.254.2' # setup real node ip wsrep_node_name='kosmo-virt3' # setup real node name wsrep_sst_method=rsync wsrep_sst_auth=sst_user:PASS EOT (on kosmo-virt1) /etc/init.d/mysql start --wsrep-new-cluster (on kosmo-virt2) /etc/init.d/mysql start (on kosmo-virt3) /etc/init.d/mysql start check on all nodes: mysql -p show status like 'wsrep%'; | Variable_name | Value | +------------------------------+--------------------------------------+ | wsrep_local_state_uuid | 739895d5-d6de-11e4-87f6-3a3244f26574 | | wsrep_protocol_version | 7 | | wsrep_last_committed | 0 | | wsrep_replicated | 0 | | wsrep_replicated_bytes | 0 | | wsrep_repl_keys | 0 | | wsrep_repl_keys_bytes | 0 | | wsrep_repl_data_bytes | 0 | | wsrep_repl_other_bytes | 0 | | wsrep_received | 6 | | wsrep_received_bytes | 425 | | wsrep_local_commits | 0 | | wsrep_local_cert_failures | 0 | | wsrep_local_replays | 0 | | wsrep_local_send_queue | 0 | | wsrep_local_send_queue_max | 1 | | wsrep_local_send_queue_min | 0 | | wsrep_local_send_queue_avg | 0.000000 | | wsrep_local_recv_queue | 0 | | wsrep_local_recv_queue_max | 1 | | wsrep_local_recv_queue_min | 0 | | wsrep_local_recv_queue_avg | 0.000000 | | wsrep_local_cached_downto | 18446744073709551615 | | wsrep_flow_control_paused_ns | 0 | | wsrep_flow_control_paused | 0.000000 | | wsrep_flow_control_sent | 0 | | wsrep_flow_control_recv | 0 | | wsrep_cert_deps_distance | 0.000000 | | wsrep_apply_oooe | 0.000000 | | wsrep_apply_oool | 0.000000 | | wsrep_apply_window | 0.000000 | | wsrep_commit_oooe | 0.000000 | | wsrep_commit_oool | 0.000000 | | wsrep_commit_window | 0.000000 | | wsrep_local_state | 4 | | wsrep_local_state_comment | Synced | | wsrep_cert_index_size | 0 | | wsrep_causal_reads | 0 | | wsrep_cert_interval | 0.000000 | | wsrep_incoming_addresses | 172.19.254.1:3306,172.19.254.3:3306,172.19.254.2:3306 | | wsrep_evs_delayed | | | wsrep_evs_evict_list | | | wsrep_evs_repl_latency | 0/0/0/0/0 | | wsrep_evs_state | OPERATIONAL | | wsrep_gcomm_uuid | 7397d6d6-d6de-11e4-a515-d3302a8c2342 | | wsrep_cluster_conf_id | 2 | | wsrep_cluster_size | 2 | | wsrep_cluster_state_uuid | 739895d5-d6de-11e4-87f6-3a3244f26574 | | wsrep_cluster_status | Primary | | wsrep_connected | ON | | wsrep_local_bf_aborts | 0 | | wsrep_local_index | 0 | | wsrep_provider_name | Galera | | wsrep_provider_vendor | Codership Oy | | wsrep_provider_version | 25.3.9(r3387) | | wsrep_ready | ON | | wsrep_thread_count | 2 | +------------------------------+--------------------------------------+ Creating user and database: mysql -p create database opennebula; GRANT USAGE ON opennebula.* to oneadmin@'%' IDENTIFIED BY 'PASS'; GRANT ALL PRIVILEGES on opennebula.* to oneadmin@'%'; FLUSH PRIVILEGES; **Remember, if all nodes will be down, actual node must be started with /etc/init.d/mysql start --wsrep-new-cluster. You should find an actual node. If you start node with not actual view, other nodes will issue error (see logs) - [ERROR] WSREP: gcs/src/gcs_group.cpp:void group_post_state_exchange(gcs_group_t*)():319: Reversing history: 0 -> 0, this member has applied 140536161751824 more events than the primary component.Data loss is possible. Aborting. ** ===Configuring HA cluster === Unfortunately pcs clsuter conflicts with Opennebula server. That's why will go with pacemaker,corosync and crmsh. **Installing HA** Set up repo on all nodes except kosmo-arch: cat << EOT > /etc/yum.repos.d/network\:ha-clustering\:Stable.repo [network_ha-clustering_Stable] name=Stable High Availability/Clustering packages (CentOS_CentOS-7) type=rpm-md baseurl=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/ gpgcheck=1 gpgkey=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/repodata/repomd.xml.key enabled=1 EOT Install on all nodes except kosmo-arch: yum install corosync pacemaker crmsh resource-agents -y On kosmo-virt1 create configuration vi /etc/corosync/corosync.conf totem { version: 2 secauth: off cluster_name: cluster transport: udpu } nodelist { node { ring0_addr: kosmo-virt1 nodeid: 1 } node { ring0_addr: kosmo-virt2 nodeid: 2 } node { ring0_addr: kosmo-virt3 nodeid: 3 } } quorum { provider: corosync_votequorum } logging { to_syslog: yes } and create authkey on kosmo-virt1 cd /etc/corosync corosync-keygen Copy corosync and authkey to kosmo-virt2 and kosmo-virt3 Enabling (on all nodes except kosmo-arch): systemctl enable pacemaker corosync Starting (on all nodes except kosmo-arch): systemctl start pacemaker corosync Checking: crm status Last updated: Mon Mar 30 18:33:14 2015 Last change: Mon Mar 30 18:23:47 2015 via crmd on kosmo-virt2 Stack: corosync Current DC: kosmo-virt2 (2) - partition with quorum Version: 1.1.10-32.el7_0.1-368c726 3 Nodes configured 0 Resources configured Online: [ kosmo-virt1 kosmo-virt2 kosmo-virt3] add properies crm configure property stonith-enabled=false crm configure property no-quorum-policy=stop ==== Installing Opennebula ==== **Installing** Setup repo on all nodes except kosmo-arch: cat << EOT > /etc/yum.repos.d/opennebula.repo [opennebula] name=opennebula baseurl=http://downloads.opennebula.org/repo/4.12/CentOS/7/x86_64/ enabled=1 gpgcheck=0 EOT Installing (on all nodes except kosmo-arch): yum install -y opennebula-server opennebula-sunstone opennebula-node-kvm qemu-img qemu-kvm Ruby Runtime Installation: /usr/share/one/install_gems Chaneg password oneadmin: passwd oneadmin Create passworless access for oneadmin (on kosmo-virt1): su oneadmin cd ~/.ssh ssh-keygen -t dsa cat id_dsa.pub >> authorized_keys chown oneadmin:oneadmin authorized_keys chmod 600 authorized_keys echo "StrictHostKeyChecking no" > config Copy to other nodes (remember that oneadmin home directory is /var/lib/one).\\ Change listen for sunstone-server (on all nodes): sed -i 's/host:\ 127\.0\.0\.1/host:\ 0\.0\.0\.0/g' /etc/one/sunstone-server.conf on kosmo-virt1:\\ copy all /var/lib/one/.one/*.auth and one.key files to OTHER_NODES:/var/lib/one/.one/ Start stop services on kosmo-virt1: systemctl start opennebula opennebula-sunstone Try to connect to http://node:9869.\\ Check logs for errors (/var/log/one/oned.log /var/log/one/sched.log /var/log/one/sunstone.log).\\ If no errors: systemctl stop opennebula opennebula-sunstone **Add ceph support for qemu-kvm for all nodes except kosmo-arch** qemu-img -h | grep rbd /usr/libexec/qemu-kvm --drive format=? | grep rbd if there is no rbd support than you have to compile and install: qemu-kvm-rhev qemu-kvm-common-rhev qemu-img-rhev Download: yum groupinstall -y "Development Tools" yum install -y yum-utils rpm-build yumdownloader --source qemu-kvm rpm -ivh qemu-kvm-1.5.3-60.el7_0.11.src.rpm Compiling. cd ~/rpmbuild/SPEC vi qemu-kvm.spec Change %define rhev 0 to %define rhev 1. rpmbuild -ba qemu-kvm.spec Installing (for all nodes except kosmo-arch). rpm -e --nodeps libcacard-1.5.3-60.el7_0.11.x86_64 rpm -e --nodeps qemu-img-1.5.3-60.el7_0.11.x86_64 rpm -e --nodeps qemu-kvm-common-1.5.3-60.el7_0.11.x86_64 rpm -e --nodeps qemu-kvm-1.5.3-60.el7_0.11.x86_64 rpm -ivh libcacard-rhev-1.5.3-60.el7.centos.11.x86_64.rpm rpm -ivh qemu-img-rhev-1.5.3-60.el7.centos.11.x86_64.rpm rpm -ivh qemu-kvm-common-rhev-1.5.3-60.el7.centos.11.x86_64.rpm rpm -ivh qemu-kvm-rhev-1.5.3-60.el7.centos.11.x86_64.rpm Check for ceph support. qemu-img -h | grep rbd Supported formats: vvfat vpc vmdk vhdx vdi sheepdog sheepdog sheepdog rbd raw host_cdrom host_floppy host_device file qed qcow2 qcow parallels nbd nbd nbd iscsi gluster gluster gluster gluster dmg cow cloop bochs blkverify blkdebug /usr/libexec/qemu-kvm --drive format=? | grep rbd Supported formats: vvfat vpc vmdk vhdx vdi sheepdog sheepdog sheepdog rbd raw host_cdrom host_floppy host_device file qed qcow2 qcow parallels nbd nbd nbd iscsi gluster gluster gluster gluster dmg cow cloop bochs blkverify blkdebug Try to write image (for all nodes except kosmo-arch): qemu-img create -f rbd rbd:one/test-virtN 10G where N node number. **Add ceph support for libvirt** On all nodes: systemctl enable messagebus.service systemctl start messagebus.service systemctl enable libvirtd.service systemctl start libvirtd.service On kosmo-virt1 create uuid: uuidgen cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5 Create secret.xml cat > secret.xml < cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5 client.oneadmin AQDp1aqz+JPAJhAAIcKf/Of0JfpJRQvfPLqn9Q== EOF Where AQDp1aqz+JPAJhAAIcKf/Of0JfpJRQvfPLqn9Q== is cat /etc/ceph/oneadmin.key.\\ Copy secret.xml to other nodes.\\ Add key to libvirt (for all nodes except kosmo-arch) virsh secret-define --file secret.xml virsh secret-set-value --secret virsh secret-set-value --base64 $(cat /etc/ceph/oneadmin.key) check virsh secret-list UUID Usage ----------------------------------------------------------- cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5 ceph client.oneadmin AQDp1aqz+JPAJhAAIcKf/Of0JfpJRQvfPLqn9Q== Restart libvirtd: systemctl restart libvirtd.service **Convering database to mysql:** Downloading script: wget http://www.redmine.org/attachments/download/6239/sqlite3-to-mysql.py Converting: sqlite3 /var/lib/one/one.db .dump | ./sqlite3-to-mysql.py > mysql.sql mysql -u oneadmin -p opennebula < mysql.sql Change /etc/one/oned.conf from DB = [ backend = "sqlite" ] to DB = [ backend = "mysql", server = "localhost", port = 0, user = "oneadmin", passwd = "PASS", db_name = "opennebula" ] Copy oned.conf to other nodes as root except kosmo-arch.\\ Check kosmo-virt2 and kosmo-virt3 nodes in turn: systemctl start opennebula opennebula-sunstone check logs for errors (/var/log/one/oned.log /var/log/one/sched.log /var/log/one/sunstone.log) systemctl start opennebula opennebula-sunstone ==== Creating HA resources ==== On all nodes except kosmo-arch: systemctl disable opennebula opennebula-sunstone opennebula-novnc From any of nodes except kosmo-arch: crm configure primitive ClusterIP ocf:heartbeat:IPaddr2 params ip="192.168.14.41" cidr_netmask="24" op monitor interval="30s" primitive opennebula_p systemd:opennebula \ op monitor interval=60s timeout=20s \ op start interval="0" timeout="120s" \ op stop interval="0" timeout="120s" primitive opennebula-sunstone_p systemd:opennebula-sunstone \ op monitor interval=60s timeout=20s \ op start interval="0" timeout="120s" \ op stop interval="0" timeout="120s" primitive opennebula-novnc_p systemd:opennebula-novnc \ op monitor interval=60s timeout=20s \ op start interval="0" timeout="120s" \ op stop interval="0" timeout="120s" group Opennebula_HA ClusterIP opennebula_p opennebula-sunstone_p opennebula-novnc_p exit Check crm status Last updated: Tue Mar 31 16:43:00 2015 Last change: Tue Mar 31 16:40:22 2015 via cibadmin on kosmo-virt1 Stack: corosync Current DC: kosmo-virt2 (2) - partition with quorum Version: 1.1.10-32.el7_0.1-368c726 3 Nodes configured 4 Resources configured Online: [ kosmo-virt1 kosmo-virt2 kosmo-virt3 ] Resource Group: Opennebula_HA ClusterIP (ocf::heartbeat:IPaddr2): Started kosmo-virt1 opennebula_p (systemd:opennebula): Started kosmo-virt1 opennebula-sunstone_p (systemd:opennebula-sunstone): Started kosmo-virt1 opennebula-novnc_p (systemd:opennebula-novnc): Started kosmo-virt1 ==== Configuring opennebula==== http://active_node:9869 - web management. With web management. 1. Create Cluster. 2. Add hosts (using 192.168.14.0 networks). Console management. 3. Add net. (su oneadmin) cat << EOT > def.net NAME = "Shared LAN" TYPE = RANGED # Now we'll use the host private network (physical) BRIDGE = nab0 NETWORK_SIZE = C NETWORK_ADDRESS = 192.168.14.0 EOT onevnet create def.net 4. Create image rbd datastore. (su oneadmin) cat << EOT > rbd.conf NAME = "cephds" DS_MAD = ceph TM_MAD = ceph DISK_TYPE = RBD POOL_NAME = one BRIDGE_LIST ="192.168.14.42 192.168.14.43 192.168.14.44" CEPH_HOST ="172.19.254.1:6789 172.19.254.2:6789 172.19.254.3:6789" CEPH_SECRET ="cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5" #uuid key, looked at libvirt authentication for ceph CEPH_USER = oneadmin onedatastore create rbd.conf 5. Create system ceph datastore. check last id number - N. onedatastore list on all nodes create directory and mount ceph mkdir /var/lib/one/datastores/N+1 echo "172.19.254.K:6789:/ /var/lib/one/datastores/N+1 ceph rw,relatime,name=admin,secret=AQB4jxJV8PuhJhAAdsdsdRBkSFrtr0VvnQNljBw==,nodcache 0 0 # see secret in /etc/ceph/ceph.client.admin.keyring" >> /etc/fstab mount /var/lib/one/datastores/N+1 where K= IP of curent node. From one node change permitions: chown oneadmin:oneadmin /var/lib/one/datastores/N+1 Create system ceph datastore (su oneadmin): cat << EOT > sys_fs.conf NAME = system_ceph TM_MAD = shared TYPE = SYSTEM_DS EOT onedatastore create sys_fs.conf 6. Add nodes, vnets, datastories to created cluster with web management. ==== HA VM==== Here is official [[http://docs.opennebula.org/4.12/advanced_administration/high_availability/ftguide.html|doc]].\\ But one comment. I'm using migrate instead of recreate command. /etc/one/oned.conf HOST_HOOK = [ name = "error", on = "ERROR", command = "host_error.rb", arguments = "$HID -m", remote = no ] ==== BACKUP==== Some words about backup.\\ **Use persistent image type for this work scheme.** For BACKUP was used a single Linux server kosmo-arch (ceph client) with installed [[http://zfsonlinux.org/|zfs on linux]]. For zpool set ZFS and deduplication on. (Remember that deduplication required about 2GB mem for 1TB storage space.)\\ Example of simple script that is starting by cron:\\ #!/bin/sh currdate=`/bin/date +%Y-%m-%0e` olddate=`/bin/date --date="60 days ago" +%Y-%m-%0e` imagelist="one-21" #space delimited list for i in $imagelist do snapcurchk=`/usr/bin/rbd -p one ls | grep $i | grep $currdate` snapoldchk=`/usr/bin/rbd -p one ls | grep $i | grep $currdate` if test -z "$snapcurchk" then /usr/bin/rbd snap create --snap $currdate one/$i /usr/bin/rbd export one/$i@$currdate /rbdback/$i-$currdate else echo "current snapshot exist" fi if test -z "$snapoldchk" then echo "old snapshot doesn't exist" else /usr/bin/rbd snap rm one/$i@$olddate /bin/rm -f /rbdback/$i-$olddate fi done Use onevm utility or web-interface (see template) to know which image assigned to VM. onevm list onevm show "VM_ID" -a | grep IMAGE_ID ==== PS==== Don't forget to change storage driver for VM to vda.([[http://www.linux-kvm.org/page/WindowsGuestDrivers/Download_Drivers|Drivers for windows]]). Without that you will face with low IO performance. (no more than 100 MB/s).\\ I saw 415MB/s with virtio drivers. ==== About author ==== [[https://www.linkedin.com/pub/alexey-vyrodov/59/976/16b|Profile]] of the author ==== Links. ==== 1. [[http://docs.opennebula.org/4.12/|Official documentation Opennebula]]\\ 2. [[http://ceph.com/docs/master/|Official documentation Ceph]]\\ 3. [[http://fabianpeter.de/cloud/owncloud-migrating-from-sqlite-to-mysql/|Convertation sqlite to mysql]]\\ 4. [[http://vadikgo.tumblr.com/post/34325489321/convert-an-opennebula-db-from-sqlite-to-mysql|Convertation Opennebula DB to mysql]]\\ 5. [[https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/High_Availability_Add-On_Reference/ch-pcsd-HAAR.html|HA Rhel 7]]\\ 6. [[http://clusterlabs.org/doc/|Cluster wirh crmsh]]