Jump to content


Photo
- - - - -

MySQL Cluster Setup (beyond the sanity)


  • Please log in to reply
No replies to this topic

#1 alexander

alexander

    Dedicated Smart-ass

  • Members
  • PipPipPipPipPipPipPip
  • 5722 posts

Posted 22 December 2009 - 02:41 PM

Yes, it is time for another crazy tutorial from alex.

Now this is not yet an ideal cluster setup, hell, techically its not even fully HA (Highly Available), but its close, and it will get there, everything in due time. This was however a large step forward, and a rather unusual setup which i should document :lol:

So what we have is 3 boxes, 2 dual quadcores with 16GB/RAM which will be used as DATA/API nodes, and one dual dual core utility box (blade) for the management server (remember that in a mysql cluster the management server is needed to start up, shut down and back up the cluster). Lets give them initial IPs in our network: management - 10.0.0.2, data/api - 10.0.0.10 and 10.0.0.11. Those ofcourse are your internally available IPs on you internal network, which while will be used, would not be the ideal place to setup your cluster. Of anything out of security reasons. MySQL cluster, you see, does not encrypt it's management traffic, so anyone who would gain access to your internal network (and it may be easier or harder, but nearly never undoable) would be able to spoof commands to your server, and that is, let's say less then ideal. So we have a couple of options, we can separate them into a different network segment via vlans and port sec, we could thrown another switch in the middle, but remember for full fail over this means that we should throw in two switches, or we could use direct links for the back end, something like a crossover cable. Now but that doesnt solve your problem you say, you can connect 3 boxes but its not very practical, and only if you have enough interfaces. Well we do on the big boxes at 6 GIG interfaces a piece, they are a force to be reckoned with, but the blade has 2 ports total, and ideally you would want both of them connected to the main network in an automatic failover configuration (remember we are looking for performance and redundancy). But what if you really dont want to shell out another 4 grand to get 2 cisco switches for just these boxes? Well there is a solution, if you could secure communication between the 3 boxes on the 10 network and connect the data nodes over a crossover, you could get away with not using the switches, question remains, how does one secure it, and this is where i had an idea about using a VPN tunnel (well rather two). It wont be a performance hit on any of the 3 boxes, it will allow me to use the 10 netowrk and create a lan that nobody will be able to see or use, it encrypts my management traffic... Wonderous, let's do it ;)

So the proposed setup looks like this

We have 3 boxes on 10/8 network

10.0.0.2 - management node, will run the vpn server creating a 192.168.0.0/24 network and will thus become 192.168.0.1
10.0.0.10 - first data/api node, will be connecting to the 192.168.0.0/24 network and will become 192.168.0.whatever, probably 6 (vpns work kinda weird like that). And it will be crossover plugged into the .11 box using a 172.16.0.2 ip
10.0.0.11 - second data/api node, will be connecting to the 192.168.0.0/24 network and will become 192.168.0.whatever, probably 10. And it will be crossover plugged into the .10 box using a 172.16.0.3 ip

Each box has a CentOS installed (you can use your distro of choice, but bear in mind that some locations or processes will change slightly based on your distro), you can use your distro of choice.


So, lets break this task into 3 steps and go from the least to the most involved one

Step one, setting up the crossover network.
Step two, setting up the vpn
Step three, setting up MySQL cluster across all that.

Note: for any change notation later on, if the line starts with -- it means remove, if line starts with ++, that means add, and lastly if nothing is preceeding the line, let those lines be the guides to (this is not unlike diff)

Step one:


Lets set up a redundant crossover link. Plug in 2 network cables cables into nic ports of your choice, i would recommend using 2 different nics, and if you are choosing them, go with the intel ones, they have proven to me to be usually the most reliable on the nix platform (for drivers, support, complying with standards), between 10.0.0.10 and 10.0.0.11 servers. That said here's how you can set up bonding:

in /etc/modprobe.conf add a line that reads
alias bond0 bonding

# cd /etc/sysconfig/network-scripts


lets edit ifcfg-bond0 (use your favorite editor) on 10.0.0.10
# emacs ifcfg-bond0

DEVICE=bond0
IPADDR=172.16.0.2
NETMASK=255.255.255.0
NETWORK=172.16.0.0
BROADCAST=172.17.17.255
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
BONDING_OPTS="mode=active-backup arp_interval=500 arp_ip_target=172.16.0.3"
for 10.0.0.11 change:
--IPADDR=172.16.0.2
++IPADDR=172.16.0.3
--BONDING_OPTS="mode=active-backup arp_interval=500 arp_ip_target=172.16.0.3"
++BONDING_OPTS="mode=active-backup arp_interval=500 arp_ip_target=172.16.0.2"
(in emacs to save press Ctrl+x Ctrl+s, to exit press Ctrl+x Ctrl+c)

now then we need to modify the ifcfg interface files for the interfaces, but first a side-note, if you want to see the status of a link on the interface use ethtool to get information, so something like
# ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: pumbag
Wake-on: d
Current message level: 0x00000001 (1)
Link detected: yes

From this we can tell that we do indeed have a link (trust me this is handy when you are working over ssh and aren't sure which interfaces got plugged in exactly.

Let's use eth0 and eth5 for this link, for me they are the first port on dual the intel card and last port on the quad broadcom card
# ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: pumbag
Wake-on: d
Current message level: 0x00000001 (1)
Link detected: yes

# ethtool eth5
Settings for eth5:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: g
Wake-on: d
Link detected: yes

# ethtool -i eth0
driver: e1000e
version: 1.0.2-k2
firmware-version: 5.6-2
bus-info: 0000:04:00.0

# ethtool -i eth5
driver: bnx2
version: 2.0.1
firmware-version: 4.6.4 NCSI 1.0.6
bus-info: 0000:02:00.1

So from here lets edit ifcfg-eth0 and ifcfg-eth5
# emacs ifcfg-eth5

original:
# Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet
DEVICE=eth5
ONBOOT=no
BOOTPROTO=none
USERCTL=no
HWADDR=00:11:22:33:44:55
finished:
# Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet
DEVICE=eth5
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
MASTER=bond0
SLAVE=yes
HWADDR=00:11:22:33:44:55
ETHTOOL_OPTS="speed 1000 duplex full autoneg on"
so you will need the master=bond0 and slave=yes as well as set to onboot to yes (otherwise the card wont come up). The ethtool opts at the end ensure that the link between the cards is forced to 1Gig full duplex, that part is VERY optional. Also make sure the hardware address there is correct, if it's not there or wrong, you should consider using this script:

#!/bin/bash

function show_help
{
    echo -e "This will change mac addresses in /etc/sysconfig/network-scripts/ifcfg-ethX to the corresponding hardware interfacesnnOptions:n-t  test run, will display the changesn-f  will actually write the changesn-h  will display this helpn-j  the just do it option, no confirmation, outputs done when done"
    exit
}

# safe, but doesnt fix things
function safe_run
{
    for a in `ifconfig -a | awk '/eth/{print $1}'`; do 
    echo "Found ${a} device"
    if [ -f /etc/sysconfig/network-scripts/ifcfg-${a} ]; then
        echo -e "Found ifcfg script correlating to the ${a} devicenCorrected file:n"
        if [ `cat /etc/sysconfig/network-scripts/ifcfg-${a} | grep -i hwaddr` not = '' ]; then
            echo -e "nFound section to correct"
            echo -e "`cat /etc/sysconfig/network-scripts/ifcfg-${a} | sed "s/([0-9A-F]{2}<img src='http://scienceforums.com/public/style_emoticons/<#EMO_DIR#>/smile.gif' class='bbc_emoticon' alt=':wink:' />{5}[0-9A-F]{2}/`ifconfig ${a} | awk '/HWaddr/{print $5}'`/"`"
                echo -e "nn"
        else
            echo -e "nDidn't find the HWADDR section, generating one"
            echo -e "`cat /etc/sysconfig/network-scripts/ifcfg-${a}`nHWADDR=`ifconfig ${a} | awk '/HWaddr/{print $5}'`"
            echo -e "nn"
        fi
    else
        ethdevice=`ethtool -i ${a} | awk '/bus-info/{split($2,b,":"); print b[2]":"b[3]}'`
        echo -e "nWarning: Could not find a ifcfg file correlating to the ${a} devicenPlease make sure your distro uses these files if you want to run the fixnnThe script will create a default config file containing:n"
        echo -e "# `lspci | grep ${ethdevice} | sed "s/${ethdevice} Ethernet controller: //"`nDEVICE=$anBOOTPROTO=dhcpnHWADDR=`ifconfig ${a} | awk '/HWaddr/{print $5}'`nONBOOT=no"
        echo -e "nn"
    fi
    done
exit
}

# dangerous, but does fix things
function fix_run
{
    while :
    do
    echo "Are you sure, this could break things, you know! Continue? [n]:";
    read yn;
    case $yn in
        "y" ) break;;
        "Y" ) break;;
        * ) exit;;
    esac
    done
       while :
    do
    echo "No i mean like really sure, this really can break things! Continue? [n]:";
    read yn;
    case $yn in
        "y" ) break;;
        "Y" ) break;;
        * ) exit;;
    esac
    done
    for a in `ifconfig -a | awk '/eth/{print $1}'`; do
    echo "Found ${a} device"
    if [ -f /etc/sysconfig/network-scripts/ifcfg-${a} ]; then
        if [ `cat /etc/sysconfig/network-scripts/ifcfg-${a} | grep -i hwaddr` not = '' ]; then
            echo "Correcting the ifcfg script"
            sed -i "s/([0-9A-F]{2}<img src='http://scienceforums.com/public/style_emoticons/<#EMO_DIR#>/smile.gif' class='bbc_emoticon' alt=':eek:' />{5}[0-9A-F]{2}/`ifconfig ${a} | awk '/HWaddr/{print $5}'`/" /etc/sysconfig/network-scripts/ifcfg-$a
        else
            echo "Could not find HWADDR section, adding one"
            echo "HWADDR=`ifconfig ${a} | awk '/HWaddr/{print $5}'`" >> /etc/sysconfig/network-scripts/ifcfg-$a
        fi
    else
        echo "nWarning: ifcfg file not found, generating a new ifcfg file"
        ethdevice=`ethtool -i ${a} | awk '/bus-info/{split($2,b,":"); print b[2]":"b[3]}'`
        echo -e "# `lspci | grep ${ethdevice} | sed "s/${ethdevice} Ethernet controller: //"`nDEVICE=$anBOOTPROTO=dhcpnHWADDR=`ifconfig ${a} | awk '/HWaddr/{print $5}'`nONBOOT=no" > /etc/sysconfig/network-scripts/ifcfg-$a
    fi
    done
exit
}

# very gangerous, but not quite annoying
function just_do_it
{
    for a in `ifconfig -a | awk '/eth/{print $1}'`; do
    if [ -f /etc/sysconfig/network-scripts/ifcfg-${a} ]; then
        if [ `cat /etc/sysconfig/network-scripts/ifcfg-${a} | grep -i hwaddr` not = '' ]; then
            sed -i "s/([0-9A-F]{2}<img src='http://scienceforums.com/public/style_emoticons/<#EMO_DIR#>/smile.gif' class='bbc_emoticon' alt=':)' />{5}[0-9A-F]{2}/`ifconfig ${a} | awk '/HWaddr/{print $5}'`/" /etc/sysconfig/network-scripts/ifcfg-$a
        else
            echo "HWADDR=`ifconfig ${a} | awk '/HWaddr/{print $5}'`" >> /etc/sysconfig/network-scripts/ifcfg-$a
        fi
    else
        ethdevice=`ethtool -i ${a} | awk '/bus-info/{split($2,b,":"); print b[2]":"b[3]}'`
        echo -e "# `lspci | grep ${ethdevice} | sed "s/${ethdevice} Ethernet controller: //"`nDEVICE=$anBOOTPROTO=dhcpnHWADDR=`ifconfig ${a} | awk '/HWaddr/{print $5}'`nONBOOT=no" > /etc/sysconfig/network-scripts/ifcfg-$a
    fi
    done
    echo "Done"
exit
}

if [ $# != 1 ]; then
    show_help
fi

case $1
  in
    -t)
      safe_run
    ;;

    -f)
      fix_run
    ;;
    
    -j)
        just_do_it
    ;;

    -h)
    show_help
    ;;

    *)
      show_help
    ;;
  esac
save it to a file (optionally ending on .sh, so you know its a shell script) (i'd recommend using roots home directory, /root)
# chmod 700 /root/file.sh
# /root/./file.sh -t

this will tell you all the changes the script would perform, without actually performing them... use -f option to run it if you are satisfied with changes...

the only thing that is left to do is to bring this bond up:

# modprobe bonding
# /etc/init.d/network restart

Hope that everything comes up correctly. You can now ping from 172.16.0.2 to 172.16.0.3 and vise-versa, if you cant, make sure your bond came up correctly, dmesg hasn't spewed errors, and lastly that the routes are set correctly (or are set at all)

# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
10.0.0.0 * 255.255.255.0 U 0 0 0 eth2
172.16.0.0 * 255.255.255.0 U 0 0 0 bond0
169.254.0.0 * 255.255.0.0 U 0 0 0 eth2
default 10.0.0.1 0.0.0.0 UG 0 0 0 eth2

Note if the 172.16.0.0 route is not displayed, add it
# route add -net 172.16.0.0 netmask 255.255.255.0 bond0

If you are still not able to ping, make sure that your firewall rule allows it, but since i will deal with this particular bit later on, lets do
# iptables -F
# iptables -t nat -F


and if the pings still dont work, i'd say you did something wrong (like set a gateway for the 172.16 network or your bond is not working correctly or something...


Step two:


The vpn stage. Actually its not very dissimilar to the above, but its a little bit more involved

first lets set up 10.0.0.2

# yum install openvpn
# cd /etc/openvpn/
# cp -R /usr/share/doc/openvpn-2.0.9/easy-rsa/ /etc/openvpn/
# cd /etc/openvpn/easy-rsa/
# chmod 770 *
# mkdir /etc/openvpn/keys
# emacs ../vars

edit the last bit to match your specifications at the bottom of the file and also change:
--export KEY_DIR=$D/keys
++export KEY_DIR=/etc/openvpn/keys
NOTE: this next step is very important
# . ../vars

# ./clean-all
# ./build-ca
# ./build-key-server server

yes you want to add the password and all that fun stuff

finally build a client certificate for each client (in our case we need 2 certs for clients
# ./build-key client1
# ./build-key client2
# ./build-dh
# cd /etc/openvpn
# emacs openvpn.conf


port 1194
proto tcp
dev tun
ca keys/ca.crt
cert keys/ndbmgmd.crt
key keys/ndbmgmd.key
dh keys/dh1024.pem
server 192.168.0.0 255.255.255.0
client-config-dir ccd
#########
######### Put your Public DNS Servers here
#########
push "dhcp-option DNS 10.0.0.100"
push "dhcp-option DNS 10.0.0.101"

ifconfig-pool-persist ipp.txt
push "redirect-gateway"
keepalive 10 120
comp-lzo
persist-key
persist-tun
status server-tcp.log
now lets create a couple of empty files for proper operation
# touch server-tcp.log
# touch ipp.txt

now lets start this sucker up
# /etc/init.d/openvpn start

make sure it starts cleanly

now lets scp the necessary files to the necessary servers:
# cd /etc/openvpn/keys
# scp ca.crt client1.csr client1.key client1.crt user@10.0.0.10:/tmp
# scp ca.crt client2.csr client2.key client2.crt user@10.0.0.11:/tmp


good, now lets go into .10 and make it connect to this tunnel
# yum install openvpn
# cd /tmp
# mmove ca.crt client1.csr client1.key client1.crt /etc/openvpn/
# cd !$

Note: thats a history command, but if you did something in between, use cd /etc/openvpn

# emacs client.conf

client
dev tun
proto tcp

#Change to your public domain or IP address of the server
remote 10.0.0.2 1194
resolv-retry infinite
nobind
persist-key
persist-tun


ca ca.crt
cert client1.crt
key client1.key

ns-cert-type server

#DNS Options here, CHANGE THESE !!
push "dhcp-option DNS 10.0.0.100"
push "dhcp-option DNS 10.0.0.101"

comp-lzo

verb 3
Now to prevent you from a lot of frustration later, lets do this too:

# emacs /etc/init.d/openvpn

change:
openvpn=""
openvpn_locations="/usr/sbin/openvpn /usr/local/sbin/openvpn"
++openvpn_opts="--route-noexec"

for c in `/bin/ls *.conf 2>/dev/null`; do
bn=${c%%.conf}
if [ -f "$bn.sh" ]; then
. $bn.sh
fi
rm -f $piddir/$bn.pid
--$openvpn --daemon --writepid $piddir/$bn.pid --config $c --cd $work
++$openvpn --daemon $openvpn_opts --writepid $piddir/$bn.pid --config $c --cd $work
After those changes, which will prevent openvpn from automagically creating routes (which will work, but not how we want) you should be ok to start the client
# /etc/init.d/openvpn start

Make sure it starts, if you see the tun0 interface in ifconfig, then hopefully its all good. To use this correctly, let's add a route:
# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.0.9 * 255.255.255.255 UH 0 0 0 tun0
10.0.0.0 * 255.255.255.0 U 0 0 0 eth2
172.16.0.0 * 255.255.255.0 U 0 0 0 bond0
169.254.0.0 * 255.255.0.0 U 0 0 0 eth2
default 10.0.0.1 0.0.0.0 UG 0 0 0 eth2

Notice the 192.168.0.9 route and do
# route add -net 192.168.0.0 netmask 255.255.255.0 gw 192.168.0.9 dev tun0

Now your routes should look like this
# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.0.9 * 255.255.255.255 UH 0 0 0 tun0
192.168.0.0 192.168.0.9 255.255.255.255 UH 0 0 0 tun0
10.0.0.0 * 255.255.255.0 U 0 0 0 eth2
172.16.0.0 * 255.255.255.0 U 0 0 0 bond0
169.254.0.0 * 255.255.0.0 U 0 0 0 eth2
default 10.0.0.1 0.0.0.0 UG 0 0 0 eth2

And finally you should be able to ping 192.168.0.1, if not, please refer to the openvpn documentation. Once you have both servers working it is time to proceed to the next step.


Step three:

Actually I lied in the beginning, this was the hardest stage to figure out, but its quite simple to implement :) Yaaay, aren't you happy?

Ok so firstly we need a mysql binary distribution. Why binary distribution? Well, source works, but i am lazy to compile it, and binary works almost as well, plus MySQL certifies and stands behind their binary distribution, and its easy to install, upgrade and downgrade...

So you need to download the latest mysql cluster package, NOTE: normal mysql distribution will NOT work, remove it from the box if you have it installed prior to proceeding! Navigate with your browser of choice (as long as it's not IE, if you are using IE, give up on this tutorial now and spare me the frustration of thinking about someone using IE to execute this pretty hardcore Nix-based tutorial) to MySQL :: Download MySQL Cluster, and download the latest release :) (note when the login page comes up, just hit the "take me to the download" link. After download completes scp the archive over to the /tmp folder of .10 and .11 (note, you dont need to actually copy it to .2, just keep on reading)

Now get into .10 (and you'll have to just repeat these steps for .11)
# cd /usr/local
# mv /tmp/mysql-cluster-gpl-7.0.9-linux-x86_64-glibc23.tar.gz .
# tar xvzf mysql-cluster-gpl-7.0.9-linux-x86_64-glibc23.tar.gz
# ln -s mysql-max-4.1.9-pc-linux-gnu-i686 mysql
# cd mysql
# groupadd mysql
# useradd -g mysql mysql
# scripts/mysql_install_db --user=mysql
# chown -R root .
# chown -R mysql data
# chgrp -R mysql .
# cp support-files/mysql.server /etc/rc.d/init.d/
# chmod +x /etc/rc.d/init.d/mysql
# mkdir /var/lib/mysql-cluster


This you should do only on one of the servers (.10 or .11)
# scp bin/ndb_mg* user@10.0.0.2:/tmp

For the .2 server
# mv /tmp/ndb_mg* /usr/bin
# mkdir /var/lib/mysql-cluster
# cd !$ (or optionally cd /var/lib/mysql-cluster)
# emacs config.ini



# Options affecting ndbd processes on all data nodes:
[ndbd default]
NoOfReplicas=2    # Number of replicas
DataMemory=12G    # How much memory to allocate for data storage
IndexMemory=2G    # How much memory to allocate for index storage

# Management Server
[ndb_mgmd]
Id=1
HostName=192.168.0.1

# Data Nodes
[ndbd]
Id=10
HostName=192.168.0.6
[ndbd]
Id=11
HostName=192.168.0.10

#API Nodes
[mysqld]
Id=20
HostName=192.168.0.6
[mysqld]
Id=21
HostName=192.168.0.10
[mysqld]
Id=22
HostName=192.168.0.6
[mysqld]
Id=23
HostName=192.168.0.10

#TCP/IP Connections
[tcp]
NodeId1=10
NodeId2=11
HostName1=172.16.0.3
HostName2=172.16.0.2
[tcp]
NodeID1=20
NodeID2=11
HostName1=172.16.0.3
HostName2=172.16.0.2
[tcp]
NodeID1=21
NodeID2=10
HostName1=172.16.0.2
HostName2=172.16.0.3
[tcp]
NodeID1=22
NodeID2=11
HostName1=172.16.0.3
HostName2=172.16.0.2
[tcp]
NodeID1=23
NodeID2=10
HostName1=172.16.0.2
HostName2=172.16.0.3
[tcp]
NodeID1=20
NodeID2=10
HostName1=172.16.0.3
HostName2=172.16.0.3
[tcp]
NodeID1=21
NodeID2=11
HostName1=172.16.0.2
HostName2=172.16.0.2
[tcp]
NodeID1=22
NodeID2=10
HostName1=172.16.0.3
HostName2=172.16.0.3
[tcp]
NodeID1=23
NodeID2=11
HostName1=172.16.0.2
HostName2=172.16.0.2
Ok, its time for explanation. First part is straight forward, change the values to reflect your hardware and your needs; dont set the values too high, remember mysqld and the os needs some resources ;). The ndb_mgmd section, straight forward, define the management server ip in there. The ndbd section describes the data nodes (ndbd processes). That part is here primarily to tell the ndbd how to communicate with them, but all in due time. Next part is the mysqld part. Why are there 4 sections there? Well because if you want to do connection pooling, and say use 2 for the pool, there will be 2 mysqld threads that will run on the server, each capable of processing it's query nearly in paralell to the other thread, which makes your cluster put through more queries in parallel. Each mysqld instance will need its own section and node id :) The conflustering TCP section, the tcp section can define alternate channels for nodes to communicate to one another, so the first entry tells the data nodes how to talk to each other. The rest (8) tell the Mysqld processes how to talk to each one of the data nodes (ndbd), even the data nodes located on the local box, otherwise it will try to use tun0, and you dont want that :) I am still thinking that i can probably point it to 127.0.0.1, but i have not tested it, so no clue how it will actually work...

So at this point .2 is pretty much all set (see the notes on firewall config)
# cd /var/lib/mysql-cluster
# ndb_mgmd -f config.ini --initial


NOTE: only use initial when you make changes to the config file
# ndb_mgm

Useful commands for the management console:
help - use it
show - show status
shutdown - proper way to shut down the cluster
10 status - show status of an individual node

Now lets navigate back to .10 (and .11)
# emacs /etc/my.cnf

# Options for mysqld process:
[mysqld]
ndbcluster                      # run NDB storage engine
ndb-connectstring=192.168.0.1  # location of management server
ndb-cluster-connection-pool=2  # amount of mysqld processes that can run
port=3306
log-error=/var/log/mysql.err.log
log-bin

# Options for ndbd process:
[mysql_cluster]
ndb-connectstring=192.168.0.1  # location of management server
Now lets start the data nodes:
# cd /var/lib/mysql-cluster
# ndbd --initial


NOTE: only use initial when actually initializing nodes or changing configuration, ideally this should be the only time you use --initial for this part

If everything is good, in ndb_mgm console, show should after a while (less then a minute) show you that it successfully started nodes 10 and 11.

Places to look for clues: the ndb log in the .2 server in the /var/lib/mysql-cluster directory

Once both servers start correctly, you can start the mysqld processes on .10 and .11
# /etc/init.d/mysql start

Now in your management node show command should output something like:
-- NDB Cluster -- Management Client --
ndb_mgm> show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=10 @192.168.0.6 (mysql-5.1.39 ndb-7.0.9, Nodegroup: 0, Master)
id=11 @192.168.0.10 (mysql-5.1.39 ndb-7.0.9, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1 @192.168.0.1 (mysql-5.1.39 ndb-7.0.9)

[mysqld(API)] 4 node(s)
id=20 @172.16.0.3 (mysql-5.1.39 ndb-7.0.9)
id=21 @172.16.0.2 (mysql-5.1.39 ndb-7.0.9)
id=22 @172.16.0.3 (mysql-5.1.39 ndb-7.0.9)
id=23 @172.16.0.2 (mysql-5.1.39 ndb-7.0.9)

If it's so, then wonderful, if not, you messed up the config somewhere, or some network link isn't up, or some route is set incorrectly, start searching. You do have more logs to look at:
on .2 there is the ndb cluster log
on .10 and .11 there are data node logs in /var/lib/mysql-cluster and there is a mysql error log in /var/log

Use them to your advantage to track down issues.

When this is all set, you can use mysql (at this point there is no password for root user)

# mysql -u root
mysql> SHOW STATUS LIKE 'NDB%';
+--------------------------------+-------------+
| Variable_name | Value |
+--------------------------------+-------------+
| Ndb_cluster_node_id | 20 |
| Ndb_config_from_host | 192.168.0.1 |
| Ndb_config_from_port | 1186 |
| Ndb_number_of_data_nodes | 2 |
| Ndb_number_of_ready_data_nodes | 2 |
| Ndb_connect_count | 0 |
| Ndb_execute_count | 0 |
| Ndb_scan_count | 0 |
| Ndb_pruned_scan_count | 0 |
| Ndb_cluster_connection_pool | 2 |
| Ndb_conflict_fn_max | 0 |
| Ndb_conflict_fn_old | 0 |
+--------------------------------+-------------+
12 rows in set (0.00 sec)

mysql> create database foo;

mysql> use foo;


That needs to happen on both boxes, now to test the cluster on one server do:

mysql> create table test1 (i int) engine=ndbcluster;
Query OK, 0 rows affected (0.54 sec)

Now go to the other server and do
mysql> show tables;
+---------------+
| Tables_in_foo |
+---------------+
| test1 |
+---------------+
1 row in set (0.03 sec)

And that pretty much tells you that your cluster is operating correctly, you can from the second box do
mysql> insert into test1 () values ('123');
Query OK, 1 row affected (0.01 sec)

now go back to the server you started from
mysql> select * from test1;
+------+
| i |
+------+
| 123 |
+------+
1 row in set (0.00 sec)

If such is the case, your cluster is operating properly, hooray, you are almost done, just read the firewall note.


Firewall note:

The easiest way i can suggest setting up the firewalls is, block most incoming from the 10.* other then eactly what you need, and allow everything from the tun0 and bond0 interfaces and 192.168 and 172.16 subnets.

If you want to block it more. You will need ports 1186 open to and from the management node, and in addition you will need 2202, and 2203 for data nodes to communicate, and obviously 3306 for mysqld access from the rest of your network.


Enjoy
  • Chacmool likes this