Linux/Desktop + SDS/MACH11 + iSCSI

This was posted on comp.databases.informix and I thought it would be useful to present it on a blog. I hope the author doesn't mind!

Linux/Desktop + SDS/MACH11 + iSCSI

By: cesar_inacio_martins@yahoo.com.br

This is a little cookbook/step-by-step which I tested successfully and want to share with everybody.

It is more focused on how configure the linux ... the IDS part is too easy to do a cookbook :)
But if you need information on how to configure informix SDS, look for the Redbook "Informix Dynamic Server 11:Extending Availability and Replication" on IBM's support site.

Why this?

I don't like to execute and study Informix on limited VMware machines which take a lot of disk, memory space and processor, I prefer real machines. Since I already use Linux on my work desktop and my notebook it makes all this more easy.

Objective:

To configure SDS on IDS 11.50 UC3DE to work with Linux/Desktop on different machines over Ethernet using iSCSI Software. (About iSCSI)

Environment:

  • IDS 11.50 UC3 Developer
  • 3 Desktop machines
  • OpenSUSE 11 x86 32-bit on server (disk storage) and node machines (IDS Primary/Secondary)
  • Server OpenSuse + Target iSCSI 0.4.16
  • Node OpenSuse + Open-iSCSI 2.0.869 (initiator only)
  • linux_Sun : The Server machine where the disks are shared
  • lnxEarth : The client/node machine where run the SDS Primary
  • lnx_Moon : The client/node machine where run the SDS Secondary

Considerations and limitations:

  • This example is just a development enviroment, NOT PRODUCTION.
  • I don't know if this environment is supported by IBM to production (Editor's note: almost certainly not!) and I do not recommend this as a production environment.
  • Do not use Target iSCSI version 0.4.15, it has a lot of bugs.
  • On the Server machine, the Target iSCSI (version 0.4.16) does not support RAW access to the physical disks, the access is via block devices ( /dev/sda )
  • To node machines, the I/O throughput on disk access is limited by Ethernet speed. E.g. over 100 Mbit Ethernet you will have the I/O transfer at 10 MBytes/Sec.
  • In this example, there has been no tuning and no security configuration of iSCSI. I used the default configuration.
  • On node machines the IDS can access the remote disks with RAW (Open-iSCSI + RAW). But remember, this RAW access is local to iSCSI initiator, the gap over ethernet still exists.
  • I've not executed any performance tests yet to discover which performs better, KAIO or AIO VPs

Configuring Server Disk Machine:

  • My server machine information:
    | linux_Sun:~ # hostname
    | linux_Sun
    |
    | linux_Sun:~ # ifconfig eth1 | egrep "eth1|inet addr"
    | eth1 Link encap:Ethernet HWaddr 00:02:B3:xx:xx:xx
    | inet addr:172.30.252.222 Bcast:172.30.255.255 Mask:255.255.0.0
    |
    | linux_Sun:~ # lsb_release -dr
    | Description: openSUSE 11.0 (i586)
    | Release: 11.0
    |
    | linux_Sun:~ # ietd --version
    | iscsid version 0.4.16
  • Here, all steps must be executed with root.
  • Create the partitions or cooked files to work like storage area.
    Here I will use the disk partition sda10 (1GB), sda11 (1GB), sda12 (3GB), sda13 (3GB)

    | linux_Sun:~ # fdisk -l /dev/sda
    |
    | Disk /dev/sda: 80.0 GB, 80026361856 bytes
    | 255 heads, 63 sectors/track, 9729 cylinders
    | Units = cylinders of 16065 * 512 = 8225280 bytes
    | Disk identifier: 0x0000fb3f
    |
    | Device Boot Start End Blocks Id System
    | /dev/sda1 1 262 2104483+ 82 Linux swap / Solaris
    | /dev/sda2 * 263 288 208845 83 Linux
    | /dev/sda3 289 8265 64075252+ f W95 Ext'd (LBA)
    | /dev/sda5 289 941 5245191 83 Linux
    | /dev/sda6 942 1594 5245191 83 Linux
    | /dev/sda7 1595 2247 5245191 83 Linux
    | /dev/sda8 2248 4859 20980858+ 83 Linux
    | /dev/sda9 4860 7291 19535008+ 83 Linux
    | /dev/sda10 7292 7413 979933+ da Non-FS data
    | /dev/sda11 7414 7535 979933+ da Non-FS data
    | /dev/sda12 7536 7900 2931831 da Non-FS data
    | /dev/sda13 7901 8265 2931831 da Non-FS data

  • Configure the Target iSCSI daemon on file /etc/ietd.conf.
    To more information about the syntax on ietd.conf, see the "man ietd.conf"
    This is my ietd.conf:
    | Target iqn.2008-10.galaxy.solar_system:informix.disks
    | Lun 0 Sectors=1000000,Type=nullio
    | Lun 1 Path=/dev/sda10,Type=blockio,ScsiId=0
    | Lun 2 Path=/dev/sda11,Type=blockio,ScsiId=1
    | Lun 3 Path=/dev/sda12,Type=blockio,ScsiId=2
    | Lun 4 Path=/dev/sda13,Type=blockio,ScsiId=3
    | Alias Disks_Informix

    "Lun 0" is used only for test and tuning purposes.

  • Start the Target iSCSI Daemon, and check if it is running :

    | linux_Sun:/etc # /etc/init.d/iscsi-target start
    |
    | linux_Sun:/etc # netstat -nltp | grep iet
    | tcp 0 0 0.0.0.0:3260 0.0.0.0:* LISTEN 12586/ietd
    | tcp 0 0 :::3260 :::* LISTEN 12586/ietd
    |
    | linux_Sun:/etc # tail /var/log/messages
    | .
    | .
    | Nov 7 13:56:15 linux_Sun kernel: iSCSI Enterprise Target Software - version 0.4.16
    | Nov 7 13:56:15 linux_Sun kernel: iscsi_trgt: Registered io type fileio
    | Nov 7 13:56:15 linux_Sun kernel: iscsi_trgt: Registered io type blockio
    | Nov 7 13:56:15 linux_Sun kernel: iscsi_trgt: Registered io type nullio

  • Finish: At this point, the disks are shared over Ethernet

Configuring Node Machine (primary/secondary):

  • My node machine information:
    | lnxEarth:/etc/iscsi # hostname
    | lnxEarth
    |
    | lnxEarth:/etc/iscsi # ifconfig eth0 | egrep "eth0|inet addr"
    | eth0 Link encap:Ethernet HWaddr 00:11:D8:xx:xx:xx
    | inet addr:172.30.252.224 Bcast:172.30.255.255 Mask:255.255.0
    |
    | lnxEarth:/etc/iscsi # iscsid --version
    | iscsid version 2.0-869
  • You will need to make a small change to the config file /etc/iscsi/iscsi.conf :
    ATTENTION: You must make this configuration change before "discovering" the disks.
    Comment out this line:
    | node.startup = manual
    and activate this :
    | node.startup = automatic
  • Start the Open-iSCSI Daemon:
    | lnxEarth:/etc/iscsi # /etc/init.d/open-iscsi start
    | Starting iSCSI initiator service: done
    | iscsiadm: no records found!
    | Setting up iSCSI targets: unused
  • Detect remote disks :
    Don't worry about login errors
    | lnxEarth:/etc/init.d # iscsi_discovery 172.30.252.222
    | iscsiadm: No active sessions.
    | iscsiadm: Cannot modify iface.transport_name. Use iface mode to update this value.
    | Cannot login over tcp to portal 172.30.252.222:3260
    | iscsiadm: Cannot modify iface.transport_name. Use iface mode to update this value.
    | Cannot login over tcp to portal 172.30.252.222:3260
    | iscsiadm: Cannot modify iface.transport_name. Use iface mode to update this value.
    | Cannot login over tcp to portal 172.30.252.222:3260
    | iscsiadm: Cannot modify iface.transport_name. Use iface mode to update this value.
    | Cannot login over tcp to portal 172.30.252.222:3260
    | discovered 2 targets at 172.30.252.222
    |
    | lnxEarth:/etc/init.d # iscsiadm -m node
    | 172.30.252.222:3260,1 iqn.2008-10.galaxy.solar_system:informix.disks
    | 172.30.252.222:3260,1 iqn.2008-10.galaxy.solar_system:informix.disks
  • There we have the 5 disks shared on another machine: sdb ... sdf
    | lnxEarth:/etc/init.d # fdisk -l 2>/dev/null |grep Disk
    | Disk /dev/sda: 80.0 GB, 80060424192 bytes
    | Disk identifier: 0x9fa99fa9
    | Disk /dev/sdb: 512 MB, 512000000 bytes
    | Disk identifier: 0x00000000
    | Disk /dev/sdc: 1003 MB, 1003451904 bytes
    | Disk identifier: 0x00000000
    | Disk /dev/sdd: 1003 MB, 1003451904 bytes
    | Disk identifier: 0x00000000
    | Disk /dev/sde: 3002 MB, 3002194944 bytes
    | Disk identifier: 0x00000000
    | Disk /dev/sdf: 3002 MB, 3002194944 bytes
    | Disk identifier: 0x00000000
  • Repeat all steps over others machines/nodes (Secondary).

Linux basic tests

  • CAUTION: be careful! Make sure you are writing over the right disk!
  • First test the I/O throughput over ethernet, without disk writes using the "nullio" configured in the server machine.
    | lnx_Moon:/etc # dd if=/dev/sdb of=/dev/null bs=2k count=100000
    | 100000+0 records in
    | 100000+0 records out
    | 204800000 bytes (205 MB) copied, 17.7421 s, 11.5 MB/s
    | lnx_Moon:/etc # dd if=/dev/zero of=/dev/sdb bs=2k count=100000
    | 100000+0 records in
    | 100000+0 records out
    | 204800000 bytes (205 MB) copied, 77.9 s, 2.6 MB/s
  • On the primary machine , clear the first 2k on the first disk of 1GB, in my case it is the "sdc" device.
    | lnxEarth:/etc/iscsi # dd if=/dev/zero of=/dev/sdc bs=2k count=1
    | 1+0 records in
    | 1+0 records out
    | 2048 bytes (2.0 kB) copied, 0.00122305 s, 1.7 MB/s
  • Read the first 2k on the Primary machine and you should see nothing there:
    | lnxEarth:/etc/iscsi # dd if=/dev/sdc bs=1k count=1
    | 1+0 records in
    | 1+0 records out
    | 1024 bytes (1.0 kB) copied, 0.0176213 s, 58.1 kB/s
  • Read the first 2k on the Secondary machine and you should see nothing there, too:
    Note: coincidentally the first disk with 1GB is "sdc" on this machine too.
    | lnx_Moon:/etc/iscsi # dd if=/dev/sdc bs=1k count=1
    | 1+0 records in
    | 1+0 records out
    | 1024 bytes (1.0 kB) copied, 0.0029615 s, 346 kB/s
  • Now, write something on this disk and read this information on the other machine:
    | lnx_Moon:/etc/iscsi # dd if=/etc/services of=/dev/sdc bs=2k
    | 373+1 records in
    | 373+1 records out
    | 764358 bytes (764 kB) copied, 0.08247 s, 9.3 MB/s

    | lnxEarth:/etc/iscsi # dd if=/dev/sdc bs=1k count=1
    | #
    | # Network services, Internet style
    | #
    | # Note that it is presently the policy of IANA to assign a single well-known
    | # port number for both TCP and UDP; hence, most entries here have two entries
    | # even if the protocol doesn't support UDP operations.
    | #
    | # This list could be found on:
    | # http://www.iana.org/assignments/port-numbers
    | #
    | # See also: services(5), http://www.sethwklein.net/projects/iana-etc/
    | #
    | # PORT NUMBERS
    | #
    | #

  • There you are, the disks are already configured!

Configure the RAW devices:

  • I configured this manually:
    | lnx_Moon:/etc # mknod /dev/raw/raw1 c 162 1
    | lnx_Moon:/etc # mknod /dev/raw/raw2 c 162 2
    | lnx_Moon:/etc # mknod /dev/raw/raw3 c 162 3
    | lnx_Moon:/etc # mknod /dev/raw/raw4 c 162 4
    | lnx_Moon:/etc # raw /dev/raw/raw1 /dev/sdc
    | /dev/raw/raw1: bound to major 8, minor 32
    | lnx_Moon:/etc # raw /dev/raw/raw2 /dev/sdd
    | /dev/raw/raw2: bound to major 8, minor 48
    | lnx_Moon:/etc # raw /dev/raw/raw3 /dev/sde
    | /dev/raw/raw3: bound to major 8, minor 64
    | lnx_Moon:/etc # raw /dev/raw/raw4 /dev/sdf
    | /dev/raw/raw4: bound to major 8, minor 80
  • Test reads with the first raw device
    | lnx_Moon:/etc # dd if=/dev/raw/raw1 bs=2k count=1
    | #
    | # Network services, Internet style
    | #
    | # Note that it is presently the policy of IANA to assign a single well-known
    | # port number for both TCP and UDP; hence, most entries here have two entries
    | # even if the protocol doesn't support UDP operations.
  • To configure the raw device to start on boot, see the file /etc/raw and /etc/init.d/raw In this version of OpenSuse there is a bug where the raw/nodes are not automatically created, so I made a small change to my /etc/init.d/raw :
    | lnxEarth:/etc/init.d # diff raw.old raw
    | 39c39
    | <
    | ---
    | > x=0
    | 40a41
    | > x=$(( x + 1 ))
    | 43a45
    | > [ ! -e /dev/raw/$rawdev ] && mknod /dev/raw/$rawdev c 162 $x

Configure to RAW devices automaticly have informix as owner after any reboot:

  • Create a new rule to udev daemon at /etc/udev/rules.d:
    | lnxEarth:/etc/udev/rules.d # echo 'SUBSYSTEM=="raw", KERNEL=="raw[0-9]*", OWNER="informix", GROUP="informix"' > 99-informix.rules
  • Reload the rule:
    | lnxEarth:/etc/udev/rules.d # udevadm control --reload_rules
  • Apply the new rule:
    | lnxEarth:/etc/udev/rules.d # udevadm trigger
  • Check if the owner change
    | lnxEarth:/etc/udev/rules.d # ls -l /dev/raw
    | total 0
    | crw-rw---- 1 informix informix 162, 1 2008-11-18 16:34 raw1
    | crw-rw---- 1 informix informix 162, 2 2008-11-18 16:34 raw2
    | crw-rw---- 1 informix informix 162, 3 2008-11-18 16:34 raw3
    | crw-rw---- 1 informix informix 162, 4 2008-11-18 16:34 raw4
    | crw-rw---- 1 root disk 162, 0 2008-11-18 16:34 rawctl
  • Configure to all services start automaticly on boot:
    | lnxEarth:/ # chkconfig raw on open-iscsi on

Now, the easy part, just install Informix and configure the SDS

I executed the test with IDS 11.50 UC3 Developer using Primary and Secondary SDS.

All tests worked fine, I have not executes any heavy process (yet), but I don't see any problem with using this configuration in a test or development environment.

And to finish, I also used the BTS 2.00 (Basic Text Search) with Smart Blob Spaces in thisc onfiguration and it worked too!

Very cool!

Source

What about nbd

Hi there,

this sounds really nice. I already read about the same thing in the german IBM Informix Newsletter some days ago (http://www.informix-zone.com/node/637) and a similar approach came to my mind: as far as I know, iSCSI comes along with a lot of overhead. So why not share block devices by nbd, enbd or (perhaps) drbd. They use a specialized protocol on the network level and chances might not be too bad that this might result in better performance. Any Suggestions ? Maybe I'll try it out, but at the very moment, there's no time left.

Cheers
Ralf

A business partner of my

A business partner of my acquaintance has used drdb with some success, but it is obviously an unsupported approach. :-)