TCPHA-0.2 What is it? ----------- TCPHA implements an architecture for scalable content-aware request distribution in cluster-based servers. It implements TCP Handoff inside kernel for the LINUX operating system. TCPHA can be used to build a high-performance and highly available server based on a cluster of Linux servers. What's the architecture of TCPHA? -------------------------------- TCPHA is composed of tcpha_fe (dispatcher), tcpha_be (real server). The dispatcher(FE) does the content-aware request distribution to real servers(BEs). Real servers(BEs) serve requests and send responses directly to clients. The software needed by TCPHA ------------------------------- Linux with Kernel-2.4.20, Gcc COPYRIGHT --------- Copyright (c) 2004-2006 Li Wang. It is released under GNU GPL (General Public License). Please see the file called COPYING. Fastest Installation --------------- 1. Patch the Linux 2.4.20 kernel: cp patch/linux-2.4.20/linux_net_netsyms.c.patch /usr/src; cd /usr/src; patch -p0 < linux_net_netsyms.c.patch 2. Rebuild kernel: make mrproper; make menuconfig; make dep; make clean; make bzImage; make modules; make modules_install; make install; depmod -a; Note: the following rules must be followed when doing 'make menuconfig' [ ] means NOT included [*] means included [ ] Loadable module support--->Set version information on all module symbols [*] Networking options--->Network packet filtering (replaces ipchains) [*] Networking options--->IP: tunneling 3. Reboot with new kernel: reboot; Let's assume your cluster has the following configuration: Workstation with address:192.168.1.23 runs as dispatcher(FE:front end), Workstations with address:192.168.1.33 and 192.168.1.38 run as real server(BE:back end). Thay all share address:192.168.1.36,which is saw by client. First,you should execute the following commands in shell: Shell>tar -zxvf tcpha-0.2.tar.gz Shell>cd tcpha Then you should choose to install which component.In this case,surely,you should install tcpha_fe on 192.168.1.23,install tcpha_be on 192.168.1.33 and 192.168.1.38. Installing tcpha_fe 1. Shell>chmod u+x fe/regex/configure 2. Shell>cd fe/regex;aclocal 3. Shell>cd fe;make 4. Modify the file:fe/tcphafe.conf according to your cluster configuration. Generally speeking,you need only change the following: 'raddr = x.x.x.x' to the FE real address,So here you should change the line to 'raddr = 192.168.1.23'. 'vaddr = x.x.x.x' to the cluster shared virtual address,here 'vaddr = 192.168.1.36' 'server = x.x.x.x port' to the BE address and port,here 'server = 192.168.1.33 666' and 'server = 192.168.1.38 666' 'rule = x.x.x.x pattern' stands for:if a request matching 'pattern' uses BE:x.x.x.x to handle it,the pattern should be regular expressions.such as 'rule = 192.168.1.38 jpg$' and 'rule = 192.168.1.33 .*' 5. Shell>mkdir /etc/tcpha;cp fe/tcphafe.conf /etc/tcpha/ Install tcpha_be 1. Shell>cd be;make 2. Modify the file:be/tcphabe.conf according to your cluster configuration. Generally speeking,you need only change the following: 'raddr = x.x.x.x' to the BE real address,So here you should change the line to 'raddr = 192.168.1.33' 'uaddr = x.x.x.x' to a address the cluster NOT used,for example: if 192.168.1.39 is not used in your cluster, you can change the line to 'uaddr = 192.168.1.39' 'vaddr = x.x.x.x' to the cluster shared virtual address,here 'vaddr = 192.168.1.36' 3. Shell>mkdir /etc/tcpha;cp be/tcphabe.conf /etc/tcpha/ Note: If your kernel is SMP kernel: tcpha_fe 1. Uncomment the following line in the fe/Makefile: SMPFLAGS= -D__SMP__ 2. Uncomment the following line in the fe/regex/Makefile.am: SMPFLAGS= -D__SMP__ 3. Shell>cd fe/regex;aclocal tcpha_be 1. 1. Uncomment the following line in the be/Makefile: SMPFLAGS= -D__SMP__ then do nomal 'make' How to use -------------- Startup You should start the components by the following order(assume you are in shell path 'tcpha/'): Start BE: 1. Shell>insmod be/ktcphabe.o 2. Shell>ifconfig tunl0 192.168.1.36 up 3. Shell>httpd Start FE: 1. Shell>ifconfig tunl0 192.168.1.36 up 1. Shell>insmod fe/ktcphafe.o Ok,as you see,setup and run are very simple, Enjoy to explore it! Stop You should stop the components by the following order(assume you are in shell path 'tcpha/'): Stop FE: 1. Shell>ifconfig tunl0 down 2. Shell>echo 1 >> /proc/sys/net/ktcphafe/unload 2. Shell>rmmod ktcphafe Stop BE: 1. Shell>ifconfig tunl0 down 2. Shell>echo 1 >> /proc/sys/net/ktcphabe/unload 3. Shell>rmmod ktcphabe More details --------------- FE configuration options raddr: FE's real address, typically eth0 address, default is 0.0.0.0, represent INADDR_ANY. vaddr: virtual address shared by the cluster, must be specified. port: FE listening port, default is 80. startservers: number of server threads started when startup, default is 5. maxspareservers: max number of idle server threads, default is 5. minspareservers: min number of idle server threads, default is 2. maxclients: max number of concurrent clients access permitted, default is 256. connperbe: number of control connections with per BE, default is 2. redirectaddr: local redirect address, default is 0.0.0.0, it is used by Local Node mechanism, more details about Local Node mechanism see below. redirectport: local redirect port, default is 8080, used by Local Node mechanism. server: BE address. rule: schedule rule. BE configuration options raddr: BE's real address, typically eth0 address, must be specified. uaddr: a address the cluster NOT used, must be specified, used for ARP filtering, more details about ARP filtering see below. vaddr: the cluster shared virtual address, must be specified. port: BE listening port, default is 666. startservers: number of server threads started when startup, default is 5. maxspareservers: max number of idle server threads, default is 5. minspareservers: min number of idle server threads, default is 2. maxclients: max number of concurrent clients access permitted, default is 256. redirectaddr: user space server appliction address, default is 0.0.0.0, represent INADDR_ANY. redirectport: user space server application listening port, default is 80. dregister: set 0 to disable dynamic register, set 1 to enable dynamic register, default is 0. used for BE dynamic register, more details see below. feaddr: FE real address, only used for BE dynamic register. If dregister is disabled, the option is ignored. If dregister is enabled, the option must be specified. feport: FE listening port, only used for BE dynamic register, default is 80. If dregister is disabled, the option is ignored. rule: BE schedule rule, only used for BE dynamic register. If dregister is disabled, the option is ignored. If dregister is enabled, the option must be specified. You can look at the files:fe/tcphafe.conf.detail and be/tcphabe.conf.detail for reference. Debug option User can uncomment the line: DEBUGFLAGS = -g -DCONFIG_TCPHA_FE_DEBUG in the Makefile to enable FE debugging. BE is similar. All debug messages will be recorded in /var/log/messages. Further, user can set debug level to control which debug messages to be printed. The method is: type the following line on the shell: echo value >> /proc/sys/net/ktcphafe/dbglevel. The value should be an integer in 0-6. BE is similar. User can cat /proc/net/tcpha_fe_conn to see the connecton messages handled by FE. Detection option Since tcpha-0.1.4. FE would detect BE's status periodically. User can turn off the function, the method is: type the following line on the shell: echo 0 >> /proc/sys/net/ktcphafe/dtcinterval. also user can set the detection interval, the unit is second. For example, type the following line on the shell: echo 5 >> /pro/sys/net/ktcphafe/dtcinterval would set the detection interval be 5 seconds. Log option User can uncomment the line: LOGFLAGS = -g -DCONFIG_TCPHA_FE_USELOG in the Makefile to enable FE logging. BE is similar. If the log option is enabled, all TCPHA messages (if debugging enabled, including debugging messages) will be writen to ./tcphafe.log. BE is similar. FAQ -------------- Encounter the following error when compiling modules: gcc -D__KERNEL__ -DMODULE -DEXPORT_SYMTAB -DMODVERSIONS -g -DCONFIG_TCPHA_FE_DEBUG/usr/src/linux/include/linux/modversions.h -c -o tcpha_fe.o tcpha_fe.c/usr/src/linux/include/linux/modversions.h: No such file or directory In file included from tcpha_fe.c:21: tcpha_fe.h:33:67: net/dst.h: No such file or directory In file included from /usr/include/sys/uio.h:24 It is because '/usr/src/linux' doesn't point to right kernel sources installation directory. Assume the installation directory is: /usr/src/linux-2.4.20, Type these commands on the shell: Shell>rm -f /usr/src/linux;ln -s /usr/src/linux-2.4.20 /usr/src/linux Encounter the following error when compiling modules: gcc -D__KERNEL__ -DMODULE -DEXPORT_SYMTAB -DMODVERSIONS -g -DCONFIG_TCPHA_FE_USELOG -g -DCONFIG_TCPHA_FE_DEBUG -O2 -Wall -Wstrict-prototypes -I/usr/src/linux/include -include /usr/src/linux/include/linux/modversions.h -c -o tcpha_fe_bh.o tcpha_fe_bh.c tcpha_fe_bh.c:290: variable `tcpha_fe_in_ops' has initializer but incomplete type tcpha_fe_bh.c:291: extra brace group at end of initializer tcpha_fe_bh.c:291: (near initialization for `tcpha_fe_in_ops') tcpha_fe_bh.c:291: warning: excess elements in struct initializer tcpha_fe_bh.c:291: warning: (near initialization for `tcpha_fe_in_ops') tcpha_fe_bh.c:292: warning: excess elements in struct initializer tcpha_fe_bh.c:292: warning: (near initialization for `tcpha_fe_in_ops') tcpha_fe_bh.c:292: warning: excess elements in struct initializer tcpha_fe_bh.c:292: warning: (near initialization for `tcpha_fe_in_ops') tcpha_fe_bh.c:292: warning: excess elements in struct initializer tcpha_fe_bh.c:292: warning: (near initialization for `tcpha_fe_in_ops') It is because no choosing packet filter support when compiling kernel. Define CONFIG_NETFILTER or choose packet filter support when do 'make menuconfig' can solve it. Encounter the following error when loading modules: ktcphafe.o: ktcphafe.o: unresolved symbol tcp_destroy_sock ktcphafe.o: ktcphafe.o: unresolved symbol tcp_clear_xmit_timers ktcphafe.o: ktcphafe.o: unresolved symbol tcp_openreq_cachep ktcphafe.o: ktcphafe.o: unresolved symbol tcp_statistics ktcphafe.o: ktcphafe.o: unresolved symbol tcp_v4_lookup_listener ktcphafe.o: ktcphafe.o: unresolved symbol tcp_put_port It is because the option: Loadable modules support--->Set version information on all module symbols is choosed when does 'make menuconfig'. No choose the option and rebuild kernel can solve it. Glossary ------------- FE Represent Front End, the dispatcher of the cluster, which is visible to clients. BE Represent Back End, the real servers of the cluster, which serve the requests of clients in deed. TCP Handoff In TCP Handoff, the connection endpoint in the FE is passed to the chosen BE, which then processes the request and sends out the response directly to the client. Local Node Mechanism Local Node mechanism is to let FE can handle requests self. Which is useful when the load of cluster is light, or number of BE are not many or some requests which can not be scheduled etc. The method is redirecting the requests to the user space server application. Such as to apache, you could modify the config file:/etc/httpd/conf/httpd.conf to let httpd listen on 8080. then start httpd, set redirectport = 8080 in the FE config file. BE dynamic register BE can dynamically register to the FE. That means no need restarting system to admit new BE to join. But make sure FE is already running, then start the new BE. Otherwise,why not add the BE address in the FE's config file beforehand. ARP filtering ARP filtering is used to solve the arp problem (http://www.linuxvirtualserver.org/docs/arp.html). Notes ------ 1. When you stop BE, the step 1 must be done before step 2 and 3. 2. The rules must contain all the cases, so you'd better add rule: 'rule = x.x.x.x .*' in the config file Feedback -------- Welcome your comments, bug reports, bug fixes, and ideas Thanks, Li Wang dragonfly@linux-vs.org