Tuesday, October 13, 2015

10-13-2015 Misc things for kivy (status: done)

There are a few things I need to do before I finish the kivy project. I was sidelined doing a bunch of jiujitsu documentation, and now I will make a second application for that.

SUMMARY: Most of this was a waste of time, but still holds promise. Mongo and Redis is not needed right away. The API will in the next article.

  1. MariaDB + maxscale: will store tests.
    1. Turns out maxscale can only shard by database. Thus, if you have to put knowledge in the software for sharding. Thus, we will install it for testing purposes, but will get rid of it fast. HAProxy and making your own scripts for it seem to work fine already. I don't know if a failover occurs if it will redo the slaving like HA proxy does. 
    2. CONCLUSION: Not doing maxscale. Hoping shard query is better. 
  2. MariaDB + ShardQuery
    1. BTW, not doing mariadb cluster for now, just master/slave for simplicity and testing. 
    2. https://mariadb.com/kb/en/mariadb/shard-query/
    3. http://shardquery.com/
  3. MonogDB sharded: will store answers. 
  4. Redis
  5. An API will be used to upload tests, do answers, and anything else. Use REDIS. 
  6. First on laptops, then AWS. 
As  extras:
  1. memcache --- nosql to mysql. 
  2. 3rd caching alternative. 
  3. Instead of Hadoop: Voltd, Vertica, and Spark (I like spark), also R
    1. Spark
      1. https://www.xplenty.com/blog/2014/11/apache-spark-vs-hadoop-mapreduce/
      2. http://spark.apache.org/
      3. https://en.wikipedia.org/wiki/Apache_Spark
      4. Python and R
      5. Unfortunately it is still JVM dependent, but written in SCALA. 
      6. Is it brain dead to manage? Hadoop is a nightmare. 
    2. R stats language
      1. To replace SAS. It works well with Spark and VerticaDB. 
  4. Mongo from Percona with the two other engines --- necessary? Are the engines useful?
    1. WiredTiger (will be the default from Mongo)
    2. RocksDB and PerconaFT will have to be tested to see if they are any good or fluff. 

MariaDB an Maxscale on laptops


      • You may have to add the repository manually. 
        mark5 apt # more sources.list.d/additional-repositories.list 
        deb http://ftp.kaist.ac.kr/mariadb/repo/10.0/ubuntu trusty main
        mark5 apt # pwd
        /etc/apt
        
    • MaxScale
      • We assume server "mark2" is the master, and mark4 and mark5 are slaves. 
        • mark2:
          • mysql client:
            GRANT REPLICATION SLAVE ON *.* TO 'rep'@'%' IDENTIFIED BY 'dumb_password';
            flush privileges;
            
          • /etc/mysql/my.cnf
            server-id = 2
            log_bin = /var/log/mysql/mysql-bin.log
            #bind-address           = 127.0.0.1
            
            
        • mark4:
          • mysql client (since we are starting fresh we can use the first binlog):
            
            CHANGE MASTER TO MASTER_HOST='192.168.1.209',
              MASTER_USER='rep', 
              MASTER_PASSWORD='dumb_password', 
              MASTER_LOG_FILE='mysql-bin.000003',
              MASTER_LOG_POS=  365;
            start slave;
            show slave status\G
            
            
            
          • /etc/mysql/my.cnf
            server-id = 4
            relay-log  = /var/log/mysql/mysql-relay-bin.log
            #bind-address           = 127.0.0.1
            
        • mark5:
          • mysql client (since we are starting fresh we can use the first binlog):
            
            CHANGE MASTER TO MASTER_HOST='192.168.1.209',
              MASTER_USER='rep', 
              MASTER_PASSWORD='dumb_password', 
              MASTER_LOG_FILE='mysql-bin.000003',
              MASTER_LOG_POS=  365;
            start slave;
            show slave status\G
            
            
          • /etc/mysql/my.cnf
            server-id = 5
            relay-log   = /var/log/mysql/mysql-relay-bin.log
            #bind-address           = 127.0.0.1
            
      • You have to log in or send your email to get it. Annoying.
      • wget https://downloads.mariadb.com/enterprise/wd3x-hx24/mariadb-maxscale/1.2.1/ubuntu/dists/trusty/main/binary-amd64/maxscale-1.2.1-1.ubuntu_trusty.x86_64.deb
        • I had linux mint 7.1 which based on Ubuntu 14.4. 
      • Read the docs: 
      • dpkg -i maxscale-1.2.1-1.ubuntu_trusty.x86_64.deb 
      • maxkeys /var/lib/maxscale/
      • maxpasswd /var/lib/maxscale/ `openssl rand -base64 32`
        • Remember the password this creates. You will need it for your config files. 
      • mkdir -p /data/maxscale/data
        mkdir -p /data/maxscale/cache
        
        # In my client on master
        create user 'maxscale'@'192.168.1.%' identified by 'dumbpassword';
        grant SELECT on mysql.user to 'maxscale'@'192.168.1.%';
        grant SELECT on mysql.db to 'maxscale'@'192.168.1.%';
        grant SHOW DATABASES on *.* to 'maxscale'@'192.168.1.%';
        flush privileges;
        
        create database if not exists maxscale_test;
        grant all privileges on maxscale_test.* to rw@'192.168.1.%' identified by 'rw';
        grant select on maxscale_test.* to ro@'192.168.1.%' identified by 'ro';
        grant all privileges on maxscale_test.* to rw@localhost identified by 'rw';
        grant select on maxscale_test.* to ro@localhost identified by 'ro';
        
        
        
      • Use the maxscale config file below, but chang the hostnames and passwords. I have two slaves.
      • Results:
        • I was able to insert data into the RR session using a mysql account that had write permission. I thought this was wrong. 
        • When connecting to RW, I thought it was suppose to split reads across every server and writes to the master only. I was able to insert data on a slave./ 
        • When the master was brought down, no new sessions could be made. Good. As expected. 
        • When I removed router_options and shutdown the master, only RW sessions couldn't be made but I could still insert data to the RR session. 
        • In order to get failover to work you have to make external scripts. 
        • Conclusion: Not better than HAproxy and either I was doing something wrong or the RW and RR sessions weren't bulletproof. Will have to do this again. Moving onto Shard Query. 

Contents of /etc/maxscale.cnf
A lot of this was stolen from: http://www.severalnines.com/blog/deploy-and-configure-maxscale-sql-load-balancing
[maxscale]
threads=4
logdir=/tmp/
datadir=/data/maxscale/data
cachedir=/data/maxscale/cache
piddir=/data/maxscale



# This is the galera monitor which monitors the mysql services (servers) and may do a failover. 
[Galera Monitor]
type=monitor
module=galeramon
servers=mark2,mark4,mark5
user=maxscale
passwd=2235D8D837C1E22E8043D73ABED6B0B7
monitor_interval=10000
disable_master_failback=1

[qla]
type=filter
module=qlafilter
options=/tmp/QueryLog

[fetch]
type=filter
module=regexfilter
match=fetch
replace=select

# Specify which servers get the write queries and which ones are the slaves. This will split the queries so that read only queries will go to the slaves. I believe the first listed server has to be the master but I am guessing it auto detects this.  
[RW]
type=service
router=readwritesplit
servers=mark2,mark4,mark5
user=rw
passwd=rw
max_slave_connections=100%
router_options=slave_selection_criteria=LEAST_CURRENT_OPERATIONS

# Specify which services can do selects. We include the master. synced isn't explained very well. I think if the slaves are in sync then its uses them. 
[RR]
type=service
router=readconnroute
router_options=synced
servers=mark2,mark4,mark5
user=ro
passwd=ro

[Debug Interface]
type=service
router=debugcli

[CLI]
type=service
router=cli


# The write listener you should put your applications at. 
[RWlistener]
type=listener
service=RW
protocol=MySQLClient
address=192.168.1.209
port=3307

# The read only listener you should put your applications at. 
[RRlistener]
type=listener
service=RR
protocol=MySQLClient
address=192.168.1.209
port=3308

[Debug Listener]
type=listener
service=Debug Interface
protocol=telnetd
address=127.0.0.1
port=4442

[CLI Listener]
type=listener
service=CLI
protocol=maxscaled
address=127.0.0.1
port=6603


# We define the servers here. The first one is the master. 
[mark2]
type=server
address=mark2
port=3306
protocol=MySQLBackend

[mark4]
type=server
address=mark4
port=3306
protocol=MySQLBackend

[mark4]
type=server
address=mark5
port=3306
protocol=MySQLBackend

Shard Query

This will have to work to compete with mongo in making a shardable easy service. For my immediate two projects, I can use Mongo, so I hope this works or this will fail. It may be possible to
use MariaDB cluster, but there is still a limitation in scaling for it, though its better.

Summary: Couldn't get it to work smooth. Still seems beta after all these years. Aborting for now because it is unlikely we will need it right away. Don't mind sharding later.

Non-detailed instructions.
https://github.com/greenlion/swanhart-tools/wiki/Installing-Shard-Query

  1. http://www.hostingadvice.com/how-to/install-gearman-ubuntu-14-04/
    1. sudo apt-get install python-software-properties # already installed
    2. sudo add-apt-repository ppa:gearman-developers/ppa
    3. sudo apt-get update
    4. sudo apt-get install gearman-job-server libgearman-dev
    5. sudo apt-get upgrade
    6. sudo apt-get install php-pear php5-dev gearman
    7. Edit /etc/php5/cli/php.ini
      1. extension=gearman.so # under “Dynamic Extensions”
    8. edit /etc/php5/apache2/php.ini
      1. extension=gearman.so # under dynamic extensions
      2. sudo service apache2 restart
    9. Try tests: http://gearman.org/getting-started/#client
      1. The first test just stalled. 
      2. these should be the same result
        1. gearman -f wc < /etc/passwd
        2. wc -l < /etc/passwd
      3. echo '<?php
        print gearman_version() . "\n";
        ?>
        ' > /var/www/html/test1.php
        
      4. php /var/www/html/test1.php
      5. Add to file and reexecute.
        <?php
        $worker= new GearmanWorker();
        $worker->addServer();
        $worker->addFunction("reverse", "my_reverse_function");
        while ($worker->work());
        
        function my_reverse_function($job)
        {
          return strrev($job->workload());
        }
        ?>
      6. Do the other examples on this page. 
    10. Considering is it very unliekly a good server would need to be sharded for what I want, we will skip this.

Mongo 3.0

My main concern here is when a shard cluster goes offline or a single secondary goes offline. Does it force a reconnect for all the mongo connections. The answer should be no. In any event, never trust the software and there should be one cluster for login, one cluster for people to work, and one clsuter for results, one for reports, etc. If any cluster goes down, you can still work on the others. Clusters should be tied to a product/feature and features/products should be as independent of each other as possible. Separate super critical, critical, and non-critical data into their own clusters too, for Mongo or MySQL.

ABORTING FOR NOW. We will store test results here. API needs to get done.


Redis

ABORTING FOR NOW. Until the product is up and running no point.....

API

This involved a backend to be made for the API. But I know what I want. This will be in the next article.

No comments:

Post a Comment