MEN Projects Blog: 10-13-2015 Misc things for kivy (status: done)

There are a few things I need to do before I finish the kivy project. I was sidelined doing a bunch of jiujitsu documentation, and now I will make a second application for that.

SUMMARY: Most of this was a waste of time, but still holds promise. Mongo and Redis is not needed right away. The API will in the next article.

MariaDB + maxscale: will store tests.

Turns out maxscale can only shard by database. Thus, if you have to put knowledge in the software for sharding. Thus, we will install it for testing purposes, but will get rid of it fast. HAProxy and making your own scripts for it seem to work fine already. I don't know if a failover occurs if it will redo the slaving like HA proxy does.
CONCLUSION: Not doing maxscale. Hoping shard query is better.

MariaDB + ShardQuery

BTW, not doing mariadb cluster for now, just master/slave for simplicity and testing.
https://mariadb.com/kb/en/mariadb/shard-query/
http://shardquery.com/

MonogDB sharded: will store answers.
Redis
An API will be used to upload tests, do answers, and anything else. Use REDIS.
First on laptops, then AWS.

As extras:

memcache --- nosql to mysql.
3rd caching alternative.
Instead of Hadoop: Voltd, Vertica, and Spark (I like spark), also R

Spark

https://www.xplenty.com/blog/2014/11/apache-spark-vs-hadoop-mapreduce/
http://spark.apache.org/
https://en.wikipedia.org/wiki/Apache_Spark
Python and R
Unfortunately it is still JVM dependent, but written in SCALA.
Is it brain dead to manage? Hadoop is a nightmare.

R stats language

To replace SAS. It works well with Spark and VerticaDB.

Mongo from Percona with the two other engines --- necessary? Are the engines useful?

WiredTiger (will be the default from Mongo)
RocksDB and PerconaFT will have to be tested to see if they are any good or fluff.

MariaDB an Maxscale on laptops

These links are old and the repositories don't exist anymore. Good read still.

https://mariadb.com/resources/downloads

MariaDB 10 --- might want to setup it as a repository.
MaxScale
Connectors

Install as root MariaDB 10 on 3 laptops (in aws it will 2 mariabd clusters):

 apt-get -y install software-properties-common
 apt-key adv --recv-keys --keyserver hkp://keyserver.ubuntu.com:80 0xcbcb082a1bb943db
sudo add-apt-repository 'deb http://ftp.kaist.ac.kr/mariadb/repo/10.0/ubuntu trusty main'
sudo apt-get update
apt-get install mariadb-server

You may have to add the repository manually.

mark5 apt # more sources.list.d/additional-repositories.list 
deb http://ftp.kaist.ac.kr/mariadb/repo/10.0/ubuntu trusty main
mark5 apt # pwd
/etc/apt

MaxScale

We assume server "mark2" is the master, and mark4 and mark5 are slaves.

mark2:

mysql client:

GRANT REPLICATION SLAVE ON *.* TO 'rep'@'%' IDENTIFIED BY 'dumb_password';
flush privileges;

/etc/mysql/my.cnf

server-id = 2
log_bin = /var/log/mysql/mysql-bin.log
#bind-address           = 127.0.0.1

mark4:

mysql client (since we are starting fresh we can use the first binlog):


CHANGE MASTER TO MASTER_HOST='192.168.1.209',
  MASTER_USER='rep', 
  MASTER_PASSWORD='dumb_password', 
  MASTER_LOG_FILE='mysql-bin.000003',
  MASTER_LOG_POS=  365;
start slave;
show slave status\G

/etc/mysql/my.cnf

server-id = 4
relay-log  = /var/log/mysql/mysql-relay-bin.log
#bind-address           = 127.0.0.1

mark5:

mysql client (since we are starting fresh we can use the first binlog):


CHANGE MASTER TO MASTER_HOST='192.168.1.209',
  MASTER_USER='rep', 
  MASTER_PASSWORD='dumb_password', 
  MASTER_LOG_FILE='mysql-bin.000003',
  MASTER_LOG_POS=  365;
start slave;
show slave status\G

/etc/mysql/my.cnf

server-id = 5
relay-log   = /var/log/mysql/mysql-relay-bin.log
#bind-address           = 127.0.0.1

You have to log in or send your email to get it. Annoying.
wget https://downloads.mariadb.com/enterprise/wd3x-hx24/mariadb-maxscale/1.2.1/ubuntu/dists/trusty/main/binary-amd64/maxscale-1.2.1-1.ubuntu_trusty.x86_64.deb

I had linux mint 7.1 which based on Ubuntu 14.4.

Read the docs:

https://mariadb.com/kb/en/mariadb-enterprise/mariadb-maxscale/

dpkg -i maxscale-1.2.1-1.ubuntu_trusty.x86_64.deb
maxkeys /var/lib/maxscale/
maxpasswd /var/lib/maxscale/ `openssl rand -base64 32`

Remember the password this creates. You will need it for your config files.

mkdir -p /data/maxscale/data
mkdir -p /data/maxscale/cache

# In my client on master
create user 'maxscale'@'192.168.1.%' identified by 'dumbpassword';
grant SELECT on mysql.user to 'maxscale'@'192.168.1.%';
grant SELECT on mysql.db to 'maxscale'@'192.168.1.%';
grant SHOW DATABASES on *.* to 'maxscale'@'192.168.1.%';
flush privileges;

create database if not exists maxscale_test;
grant all privileges on maxscale_test.* to rw@'192.168.1.%' identified by 'rw';
grant select on maxscale_test.* to ro@'192.168.1.%' identified by 'ro';
grant all privileges on maxscale_test.* to rw@localhost identified by 'rw';
grant select on maxscale_test.* to ro@localhost identified by 'ro';

Use the maxscale config file below, but chang the hostnames and passwords. I have two slaves.
Results:

I was able to insert data into the RR session using a mysql account that had write permission. I thought this was wrong.
When connecting to RW, I thought it was suppose to split reads across every server and writes to the master only. I was able to insert data on a slave./
When the master was brought down, no new sessions could be made. Good. As expected.
When I removed router_options and shutdown the master, only RW sessions couldn't be made but I could still insert data to the RR session.
In order to get failover to work you have to make external scripts.
Conclusion: Not better than HAproxy and either I was doing something wrong or the RW and RR sessions weren't bulletproof. Will have to do this again. Moving onto Shard Query.

Contents of /etc/maxscale.cnf
A lot of this was stolen from: http://www.severalnines.com/blog/deploy-and-configure-maxscale-sql-load-balancing

[maxscale]
threads=4

logdir=/tmp/
datadir=/data/maxscale/data
cachedir=/data/maxscale/cache
piddir=/data/maxscale

# This is the galera monitor which monitors the mysql services (servers) and may do a failover. 
[Galera Monitor]
type=monitor
module=galeramon
servers=mark2,mark4,mark5
user=maxscale
passwd=2235D8D837C1E22E8043D73ABED6B0B7
monitor_interval=10000
disable_master_failback=1

[qla]
type=filter
module=qlafilter
options=/tmp/QueryLog

[fetch]
type=filter
module=regexfilter
match=fetch
replace=select

# Specify which servers get the write queries and which ones are the slaves. This will split the queries so that read only queries will go to the slaves. I believe the first listed server has to be the master but I am guessing it auto detects this.  
[RW]
type=service
router=readwritesplit
servers=mark2,mark4,mark5
user=rw
passwd=rw
max_slave_connections=100%
router_options=slave_selection_criteria=LEAST_CURRENT_OPERATIONS

# Specify which services can do selects. We include the master. synced isn't explained very well. I think if the slaves are in sync then its uses them. 
[RR]
type=service
router=readconnroute
router_options=synced
servers=mark2,mark4,mark5
user=ro
passwd=ro

[Debug Interface]
type=service
router=debugcli

[CLI]
type=service
router=cli


# The write listener you should put your applications at. 
[RWlistener]
type=listener
service=RW
protocol=MySQLClient
address=192.168.1.209
port=3307

# The read only listener you should put your applications at. 
[RRlistener]
type=listener
service=RR
protocol=MySQLClient
address=192.168.1.209
port=3308

[Debug Listener]
type=listener
service=Debug Interface
protocol=telnetd
address=127.0.0.1
port=4442

[CLI Listener]
type=listener
service=CLI
protocol=maxscaled
address=127.0.0.1
port=6603


# We define the servers here. The first one is the master. 
[mark2]
type=server
address=mark2
port=3306
protocol=MySQLBackend

[mark4]
type=server
address=mark4
port=3306
protocol=MySQLBackend

[mark4]
type=server
address=mark5
port=3306
protocol=MySQLBackend

Shard Query

This will have to work to compete with mongo in making a shardable easy service. For my immediate two projects, I can use Mongo, so I hope this works or this will fail. It may be possible to
use MariaDB cluster, but there is still a limitation in scaling for it, though its better.

Summary: Couldn't get it to work smooth. Still seems beta after all these years. Aborting for now because it is unlikely we will need it right away. Don't mind sharding later.

Non-detailed instructions.
https://github.com/greenlion/swanhart-tools/wiki/Installing-Shard-Query

http://www.hostingadvice.com/how-to/install-gearman-ubuntu-14-04/

sudo apt-get install python-software-properties # already installed
sudo add-apt-repository ppa:gearman-developers/ppa
sudo apt-get update
sudo apt-get install gearman-job-server libgearman-dev
sudo apt-get upgrade
sudo apt-get install php-pear php5-dev gearman
Edit /etc/php5/cli/php.ini

extension=gearman.so # under “Dynamic Extensions”

edit /etc/php5/apache2/php.ini

extension=gearman.so # under dynamic extensions
sudo service apache2 restart

Try tests: http://gearman.org/getting-started/#client

The first test just stalled.
these should be the same result

gearman -f wc < /etc/passwd
wc -l < /etc/passwd

echo '<?php
print gearman_version() . "\n";
?>
' > /var/www/html/test1.php

```
php /var/www/html/test1.php
```

Add to file and reexecute.

<?php
$worker= new GearmanWorker();
$worker->addServer();
$worker->addFunction("reverse", "my_reverse_function");
while ($worker->work());

function my_reverse_function($job)
{
  return strrev($job->workload());
}
?>

```
Do the other examples on this page. 
```

Considering is it very unliekly a good server would need to be sharded for what I want, we will skip this.

Mongo 3.0

My main concern here is when a shard cluster goes offline or a single secondary goes offline. Does it force a reconnect for all the mongo connections. The answer should be no. In any event, never trust the software and there should be one cluster for login, one cluster for people to work, and one clsuter for results, one for reports, etc. If any cluster goes down, you can still work on the others. Clusters should be tied to a product/feature and features/products should be as independent of each other as possible. Separate super critical, critical, and non-critical data into their own clusters too, for Mongo or MySQL.

ABORTING FOR NOW. We will store test results here. API needs to get done.

Redis

ABORTING FOR NOW. Until the product is up and running no point.....

API

This involved a backend to be made for the API. But I know what I want. This will be in the next article.

MEN Projects Blog

Tuesday, October 13, 2015

10-13-2015 Misc things for kivy (status: done)

MariaDB an Maxscale on laptops

Shard Query

Mongo 3.0

Redis

API

No comments:

Post a Comment