Friday, April 22, 2016

Cassandra Certification, and others.

Update2: I need to merge this with the other certification page I made. 

UPDATE: After the 1st test for cassandra, do AWS certification, then the 2nd test for cassandara. Another thing to do, with redshift and mysql on AWS, learn to make an instance and only use it for an hour. Then destroy it. Perhaps script it to do something then save the output.

You have to go through 2 $300 tests to get the basic certification. There are a lot of classes I need to put up here and study guides. I don't know if there are cheap or free tests to take.

I have one cassandra test to take in May, so I might as well finish it out. I think in order, then, vertica, hadoop, coudbdb (but I consider this minor certifications), and maybe voltdb if it looks like it is economically worth it. If I had known vertica was cheap, I would have done it instead of starting on cassandra.

I just started this, so I apologize there isn't much here. I choose Cassandra because I have reservations about Hadoop. VoltDB and vertica also interest me and I might wants certifications in those, as well as couchDB.

Some quick stuff:

Preparation
There has to more preparation stuff, like youtube.


Getting certified
  • https://academy.datastax.com/Certifications
Cassandra Tutorials (didn't take these yet) from udemy
  •  Learn Apache Cassandra from Scratch
  • There are about 3 or 4 more courses at udemy for cassandra. 

------------------------------------------------
Other misc crap that needs its own blog
CouchDB
I am not thrilled by these options. I might look for more. 
  • https://www.udemy.com/learn-nosql-database-design-with-couchdb/
  • http://www.kerneltraining.com/couchdb
    • Not sure if this is expensive. 
  • http://www.vskills.in/certification/Certified-Apache-CouchDB-Professional
    • I am have no idea if we can even do it. Its in India for the government. 
voltdb
  • https://university.voltdb.com/
    • $2000 YIKES!!!!
hadoop
  • http://hortonworks.com/training/certification/hdpca-certification/
    • $250 makes this attractive. Maybe more than cassandra, or both. 
Vertica
  • From HP. I suspect these exams are not too expensive. Need to check. This also is very attractive. 
    • https://my.vertica.com/resources/certification/
    • https://my.vertica.com/resources/certification/client-certification/
AWS https://aws.amazon.com/certification/
With the direction the cloud is going, it is important as a DBA to know AWS in an out to best make recommendations and to handle cloud environments. If you get this, it still works in favor for other cloud environments too. 
  • From AWS -- The exams are $150 each I think, but whatever, they are cheap. 
    • https://aws.amazon.com/certification/certified-solutions-architect-associate/
    • https://aws.amazon.com/certification/certified-solutions-architect-professional/
    • https://aws.amazon.com/certification/certified-sysops-admin-associate/
    • https://aws.amazon.com/certification/certified-devops-engineer-professional/
    • Also, they have practice exams for $20, I heavily recommend these. 
  • Udemy -- just cheap classes
    • https://www.udemy.com/aws-certified-developer-associate/
    • https://www.udemy.com/aws-solutions-architect-associate-certification-course2016/
    • https://www.udemy.com/aws-certified-solution-architect-associate-level-training/
    • https://www.udemy.com/aws-certified-solutions-architect-associate/
    • https://www.udemy.com/aws-certified-sysops-administrator-associate/
  • Online documentation
    • https://aws.amazon.com/documentation/vpc/
    • https://aws.amazon.com/documentation/
Chef certification --- This is good to learn. chef I think is the 2nd most popular one out there.
  •  https://training.chef.io/certification
    • there are free tutorials
Puppet certification --- the most popular one
  •  https://puppet.com/support-services/certification



DAD

Tasks finished.

  1. Installed redis
    1. sudo apt-get -y install redis-server
    2. The default configuration with no password and tied to the loopbask is fine. In production environments, of course add a password. 
    3. You may want to download and install the latest version. 
    4. https://redislabs.com/python-redis\
      1. pip install redis


Well now that I have mongo and postgresql certification, I am ready to start my dad project again. It will initially support MySQL, Mogno, and PostgreSQL without replication. Then add in replication, vertica, voltdb, and cassandra. There are different types of replication, and it can be a pain. I will update the progress to this blog for DAD.

Project list:

Part 1
1. Load from YAML options file what database and location to run DAD from. Initially MySQL, Postgresql, or Mongo.
2. Basic connection for database.
3. Initial DAD setup scripts.
4. Store process information
5. Store OS information
6. Store misc information
7. Store diskspace according to database and according to OS.
8. One module for each database type with the same functions to get data.
9. Each module will have a support module.
10. Modules are dynamically loaded when needed and otherwise abort. Aborting one type of database doesn't kill the scripts from running with the other databases.
11. Log file for errors which rotates.
12. Global module is database independent.
13. All data transfer, inserting and selecting data, will be done by either json or a python dictionary (probably python). Also, printing and inserting data will be checked for on each field so that a print won't abort a server if there is no info, and inserts won't abort if one field is missing. There should never be NULLs in fields. use empty spaces. NULLs mean no data was inserted.

Part 1-2.
1. Make global available dashboard on AWS for systems.

Part 2
1. Automatic dev packages. Dependecy checks and not override later versions.
2. Automatic rpm packages. Dependency checks and not override later versions.
3. Everything in YAML will have variables that will not collide between versions, or minimally. It should work by default if no options are given and state what the default options are.
4. Encrypt data option which is needed at startup by a config file or a loaded program. Have program run only on localhost.

Part 3
1. voltdb
2. couchdb
3. cassandra
4. vertica

Part 4
1. Add in replication and cluster columns for all databases.

Part 5
1. Run DAD from other databases. (have to chose which ones).

Part 5
1. sqllite
2. bsd files with replication

-----------------------------------------------------
Misc stuff I neded to do:


  1. Install Percona MySQL 5.7. I used Linux Mint instead of Ubuntu because Ubuntu hasn't figured out in 10 years their stupid Unity desktop sucks rocks. 
    1. wget https://repo.percona.com/apt/percona-release_0.1-3.trusty_all.deb
    2. sudo dpkg -i percona-release_0.1-3.trusty_all.deb 
    3. sudo apt-get update
    4. sudo apt-get install percona-server-server-5.7
    5. Its uses a plugin system for passwords now, and you have to be as root to login when you don't have a password. You have to change the plugin used for the account and password. Overall its a good move. 
Mongo and postgresql were already installed. For now, just doing no replication dashboard.

The memory overhead for these 3 systems is VERY low. So I just leave them on.

Driver installation:

  • postgresql
    • http://initd.org/psycopg/docs/install.html
      • sudo apt-get install python-psycopg2
      • pip install psycopg2
      • OR
      • apt-get install python-psycopg2
  • MySQL
    • sudo apt-get install python-mysqldb
    • I didn't use the pip version. 
  • Mongo
    • python -m pip install pymongo