Cron Job Note - 2
Refresh the database (NoSQL) - Cassandra
The sample script is used to backup the data from production database and refresh the data to staging or test database. It is not supposed to restore data because of database corruption.
Backup the Cassandra database nightly
- There is a keyspace named
hho_ks
in the Cassandra nodes store the production data. - Every night the staging Cassandra server will be refreshed with production’s snapshot
- This solution is not built on the incremental snapshot.
- The production and staging nodes are running within the same subnet.
- The backup script will be run nightly on production server
Cron job setting
00 20 * * * <user> /home/<user>/bin/cass_snapshot.sh >> /home/<user>/bin/refresh.log 2>&1
The script to create a snapshot:
cass_snapshot.sh
#!/bin/bash # the log file sits home/<user>/bin/refresh.log # SET staging Cassandra IP CASS_STG_IP=0.0.0.0 echo "$(date): Beginning refresh of staging Cassandra ${CASS_STG_IP}" START=$(date +%s) cd /data/cassandra # Prepare a schema script of keyspace hho_ks echo "$(date): Prepare a schema script of keyspace hho_ks" sudo bash -c "cqlsh -e 'DESC KEYSPACE hho_ks' > hho_ks.cql" # Remove old snapshots folder and create new snapshots folder with some sub-folders echo "$(date): Remove old snapshots folder and create new snapshots folder with some sub-folders" sudo rm -rf snapshots sudo mkdir snapshots cd snapshots sudo mkdir -p service interval imported_file market_file meter_config cd /data/cassandra # Clear snapshot hho_ks echo "$(date): Clear snapshot hho_ks" # find /data/cassandra/data/hho_ks/ -name snapshots -type d nodetool clearsnapshot hho_ks > /dev/null 2>&1 # find /data/cassandra/data/hho_ks/ -name snapshots -type d # Create new snapshot (cut out the snapshot id into a variable) echo "$(date): Create new snapshot" SNAP_ID=$(nodetool snapshot hho_ks | cut -c 66-78) # echo SNAP_ID=$SNAP_ID # Copy snapshot data to new folder snapshot # ~ 7mins echo "$(date): Copy snapshot data to new snapshot folder" for d in $(ls /data/cassandra/snapshots); do \ sudo cp -R /data/cassandra/data/hho_ks/$d*/snapshots/$SNAP_ID/. /data/cassandra/snapshots/$d/ ; \ done # Create a tarball of snapshots echo "$(date): Create a tarball of snapshots" sudo rm -rf snapshots.tar.gz sudo tar -zcvf snapshots.tar.gz snapshots/ > /dev/null 2>&1 # ~30mins # Copy tarball and schema script to staging Cassandra echo "$(date): Copy tarball and schema script to staging Cassandra ${CASS_STG_IP}" scp snapshots.tar.gz <user>@${CASS_STG_IP}:/home/<user>/ # ~5mins scp hho_ks.cql <user>@${CASS_STG_IP}:/home/<user>/ # Refresh snapshots on staging Cassandra echo "$(date): SSH to staging Cassandra ${CASS_STG_IP}" sudo ssh <user>@${CASS_STG_IP} 'bash -s' < /home/<user>/bin/cass_refresh.sh echo "$(date): Completed refresh of staging Cassandra ${CASS_STG_IP}" END=$(date +%s) echo "Refresh duration: $(( $END - $START ))s"
Refresh the staging Cassandra node
The script to refresh the staging node:
cass_refresh
#!/bin/bash # Unzip tarball echo "$(date): Unzip tarball" cd rm -rf snapshots tar -zxvf snapshots.tar.gz > /dev/null 2>&1 # ~5mins # Attempt to drop keyspace hho_ks # It will likely throw a java.lang.RuntimeException echo "$(date): Drop keyspace hho_ks" cqlsh -e "drop keyspace hho_ks" > /dev/null 2>&1 sleep 1m # Start/restart Cassandra service echo "$(date): Restart Cassandra service" sudo systemctl start cassandra.service sudo systemctl restart cassandra.service # Wait for 5 minutes to ensure the Cassandra service is running echo "$(date): Wait for 5 minutes to ensure the Cassandra service is running" sleep 5m # Check Cassandra service status # sudo systemctl status cassandra # Attempt to drop keyspace hho_ks again # It will likely complain: Cannot drop non existing keyspace 'hho_ks' cqlsh -e "drop keyspace hho_ks" > /dev/null 2>&1 sleep 1m # Remove data folder hho_ks echo "$(date): Remove data folder hho_ks" sudo rm -r -f /var/lib/cassandra/data/hho_ks # Recreate keyspace hho_ks echo "$(date): Recreate keyspace hho_ks" cqlsh --file="hho_ks.cql" > /dev/null 2>&1 # Copy snapshots to new data folder hho_ks # ~5mins echo "$(date): Copy snapshots to new data folder hho_ks" for d in $(ls snapshots); do \ sudo cp -R snapshots/$d/. /var/lib/cassandra/data/hho_ks/$d*/ ; \ done; # Refresh keyspace # ~10mins echo "$(date): Refresh keyspace" for d in $(ls snapshots); do \ nodetool refresh -- hho_ks $d ; \ done