Thursday, February 13, 2020

Netapp: Excessive DNS queries by Netapp Harvest Server against monitored Netapp Cluster

Learn Storage, Backup, Virtualization,  and Cloud. AWS, GCP & AZURE.
..........................................................................................................................................................................

Synopsis: Netapp nabox, if used ova image for 2.5 version considering it as a GA release is good in terms of use case. However, Server generates thousands of DNS queries against cluster in less than 24 hours.

I have ran into a problem where DNS server got choked with NABOX server that generated about 70K DNS requests for both 27A and 27AAAA request in less than 24 hours.

Resolution:
1.     Use IP address of Cluster (i.e. use cluster management IP) as source of cluster to be monitored.
2.     Use Beta Version of newer release running 2.6 that has the function that bundles dnsmasq as a local resolver.
I went with Option 2 as it was seamless and worth running hostname rather than using IP address and such.

Ref:




You are Welcome :)

Wednesday, February 12, 2020

Cohesity: Basics of System Internals, key concepts, and Services of Cohesity Solution

Learn Storage, Backup, Virtualization,  and Cloud. AWS, GCP & AZURE.
..........................................................................................................................................................................

Problem: It is important to use common terminology while administering a solution, and its even more important to know what each service or terminology used in solution actually does in meeting what is expected of the solution. It helps connect the dots.

What are System Internals of Cohesity and what each Internal Service function is responsible for?

Magneto Service:
Magneto service is the heart of Cohesity solution, which is responsible for Cohesity Integrated Backup Software i.e. DataProtect. Magneto is a backup Engine that integrates Backup service with VMWare vCenter VADP), Oracle, SQL Server, Pure Storage, Netapp, and more.

IRIS Service:
This service provides user interface to interact with Cohesity resources by allowing access to cluster and its services. This facilitates the REST API orchestration. This service is responsible for Web GUI for system management and CLI for advanced Commands.

Gandalf Service:
This is a Cluster Health and State Manager Service. Holds Distributed lock manager and holds Binary Configuration for other processes.

Nexus Service:
This manages platform level operations, while also starts/stops OS level services. Common tasks this performs includes node discovery, cluster initialization, VIP management, and IP/DNS/NTP configuration. This service also manages NON-Disruptive-Upgrades (NDU).

Scribe Service:
This service is responsible for Distributed metadata and journaling. It uses Paxos algorithm for data consistency while holds the metadata for the file systems such as files, snapshots, segments, dedupe chunks, data location and such.

Bridge Service:
This is a main Engine Controller for I/O Operations. This exposes FS types (VIEW- SMB, NFS, and S3) to Clients. This is responsible for updating Scribe of data location, chunking bite streams, and sending to the blob store.

What is Blob?
Blob is a system object that is simply logical allocation of bricks. Bricks is an entity that can be part of which that makes up small files (<=8MB). Whereas, MegaFile(256GB+) is multiple blobs from different nodes. Therefore, here is how the hierarchy would look like:-- Inodes are smallest entity within an inode snaptree that defines file system objects, and hierarchy. Inodes point to Blobs. Blobs are part of bigger entity i.e. Bricks.

Hydra Service:
This is a front-end write cache for bridge. Once data gets written on Cohesity, it is written in native format in Hydra level, where read can be performed directly from hydra if needed. It then flushes to blob store on SSD and HDD depending on I/O profile.

Yoda Service:
Yoda is an Indexing engine, is responsible for indexing files within VM and/or other backed up object based on policy defined if to enable indexing or not. Once Backup job is completed, that’s when Yoda service kicks in.

Genie Service:
This is essentially a service to enable remote tunnel for support. This service checks some other monitoring features such as monitoring hardware hearbeat and health.

Eagle Service:
This service enable data reporting to support (call-home) and support for dark site collection/upload/analysis. It also analyses support data for install base, support and analytics in terms of statistics and alerts. This further sensors that monitor hardware and software.


Note: All the information I have used here is a cumulative information gathering from working knowledge with Field Engineers, Account System Engineers, Cohesity Documentation from support site, as well as personal understanding which I have gathered by conversation with Support tech, and hands on experience. I have tried my best to best reflect accuracy. This is Strictly for the audience who are new and/or current Cohesity users.

You Are Welcome

VMWare: Restore a VM using Netapp snapshot in an NFS exported Datastore Environment

Learn Storage, Backup, Virtualization,  and Cloud. AWS, GCP & AZURE.
............................................................................................................................................................................
Problem: Restore a VM that resides on a NFS Exported Datastore using Netapp Native Snapshot, NOT VMWare snapshot. Use case is-- A virtual disk on Windows server cannot be expanded, if that VM has snapshot on it. This method adds one more way of restoring a VM, if you don’t have solid backup in place, and/or it can be used as secondary plan for restoring a VM, should primary backup becomes unavailable.
1.    Find which ESXi host this VM resides on
a. This can be done either through vcenter GUI or through the vmware scripts
b.   Look at Each Disk of the VM,  EVERY virtual disk MUST be on the NFS to be recoverable.  
c.   In our example,  the disks are all Netapp volumes (netappdatastore2)
2.     Shutdown/power down the VM 
3.     Unregister the VM from Vcenter (Remove from Inventory) (DONT DELETE FROM DISK)
4.     Login to the ESXi host as user root,  in this case,  ESX-Server-01.domain.com
a.   ssh -l root@ ESX-Server-01.domain.com
b.   Find the netapp volume once you log in.
c.   [root@ESX-Server-01:~] df -h
Filesystem  Size  Used      Available Used%      Mounted on
NFS         20TT  15TT      5.0T      81%        /vmfs/volumes/netappdatastore2
5.    cd into that file system
[root@ESX-Server-01: cd /vmfs/volumes/netappdatastore2
   [root@ESX-Server-01 pwd
/vmfs/volumes/netappdatastore2
6.    Make sure we have a .snapshot directory and it has snapshots for the VM
7.   cd into .snapshot directory, and locate the snapshot desired based on timestamp.
root@ESX-Server-01:/vmfs/volumes/netappdatastore2] cd .snapshot/
[root@ESX-Server-01:/vmfs/volumes/netappdatastore2/.snapshot] ls -l
Snapshot.1
Snapshot.2
Snapshot.3
8.   Find the existing vm directory and move that directory into a backup directory to save the existing state
[root@ESX-Server-01:/vmfs/volumes/netappdatastore2] mv VMname VMname.backup
9.   Recover the VM directory from snapshot directory,  you MUST choose the right snapshot directory by the time frame
Cd into desired snapshot.
[root@ESX-Server-01:/vmfs/volumes/netappdatastore2/.snapshot/Snapshot.3]
[root@ESX-Server-01:/vmfs/volumes/netappdatastore2/.snapshot/snapshot.3] ls -ld VMName
10. Rsync command is recommended vs cp  (rsync has the -S option to handle sparse files more efficiently,  thin allocated vmdk files are usually sparse files - that are files with holes in the blocks).
root@ESX-Server-01:/vmfs/volumes/netappdatastore2/.snapshot/snapshot.3] rsync -av -S VMName ../../VMName
11. Re-register the vm in vcenter with the newly recovered directory
a.   Login to vcener
b.   Brower the ESXi host the VM was running on
c.   Go to Storage for the ESXi host and browse the storage file systems on the ESXi host
d.   Browse to the directory where you have the recovered VM files. 
e.   Go into that directory and register the VM. (right click on the VM definition file, and choose register). 
12.  Power back on the VM.


 You are Welcome :)