Network Architecture and Design Protocol for Distributed Storage: in-Kernel or CVM | vSAN

Network Architecture and Design Protocol for Distributed Storage: in-Kernel or CVM | vSAN


Hola!
Soy Elver Sena Sosa and I would like to talk about something that I get
asked quite a bit and that is when you have a distributed storage architecture. Do we go in-kernel or CVM? What’s the difference? And here’s how I
see it. So, we have on my right we have three
hosts. And they’re running a CVM solution a control VM solution where the control VM the VM decay gets its storage from the local hard drive on the host. I’m sorry that’s over here and then it turns around presents the data source, typically eSXI where that data source used to store the VMs that are local to it. And then on my right I have a solution where we have something like vSAN and and you haven’t connected to the network and you have a VMs with its VM decays subjects
that are stored in a disk group somewhere in the cluster. The main
difference as I see it is concerns of efficiencies on how the kernel the
hypervisor handles in the load. In the case of vSAN, when a VM does an IO
command the vSAN process gets called to do it after the hypervisor comes in. So, the
hypervisor receives the IO commands and hands it over to the vSAN process who then sends the IO request down to where the disk group is located. Uh, that’s kind of the process.So, vSAN is going to basically consume what it needs and no
more since the hypervisor is the one handling all the stuff and the
hypervisor’s always there. So, the consumption for this hybrid comes to
compute CPU memory, it’s somewhat minimal in the hypervisor relative to the
hypervisor. Now when you come to the CVM, and the the one thing about it and its a good solution about when it comes to performance is that when a VM
sending IO to its hard drive, to its VM decay, the hypervisor comes in, sends the traffic to the traffic to the internal
datastore they discussed the datastore which then goes up to the CVM which
then has to write do the same command down to its VM decay which means the
hypervisor has to come back in to then take that traffic and then send it down
to its local datastore. So, you have the processes
here doing multiple work to do the same IO command right in that
hypervisor. So there is that part of the efficiency. Now that by itself is one thing then you have the consumption of resources. How
much does it take in addition to the same hypervisor, the same process
doing multiple work in the same server? How does it be in handling? And
the CVM itself it has its own S that itself is only consume resources that
are not necessary themselves needed to the execution of the IO. I mean the IO
itself is gonna run so there is that part. So overall I think that when it
comes to design in-kernel will give you better performance for the same IO
that something that is one man running in-kernel would do. Now there is that one
part there is also the part when it comes to data locality.
I’ve heard some customer come and say “Hey the CVM gives me better data
locality” but I happen to believe that that data locality with appropriate configured network is gonna give you no real value because the latency is involved for port
to port latency in some data center switches today are ineligible for more
commissary application out there but let’s assume for a second that data
locality does matter to you, it does matter to you, to your application and the CVM has your data locally. Well, what happens when a VM say this one, vMotions to
another host. When that happens if you think about it that VMs decay,
its data its files are local to the host so the performance is like this and
remember the data collecting matters but when the VM moves over here now the CVM
in the destination host has to send request to where the VM came from to
that CVM so that he can send those blocks that the game wants to be going
through the network which means that if data locality matter you have
performance going like this what VM and then is gonna take a dip when vMotion happens and then it’s gonna slowly pick up as that data is migrated
over to the host and notice that I said the data is migrated which means during
the day anytime a VM vMotions to another host, its data has to be copy of
the source host which is going to take network resources away. When you have
this solution what is distributed what with data locality
is assumed not to matter well now VM vMotions to another host
all you do is that the host now start repointing over here and there is no
data migration and the performance because it’s going to the network anyway
and the assumption here is data locality doesn’t matter the performance is
always constant and with that that’s all my two cents
when it comes to be storage with the in-kernel versus a CVM or a CBM one or
the other Thank you for watching! Soy Elver Sena Sosa. Have a good day! you

Leave a Reply