Introduction
I have been working with virtualization and container systems over a decade and a half at this point. My first experiences with putting multiple machines on a single host was likely OpenVZ. I have worked with systems such as OpenVZ, Xen, KVM, Docker, VMware ESXi, and their management systems such as vCenter, Proxmox VE, Nutanix Prism, and Xenserver to one level or another. I have also ran storage arrays from Netapp, Pure, EMC, HPE, and IBM and hyper-converged solutions such as Ceph and Nutanix AVS. I have designed and built enterprise storage and virtualization solutions for over a decade at this point. What follows is an action plan based on that experience.
What is to follow is my case for a supplementing of Proxmox VE’s storage capabilities. I believe the time is right, and an opportunity exists at this point that will not exist again in the near future.
I-Executive Summary
Proxmox VE had its 1.0 release in late 2008, and is as of the writing on version 9 of their solution. It supports both traditional virtual machines (VMs) and container based infrastructure. The underlying operating system is Debian Linux, and it uses a web based UI for operational management.
While Proxmox is an excellent choice for SMB systems, expanding the features of its back end storage infrastructure would make integration into enterprises much simpler. This integration would expand its customer base and give enterprises choices in the software that they consume.
The shift in relationship that customers have with VMware is a key driver of the need for these options. This shift has threatened the consumers of this infrastructure, and likely even some producers of it as well.
Within these shifts lies opportunity however. For a consumer, the opportunity is to have greater control over their infrastructure and reduce impact of market shifts. For infrastructure producers, the opportunities are the ability to catch a transitional wave and to have redundancy in up stream software that is critical for driving revenue.
II-The Virtualization Landscape Has Shifted
In 2023, Broadcom closed a deal to buy VMware for $69 billion USD including taking on $8 billion USD in debt. After the sale, Broadcom has made multiple changes to the structure of contracts and the options to purchase VMware Products. These include:
- Typical subscription increases of 300-1000%
- Transition to subscription licensing instead of perpetual licensing
- Forcing users to relinquish perpetual licenses when obtaining a subscription
- Audits of users who do not renew
- “Bundling” of products that may or may not be consumed by a user
- Shifting pricing with no assurance of future consistency
- Shift from a per CPU socket to a per CPU core pricing model
- Elimination of many VMware resellers, disrupting supply chain continuity and eliminating advocates for the consumers of VMware Products
These changes not only present a direct threat to businesses in terms of cost, but they also present a threat to businesses in terms of control of their own infrastructure. If a business transitions to subscription based modeling and surrenders their perpetual licenses, they are put in a position where they are legally obligated to stop using VMware if they do not pay for renewals. This gives VMware extensive opportunity for leveraged renewals and predatory pricing. This could be significant exposure both in terms of fiscal and legal cost, but also in terms of system availability and data access.
For companies who provide storage arrays, this shift is likely to be disruptive to block (SAN) storage portfolios the most. VMware is a critical part of what makes enterprise SAN storage viable as it is in the modern era. Instead of having a large number of lower powered servers connecting to a SAN, servers with more resources (CPU, RAM) are connected to the SAN and contain multiple VMs. This does things like reduce data center footprint and decrease overall port count for Ethernet and Fibre Channel significantly. Without virtualization, the equation in the cost/benefit analysis shifts significantly and storage islands may become standard architecture for cost reasons.
III-Making Proxmox VE Storage Enterprise Ready
Proxmox currently has multiple mechanisms to consume storage. These include ZFS, Ceph/RBD, LVM, and local file systems. These mechanisms work well in single node or hyper converged implementations, but are not readily integrated into an enterprise SAN environment. iSCSI support does exist, but it requires manual configuration and the way that Proxmox recommends implementing is through shared LVM via command line. This increases complexity, potential variability and limits certain features within Proxmox its self.
The proposed solution is to use GFS2 as a replacement for VMware’s VMFS. VMFS is a critical element that makes Logical Units (LUNs) exported to systems usable across multiple hosts. The addition of GFS2 will also include elements that reduce the barrier to entry for administrators by giving them access to a graphical interface that will enable daily operations without command line use.
IV-Risks and Obstacles)
Risks include potential brand harm due to implementation of a stack that is previously untested. This risk should be mitigated via a strong beta program. GFS2 has been and is being used by other hypervisors as an underlying storage subsystem, which means a great deal of testing has been done.
Lack of knowledge within the support structure of Proxmox is a hazard. This may be reduced via partner relationships or via a more extended testing and documentation phase to build up documentation and skills.
Another risk is that Fibre Channel based storage systems go away in the long term. This is in my opinion a minimal risk, because the technology is easily adapted to other block storage technologies such as iSCSI or NVMe/TCP and it may be valuable to explore the idea of expanding into these protocols as part of the initial release anyways.
The biggest obstacle is lack of suitable test equipment. Storage systems are a significant capital expenditure across product lines. It may be worth investigating establishment of relationships with storage vendors to qualify hardware. These vendors likely have a vested interest in a VMware alternative for their customers, since VMware its self is a key consumer of block storage in many organizations.
V-Costs)
Key costs include development, testing, documentation and test equipment. Intitially the development equipment could be fairly minimal and fully virtualized. After the initial phase, a full SAN and multiple servers would need to be deployed for full performance and reliability testing.
Since the elements exist in Proxmox, it should largely be an alteration and addtion to existing work flows. The exception to this would be array integration which could be done at a later time and the implementation of the GFS2 subsystems.
VI-Benefits and Opportunities)
Among the biggest potential benefit for Proxmox is a chance to transition into the enterprise data center. This could lead to significant revenue increases and efficiencies due to working with larger customers who have strategic teams. It could also present an opportunity for them to work with SMB customers who have existing storage systems they want to maintain and capitalize on.
A tertiary benefit is that Proxmox could find its self used in OEM implementations much like Dell-EMC did with VXRAIL or Nutanix has done with partners previously. These partnerships could improve hardware validation and supported hardware via a growing install base.
VII-Adoption and Timeline)
This is something that should be looked at as soon as possible. It is likely 3-6 months would be adequate for development and then another 12-18 months for proper testing and customer feedback.
VIII-Governance)
Governance of the project should be done by Proxmox themselves. There may be an arrangement between Proxmox and other server/storage vendors about the development path. Since Proxmox VE is an open source product, this governance should exist already and require minimum to no modification.