Abhishek Singh, OpenStack Best Practices
This article attempts to throw some light on community Openstack best practices for Ceph storage. It aims to assist storage administrators and operations engineer that are engaged in deploying multi-mode Openstack clusters.
Firstly, the author would discuss about Ceph and its benefits, and then the best practices of Ceph from different scenarios looking into the infrastructure peripheries. The author assumes that the readers have fundamental knowledge on Openstack and its deployment. Ceph is a software defined storage solution for Openstack and it is used for aggregating different storage devices including commodity storages to give an intelligent storage pool to various end-users. A properly designed Ceph can provide High availability too. Openstack Cinder is used to provide volumes and Glance provides image service. Like other object storages Ceph also needs a gateway which is an intelligent service to categorize the defined data and place into object storage and it is RadosGW. Ceph integrates into Openstack Nova and Cinder by Rados block devices. A benefit one can see in cinder with Ceph over the default volume back end local volumes managed by LVM and cinder is it is a distributed and network available solution.
Another advantageous feature that comes along with Ceph is copy-on-write that allows existing volume as a source for unmodified data of another volume. It significantly consumes less space for new virtual machine based on templates and snapshots. Using network availability and distributed storage, live migration is possible for even ephemeral disks. This proves to be handy dealing with failing hosts and during infrastructure upgrades. Ceph’s integration with QEMU also gives space to use Cinder QoS feature to control virtual machines from consuming all IOPS and storage resources.
The purpose of this article is to emphasise that cloud deployment is more exposed to threats than traditional environment. This is because storages are accessible on internet, and in addition, Ceph and other Openstack services are installed on servers and mostly with default options.
This article will now discuss on securing block and object storage using Ceph and then move to the topic of securing connectivity between Openstack and SAN/NAS solutions. RadosGW is vulnerable component in object storage because it is exposed to HTTP Restful requests. It was also suggested in Openstack Summit at Vancouver to have a proxy appliance that has a separate network with “SSL termination” enabled with proxy forwarding and web authentication filtering between virtual machines and RadosGW. Ceph is not having centralized mechanism for managing object storages, it is managed using CephX with each device. This means that clients can directly interact with the Object storage devices (OSDs), Cephx works like kerberos. Here’s the catch, CephX authentication is only between Ceph client and Ceph server hosts, it is not extended beyond the Ceph client and CephX policy doesn’t work when someone access Ceph client from remote host.
In order to exercise the functionality of monitoring, OSDs and metadata servers’ Ceph has another authentication mechanism called “Caps”. Caps also restricts access among pools, with this said users can have access to certain pools and have no access for some. In other word this authentication helps in building policies for authorization.
It is very important to understand how Ceph authenticates and the vulnerability attached to it. Ceph use keys for communication. These keys used to authenticate Ceph client are stored on server and are in plain text files which is a vulnerability for any environment. If one hacks the server the keys are exposed. In order to control this, arbitrary users, portable machines and laptops should not be configured to talk with Ceph directly because it would then require storing plaintext files to be stored in more vulnerable machines and compromise the security. As a best practice, users can login to a trusted machine with proper hardening and security and use that machine to store plaintext authentication files. So far Ceph does not include options to encrypt user data in object storage. There is a need for out-of-the-box solution to encrypt the data. Apart from these, one can implement best practices of DoS, for example: limit load from client using QEMU IO throttling features, limit the max open socket per object storage disks (OSD), limit the max open sockets per source IP and use throttling per client.
Moving forward to the second section, this article will now discuss on securing connectivity between Openstack and SAN/NAS solution which is equally important as securing block and object storage. Cinder and storage communicates through management path. Communication uses SSH, or through REST by SSL. It is advisable to keep management interface on secure LANs and use strong passwords for management accounts. Try avoid default vendor passwords, and role based security and accountability can be helpful forensic tools. Now the readers might be thinking about the efforts that can be used in securing the data path. There are many ways to do it and a strict checklist with specifications for setting hardening parameters can be used for devices and components, let say for NFS: a stricter configuration options for exports and user management can be practiced, proper access control lists (ACLs) to limit only authenticated users to see IP SAN and all other setting that can reduce the vulnerability list. The reason why proper ACL is important is because any server that resides on same IP segment as that of isci storage can access the storage theoretically and perform read/write operations. Proper control on file owned by root with permissions 600 is also advisable.
There are other ways for securing communications like CHAP that assists identifying client through username and password. When Cinder communicates with storage it generates a random key and it is used by Nova when it connects with iscsi and thus a secure connection is established. Another important area to consider is encrypting exposed traffic using “transport” and “tunnel” encryption. there are two ways to encrypt the data- Transport mode and tunnel mode. On transmitter side, transport mode only encrypts data portion and not the header whereas tunnel encrypts both header and data portion. On receiving side, IPSEC-complaint device should be able to decrypt the data packets and for that to work transmitter and receiver should share a public key that gives a secure connectivity, however, it can put some load on network in return. For those volume that uses block storages through fibre channel volume should have Zone managers which can be configured through cinder.conf for proper control.
To conclude, Openstack ecosystem is quite vulnerable when typically installed and a lot of improvements can be seen in terms of security.