Mô hình và giải pháp triển khai CEPH - Highly Scalable, Open Source, Distributed File | Mo hinh va giai phap trien khai CEPH - Highly Scalable, Open Source, Distributed File - Quản trị net diễn đàn chia sẻ thông tin các thủ thuật mạng, internet bảo mật thông tin dành cho giới IT VIệt hy vọng là nơi bổ ích cho cộng đồng

15-05-2015, 02:58 PM

Mô hình và giải pháp triển khai CEPH - Highly Scalable, Open Source, Distributed File System

CEPH is perhaps one of the most talked about open source object storage projects, especially now that it can be used in OpenStack for primary (cinder - volume) storage as well as secondary (glance - image) storage. This technology summary attempts to provide you with as much relevant information as possible in the shortest number of words. Here are some of the most important features of this up and coming object store.

For further information please refer to the Credits and links collapsible section below. There you will find links to documentation and companies that offer support services for this product. At a later stage I will write a similar piece on GlusterFS, which is, rightly or wrongly, compared to CEPH.

Highlights

Highly scalable (petabytes) distributed file system
No single point of failure in architecture
8 years old
Designed to be used on commodity hardware - written to cope with multiple hardware failures
Self managing - CEPH rebalances data automatically when there are hardware alterations - failure, additions, upgrades
Configurable replication policy governs number of data copies
Configurable 'CRUSH map' provide awareness of physical environment to CEPH to better protect data loss against different environment failures - disk, compute, network, power data centre
Integrates with OpenStack
Integrates with Cloudstack for secondary storage but there are caveats with using it for primary storage

Components

3 ways to access object store:
- Objects - CEPH Object Gateway S3 and SWIFT compatible APIs
- Host/VM - CEPH block device provides distributed block storage
- File system - CEPH file system - distributed, scaleout, POSIX compliant
Fourth way to access object store is directly via calls () to LIBRRADOS library
Object store is called RADOS - 'Reliable Autonomous Distributed Object Store'. It is mad up of two components:
OSDs - Object Store Daemon - handles storage and retrieval of objects from disks.
Monitors - maintain cluster membership and state, consensus for distributed decision making
Monitors and OSDs run on top of standard Linux operating system
OSD writes objects to the local file system
OSD, monitors and gateways are user mode code
Additional architectural concepts are:
- Pools - logical partitions for storing objects. parameters include - ownership/access to objects, replication count, CRUSH map, number of placement groups
- CRUSH - data distribution algorithm - Controlled, Scalable, Decentralised Placement of Replicated Data
- CRUSHmap - manually created map of 'failure domain hierarchy'. It contains the physical information about the devices in the object store. One per pool
- Placement groups - a collection of OSDs, the mapping of which is determined by CRUSH maps
CEPH components should ideally be run on 3.0+ Linux kernels (contains relevant bug fixes, syncfs system call and best available OSD suitable file system)

Deployment Architecture
(Varies according to intended method of communication and throughput requirements)

Sensible, minimum (in order to have data redundancy as well as respectable performance) system should have
- 3 nodes (servers) running monitor service
- 3 nodes (could be the same nodes as the monitor service) running OSDs (one per data path - i.e. one per disk or array). This would provide 2-copy replication
- 2 nodes running object gateway (in Active/passive configuration)
- Underlying file system can be production - ext4, xfs; development - btrfs, zfs
- Disks can be presented as 'JBODs' or a single drive RAID0 arrays where performance battery backed write-back cache is present on an array controller
- Optional, higher performance configuration would see SSDs being utilised for file system journals
- 10GB Ethernet for data network and separate 1GB for management
- RAM configuration should take into account number of OSDs on each node as wells as if node is running a monitor service or gateway
- Ubuntu 12.04 recommended OS. Debian, Centos are supported
Example configuration for a more sophisticated deployment
- 12 nodes 2U, 12 disk servers for disk storage
- 3 monitors running on OSDs
- 3 separate S3/SWIFT gateway servers
- Pair of load balancer appliances to spread gateway load
- Separate front and back storage networks to allow replication traffic to traverse isolated network
Tham khảo: http://www.openclouddesign.org/artic...ed-file-system

15-05-2015, 02:58 PM	#1
hoctinhoc Guest Trả Lời: n/a	Mô hình và giải pháp triển khai CEPH - Highly Scalable, Open Source, Distributed File Mô hình và giải pháp triển khai CEPH - Highly Scalable, Open Source, Distributed File System CEPH is perhaps one of the most talked about open source object storage projects, especially now that it can be used in OpenStack for primary (cinder - volume) storage as well as secondary (glance - image) storage. This technology summary attempts to provide you with as much relevant information as possible in the shortest number of words. Here are some of the most important features of this up and coming object store. For further information please refer to the Credits and links collapsible section below. There you will find links to documentation and companies that offer support services for this product. At a later stage I will write a similar piece on GlusterFS, which is, rightly or wrongly, compared to CEPH. Highlights Highly scalable (petabytes) distributed file system No single point of failure in architecture 8 years old Designed to be used on commodity hardware - written to cope with multiple hardware failures Self managing - CEPH rebalances data automatically when there are hardware alterations - failure, additions, upgrades Configurable replication policy governs number of data copies Configurable 'CRUSH map' provide awareness of physical environment to CEPH to better protect data loss against different environment failures - disk, compute, network, power data centre Integrates with OpenStack Integrates with Cloudstack for secondary storage but there are caveats with using it for primary storage Components 3 ways to access object store: Objects - CEPH Object Gateway S3 and SWIFT compatible APIs Host/VM - CEPH block device provides distributed block storage File system - CEPH file system - distributed, scaleout, POSIX compliant Fourth way to access object store is directly via calls () to LIBRRADOS library Object store is called RADOS - 'Reliable Autonomous Distributed Object Store'. It is mad up of two components: OSDs - Object Store Daemon - handles storage and retrieval of objects from disks. Monitors - maintain cluster membership and state, consensus for distributed decision making Monitors and OSDs run on top of standard Linux operating system OSD writes objects to the local file system OSD, monitors and gateways are user mode code Additional architectural concepts are: Pools - logical partitions for storing objects. parameters include - ownership/access to objects, replication count, CRUSH map, number of placement groups CRUSH - data distribution algorithm - Controlled, Scalable, Decentralised Placement of Replicated Data CRUSHmap - manually created map of 'failure domain hierarchy'. It contains the physical information about the devices in the object store. One per pool Placement groups - a collection of OSDs, the mapping of which is determined by CRUSH maps CEPH components should ideally be run on 3.0+ Linux kernels (contains relevant bug fixes, syncfs system call and best available OSD suitable file system) Deployment Architecture (Varies according to intended method of communication and throughput requirements) Sensible, minimum (in order to have data redundancy as well as respectable performance) system should have 3 nodes (servers) running monitor service 3 nodes (could be the same nodes as the monitor service) running OSDs (one per data path - i.e. one per disk or array). This would provide 2-copy replication 2 nodes running object gateway (in Active/passive configuration) Underlying file system can be production - ext4, xfs; development - btrfs, zfs Disks can be presented as 'JBODs' or a single drive RAID0 arrays where performance battery backed write-back cache is present on an array controller Optional, higher performance configuration would see SSDs being utilised for file system journals 10GB Ethernet for data network and separate 1GB for management RAM configuration should take into account number of OSDs on each node as wells as if node is running a monitor service or gateway Ubuntu 12.04 recommended OS. Debian, Centos are supported Example configuration for a more sophisticated deployment 12 nodes 2U, 12 disk servers for disk storage 3 monitors running on OSDs 3 separate S3/SWIFT gateway servers Pair of load balancer appliances to spread gateway load Separate front and back storage networks to allow replication traffic to traverse isolated network Tham khảo: http://www.openclouddesign.org/artic...ed-file-system

Chia Sẽ Kinh Nghiệm Về IT