Blog
  • Welcome to egonzalez blog
  • Software Supply Chain Security
    • Software Supply Chain Security: Why It Matters
    • Software Supply Chain Security: A Technical Deep Dive
    • SLSA and the Software Supply Chain Security: Time to Get Serious
  • Provenance
    • Understanding Provenance in Software Supply Chain Security
  • Building a secure development framework
  • Hacking
    • Index
      • Hack the box writeups
        • Dyplesher HTB writeup
        • Fatty HTB writeup
        • Oouch HTB writeup
        • Sauna HTB writeup
      • Python Vulnerabilities
        • Data Deserialization
          • Pickle
          • XML
          • YAML
      • Hacking cheatsheet
  • DevSecOps
    • Index
      • Gitlab CI minikube development environment
      • Gerrit review minikube
      • Gerrit and gitlab replication and CI job hooks on k8s
      • Vault integration with Gitlab CI
      • Gitlab CI template for DefectDojo
      • Falco real time runtime thread detection on k8s
      • Zarf - Airgap deployment in kubernetes
      • OWASP Dependency-track
      • OpenDaylight in a Docker
      • To conditional or to skip, that's the Ansible question
      • Spacewalk Red Hat Satellite v5 in a Docker container PoC
      • Ansible INI file module
  • OpenStack
    • Index
      • OpenStack tacker and service function chaining sfc with kolla
      • Deploy OpenStack designate with kolla-ansible
      • OpenStack keystone zero downtime upgrade process newton to ocata
      • Midonet integration with OpenStack Mitaka
      • OpenStack kolla deployment
      • Magnum in RDO OpenStack Liberty
      • Nova VNC flows under the hood
      • Ceph Ansible baremetal deployment
      • Rally OpenStack benchmarking with Docker
      • OpenStack affinity/anti-affinity groups
      • Migrate keystone v2.0 to keystone v3 OpenStack
      • Neutron DVR OpenStack Liberty
      • OpenStack segregation with availability zones and host aggregates
      • Nova Docker driver
      • Murano in RDO OpenStack manual installation
      • Ceph RadosGW admin Ops
      • Multiple store locations for glance images
      • List all tenants belonging an user
      • Load balancer as a service OpenStack LbaaS
      • OpenStack nova API start error
      • Delete OpenStack neutron networks
Powered by GitBook
On this page

Was this helpful?

  1. OpenStack
  2. Index

Ceph Ansible baremetal deployment

How many times you tried to install Ceph? How many fails with no reason?

All Ceph operator should agree with me when i say that Ceph installer doesn't really works as expected so far.

Yes, i'm talking about ceph-deploy and the main reason why i'm posting this guide about deploying Ceph with Ansible.

At this post, i will show how to install a Ceph cluster with Ansible on baremetal servers.

My configuration is as follows:

  1. 3 x ceph monitors 8GB of RAM each one

  2. 3 x OSD nodes 16GB of RAM and 3x100 GB of Disk

  3. 1 x RadosGateway node 8GB of RAM

First, download Ceph-Ansible playbooks

git clone https://github.com/ceph/ceph-ansible/
Cloning into 'ceph-ansible'...
remote: Counting objects: 5764, done.
remote: Compressing objects: 100% (38/38), done.
remote: Total 5764 (delta 7), reused 0 (delta 0), pack-reused 5726
Receiving objects: 100% (5764/5764), 1.12 MiB | 1.06 MiB/s, done.
Resolving deltas: 100% (3465/3465), done.
Checking connectivity... done.

Move to the newly created folder called ceph-ansible

cd ceph-ansible/

Copy sample vars files, we will configure our environment in these variable files.

cp site.yml.sample site.yml
cp group_vars/all.sample group_vars/all
cp group_vars/mons.sample group_vars/mons
cp group_vars/osds.sample group_vars/osds

Next step is configure the inventory with our servers, i don\'t really like use /etc/ansible/host file, i prefer create a new file per environment inside playbook\'s folder.

Create a file with the following content, use you own IPs to match your servers on the desired role inside the cluster

[root@ansible ~]# vi inventory_hosts

[mons]
192.168.1.48
192.168.1.49
192.168.1.52

[osds]
192.168.1.50
192.168.1.53
192.168.1.54

[rgws]
192.168.1.55

Test connectivity to you servers pinging them through Ansible ping module

[root@ansible ~]# ansible -m ping -i inventory_hosts all
192.168.1.48 | success >> {
    "changed": false,
    "ping": "pong"
}

192.168.1.50 | success >> {
    "changed": false,
    "ping": "pong"
}

192.168.1.55 | success >> {
    "changed": false,
    "ping": "pong"
}

192.168.1.53 | success >> {
    "changed": false,
    "ping": "pong"
}

192.168.1.49 | success >> {
    "changed": false,
    "ping": "pong"
}

192.168.1.54 | success >> {
    "changed": false,
    "ping": "pong"
}

192.168.1.52 | success >> {
    "changed": false,
    "ping": "pong"
}

Edit site.yml file, i will remove/comment mds nodes since i\'m not going to use them.

[root@ansible ~]# vi site.yml

- hosts: mons
  become: True
  roles:
  - ceph-mon

- hosts: agents
  become: True
  roles:
  - ceph-agent

- hosts: osds
  become: True
  roles:
  - ceph-osd

#- hosts: mdss
#  become: True
#  roles:
#  - ceph-mds

- hosts: rgws
  become: True
  roles:
  - ceph-rgw

- hosts: restapis
  become: True
  roles:
  - ceph-restapi

Edit main variable file, here we are going to configure our environment

[root@ansible ~]# vi group_vars/all

Here we configure from where ceph packages are going to be installed, for now we use upstream code with the stable release Infernalis.

## Configure package origin
ceph_origin: upstream
ceph_stable: true
ceph_stable_release: infernalis

Configure interface on which monitor will be listening

## Monitor options
monitor_interface: eth2

Here we configure some OSD options, like journal size and what networks will be used by public and cluster data replication

## OSD options
journal_size: 1024
public_network: 192.168.1.0/24
cluster_network: 192.168.200.0/24

Edit osds variable file

[root@ansible ~]# vi group_vars/osds

I will use auto discovery option to allow ceph ansible select empy or not used devices in my servers to create OSDs.

# Declare devices
osd_auto_discovery: True
journal_collocation: True

| Of course you can use other options, i\'ll highly suggest you to read variable comments, as they provide valuable information about usage. | We\'re ready to deploy ceph with ansible with our custom inventory_hosts file.

[root@ansible ~]# ansible-playbook site.yml -i inventory_hosts

After a while, you will have a fully functional ceph cluster.

You can check your cluster status with ceph -s. we can see all OSDs are up and pgs active/clean.

[root@ceph-mon1 ~]# ceph -s
    cluster 5ff692ab-2150-41a4-8b6d-001a4da21c9c
     health HEALTH_OK
     monmap e1: 3 mons at {ceph-mon1=192.168.200.141:6789/0,ceph-mon2=192.168.200.180:6789/0,ceph-mon3=192.168.200.232:6789/0}
            election epoch 6, quorum 0,1,2 ceph-mon1,ceph-mon2,ceph-mon3
     osdmap e10: 9 osds: 9 up, 9 in
            flags sortbitwise
      pgmap v32: 64 pgs, 1 pools, 0 bytes data, 0 objects
            102256 kB used, 896 GB / 896 GB avail
                  64 active+clean

| We are going to do some tests. | Create a pool

[root@ceph-mon1 ~]# ceph osd pool create test 128 128
pool 'test' created

Create a file big file

[root@ceph-mon1 ~]# dd if=/dev/zero of=/tmp/sample.txt bs=2M count=1000
1000+0 records in
1000+0 records out
2097152000 bytes (2.1 GB) copied, 16.7386 s, 125 MB/s

Upload the file to rados

[root@ceph-mon1 ~]# rados -p test put sample /tmp/sample.txt 

Check om which placement groups your file is saved

[root@ceph-mon1 ~]# ceph osd map test sample
osdmap e13 pool 'test' (1) object 'sample' -> pg 1.bddbf0b9 (1.39) -> up ([1,0], p1) acting ([1,0], p1)

Query the placement group where you file was uploaded, a similar output will prompts

[root@ceph-mon1 ~]# ceph pg 1.39 query
{
    "state": "active+clean",
    "snap_trimq": "[]",
    "epoch": 13,
    "up": [
        1,
        0
    ],
    "acting": [
        1,
        0
    ],
    "actingbackfill": [
        "0",
        "1"
    ],
    "info": {
        "pgid": "1.39",
        "last_update": "13'500",
        "last_complete": "13'500",
        "log_tail": "0'0",
        "last_user_version": 500,
        "last_backfill": "MAX",
        "last_backfill_bitwise": 0,
        "purged_snaps": "[]",
        "history": {
            "epoch_created": 11,
            "last_epoch_started": 12,
            "last_epoch_clean": 13,
            "last_epoch_split": 0,
            "last_epoch_marked_full": 0,
            "same_up_since": 11,
            "same_interval_since": 11,
            "same_primary_since": 11,
            "last_scrub": "0'0",
            "last_scrub_stamp": "2016-03-16 21:13:08.883121",
            "last_deep_scrub": "0'0",
            "last_deep_scrub_stamp": "2016-03-16 21:13:08.883121",
            "last_clean_scrub_stamp": "0.000000"
        },
        "stats": {
            "version": "13'500",
            "reported_seq": "505",
            "reported_epoch": "13",
            "state": "active+clean",
            "last_fresh": "2016-03-16 21:24:40.930724",
            "last_change": "2016-03-16 21:14:09.874086",
            "last_active": "2016-03-16 21:24:40.930724",
            "last_peered": "2016-03-16 21:24:40.930724",
            "last_clean": "2016-03-16 21:24:40.930724",
            "last_became_active": "0.000000",
            "last_became_peered": "0.000000",
            "last_unstale": "2016-03-16 21:24:40.930724",
            "last_undegraded": "2016-03-16 21:24:40.930724",
            "last_fullsized": "2016-03-16 21:24:40.930724",
            "mapping_epoch": 11,
            "log_start": "0'0",
            "ondisk_log_start": "0'0",
            "created": 11,
            "last_epoch_clean": 13,
            "parent": "0.0",
            "parent_split_bits": 0,
            "last_scrub": "0'0",
            "last_scrub_stamp": "2016-03-16 21:13:08.883121",
            "last_deep_scrub": "0'0",
            "last_deep_scrub_stamp": "2016-03-16 21:13:08.883121",
            "last_clean_scrub_stamp": "0.000000",
            "log_size": 500,
            "ondisk_log_size": 500,
            "stats_invalid": "0",
            "stat_sum": {
                "num_bytes": 2097152000,
                "num_objects": 1,
                "num_object_clones": 0,
                "num_object_copies": 2,
                "num_objects_missing_on_primary": 0,
                "num_objects_degraded": 0,
                "num_objects_misplaced": 0,
                "num_objects_unfound": 0,
                "num_objects_dirty": 1,
                "num_whiteouts": 0,
                "num_read": 0,
                "num_read_kb": 0,
                "num_write": 500,
                "num_write_kb": 2048000,
                "num_scrub_errors": 0,
                "num_shallow_scrub_errors": 0,
                "num_deep_scrub_errors": 0,
                "num_objects_recovered": 0,
                "num_bytes_recovered": 0,
                "num_keys_recovered": 0,
                "num_objects_omap": 0,
                "num_objects_hit_set_archive": 0,
                "num_bytes_hit_set_archive": 0,
                "num_flush": 0,
                "num_flush_kb": 0,
                "num_evict": 0,
                "num_evict_kb": 0,
                "num_promote": 0,
                "num_flush_mode_high": 0,
                "num_flush_mode_low": 0,
                "num_evict_mode_some": 0,
                "num_evict_mode_full": 0
            },
            "up": [
                1,
                0
            ],
            "acting": [
                1,
                0
            ],
            "blocked_by": [],
            "up_primary": 1,
            "acting_primary": 1
        },
        "empty": 0,
        "dne": 0,
        "incomplete": 0,
        "last_epoch_started": 12,
        "hit_set_history": {
            "current_last_update": "0'0",
            "history": []
        }
    },
    "peer_info": [
        {
            "peer": "0",
            "pgid": "1.39",
            "last_update": "13'500",
            "last_complete": "13'500",
            "log_tail": "0'0",
            "last_user_version": 0,
            "last_backfill": "MAX",
            "last_backfill_bitwise": 0,
            "purged_snaps": "[]",
            "history": {
                "epoch_created": 11,
                "last_epoch_started": 12,
                "last_epoch_clean": 13,
                "last_epoch_split": 0,
                "last_epoch_marked_full": 0,
                "same_up_since": 0,
                "same_interval_since": 0,
                "same_primary_since": 0,
                "last_scrub": "0'0",
                "last_scrub_stamp": "2016-03-16 21:13:08.883121",
                "last_deep_scrub": "0'0",
                "last_deep_scrub_stamp": "2016-03-16 21:13:08.883121",
                "last_clean_scrub_stamp": "0.000000"
            },
            "stats": {
                "version": "0'0",
                "reported_seq": "0",
                "reported_epoch": "0",
                "state": "inactive",
                "last_fresh": "0.000000",
                "last_change": "0.000000",
                "last_active": "0.000000",
                "last_peered": "0.000000",
                "last_clean": "0.000000",
                "last_became_active": "0.000000",
                "last_became_peered": "0.000000",
                "last_unstale": "0.000000",
                "last_undegraded": "0.000000",
                "last_fullsized": "0.000000",
                "mapping_epoch": 0,
                "log_start": "0'0",
                "ondisk_log_start": "0'0",
                "created": 0,
                "last_epoch_clean": 0,
                "parent": "0.0",
                "parent_split_bits": 0,
                "last_scrub": "0'0",
                "last_scrub_stamp": "0.000000",
                "last_deep_scrub": "0'0",
                "last_deep_scrub_stamp": "0.000000",
                "last_clean_scrub_stamp": "0.000000",
                "log_size": 0,
                "ondisk_log_size": 0,
                "stats_invalid": "0",
                "stat_sum": {
                    "num_bytes": 0,
                    "num_objects": 0,
                    "num_object_clones": 0,
                    "num_object_copies": 0,
                    "num_objects_missing_on_primary": 0,
                    "num_objects_degraded": 0,
                    "num_objects_misplaced": 0,
                    "num_objects_unfound": 0,
                    "num_objects_dirty": 0,
                    "num_whiteouts": 0,
                    "num_read": 0,
                    "num_read_kb": 0,
                    "num_write": 0,
                    "num_write_kb": 0,
                    "num_scrub_errors": 0,
                    "num_shallow_scrub_errors": 0,
                    "num_deep_scrub_errors": 0,
                    "num_objects_recovered": 0,
                    "num_bytes_recovered": 0,
                    "num_keys_recovered": 0,
                    "num_objects_omap": 0,
                    "num_objects_hit_set_archive": 0,
                    "num_bytes_hit_set_archive": 0,
                    "num_flush": 0,
                    "num_flush_kb": 0,
                    "num_evict": 0,
                    "num_evict_kb": 0,
                    "num_promote": 0,
                    "num_flush_mode_high": 0,
                    "num_flush_mode_low": 0,
                    "num_evict_mode_some": 0,
                    "num_evict_mode_full": 0
                },
                "up": [],
                "acting": [],
                "blocked_by": [],
                "up_primary": -1,
                "acting_primary": -1
            },
            "empty": 0,
            "dne": 0,
            "incomplete": 0,
            "last_epoch_started": 12,
            "hit_set_history": {
                "current_last_update": "0'0",
                "history": []
            }
        }
    ],
    "recovery_state": [
        {
            "name": "Started\/Primary\/Active",
            "enter_time": "2016-03-16 21:13:36.769083",
            "might_have_unfound": [],
            "recovery_progress": {
                "backfill_targets": [],
                "waiting_on_backfill": [],
                "last_backfill_started": "MIN",
                "backfill_info": {
                    "begin": "MIN",
                    "end": "MIN",
                    "objects": []
                },
                "peer_backfill_info": [],
                "backfills_in_flight": [],
                "recovering": [],
                "pg_backend": {
                    "pull_from_peer": [],
                    "pushing": []
                }
            },
            "scrub": {
                "scrubber.epoch_start": "0",
                "scrubber.active": 0,
                "scrubber.waiting_on": 0,
                "scrubber.waiting_on_whom": []
            }
        },
        {
            "name": "Started",
            "enter_time": "2016-03-16 21:13:09.216260"
        }
    ],
    "agent_state": {}
}

That\'s all for now.

Regards, Eduardo Gonzalez

PreviousNova VNC flows under the hoodNextRally OpenStack benchmarking with Docker

Last updated 5 years ago

Was this helpful?

| Maybe you find some issues or bugs when running the playbooks. | There is a lot of efforts to fix issues on upstream repository. If a new bug is encountered, please, post a issue right here. |

https://github.com/ceph/ceph-ansible/issues