Skip to content

Commit

Permalink
Add Cloudlab docs (#250)
Browse files Browse the repository at this point in the history
* docs: add CloudLab getting started guide

* docs: make instructions for Ansible consistent here and there

* docs: install Ansible with apt

* docs: update cloudlab.md to reflect recent changes elsewhere
  • Loading branch information
whentojump authored Aug 4, 2023
1 parent 142995f commit ffe0371
Show file tree
Hide file tree
Showing 3 changed files with 138 additions and 19 deletions.
127 changes: 127 additions & 0 deletions docs/cloudlab.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# CloudLab getting started guide

## 1 Prerequisites

First, submit an account request to CloudLab (https://www.cloudlab.us/). Things to note:

- If you're an internal member, select "Join Existing Project" and type: `Sieve-Acto`. Otherwise, you'll have to join other existing projects or create a new one, which is not detailed here.
- The username and key you provide will be used for SSH login

Wait for the admin to approve your account. Once you are able to login, familiarize yourself with the web-based dashboard, and [the concept of *profiles* and *experiments*](https://docs.cloudlab.us/basic-concepts.html).

Although you should be able to log in to any machine instantiated by your project collaborators (i.e. a Linux user will be automatically created for you on every machine with `authorized_keys` set up), for us (`Sieve-Acto`), the current practice is to let everyone run code **on their own experiments**.

Next you'll prepare the dependencies either manually ([section 2](#2-manually-set-up-the-dependencies)) or automatically ([section 3](#3-automatically-set-up-the-dependencies), recommended).

## 2 Manually set up the dependencies

<details><summary>Click to show details</summary>

### 2.1 Create CloudLab experiments

Launch an experiment via the web dashboard:

1. Click "Experiments" -- "Start Experiment". The default selected profile should be `small-lan`. "Next".
2. Enter/Choose parameters:
- "Select OS image": `UBUNTU 20.04`
- "Optional physical node type": `c6420`
- Leave other parameters as default. (Especially those regarding temporary filesystem -- this will be handled after provisioning using Ansible.)
3. "Next". Give your experiment a name. "Next". "Finish".

Wait for the provisioning to finish. The web dashboard will show you the server address, in the form of `<node>.<cluster>.cloudlab.us`. E.g. `clnode123.clemson.cloudlab.us`.

### 2.2 Install the dependencies with Ansible

You are going to manage CloudLab machines with Ansible from a controller node. This "controller" can be your local machine, or one of the CloudLab machines themselves.

**On your controller node**:

Install Ansible:

```shell
sudo apt update
sudo apt -y install software-properties-common
sudo add-apt-repository --yes --update ppa:ansible/ansible
sudo apt -y install ansible
ansible-galaxy collection install ansible.posix
ansible-galaxy collection install community.general
```

Clone the Ansible scripts:

```shell
git clone https://github.com/xlab-uiuc/acto-cloudlab.git /tmp/acto-cloudlab
```

Set up `ansible_hosts` file (**remember to replace the placeholders with your real domain and user name**):

```shell
domain="clnodeXXX.clemson.cloudlab.us"
user="alice"

cd /tmp/acto-cloudlab/scripts/ansible/
echo "$domain ansible_connection=ssh ansible_user=$user ansible_port=22" > ansible_hosts
```

> If the controller is a CloudLab machine too, this step can be automated:
>
> ```shell
> sudo apt -y install xmlstarlet
>
> component_name=$( geni-get portalmanifest | xmlstarlet sel -N x="http://www.geni.net/resources/rspec/3" -t -v "//x:node/@component_id" )
> cluster_domain=$( echo $component_name | cut -d '+' -f 2 )
> node_subdomain=$( echo $component_name | cut -d '+' -f 4 )
> domain="${node_subdomain}.${cluster_domain}"
> user=$( geni-get user_urn | rev | cut -d '+' -f -1 | rev )
>
> cd /tmp/acto-cloudlab/scripts/ansible/
> echo "$domain ansible_connection=ssh ansible_user=$user ansible_port=22" > ansible_hosts
> ```
>
> Or even simpler, use `127.0.0.1` directly:
>
> ```shell
> cd /tmp/acto-cloudlab/scripts/ansible/
> echo 127.0.0.1 > ansible_hosts
> ```
(Only if the controller is a CloudLab machine too) work around the key authentication:
```shell
ssh-keygen -b 2048 -t rsa -f ~/.ssh/id_rsa -q -N "" && cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
```
Finally, run the Ansible scripts to install dependencies:
```shell
ansible-playbook -i ansible_hosts configure.yaml
```
</details>
(Only if the controller is a CloudLab machine too) log out before jumping to section 4 and logging in again.
Go to [section 4](#4-run-acto).
## 3 Automatically set up the dependencies
Everything needed to install the dependencies (see section 2) is included in [a CloudLab profile](https://github.com/xlab-uiuc/acto-cloudlab), by which the same environment can be set up without manually entering any command.
Launch an experiment via the web dashboard:
1. Open this link: https://www.cloudlab.us/p/Sieve-Acto/acto-cloudlab?refspec=refs/heads/main. The default selected profile should be `acto-cloudlab`. "Next".
2. "Next". Give your experiment a name. "Next". "Finish".
Wait for provisioning and startup both to finish (i.e. under the "List View" tab, "Status" is `ready` and "Startup" is `Finished`). The web dashboard will show you the server address, in the form of `<node>.<cluster>.cloudlab.us`. E.g. `clnode123.clemson.cloudlab.us`.
## 4 Run Acto
Log in to the CloudLab machine, and run:
<!-- TODO this is now sosp-ae because of Ansible scripts in acto-cloudlab -->
```shell
cd ~/workdir/acto
make
python3 reproduce_bugs.py --bug-id rdoptwo-287
```
17 changes: 11 additions & 6 deletions scripts/ansible/README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,28 @@
# Ansible playbook to automatically configure environment for Acto to run a baremetal machine
# Ansible playbook to automatically configure environment for Acto to run on a baremetal machine

To run the script, you first need an `ansible_hosts` file. Each line in the file should contain
a worker in your cluster. See https://docs.ansible.com/ansible/latest/user_guide/intro_inventory.html
for details.

An example:

```ini
c220g5-110417.wisc.cloudlab.us ansible_connection=ssh ansible_user=tylergu ansible_port=22
c220g5-110418.wisc.cloudlab.us ansible_connection=ssh ansible_user=tylergu ansible_port=22
```

If you haven't installed `ansible playbook` on your control node, run
If you haven't installed `ansible-playbook` on your control node, run

```sh
pip3 install ansible
ansible-galaxy collection install ansible.posix
ansible-galaxy collection install community.general
```

Then just run
```
bash configure.sh

Then just run

```sh
ansible-playbook -i ansible_hosts configure.yaml
```

and the proper environment will be setup on the workers.
13 changes: 0 additions & 13 deletions scripts/ansible/configure.sh

This file was deleted.

1 comment on commit ffe0371

@github-actions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage

Coverage Report
FileStmtsMissCoverMissing
acto
   __main__.py86860%1–173
   common.py3549673%97–98, 102–116, 120, 122, 124, 129, 132, 136, 138, 140, 142, 144, 148, 153–162, 203, 236, 249, 291, 315–316, 325, 327, 330, 334–343, 363–366, 415, 418, 432, 460–461, 498–503, 505–507, 514–517, 521, 537–538, 544–555, 614–623, 627, 634–639
   deploy.py15212120%36–39, 42–58, 62, 65–75, 83–113, 116–123, 129–151, 154–167, 179–202, 209–218, 226–244
   engine.py5265260%1–906
   oracle_handle.py241346%15–18, 26–27, 39–46, 54
   reproduce.py1101100%1–210
   serialization.py301647%19–27, 33–39
   snapshot.py25292%17, 34
acto/checker
   checker.py13285%12, 19
   checker_set.py571279%41–42, 67–78
   test_checker.py1621526%13–246
acto/checker/impl
   crash.py30293%14, 16
   health.py49394%36, 65, 81
   kubectl_cli.py29486%32–33, 44–46
   operator_log.py24292%24, 32
   state.py2275277%55, 63–64, 69–93, 104–105, 133, 140, 147–149, 166, 265, 288, 317, 319–323, 330–338
   state_compare.py76396%62, 73, 80
   state_condition.py391367%18, 24, 29–37, 47, 53
acto/input
   get_matched_schemas.py542259%12, 47–51, 55–74
   input.py58846920%29–36, 73, 91, 111, 122–125, 128, 158–170, 173–184, 188, 192, 199, 205, 209–388, 409–423, 431–436, 440–453, 462, 478, 481–488, 500, 502, 504–506, 509–522, 527, 535–546, 549–555, 563–590, 594–874, 897–910, 914–961
   testcase.py552064%40–50, 53, 56, 59, 62–66, 95–96, 100, 103, 109, 116, 119
   testplan.py18313725%14–24, 27–29, 32, 35–42, 45, 48–67, 70–74, 78, 82, 85–91, 99–107, 110–121, 124, 127, 130, 133, 136–138, 141–151, 154–164, 167, 174–185, 188–194, 200, 203–219, 222, 225, 231, 239–244, 247, 250, 253, 259–260, 263–269, 272, 275, 278
   value_with_schema.py33721835%16, 20, 24, 28, 32, 36, 40, 54, 58, 61–67, 70–76, 86–114, 117–124, 128–137, 141–149, 152–156, 159, 162, 166, 180, 183, 186–192, 195–201, 211–230, 233–240, 243, 247–256, 260–272, 275–279, 282, 285, 289, 301, 307, 312, 314, 316, 320, 324, 329–333, 337–340, 349–359, 362–369, 373–376, 380–385, 388–391, 405, 408–412, 416, 421–428, 431–434, 437–439, 442–445, 448–451, 462, 465, 468, 485–498
   valuegenerator.py62038937%20, 24, 28, 32, 43, 47, 51, 55, 76–86, 90–104, 107, 110, 114, 117, 121–130, 133–139, 142, 145, 148, 151, 154, 157–167, 194–199, 202–213, 216, 219, 223, 226–231, 234–237, 240, 243–248, 251–254, 257, 260, 263, 266, 269, 273–282, 285, 288, 291, 294–304, 331–343, 346–347, 350, 353, 357, 360–365, 368–371, 374, 377–382, 385–388, 391, 394, 397, 400, 403, 406, 409–419, 450–482, 485–496, 499–502, 505–508, 512–520, 523, 526, 529, 532, 535, 538–548, 573–589, 592–609, 612, 615, 619–621, 624–628, 631–632, 635–644, 647–653, 656–657, 660–670, 673, 676, 679, 682, 685, 688–698, 710–711, 714–724, 727–730, 733–736, 740, 748, 751–752, 755–765, 768–771, 774–777, 781, 794–797, 800–814, 817, 820, 824, 827–832, 835, 838, 841–846, 849, 852, 855, 858, 861–871, 881, 884, 887, 890, 894, 913, 919, 931, 933, 936–940, 943–946, 951, 955, 957, 961–962
acto/input/known_schemas
   base.py531375%17–18, 28, 37, 46–47, 56–57, 66, 75, 84–85, 93
   cronjob_schemas.py763357%13, 16–19, 22, 25, 36–39, 42–47, 50, 53, 59, 62–65, 68, 71, 82, 85–90, 93, 96, 113, 117–119, 131, 137, 140
   deployment_schemas.py592558%16, 22–27, 30–32, 35, 38, 54–57, 65–67, 70, 78–81, 91, 94
   known_schema.py753948%28, 31–34, 37, 43, 46–48, 51, 54, 81–84, 102–113, 117–135
   pod_disruption_budget_schemas.py562261%14–17, 21, 25–27, 30, 41–44, 48, 54, 57, 68–71, 81, 84
   pod_schemas.py79727166%16–19, 23, 28, 32, 40–43, 47, 51–53, 61–64, 68, 73, 83, 92, 151, 156, 160, 167, 171, 178, 182, 238, 242, 247, 251, 294, 298, 303, 307, 335, 338, 341, 344, 347, 350, 353, 356, 359, 362, 365, 368, 393, 397, 400–405, 408, 414, 417, 420, 428, 431, 434, 437, 445, 453, 481, 488–494, 503, 507, 539, 543, 572, 575, 578, 581, 584, 587, 590, 619, 623, 626–631, 634, 642, 645, 648, 664–667, 670–672, 675, 684, 690, 693–696, 699, 702, 713–714, 717–719, 722, 725, 736, 739, 742, 745–748, 752, 756–758, 761–765, 768, 774, 777, 780, 783, 786, 789, 792, 795, 798, 801, 804, 807, 810, 813, 816, 840, 845, 849–857, 860, 866, 869, 872, 875, 878, 881, 884, 887, 890, 893, 896, 899, 902, 905, 908, 929, 933–935, 938–943, 946, 959, 964, 974, 980, 986, 989, 992, 1032, 1036–1038, 1041, 1054, 1059, 1065, 1068–1071, 1074, 1077, 1088–1091, 1094–1096, 1102, 1108, 1111–1114, 1117, 1120, 1133–1136, 1139–1141, 1147, 1153, 1156–1159, 1162, 1165, 1176–1179, 1182–1184, 1190, 1196, 1199–1202, 1205, 1212, 1215–1217, 1223, 1238, 1243, 1247, 1274, 1278, 1291, 1296, 1302, 1305, 1308, 1314, 1317, 1320–1322, 1325, 1342, 1345, 1348, 1374, 1378–1381, 1384, 1402, 1408, 1411, 1414, 1421, 1427, 1469, 1473
   resource_schemas.py1495066%15–19, 22, 25, 28–32, 35, 38, 45, 49, 54, 57–59, 62, 76, 78, 82, 96, 102–105, 108, 111, 121, 134, 147, 154, 159–162, 165, 173–176, 179, 187, 194, 199, 213, 231, 236
   service_schemas.py1786762%13, 16, 19, 25–30, 33, 36, 42, 45–48, 51, 54, 64–67, 70–73, 79, 85, 88–91, 94, 101–102, 105–108, 111, 114, 127, 142, 180, 184, 208, 214, 217, 220, 228–231, 235, 240, 244–246, 249, 257–258, 261, 264, 277–280, 284, 288–290, 293
   statefulset_schemas.py1866167%15–18, 21, 31–36, 42, 49–53, 56–58, 61–63, 66–70, 73–75, 78–80, 83, 90–93, 99–102, 105, 132, 149, 154, 158, 164, 167–170, 173, 176, 190–193, 196–201, 207, 225, 230, 234, 262, 266, 290
   storage_schemas.py1797956%13, 16, 25–30, 33, 36, 42, 48–53, 59, 67–70, 74, 79–80, 83, 89, 92–95, 98, 104–105, 108–111, 114–116, 122, 130–131, 135, 140, 145–148, 154–155, 158, 164, 181–184, 193, 197, 203–205, 211, 214, 228–231, 244, 250–252, 255, 258, 266–269, 273, 277–279, 282
acto/kubectl_client
   kubectl.py231822%8–14, 23–29, 37–44
acto/kubernetes_engine
   base.py483038%12, 16, 20, 24, 28, 31–49, 56–70
   k3d.py85850%1–139
   kind.py887119%18, 22–51, 57, 66–99, 102–114, 117–130, 137–151
   minikube.py330%1–5
acto/monkey_patch
   monkey_patch.py792470%9, 18, 36, 39, 41, 45–56, 76, 90–95, 106
acto/parse_log
   parse_log.py781976%71–74, 84–86, 89–91, 96–98, 116–124
acto/post_process
   post_diff_test.py38022341%39, 53–54, 159, 169, 173, 178–187, 222, 224, 226–230, 236–253, 261–269, 272–294, 301–310, 313–360, 371–375, 395, 403–433, 436–453, 457–505, 518, 522, 528–529, 531–552, 562–598
   post_process.py1042279%51, 55, 67, 77–88, 100, 103, 138–142, 146, 150, 158, 162
   test_post_process.py28196%53
acto/runner
   runner.py28625810%24–44, 72–117, 120–129, 132–163, 170–197, 204–227, 230–236, 239–258, 261–266, 277–282, 296–301, 314–390, 395–399, 405–415, 426–449, 454–487
acto/schema
   anyof.py442055%26, 30–38, 41, 44, 47–49, 52, 55–60
   array.py702959%31, 43–52, 57, 59–62, 71–78, 81–83, 86–88, 91, 94, 97, 103
   base.py1025744%13–15, 18–20, 23, 26–37, 40, 43, 46–48, 51–61, 64–74, 77, 80–86, 95, 100, 105, 110, 114, 118, 142, 145–149
   boolean.py271063%16, 20, 23, 26, 29–32, 35, 38
   integer.py26965%17, 21–23, 26, 29, 32, 35, 38
   number.py301163%30–32, 35–37, 40, 43, 46, 49, 52
   object.py1174859%44, 46, 51, 66–75, 80, 83, 94–109, 112–120, 123–126, 129, 132, 148, 151–158, 168
   oneof.py443032%13–19, 22, 25–27, 30–38, 41, 44, 47–49, 52, 55–60
   opaque.py17665%13, 16, 19, 22, 25, 28
   schema.py42881%21, 25, 31–34, 49–50
   string.py27967%25, 29–31, 34, 37, 40, 43, 46
acto/utils
   __init__.py14193%12
   acto_timer.py312229%10–15, 19, 22–33, 38–40, 44–47
   config.py34197%63
   error_handler.py433323%13–30, 38–52, 56–74
   k8s_helper.py645120%21–27, 39–45, 57–61, 73–79, 83–91, 95–102, 106–127
   preprocess.py695914%16–62, 74–123, 129–174
   process_with_except.py990%1–13
   thread_logger.py15380%9, 18, 28
TOTAL8010430046% 

Tests Skipped Failures Errors Time
124 0 💤 0 ❌ 0 🔥 1m 9s ⏱️

Please sign in to comment.