-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathGRID5000_EXAMPLE
195 lines (122 loc) · 7.79 KB
/
GRID5000_EXAMPLE
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
First connect to grid5000
ssh access.grid5000.fr
the connect to sophia
ssh sophia
then copy the necessary files to your home directory
cp -r ../tcrain/public/* ~/
Link your ssh key that will be used for the experiments
ln ./.ssh/id_rsa key
Link your public key
ln ./.ssh/id_rsa.pub ./.ssh/exp_key.pub
Reserve some nodes
oargridsub -t deploy -w '2:00:00' sophia:rdef="/nodes=2" -v
This will reply something like:
[OAR_GRIDSUB] [lille] Reservation success on sophia : batchId = 1507949
[OAR_GRIDSUB] Grid reservation id = 54894
Now you need to edit the config file for this experiment,
so here are the contents that I will have for this experiment in the file
./basho_bench/script/testconfig (I described in a previous email what each line does):
54894
master
1
0
1
1
4
1
grid5000newpb
antidote_pb.config
Meaning:
gridjobid - the id of the job running on grid5000
branch - the git branch of antidote to run the experiment on
dodeploy - 1 or 0, 1 means it will boot the machines and load the os image, 0 means the machines should already be running
secondrun - 1 or 0, 0 means it will download and recompile all the code, 0 means it should already be compiled
computeCount - the number of machines running BashoBench per DC
benchCount - the number of machines running Antidote per DC
benchParallel - the number of instances of BashoBench to run per machine
dcspercluster - the number of DCs per grid5000 cluster (for example if we want to run 2 DCs within the Rennes site)
benchBranch - the git branch of BashoBench to use
benchFile - the name of the benchmark configuration file to use
Now you should be able to run the experiment
./basho_bench/script/grid5000start.sh ./basho_bench/script/testconfig
Hopefully if there are no errors, once it finishes it will give you the name of a tar file. You need to download this file to your local computer to compile the results. I do this from the basho bench folder on my local computer:
cd ~/basho_bench
scp [email protected]:~/sophia/antidote_bench-2016-01-29-1454081924.tar ./
Then to compile the results you run:
./script/makeGraph.sh antidote_bench-2016-01-29-1454081924
Now you will have a folder with all the results.
In this example it is in the folder called antidote_bench-2016-01-29-1454081924/results-pubsub_weak_meta_data-2dcs-8nodes-2benchNodes
First you need to get an account on Grid5000:
https://www.grid5000.fr/mediawiki/index.php/Grid5000:Get_an_account
Initial setup for the first time:
First you need benchmark images.
Ask me for images if you need them or you can make them yourself by doing the following:
Instructions for getting an images deploying them can be found here
https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2
To configure an image you need to:
1. Install erlang as described here https://github.com/SyncFree/antidote
2. Install antidote in /root/antidote
3. Install basho_bench (https://github.com/SyncFree/basho_bench) in /root/basho_bench[1-6]/basho_bench
4. Put your benchmark key in the /root/.ssh folder
5. Put a file named key that links to your key in the /root/ directory
Now you need to have the antidote image and env file in each site you want to run the experiment
~/antidote_images/antidote_image_init.tgz
~/antidote_images/mywheezy-x64-base.env
In ~/ make a file called key that links to your benchmar key
Also in each site create a ~/basho_bench/script/ folder and place the following files it in:
grid5000start.sh
grid5000start-createnodes.sh
parallel_command.sh
mergeResults.awk
mergeResults.sh
mergeResultsSummary.awk
Thats all for setup!!!
Now how to run a benchmark:
First you need to reserve some nodes (note there are rules for how many nodes you can reserve at a time, see the grid5000 website).
This gives an 30 min reservation with 1 node at rennes and one at lille:
oargridsub -t deploy -w '0:30:00' rennes:rdef="/nodes=1",lille:rdef="/nodes=1"
The following command reserves 22 nodes in the suno cluster at the sophia site, and 22 nodes in the chinqchint cluster at the lille site, for 6 hours, with the reservation set to start at 20h 2015-07-26
oargridsub -t deploy -w '6:00:00' -s '2015-07-26 20:00:00' nancy:rdef="/nodes=22":prop="cluster='grisou'",lille:rdef="/nodes=22":prop="cluster='chinqchint'"
If you wanted to start the reservation immediately remove the -s option.
Note that each site will be created into a DC, for the example above you will have 2 DCs, lille and sophia.
There is an exception, if you set the number of nodes for a certain site to be equal to the number of BenchNodes (see below), then a DC will not be created at this site and will only be a site running basho_bench. Otherwise the reservation should reserve the same number of nodes per site.
Then run this from the frontend to deploy the nodes and run the benchmark:
~/basho_scripts/grid5000start.sh JobId Branch Deploy SecondRun AntidoteNodes BenchNodes BenchInstances BenchConfigFile
The options are the following:
JobId:
This is the id assigned to the reservation. You can see information about the reservations by running oargridstat
Branch:
This is the git branch of antidote to benchmark.
Deploy:
This can be 1 or 0. This should be 1 the first time you run the benchmark on a reservation and 0 for subsequent runs. If it is 1, then the image will be loaded on the nodes.
SecondRun:
This can be 1 or 0. When this is 1 the source code for antidote and basho_bench will be pulled from git and compiled. It should be 0 every time the code should be compiled, for example when you changed something, or the number of benchmark nodes. It should always be zero on the first run of a reservation.
AntidoteNodes:
This is how many nodes per DC will be running as part of the antidote cluster/ring in that DC.
BenchNodes:
This is how many nodes per DC will be running the basho_bench. The sum of AntidoteNodes and BenchNodes should be no greater than the number of nodes reserved per site (but can be less).
BenchInstances:
This is a value 1 through 6. It is the number of basho_bench instances that will be run per basho_bench node. Note that I put this here just because I noticed that a single instance of bash_bench wansn't using all resources on that node, but I didn't look into detail why this happens.
DCs per Cluster:
Number of DCs of antidote that will run per G5K cluster.
BenchConfigFile:
This is the name of the basho_bench configuration file for the benchmark in the examples directory.
An example first run might be:
~/basho_bench/script/grid5000start.sh 53738 weak_meta_data 1 0 5 6 antidote_pb.config
Watch the output for errors, likely the first time you didn't get the configuration completely correct (these instructions aren't super clear).
The benchmark should take some time to run, it will run several tests with read(update) ratios of ("99.99(.01)", "99(1)", "90(10)", "75(25)", "50(50)", "1(99)")
At the end you will get an message saying the name of the result file with all the results tarred up.
Scp this file to your local machine in the basho_bench/ folder (of this git clone in which this readme exists).
Now from this folder run:
./script/makeGraph.sh result_file_name
(Note you should not include the tar extenstion as part of the input).
NOTE: If you want to run multiple benchmarks on different reservations at the same time you can, but run them from different fronends as they may overwrite temporary files from eachother (note I should fix this)
Some other notes:
This doesnt work, don't have the rights to run kdeploy directly from the deploy job:
oargridsub -t deploy -w '0:30:00' rennes:rdef="/nodes=1",lille:rdef="/nodes=1" -p ~/basho_scripts/grid5000start.sh
A grid job that you can ssh to:
oargridsub -t deploy -w '0:30:00' rennes:rdef="/nodes=1",lille:rdef="/nodes=1"
oargridstat
oargridstat -w -l 53584 | sed '/^$/d' > ~/machines
OAR_JOB_ID=53584 oarsh -i /tmp/oargrid//oargrid_ssh_key_tcrain_53579 `head -n 1 machines`
oargriddel 53577