-
Take this short interactive regex tutorial.
-
Find the number of words (in
/usr/share/dict/words
) that contain at least threea
s and don't have a's
ending. What are the three most common last two letters of those words?sed
'sy
command, or thetr
program, may help you with case insensitivity. How many of those two-letter combinations are there? And for a challenge: which combinations do not occur?Solutions:
To find the number of the words, run the following commands in your ternminal.
cat /usr/share/dict/words | tr "[:upper:]" "[:lower:]" | awk "/(.*?a.*?){3}([^'][^s])?/" | wc -l
Or use grep to achieve the same result.
grep -icE "^(\w*?a\w*?){3}([^'][^s])?$" /usr/share/dict/words
List the three most common last two letters with the following commands. If you are curious about the number of the cominations, you can just wipe out the
awk '{print $2}'
term.grep -iE "^(\w*?a\w*?){3}([^'][^s])?$" /usr/share/dict/words | awk '{print substr($0,length()-1)}' | sort | uniq -c | sort -rnk1,1 | head -n3 | awk '{print $2}'
To find the cominations that do not occur, run the following bash script.
[todo]
-
To do in-place substitution it is quite tempting to do something like
sed s/REGEX/SUBSTITUTION/ input.txt > input.txt
. However this is a bad idea, why? Is this particular tosed
? Useman sed
to find out how to accomplish this.Solutions:
We can use
-i extension
to edit files in-place and saving backups with the specified extension. If a zero-length extension is given, no backup will be saved. It is not recommended to give a zero-length extension when in-place editing files, as you risk corruption or partial content in situations where disk space is exhausted, etc. -
Find your average, median, and max system boot time over the last ten boots. Use
journalctl
on Linux andlog show
on macOS, and look for log timestamps near the beginning and end of each boot. On Linux, they may look something like:Logs begin at ...
and
systemd[577]: Startup finished in ...
On macOS, look for:
=== system boot:
and
Previous shutdown cause: 5
Solutions: (on Linux)
journalctl | grep -E 'Startup finished in (\w\.?)+ \(kernel\) \+' | tail -n10 | sed -E 's/(.*) = (.*)s./\2/' | R --slave -e 'x <- scan(file="stdin", quiet=TRUE); summary(x)'
-
Look for boot messages that are not shared between your past three reboots (see
journalctl
's-b
flag). Break this task down into multiple steps. First, find a way to get just the logs from the past three boots. There may be an applicable flag on the tool you use to extract the boot logs, or you can usesed '0,/STRING/d'
to remove all lines previous to one that matchesSTRING
. Next, remove any parts of the line that always varies (like the timestamp). Then, de-duplicate the input lines and keep a count of each one (uniq
is your friend). And finally, eliminate any line whose count is 3 (since it was shared among all the boots). -
Find an online data set like this one, this one. or maybe one from here. Fetch it using
curl
and extract out just two columns of numerical data. If you're fetching HTML data,pup
might be helpful. For JSON data, tryjq
. Find the min and max of one column in a single command, and the sum of the difference between the two columns in another.