Software Development Methods and Tools—CSCI-3308

Lab 2—Regular expressions

Objectives

Exercises

For each step please record the commands (and options) that you used to complete the task. At the end you will receive credit for the lab by showing your TA these commands.

Step 1 - Download practice files

For today’s lab we will be using the following data files:

Copy the provided zip file to your home directory and decompress it using the unzip command:

curl -L https://gist.github.com/dgraham/acfdc4ffc2d6e74fd587/archive/f6f52f1d2a89d627cdee9f3ae76f23f4eefa24ce.zip > lab2.zip
unzip lab2.zip -d lab2
cd lab2

Check to make sure each of the above files was correctly unzipped into the lab2 directory.

Step 2 - Use the diff command

Step 3 - Use the wc command

Step 4 - Use the cut command

Step 5 - Practice using pipes

Step 6 - Use the sed command

Step 7 - Use the awk command

Step 8 - More practice with regular expressions

For the following problems use grep or egrep with the regex_practice_data.txt file.

  1. How many phone numbers are in the dataset?
  2. How many city of Boulder phone numbers (e.g. starting with 303-441-…)?
  3. How many email addresses?
  4. How many email addresses are from government domains (e.g. ‘.gov’)?
  5. How many email addresses are in ‘first.last’ name format AND involve someone who’s first name starts with a letter in the first half of the alphabet?

Credit

To get credit for this lab exercise, show the TA your code and run your program.

Lab material by Liz Boese.