Assignment 1—Bash shell scripts
Due date
February 2, 2017 at 6:00 pm
Objectives
- Write a bash script with bash shell commands to loop through the file
- Write a bash script using Unix commands like
awk
- Practice regular expressions
- Use Google to figure out what you don’t know
- [optional] Experience pair programming (or work alone)
Part 1 - Bash shell scripts
Submit two separate bash shell scripts that perform these steps:
- Accept a file name as the first command line argument
- Read in the data file
- Calculate the average of the scores
- Sort the output by last name then first name
- Format the output as shown below
The objective of writing two scripts is to see that there are multiple correct
solutions to such problems. One solution should use awk
, and the other should
use bash commands.
An example data file is shown below.
123456789 Lee Johnson 72 85 90
999999999 Jaime Smith 90 92 91
888111818 JC Forney 100 81 97
290010111 Terry Lee 100 99 100
199144454 Tracey Camp 77 84 84
299226663 Laney Camp 70 74 71
434401929 Skyler Camp 78 81 82
928441032 Jess Forester 85 80 82
928441032 Chris Forester 97 94 89
If the data above is stored in a file named grades.txt
, we would
execute the two scripts with:
$ grades.sh grades.txt
$ grades-awk.sh grades.txt
Output
71 [299226663] Camp, Laney
80 [434401929] Camp, Skyler
81 [199144454] Camp, Tracey
93 [928441032] Forester, Chris
82 [928441032] Forester, Jess
92 [888111818] Forney, JC
82 [123456789] Johnson, Lee
99 [290010111] Lee, Terry
91 [999999999] Smith, Jaime
Note: we will accept both rounded and truncated averages.
Part 2 - Regular expressions
Copy the provided zip file to your home directory and decompress it using the unzip command:
curl -L https://gist.github.com/dgraham/acfdc4ffc2d6e74fd587/archive/f6f52f1d2a89d627cdee9f3ae76f23f4eefa24ce.zip > hw1.zip
unzip hw1.zip -d hw1
cd hw1
Only the regex_practice_data.txt
file is required for the assignment.
Write a shell script that uses regular expressions to answer the following questions.
Hint: grep
and egrep
are your friends (egrep treats { }
differently than grep).
Be sure to check for word boundaries in your answers using \b
where appropriate.
Pipe answers to wc –l
to get the count.
There are multiple correct solutions for each of the questions below.
- How many lines end with a number?
- How many lines do not start with a vowel?
- How many 12 letter (alphabet only) lines?
-
How many phone numbers are in the dataset?
format: _ _ _-_ _ _-_ _ _ _
-
How many city of Boulder phone numbers?
format: 303-_ _ _-_ _ _ _
- How many begin with a vowel and end with a number?
- How many email addresses are from geocities? (e.g. end with
geocities.com
)? - How many incorrect email address are there (lines with an @ in it but formatted
incorrectly)? An email address has a user id and domain names can consist of
letters, numbers, periods, and dashes. An email address has to have a
top-level domain (e.g.
something.com
).
Suggestions
- Talk to each other about the numerical answers that you get.
- You can redirect output to a file then run diff between two files to see what the differences are to help figure out where one solution may be wrong.
Requirements
- Scripts must be bash files named
grades.sh
grades-awk.sh
regex-answers.sh
- The first line must be
#!/bin/bash
- The second line in each file is a comment with your name (and your partner’s name if you pair program).
- For all scripts, read in the name of the data file from command line arguments. We will test with additional data files that have different names!
- Grades scripts
- The data files for the grades scripts will be the same format as shown, though it may have more or less lines in the file. All students have three grades in the data files.
- Print out the data with the average, the ID in square brackets, then the last name, comma, space, first name.
- The output also needs to be sorted, first based on the last name. If the last name is the same, sort then on the first name. If the person has the same last name and first name, then sort based on the ID. All IDs are unique in the file.
-
If the program is run without a file name as the single command line argument, print out the usage statement:
Usage: grades.sh filename Usage: grades-awk.sh filename
You may notice that this is how most Unix commands are set up.
- Regex program
- Each line of output should map to the question. There are eight questions
so you should only have eight lines of output, which is the output from
calling
wc –l
. If you do not know how to do one of the answers print out this placement so that the rest of your answers align in the output:echo "0"
-
If the program is run without a file name as the single command line argument, print out the usage statement:
Usage: regex-answers.sh filename
- Each line of output should map to the question. There are eight questions
so you should only have eight lines of output, which is the output from
calling
Credit
To receive credit for this assignment:
- You may pair program, but you may not have more than two in your group (only a group of two is permitted or work alone). If you do pair program, then only one of you submits.
- Zip all three files and save it as
Lastname_HW1.zip
and submit. If you are pair programming, then name the fileLastname1_Lastname2_HW1.zip
and only one of you may submit it online to the Homework 1 Submission.
Assignment material by Liz Boese.