Utilities Shell /Unix/: March 2017

Tuesday, 14 March 2017

Most Useful Linux Command Line Tricks

We use many Linux command lines every day. We know some tricks from the web, but if we don't practice them, we may forget them. I've decided to make a list of tips and tricks that you may have forgotten or that may be entirely new to you.

Display Output as a Table

Sometimes, when you see the output of a command, it can be overwhelming to identify the output due to overcrowded strings (for example, the output of the mount command). How about viewing it like a table? This is easy to do!

mount | column –t:

In this example, the output is well-formatted because of the spaces. What if the separators were something else, like colons? (For example, in the output of cat /etc/passwd.)

Just specify the separator with an –s parameter, like below.

cat /etc/passwd | column -t -s:

Repeat a Command Until It Runs Successfully

If you search Google about this feature, you will find that a lot of people have asked how to repeat the command until it returns successfully and runs correctly. Suggestions include pinging the server until it becomes live, checking if a file with a specific extension is uploaded from a specific directory, checking if a specific URL has come into existence, etc.

You can use the while true loop to achieve that:

In this example, >/dev/null 2>&1 redirects the output of your program to /dev/null. Include both the Standard Error and Standard Out.

This is one of coolest Linux command line tricks to me.

Sort Processes by Memory Usage

ps aux | sort -rnk 4:

Sort Processes by CPU Usage

ps aux | sort -nk 3:

To check your architecture, perform getconf LONG_BIT.

Watch Multiple Log Files at the Same Time

You may use tail commands to watch your logs without problems, but sometimes, you may want to watch multiple log files. Using multi-tail commands supports text highlighting, filtering, and many more features that you may need:

You can install it if it is not found on your system with apt-get install multitail.

Return to Your Previous Directory

just type cd – and you will return back to the previous directory.

Make a Non-Interactive Shell Session Interactive

To do this, change the settings from ~/.bashrc to ~/.bash_profile.

Monitor Command Output at Regular Intervals

Using the watch command (watch df –h), you can watch any output of any command. For example, you can watch the free space and how it is growing.

You can imagine what you can do with variant data by using the watch command.

Run Program After Session Killing

When you run any program in the background and close you, it will be killed by your shell. How can you continue running the program after closing the shell?

This can be done using a nohup command — which stands for no hang-up:

nohup wget site.com/file.zip

This command is one of the most forgotten Linux command line tricks because many of us use another command-like screen:

A file will be generated in the same directory with the name nohup.out, which contains the output of the running program:

Cool command, right?

Automatically Answer Yes or No to Any Command

If you want to automate the process that requires user to say yes

That can be done using yes command: yes | apt-get update.

Maybe you want to automate saying "no" instead. This can be done using yes no | command.

Create File With a Specific Size

You can create a file with a specific size using the dd command: dd if=/dev/zero of=out.txt bs=1M count=10.

This will create a file of 10 megabytes filled with zeros:

Run Your Last Command as Root

Sometimes, you forget to type sudo before your command that requires root privileges. You don’t have to rewrite it; just type sudo!

Record Your Command Line Session

If you want to record what you’ve typed on your shell screen, you can use script command to save all of your typings to a file named typescript : script.

Once you type exit, all of your commands will be written to that file so you can review it later.

Replace Spaces With Tabs

You can replace any character with any other using the tr command, which is very handy: cat geeks.txt | tr ':[space]:' '\t' > out.txt.

Convert a File to Upper or Lower Case

You can do this using: cat myfile | tr a-z A-Z > output.txt.

Powerful Xargs Command

The xargs command is one of the most important Linux command line tricks. You can use this command to pass the output of a command to another command as an argument. For example, you may search for PNGpng files and compress them or do anything with them:

find. -name *.png -type f -print | xargs tar -cvzf images.tar.gz

Or, maybe you have a list of URLs in a file and you want to download them or process them in a different way:

cat urls.txt | xargs wget

Keep in mind that the output of the first command passed at the end of the xargs command.

What if your command needs the output in the middle? Easy!

Just use {} combined with the –i parameter, like below, to replace arguments in the place where the output of the first command should go:

ls /etc/*.conf | xargs -i cp {} /home/likegeeks/Desktop/out

This is only a few Linux command line tricks. There are some more geeky things that you can do using other commands, like the awk command and the sed command!

35+ Examples of Regex Patterns Using sed and awk in Linux

In order to successfully work with the Linux sed editor and the awk command in your shell scripts you have to understand regular expressions, or regex for short (and to be accurate in our case, it is bash regex). Though there are many engines for regex you can use, I've decided to that in this tutorial we will use the shell regex, so you can see bash working with regex.

First, we need to understand what regex is, then we will dive deep into using it.

Our Main Points Are:

1. What is regex?

2. Types of regex.

3. Define BRE Patterns.

4. Special characters.

5. Anchor characters.

6. The dot character.

7. Character classes.

8. Negating character classes.

9. Using ranges.

10. Special character classes.

11. The asterisk.

12. Extended Regular Expressions.

13. Grouping expressions.

14. Practical examples.

What is Regex

For some people, when they see regular expressions for the first time they say, 'what are those ASCII buggers! Well, a regular expression, or regex, in general, is a pattern of text you define that a Linux program (in our case) like sed or awk uses to filter text.

The regex pattern makes use of wildcard characters to represent one or more characters in the data stream. We’ve seen some of those wildcard characters when introducing basic Linux commands and see how ls commands use wildcard characters to filter output.

Types of Regex

There are many different applications that use different types of regex in Linux. These include programming languages (Java, Perl, Python) and Linux programs like (sed, awk, grep) and many other applications.

A regex is implemented using a regular expression engine. A regular expression engine is an underlying software that interprets regular expression patterns and uses those patterns to match the text.

Linux has two regular expression engines:

The POSIX Basic Regular Expression (BRE) engine
The POSIX Extended Regular Expression (ERE) engine

Most Linux programs at a minimum conform to the POSIX BRE engine specifications, recognizing all the pattern symbols it defines. Unfortunately, some utilities (such as the sed) conform only to a subset of the BRE engine specifications. This is due to speed constraints because the sed attempts to process text as quickly as possible.

The POSIX ERE engine is often found in programming languages. It provides advanced pattern symbols as well as special symbols for common patterns, such as matching digits and words. The awk command uses the ERE engine to process its regular expression patterns.

And because there are so many different ways to implement regex, it’s hard to write patterns that work on all engines. Hence we will focus on the most commonly found regex and demonstrate how to use them in the sed and awk.

Define BRE Patterns

The most basic BRE pattern is matching text characters in a data stream, and we’ve seen that using sed and awk, but let’s refresh our memory.

$ echo "This is a test" | sed -n '/test/p'

$ echo "This is a test" | awk '/test/{print $0}'

You may notice that the regex doesn’t care where in the data stream the pattern occurs. It also doesn’t matter how many times the pattern occurs. After the regex can match the pattern anywhere in the text string, it passes the string along to the Linux program that’s using it.

The first rule to remember is that regular expression patterns are case sensitive.

$ echo "This is a test" | awk '/Test/{print $0}'

$ echo "This is a test" | awk '/test/{print $0}'

The first regex found no match because the word “this” doesn’t appear in uppercase in the text string, while the second line, which uses the lowercase letters in the pattern, worked just fine

You also don’t have to limit yourself to single text words in the regular expression. You can include spaces and numbers in your text string as well.

$ echo "This is a test 2 again" | awk '/test 2/{print $0}'

Spaces are treated just like any other character in regex.

Special Characters

There are a few exceptions when defining text characters in a regex. Regex patterns assign a special meaning to a few characters. If you try to use these characters in your text pattern, you won’t get the results you were expecting.

The following special characters are recognized by regex:

.*[]^${}\+?|()

If you want to use one of the special characters as a text character, you need to escape it. The special character that does this is the backslash character (\). For example, if you want to search for a dollar sign in your text, just precede it with a backslash character like this:

$ cat myfile
 
There is 10$ on my pocket
$ awk '/\$/{print $0}' myfile

Also, backslash itself is a special character, so if you need to use it in a regex pattern, you need to escape it as well, producing a double backslash.

$ echo "\ is a special character" | awk '/\\/{print $0}'

Although the forward slash isn’t a regular expression special character, so if you use it in your regular expression pattern in the sed or the awk, you get an error.

$ echo "3 / 2" | awk '///{print $0}'

So you need to escape it like this:

$ echo "3 / 2" | awk '/\//{print $0}'

Anchor Characters

You can use two special characters to anchor a pattern to either the beginning or the end of lines in the text. The caret character (^) defines a pattern that starts at the beginning of a line of text, in the text. If the pattern is located at any place other than the start of the line of text, the regex pattern fails.

You can use it like this:

$ echo "welcome to likegeeks website" | awk '/^likegeeks/{print $0}'
$ echo "likegeeks website" | awk '/^likegeeks/{print $0}'

The caret anchor character checks for the pattern at the beginning of each new line of data.

$ awk '/^this/{print $0}' myfile

Great!! When using sed, if you position the caret character in any place other than at the beginning of the pattern, it acts like a normal character and not as a special character.

$ echo "This ^ is a test" | sed -n '/s ^/p'

But if you use awk you have to escape it like this:

$ echo "This ^ is a test" | awk '/s \^/{print $0}'

The above code pertains to the beginning of the text, but what about looking at the end?

The dollar sign ($) special character defines the end anchor:

$ echo "This is a test" | awk '/test$/{print $0}'

You can combine both the start and end anchor on the same line like this:

$ cat myfile this is a test This is another test And this is one more
$ awk '/^this is a test$/{print $0}' myfile

As you can see, it prints only the line that has the matching pattern.

You can filter blank lines with the following pattern:

$ awk '!/^$/{print $0}' myfile

Here we introduce the negation which is done by the exclamation mark (!).

The pattern looks for lines that have nothing between the start and end of the line and negates that to print only the lines that have text.

The Dot Character

The dot character is used to match any single character except a newline character.

Look at the following example to get the idea:

$ cat myfile this is a test This is another test And this is one more start with this
$ awk '/.st/{print $0}' myfile

You can see from the result that it prints only the first two lines because they contain the st pattern while the third line does not have that pattern, and the fourth line starts with st so that also doesn’t match our pattern.

Character Classes

You can match any character with the dot special character, but what if you want to limit what characters to match? This is called a character class.

You can define a set of characters that would match a position in a text pattern. If one of the characters from the character set is in the text, it matches the pattern.

To define a character class, you use square brackets [] like this:

$ awk '/[oi]th/{print $0}' myfile

Here we search for any th character that has the characters, o or I, before it.

This comes in handy when you are searching for words that may contain upper or lower case letters, and you are not sure about that.

$ echo "this is a test" | awk '/[Tt]his is a test/{print $0}'
$ echo "This is a test" | awk '/[Tt]his is a test/{print $0}'

Of course, it is not limited to characters; you can use numbers or whatever you want. You can employ it as you want as long as you got the idea.

Negating Character Classes

You can also reverse the effect of a character class. Instead of looking for a character contained in the class, you can look for any character that’s not in the class. To do that, just place a caret character at the beginning of the character class range.

$ awk '/[^oi]th/{print $0}' myfile

By negating the character class, the regex pattern matches any character that’s neither o nor an i.

Using Ranges

You can use a range of characters within a character class by using the dash symbol like this:

$ awk '/[e-p]st/{print $0}' myfile

This matches all characters between e and p then followed by st as shown. You can also use ranges for numbers.

$ echo "123" | awk '/[0-9][0-9][0-9]/'
$ echo "12a" | awk '/[0-9][0-9][0-9]/'

You can also specify multiple, non-continuous ranges in a single character class.

$ awk '/[a-fm-z]st/{print $0}' myfile

The character class allows the ranges a through f, and m through z to appear before the st text.

Special Character Classes

The BRE contains special character classes you can use to match against specific types of characters.

And this is the list:

[[:alpha:]] - Matches any alphabetical character, either upper or lower case.

[[:alnum:]] - Matches any alphanumeric character 0–9, A–Z, or a–z.

[[:blank:]] - Matches a space or Tab character.

[[:digit:]] - Matches a numerical digit from 0 through 9.

[[:lower:]] - Matches any lowercase alphabetical character a–z.

[[:print:]] - Matches any printable character.

[[:punct:]] - Matches a punctuation character.

[[:space:]] - Matches any whitespace character: space, Tab, NL, FF, VT, CR.

[[:upper:]] - Matches any uppercase alphabetical character A–Z.

You can use them like this:

$ echo "abc" | awk '/[[:alpha:]]/{print $0}'
$ echo "abc" | awk '/[[:digit:]]/{print $0}'
$ echo "abc123" | awk '/[[:digit:]]/{print $0}'

The Asterisk

Placing an asterisk after a character signifies that the character must appear zero or more times in the text to match the pattern.

$ echo "test" | awk '/tes*t/{print $0}'
$ echo "tessst" | awk '/tes*t/{print $0}'

This pattern symbol is commonly used for handling words that have a common misspelling or variations in language spellings.

$ echo "I like green color" | awk '/colou*r/{print $0}'
$ echo "I like green colour " | awk '/colou*r/{print $0}'

Here in these examples, whether you spell it color or colour, it will match because the asterisk means if the u character existed many times or zero times that these two would still match.

Another handy feature is combining the dot character with the asterisk character. This combination provides a pattern to match any number of any characters.

$ awk '/this.*test/{print $0}' myfile

It doesn’t matter how many words between the words this and test, any line will match and be printed.

The asterisk can also be applied to a character class.

$ echo "st" | awk '/s[ae]*t/{print $0}'
$ echo "sat" | awk '/s[ae]*t/{print $0}'
$ echo "set" | awk '/s[ae]*t/{print $0}'

All three examples match because the asterisk means if you find the a or e character zero times or more it will print.

Extended Regular Expressions

The POSIX ERE patterns include a few additional symbols that are used by some Linux applications and utilities. The awk command recognizes the ERE patterns, but sed doesn’t. We will discuss the commonly used ERE pattern symbols that you can use in your awk program scripts.

The Question Mark

The question mark indicates that the preceding character can appear zero times or once so no repeating here.

$ echo "tet" | awk '/tes?t/{print $0}'
$ echo "test" | awk '/tes?t/{print $0}'
$ echo "tesst" | awk '/tes?t/{print $0}'

You can use the question mark symbol along with a character class.

$ echo "tst" | awk '/t[ae]?st/{print $0}'
$ echo "test" | awk '/t[ae]?st/{print $0}'
$ echo "tast" | awk '/t[ae]?st/{print $0}'
$ echo "taest" | awk '/t[ae]?st/{print $0}'
$ echo "teest" | awk '/t[ae]?st/{print $0}'

If zero characters or one character from the character class appears, the pattern match passes.

But if both characters appear, or if one of the characters appears twice, the pattern match fails.

The Plus Sign

The plus sign indicates that the preceding character can appear one or more times, but must be present at least once.

$ echo "test" | awk '/te+st/{print $0}'
$ echo "teest" | awk '/te+st/{print $0}'
$ echo "tst" | awk '/te+st/{print $0}'

If the e character is not present, the pattern match fails. The plus sign also works with character classes, the same way as the asterisk and question mark

$ echo "tst" | awk '/t[ae]+st/{print $0}'
$ echo "test" | awk '/t[ae]+st/{print $0}'
$ echo "teast" | awk '/t[ae]+st/{print $0}'
$ echo "teeast" | awk '/t[ae]+st/{print $0}'

The Pipe Symbol

The pipe symbol allows to you to specify two or more patterns that the regex engine uses in a logical OR formula when examining the data stream. If any of the patterns match the text, the text passes. If none of the patterns match, the pattern will fail. Here is an example:

$ echo "This is a test" | awk '/test|exam/{print $0}'
$ echo "This is an exam" | awk '/test|exam/{print $0}'
$ echo "This is something else" | awk '/test|exam/{print $0}'

This example looks for the regular expression test or exam in the text. Keep in mind that you can’t place any spaces within the regular expressions and the pipe symbol.

Grouping Expressions

Regex patterns can also be grouped by using parentheses. When you group a regex pattern, the group is treated like a standard character. You can apply a special character to the group just as you would to a regular character

$ echo "Like" | awk '/Like(Geeks)?/{print $0}'
$ echo "LikeGeeks" | awk '/Like(Geeks)?/{print $0}'

Practical Examples

We’ve seen some simple demonstrations of using regular expression patterns, it’s time to put that in action, just for practice.

Counting Directory Files

Let’s look at a bash script that counts the executable files that are present in the directories defined in your PATH environment variable. To do that, you need to parse out the PATH variable into separate directory names.

$ echo $PATH

To get a listing of directories that you can use in a script, you must replace each colon with a space.

$ echo $PATH | sed 's/:/ /g'

Now let’s iterate through each directory using for a loop like this:

mypath=$(echo $PATH | sed 's/:/ /g')   for directory in $mypath   do   done

Great!!

Now we can use the ls command to list each file in each directory and save the count in a variable.

#!/bin/bash   mypath=$(echo $PATH | sed 's/:/ /g')   count=0   for directory in $mypath   do   check=$(ls $directory)   for item in $check   do   count=$[ $count + 1 ]   done   echo "$directory - $count"   count=0   done

You may notice some directories don't exist, no problem there.

Cool!! This is the power of regex. Those few lines of code count all files in all directories. Of course, there is a Linux command to do that very easily, but here we introduced how to employ regex in something you can know how to use it, and with some creativity, come up with some more useful ideas.

Validating e-mail Addresses

There are a ton of websites that use regex patterns for everything e-mail phone number related. Here's how it works.

username@hostname.com

The username can use any alphanumeric characters combined with dot, dash, plus sign, and underscore. The hostname can use any alphanumeric characters combined with dot and underscore.

Let’s start building the regular expression pattern from the left side. We know that there can be multiple valid characters in the username. This should be very easy.

^([a-zA-Z0-9_\-\.\+]+)@

This grouping specifies the allowed characters in the username and the plus sign to indicate that at least one character must be present or more than the @ sign.

Then the hostname pattern should be like this:

([a-zA-Z0-9_\-\.]+)

There are special rules for the top-level domain. Top-level domains are only alphabetic characters, and they must be no fewer than two characters (used in country codes) and no more than five characters in length. The following is the regex pattern for the top-level domain:

\.([a-zA-Z]{2,5})$

Now we put them all together:

^([a-zA-Z0-9_\-\.\+]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Let’s test that regex against an email:

$ echo "name@host.com" | awk '/^([a-zA-Z0-9_\-\.\+]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$/{print $0}'
$ echo "name@host.com.us" | awk '/^([a-zA-Z0-9_\-\.\+]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$/{print $0}'

Awesome! Works great!

This was just the beginning for a regex world that never ends. I hope after this post you understand these ASCII buggers and use it more professionally.

The grouping of the “Geeks” ending along with the question mark allows the pattern to match either the full day name LikeGeeks or the word Like only.