# AWK

It is more of a script than an command, which is heavily used for text processing and data manipulation and produce formatted reports. It is typically used as a data extraction and reporting tool. It is a standard feature of most Linux operating systems and is useful when handling text files that are formatted in a predictable way. awk parses and manipulates tabular data on a line-by-line basis, and it iterates through the entire file. By default, awk uses whitespace—for example, spaces and tabs—as a delimiter to separate fields

The syntax is as follows:&#x20;

```awk
awk GNU_OR_POSIX_OPTIONS 'pattern_selection_criteria {action}' input-file
```

<figure><img src="https://275986271-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FPaRFhO7J6sRJrjn8Haee%2Fuploads%2FXbUgjB1cKu9aBEOAciEq%2Fimage.png?alt=media&#x26;token=a9e67a0e-237e-496b-a8d5-458b158d94fe" alt=""><figcaption></figcaption></figure>

As shown above, -F is important to tell the delimiter

awk has  built-in variables such as:

* **`$0`**. Used to specify the whole line.
* **`$1`**. Specifies the first field or first column. (example - awk '{print $1}')
* **`$2`**. Specifies the second field.
* NR: Counts the number of input records (usually lines). Awk command performs the pattern/action statements once for each record in a file.&#x20;
* FS: Just like the command line argument -F, the field separator can also be passed via variable FS.
* RS: Stores the current record separator character. Since, by default, an input line is the input record, the default record separator character is a newline.&#x20;
* OFS: Stores the output field separator, which separates the fields when Awk prints them. The default is a blank space. Whenever print has several parameters separated with commas, it will print the value of OFS in between each parameter.&#x20;
* ORS: Stores the output record separator, which separates the output lines when Awk prints them. The default is a newline character. print automatically outputs the contents of ORS at the end of whatever it is given to print.
* FNR: It is the current record number in the current file. For the file, NR is going to be equal to FNR as FNR will reset to 1 for every file but NR will keep increasing.
* NF: Variable whose value is the number of fields in the current record.

<figure><img src="https://275986271-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FPaRFhO7J6sRJrjn8Haee%2Fuploads%2FA7PzjQjCmSZ3B1g1usTk%2Fimage.png?alt=media&#x26;token=96db7ead-8cbe-4f48-a815-0697fcc29560" alt=""><figcaption></figcaption></figure>

<figure><img src="https://275986271-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FPaRFhO7J6sRJrjn8Haee%2Fuploads%2FbsquZx98gzZcClxYaOlW%2Fimage.png?alt=media&#x26;token=2177c920-8d7b-43b7-b32f-0b1de1c67130" alt=""><figcaption></figcaption></figure>
