Regex
Last updated
Last updated
Regex is advanced search pattern which can be used to search non-specific and specific data, as well as enhance the quality of your programming code.
Please note that if one can avoid regex instead of something simpler, one must avoid using it. Sometimes using regex where it is not required can break things.
https://regexr.com and https://regex101.com are very helpful online tools to build regex queries.
To identify any pattern, different pattern formation elements (regex structures) are used which are as follows:
Character Classes: List of characters that can appear in the pattern. Character classes are defined by square brackets around the list
Meta Characters, Anchors and Escape characters: They have special meaning within regex and usually start with \
Occurrences: They usually tell how much to match with the help of wildcards
Quantifiers: Combining occurrences with the previous two regex structures can give something as quantifiers.
Note, capture group can be used with (), so if one wants to capture a pattern which has anything in between two words, let's just say, WORD1 aasdkjaslkj WORD2 asdka WORD3, then to capture from word1 to word2, regex can use the capture group as (.*?) which means match anything except newline which can be any number of times including zero and capture it only once
Examples:
To search gmail id of let's just raghav, but if there are many raghavs' (like raghav1@gmail.co, raghav5@gmail.com) and there may be additional emails such as that of yahoo, outlook etc , then the regex query can be as follows:
Here, raghav is matched and then it can have a digit or not i.e.digits can be 0 or more hence *, followed by @ of the email domain and then domain can be be any word greater than 2 characters followed by a dot, which is lastly followed by a word of characters 1 or more to tell root domains
To search Aadhaar card number which is a set of 12 digits with spaces in between, the regex query can be:
To search credit card separated by either spaces or dash, regex query can be:
To extract IP address from IIS source file, one can use regex101.com to form a query, grep -Po to extract and awk to print only required information as shown below:
Note: awk consider field separator as space by default