In the previous lessons, we've learnt about character classes, and negated character classes. In this lesson, we'll look at Quantifiers.
Let's say you wanted to create a pattern that matches "b", followed by 5 "a"s, that is "aaaaa", you could write the pattern like this:
But what if you want to match 16 "a"s, now that could make your regex construct longer. /baaaaaaaaaaaaaaaa/. But you can make your repetitive patterns more concise, using Quantifiers.
What are Quantifiers?
Quantifiers specify the quantity of a set of characters. Quantity here means the number of occurrences. With quantifiers, you can create repeating characters in your patterns.
Quantifiers are created with curly braces. The syntax is:
X can be any character. Then you specify the minimum and maximum occurrences separated by comma. The maximum is optional.
For example, in our baaaaa example, you can create a pattern like this:
This means, "b" followed by five "a" characters (5 occurrences of "a").
Let's say we apply this pattern to this string: “The baby says babaaaaa baaaaba babaabaaa and baaaaaa”:
You can see the substrings with a b followed by five as matching the pattern. You see that the other substrings with b followed by as do not match, because the as are not up to 5.
To specify a range of occurrences, you can add the maximum parameter along with the minimum parameter like this:
Here, we have specified a pattern that matches "b" followed by either 2 as, 3 as, 4 as, or 5 as. 2 is the minimum, 5 is the maximum.
Applying this to our string again:
You see that:
- b, followed by 5as is matched
- b, followed by 4as is matched
- b, followed by 2as is matched
- and b, followed by 3as is matched
b, followed by 1a is not matched, because our quantifier range specifies a minimum of 2as.
You can also have the comma, but leave the maximum empty like this:
This means the pattern will match "b" followed by 2 or more as. Applying this on our string again:
You see that we have the same result from before, except that the last part of our string now matches b, followed by 6as also.
Quantifier with Character Set
A quantifier defines a repetition of the character preceding it in the pattern. The preceding character can be a literal character like "a" (as we saw above), or it can also be a Character Set. For example, a character set with a range of a to z, and a quantifier specifying three repetitions:
Starting from the character set, this specifies a range of a to z. So this can match any letter from a to z. Then by adding the quantifier it means, any letter from a to z (1), followed by a letter from a to z (2), and followed by a letter from a to z (3). That is, three times.
This quantifier example, can be written without quantifier like this:
Which means any three letters between a and z.
Let's change the pattern back and test this on “His name is Deee, and he was a bad dev”:
Remember Flags right? In this pattern, the global flag g is added so the pattern returns multiple matches. The case insensitive flag i is also added so that there's no strictness with the casing of characters.
As you can see from the results, this pattern matches “His”, “nam”, Dee, “and”, “was”, “bad” and “dev”. These are three letters which match our pattern. Our pattern says, any letter between a to z, for three times.
Note: that this does not mean that the particular character is repeated three times. For example, let's say the letter “g” from the a-z range matches, this pattern does not mean g, g, g. It means any letter from a to z (1), followed by any letter from a to z (2), and followed by any letter from a to z (3).
A character class can contain a mix of letters, numbers and even symbols. So we can use a quantifier with a mixed character class. For example:
Which means match any substrings with 9 or more letters or numbers.
We can use this pattern for example to verify if the password a user enters contains only letters and numbers and has minimum of 9 characters.
Let's say we use this pattern to validate a password like "jha8302kdlst":
As you can see, "jha8302kdlst" is matched because it has letters and numbers with 12 characters.
But a password like "$my_password" would fail: