![]() |
Mastering Python Regular Expressions http://www.packtpub.com/ Authors : Félix López |
제4장 Look Around
더 강력한 종류의 zero-width assertion(캐릭터 소비와 매칭없이 input의 포지션이 알맞는지를 보증).
캐릭터 소비없이, 매칭의 positive or negative result를 반환.
Look ahead
Positive look ahead
import re
expr1 = r'fox'
expr2 = r'(?=fox)'
data = "The quick brown fox jumps over the lazy dog"
pattern = re.compile(expr1)
result = pattern.search(data)
print(result.start(), result.end()) # 16 19 # 소비됨으로 endpos 변경됨
pattern = re.compile(expr2)
result = pattern.search(data)
print(result.start(), result.end())# 16 16 # 비소비로 endpos변경 없읍
print(result) # <re.Match object; span=(16, 16), match=''> => 포지션 반환, content 반환x
import re
expr1 = r'\w+,'
expr2 = r'\w+(?=,)'
expr3 = r'\w+(?=\,|\.)'
data = "They were three: Felix, Victor, and Carlos."
pattern = re.compile(expr1)
result1 = pattern.findall(data)
print(result1) # ['Felix,', 'Victor,']
pattern = re.compile(expr2)
result2 = pattern.findall(data)
print(result2) # ['Felix', 'Victor']
pattern = re.compile(expr3)
result2 = pattern.findall(data)
print(result2) # ['Felix', 'Victor', 'Carlos']
Negative look ahead
import re
expr = r'John(?!\sSmith)'
data = "I would rather go out with John McLane than with John Smith or John Bon Jovi"
pattern = re.compile(expr)
result = pattern.finditer(data)
for i in result:
print (i.start(), i.end())
'''
27 31 # John McLane의 John
63 67 # John Bon Jovi의 John
'''
Look around and substitutions
import re
expr = r'\d{1,3}'
expr2 = r'\d{1,3}(?=(\d{3})+(?!\d))'
data = "The number is: 1234567890"
pattern = re.compile(expr)
result = pattern.findall(data)
print(result) # ['123', '456', '789', '0']
pattern = re.compile(expr2)
results = pattern.finditer(data)
for result in results:
print(result.group())
'''
1
234
567
'''
pattern = re.compile(expr2)
result = pattern.sub(r'\g<0>,', "1234567890")
print(result) # 1,234,567,890
Look behind
이 메커니즘은 fixed-width patterns만 지원, variable-width patterns(quantifier, back reference)을 위해 regex모듈 이용하라!!! ( Alternation는 동일 길이여야 한다) - https://pypi.python.org/pypi/regex
Positive look behind
import re
expr = r'(?<=John\s)McLane'
data = "I would rather go out with John McLane than with John Smith or John Bon Jovi"
pattern = re.compile(expr)
result = pattern.finditer(data)
for i in result:
print (i.start(), i.end()) # 32 38
print(i) # <re.Match object; span=(32, 38), match='McLane'>
Negative look behind
import re
expr = r'(?<!John\s)Doe'
data = "John Doe, Calvin Doe, Hobbes Doe"
pattern = re.compile(expr)
result = pattern.finditer(data)
for i in result:
print (i.start(), i.end())
print(i)
'''
17 20
<re.Match object; span=(17, 20), match='Doe'>
29 32
<re.Match object; span=(29, 32), match='Doe'>
'''
# twitter scannig
import re
pattern = re.compile(r'(?<=\B@)[\w_]+')
result = pattern.findall("Know your Big Data = 5 for $50 on eBooks and 40% off all eBooks until Friday #bigdata #hadoop @HadoopNews packtpub.com/ bigdataoffers")
print(result) # ['HadoopNews']
Look around and groups
import re
pattern = re.compile(r'\w+\s[\d-]+\s[\d:,]+\s(.*(?<!authentication\s)failed)')
a=pattern.findall("INFO 2013-09-17 12:13:44,487 authentication failed")
print(a) # []
b=pattern.findall("INFO 2013-09-17 12:13:44,487 something else failed")
print(b) # ['something else failed']
'정규표현식' 카테고리의 다른 글
[ Python ] 정규표현식 Table 및 우선순위 (0) | 2022.11.01 |
---|---|
[ python ] 정규표현식[5] (0) | 2022.10.31 |
[ Python ] 정규표현식[3] (0) | 2022.10.27 |
[ python ] 정규표현식[2] (0) | 2022.10.27 |
[ python ] 정규표현식[1] (0) | 2022.10.27 |