Assignment

You have to read specific text file strings in specific
format such as these two formats (390 P.2d 724, 726) , (ß 20-2-308),
(16-3-107(A-Z)) and export them in excel sheet in order.

f = open(“WY 2011.txt”,”r”,encoding=’utf-8′) #opens file with name of “test.txt”

print(f.read())

for example, the following formats are parsed/extracted from many text files from 1 directory:

258 P.3d 704, 708  (1 to 3 digits number then P.3d then 1 to 4 digits numbers )
920 P.2d 632, 635  (1 to 3 digits number then P.2d then 1 to 4 digits numbers)
13 Wyo. 408, 80 P. 664 (1 to 3 digits number the Wyo. number P. another number)

Solution

code.py

importos

import re

importcsv

defget_matches(fname):

matches =

[r’\d{1,3} P\.3d (?:\d{1,4}, )*\d{1,4}’,

r’\d{1,3} P\.2d (?:\d{1,4}, )*\d{1,4}’,

r’\d{1,3} Wyo\. \d+, \d+ P\. \d+’]

with open(fname, ‘r’, encoding=’utf-8′, errors=’ignore’) as fp:

text = fp.read()

items = []

for pattern in matches:

found = re.findall(pattern, text)

for item in found:

items.append(item)

return items

path = ‘/users/documents/python’

files = [os.path.join(path, name) for name in os.listdir(path)]

all_matches = []

forfname in files:

for item in get_matches(fname):

all_matches.append(item)

with open(‘output.csv’, ‘w’) as fp:

for item in all_matches:

fp.write(item + ‘\n’)